In this post, we’ll explore a formula-based approach that does not use VBA to find the last occurrence of a delimiter and use it to retrieve the lowest sub account from a Quickbooks account list in Excel.
When I was doing some research for a project I had a couple of years ago, I was surprised by the many creative ways that Excel users have approached this task. There are several well documented methods, each creative and beautiful.
This post walks through the logic of one such solution, and explores each of the functions it uses. If you just want the answer, without all of the narrative, just skip down to the end for the formula and sample workbook.
Although this technique can be applied in many situations, the illustration used in this post is that we receive an exported account list from Quickbooks, and we need to split the full account name in order to find the lowest sub account.
Quickbooks typically uses a colon (:) between accounts, as shown in the screenshot below:
We need a single column that contains the lowest sub account. For row 11, the formula should return the account. But, for row 12, the formula should return the Design Income account. For row 15, the formula should return the Subcontracted Labor Income account. We need the lowest sub account, regardless of the number of sub accounts in any given row.
This is exactly what we will do with a sophisticated formula. But, before we do that, let’s discuss some features that offer alternative approaches.
There are at least two features that we could use to help with this task. One feature is text-to-columns (Data > Text to Columns), This works well when there are the same number of sub accounts in each row. If however, some accounts have 2 sub accounts, and some have 3 sub accounts, then the lowest sub account will not appear in the same column.
If you happen to be using Excel 2013, the FlashFill feature is easy to use and may be an alternative that would work well. See the FlashFill post for more info.
If this task was part of a one-time project, I would probably use FlashFill and be done. It is fast and easy to use. If however, this was a recurring-use workbook, then, I would probably consider using a formula-based approach instead.
You see, it is our preference to eliminate all manual steps from a recurring-use workbook. If a workbook will be used in future periods, we prefer to use a formula approach rather than a manual feature approach. That way, when we paste in the updated data, such as the updated account export, the formulas will automatically prepare the data for our use without any additional manual steps. We love storing our data in tables because they auto-expand and auto-fill calculated columns. So even if there are more rows than last period, the formula will be filled down to include any new rows.
In order to eliminate a manual step from a recurring-use workbook, I’ll invest a significant amount of time trying to get the formula dialed in. This upfront investment pays handsome efficiency dividends each subsequent period. The approach presented in this post is one such investment. It may take longer initially than using FlashFill, but the benefit is that each subsequent period you can just paste-and-go.
There are several great ways to accomplish this task using formulas. I had trouble figuring out who should get credit for the idea used below since this approach is presented in many web pages. But whoever you are, your approach is brilliant…thank you!
We want to set up a formula that retrieves the lowest sub account. If there were only one delimiter in a row, such as in row 12 above, then, this task is super easy. However, when there is more than one delimiter, such as in rows 15 and 16, this task becomes more difficult. When you are trying to write a consistent formula that you can fill down throughout a range, and you want it to work on cells with no delimiter (e.g., row 11), cells with one delimiter (e.g., row 12), cells with two delimiters (e.g., row 15), and cells with any number of delimiters, then, the task suddenly becomes challenging.
If finding a single delimiter is easy, but finding the last delimiter when there are multiple delimiters is hard, we can simplify our life by replacing the last (right-most) delimiter with a unique character.
To illustrate this idea, let’s simplify our account names for a moment to make it easier to visualize. Here is a full account name as it comes to us in the export:
SegmentA : SegmentB : SegmentC : SegmentD
Using built-in Excel functions, it is hard to retrieve SegmentD because there are multiple delimiters (:). But, if we had a way to somehow identify or tag the last delimiter, and replace it with a unique character, such as $, then our job is easy. Ideally, we could convert the full account to something like this:
SegmentA : SegmentB : SegmentC $ SegmentD
Now we could ask Excel to find the tag ($) and then retrieve the values to the right of it.
This is essentially what our formula will accomplish. Before we write the formula, let’s take a moment to explore the functions it will use.
The functions we will need in our formula are:
- LEN – returns the number of characters in a text string
- SUBSTITUTE – replaces one character for another character
- RIGHT – returns characters from the right of a text string
- FIND – returns the position number of a character in a text string; error if not found
- IFERROR – replaces an error with a specified value
Let’s take them in the order that we’ll need them. As you read through these steps, remember, the ultimate goal is to identify the final delimiter, replace it with a unique tag, and then retrieve the characters that lie to the right of it.
Step 1: Compute the Number of Delimiters
Our first task is to figure out how many delimiters (:) appear within the full account. For example, we need to return 3 for the following account:
If we could somehow count the number of characters in the full account, and then subtract from that the number of characters in a text string that excluded the delimiters, then we would have it.
For example, if we could count the number of characters in the text string above, and then subtract from that the number of characters in the text string below, we would have it:
In order to accomplish this we first need the LEN function. The LEN function returns the length of a text string, which basically means it counts the number of characters.
Assuming cell A1 contained the full account (SegmentA:SegmentB:SegmentC:SegmentD), the following function would return the number of characters, 35:
Now that we have the total length, we need to subtract from it the length of a text string that has no delimiters. The function we’ll need to accomplish this is the SUBSTITUTE function. This function returns a text string after having replaced selected characters with desired characters.
For example, we could use the function to replace every occurrence of the delimiter with any other character. Or, we could replace every delimiter with an empty text string, essentially removing all delimiters from the result. Assuming our full account is in A1, we could return SegmentASegmentBSegmentCSegmentD by using the following:
This function will find each occurrence of the delimiter (:), and replace it with an empty text string “”, effectively removing all delimiters.
Back to the task at hand, we compute the number of delimiters in A1 by taking the length of A1 and subtracting the length of a text string that has the delimiters removed. This is accomplished by using the following:
The formula takes the length of A1, which is 35, and subtracts from it the length of a text string without delimiters, or 32, which leaves the number of delimiters, 3.
Now that we have the total number of delimiters in the account, we can move to the next step, replacing the last delimiter with a unique character.
Step 2: Replace Final Delimiter with Unique Character Tag
We need to replace the final delimiter with a unique character that we can later use as a marker, so that we can retrieve the characters to the right of it.
Can you think of a function that we can use to substitute one character for another? Yes, the SUBSTITUTE function. However, one little detail. The SUBSTITUTE function as used above replaces all occurrences of the matching character. The SUBSTITUTE function has an optional fourth argument however, that enables us to replace one specific occurrence of the character. The fourth argument, known as instance_num, allows us to replace just the final delimiter with our unique character.
In order for us to know the instance number of the final delimiter, we need to know how many there are. Fortunately, we determined how many delimiters there are in the previous step. If we discovered that there are 3 delimiters in the text string, we can replace the third (final) instance with a unique character, such as $, by using the following formula:
Rather than replacing all delimiters, this function only replaces the third instance or occurrence of the delimiter, and returns the following text string:
Since we want to fill the formula down, we’ll need to replace the integer 3 with the underlying functions that compute it, as follows:
So far so good? OK, let’s move to the next step.
Step 3: Right
Now that we have tagged the final delimiter with a unique character, we can simply pull the values that lie to the right of it.
We can use the RIGHT function to retrieve characters from the right side of a text string. The function requires us to tell it the number of right-most characters to retrieve. The text string we have built thus far is:
We want to use the RIGHT function to retrieve SegmentD, so, we could use the following:
This would return the right 8 characters from the string. Since account names have a variable number of characters, we can’t use an integer value for the second argument, 8. We’ll instead need to use functions to compute the number of characters.
Fortunately, we have tagged the position with a unique character ($) that we can use to determine the number of characters in the account name. To compute the number of characters that lie to the right of the tag, we compute the total number of characters in the text string and then subtract the position number of the tag.
We already know how to compute the total number of characters in a text string using the LEN function, so, let’s figure out how to determine the position number of the tag.
We can use the FIND function to return the position number of our unique character tag $. Let’s assume for a moment that we stored the following value in cell A1:
We could compute the position of the tag by using the following function:
This formula returns 27, telling us that the tag $ appears as the 27th character within the text string.
If we subtracted the total length, 35, from the tag position, 27, then we are left with the number of characters to the right of the tag, 8.
Thus, we compute the number of characters to the right of the tag by using the following:
Now that we know how many characters are in the account name, we can use the RIGHT function to return it:
This will return the characters to the right of the tag, which in our case is the account name SegmentD.
Our tagged text string is not stored in A1, it has been calculated, and thus the actual formula we need, which combines all of the steps thus far, follows:
Step 4: Error Trap
In the case where a delimiter is not found, the FIND function will return an error. When no delimiter is found in the cell, we want to return the original cell value rather than an error. Thus, we’ll wrap an IFERROR function around the whole thing. The IFERROR function allows us to substitute a value when a function returns an error.
For example, if the FIND function can’t find a delimiter, we could substitute the original cell value for the error with the following formula:
Wrapping an IFERROR function around our formula results in the following:
This final formula works to retrieve the values to the right of the last occurrence of the delimiter. In a way, it is like performing a right-to-left search. But, since Excel’s FIND function doesn’t support searching from the right, we use the above formula instead.
When we use a similar formula with our Quickbooks account list, we are able to populate a column that contains the lowest sub account, as displayed below:
For more detailed information on the functions or their arguments, please consult the Excel help system.
Account numbers only?
Could we pull the account numbers into their own column? Sure, now that we have a column that is the lowest sub account, we could split that into account number and name columns with a variety of different approaches.
We could use the LEFT function to retrieve the left 5 characters. This of course assumes all account numbers are 5 characters. If they are variable length, we would need to rely on a delimiter and use the FIND function.
Since the LEFT function returns a text string, we could wrap the VALUE function around it to convert it to a numeric value if needed.
We could use the RIGHT function along with the LEN function to pull the account names into their own column.
Although Quickbooks uses the colon delimiter, this formula could be tweaked to be made more universal by setting up a cell that stores the delimiter character, and then naming it. Similarly, we could place the tag character in a cell and name it too. If we named the delimiter cell d, and the tag character cell tag, the updated formula would be:
The approach presented in this post certainly uses a sophisticated formula. Figuring out and implementing a formula like this makes sense for recurring-use workbooks since you’ll be able to improve your productivity in future periods.
The sample workbook provided below includes a working formula for reference:
If you have an alternative approach that you prefer, please post it in the comments below, I’d love to hear about it!