Use the Column Header to Retrieve Values from an Excel Table

This post discusses ways to retrieve aggregated values from a table based on the column labels.

Overview

Beginning with Excel 2007, we can store data in a table with the Insert > Table Ribbon command icon. If you haven’t yet explored this incredible feature, please check out this CalCPA Magazine article Excel Rules.

Frequently, we need to retrieve values out of data tables for reporting or analysis. This task is fairly easy using traditional lookup functions or conditional summing functions. However, when preparing workbooks to be used on an ongoing basis, we need to keep the formula consistency principle in mind. This means we write consistent formulas within a range, so that we can fill them down and right. Often, this is just a matter of setting the cell references properly, such as relative, absolute, or mixed. However, when our data is stored in a table, we can use structured table references and column headers to build consistent formulas.

Objectives

Let’s identify our goals and objectives before we begin. We have exported some invoice information from our accounting system, and have stored it in a table named tbl_inv, pictured below:

20140123c

We would like to retrieve values from the table, and aggregate them by customer ID, in order to populate our little summary report, pictured below:

20140123f

Our first goal is to write a single formula in C8, and then fill it down and to the right. That is, to use consistent formulas.

Next, let’s examine our data and our report. One thing to notice is that each customer may appear on many rows. Thus, we need to use a conditional summing function such as SUMIFS rather than a traditional lookup function, which would only return the related value from the first matching item. If you are not familiar with the SUMIFS function, please check out Multiple Condition Summing in Excel with SUMIFS.

Next, we notice that the report column order differs from the data column order. The report order is custid, amount, tax, shipping, and total. The order of the data columns are custid, shipping, tax, amount, and total. That means we need to write a formula that can accommodate the column order differences.

So, to summarize our objectives:

  • Use consistent formulas
  • Aggregate multiple rows
  • Accommodate column order differences between the data and report

We can meet our objectives by nesting the INDEX and MATCH functions inside of our SUMIFS function to dynamically select the proper sum column. Let’s unpack the formula step by step.

Formula

The first argument of the SUMIFS function is the column of numbers to add. When populating our report’s first column, the column to add is the table’s amount column tbl_inv[Amount]. Our formula would be something like this:

=SUMIFS(tbl_inv[Amount],tbl_inv[CustID],$B8)

However, when populating the next report column, the column of numbers to add is the table’s tax column tbl_inv[Tax]. Our formula would be something like this:

=SUMIFS(tbl_inv[Tax],tbl_inv[CustID],$B8)

Since the first function argument needs to be different for each report column, we are forced to write unique formulas for each column. This does not fall in line with our formula consistency objective. So, the question is, how do we express the first SUMIFS argument so it is dynamic?

One approach is to use the INDEX/MATCH functions. If you are not familiar with the INDEX/MATCH functions, please feel free to check out How to Return a Value Left of VLOOKUP’s Lookup Column for more information. In that blog post, we discussed how the INDEX function returns a cell value, but, it does much more than that.

The INDEX function can actually return a cell value or a range reference. Microsoft describes the function as having two “forms”, the array form and the reference form. It is a fancy way to say that the function can return either a cell value (array form) or a range reference (reference form). We’ll ask the function to return a range reference that can be used as the first SUMIFS argument.

When we are done, the first argument of the SUMIFS function will be dynamic, and it will use INDEX to return a range reference and MATCH to dynamically figure out which column. It will look a little something like this:

=SUMIFS(INDEX(MATCH(...)), tbl_inv[CustID],$B8)

The INDEX/MATCH functions will provide the SUMIFS function with the column of numbers to add. The basic idea is that we will ask the INDEX function to return a reference and we will ask the MATCH function to tell the INDEX function which column to refer to based on the header value. MATCH will look for our report column header, such as Amount, in the table’s header row. The assumption here is that the report header labels match the data header labels.

Since the MATCH function returns the relative position number of a list item, we ask it to tell us the column number of the matching report label. For example, the Amount column is the 6th column, so, it would return 6 to the INDEX function. INDEX uses this information to return the Amount column reference to the SUMIFS function. SUMIFS uses this reference as the column of numbers to add.

Since there are several moving parts, we’ll just ease into this formula. First, let’s replace the first argument of the SUMIFS function with an INDEX function. The resulting formula looks like this:

=SUMIFS(INDEX(tbl_inv,,6),tbl_inv[CustID],$B8)

This formula uses the SUMIFS function to add a column of numbers. The column of numbers to add is the first argument, which is an INDEX function. The first argument of the INDEX function provides the initial range, the whole table, tbl_inv. The second argument of the INDEX function is the row_num argument, and tells the INDEX function which row number to return. Since we want to return all rows, we leave this argument blank. The third argument of the INDEX function is the column_num argument, and we entered 6 because the table’s amount column is the 6th column. Thus, the INDEX function above returns the range reference corresponding to the 6th column in the table to the SUMIFS function.

However, we can’t stop here because we hard-coded the column_num argument 6. If we fill this formula to the right, the 6 would remain and all report column would return the same result, the sum of the amount column. Instead, we need to ask Excel to dynamically figure out which column number the INDEX function should use. And this is accomplished with the MATCH function. The following MATCH function will figure out which column has a matching column header label.

=MATCH(C$7,tbl_inv[#Headers],0) 

This function looks for the value in C$7 in the headers row of the table. We use a relative column reference (C) so that it updates as it is filled right, and an absolute row reference ($7) so that it is locked onto the report header row. Note the special structured table reference that refers to the headers row. It starts with the table’s name, tbl_inv, and then #Headers enclosed in square brackets. You can type in this reference or just use your mouse to select it interactively. The third argument, 0, tells the function we are looking for an exact match.

Since the MATCH function above returns 6, we need to nest it inside of the INDEX function. If we nest the MATCH function inside the INDEX function, we get the following:

INDEX(tbl_inv,,MATCH(C$7,tbl_inv[#Headers],0))

This formula segment returns the column reference needed by the SUMIFS function. So, nesting the INDEX/MATCH functions inside the SUMIFS function results in the following:

=SUMIFS(INDEX(tbl_inv,,MATCH(C$7,tbl_inv[#Headers],0)),tbl_inv[CustID],$B8)

We could write this formula in C8, and then fill it down and right to create the report. But, there is one more little detail.

If this workbook is designed to be used on a recurring basis and to remain in place for a long time, then we need to think about ways that a future user could break the workbook and address any risks up front, as discussed in Excel University Volume 1 Chapter 19. We therefore need to consider the different ways that future users may try to fill our formula down and to the right, because the way they fill right could accidentally break our formula.

Specifically, if a user tries to fill the formula right with the fill command, a copy/paste, or, with Ctrl+Enter, then Excel treats the tbl_inv[CustID] reference as absolute. This is good because as we fill the formula right we do want all formulas to reference the customer id column. However, if a user chooses to fill the formula right by dragging the fill handle, then, the reference is treated as relative, meaning, it will also slide to the right and change to tbl_inv[Shipping], tbl_inv[Tax], and so on.

So, to be safe, we’ll modify the reference slightly so that Excel treats it as absolute even if a user fills right with the fill handle. We’ll update it to a single-column range reference: tbl_inv[[CustID]:[CustID]]. The final formula is:

=SUMIFS(INDEX(tbl_inv,,MATCH(C$7,tbl_inv[#Headers],0)),tbl_inv[[CustID]:[CustID]],$B8)

We enter the formula in C8, and fill it down and right, and the resulting report is pictured below.

20140123g

Although it may feel like it took us a long time to get here, the end result is a formula that meets our objectives. It can be filled down and to the right and continue to work, it accommodates column order differences, and it aggregates values.

Conclusion

Generally, it is worth the time to create consistent formulas that can be filled down and right and continue to work for recurring use workbooks because it makes updating them over time faster. Additionally, writing formulas that can accommodate minor structure changes, such as the column orders moving around over time, will help reduce errors and improve efficiency. Although the upfront time investment getting a formula like the one above to work may be significant, you’ll receive your investment back each subsequent period through improved productivity.

Sample File 

To check out the Excel file that was used to prepare the screenshots, feel free to download the sample file:

RetrieveTableData

To check up the updated version that replaces the $B8 cell reference with a named reference, feel free to download the sample file:

RetrieveTableData2

Additional Notes

  • If you need to retrieve a text string, rather than the sum of numbers, you’ll probably want to use traditional lookup functions, such as VLOOKUP. The workbook sample file includes a VLOOKUP sheet with the relevant formulas.
  • If you’d like more assistance with the SUMIFS, INDEX, and MATCH functions, please feel free to check out our online Excel training courses.
  • If you have another approach you prefer, please post a comment, we would love to hear about it!

This article was written by Jeff Lenning

37 comments:

  1. Hanna Azar
    Reply

    WOW ,,, you solved a great problem for me ,,, very much appreciated …

    1. jefflenning Post author
      Reply

      You are very welcome!

  2. L E Boyd
    Reply

    Is there any way to replace the $b8 with something like [Invoice]8?

    1. jefflenning Post author
      Reply

      L E Boyd,
      Yes! One way to replace the cell reference $B8 is by naming the CustID report column with a name such as CustID. Then, you can use the name CustID in your formula, and, you can use it without a row reference since Excel will default to referencing the cell within the range that lies on the same row as the formula. I’ve uploaded an updated version that demonstrates this technique for reference. The download is RetrieveTableData2.
      Hope it helps!
      Thanks
      Jeff

  3. Mikef
    Reply

    This gets me halfway to where I want to be. I have a similar issue but I also have multiple tables that have identical structure. I need to be able to dynamically change the table name. In the past, I have been able to construct formulas that dynamically change by using the indirect formula. That appears to only create cell references. I have tried concatenate and text function but I keep getting #value errors. When I look at the the results of the function argument, it appears the table[col] argument has quotes around it.

    1. jefflenning Post author
      Reply

      Mikef,

      If you need to dynamically change the table name, you are exactly correct, you’ll want to use the INDIRECT function. Taking it one step at a time, you can replace the structured table references with the INDIRECT function as follows:

      =SUMIFS(INDEX(INDIRECT("tbl_inv"),,MATCH(C$10,INDIRECT("tbl_inv[#Headers]"),0)),INDIRECT("tbl_inv[[CustID]:[CustID]]"),$B11)

      Once that formula is working, we can put the table name in a cell and then use the cell reference in the formula. For example, if we stored the table name (tbl_inv) in cell C6, then, we could update the formula as follows:

      =SUMIFS(INDEX(INDIRECT($C$6),,MATCH(C$10,INDIRECT($C$6&"[#Headers]"),0)),INDIRECT($C$6&"[[CustID]:[CustID]]"),$B11)

      The key to making INDIRECT work is that you’ll want to store the table name in a cell to make it easy to update, and then use concatenation to join text string references to the table name, for example, INDIRECT($C$6&”[#Headers]”).

      Hope this helps!

      Thanks.
      Jeff

      1. Kris Lyttle
        Reply

        I have a slightly different problem: I have multiple reference tables on a separate sheet from my summary table… and I’m using Data Validation Drop-down lists in the summary. When I tried the formula, it doesn’t work.

        Is there a work-around? Thanks in advance.

        1. jefflenning Post author
          Reply

          Kris,
          If you have multiple reference tables, and you’d like your formula to retrieve values from different tables based on the selection in a drop-down, then you could consider using the INDIRECT worksheet function. This technique is discussed in the following Excel University blog post, I hope it helps!
          https://www.excel-university.com/referring-to-tables-indirectly-with-indirect/
          Thanks
          Jeff

  4. Gwen
    Reply

    I have a table of data, first name last name and company name as well as badge number and classification, I have written the range_vlookup but ultimately I want to be able to type into one cell and “find” a complete name and have the vlookup complete the other columns fields for there , for example if I type it in “jas” it will present me with the options of all the Jason’s that are available in the table I can then choose the right one and the rest off the formula will complete, for the life of me I cannot find how to create a search if the table to work with the vlookup is there a way?

  5. ron omegna
    Reply

    jeff,

    this is excellent – thank you for the insight on the headers row.
    can I extend the concept with a question?
    can you UPDATE a header value?

    I take a weekly feed of data; the fields I choose can be modified in each feed.
    the source sends the main extract AND a ‘key’ extract to match.
    the ‘key’ extract contains the mnemonic column ‘headers’ for the main extract, plus the English language equivalent of the mnemonic.

    so, is there a way to replace/update the cryptic mnemonic in the main extract with the English language equivalent? since I put the main extract in pivot tables and into the power pivot data model, the real names would far easier to use and remember. I suppose it would be better to rename the mnemonic only for the data model? and not in the xls side of excel2013? the main extract is the imported data, if I rename the headers, on the next ‘refresh all’, will I have compromised the import?
    [I apologize if I have confused the issue – just beginning to learn this whole import external data and the power pivot data model]

    thanks,
    ron

    1. jefflenning Post author
      Reply

      Ron,

      Once the extract has hit the worksheet, the only way that I can think of to quickly replace the mnemonic header with the English language equivalent would be via macro. Other ideas I considered are formulas, but, we are prevented from using formulas in table headers. If you weren’t going to store the data in a table, then, you could use formulas with the VLOOKUP function to retrieve the English value in a new row for each mnemonic value. The find/replace tool is too manual for a recurring process. I suppose another approach would be to switch the headers prior to the data hitting Excel, for example, store the export in Access and then retrieve it into Excel with the external data feature.

      But, I think if this were my project, I would probably try to write a macro that loops through the key extract and does a FIND/REPLACE in the main extract’s header row. That macro would probably be fairly easy to build, and you can start by recording a macro and then manually doing a FIND/REPLACE to learn the VBA syntax.

      Anyhow, those are my 2 cents…hope the ideas are helpful!

      Thanks,
      Jeff

  6. Raghava
    Reply

    Hi,

    What if i have multiple amount columns from which i have to get the values and sum matching the customer ID.

    Regards
    Raghava

    1. jefflenning Post author
      Reply

      Raghava,
      If you have multiple amount columns that you need to add, then, one approach is to use the addition operator with two or more SUMIFS function, for example:
      =SUMIFS(…)+SUMIFS(…)+SUMIFS(…)
      Hope this idea helps!
      Thanks
      Jeff

  7. Nade
    Reply

    HI Jeff,

    This has been very helpful, however I’m stuck with a formula because I have multiple conditionals, so far this is the formula I got:

    =SUMIFS(INDEX(tbl_inv,,MATCH(C$7,tbl_inv[#Headers],0)),tbl_inv[[CustID]:[CustID]],[@CustID],tbl_inv[Date],”01-01-2015″,tbl_inv[Invoice],”50756″)

    Can you please help to get the right formula?

    Thanks!

    1. jefflenning Post author
      Reply

      Most likely, the function is not returning the desired result due to the date criteria value argument, “01-01-2015”. Excel is expecting a valid Excel date serial, and “01-01-2015” is a text string. I would recommend storing the date value in a cell, such as A1, and then changing the argument from a text string to a cell reference, such as A1. This should resolve the issue.
      Thanks
      Jeff

  8. Jyothi Suresh
    Reply

    Hi Jeff,

    This has been very useful to me, however i am stuck if i have to add condition on the header column. In the above example i want to add the condition of sum all the amounts <500.

    Can you please help to get the right formula?

    Thanks!

    1. jefflenning Post author
      Reply

      Jyothi,

      The challenge with this task is that Excel automatically converts header cells into text strings, thus making comparisons difficult. One option would be to store the data in an ordinary worksheet range instead of a table so that you can use a horizontal SUMIFS function and compare the cell values in the column header row.

      Thanks,
      Jeff

  9. Linda
    Reply

    Jeff,
    This solution is exactly what I need! But I’m having trouble when I replace the absolute column reference (6, in the above example) with the MATCH function. I have made sure the formats for B$8 match the data table’s header row but it keeps returning a #N/A. My data table’s header row contains one text string in the first column while the remaining columns are dates. I have tried replacing #Headers with a named range that excludes the text string with no success. Any additional guidance would be most welcome!

    1. jefflenning Post author
      Reply

      Linda,

      This could be one of two issues. First, double check that you’ve defined the third argument in the MATCH function as 0. If not, you could be getting the #N/A error. If that doesn’t fix it, then, my best guess is that it is the dates in the table’s header row. You see, a date value is different than a text string, and when you convert an ordinary range into a Table, Excel automatically converts the data type of dates in the header row to text strings. Since the stored values are not longer date values, Excel can’t MATCH them to a date value. Fortunately, you can use the TEXT function to convert the date value to a text string so that Excel can make the MATCH. An updated MATCH function would look like this:
      =MATCH(TEXT(B$8,”m/d/yyyy”),Table1[#Header],0)
      And, you’d use whatever date format code matches to the text string in the table’s header row.

      Hope this helps!

      Thanks,
      Jeff

  10. Linda
    Reply

    Using the Text function did the trick!!
    Thanks so much!

    1. jefflenning Post author
      Reply

      Yay! Welcome 🙂

  11. Anna
    Reply

    Thanks so much for this, it has been so helpful! I have the formula correctly matching my version of CustID but run into trouble when I paste over the data table with a new dataset. The column header and values for my CustID are still in the new dataset, but has moved. So say my CustID was in column C originally, it is now in column F. Instead of finding the CustID data I want in column F, the formula changes itself to look up the new data that now exists in the original column C. Would you know if there is a fix for this? I need to account for my CustID column moving around. Any additional guidance would be most welcome – thanks!

    1. jefflenning Post author
      Reply

      Anna,
      Ah…yes, no problem. Essentially what we did in the post above is to use the INDEX/MATCH functions to dynamically figure out the column of numbers to sum (the SUMIFS first argument). And, in your case, you also want to dynamically figure out the criteria column (the SUMIFS second argument). This means that we’ll want to use the same technique (INDEX/MATCH) for the criteria_range (CustID) argument. To do so, we would replace the existing 2nd argument (tbl_inv[[CustID]:[CustID]] in the above formula) with an INDEX/MATCH argument. Assuming that you stored the label “CustID” in cell B7, then the new 2nd argument would be something like this: INDEX(tbl_inv,,MATCH($B$7,tbl_inv[#Headers],0)).
      The new formula would look something like this: =SUMIFS(INDEX(tbl_inv,,MATCH(C$7,tbl_inv[#Headers],0)),INDEX(tbl_inv,,MATCH($B$7,tbl_inv[#Headers],0)),$B8), where we use INDEX/MATCH arguments for both the first sum_range argument and the second criteria_range argument.
      Hope this helps!
      Thanks
      Jeff

  12. Monas Haregot
    Reply

    Hi Jeff! Many thanks indeed for your great help. I found your formula perfect and i m making use of it. But i miss something equally important too. That is the COUNTIFS that return the number of added items in the SUMIFS function.I tried to modify the SUMIFS function to provide for COUNTIFS but i didn’t succeed. Could you please help me with the latter, with the same example above? Thank you in advance.

  13. Michael
    Reply

    Hi Jeff I am having problems with this formula. The problem is that when I use”>=” it returns a zero value. When I use “=” a value is returned. It’s as if the tale does not like > or =”&11000,TBT[POSTED_TOTAL_SUM])

    1. Michael
      Reply

      Sorry,

      This is the formula: =SUMIF(TBT[FUND_CODE],”>=”&11000,TBT[POSTED_TOTAL_SUM])

      1. Kurt LeBlanc
        Reply

        Hey Michael

        I believe you have the order wrong…can you give me a little more detail about your scenario so that I can help you better?

        Kurt LeBlanc

  14. Cristi
    Reply

    This is super, you solved my problem.
    Thank you very much.

    1. jefflenning Post author
      Reply

      Welcome 🙂

  15. Christina
    Reply

    Hi Jeff,

    What a great post and saved me a lot of tears!

    However, I have to now replicate my workbook. My Table structure changes but still has the same headings. Is there a way of just pasting in my new Table so my formulas refer to the same heading text? The ‘MATCH’ formula refers to a different heading when I update my Table data.

    Hope that made sense!

    Christina

    1. Kurt LeBlanc
      Reply

      Hey Christina,

      I think you just missed something little:) Its difficult to visualize it from your description, but it sounds like you just missed the last section of this blog about how different ways of filling your report can affect the headings.

      Let me know if that fixes your issue and if I can be of more help please
      Kurt LeBlanc

  16. Smruti
    Reply

    Amount. Tax% tax amount
    3000. 5%. 150
    5000. 5%. 250
    8000. 12%. 960
    10000. 5%. 500
    1000. 12%. 120
    6000. 12%. 720

    I have a doubt!
    I want to know how to calculate the total of amount whose tax % is 5% and also total of amount whose tax % is 12% .using sumifs? Need help plz

  17. Cristian
    Reply

    Hello Jeff! I have an issue with a formula where a cell need to be locked in a formula used to delete duplicates from column [Key].

    The formula returns 1 for the duplicates (Except the first one found) and 0 for the non duplicates.

    This is the formula:

    =IF(COUNTIF($I$11:[Key];[Key])>1;”1″;”0″)

    What I would like to do is to replace the $I$11 with the named column [@Key].

    What I have so far is:

    =IF(COUNTIF([Key]:[Key];[Key])>1;”1″;”0″)

    Which almost does the trick but not quite, because I can’t put 0 for the first duplicate found.

    Could you help me with this?

  18. Blossom
    Reply

    My data is laid out like so:

    1/31/17 – I input a new date each month

    C R S
    # 1/31/17 2/28/17
    11 $226 $289
    35 $458 $693
    11 $56 $89
    Can you please tell me the formula that will both use the date at the top of the sheet to go to the correct column and then add only the dollar amounts in that column for the particular # (11 or 35). As you can see the # might show up several time in one month so vlookup isn’t working right, as it only picks up the first instance. I need the total $’s for all the 11’s in January. I have spent hours playing with every sumif, match, index, vlookup, and I just CANNOT get it. You would be my hero if you can give me the correct formula!

    Thank you!

Leave a Reply

Your email address will not be published. Required fields are marked *