Perform Approximate Match and Fuzzy Lookups in Excel

This post explores Excel’s lookup functions, approximate matches, fuzzy lookups, and exact matches.  The built-in Excel lookup functions, such as VLOOKUP, are amazing. When implemented in the right way for special projects or in recurring use workbooks, they are able to save a ton of time. The VLOOKUP function alone has saved countless hours in my recurring use workbooks. However, the VLOOKUP function, similar to Excel’s other lookup functions such as HLOOKUP and MATCH, is built to perform an exact match or a range lookup. Both of these are quite different from an approximate match or a fuzzy lookup. This post discusses the details of these ideas, and demonstrates how to perform a fuzzy lookup in Excel 2010 and later.

*** UPDATED BLOG POST: FUZZY MATCH IS AVAILABLE IN POWER QUERY ***

Understanding Built-In Lookup Functions

The built-in Excel lookup functions, such as VLOOKUP, HLOOKUP, and MATCH, work with similar lookup logic.  To simplify this post, we’ll use just one as the example.  Since the VLOOKUP function is probably the most used and most familiar lookup function, we’ll use it as we explore these ideas.

The basic idea of an Excel lookup function is to look for a value in a list.  For example, we could ask Excel to find “ABC Company” in a list of customer names.  That is the basic idea, but the application of lookup functions are numerous and the implementations can become quite sophisticated and powerful.

For this post, I’d like to split the tasks that a lookup function performs into two steps.  I’ll call step one the match, and step two the return. In the first step, the match, Excel must find the matching value.  You tell Excel the value to find, such as “ABC Company” and you tell Excel where to look, such as in a range of cells.  You are asking Excel to find the lookup value in the lookup range.

Step two, the return, is the function’s result. That is, what value the function should return to the cell.  Some lookup functions, such as the MATCH function, tell Excel to return the position number.  Other lookup functions, such as the VLOOKUP function, tell Excel to return a related value.  So, based on which lookup function you select, and which function argument values you enter, Excel knows what to return once it finds its match. So far so good?

Let’s do a quick example at this point.

I would like to find a specific customer name “ABC Company” in a list of customers, and if found, I would like Excel to return the customer id which is found in the next column.

I would use a VLOOKUP function, and I would ask it to find “ABC Company” in the Customer Table, and return the ID.  Assuming the customer name was entered in C7, and the customers were stored in a Table named Table1, then the following function would do the trick:

=VLOOKUP(C7, Table1, 2, FALSE)

Where:

  • C7 is the value to find
  • Table1 is the lookup range
  • 2 is the column that has the value we wish to return
  • FALSE means we are not performing a range lookup

This function is entered in C8 in the screenshot below.

Basic VLOOKUP Lookup

As you can see, the ID AC100 was successfully returned to the formula cell C8.  And that my friend is the basic idea of the VLOOKUP function.  Find a value (the match) and compute the result (the return).

It is important to note that the lookup value, the text string “ABC Company” must be found in the lookup range.  Except for case (upper and lower), the two values must match exactly.  “ABC Company” would not match “ABC Company, Inc.”, “ABC Co”, or “ABC Company   “.  No leading spaces, no trailing spaces, no extra abbreviations or characters.  They must be the same.  This is called an exact match.  If the value is not the same, the function will not match it, and you’ll get an error, as shown in the screenshot below.

VLOOKUP Error

Now that we have covered the basics, it is time to explore the VLOOKUP’s fourth argument.

 The Truth about the VLOOKUP Fourth Argument

The fourth argument of the VLOOKUP function is officially named: range_lookup.  It is a boolean argument, meaning you can pass it a value of TRUE or FALSE, or any other representation of TRUE or FALSE. The thing that tends to mislead Excel users is the description that Microsoft used for these options.  Excel describes the TRUE value as “Approximate Match” and FALSE as “Exact Match.” A clearer description would have been something like TRUE “You are doing a range lookup” and FALSE “You are not doing a range lookup” but in any event, the descriptions are what they are.

When you select TRUE (Approximate Match) you are not asking Excel to match values that are approximately the same as each other.  The description Approximate Match would tend to imply that the function would match “ABC Company” and “ABC Company, Inc.” since they are approximately the same name.  In some cases and in some data sets, this idea would work.  But this idea does not work in all cases, and thus, can’t be relied upon in our workbooks.  For example, in the screenshot below, the function did not find a match between “ABC Company” and “ABC Company, Inc.” as evidenced by the incorrect ID returned in C8:

VLOOKUP No Match

In the following screenshot however, the function did find a match between “ABC Company” and “ABC Co” as evidenced by the expected ID returned to C8:

VLOOKUP Approximate Match

The way that the function actually works when TRUE is selected is this: it walks down the list row by row, and ultimately stops on the row that is less than the value and where the next row is greater than the value.  This is why the lookup range must be sorted in ascending order for the function to return an accurate result when the fourth argument is TRUE.

This idea can be confusing when thinking about text strings, but makes more sense when thinking about numbers.  For example, when trying to find the correct commission rate based on the sales value.  In this case, you want to perform a range lookup.  You want to look up a value from within a range.  This is illustrated in the screenshot below.

Range Lookup

The function walks down row by row trying to determine which row to stop on.  It continues down until it finds a row that is greater than the lookup value, and then it stops on the previous row.  It stops on the row that is less than the value, and where the next row is greater than the lookup value.  This is pretty easy to understand when thinking about numbers, but can be harder to visualize when thinking about text strings. The key to understanding this function argument however, is to realize that the logic is identical when operating on text strings and numbers.  This is why “ABC Company” does not match “ABC Company, Inc.”, because “ABC Company Inc.” is greater than ABC Company.  This is why “ABC Company” will match “ABC Co”, because “ABC Co” is less than “ABC Company.”  As you can see, this is not what we have in mind when thinking about approximate match.

What is a Fuzzy Lookup aka Approximate Match

An approximate match, to us, means that two text strings that are about the same, but not necessarily identical, should match.  For example, “ABC Company” should match “ABC Company, Inc.,” “ABC Co,” and “ABC Company   .”  We think about an approximate match as kind of fuzzy, where some of the characters match but not all.

The idea of a fuzzy lookup is that the values are not a clear match, they are not identical. But that they are likely a match, there is a probability that they are a match.  They likely represent the same underlying entity.

Now that we realize the VLOOKUP function does not truly perform approximate match logic, at least, not in the way we want it, what do we do?

Add-In

When you hit a wall, go around it.  Since the built-in lookup functions do not perform fuzzy logic when performing the match, we hit a built-in limitation of Excel.  Microsoft has offered a way to work around this limitation by offering a free add-in.

Microsoft offers a free add-in that enables Excel to perform fuzzy lookups.  It is called “Fuzzy Lookup Add-In for Excel” and is available at the time of this post at the link below:

http://www.microsoft.com/en-us/download/details.aspx?id=15011

Once installed, this add-in performs fuzzy lookups.  It does not change the behavior of any of the built-in lookup functions. It does not enable your VLOOKUP functions to perform fuzzy lookups.  It is an add-in which basically processes two lists and computes the probability of a match.

You specify the two tables, and within each table the columns to inspect.  Basically, you define step one the match.  You then define step two by identifying which columns from the tables should be included in the result. You can also specify the probability threshold.  You hit go, and the add-in performs its work, and then outputs the resulting table starting at the active cell.  It basically generates a static report based on the settings you select.

Here is a screenshot of the output, showing that it successfully matched “ABC Company” and “ABC Company, Inc.” in the same data set that caused our VLOOKUP function to fail.

Fuzzy Lookup

For more information about the fuzzy lookup add-in, and more detail on how to use it, please visit the Microsoft link above.  The add-in comes with instructions, a sample Excel file, and a pdf file with background and the logic it uses to do its magic.  It also comes with a license, so, you’ll want to be sure to read the license terms in the LicenseTerms.rtf document included with the download.

There is some extremely interesting computer science and math working behind the scenes, including Jaccard similarity, tokenization of records, and transformations.  Pretty heavy mathematics in there. Thanks Microsoft Research for this add-in!!

Posted in , ,

Jeff Lenning

I love sharing the things I've learned about Excel, and I built Excel University to help me do that. My motto is: Learn Excel. Work Faster.

Excel is not what it used to be.

You need the Excel Proficiency Roadmap now. Includes 6 steps for a successful journey, 3 things to avoid, and weekly Excel tips.

Want to learn Excel?

Our training programs start at $29 and will help you learn Excel quickly.

86 Comments

  1. Marty Toth on June 20, 2014 at 11:08 am

    Jeff,
    This is one of the most impressive informational and detailed instruction posts I’ve ever see on the web. Short and to the point, with wonderful examples, and well thought out. GREAT job!

    • jefflenning on June 20, 2014 at 3:16 pm

      Thanks 🙂

  2. Luciano Evaristo Guerche on July 25, 2014 at 10:10 am

    Do you believe this add-in is able to fuzzy lookup on about 70 columns?

    • jefflenning on July 25, 2014 at 11:36 am

      Luciano – I’ve not tried it on a data set of 70 columns, so I’m not sure…but I’d love for you to let me know what you find out when you give it a try!

      Thanks
      Jeff

  3. muath alzubi on August 5, 2014 at 4:22 am

    man you are amazing. great article,, its very useful,, thanks a lot for your work 🙂 ,, also thanks microsoft 🙂

    • gv on September 7, 2023 at 8:09 am

      could you use “*apple*” in your search term?

  4. Donnchadh McKenna on August 5, 2014 at 10:22 pm

    I thought that this was a wonderful add on for excel. With my job i find i need to find partial matches in the same column of a table and identify them. This seemed to answer my prayers. However i am finding a big issue, which i am hoping is fixable.

    When i do a fuzzylookup on two columns and one has “apple” and the other “apple:” it brings back a perfect match not a partial match. The system is viewing this as a perfect match, it is ignoring the “:”
    This makes the add on useless as this sort of partial match is required to be identified is required as well as “colour” vs “color”. Any thoughts on how this maybe fixed? If it is, this would be perfect. Are there any other free add ons that i could use that would work.
    Thank you
    Donnchadh

    • jefflenning on August 14, 2014 at 7:18 am

      Donnchadh,

      I’m unaware of add-in settings or options that would allow you to modify the built-in behavior to meet your needs. I modified the Similarity Threshold setting and various others, but none seemed to produce the desired result. Thus, you may need to solve it in two steps. The first step is using the add-in to generate the results table and the second step is using a formula to let you know if the two items in the results table are identical. For example, assuming that the apple/apple: results were in cells C10 and D10, you could write a formula such as:
      =C10=D10
      You could fill this formula down and it would provide TRUE for rows that were an exact match and FALSE otherwise.

      Thanks,
      Jeff

      • Dennis on August 28, 2014 at 9:26 pm

        Would it be possible to have the fuzzy lookup match up using two columns wherein one column is fuzzy matching but the other one is a strict match? I’m doing it for name and birthdate where name doesn’t have to completely match but birthdate must be a perfect match. I’ve played around with the column configuration settings but the lookup is still returning records where the birthdate isn’t matching. Thanks.

        • jefflenning on September 16, 2014 at 9:17 am

          Dennis,

          I believe the add-in allows you to provide a “Transformation Threshold” for individual columns via the Column Configuration options. The default Column Configuration options are Default, PhoneNumber, SSN, and ZipCode. You can actually add a new custom configuration. On the bottom of the Add-In panel, you’ll see a Configure… button which opens the Configure dialog. In the Column Configurations tab, you can click the Add button to create a new configuration, for example, BDate. Set the EditTransformationThreshold value to 1 which means exact match. Then, close the Configuration dialog. In the Fuzzy Lookup panel, you want to select the two Name columns and then click the match icon to push the selection down into the Match Columns list box. Set the configuration for that one to say Default, which is a fuzzy match. Then, select the BDate columns from both tables and click the match icon to push the selection down into the Match Columns list box. Set the configuration to say BDate, the custom configuration you just made that uses 1 for exact match. Then, delete any other Match Column entries that may have been added automatically, so that you only have the two we added manually.

          I believe this will provide the result you are seeking…thanks!

          Thanks
          Jeff

        • LJ on October 20, 2016 at 3:26 pm

          why don’t you concatenate the date & name???
          Then use something like left({Cell},10)=left({cell],10) to filter exact date matches.

  5. Moishz on August 13, 2014 at 8:46 pm

    Thank you very much
    Great post, informative, to the point, well written, and all around very helpful.
    Thanks.

  6. Patrice Liautaud on October 5, 2014 at 1:06 pm

    Hi Jeff
    Your post is very informative, thank you for that, I am trying to match a string of 9 numbers from a list to a list that contains text and the numbers. What function would you suggest to match items in the cell and ignore the rest? Not sure if this exists but considering my two tables have over 120k lines each, this could save me enormous amount of time.

    Thank you

    Pat

    • jefflenning on October 9, 2014 at 12:05 pm

      Patrice,

      One approach to tell if the 9-number string appears in the list that contains text and numbers is with the COUNTIFS function. Since the COUNTIFS function supports the use of wildcards, it is a fairly easy task to determine if there is a matching value or not.

      For example, if your list of 9-number values are in column A, and the text/number list is in column G, you could write the following formula in Cell B1 and fill it down:

      =COUNTIFS(G:G,”*”&A1&”*”)

      This tells Excel to count the number of cells in G that contain the 9-number string in A1 even if it includes preceding or trailing text values. For rows where the function returns 0, it means the value can’t be found; any rows with 1 or more means it can be found in that number of cells. This approach assumes that the 9-numbers are sequential in the text/number string, for example 123456789 and AA123456789BB. It won’t work if the letters are in between the numbers, such as 1234A5678B9C.

      If you need to return a related value, rather than just determine if the value exists on the other list, then you could use VLOOKUP along with wildcards instead. The following blog post explains more about using VLOOKUP with wildcards:

      https://www.excel-university.com/vlookup-and-wildcards/

      Hope it helps!

      Thanks
      Jeff

  7. Betsy on November 19, 2014 at 4:47 pm

    I have a long column of names in my spreadsheet in which some names repeat multiple times. These names are not necessarily identical (e.g. John A. Smith and John Arnold Smith) and so I used Fuzzy Lookup to identify the near matches. I am now wondering if there is a quick and easy way to change the names that were entered incorrectly so that they are consistent throughout.

    • jefflenning on December 22, 2014 at 4:12 pm

      Betsy,

      A few different ideas come to mind. You could write a VLOOKUP function that retrieves the “correct” value from the related list in a column. Then, you could do a copy/paste values. This would be a great approach when the list is really long and there are many changes and you aren’t up for writing a macro. Another approach if the number of changes is small, like one or two, is to use the Find/Replace command. If there are lots of changes and the formula based approach with VLOOKUP won’t work, you could try a macro that performs the Find/Replace for you.

      Hope these ideas help!

      Thanks
      Jeff

    • Sara on July 22, 2019 at 1:39 am

      Betsy, I have data in a single column too. Please tell me how you used the Fuzzy function for this.

  8. Leslie McMahan on December 5, 2014 at 10:41 am

    Jeff,
    Great information and wonderfully presented! Thank you for your insight! I do have a tough question which I have not been able to find answered on the site. I have two sheets where I am comparing four columns for a match. If the columns match, I need to return a numeric value from one spreadsheet to the other. The data I am comparing includes , a date (mm/dd/ccyy), two number fields and a 12 byte mixed alpha/numeric field. The total size is 19,000 plus rows. Both sheets are sorted numerically by the first and second numerical columns. I have been researching the VLOOKUP, but have not been able to determine how multiple column criteria can be used. Do you have any suggestions of what function(s) can be used to complete this task?
    Thank you!

  9. Stephen on January 8, 2015 at 7:28 pm

    Jeff,

    I used to use Fuzzy Lookup on my old PC. Great tool! I recently received a new PC and I’ve had a lot of trouble getting Fuzzy Lookup installed. Is there a special trick to getting it incorporated into excel? I’ve downloaded the file and it doesn’t appear on the quick acccess toolbar within excel. Here are things I’ve done so far:

    – Gone into EXCEL OPTIONS under ADD-INS
    – The FUZZY LOOKUP appears under Inactive Applications Add-Ins.
    – I’ve gone to MANAGE ADD-INS and it does not appear
    – I’ve gone to COM ADD-INS and I see Fuzzy Lookup unchecked; I then check the box, however it doesn’t do anything; I then re-enter into COM ADD-INS and the box is unchecked again (it doesn’t seem to take)
    – I’ve uninstalled and reinstalled the program
    – I’ve rebooted my machine
    – I’ve tried on two computers (one with Excel 2007 and another with Excel 2010); same issue on both

    Any thoughts?

    • Kurt LeBlanc on July 1, 2016 at 4:15 am

      Try restarting Excel after checking the box.

    • Greg on November 17, 2016 at 4:08 pm

      I was having the same issue, running MS Excel 2013 (32 bit). I found another computer running 64 bit, and the add-in installed seamlessly. Could be that the Fuzzy Lookup Add-in is not compatible with 32 bit operating systems.

  10. abhinav on January 19, 2015 at 10:52 pm

    awesome post!! clear, easy to read and to the point!

  11. Ifeanyi on February 19, 2015 at 4:52 am

    Thank you so much for this post. Very helpful indeed.

  12. VJ on February 19, 2015 at 6:12 pm

    Great Job!!! Very Helpful.
    Thank You!!

  13. Abdullah Javed on February 25, 2015 at 9:36 pm

    Great great great great great post…. thanks so much for this info!!!!

  14. Tim on March 24, 2015 at 12:42 pm

    How good is this !
    Well done on this article mate – very well executed.

    • jefflenning on March 24, 2015 at 12:44 pm

      Thanks!

  15. Gemma on March 31, 2015 at 5:11 am

    I’m trying to match some specified account names to transaction references given containing all sorts of random punctuation, spaces and typos. I’ve tried vlookup, index/match in combination with iferror and with variations on whether the match should be less than/ greater than, true or false… chucked in some wildcard references also to see whether the results were any better. Some of the formula combinations came close to giving me the correct returned value most of the time but, with so much data, I just refuse to go through each one to see if I can spot the errors.
    I then tried the fuzzy match add-in as my head was hurting. Still returned incorrect values.
    Also tried a vba fuzzymatch function I plagiarized from the internet but it broke my excel (albeit temporarily).
    Any ideas on how I can solve this problem please?

    • jefflenning on April 13, 2015 at 2:07 pm

      Gemma,

      It sounds like you are asking Excel to match an account name to transaction references which contain all sorts of random punctuation, spaces, and typos. The built-in capabilities of Excel are great for doing exact matches, but, as you know, not good at this type of match. I believe the tool that will provide the most accuracy most often is the fuzzy lookup wizard.

      However, an approach which is more manual the first period but quick in subsequent periods is to store the values in a lookup table. Unfortunately, this means building the lookup table manually the first month, and then, updating it when new transaction references appear. For example, this is the approach often used when reconciling credit card transactions. Since the credit card statement description is often uniform, this can speed things along in subsequent periods. The lookup table would have two fields, the transaction description and then the account. For example, the description of AVIS-DFW-10024Z is the credit card activity description and then you map it to the Automobile Expense account. Then you map Chipotle-2499399900001 to Meals and Entertainment. And so on. Depending on the number of transactions and whether or not there is consistency each period, this may be helpful.

      Best of luck!

      Thanks
      Jeff

      • Gemma Erasmus on April 13, 2015 at 8:42 pm

        Thank you Jeff
        Lookup table is a great idea though discovered that the data is being entered manually (!!!) so tackling the issue at source as well – exporting data and lookup table should do the trick I reckon.
        Thanks again!

        • jefflenning on April 14, 2015 at 12:49 pm

          Ah…yes…tackling the issue at the source is the best plan!

          Best of luck!
          -Jeff

          • Carol-Anne on April 29, 2015 at 11:40 am

            Is there a way to make INDEX MATCH MATCH return the result from the row or column that is greater than the lookup value if it is not an exact match?



          • jefflenning on May 14, 2015 at 12:29 pm

            Carol-Anne,
            Yep! Just use 1 (or -1 depending on the sort order) as the MATCH function’s third argument (instead of 0).
            Thanks,
            Jeff



          • gemma on May 14, 2015 at 8:48 pm

            Carole-Ann,
            If you want the value that is greater to be returned only if there is no exact match you could use IFERROR(index(array, match(lookup, array, 0), (index(array, match(lookup, array, 1). So the formula performs the second INDEX MATCH lookup if the first one returns an error. HTH.

            Gemma



  16. Fuzzy on April 21, 2015 at 2:34 pm

    “and demonstrates how to perform a fuzzy lookup in Excel 2010 and later.”

    I think it should be noted, though, that the Microsoft Fuzzy Lookup Add-In for Excel also works for Excel 2007!

    • jefflenning on April 22, 2015 at 2:22 pm

      Thanks for the heads up!
      Thanks
      Jeff

  17. Fuzzy on April 21, 2015 at 2:44 pm

    “The add-in comes with instructions, a sample Excel file, and a pdf file with background and the logic it uses to do its magic.”

    As of 4/21/2015, it’s not a PDF, but rather a docx file.

    • jefflenning on April 22, 2015 at 2:23 pm

      Thanks for the heads-up!
      Thanks
      Jeff

  18. mastertuma on May 6, 2015 at 1:48 am

    I got trouble in lookup between two table

    Name&Date EffectiveDate Name Rate
    ADAM&42064 1-03-2015 ADAM 5%
    MARY&42064 1-03-2015 MARY 6%

    I want to get rate depend on Name&Date as my lookupvalue
    therefore for ADAM&42064 lookup value from effectivedate 1-03-2015 onwards the rate will be 5% otherwise it will be 0% if no data. The problem is when lets say I clear the data for the last row, the result will catch the upper row data instead become 0 rate. FYI the effectivedate, name and rate column key in by user. I need to catch the rate between two range of dates for example 1st march 2015- 31st march 2015.

    please help me. n tq in advance.

    • Kurt LeBlanc on July 1, 2016 at 5:38 am

      Mastertuma,

      I suggest looking into SUMIFS() with concatenation for your issue. It can be found in Chapter 20 of Excel University’s Volume 2.

      Thanks,
      Kurt

  19. Hoang on July 9, 2015 at 8:06 am

    Hi Jefflenning,

    Thank you for sharing this!

    I am currently working on a spreadsheet with 10k rows where there are names in column A that are almost similar. I would like to map them in column B.

    e.g. I have in cell A1 Jeff1, A2 Jeff_c, A3 Jeff, A4 Jeffs, I would like cell B1-B4 show me JEFF.

    Is that possible using the fuzzy look up?

  20. Pawan on July 22, 2015 at 11:05 am

    Hey, Thank you so much for sharing this wonderful feature. With Excel, there will be a constraint that we cannot perform the fuzzy lookup if the number of records are more. Since its gives ‘Out of Memory’ exception. In order to solve this, can we run this in Microsoft Access ?

    Please Respond. Thank you again.

    • jefflenning on August 11, 2015 at 11:00 am

      Hi Pawan,

      To my knowledge, Excel’s Fuzzy Lookup Wizard Add-in isn’t designed to work with an Access database. I wish it was though, that would be pretty cool!

      Thanks,
      Jeff

  21. Piyush Goel on September 9, 2015 at 2:36 am

    Is there a similar solution for Excel on Mac OS? The Fuzzy plugin is a .exe file that is windows only.

    Piyush

    • jefflenning on September 9, 2015 at 12:34 pm

      Piyush – hmmm…not to my knowledge…but if you find something please post back to share with other Mac users…thanks!
      Thanks
      Jeff

  22. James on September 14, 2015 at 11:25 pm

    Hi,

    Im currently validating my database, i want to match my two database which are in different files. so what i am doing is i will match my data from file 1 to file 2. example i want to check if column tradename in file 1 is in column tradename file 2. i believe this is the right syntax i should used but i dont know how to used it.

    • jefflenning on September 17, 2015 at 10:22 am

      James,
      If the values between the 2 databases you want to compare have exact matching values, then, I would probably use COUNTIFS or VLOOKUP instead of the Fuzzy Lookup Addin. The Fuzzy Lookup Addin is great when the values between the two lists may be different, for example ABC Co and ABC Company. But, when the values are exactly the same, such as ABC Co and ABC Co, it will probably be easier to compare with a built-in function. I have an article which discusses how to perform such list comparisons here:
      https://www.excel-university.com/articles/cal-cpa/comparing-spreadsheet-lists-with-ease/
      Hope it helps!
      Thanks
      Jeff

  23. mariana on September 27, 2015 at 5:42 pm

    very good you just save me

  24. Greg Jenkins on February 20, 2016 at 2:16 pm

    I tried using Excel fuzzy lookup on a few jobs after reading this article and was only able to do “stemming” matches, not real fuzzy logic. For example Excel fuzzy lookup can match “Company ABC” with “Company ABC Inc.” But even notepad can do that with “Find/Replace”, (e.g., Find “Company ABC” and Replace with “Company ABC Inc.” 🙂 My jobs which I assume are not uncommon are needing to match things like “122 Main Street #22” with “122 Mane Unit 22” along with typos/errors where a few random characters are incorrect. In any event I ended up going with a cloud based app: http://sullivansoftwaresystems.com/cgi-bin/fuzzy-lookup

    • Kurt LeBlanc on July 27, 2016 at 9:58 am

      Hey Greg,

      Thanks for your additional information!

      Kurt LeBlanc

  25. Mosab on April 11, 2016 at 12:43 pm

    This is exactly what I’ve been looking for. Thank you very much!

  26. Yangyang on April 12, 2016 at 12:49 pm

    You are my hero!! Your article is so clean, organized, and to the point. It saved me HOURS. Thank you!

  27. Lee on May 11, 2016 at 7:26 am

    Great article. I just ran it on tables of 5K and 3K records, so the add-in is scaleable!

    • jefflenning on May 11, 2016 at 10:52 am

      Awesome 🙂

  28. Mike Mason on June 8, 2016 at 2:00 am

    I have two tables with over 36000 rows each and tried fuzzy lookup on one column match in each.
    I continually get get out of memory despite shutting everything down, have 8GB RAM and still I don’t get a finished result?
    I am matching the description of assets, so less than 100 characters in a cell and trying to see which are thus missing from a given location from one group versus which ones they have on the same location and therefore any the other will be missing.
    Is there anything in set up I should be doing? I’ve adopted purely the default approach.

    • Kurt LeBlanc on July 26, 2016 at 1:00 pm

      Hey Mike

      The best way I know how is to keep the tables in a database and then connect them to Excel. That way the database stores most of the memory and Excel only does some calculations:)

      Let me know if that helps you!
      Kurt LeBlanc

  29. Kelvin Banbury on June 13, 2016 at 5:14 am

    Are you able to use fuzzy matching to ignore the exact match?
    e.g. In table 1 I have AB Company in table 2 I have AB Company and AB Co. I want to ignore the exact match and just return the fuzzy match to AB Co.
    My analysis doesn’t allow me to remove the exact match from table 2.

    • Kurt LeBlanc on July 26, 2016 at 8:22 am

      Hey Kelvin

      I have been playing around with the settings for a while and don’t see a way to find what you need…I hope someone else sees this and can help you:)

      Kurt LeBlanc

  30. Exis on August 2, 2016 at 4:14 pm

    You can also try a software with many more options such as http://www.fuzzy-logic.com to dedupe data with fuzzy logic techniques.

    • Kurt LeBlanc on August 3, 2016 at 6:49 am

      Thank you for the information Exis:)

      Kurt LeBlanc

  31. Ali on September 9, 2016 at 7:22 am

    Great post – thank you. May I just pose a suggestion, with or without which your article is still great!

    Could you please provide the link/answer to the topic of the article upfront (rather than in the middle) so that more advanced excel users such as yourself could quickly find the answer? I almost lost the thread when you went into the (very well) detailed and displayed explanation of Vlookup False/True but kept on reading!

    Thank you and please do not take my suggestion as criticism as it is not intended to be in anyway.

    • jefflenning on September 9, 2016 at 7:36 am

      Ali…thanks for your feedback!

  32. Brian on September 23, 2016 at 5:31 am

    When I install the add-in, the terms of use tell me that it is licensed as part of the MSDN licensing program and not for production use. This would mean that it cannot be used for work applications. Are you aware of that restriction? It seems that other commenters are using it for work use, so maybe they have not read the license and terms of use? It might be worth noting the current terms in your article, or at least directing users to read the terms during the install. The tool is free, but not licensed for all use yet so still has limited value for on the job application.

    • jefflenning on September 23, 2016 at 6:28 am

      Thanks Brian for your suggestion…I’ve gone ahead and updated the post to draw attention to the license and to the LicenseTerms.rtf file included.
      Thanks,
      Jeff

  33. AD on October 12, 2016 at 12:01 pm

    THIS IS GOLD. I’ve to estandarize a lot of “fuzzy data” and have been dreaming last days with something like this. This fuzzy lookup addin is EXACTLY what i was looking for. A % estimate of how similar the datas are.

    Thank you for also such a comprensive post, I almost never leave messages, but this time is worth it

    • jefflenning on October 12, 2016 at 12:02 pm

      Thanks, glad it helped 🙂

  34. JackDiggensIII on November 10, 2016 at 10:08 pm

    What’s interesting is it’s not in the standard load for Excel. No reason for that. Just like editable maps are not in the standard load for PowerPoint. No reason for that, either.

  35. Erik Curns on March 16, 2017 at 12:28 pm

    Jeff, what a great add-on! We have a unique situation where our suppliers are also setup as customers in our system. Of course, if we had the forethought we would have insured that the customer name mirrors the supplier name. Unfortunately, we did not do this and I have been manually matching customers to suppliers. With that tool, the manually exercise has been automated. Thank you again Jeff!

    • Jeff Lenning on March 16, 2017 at 6:31 pm

      Welcome 🙂

  36. ragasuthan pathmanathan on May 8, 2017 at 4:24 am

    Excellent tool. Thanks for introdcing

  37. vinco on July 7, 2017 at 11:09 pm

    Hi All,
    can someone help me? I’m just working on a long list of vehicle spares and want to detect all the descriptions that can have 4 similar letters or numeral at least. The test must be applied to whole the column containing the parts descriptions.
    Thanks in advance
    regards,

  38. Sage on July 18, 2017 at 10:47 am

    Hello Jeff,

    I love the fuzzy lookup add-in, but as our company has moved to 64-bit we are having compatibility issues. I have googled this issue and see that other users are having similar problems: the add-in installs, you set everything up, press Go, but then nothing happens. I reached out to Microsoft, but they were unable to help. Are you aware of an alternative or way to fix this issue with a seemingly incompatibility with 64-bit machines?

    Thanks!

    • Jeff Lenning on July 19, 2017 at 12:56 pm

      I’m not aware of a solution…sorry about that!

  39. John on January 9, 2018 at 10:33 am

    I know it has already been said here, but this is one, if not the best excel tutorial, I’ve ever seen on the web: simple, clean and user-fridenly!

    Thank you very much =D

  40. Marcelo Augusto on April 4, 2018 at 2:38 pm

    I did it with 3k rows, while a standard PROCV returned around 50% of results i got 90% with this extension at 85% treshold, loved it

    Awesome article also, thank you so much for the information

  41. Raymund Gray on February 26, 2019 at 8:40 pm

    HI, hope you can help me with this:

    You already give a configuration for using a DEFAULT match for name and EXACT match for born date, I’m trying something similar and apparently this is an THISMATCH OR THISMATCH results. Is there a way to force a 0 Similarity if the EXACT match doesn’t match?

    Hope you are still answering

    Thanks

  42. Bryan on April 26, 2019 at 9:40 pm

    Hey everyone, I have a question about this. I have a large exported dataset (about 500,000 rows.) One of the rows is more or less a comment section where the user wrote comments about the interaction with the customer. We are looking for a list of records where any of the 8 keywords appear within those cells.

    I’ve created a wordbank containing about 30 spelling, case and abbreviation variations each. Will it be possible to use the Fuzzy Lookup tool to look in that column for any of the words in the bank; and then return a list of say, the User ID’s located within the rows where it finds one of the bank words? I hope I’ve explained that well enough.

  43. Sara Saleem on July 22, 2019 at 1:37 am

    Bryan, I have data in a single column with similar names repeating multiple times with a variation of spellings, how can I use this Fuzzy add-in to work on a SINGLE COLUMN?
    Please help. Thanks!

  44. Jasmin on August 21, 2019 at 4:41 pm

    While I’m not interested in downloading an add-in, I found this post incredibly helpful to understand more about VLOOKUP. Thank you! What a useful page to deep-dive what’s going on under the hood. I, like so many others, it seems, have also been foiled by the “Approximate match” promise, which was confusing and frustrating when it didn’t function the way I expected. Now I understand. Thank you!

  45. Sheila on January 12, 2020 at 6:28 pm

    Hi,
    I just downloaded the application and am trying to apply to one column only. There is nothing in the instructions on this, and it looks like the logic is matching the same column against itself, resulting in a high degree of matching! What I’m trying to do is find instances of spelling differences in the same column. Any insight you could provide would be greatly appreciated, thanks. I see that some other posters are having this difficulty as well.

  46. Chase on June 4, 2023 at 8:15 pm

    Im trying to do a lookup for addresses. As you all know, address can fluctuate from 650 e abrahams st to 650 east abrahams street. Could you offer assistance as to look something like this up and be 99 percent accurate. I spoke to a collegue and he stated there was a dual look up as you can look up the number in the address then cross reference it with the zip. can this be down and if so how.

  47. Deane B. on October 2, 2023 at 1:36 pm

    I’ve been trying to find a way to match strings and most of what I found would only work with exact matches or approximate matches over whole documents. This is working very well for fuzzy-matching short text columns!

    Thank you!

Leave a Comment