Formula to Remove Duplicates

Removing duplicates in Excel is a common task, but what if you want more control than the Remove Duplicates command provides? Maybe you need a formula to handle duplicates dynamically. In this guide, we’ll cover how to remove duplicates from a single column, handle multiple columns, and use formulas for various scenarios like summing, finding maximums, or creating a list of related values. Let’s jump in!

Video


Step-by-step Tutorial

We’ll take this process one exercise at a time.

Exercise 1: Removing Duplicates from a Single Column

Let’s start with a sample case: one column. (We’ll get to multiple columns in the next exercise.)

We have a single column like this:

The simplest way to remove duplicates within a single column is to use the Remove Duplicates command:

Select your data range.

Go to Data > Remove Duplicates.

You’ll see the resulting Remove Dupicates dialog:

Note: if your data has headers, check the My data has headers box.

Click OK, and you’ll see a confirmation:

Excel removes duplicates, leaving the first occurrence of each value.

The result is a de-duplicated list!

But, what if our data contained multiple columns? Well, that takes us to our next exercise.

Exercise 2: Removing Duplicates from Multiple Columns

When working with multiple columns, we need to consider two things. First, we need to think about how we define a “duplicate” row. And if we say that not all column values need to match for the row to be considered a duplicate, we need to think about what we want to happen to the other columns. Let me clarify. Consider the following:

When we look at this range, we need to think about how we define a “duplicate” row. For example, should we only look at the Item column … meaning, only if a row has a duplicate Item values should it be removed? Or, should we define a duplicate row as those rows where the Invoice and Item need to match? Or, do all three column values need to match to be considered a duplicate? So, this is the first thing to clarify: which columns need to match in order for the row to be considered a duplicate. In this case, let’s say that we should only consider the Item column values when determining which rows are duplicates. And this leads us to the second thing we need to consider: what do we do with the other column values.

Since only the Item values need to match for the row to be considered a duplicate, what do we want for the other column values? For example, do we just want to keep the first row values? Do we want to sum the Amount column values? Do we want the max invoice number, or a list of all matching invoice numbers? You get the idea.

So, let’s revisit the Remove Duplicates command in this exercise, and we’ll use formulas in the next exercise.

Using the Remove Duplicates Command

Select your data range.

Go to Data > Remove Duplicates. The resulting Remove Duplicates dialog:

By default, all columns are selected. This means that as it analyzes each row, all column values must match for the row to be considered a duplicate. In our case, we only want it to consider the Item values when determining which rows are duplicates. So, we update the checkboxes accordingly:

Click OK and we get the confirmation:

Click OK and:

As you can see, Excel only considered a row as a duplicate if the Item values match. For the other columns, it just kept the first matching values. And, depending on what you are working on, that may be sufficient.

But, if you want to do something other than keeping the first values, we can use other techniques … including formulas.


Exercise 3: Advanced Scenarios with Formulas

If you need more control over how duplicates are handled (e.g., summing related values or returning the largest), formulas offer such flexibility. There are other tools too, such as PivotTables and Power Query. But for this post, we’ll focus on a few handy functions.

We’ve stored our data in a table (named Table1):

We’d like to start by writing a formula that creates a list of unique values found in the Item column.

Step 1: Creating a Unique List

We’ll use the UNIQUE function:

=UNIQUE(Table1[Item])

If we wanted these results to be sorted, we can wrap the SORT function around it:

=SORT(UNIQUE(Table1[Item]))

Now that we have our unique Item values, we can think about what we want to happen with the other columns.

Step 2: Handling Other Columns

What should happen to the other columns? Let’s explore common scenarios:

1. Summing Related Values

We can use the SUMIFS function to sum amounts related to each unique value:

=SUMIFS(Table1[Amount],Table1[Item],F9)

Fill this formula down for all unique items.

2. Returning the Maximum Value

Replace SUMIFS with MAXIFS to find the largest related value:

=MAXIFS(Table1[Amount],Table1[Item],F9)

Note: use MINIFS to find the minimum value.

3. Returning the First or Last Match

To return the first related value, use XLOOKUP:

=XLOOKUP(F9,Table1[Item],Table1[Amount])

For the last related value, use XLOOKUP with the search mode set to reverse:

=XLOOKUP(F9,Table1[Item],Table1[Amount],,,-1)

And what about the Invoice column? We could find the min, max, first, or last by using the functions presented above. But, what if instead we wanted a comma-separated list of all matching invoices? Well, let’s do that now.

4. Creating a Comma-Separated List

We use the FILTER function to return all matching invoices, and then we wrap the ARRAYTOTEXT function around it to combine the results into a comma-separated list:

=ARRAYTOTEXT(FILTER(Table1[Invoice],Table1[Item]=F9))

Wrapping Up

There are several ways to remove duplicates in Excel and the most appropriate option really depends on what you are working on. The Remove Duplicates command is great for quick tasks, but using a formula with functions like UNIQUE, SORT, SUMIFS, MAXIFS, MINIFS, XLOOKUP, FILTER, and ARRAYTOTEXT give you flexible options to control the results.

File

FAQ

FAQs for “Formula to Remove Duplicates”

Q: How can I quickly remove duplicates from a single column in Excel?
A: Use the “Remove Duplicates” command under the Data tab. Select your column, ensure the “My data has headers” box is checked if applicable, and click OK. Excel will keep the first occurrence of each value and remove duplicates.

Q: Can the “Remove Duplicates” command handle multiple columns?
A: Yes, when working with multiple columns, you can select which columns Excel should analyze for duplicates. Check or uncheck the relevant columns in the “Remove Duplicates” dialog to define how duplicates are identified.

Q: What is the UNIQUE function in Excel, and how is it used to remove duplicates?
A: The UNIQUE function dynamically generates a list of unique values from a range. For example, =UNIQUE(Table1[Item]) will return a list of unique values in the “Item” column.

Q: How can I sort a list of unique values in Excel?
A: Wrap the UNIQUE function with the SORT function. For instance, =SORT(UNIQUE(Table1[Item])) creates a sorted list of unique values.

Q: How can I sum related values for unique items in Excel?
A: Use the SUMIFS function. For example, =SUMIFS(Table1[Amount], Table1[Item], F9) sums the “Amount” column values corresponding to the unique item in cell F9.

Q: How can I find the maximum or minimum related value for unique items?
A: Use MAXIFS or MINIFS. For example, =MAXIFS(Table1[Amount], Table1[Item], F9) finds the maximum value related to the unique item in F9.

Q: How do I retrieve the first or last matching value for a unique item?
A: Use XLOOKUP. For the first value: =XLOOKUP(F9, Table1[Item], Table1[Amount]). For the last value: =XLOOKUP(F9, Table1[Item], Table1[Amount],,,-1).

Q: How can I create a comma-separated list of related values for a unique item?
A: Use the FILTER function wrapped with ARRAYTOTEXT. For example: =ARRAYTOTEXT(FILTER(Table1[Invoice], Table1[Item]=F9)) generates a list of all matching invoices for the unique item in F9.

Q: When should I use formulas instead of the “Remove Duplicates” command?
A: Use formulas like UNIQUE, SORT, SUMIFS, and FILTER when you need dynamic results or want more control over how duplicates are handled, such as summing related values, finding maximums, or creating custom lists.

Q: Are there other tools in Excel for handling duplicates apart from formulas?
A: Yes, PivotTables and Power Query are excellent alternatives for managing duplicates, especially for more complex data analysis tasks.

Posted in , ,

Jeff Lenning

I love sharing the things I've learned about Excel, and I built Excel University to help me do that. My motto is: Learn Excel. Work Faster.

Excel is not what it used to be.

You need the Excel Proficiency Roadmap now. Includes 6 steps for a successful journey, 3 things to avoid, and weekly Excel tips.

Want to learn Excel?

Our training programs start at $29 and will help you learn Excel quickly.

Leave a Comment