How to Remove Duplicates in Google Sheets

In this post, you’ll see how to remove duplicates in Google Sheets using the Remove Duplicates tool, the UNIQUE function, Pivot Tables, Conditional Formatting, Apps Script or an Add-On.

Firstly though, what are duplicates? And why are they a problem?

Duplicates (or duplicate entries, or duplicate rows) are instances of the same record appearing in your data more than once.

They’re a huge problem and the bane of any data analyst’s life. It’s critical that they are found and removed before any data analysis is performed.

Imagine you had two instances of the same client transaction for $5,000 in your database. When you summarize your data, you might think you have $10,000 in revenue from that client when in fact you only have $5,000. You’ll make decisions based on the wrong data.

And we all know that’s bad for business.

So it’s best that you deal with them head on.

Thankfully, it’s not too painful or time consuming to fix them. In this guide I’ll show you six different techniques on how to search for duplicates in Google Sheets and how to remove duplicates in Google Sheets.

Contents

  1. Method 1: How to remove duplicates with the Remove Duplicates tool
  2. Method 2: How to remove duplicates with formulas
  3. Method 3: How to remove duplicates with Pivot Tables
  4. Method 4: How to remove duplicates with Conditional Formatting
  5. Method 5: How to remove duplicates with Apps Script
  6. Method 6: How to remove duplicates with an Add-On

How to remove duplicates in Google Sheets? Recommendations

In most cases, I would advocate using the Remove Duplicates tool, because it’s quicker and easier than the other methods. However, here’s a rundown of when it might make sense to use the different methods:

Method 1: Remove Duplicates tool is the most robust, practical method of removing duplicates.

Method 2: Formulas The UNIQUE formula is great for small, simple datasets or when you need to remove duplicates inside a nested formula.

The IF formulas are a robust but rather old-school approach. They have the advantage of getting your hands dirty with the data so you really see the duplicate entries up close.

Method 3: Pivot Tables are a great way to search for duplicates in Google Sheets. Pivot Tables are extremely flexible and fast to use, so it’s great tool to use when you’re unsure if you have duplicates and you’re checking your data.

Method 4: Conditional Formatting is a great way to highlight duplicates in Google Sheets.

Method 5: Apps Script is useful for developers who want to remove duplicates from Sheets as part of their apps, or someone who needs to repeatedly and automatically de-duplicate their data.

Method 6: Add-on gives you more options for how you remove the duplicates (e.g. making fresh copy of the data, highlighting the duplicate entries first for review etc.) so is a good option if you need to understand the duplicates. However, it’s a paid add-on which seems unnecessary given the other free tools at your disposal.

I do use all of these different methods myself, for the different scenarios described above.

Video: how to remove duplicates in Google Sheets using the UNIQUE formula and Add-on methods

Method 1: How to remove duplicates in Google Sheets with the Remove Duplicates tool

The new feature is super easy to use. You find this feature under the menu: Data > Remove Duplicates

Remove Duplicates in Google Sheets

When you click Remove Duplicates, you’ll be prompted to choose which columns you want to check for duplicates.

You may want to remove duplicates where the rows entirely match, or you may wish to choose a specific column, such as an invoice number, regardless of what data is in the other columns.

Choose duplicate columns

The duplicates will then be removed and you’ll be presented with a summary report, telling you how many duplicates were removed:

Remove duplicates summary

For more information and more detail about the tool, you can check out Google’s G Suite announcement (includes details of other new features too!).

Method 2: How to remove duplicates in Google Sheets using formulas

2.1 Use the UNIQUE formula

This method deletes duplicates in the range of data you select.

It considers all the columns of your data range when determining the duplicates. In other words, it compares each row of data and removes any rows that are duplicates (identical to any others across the whole row).

It’s very easy to implement as it involves a single formula with a single argument — the range you want to de-duplicate (remove duplicates from).

=UNIQUE(A1:D11)

Here’s an example of the UNIQUE function in action. The function is in cell F1 and looks for duplicates in the data range in A1:D11:

Unique formula to remove duplicates in Google Sheets

And this is the result:

Unique formula to remove duplicates in Google Sheets

You can see the table on the right has fewer rows, because the duplicate rows have been removed.

2.2 Highlight duplicate values with COUNTIF

This method first highlights duplicates in your dataset.

First, create a new column next to the data column you want to check for duplicates (e.g. invoice number).

Then use this formula in cell B2 to highlight the duplicates in column A:

=COUNTIF(A$2:A2,A2)>1

You’ll notice the range is A$2:A2

The $ sign is key here because it locks the range to the top of the column, even as you copy the formula down column B. So this formula checks for duplicates in the current row back up to the top.

When a value shows up for the first time, the count will be 1, so the formula result will be false. But when the value shows up a second time, the count will be 2, so the formula result will be TRUE.

Highlight duplicates in Google Sheets

The final step is to highlight the TRUE values (the duplicates) and delete them.

Note: If you have a large dataset, with a lot of duplicates, then it’s best to turn the Duplicate column into values (Copy > Paste Special), sort by this column so all the duplicates (TRUEs) are in a block at the bottom of your dataset, and then delete them in one big group. It’s much quicker.

2.2 How to remove duplicates in Google Sheets using IF formulas

This is how I first learnt to find duplicates and how to remove duplicates.

It works by sorting your data and comparing adjacent rows. It’s rather cumbersome at first, because the formulas are a little tricky. However, it’s easy once you get used to it and has the benefit of getting you into the weeds of your data. You will see the duplicates up close and can investigate why they’re happening.

It involves using an IF formula to check whether a value on your current row is equal to the value on the previous row. If they’re equal, then it’s a duplicate entry.

Duplicate entries are assigned a value of 1, and unique (non-duplicate) entries are assigned a value of 0. That way you can easily count how many duplicates you have (by summing the column).

Here’s an image showing this method in action:

How to remove duplicates in Google Sheets using an IF formula

Only the first column, the invoice number, is considered for duplication purposes in this example.

That means if two rows have the same invoice number then they’ll be tagged as duplicates, regardless of whether any other datapoints, like date, are different.

You can see that invoice number “196-X” on row 4 has been marked with a 1 to indicate it’s a duplicate row.

If you want to consider more columns or the whole row use one of other methods described in this article.

The next step is to highlight the duplicate pairs:

Highlight duplicate groups in Google Sheets using an IF formula

This is done so that you can compare the unique entry and the duplicate entries side-by-side and investigate what’s happening before you blindly delete them.

The final step is to then delete the duplicate rows.

Here’s the whole process explained in more detail:

Step 1: Add a unique order column

This step is important. It lets you sort the data by whichever column you want and then easily return to the original order. (Note: This was not shown in the above screenshot.)

Step 2: Sort A-Z on whichever column is your “unique” column

Do you have an invoice number you can check for duplicates? Or a transaction reference number? Or a student ID?

Identify the column that should be unique. Then sort it from A-Z.

This sorts all matching entries next to each other (so duplicate entries will be next to their unique counterparts).

Step 3: Insert two new columns to the right of this “unique” column

Apply a color coding to inform users that these columns have been added and were not part of the original dataset.

Step 4: Identify duplicate entries with an IF formula

The basic formula to check for duplicates, in column B, is this:

=IF(B3=B2,1,0)

Step 5: Identify duplicate groupings with an IF and OR formula

And then the formula in the adjacent column, which identifies the duplicate groups (i.e. the first entry and subsequent duplicates):

=IF(OR(D3=1,D2=1),1,0)

Step 6: Review duplicate entries

This step is always worth doing. You may identify the source of your duplicate entries. Perhaps all the duplicates are specific to a certain date or customer, which you can then resolve.

Step 7: Keep copies of your duplicate formulas for future reference (optional)

If you think you may want to use these formulas again, you can keep copies in the rows above your Sheet so you have them “live”.

Step 8: Turn duplicate columns into values

Highlight the data (consisting of 0’s and 1’s) in your two duplicate columns.

Copy > Paste special > Paste values only

Step 9: Sort by the duplicate column to move duplicates

Move all of the duplicates to the bottom of your dataset, so they can easily be removed as a single contiguous range. This is much faster than simply filtering on the duplicates and deleting them, especially if your dataset is large.

Step 10: Sort your dataset by the original order column

Sort your data back to it’s original order, but now with the duplicate rows removed.

To recap, this method:

  • Allows you to highlight duplicates in Google Sheets
  • Allows you to identify duplicate groups in Google Sheets
  • Keeps copies of all your data until you are ready to delete rows
  • Is rather labor intensive however, involving tricky formulas
  • Modifies the original dataset (unless you make a copy)

2.3 Use IF formulas on multiple columns

This is the same as method 3 above, but applied to multiple columns.

In this case, you want to search for duplicates in Google Sheets based on two (or more) columns. Maybe an invoice number and a product ID.

First thing to do is to identify the columns you want to include in the duplicate consideration.

This time, insert three new columns to the right of your final “unique” column.

Use a formula to concatenate all of the “unique” columns you’re considering to highlight duplicates:

=B2&C2

or you can also use the CONCATENATE formula (including more columns this time):

=CONCATENATE(B2,C2,D2,E2)

This new column consists of all of the “unique” columns squashed together.

Now, use Method 3 to de-duplicate on this new “unique” column of joined values. You proceed through the same 10 steps from Method 3 to identify duplicates and duplicate groups.

Here’s an example:

How to identify duplicates in Google Sheets using multiple columns

Method 3: How to remove duplicates in Google Sheets using Pivot Tables

If you’re new to Pivot Tables, check out my article Pivot Tables 101: A Beginner’s Guide.

Pivot Tables are extremely useful for exploratory data analysis.

Pivot Tables are a great tool to use to search for duplicates in Google Sheets. They’re extremely flexible and fast to use, so they’re often a great place to start if you’re unsure whether you have any duplicates in your data.

Highlight your dataset and create a Pivot Table (under the Data menu).

A new tab opens with the Pivot Table editor.

Under ROWS, choose the column you want to check for duplicates (e.g. invoice number). Then in VALUES, choose another column (I often use the same one) and make sure it’s set to summarize by COUNT or COUNTA (if your column contains text), like this:

Highlight duplicates in Google Sheets using a Pivot Table

The Pivot Table will then look like this:

Highlight duplicates in Google Sheets using a Pivot Table

You can see that duplicates values (for example 196-X) will have a count greater than 1.

From here you can lookup these duplicate values in your original dataset and decide how to proceed.

As you can see, this method is most suitable when you’re checking for duplicates and want to go and investigate them.

Method 4: How to remove duplicates in Google Sheets using Conditional Formatting

This method has two steps: 1) highlight duplicates in Google Sheets, and 2) remove those highlighted duplicates (optional).

Select your dataset and open the conditional formatting sidebar (under the Format menu).

Under the “Format cells if…” option, choose custom formula is (the last option) and enter the following formula:

=COUNTIF($A$1:$A1,A1)>1

This formula checks for duplicates in column A.

Highlight duplicates in Google Sheets with Conditional Formatting

The output is highlighting applied to the duplicate values:

Highlight duplicates in Google Sheets with Conditional Formatting

What if you want to apply the highlight to the whole row?

You need to make one small tweak to the formula (highlighted in red) by adding a $ sign in front of the final A:

=COUNTIF($A$1:$A1,$A1)>1

Now your output will look like this, with the whole row highlighted:

Highlight duplicates in Google Sheets with Conditional Formatting

Check out this article for a more detailed look at how to highlight a whole row using conditional formatting.

Method 5: How to remove duplicates in Google Sheets using Apps Script

(New to Apps Script? Read my starter guide to Apps Script for a primer.)

It’s relatively straightforward to create a small script file that can remove duplicate rows from your datasets.

The advantage of writing an Apps Script program is that you can run it over and over, for example each time you add new data.

Sample Apps script program: How to remove duplicates in Google Sheets

This program removes duplicates from a dataset in Sheet 1. It’s very specific to the Sheet and data range, but it’s easy to create and modify.

It works as follows:

  • Get the values from the data range in Sheet1, using Apps Script
  • Turn the array rows into strings (blocks of text) for comparison
  • Filter out any duplicate rows
  • Check whether a de-duplicate sheet exists
  • If it does, clear out the old data and paste in the new de-duplicated data
  • If it does not exist, create a new sheet and paste in the new de-duplicated data
  • Add a custom menu to run from the Google Sheet

So it’s very specific to this use case, but it could be easily adapted if necessary for different datasets. Here it is in action:

Remove duplicates in google sheets with apps script

And here’s the Apps Script code for this program:

And you can also add a custom menu to run it from your Google Sheet rather than the script editor window:

The code for this simple duplicate program is also here on GitHub.

Suggestions for improvement

  • Set triggers to run the duplicate remover on certain conditions (e.g. once a day, when new data is added)
  • Better control over selecting the data (i.e. which Sheet, what range etc.)
  • Whether to consider all of the columns or not for duplicates
  • Better control over the output

I started coding something along these lines, but it gets complicated as you start to pile on more edge-cases and user options. I realized pretty quickly that all I was doing was reinventing the wheel, since a perfectly fine add-on exists (see method 2).🤔

The best thing about Apps Script is that it lets you build minimal viable products specific to your situation very quickly.

Once you’re familiar with Apps Script, it only takes 15 – 30 minutes to build custom scripts, like this one to remove duplicates in Google Sheets.

Method 6: How to remove duplicates in Google Sheets using an Add-on

Add-ons are specialized programs that you add to your Google Sheets, to give you extra functionality. They’re developed by third-party developers but are vetted by Google before they end up in the Add-On directory.

You add Add-ons to your Sheet through the Add-ons menu:

Add-On Menu in Google Sheets

As you’ll notice there a lot of options. The best one I’ve found is Remove Duplicates Add-On from Ablebits, which is the first one in this image:

remove Duplicates In Google Sheets With Add-On

This is a paid Add-On, although you can use the full functionality for free during a 30-day trial period. (Full disclosure: this is an affiliate link, meaning I’ll earn a small commission at no extra cost to you.)

Click here to test out Remove Duplicates Add-On from the Add-On store.

This add-on highlights duplicates in Google Sheets, deletes duplicates in Google Sheets and can even compare two columns for the same value.

It’s very easy to use and super handy if you deal with duplicates a lot, or if you’re not sure where or why duplicates are in your data. Before you delete them, you can investigate which rows of data are duplicates and see if you can identify anything that’s a problem.

It’s very easy to use.

It has a lot of options, such as whether you want to highlight duplicates, delete them, create a new copy of your data etc., so I’ll leave it to you to explore.

Here’s a GIF image showing the steps to highlight duplicates in your dataset:

How to highlight duplicates in Google Sheets with an Add-On

Now you know how to remove duplicates in Google Sheets with six different techniques, go forth and banish those duplicates from your datasets!

7 thoughts on “How to Remove Duplicates in Google Sheets”

  1. This is great! Thanks…I was going to ask how to show duplicates in multiple columns as that is what I needed last week and found how to do that using the conditional formatting I used the formula =countif($A:$BX,A1)>1

    Thanks for your info, super helpful!

Leave a Reply

Your email address will not be published. Required fields are marked *