# Tom’s Tutorials For Excel: Deleting Duplicate Records

Tom’s Tutorials For Excel: Deleting Duplicate Records

When you have a list of data, sometimes it is not enough to simply delete rows with duplicated information based only on the items in one column. Multi-column lists can have duplicated records based on the fact that every item in every column of a row’s data matches that of another row’s entire data, cell for cell. In those cases, you need to compare a concatenated (combined) string of each record’s (row’s) data, and compare that to the concatenated strings of all the other rows.

Take a close look at this Before and After picture. In the original list, all the items in rows 5 and 7 match, as do all the items in rows 3 and 10. This is a short list for demonstration purposes. If your list were a couple hundred thousand rows deep, such as an annual list of bank transactions or department store purchases, you would need a fast and easy way to delete duplicated records. The macro that follows the picture is one way to do the job. The comments in green explain each step. `Sub DeletingDuplicateRecords()`

`'Turn off ScreenUpdating to speed up the macro.`
`Application.ScreenUpdating = False`

`'Declare a range variable for the helper column being used.`
`Dim FilterRange As Range`
`'Define the range variables dynamic range.`
`Set FilterRange = Range("E1:E" & Cells(Rows.Count, 1).End(xlUp).Row)`

`'For efficiency, open a With structure for the FilterRange variable.`
`With FilterRange`
`'Enter the formula`
`'=SUMPRODUCT((\$A\$1:\$A1=\$A1)*(\$B\$1:\$B1=\$B1)*(\$C\$1:\$C1=\$C1)*(\$D\$1:\$D1=\$D1))>1`
`'in all cells in column E (the helper column) that returns either TRUE`
`'if the record is a duplicate of a previous one, or FALSE if the record`
`'is unique among the records in all previous rows in the list.`
`.FormulaR1C1 = _`
`"=SUMPRODUCT((R1C1:RC1=RC1)*(R1C2:RC2=RC2)*(R1C3:RC3=RC3)*(R1C4:RC4=RC4))>1"`
`'Turn the formulas into static values because they will be filtered,`
`'and maybe deleted if any return TRUE.`
`.Value = .Value`
`'AutoFilter the helper column for TRUE.`
`.AutoFilter Field:=1, Criteria1:="TRUE"`
`'Error bypass in case no TRUEs exist in the helper column.`
`On Error Resume Next`
`'This next line resizes the FilterRange variable to exclude the first row.`
`'Then, it deletes all visible filtered rows, abbreviated by SpecialCells`
`'constant number 12.`
`.Offset(1).Resize(.Rows.Count - 1).SpecialCells(12).EntireRow.Delete`
`'Clear the Error object in case a run time error would have occurred,`
`'that is, if no TRUEs existed in the helper column to be deleted.`
`Err.Clear`
`'Close the With structure for the FilterRange variable object.`
`End With`

`'Exit (stop using) AutoFilter.`
`ActiveSheet.AutoFilterMode = False`

`'Clear all helper values (there would only be FALSEs at this moment).`
`'Note that Columns(5) means column E which is the fifth column from the left`
`'on a standard Excel spreadsheet.`
`Columns(5).Clear`

`'Clear the range object variable to restore system memory.`
`Set FilterRange = Nothing`

`'Turn ScreenUpdating back on.`
`Application.ScreenUpdating = True`

`End Sub`

###### One comment on “Tom’s Tutorials For Excel: Deleting Duplicate Records”
1. Doris says:

Good clean and clear cut. Nice, lots of explanation easy to follow.
Thank you for sharing.