Tuesday, 23 October 2012

Finding Duplicates In An Excel Spreadsheet With VBA

If you've ever been confronted with an Excel spreadsheet that has duplicate entries, you'll know that some care is required. You don't want to delete what might be genuine unique entries, or miss out on other entries that might be duplicates but have slipped through.
Some duplicates might hold information that you need to retain or transfer to a new entryYou might need to add data to new or existing entries, for example to add a unique keySome entries may be partial duplicates requiring further investigation
Excel Tools for Finding Duplicates
Excel has some good tools for identifying and removing identical entries but these can be a little inflexible if you're wanting to do something slightly different with your data:
Conditional formatting enables highlighting of possible duplicates and is a quick way to find repeated entriesThe remove duplicates tool is a blunt instrument which deletes the offending entries and reports on the number of items removedVarious worksheet formulas can identify duplicate entries
But sometimes the standard tools won't quite do the job and it's good to know what to do if you need something just a little more complicated. With a little knowledge and planning you can write your own VBA code to find duplicate entries.
Finding Duplicates With VBA
We're going to write some VBA code that will identify duplicate entries and copy them to a new worksheet. This might be a common task in business as you might want to review the possible duplicates rather than just delete them.
A good example might be a customer list which you suspect might have repeated entries. We want our code to search the phone numbers looking for duplicate records; if a phone number is repeated there's a good chance the entries are duplicates.
Select the sheet and cells to search

Sheets("allData").Activate
dupsCol = "e1"
Range(dupsCol).Activate
lastCell = ActiveCell.End(xlDown).Address
allCells = ActiveCell.Address & ":" & lastCell
Range(allCells).Select
Now we'll loop through the data and compare each cell to the total data set. If we find a duplicate we'll copy the row to the "duplicates" worksheet.

ct = 0
For Each c In Selection
curCell = c.Value
If Application.WorksheetFunction.CountIf(Range(allCells), curCell) > 1 Then
ct = ct + 1
Sheets("alldata").Range("a" & c.Row).EntireRow.Copy Sheets("duplicates").Range("a" & ct)
End If
Next

Now we'll sort the duplicates to make them easier to work with.

Sheets("duplicates").Activate
Range("A1").CurrentRegion.Select
Selection.Sort Key1:=Range(dupsCol)
Summary
With a little planning and knowledge it can be easy to improve the standard tools that come with every version of MS Excel. Learning a little about VBA and how it can improve your spreadsheet processes will pay dividends both in saving time and improving productivity.




No comments:

Post a Comment