Wednesday, 12 September 2012

Replacing Unwanted Characters In Excel Data Using VBA

If you've ever been given an Excel spreadsheet file to analyze or extract information from you'll know how frustrating it can be if the data is not in a consistent format. Unfortunately this is a common occurrence, especially if the data has been entered by different staff members over a period of time.
This article will show you how to remove unwanted characters from Excel worksheets with just a few lines of VBA code.
Examples Of Inconsistent Format
Types of unwanted characters include abbreviations, bracketing of cities and states; and also phone numbers containing non-numeric characters. The following represent examples of inconsistent formats which might create problems later on.


07-526-2025
(021) 5262065
AZ
NY
(New York)
04 526 2065
021-526-2065.
Arizona.

The problem with using different formats is that you never know what the data may be used for in the future. Perhaps a dialling function on a customer management program might be relying on phone numbers that are in a pure numerical format. Or someone might be doing a search for customers in "Arizona" but the search won't be looking for "AZ".
Without instruction, some data entry staff tend to enter information in what they see as a user-friendly format. Each staff member might have a different view on the best data format but computers think differently from human beings and need information in a predictable format.
Using VBA To Clean Up Your Data
We're going to write a short routine to clean a list of phone numbers - removing unwanted characters and making the data searchable and computer-friendly. Excel has various in-built formulas to replace or substitute characters in a string, but the problem is that an entry may contain several different non-numeric characters and we want to remove them all.
Our VBA code will combine several techniques to ensure each phone number contains numeric values only.

Select the data range


Range("a1").CurrentRegion.Columns(1).Select
Loop through the selected data and

Assign the value of each cell into a temporary variable that
we'll use to find non-numeric characters
For Each c In Selection
txt = c.Text
Check each character in the entry to make sure it's numeric and

if it isn't, remove the character from the cell value
For x = 1 To Len(txt)
chkChar = Mid(txt, x, 1)
if Not IsNumeric(chkChar) Then
c.Value = Replace(c.Value, chkChar, "",)
End If
Next
Go to the next cell to check

Next
Once the phone numbers are purely numerical, your data can be a lot more useful. For example it could be imported into a database which requires numerical entries or be used as a search "key" when looking for duplicates.
Summary
You've seen how VBA can help clean up your data - although the best idea is to have solid business rules to prevent inconsistent formatting to begin with.
A good idea is to take the time to develop good processes before even beginning data entry but it's nice to know there are simple techniques available to make the best use of your business information whatever state it might be in.



No comments:

Post a Comment