There are a number of different activities that fall under the category of data cleaning, all of which are designed to confirm, reformat, update and enhance your information in order to make it more usable. Some of most common cleaning processes are as follows:
Address cleaning: this is the process of smoothing out the variations in the way people fill out forms and enter data. By matching addresses to Postal Address Files, we can standardise formats of customer details, verify or determine gender, check existence of addresses, confirm phone numbers, flag customers that meet certain criteria and more. In the past, this service has enabled our clients to significantly reduce their mailing waste and people time used to communicate with unengaged customers, as well as allowed them to build robust datasets for analytical purposes.
De-duping: data deduplication is designed to get rid of duplicate copies of repeating data points or entries. By comparing entries to one another, we can identify customers with the same name, birthday and address (for example) – it’s very likely that these two entries are actually the same person, so they can be consolidated. This process, like address cleaning, is helpful for companies looking to minimise wasted resource in marketing communications, as well as improve storage utilisation.
Matching: this is a process in which we create rules to identify groups of records that ‘match’ according to specific rules – like people who live in the same household, for example. Most of the matching we do is around name and address information, but we have the ability to match anything within a dataset (DOB, Tax number, stores transacted at, phone numbers, email addresses, gender etc.). Matches are useful when it comes to cross-selling, or consolidating mail outs (e.g. only sending one to a household).