Data Cleansing

data cleaning

Data cleaning varies depending on what it is your organisation is trying to achieve, but the overall goal is to get sections or components of your data into better shape in order to facilitate accurate and streamlined analytics activities.  This is an important process to undergo before you jump into any significant analytical projects. 

WHAT CAN data cleaning HELP WITH?

There are a number of different activities that fall under the category of data cleaning, all of which are designed to confirm, reformat, update and enhance your information in order to make it more usable.  Some of most common cleaning processes are as follows:

BookAddress cleaning: this is the process of smoothing out the variations in the way people fill out forms and enter data. By matching addresses to Postal Address Files, we can standardise formats of customer details, verify or determine gender, check existence of addresses, confirm phone numbers, flag customers that meet certain criteria and more.  In the past, this service has enabled our clients to significantly reduce their mailing waste and people time used to communicate with unengaged customers, as well as allowed them to build robust datasets for analytical purposes.

CommunityDe-duping: data deduplication is designed to get rid of duplicate copies of repeating data points or entries. By comparing entries to one another, we can identify customers with the same name, birthday and address (for example) – it’s very likely that these two entries are actually the same person, so they can be consolidated.  This process, like address cleaning, is helpful for companies looking to minimise wasted resource in marketing communications, as well as improve storage utilisation.

SegmentationMatching: this is a process in which we create rules to identify groups of records that ‘match’ according to specific rules – like people who live in the same household, for example.  Most of the matching we do is around name and address information, but we have the ability to match anything within a dataset (DOB, Tax number, stores transacted at, phone numbers, email addresses, gender etc.).  Matches are useful when it comes to cross-selling, or consolidating mail outs (e.g.  only sending one to a household).



  • Reduce wasted resource and people time
  • Reduce storage with cleaner data
  • Enable accurate and streamlined analytics
  • Keep customer data accurate and up to date
  • High rate of accuracy once clean

Industries covered:

  • Banking and finance
  • FMCG
  • Utilities
  • Telecommunications
  • Retail
  • Travel and Tourism
  • Entertainment
  • Insurance
  • Government


Schedule a free phone chat with a Datamine consultant

Interested in learning more about how Datamine could help you get your data clean and ready to analyse?  Fill out the form below to schedule a call with us.