Rediscovering the data audit
Whether trading online or off, any UK data processing supplier worth its salt will offer a free data quality audit. This will at least show how many goneaways or deceased records a list has, and lets the bureau give potential clients a cost estimate for the job. As well as using audits to compare costs from multiple suppliers, it’s also possible to pay for further, more detailed investigative work that can uncover fundamental issues in how a company manages its customer data.
Essential precursor
“We always encourage our clients to run a free data audit,” says Linda Churches, Solutions Consultant at Occam. “It gives an idea of what can be done for a given budget. We are also able to provide consultancy to show how other processes can be applied pre- and post-processing.”
The audit process is the same as any cleansing job: brief a service provider on the goals of the work and the reference files to be used. The provider runs the processing job and forwards the report with the aggregated results, but without actually appending any flags to the client file or making any chargeable hits on the reference files.
“It’s important to chat through the project first,” says Emma Thwaite, client services director at Alchemetrics. “The main thing is to understand the client’s needs and what they are trying to achieve.”
This need to consult in advance in order to decide on the data processing set-up is emphasised by Steven Day, director at UKChanges. “Are you going to use confirmed rather than assumed data to match against for customer files?” he asks. “An audit is useful but it must be seen in the context of the overall brief.”
Online audits work in the same way as offline ones, though there’s usually much less scope for stating your goals or doing much in the way of varying processing parameters. However, most online service providers will be happy to tune their audits and other services to clients’ requirements at little or no extra cost so it’s well worth picking up the phone.
A free audit fulfils a number of functions, some important and obvious, others minor or unexpected. The key measure for a mailing list is deliverability, so name and address quality tends to be the top priority in both b2b and b2c list audits. Measures of data quality here can include whether or not the list has full postcodes or less precise variables such as postcode areas, or if surname, address and postcode are well populated and accurate but first names are largely missing. Similarly, out-of-date postcodes and non-standard address formatting are also causes for concern. Other examples of issues include gender conflicts such as “Mr Mabel Smith”.
At Alchemetrics and many other MSPs, the company also lists the data population of key list fields in a simple data profiling exercise: which fields are empty, whether they contain alphanumeric or control characters and so forth. This basic information could be critical to a company that’s only starting to look at its data for the first time.
“The audit identifies gaps that the client may need to fill, and it confirms the file specification to clients too,” says Thwaite. “It shows we have got what they expected to send.”
For those more experienced in customer data management, reports on data populations are unlikely to surprise but they still perform a useful function in providing a benchmark from an external expert that can be tracked over time.
At Data8, the audit flags up records where address field positions vary, where unexpected data types are encountered and many other issues. “It highlights where a file may need pre-processing work, it’s impossible to work with really dirty data in its original form,” says managing director Antony Allen.
The number of duplicates is obviously important, while the overall percentages of goneaways and deceaseds are two of the most important results for a number of reasons, not least processing cost. According to Alchemetrics, a goneaway rate of around 3% and a deceased rate of 1% is a general industry standard for an active customer file.
More than that and, as well as cleaning the file, further investigation may well be warranted. As these rates vary in different industries, it’s worth consulting with the provider on this. “We offer a benchmarking service based on aggregated data we hold from other companies in the same sector,” says Churches.
It may be that a file has simply not been updated for a long time or there could be more serious problems with the feeds to the customer database or the internal updating and data collection process. One useful technique is to break down the suppression matches by initial source to see if one source predominates.
Profile information based on standard industry geodemographic classifications like Cameo and Acorn is increasingly included as part of a standard audit. If this is the first profile a client has seen then it may give an eye-opening rundown of their customer base. More advanced work using transactional indicators to profile top spenders can give valuable insights into the attributes and locations of the best customers.
Verifying email addresses as part of a free audit is more common these days. Work might include checking for correct formatting, a valid domain or even pinging the email to check its currency, though any web data investigation is unlikely to be part of a free audit.
Other audit information might include the range of external data that might be appended to a file to either fill in missing fields or to enhance the file to make it suitable for other marketing initiatives. This could be anything from phone numbers or insurance renewal dates to SIC codes in a b2b file or a count of the number of new addresses available for identified goneaways. Other companies include a simple ROI calculation for the processing indicated in the audit based on client-entered pack and mailing costs, while Data8 even offers a CO2 calculator.
“This is important to a number of companies and should be important to all companies,” says Allen. “They can use it to check against their corporate goals and it’s also relevant to PAS 2020.”
If the intention is to compare audits in order to help in picking a supplier, it’s definitely worth setting up the processing parameters to be exactly the same – insofar as that is possible. It’s also worth remembering that there are other factors in the decision to choose a data processing provider besides match results and bottom line costs. Strong consultative service alongside demonstrable expertise and experience in providing data processing to expert users are just as important.
Using the same reference files, matching to PAF beforehand (or not), specifying whether the list is made up of customers or prospects: setting standard job specs will make the comparison process more useful and accurate. The chief hurdle here is the variation in matching techniques used by different MSPs, and how that matching is set up, for example the order in which the file is matched against reference data (file hierarchy).
This is likely to be harder when using online self-service processing where there tends to be little or no facility to alter the processing parameters, though there are notable exceptions here such as UKChanges and dbg. The latter offers gold, silver and bronze options for matching “tightness”. There are also reference files such as Experian’s Absolute Movers which no other provider offers.
Because of these issues, comparisons are never going to be an exact science. However by running say, four audits, it should be possible to identify any real outliers in the results. For example, if three providers come up with a similar level of matches to goneaway and suppression files (usually the main cost in any data cleansing exercise) but the fourth has match rates of twice the others, some investigation is needed. Simply picking up the phone should be the first move as the difference may be down to something as simple as an error in loading the data.
“Matching at forename/initial level might show fewer deceased matches that other audits but if we did it at family level, then the match rate goes up significantly,” says Day. “Don’t buy on match rate or price alone.”
Most MSPs are happy to go well beyond a standard contact list audit to delve deeper into inconsistencies in a file, list the possible reasons for their existence and then to look at possible remedial action. The boundary between free and paid is flexible, as any provider should expect to do some upfront work in order to scope a project and give the client an accurate estimate. A simple example might be splitting a large file by the date of the last contact with each customer. This means that the oldest records - and those most likely to be in error - can be prioritised for cleansing.
“We still see a real split in the market between companies that have shocking data and don’t know, and those that are more advanced in their use and knowledge of customer,” says Richard Lees, managing director at dbg. “As data has moved to board level, so the concept of the data audit has become about much more than the contactability of customer records. It goes beyond marketing to the area of CDI (Customer Data Integration) and MDM (Master Data Management).”
According to dbg, it’s the complexity of a business that produces the need for more extensive audits. There’s a lot more digging to do to resolve data quality issues at a company that is pulling in data from multiple on- and offline channels, affiliate partners and reseller networks, rather than simply running merge/purge on a collection of mailing lists. “In most cases, you can’t divorce data cleansing from the overall data strategy,” confirms Lees.
The initial audit often acts as a trigger for any further work. For example, creating then examining derived variables can show the suitability or otherwise of a file for analyses like RFV. There might be more extensive profiling work or visualisation at a higher level of detail based on the initial audit, such as focusing on the completeness and accuracy of the key transactional variables, then using those variables to segment the file by various customer groups.
Then the MSP can start to look at questions like: do those entering the customer database via a free web trial tend to spend more or have higher retention rates than those recruited via direct mail? Or does seasonality affect recruitment rates and how do the effectiveness of recruitment channels vary across the year? Comparing the transactional and demographic characteristics of different segments might then lead to more informed decisions as to how each group should be incentivised and treated in future.
Tangible Data typically runs two advanced types of audits for clients, according to managing director Nigel Magson. The first is an exhaustive investigation of every variable in the database to identify links between tables, existing record keys or orphan records, and might be a precursor to a system redesign or data model change.
The second is, “more of a marketing analyst’s view”, says Magson, and might involve work such as categorising channel codes before deciding on a smaller number that might be incorporated within a look-up table. Apparently one client arrived with over 14,000 separate codes in its customer file.
Consult and decide
Standard free audits can uncover fundamental errors in a file and, with some care, can also aid in comparing providers. With further investment, MSPs can list many other file characteristics in an advanced audit but the crux comes in understanding why the results are as they are, and what can be done to improve matters. As Magson says, “There are no right and wrong answers in analysing the results of an audit, it’s all about having an intelligent conversation with the client.”
Online fire retailer makes "vast improvement" with address validation
17 May 2012: Online fire retailer GasFire.co.uk has reported a ‘vast improvement’ in its ordering process after installing address validation from Postcode Anywhere on its Magento ecommerce website.
Barnet Council fined £70k for losing sensitive data in burglary
16 Jun 2012: The London Borough of Barnet has been fined £70,000 for losing paper records containing highly sensitive and confidential information, including the names, addresses, dates of birth and details of the sexual activities of 15 vulnerable children or young people.
Kvarby joins Next Performance as COO
14 May 2012: Next Performance, the real-time advertising marketing platform specialising in next generation retargeting services, has announced that Bjorn Kvarby has joined the company’s management team as Chief Operating Officer.
Celebration of life and work of Derek Holder set for July
11 May 2012: A tribute event to celebrate the life and achievements of the late Professor Derek Holder F IDM will take place at London’s Royal Geographical Society on Friday 6 July 2012 from 2.30pm.
Hortonworks strikes Hadoop deal with Kognitio
8 May 2012: Hortonworks, a leading commercial vendor promoting the innovation, development and support of Apache Hadoop, has partnered with in-memory data analytics pioneer Kognitio.
NICE launches new analytics-driven real time customer interaction solution
3 May 2012: Global intent-based solutions provider NICE has introduced an integrated customer interaction management solution that it says impacts on every stage of the interaction lifecycle.
Semphonic and iJento announce global partnership
3 May 2012: Multichannel customer intelligence specialist iJento and web analytics consultancy Semphonic have announced a new global partnership to collaboratively help organisations track and understand both digital and multichannel customer journeys.
What is the value of a name?
30 Apr 2012: Why do some businesses generate far more from their databases than others? It often comes down to lack of measurement and proper ROI metrics, says Mark Patron.
Teradata to acquire eCircle
1 May 2012: Teradata, the global analytic data solutions company, has signed a definitive agreement to acquire Munich-based eCircle, the European leader in cloud-based digital marketing.
Could it be Magiq?
27 Apr 12: The complexity of managing behavioural targeting and real-time web personalisation has meant that very few practical solutions exist for marketers but all that could be about to change with the launch of LifecycleMAGIQ, discovers James Lawson.
Callcredit powers through tough year with 11% hike in profits
27 Apr 2012: Callcredit Information Group has posted its annual results for 2011, showing a hike of 11.2% in profits from operations from a 50% increase in revenues.
Energy supplier First Utility selects StrongMail
26 Apr 2012: UK energy supplier First Utility has selected StrongMail On-Demand to drive its lifecycle email marketing campaigns.
Over-contact, poor data management hits charities
25 Apr 2012: More than 50% of UK adults would stop donating to a charity if it contacted them too frequently, according to new research examining consumer perceptions of how charities market their services and the way they use and manage supporter data.
Google Analytics boss joins Acxiom
23 Apr 2012: Acxiom, the global marketing services and technology business, has announced that former Google Analytics Product Manager Dr Phil Mui has joined the company as Chief Product and Engineering Officer, a newly created position at Acxiom.
Communicator Corp appoints new MD
23 Apr 2012: Global enterprise email management company Communicator Corp has promoted Chief Operating Officer James Bunting to Managing Director.
REaD Group redefines suppression with Qinetic file
23 Apr 2012: The REaD Group launches an audacious unified file that combines deceased, goneaway, relocated and latest occupier records all within a single file.
Callcredit launch set to Define the data market
20 Apr 2012: Callcredit Marketing Solutions and its specialist data division The Trading Floor have launched what they believe is "the most granular, most accurate and most up to date consumer database of its kind".
Anonymous prospects contactable with new Neolane functionality
20 Apr 2012: Conversational marketing technology provider Neolane has added new features to its Interaction application which will now allow marketers to interact with anonymous prospects online.
MySQL creator secures £2.5m of funding
18 April 2012: SkySQL, the creator of MySQL, has announced that it has raised $4m in Serie A funding from a number of investors
Toshiba falls foul of Data Protection Act
17 Apr 2012: Toshiba Information Systems (UK) is the latest organisation to breach the Data Protection Act (DPA) after the personal details of 20 competition entrants were compromised by a security flaw on its website.
Watson Phillips Norman picks up charity DM account
17 Apr 2012: DM agency Watson Phillips Norman has been appointed by international animal welfare charity the Brooke to work on its acquisition direct marketing programme and new product development.
Judging panel confirmed for inaugural IoF SIG Awards
16 Apr 2012: A strong and experienced panel of judges has been announced for the Insight in Fundraising Special Interest Group’s first awards scheme.
Bulk Mail: what it means to direct marketers
13 Apr 12: Bob Carter of BBS offers an in depth guide to the implications for marketers of The Royal Mail's recent overhaul of its Mailsort service.
Apteco names top partners for 2011: D&B, Celerity-IS & Callcredit
13 Apr 2012: Apteco has recognised its top three performing FastStats reseller partners for 2011 from its network of over 50 partners in the UK, Europe, North America and Australia.
Poor data costs UK firms £1 for every £6 spent
12 Apr 12: Around £1 in every £6 of departmental budget is wasted on average by UK companies because of poor data quality according to new Experian QAS research.
Hopewiser celebrates 30 years at the forefront of addressing
12 Apr 2012: Hopewiser is celebrating three decades as a leading provider of addressing software and address management expertise this year.

