Mapping the data universe

Send to friend

Adopting a continual, pro-active approach to information governance is essential if retailers are to meet the challenges presented by big data. Martin Bonney, Director International Consulting Services, Epiq Systems, suggests the sector has room for improvement.

Big data presents big challenges for companies in the retail sector. Every year, a typical Fortune 500 company can produce several petabytes of electronic information. Each employee is likely to send and receive around 100 e-mails per day, and each piece of data is likely to cross the desktops of dozens if not hundreds of individuals.[1] Whether stored on hard drives, databases, removable media like USB keys and CDs, or on backup tapes, that data is archived and replicated, and grows exponentially. According to a 2012 survey, 2.8 zettabytes of information were created in 2012. This is projected to grow to 40 zettabytes in 2020, representing 50-fold growth from the beginning of 2010.[2]

Companies in the retail sector need to be prepared to respond to regulatory investigations. For example, there have been recent instances of large retailers facing investigations from both the Serious Fraud Office and the Financial Conduct Authority in relation to problems with accounting and reporting. [3] In large scale enquiries like these the accurate tracking of correspondence, documents, audio evidence and emails relating to the case is essential. An accurate data map is crucial in anticipation of these kinds of investigations, however unexpected they may be.

We recently conducted a survey* into the impact of growing data volumes on major corporations, including retailers. More than three-quarters of the corporations surveyed feel confident in their ability to locate key data in the face of litigation or investigation. However, further research suggests such confidence may be misplaced. The survey went on to reveal that only around half of these companies continually monitor and update their data map.

Regulatory deadlines for document production can be as short as 14 days. If data is not continually assessed, the ability to respond to requests quickly, accurately and defensibly is severely tested.

So where to begin? How best to navigate this data maze? Taking a pro-active approach begins with gaining a clear understanding of the data universe, the records retention policies and the legal and regulatory needs of the business. This has obvious benefits in terms of reducing the cost of storage (many estimates suggest by more than 40 per cent) and the concomitant cost of processing and reviewing documents. Less quantifiable, but possibly more significant is that by not doing this, organisations could retain data that could come back to bite them in the future, but could justifiably have been deleted if retention policies had been practically applied.

Planning and communication is the essence of a successful information governance strategy. Getting the key players (typically at least IT, legal/compliance and third-party eDiscovery provider) talking to each other, and investing the time to build a data map – essentially a description of the organisation's data types, technical infrastructure and storage solutions - is an essential first-step. Assessing the actions needed to preserve and collect data can bring an early understanding as to the scale and nature of the challenge.

Another benefit of building a data map is that it enables organisations to quickly highlight data sets that can be removed from a disclosure requirement, for example back-up duplicate emails. This "low-hanging fruit" can be useful whether relating to a regulatory investigation, an internal investigation or to litigation. Showing practical responsiveness to disclosure requests is a way to gain essential goodwill from regulators or the courts.

Our experience within the sector suggests that the challenge for retailers extends beyond that of typical discovery for litigation purposes to encompass questions of economics and to provide transparency around fair-pricing. Business leaders recognize the importance of this transparent approach to pricing in winning-back and maintaining consumer trust and the industry is pressing for guidance around the provision and communication of cost and profitability data, both historic and future[4].

Only by bringing all relevant parties together and working closely as a team to define what is most important to the organisation in terms of time, cost and scope, can the appropriate solution be built and, crucially, maintained. The data landscape is continually shifting, therefore mapping this landscape cannot be a one-off exercise. It is not enough to adopt an irregular pattern of data monitoring, and leading businesses are recognising the benefit of partnering with experts to adopt a pro-active, continual approach to information governance.


Survey conducted by telephone in November 2013, targeting 100 respondents from large "blue chip" companies (defined as having more than $500M annual revenues but with the majority in the study – 74 per cent - having more than $1B annual revenues) in UK, Germany, Netherlands and Switzerland. The largest European organisations in Manufacturing and Construction; Retail; Financial Services; Utilities; Pharmaceuticals; Professional Services and IT/Telecoms took part. Respondents were typically the CFO/Finance Director, Head of Compliance/Compliance Director or the Head of Legal/Legal Director/Head of Counsel.

[1] Email Statistics Report, 2013-2017. The Radicati Group, Inc
[2] Gantz, John and David Reinsel "The Digital Universe in 2020: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East" IDC iView. December 2012
[3] See for example

Comments (0)

Add a Comment

This thread has been closed from taking new comments.