In today’s blog post, our founder and CEO Lennard Stoever comments on the BIG DATA hype – an article also to be read in the INTERNET WORLD Businessissue 7/2014, p. 42.
Ever since the past year, we experience an actual BIG DATA hype. For the topic has been communicated not only by the specialist press, the result is the term’s strong dilution: By now, every online retailer seems to think that he should do BIG DATA as well – though for most of them, it has no direct relevancy at all.
So what does BIG DATA actually mean?
Its definition is built on the following criteria: volume (many data sets), variety (data from various sources, structured and unstructured) and velocity (velocity of data generation and processing). Whereas up to now, only few big enterprises have been capable of coping with such huge amounts of data technologically, conceptually as well as financially, the revolution of BIG DATA in today’s sense is that by now, innovative technology enables everyone to process data of nearly all sizes at relatively low charges. Given that, not only online retailers face boon and bane at once: Who focusses on BIG DATA at a too early stage, becomes blind to the essential easily, for BIG DATA, implemented and used reasonably, is a time-consuming and resource-intensive project. Who wants to get involved with that, should think twice about BIG DATA actually being relevant for him at all.
And for whom BIG DATA does have any relevancy?
For this purpose, let’s have a look at a little sample calculation: A shop with an annual revenue of 30 Mio. € (and thus actually one of the top 200 online shops in Germany) generates structured data sets of about 10 terabyte per year. This is enough to create a basic set of relevant business KPIs. The question, if this can already be considered BIG DATA is answered quickly when comparing those amounts to shops like eBay that with about 100 terabyte generate ten times as much – and not per year, but per day! Ladies and gentleman, this is BIG DATA!
First Small Data, then BIG DATA
But nonetheless, we see online retailers of much smaller sizes bothering their heads about Hadoop, NoSQL, In-Memory & Co – with the result that the creation of important basics comes off badly far too often. Instead of wondering about the connection of sales figures with weather forecasts in the spirit of what in common parlance is considered to be BIG DATA, one should first and foremost clarify what sales does actually mean for the own company. Do controlling, marketing and category management have the same understanding of this metric? In the course of our work with various online shops, we had to experience far too often that precisely this is not the case in many companies – without the executives being aware. But exactly those things are the important basics that are crucial for my company’s efficient growth: What are the important metrics for my shop? How should these metrics be defined – company-wide? Which figures do I have to look on a monthly, weekly or even daily basis to keep track of my business processes and be able to control their development efficiently? Am I aware of the fact that I do not have to look at my sales reports, inventory turnover and costs per click every day, but better every week, while my marketing department actually should monitor their revenue, return rates and gross margins per channel and campaign on a daily basis? Especially small shops with limited resources that already bother about BIG DATA, are quite exposed to the danger of losing sight of those vital questions.
Focus on the essentials!
My advice to all online shops: postpone BIG DATA to tomorrow! A clearly structured, functioning Data Warehouse that connects the most important data sources (shop, ERP, web analytics) intelligently, is everything you need to make your business processes become transparent – from acquisition costs to actual profits. Analyzing Twitter feeds with Hadoop is the second step before the first. To begin with, it’s crucial to do your homework.
Lennard Stoever, minubo Gründer & CEO