Real-time is a Buzzword Bubble. The Truth is: 90% of BI Cases Don't Need it
Again and again, I cannot decide whether to scream or cry – or just leave the meeting room resignedly. "Can you do that in real-time?" There it is again, that question – posed in all naive innocence or in the trumpeting gesture of a person who thinks he or she has finally forced the service provider into a corner.
It is an absolute symptom of the buzzword era, which we've been experiencing in the analytics context for quite some time now – it seems to be obligatory these days to throw smart phrases around in conversations on the subject. Maybe otherwise, one would immediately be considered clueless? I cannot help but get the impression that in many cases it's not even clear what the corresponding terms mean at all; and, more importantly, which of the countless concepts with the fancy names can be a good solution for what – or not.
Back to the concrete case of real-time: First of all, dear questioner – what do you actually mean? Do you mean 1) real-time data queries à la "give me the development of key figure X over time period Y" based on a raw data set that remains unchanged throughout the day, or do you mean 2) real-time data updates, that is, the constant updating of raw data in the data warehouse, on the basis of which various information is then output in real-time again?
If you mean the former: Lord, yes! An analytics or BI solution that is not capable of doing that these days should be thrown out of the window. Every user must be able to access the data he or she needs at any time – autonomously and quickly (okay, whatever: in real-time). This is a basic prerequisite for any truly value-adding analytics or BI solution – even though most traditional BI monoliths, which are still thriving in the commerce world, are still not able to handle this. If I have to wait two minutes for my query results, that's not only inefficient, but also, it spoils my desire to work with data (yes, it even does that in real time) – data-driven work culture adieu.
If you mean the second: Also yes, but – why do you think you need this functionality so urgently that you have to ask this question right at the beginning of our conversation? Even if no one else may have dared to oppose your self-confident buzzword talk before, I'll tell you straightaway: For 90% of all analytics use cases in the BI context (vs. in the context of primary systems like web tracking or ERP, which collect or generate data themselves and display it directly, or in the context of operational cases like real-time bidding on ad placements) real-time data updates don't make any sense at all – in regard to what we've discussed about your business and issues so far, I would even guess 100% for you personally. The truth is: Real-time data updates in business intelligence, technology-wise, are anything but trivial and, due to the data that has to be processed again and again, cost a lot of performance resp. computing power – and therefore money. Are you sure that you want to give us this money for a service that probably no one in your company will ever use?
Of course, the situation changes if you happen to work with the 10% of use cases for which real-time data updates can actually be relevant also in the BI context, e.g. in connection with behavior-based communication with users on your website – in real-time, as long as they are there.
My suggestion: Take a moment, sit down with your users and see if you really have real-time use cases for your BI. If that' s the case, then collectively define which raw data in your data warehouse (not in your web tracking) needs to be updated in real-time (probably very few). This way, an update process can be set up exclusively for the corresponding data set, while at the same time ensuring a cost-efficient, lean setup. Alright?
Well, that had to get out. Until next time with another insight into my thoughts and experiences around business intelligence & co.