We get questioned a lot about both historians and big data, and since our team works with both (and many other types of data) we thought we should add our perpsective to help clear up, or further muddy, the topic of Big Data.
First, let's get our head around big data with a some definitions from around the web:
- Wikipedia: Big Data usually includes data sets with sizes beyond the ability of commonly used software tools to capture, curate, manage, and process the data within a tolerable elapsed time. Big data sizes are a constantly moving target, as of 2012 ranging from a few dozen terabytes to many petabytes of data in a single data set.
- Gartner: Big data is high-volume, -velocity and -variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making.
- McKinsey: “Big data” refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze. This definition is intentionally subjective and incorporates a moving definition of how big a dataset needs to be in order to be considered big data—i.e., we don’t define big data in terms of being larger than a certain number of terabytes (thousands of gigabytes).
For the purposes of answering how Big Data and Enterprise Historians relate, I think it is best to not just look at the size of the data (which is huge no matter how you look at it - see graphic below), but the characteristics of the data and the tools necessary to get value from it.
This is best summed up in a free guide from our friends at GE (makers of the Proficy Historian) called "The rise of Industrial Big Data." They offer quite a few insights and detail about the size of historian data, the speed at which it moves, and much more. Here is a graphic from the guide that shows the data output of just one single factory area for example:
So we know historian data is big and has high velocity, but what really makes historian data fit the category of big data is how unique the tools are for storing, retrieving and analyzing this data. By the simple fact that historians are their own category and that most successful historians (OSIsoft PI, GE Proficy, Wonderware Historian, Rockwell FactoryTalk, Etc.) are not based on relational databases or data warehouses demonstrates just how specialized these tools must be.
So, is there any importance to putting historians in the Big Data category? Does it even matter? I think the answer is yes, for several reasons:
- Historians are not as well know as they should be. They have been around for decades but are only known to a small subset of the technology world and have remained stuck in a narrow band of industrial uses for far too long. Attaching them to the 'hotness' of the Big Data category may shine some light on these powerful engines and the value they bring.
- Historians bring some history and deep experience to the world of Big Data. Tens of thousands of companies run historians, and some historian customers have been running their systems for decades and have real knowledge that should be shared with the Big Data community.
- Many of the tools that were brought up in the historian age could translate very well to the Big Data age and more people need to know about them.
In our next post about big data we will explore how two giant and fast-growing industry topics are set to collide: Big Data & Mobility.
Until next time...