Big data… We hear about it literally
everywhere these days but what is it, really??
This past spring I was contacted by a
specialist network I am engaged in (Informed Decisions). They asked if I, on very
short notice, could participate as a speaker at a seminar arranged by
Embarcadero (http://www.embarcadero.com/)
in Manchester. The brief was short and
swift; “just talk about your ‘standard’
information modeling and taxonomy management approaches”. “OK, I can do
that. 40 minutes, you said? Right, no problem!”
Next call was from the seminar planner at
Embarcadero. Disregarding the 99% super nice chatting during the call, the
comment “Bear in mind that the audience
is expecting talks to address Big Data and Information Architecture content”
almost made me choke. I called back to ID… “Listen guys, you know I am not a
Big Data Architect…”
Let’s spare you the details of the calls
that followed, but somehow they all managed to convince me that it all was a
really good idea. Personally I felt like I was about to enter the lions’ pit in
ancient Collosseum. Not only do I not know that much about database models, I
also tend to resent databases as I work mainly with unstructured information
and find the theories behind database architectures too limiting. In either
case, this truly forced me to start thinking about what “Big Data” really is
and what it means to us.
After a rather quick read through of a
number of papers on the matter I came to an “astonishing conclusion”. My
revelation: Big Data simply equals an obscene
amount of data… Joke aside, it is quite clear that the label itself, rather
than referring to the actual numbers of Petabytes, is a label set on the
challenges we face in dealing with these obscene amounts of data, very much from
a technology and hardware perspective.
Reading on, I also came across some analyst
reports on the topic that added to the same conclusion but also brought in a perspective
which, to me, is more interesting, namely that of how to use the Big Data. There
was one particular quote that really caught my attention: “All the data in
the world doesn’t help if the right questions aren’t asked, and big data does
not generate such questions, or even contribute to their formulation.” (Jones/Silberzahn, Forbes Leadership 7 feb 2013). It reminded me of another, more
than a decade old, quote (source forgotten) saying roughly that “the volume
of information which an individual in an eighteenth century village encountered
during a lifetime about the world outside the village equals one issue of the
Financial Times.” This, of course, is an allegory of sorts but with the
purpose of putting us humans in the center of the information tsunami we are,
to a greater and greater extent, experiencing.
The main theme of my presentation hence
ended up being that when it comes to humans, Big Data is nothing new
whatsoever. The challenge has been with us for decades and to the human brain,
the difference between far too much and obscenely too much is nil. We just
cannot handle it. Hence, from a human usage perspective, we can still approach
the data volumes with the same models and tools we have used for some time now.
The true super power of Big Data, however, comes when we can start combining
such practices with the new models growing out of the sheer hardware and
software needs driven by the obscene amounts of Petabytes that need to be
handled. I further realized, during my “research” prior to the presentation,
that if this is to happen, people like me and true BigData/architecture people
need to get together and start bouncing ideas off each other. Maybe that is
already happening everywhere, but for me this was a first encounter. I have
continued my “quest for more understanding in the field” by doing just that and
it truly seems like there are lots of learnings to be gained in both directions
so let’s start bouncing ideas off each
other!
So, did the presentation go down well? This
is not for me to judge but from the feedback I got, yes it did.
As always, I’d love to hear your thoughts
about this below.