In which we learn the true meaning of "Big Data" by asking existential questions. Allergy warning: contains references to cats.
Welcome to the first Epinomy blog! Over the course of the coming months we will dive into world of modern information management with a fun biological twist, often examining metaphors and similarities between these two worlds. We believe that big data and the management of information is all about the three T’s, “tables, text and triples” and during the course of the blog we will attempt to simplify this highly complex enterprise information management process.
Taxonomy, semantics and ontology normally the jargon of biologist, has now become paramount in the world of information management as organizations strive to discover all relevant information required for business critical decision-making. During the 1980’s and 1990’s biological words like “niche,” and “symbiosis” entered the world of IT jargon. They became and important and were followed by “ecosystem” at the beginning of this millennium. Now it’s not just about your products and services, what niche you compete in or how symbiotic your products are with other platforms. What is important is the robustness of your ecosystem and the additional value that it brings to customers at the center.
Taxonomy, semantics and ontology are not new in world of document-content-records management but are new to mainstream computer jargon, somewhat like NoSQL databases. According to the Oxford English Dictionary, sematology is of Greek origin and as is defined as “the doctrine of the use of ‘signs’ (esp. words) in relation to thought and knowledge.” Semantic, originating around 1665 is again of Greek origin with a broader definition, simply means “to show or signify” or in a broader sense “relating to the significance of meaning.” Interestingly, there are many similarities between the worlds of technology and biological sciences primarily because both disciplines depend on interrelationships.
Classification of organisms on earth is well established and follows the Linnaen system of nomenclature, however, interrelationships between all organisms are still not well known. Another similarity that is interesting is that 76% of all animals on earth are little known invertebrates, and nearly 80% of all data in organizations is in an unstructured format making even more difficult to discover and understand its meaning and interrelationships. In the sphere of enterprise information management taxonomy tools follow a complex doctrine of signs or what IT professionals call signals.
Typical enterprise content repositories generate signals in the form of folder structures, document titles and sometimes, useful metadata associated with documents, in contrast to the WWW which is a tapestry of metadata linking information together. Most enterprise documents live in isolated silos with little interaction outside of their folder cells. Good metadata makes it easier to find documents and can be used for refining search results and for navigating and drilling down to the right answer in a few mouse clicks, resulting in good search results. See the Epinomy white paper on this site, Leveraging Semantics to Find Enterprise Big Data for a deeper dive into signals, metadata creation and taxonomy management.
We don’t think big data is going away anytime soon and its not really new. What we do see is a new “data driven culture of real-time decision making” that enables organizations to gain and maintain competitive advantage by discovering interrelationships between data types and leveraging that knowledge. This “data driven paradigm” and the pursuit of “all your data all the time” is not yet a reality in many organizations because of the disparate and siloed nature of enterprise data and what we call metadata madness.