Index everything and make it easy to find.
Data lakes are giant pools of information that can contain every single byte that an organization produces in one central location. Every database record, transaction, patient or customer record, case narrative, time series, image, video, or document is poured into the data lake with minimal processing.
Why would I want to do that?
Databases are optimized to handle structured data. Structured data is pretty rigid, and requires you to know what columns are going to be stored before you store a single byte of data.
The vast majority of enterprise information is in unstructured data. Unstructured data can be Word documents, PDFs, web pages and anything else that is primarily text.
Data lakes accept both kinds of content and index it into a searchable body of knowledge that spans data silos and provides a Google-like search experience inside the enterprise.
How does that help me?
When everything is indexed, it is much easier to mine the text for connections to the table columns or dimensions that it references.
One of the defining features of a data lake is that it is very easy to set up and populate compared with competing technologies. Content is added with minimal processing and stored without requiring onerous data modeling and normalization.
A typical master data management application will require hundreds of engineer-hours of manually extracting, transforming and loading from disparate sources.
Epinomy takes the data as-is. As long as it can be represented as raw text, XML, JSON, CSV or any other common markup language, it can be imported natively, with all structure preserved and instantly indexed and searchable.
A data lake stores all of your enterprise information, regardless of what format or structure it has.