Structured data is traditionally stored in rigid databases called "relational databases" or "relational data base management systems" (RDBMS).  For decades now, most organizations have at least one of these relational databases like Oracle, SQL Server or MySQL.   The programming language that interacts with these databases is called "SQL", or Structured Query Language.  

Relational Databases

Relational (sometimes called SQL) databases are designed for efficient storage of highly structured information.   In order to effectively use a relational database, you have to define columns in tables in a very rigid way and define the relationships between the tables using indexes.  The process of defining these columns and relationships in the most efficient way possible is called "normalization", and is a royal pain in the neck. 

The cardinal rule of normalization is that you should avoid duplicating values.  For example, in stead of putting the value "Red" in a in a table of cars, you should create a reference to a row in a another table of colors.   instead of putting the customer name, address and contact information into a transaction table, you put a reference to a row in a separate customer table.   As you can see, this is storage efficient, because you only need to put the customer into the system once and then reference that customer from multiple locations, multiple times.  To retrieve the customer you use a "Join" operation which retrieves the information from both tables in a single query.   It is quite elegant and cool, which is why it has been the dominant database structure since the mid 1980's.

NoSQL Databases

MarkLogic is a NoSQL database.  That means that you don't have to define such rigid data structures as is required by normalization, and you can store things in a more natural way.  The equivalent of a table row in MarkLogic is an XML Document.  There are a few things about XML Documents that make them a compelling way to store structured information.

  1. They do not require a rigid data structure.  This means you do not have to define the shape of your documents before you add them to the system.   As long as a document is valid XML, It will be loaded without complaint into MarkLogic.
  2. Fewer joins.  Since it is encouraged to store actual values in a document (instead of a link to the actual values), you do not have to look up the color "Red" in a table of colors.  You just put the word "Red" into the <color> element of the document.
  3. One of the biggest strengths of the MarkLogic database is also it's greatest weakness, the XQuery language.  XQuery is a much more powerful way to retrieve information than SQL, but it is much less popular than SQL.  That means there is a learning curve for most people*. 
  4. ACID compliance.   One of the biggest complaints about NoSQL databases is that they are not suitable for transactional applications (like at financial institutions) because they do not support ACID transactions.  MarkLogic is the only NoSQL database that supports this crucial property.

* MarkLogic 8 supports the JavaScript language as an alternative to XQuery.  This language is much more widely known, and increases the number of developers available for MarkLogic significantly.

Google+