In order to compare relational and NoSQL architectures, we must begin with their historical roots. The focus of this paper will be on the business drivers that led to each architecture being readily adopted by the IT community, as opposed to an in-depth historical essay on each model’s technical genesis. Describing why the products became popular provides us with the foundation to perform an analysis of the relational and NoSQL models.
Why did the Relational Model Become Popular?
I started my career as an IBM IMS hierarchical database DBA. During this time, I spent several years managing very large IMS database systems that supported tens of thousands of concurrent users and hundreds of nightly batch processing programs. I was a student of all database systems and was fairly educated on both hierarchical and CODASYL/network database architectures. I was administering those environments during the time when the first set of relational database products were initially unveiled to the IT community.
The hierarchical and network architectures that were popular at that time required that the logical and physical layers be entirely dependent upon each other. Both data storage and data navigation were rigidly defined. In IMS, the application programs could not deviate from the data paths that were prebuilt using a combination of Database Descriptors (DBDs), which defined the physical structure of the data, and Program Specification Blocks (PSBs), which were the predefined navigation paths. Programs were required to follow the prebuilt paths to navigate through the stored data.
When a member of the development team wanted access to data that wasn’t predefined in an existing navigation path, the database administrator was required to modify an existing PSB or create an entirely new one. Changing DBDs to add new data elements and PSBs to establish new navigation paths often required that the data be reorganized and programs recompiled. We can assume that these environments could be easily classified as “rigid.”
The Advent of Relational Systems
I had the extremely good fortune of working with Craig Mullins, who went on to become one of the thought-leaders on IBM’s relational database management system, DB2. Craig spearheaded the initial DB2 implementation efforts in our environment. He would walk into my cube and espouse the benefits of set versus row processing, separating the logical representation from its physical layer and the resulting flexibility it provided to both application development and data storage. He described IMS as “legacy” and that I needed to work with him supporting relational products. Craig firmly believed that the relational model would become the defacto standard for the majority of future database implementations.
"Blasphemy," I said! How could one ever expect to navigate through data without using predefined, physical pointers? Establishing relationships between the stored data elements using a combination of table definitions, the data values themselves and the SQL language was a totally foreign concept at that time. Our job as database administrators was to predefine those navigational constructs allowing the application development teams to traverse them to access to the desired information.
But, the more I learned, the more I became a proponent of the relational model, eventually becoming a member of the shop’s relational database support team. We were pushing the envelop of new technology at that time, working on some of the first commercially viable releases of IBM’s flagship relational database product. Other, more conservative members of our organization would tell us that the relational model was “a flash in the pan” and that it would never gain widespread adoption. Relational databases would be relegated to niche implementations.
I think we can draw a correlation between the initial implementation of relational systems and NoSQL. As NoSQL matures and new features are added that allow it to be more universally implemented, it is a relatively safe assumption that it will follow the same natural path as its relational counterpart.
I then assumed the same role Craig did for Oracle’s RDBMS product at our shop, installing and administering Oracle Version 6. I built my organization’s first non-mainframe production application using Oracle Version 7 as the database. The only hardware I had available to use as a database server was my personal desk top.
Our first application was a phone directory that stored a little over 20,000 rows. I could tell when the phone operators were scanning the data using wildcards because my mouse pointer would begin to stutter across my screen.
Once again, I was astounded at the benefits and features this new upstart product provided. It could run anywhere – as in, it didn’t need a mainframe. The ability to run on low cost “commodity” hardware, reduced support requirements and the product’s flexibility were the primary driving factors for Oracle’s increasing popularity at that time.
I was subjected to the same comments as before. I was frequently told Oracle would never be a viable replacement for more traditional systems and that it would always be a niche player. Mainframes would never be replaced by an unwieldy and unmanageable collection of Linux and Windows servers.
Let’s fast-forward to NoSQL. NoSQL’s beginnings are subject to debate. The argument can be made that IBM’s hierarchical offering, IMS, IDMS/CODASYL network databases and several other systems that predate relational products could loosely be defined as NoSQL. For the sake of this discussion, we’ll focus our analysis on the inception of the more nascent offerings that include MongoDB, Redis, Oracle NoSQL and the numerous Amazon projects.
Like its relational counterpart, NoSQL databases are currently thought to be most appropriate for special purpose implementations. In NoSQL’s case, those special purposes include storing semi and non-structured data as well as accommodating large amounts of data and high numbers of concurrent users.
The historical genesis of this new class of products is older than most realize. For example, Neo4j, a NoSQL graph database project, was started in 2000 with the first production deployment coming in 2003. The founders of Neo4J built the first prototype in an attempt to solve the performance problems they were experiencing with relational database management systems. The data they were attempting to store and process didn’t fit into the relational storage model.
The years from 2005 to 2010 were especially fruitful for the creation NoSQL product offerings. MongoDB, CouchDB, Cassandra, Redis and Hbase NoSQL products were all unveiled during those years.
The innovators behind many of the NoSQL product offerings were super-sized technology players that were being affected by a new form of social and internet retail traffic profiles. Facebook, Amazon, and Google were all contending with exponential increases in concurrent users and user-generated content that did not fit neatly into tabular rows and columns.
Although the middle-tier, web servers were able to be easily scaled horizontally to accommodate increased demand, their database server counterparts presented more of a challenge. While traditional database clustering had been readily available for years, it was both costly and complex to administer. They also understood that buying progressively more powerful hardware to provide vertical database server scalability presented its own set of limitations.
Their need to store unstructured data in conjunction with the system’s ability to provide almost absurdly high degrees of scalability, data distribution and availability were the business drivers behind their innovation, which lead to the creation of database management systems that did not adhere to the relational model. Many of the NoSQL products are specifically designed to leverage low-cost hardware to provide horizontal scalability and data redundancy at an affordable price point.
MongoDB was founded in 2007 by the techs that worked at DoubleClick, ShopWiki and Gilt Groupe. Like their larger counterparts, they realized that a different solution was needed. A solution that addressed the needs of storing non-standard data, data distribution and horizontal scalability. Big players weren’t the only vendors that identified that the relational model wasn’t a good fit for every application. Whenever there’s a gap, smart people will step in to fill it.
Drawing Parallels Between Relational and NoSQL Adoption
One of the key initial adoption drivers for relational systems was their flexibility. The hierarchical and network database competitors during the early years of relational product adoption were well known to rigidly mate both the logical and physical layers. There was no abstract layer. Relational databases offered unparalleled flexibility when they were compared to their non-relational counterparts.
Although DB2 was originally a mainframe product, during the early growth stages of the relational model, there were competing product offerings that were marketed as being able to run on lower cost Windows and UNIX operating systems. A parallel can be easily made between the marketing literature presented by the early relational database players (Oracle, Informix, Ingres, Sybase) touting their product’s flexibility and ability to run on cost-effective hardware and the NoSQL vendors of today.
An argument can now be made that the same business drivers that led to relational systems eclipsing their hierarchical and network counterparts will also fuel NoSQL’s increased acceptance, that NoSQL is a natural progression of database technology.
Each new release of any database product contains numerous new features and functionality. Database vendors know that they must add new features to remain competitive. A competitive marketplace forces all software vendors to maximize their product’s inherent feature set. Constant innovation and integration of new features that differentiate their products from other vendors is an absolute requirement for their continued competitive survival.
It remains to be seen if NoSQL’s increasing feature set will allow it to directly compete with relational systems. Relational product vendors, during the early stages of their lifecycle, were also often defined as being niche players. As they matured, they listened to what customers wanted and improved their product offerings accordingly to gain competitive advantage. If they didn’t, they fell by the wayside.
As NoSQL database products continue to mature, they will become more robust, more intelligent and more standardized. As a result, their adoption rate will continue to grow, as it would with any technology possessing these traits. Organizations will increasingly view them as standard infrastructure choices for new database application implementations.
Will the NoSQL vendor’s desire to increase market share, which may require them to compete more directly with relational product manufacturers, drive them to add functionality that allows them to be more widely adopted? The larger relational vendors will attempt to co-opt any NoSQL technology that challenges their dominant role in the industry. As they identify offerings as tangible threats, their strategy will be to ensure that the technologies used by those vendors become a component of, not a replacement for, their traditional database products. The key to their continued dominance will be their ability to identify and seamlessly integrate technologies that are destined to become more widely adopted vs those that will continue as niche offerings.
RDX offers a robust set of MongoDB support services. For customers that are new to MongoDB, our experts will guide you through each step of the application design and implementation process. From MongoDB's schemaless data architecture to Sharding and ReplicaSets, RDX will act as your trusted mentor and advisor. RDX is able to convert data from existing data stores to MongoDB or help your team design and deploy entirely new applications. After implementation, a robust, PCI DSS monitoring and support architecture guarantees that our experts are there when you need them, your systems are secure, and your databases are benefitting from leading-edge support technologies.