The Buzz about Open-Source Databases
Open-source databases and NoSQL databases aren't the same thing, but both are creating a buzz in the database marketplace.
By Rick Murphy05/15/2017
You’ve likely heard the buzz around NoSQL databases and open-source databases, but aren’t these the same thing? This article explains these terms, and provides an overview of what’s happening in the database marketplace and why open-source databases are causing such a buzz. Finally, the article examines which open-source databases are available for Linux* on POWER*.
What Are NoSQL and Open-Source?
NoSQL generally refers to databases that don’t have an SQL interface, but more recently some so-called NoSQL databases have come to support some SQL or another query language. As a result, NoSQL has evolved to mean “Not only SQL.”
Although the term “NoSQL” was coined as long ago as 1998, it came to prominence in 2009 when it was used to describe the emergence of new, non-relational databases, which don’t view data in strictly defined tables of rows and columns. NoSQL refers to a database that is not relational.
Closed source refers to software whose source code is kept secret to prevent copying. Open source software’s source code is open and available for study, modification and even redistribution. Open-source software is often free to download and use.
An open-source database system’s source code is open source, and it could be relational or non-relational (NoSQL).
Trends Affecting the Database Market
Two forces are presently at work in the database market: the need for new applications and the need to lower costs. The need to lower costs doesn’t seem like anything new, but the need for new applications is driving the need to lower costs.
What are these new applications and why are they needed? Three factors are driving the accelerated pace of new applications:
- The advent of Web 2.0. Static webpages have become dynamic and social media is all around us: Everyone is tweeting, posting, blogging, vlogging, sharing photos, chatting and commenting.
- The advent of the smartphone. This has spawned new social media and new applications, some of which aren’t even available as traditional websites but only as apps. It’s now common to use your smartphone to book travel, check in, post your status, grab a ride, listen to music, find a coffee shop, upload photos, buy stuff and manage your finances. The list is endless and growing all the time.
- The advent of smart devices. Smart cars, smart homes, smart appliances and more are at the forefront of the rapidly growing network of connected things that collect, process and exchange data—the Internet of Things.
Together, these generate huge amounts of new data, much of which is unstructured and can’t be neatly stored in a tabular relational database. Accordingly, new flexible databases are needed to store, manage and process the new data—NoSQL databases. Community-based open-source developers have, in turn, developed many NoSQL databases.
Companies want to absorb and use the new data to stay ahead in business, and provide features such as product recommendations and a differentiating customer experience. The data can be analyzed in search of patterns for applications such as fraud detection and behavior analytics.
Driven to develop new dynamic applications, companies are looking at their IT budgets and discovering how much is spent on support and maintenance of their traditional relational database systems. And it’s a lot. Estimates vary, but EnterpriseDB says up to 35 percent of software infrastructure spend is on database management systems (bit.ly/2lAzrF6). Switching to lower-cost, open-source software saves money, which is why an estimated 78 percent of enterprises use it, according to a ZDNet article (zd.net/1yxglk5).
In short, new applications spawn an enormous amount of new unstructured data, which needs NoSQL databases to store and process it. This spawns even more new applications for better customer engagement. Open-source NoSQL databases are ideally suited to meet this need because of their flexibility and lower cost.
Types of NoSQL Databases
Because new data needs new databases, it follows that no single option can address the needs of all new data and new applications. You’ll want to do your own research if you’d like the deep technical details behind these, but the main categories are:
- Key-value. Data is stored in a single collection that can have different fields for each record. The best-known example of this is Redis, which stores data in-memory for very fast access. This page shows some use-cases: objectrocket.com/blog/how-to/top-5-redis-use-cases
- Document. This is used to store semi-structured document-oriented information, the most popular is MongoDB. This schema-less database allows documents (records) in the same collection (table) to have different fields. This page shows some use-cases: mongodb.com/use-cases
- Wide-column. In some ways, it is similar to key-value, but it allows a large number of dynamic columns. The best-known example is Cassandra, which is designed for clustered deployment across multiple nodes. Cassandra boasts linear scalability and high availability. Wide-column databases are well suited for analyzing huge datasets. This page shows some use-cases: opensourceforu.com/2016/04/the-many-uses-of-apache-cassandra
- Graph databases. Based on mathematical graph theory where the emphasis is on the connections (relationships) that link data together, graph databases allow complex queries over millions of connections to be executed quickly. Classic use cases are real-time recommendations, social networks and fraud detection. The best-known example of a graph database is Neo4j. This video provides a great introduction and includes some more use-cases: youtube.com/watch?v=-dCeFEqDkUI
Redis, MongoDB, Cassandra and Neo4j are all open source, with community editions available to download and use for free. Paid-for enterprise editions, which include support and additional features, are also available. Typically, users start with the community editions but progress to enterprise editions when applications move to production and become mission critical. Even so, enterprise editions are less expensive than traditional enterprise relational databases.
Open-Source Relational Databases
In addition to NoSQL databases, several open-source relational database systems are available. MySQL and PostgreSQL are the best known and many others are variants of them. Here are the main players:
- MySQL is the world’s most popular open-source relational database. It was acquired by Oracle in 2010, and Oracle now charges for support. A free “community” version is still available. MySQL rose as part of the classic LAMP stack (Linux/Apache/MySQL/PHP) and often serves as the back-end database for websites, particularly those based on WordPress. But its use can extend far beyond that. For example, it’s particularly well-suited for online transaction processing applications.
- MariaDB. This MySQL fork was taken the day after Oracle announced it was acquiring Sun Microsystems, which had itself acquired MySQL in 2008. MariaDB is essentially a clone of MySQL and can be dropped in as a replacement. Uncertain about the direction in which Oracle is taking MySQL, many users have already migrated from MySQL to MariaDB.
- PostgresSQL. More robust, more performant and with more features than MySQL, PostgreSQL has a strong reputation for reliability and data integrity. The community edition is free.
- PostgresPURE. Its producer, Splendid Data in Europe, says it’s the world’s only truly 100 percent open-source enterprise-level alternative for Oracle. With community PostgreSQL at its core, Splendid Data has added tools and support to make an enterprise-ready package.
- EnterpriseDB Postgres Advanced Server. Based on PostgreSQL, the engineers at EnterpriseDB (EDB) have added many additional features and tools, most notably Oracle compatibility features, which are closed source, to enable Oracle developers and database administrators to transition to EDB more easily than to other PostgreSQL variants. EDB charges for these extras and for support.
Like the NoSQL databases, community editions are available to download and use for free. Paid-for enterprise editions are available with support and additional features as noted previously.
Flexibility and Innovation
While the initial attraction of open-source databases might seem to be their low cost, businesses often choose the enterprise editions to get support and additional features. Keep in mind that these are less expensive than traditional relational databases, but they are not necessarily low cost.
The real attraction of open-source databases is not even the fact that they are open source. Businesses don’t choose open-source databases because they are open source. They choose them because of their flexibility with new data and their power to enable innovation. Select your open-source database based on the best fit for your use-case.
Open-Source Databases on Power
It likely goes without saying that all open-source software, including databases, runs on x86-based Linux systems. But what options are available for companies that have already invested in IBM Power Systems* or those that wish to do so?
MongoDB, EnterpriseDB, Redis, Cassandra, Neo4j and MariaDB are all available on IBM POWER*. And all are much more performant on POWER compared to a similarly configured x86 system.
In fact, IBM guarantees 2x price-performance over x86 for MongoDB and 1.8x for EnterpriseDB, which means that clients choosing IBM power can expect better performance and less server sprawl.
Companies choose the IBM POWER processor over x86 for its scalability, reliability, virtualization and lower total cost of ownership, as well as its superior performance. More information about open source databases on IBM Power Systems is available online (ibm.com/systems/power).
Rick Murphy is a migration consultant in the IBM Systems Lab Services Migration Factory, helping clients migrate to IBM Systems.
Post a Comment
Note: Comments are moderated and will not appear until approvedcomments powered by Disqus