This article was contributed by Denis Souza Rosa, Developer Advocacy Manager at Couchbase.
In 1777, British mathematician, Jesse Ramsden, published a paper describing the design of a screw-cutting lathe. This machine represented a big technology breakthrough, as producing screws at scale enabled heavy and complex machinery to be produced faster during the industrial revolution. Today, Kubernetes and Operators are the screw-cutting lathes for stateful applications. With this combination, any software vendor is capable of providing fully managed services at a reasonable cost.
The most famous example of stateful applications in tech are databases, which developers expected to work out-of-the-box, but historically are exactly the opposite. The task of maintaining it falls on the shoulders of DevOps engineers in small to medium companies, while in big enterprises, enterprise databases are so critical that it is common to see a specialized department for data management.
There is not much room for failure in this area; data is commonly one of the most valuable assets of a company. Because of that, developers and database administrators (DBAs) have always been conservative while picking the next data storage for a project, even if it means picking a suboptimal one.
The truth is, they are not wrong. Storing and retrieving data from different sources doesn’t pose a challenge for most developers, but the learning curve to manage and size it properly could be steep. The end result is that only big enterprises have enough resources to train their teams to produce scalable and cost-effective software, while other companies often prefer to stay on the “safe side” by using relational in suboptimal scenarios. This “safe side” behavior might lead in the midterm to performance and scalability issues that, following the trend, would probably be solved on the application level with things like microservices, which adds a whole new layer of complexity, while simply staying within the same architecture but switching to a more adequate data storage would address the same problem.
This long introduction aims to state one simple problem: Developers understand that specialized databases could be a crucial success factor for their applications, but the upfront investment sometimes is higher than what they can afford. AWS was the first big company to realize that when they launched DynamoDB in 2012.
Back then, launching a database-as-a-service (DBaaS) was something that only big players could do, as frequent tasks like version upgrades, recovery of faulty nodes, data replications, or even a basic thing like provisioning a simple database required some sort of infrastructure automation, which had to be written from scratch. In the majority of cases, the automation code was tightly coupled with the infrastructure that it was running on, which would also push companies to create their own private clouds to avoid anchoring their strategy and costs to third-party providers.
Borg was one of these in-house solutions developed by Google, and it would later become the sprouts of what Kubernetes is today. One of the success factors of Kubernetes was its extensibility. It allows the deployment applications called “Operators” to react to events triggered in the cluster. This feature enabled enterprise database vendors to build specialized apps that could monitor their databases and act accordingly in case of a state change, which virtually can provide a DBaaS-like experience in any Kubernetes cluster. Couchbase was the first company to release an official operator back in 2017, which made some noise in the Kubernetes/NoSQL world and created a wave of other companies trying to do something similar.
Community-driven operators have also been quite popular; databases like PostgreSQL and MySQL have various operators available, including a few enterprise databases actively maintained by large organizations. Developer groups around this topic are starting to pop up everywhere: the DOK Community (Data on Kubernetes) is a clear example of that.
Despite the fast community adoption and the stellar progress made in the last 4 years, including all major cloud providers launching their fully-managed Kubernetes services, the main challenge for companies to adopt this kind of technology is the steep learning curve of Kubernetes itself.
The future of enterprise databases-as-a-service
Providing fully-managed services became so accessible that it even became a business model for some cloud providers. Most of these technologies were open-source, so all they had to do was to add a user-friendly façade on top of it.
This strategy had a heavy impact on the revenue of some vendors, which forced them to change their licenses. MongoDB was the first one, moving to SSPL in 2018, followed by Redis (RSAL) and ElasticSearch (ELv2). Other databases like MariaDB decided to follow a different path; they all changed their licenses to BSL, which is usually converted to another license (often Apache 2) after two to four years on average. There is no right or wrong here, but open-source has always been the foundation of software development, and a license that protects the company’s intellectual property for a given time and then releases it to the public when the code still is relevant seems to be a reasonable approach to me.
The rise of DBaaS, Kubernetes, and Operators should help the adoption of NoSQL to skyrocket in the following years, as they can deliver better performance, lower cost, and higher productivity, but this time without the upfront cost of learning how to manage it. Because of that, the database market currently controlled by RDBMS should become much more diverse. All this activity will benefit the whole developer community and how we build effective software.
Denis Souza Rosa is a Developer Advocacy Manager at Couchbase.