In recent years, a growing consensus has grown among developers that distributed, elastic, highly available databases (think DynamoDB) are best consumed as managed services. Just worry about your applications, let the managed service handle the data. But what if you could deploy, upgrade, and reconfigure the world’s highest performance open-source database on your laptop with just a few simple commands? And what if you wanted to know how we built such a system?
-------------------
In this article we introduce managed RonDB on Docker. In less than one minute, on your laptop, you can install and reconfigure managed RonDB by installing Docker, cloning our repository, and starting a highly available database, scale it up / kill nodes / upgrade versions - all without any database downtime. We dive into the details about how we built managed RonDB using a distributed agent that takes a declarative desired state to make decisions for cluster reconfigurations.
In the quest to make RonDB easily accessible, we at Hopsworks spent the last year rethinking what RonDB as a managed database should look like. Previously having dealt with static size clusters, simple event-based protocols controlled by a cloud backend worked great - however as we slowly migrated to dynamic configurations with both vertical and horizontal scaling functionalities, we hit our limits, especially in terms of testability. Testing was particularly important; if our “highly available” database went down due to a scaling reconfiguration, this would not be ok.
Managed RonDB consists of a number of services, each of which can be independently reconfigured and resized. The services are: MySQL servers, data nodes, and management servers. All of these three components can both be scaled in terms of VM sizes (vertical) and number of VMs (horizontal). Furthermore, managed RonDB offers an interface to create backups in AWS/GCP/Azure and also to restore a cluster from a backup. Lastly, managed also means providing the possibility of doing online software upgrades (with zero database downtime).
Since we deploy managed RonDB on AWS/Azure/GCP, a major requirement was that our customer-facing control plane did not have access to the VPNs where RonDB is deployed. The only communication allowed from a deployed RonDB cluster inside a customer cloud account, was an egress network call from the RonDB cluster to the control plane (containing minimal cluster status updates and the response can include commands to execute on the cluster). Our architecture for enabling control commands to be sent from the control plane to a deployed RonDB cluster involved placing a database controller (“agent”) on each VM to execute control commands on the database services. A leader agent fetched the commands from the backend. This first approach was quick and simple: let the backend create events for the agents and thereby be the main controller of the state. Over time however, we identified 4 issues with this:
To tackle these problems, we came up with a new approach. This concept, which we call “reconciliation”, uses a desired state, and has similarities to how Kubernetes operates (and has its roots in Dijkstra’s self-stabilizing distributed systems). Just like Kubernetes, we have a reconciliation loop, which checks the cluster’s internal state and decides on actions to take in order to reach the desired state. Whenever it deviates from the desired state, it runs the reconciliation loop again. The big change for us however, is that this loop is run on the agent side. The desired state is decided by the client, which is still the control plane. Using this declarative approach, the control plane no longer requires knowledge of the set of control commands, nor of RonDB. If the control plane suggests a desired state which does not make sense, the state is most likely rejected.
Our agent is thereby now a full state machine. This comes with challenges, but also advantages. Given 42 fields, of which around 20 are mutable, finding problematic state paths is obviously a challenge. However, the biggest advantage is the testability. The agent has two functionalities - one is figuring out the next actions to take, given an internal & desired state, which is the role of the leader agent, and the other are gRPC functions, which can execute on any agent, depending on what the leader decides. The agent is developed in the GO language, and given the nature of GO, where it is common for functions to have error as a return value, we have managed to build “state transformation” tests as unit tests, by creating dummy return values for gRPC functions where error=nil. This means that we can generate and run thousands of random new & consecutive desired states without any side effects and check whether the agent can reach these within one reconciliation loop. If it doesn’t, the test fails.
Another thing we can quickly detect with these state transformation tests are accidental downtimes. Let’s say our RonDB database has a replication factor of 1, and we request to upgrade RonDB to a new version. If the agent does not first increase the replication factor to ‘2’ (by asking for a new VM for a new data node) before performing a rolling upgrade, we will run into downtime. This is easily checked in the generative tests.
Whenever we run into a generative test that fails, we serialize it as a JSON file, which contains the initial internal state and the desired state that the agent failed to reach from it. These JSON files add up for our regression tests. However, often we realize that the new desired state should simply not have been accepted given the internal state.
Since all the state logic is now part of the agent, we can now also use Docker to perform integration tests without using our production control plane. The control plane now simply has to return desired states to the agent and generate/remove VMs that the agent asks for (for it to reach the desired state). For this, we have written a Flask web server of around 1000 lines of Python code that is capable of doing just that - but instead of spawning VMs, it spawns containers. For the first stage of these integration tests, each database container only runs the agent - all system calls are replaced by dummies. This is a first step of quickly checking that all the cross-agent gRPC calls are working and we can again check whether we reach our desired states. For the second stage, we actually install RonDB in the containers and the whole thing becomes real.
The Docker image that we use as a base image for this is hopsworks/rondb-standalone and on top we install both the agent and supervisorctl. The latter is a drop-in replacement for systemd, so that multiple processes can run in the container. The agent thereby is responsible for starting/stopping any RonDB service in its own container. All of this essentially means that we have a running RonDB cluster in which we can test and show-case any reconfiguration, software up/downgrade and backup creation without having to pay a cent for VMs (and wait for them to spin up).
Since we wanted our customers to be able to convince themselves of the flexibility of managed RonDB, we have now also made the Docker image hopsworks/rondb-managed public and set up a docker-compose file for people to use in our rondb-docker repository on GitHub. It is thereby now possible for anybody to spin up a managed RonDB cluster locally within a minute.
Lastly, for interested people that want to try RonDB by itself (“standalone”) and perhaps even benchmark it locally, we once again recommend the rondb-docker repository on GitHub, which also offers quick spin-ups of non-managed RonDB clusters.
To follow up on this blog post, we have created a quick demo: