Definition of Containers: Containers are designed to solve the problem of “how to ensure that the software can run properly when switching runtime environments”.
Currently, containers and Docker are still the hottest words in the technology field, stateless services containerization has been the trend, but also brings a hot issue is debated by everyone: database MySQL need to be containerized?
We carefully analyze the various views of everyone, and find that those who agree are only elaborating MySQL needs containerization from the perspective of container advantages, with little business scenarios to verify their views; on the other hand, if we look at the opponents, they elaborate MySQL does not need containerization from multiple factors such as performance and data security, and also cite some unsuitable business scenarios. Here, we’ll talk about a few reasons why Docker is not suitable for running MySQL!
Data security issues
Do not store data in containers, which is one of the official Docker container usage tips. Containers can be stopped, or deleted, at any time. When the container is deleted, the data in the container will be lost. To avoid data loss, users can use data volume mounts to store data. However, the container Volumes are designed to provide persistent storage around the Union FS mirror layer, and data security is not guaranteed. If the container suddenly crashes and the database is not shut down properly, the data may be corrupted. In addition, the shared data volume groups in containers are more damaging to the physical machine hardware.
Performance Issues
As we all know, MySQL is a relational database with high IO requirements. When a physical machine runs more than one, the IO will accumulate, leading to IO bottlenecks and greatly reducing MySQL’s read and write performance.
In a special session to discuss the top ten difficulties of Docker applications, an architect of a state-owned bank also raised: “The performance bottleneck of database generally appears above IO, and if we follow the idea of Docker, then multiple dockers will end up with IO requests on top of storage. Nowadays, most of the databases on the Internet are share nothing architecture, which is probably a factor for not considering migration to Docker.
In fact, there are some corresponding strategies to solve this problem, as follows.
Database program and data separation
If you use Docker to run MySQL, the database program needs to be separated from the data by storing the data in the shared storage and the program in the container. If there is an exception in the container or MySQL service exception, a brand new container will be started automatically. In addition, it is recommended not to store data in the host, the host and container share volume groups, the impact of damage to the host is relatively large.
Run lightweight or distributed databases
Deploying a lightweight or distributed database in Docker, Docker itself recommends that the service hang and automatically start a new container instead of continuing to restart the container service.
Rational layout application
For applications or services with high IO requirements, it is more appropriate to deploy the database in a physical machine or KVM. Currently, Tencent Cloud’s TDSQL and Ali’s Oceanbase are deployed directly on physical machines instead of Docker.
Stateful issues
Horizontal scaling in Docker can only be used for stateless compute services, not databases.
An important feature of Docker’s rapid scaling is statelessness. Anything with data state is not suitable to be placed directly inside Docker, and if a database is installed in Docker, storage services need to be provided separately.
Currently, Tencent Cloud’s TDSQL (financial distributed database) and AliCloud’s Oceanbase (distributed database system) are running directly on physical machines, not on Docker which is easy to manage.
Resource Isolation
In terms of resource isolation, Docker is indeed inferior to the virtual machine KVM. Docker uses Cgroup to achieve resource limits, which can only limit the maximum amount of resources consumed, but not isolate other applications from occupying their own resources. If other applications take up the physical machine resources, it will affect the efficiency of MySQL reading and writing in the container.
The more isolation levels you need, the more resource overhead you get. The ease of horizontal scaling is a major advantage of Docker over dedicated environments. However, horizontal scaling in Docker can only be used for stateless computing services, not for databases.
Can’t MySQL run in a container?
It’s not all that MySQL can’t be containerized.
- Businesses that are not sensitive to data loss (e.g., products searched by users) can be databased, using database sharding to increase the number of instances and thus increase throughput.
- docker is suitable for running lightweight or distributed databases. When the docker service goes down, it will automatically start a new container instead of continuing to restart the container service.
- Database using middleware and containerized system is able to automatically scale, disaster recovery, switching, and self-contained multiple nodes, which is also containerizable.