DockOne Technology Sharing (6): Sina Sce Docker Best Practices

[Editor's Note] This article mainly from the IaaS perspective, sharing SCE through what kind of practice to support the upper product line containerization demands. First talk about why we do support Docker technology this thing, and then introduce all aspects of Docker support practice. Finally, some of the experiences and trampled pits are summarized in the course of the practice, and the points that need to be deepened.

It is assumed that tonight's audience has used at least a small number of Docker technology, familiar with the relevant concepts and principles.

DockOne technology sharing has been on the Docker several technical points for a number of in-depth analysis, so I mainly from the IaaS perspective tonight, sharing SCE through what kind of practice to support the upper product line containerization demands.

———-

Why support Docker technology

Why do this?

First introduced under my waves SCE. SCE is Sina R & D center main push of private cloud products, has covered all the company's internal product line. Based on OpenStack customization, the integration of the company channel machine, CMDB, for the company's entire product line to provide IaaS services. Public cloud version recently closed beta.

First, OpenStack is complementary to Docker.

  • OpenStack for IaaS, resource-centric, packaged OS; able to provide mature resource constraints and isolation; multi-OS series support;
  • Docker is for PaaS, service-centric, packaged service;

At present, IaaS industry mainly to provide cloud host services, has a mature resource constraints, resource isolation capabilities, but in essence is the OS package, can not meet the peak access, rapid expansion, rapid deployment and other demands. And docker innate characteristics of "light, fast, good, province", it is just to make up for IaaS in this area of ​​the problem. Of course, the OpenStack community in order to be able to better support docker also try to do a lot of effort, this will be mentioned later.

Second, SCE operation and maintenance process found that the product line on the container technology demand is quite strong.

  • Rapid deployment
  • Fast start and stop, create and destroy;
  • Consistent development of test environment;
  • Demonstration, trial environment
  • Solve equipment costs, make full use of resources;
  • Rapid verification of technical solutions;
  • More……

IaaS short board + demand driven, let us realize: SCE is also necessary and very suitable for supporting container technology this matter.

IaaS vendor Docker support profile

Research and analysis of several IaaS circle more representative of the giant and upstart, from the research results can be seen, the current IaaS vendor support for Docker is still relatively weak.

Only Ali cloud for Docker users to do some more things to provide Ali official Registry. But did not provide newer support for Docker's cloud host, only one third party provided a very old mirror with almost no usability.

UnitedStack and Albatron only offer CoreOS. In fact, CoreOS user acceptance is very low. We SCE also tried to provide CoreOS, but because the company and the use of CentOS system is too different, there is no product line is willing to use and migrate to CoreOS above.

———-

Docker supports all aspects of practice

Based on the above needs and research, SCE mainly in the Registry, Hub, Docker support virtual machine image, log storage and retrieval, network and storage drivers and other aspects of the practice, is committed to the product line users more convenient and efficient use of Docker technology, Promote Docker's use within the company.

Registry + Hub program

Registry back-end storage program, in fact, we have shared more, mostly with dev and s3. SCE of course using their own Sina S3, and then the first program is Docker Registry sinastorage driver + sina s3. Reliability naturally needless to say, but rely on storage driver, tracing the process of the problem, debugging and maintenance are more trouble, and the need to automatically build a new mirror after the update.

Since the provision of reliable cloud hard drive, why not provide services for themselves. Decisive will endure the old nose time to change the program for the localstorage + SCE cloud hard drive, no longer rely on the driver's day more comfortable, and also enjoy the cloud hard disk snapshot, resize and other advanced features.

So, for the friends in the Registry storage backend selection, give some suggestions for reference:

  • The use of the mirror image storage reliability requirements of the scene, it is recommended to use dev directly;
  • On the mirror storage requirements of the higher reliability of the use of the scene, if you are running on the IaaS, it is strongly recommended to use localstorage + cloud hard disk program;
  • On the mirror storage requirements of the higher reliability of the use of the scene, if you do not run on the IaaS, you can get some ocean out with S3;
  • On the mirror storage requirements of the higher reliability of the use of the scene, if you do not run on the IaaS, do not want to spend money, want to use their own storage, they can only write their own driver. I will not tell you how serious the problem is.

In order to provide a convenient image capture and retrieval services to the product line, SCE and microblogging platform, jointly launched the SCE docker hub, based on docker-registry-frontend development. And SCE existing services to get through, and support repo, tag, detailed information, Dockerfile view, retrieval and management.

For the product line more convenient to use Docker official mirror, our automatic synchronization tool will be based on the mirror registry, periodic automatic synchronization Docker official mirror to the SCE distributed back-end storage cluster, making the product line users can quickly pull through the network to Docker Official image.

Since SCE does not guarantee that it is impossible to guarantee the security of the Docker Hub's official image, it is advisable to use SCE's official image or build its own baseimage.

For the product line private Registry needs, SCE also provides a corresponding one-click build tools.

CentOS 7 + Docker image

SCE in Docker support mainly do the following work:
1) integrated Docker 1.5, Docker-Compose 1.2 environment;
2) to provide docker-ip, docker-pid, docker-enter and other cli, to simplify the use of users;
3) cooperation with DIP, support rsyslog-kafka, to solve the log monitoring search problem;
4) cooperation with the microblogging platform to provide muti-if-adapter tools to solve the same primary and secondary service port conflict with the same problem;
5) more …

Use Docker on SCE

With the above work support, in the SCE on the use of docker becomes very convenient.
SCE上使用docker.png

Log scheme

At present, SCE mainly supports 3 log program:

  • App container logfile
  • App stdout, stderr
  • App + agent to play far;

The first two scenarios apply to the use of the scene is not high, such as the development of testing.
The third scenario is applicable to the real business scenarios, production environment, these scenes on the log persistence, retrieval, monitoring, alarm and so have a strong demand;

Docker 1.6 syslog driver current usability, ease of use are still not ideal, but worthy of attention.

App + rsyslog-kafka program

The third kind of log program mentioned above, we are through the ELK program to achieve.

Architecture diagram

elk.png

Log flow
App >>> container rsyslog-kafka >>> kafka >>> logstash >>> elasticsearch >>> kibana

Business flow

  1. Product line go DIP real – time log analysis service access;
  2. DIP approval;
    1. Config_web Dynamically expand logstash cluster based on Docker Swarm api;
    2. 2. Give the user access to the required data, such as Kafka broker, topic;

  3. Product line based on access data to create container;
    1. docker run -d -e KAFKA_ADDR=... -e KAFKA_TOPIC=... -e LOG_FILE=... -v $(pwd)/kafka_config.sh:${SOME_DIR}/kafka_config.sh ...
    2. Follow the SCE log access specification, container run.sh need to call SCE provided to the log configuration tool docker / tools / rsyslog_config.sh;
    3. 3. rsyslog_config.sh will automatically configure rsyslog, access process and details of the product line transparent;

Network mode

Most of the current product line or bridge and host, although these two models there are some problems.

Although there are some problems, but still able to meet the needs of small and medium-sized cluster network.

But the scale up, the above two models do not apply. For large-scale network solutions, we will follow up the follow-up, the main plan to investigate ovs / weave / Flannel and other programs.

Libnetwork driver in addition to the translation of the bridge, host, null, also provides a remote to support the distributed bridge; follow-up plan to provide more drivers to solve the current network problems, worthy of attention.

In addition, the product line for some meaningful needs, such as microblogging platform proposed "the same host host the same service port conflict", SCE and product line will actively explore the corresponding solution;

Storage drive selection

This part is mainly about the beginning of the storage drive options for some considerations.

  • Aufs. Docker initially used the file system, has been unable to join the kernel, so compatibility is poor, only Ubuntu support. Need to compile their own users, use more trouble;
  • Btrfs. The data is not written directly but is written to the log, in the case of a continuous write stream, the performance may be half;
  • Overlayfs. A new unionfs, but the kernel version requirements are too high, need kernel 3.18 +;
  • Devicemapper Default driver. Can be said that the current general case of the best solution. SCE is using this driver.

Devicemapper related to some practice and pit will be mentioned later.

Cluster management

At present, SCE mainly recommends three cluster management tools: Shipyard, Swarm, Kubernetes.

Shipyard

  • Supports container cluster management across hosts
  • Lightweight, low learning costs
  • Supports simple resource scheduling
  • Support GUI chart display
  • Support for horizontal expansion of instances

Swarm

  • Docker official main push of the cluster management program
  • Relatively lightweight, low learning costs
  • Support multiple discovery backend
  • Rich resource scheduling
  • Rest API, fully compatible with the Docker API
  • There are some pit
  • The current product line is the most acceptable, and the most use of the program

Kubernetes

  • Google Borg / Omega open source implementation
  • Update iteration too block, architecture and implementation of relatively complex, learning costs, higher cost of transformation
  • Resource scheduling
  • Expansion capacity
  • Automatic fault recovery
  • Multi-instance load balancing
  • Support for OpenStack is better
  • following up

The three have their own advantages and disadvantages, the specific use of which tools or need to be based on specific business needs, and not that Kubernetes strong must be good.

According to the current product line use and feedback, swarm is still easier to receive some.

Integration with OpenStack

Next, we are IaaS, so we have to talk about integration with OpenStack. How to integrate with OpenStack better, give full play to both advantages, we have been concerned about a point.

At present there are three main programs:

  • Nova + docker driver
  • Heat + docker driver
  • Magnum

Nova driver and heat driver two programs, there are some Mishap. Such as the nova driver program to container as a VM processing, will sacrifice all the advanced features of Docker, which is clearly can not be received; Another example is the heat driver program, there is no resource scheduling, the creation must specify the host, which obviously only applicable In the small micro-scale.

OpenStack community at the beginning of this year began to force CaaS new project magnum. Through the integration of Heat, to support the use of Docker's advanced features; can be integrated Swarm, Gantt or Mesos, Docker cluster resource scheduling (now planned to use swarm to do scheduling); Magnum will also deep integration with Kubernetes.

Magnum has identified the pain of the previous two solutions and is working on the problem in the right way. Very worthy of follow up.

In addition, the Vancouver OpenStack Summit, the OpenStack Foundation, in addition to also said it will actively promote the Magnum sub-project, the technical realization of the container and OpenStack depth integration.

———-

Practical experience & treaded pits

The following sections describe most of the product lines, and lessons learned from SCE's Docker practice are documented to SCE's official Docker wiki. From the SCE Docker wiki in the extraction of some practical experience & stepped on the pit, presumably we have more or less has been practiced or stepped on. If you have not encountered these problems, I hope that our experience can help you to sum up.

Mirror production aspects

It is recommended to use Dockerfile build mirror, mirror document;
Dockerfile, value do not add "". Because the docker will take your "" as part of the value;
Minimize the size of the mirror, layered design, try to reuse;

Run the container side

A container a process, easy to service monitoring and resource isolation;
Not recommended with latest
For the container to provide services, it is recommended to develop the habit of adding –restart

Data storage

It is recommended to use volume loading mode does not depend on the host directory, easy to share and migration;

Resource constraints

Cgroup allows the process to be overridden, that is, in the case of free resources, allowing the use of resources beyond the limits of resources.
The cgroup restricts the use of resources only when necessary (when resources are low), such as when a number of processes use the CPU at the same time.
Cpu share enforcement occurs only if multiple processes are running on the same core. In other words, even if your machine on the two container cpu restrictions are different, if you bind a container in core1, and another container bound to core2, then the two containers can play their own nuclear.

Resource isolation

User ns is through the container uid, gid mapping for the node uid, gid to achieve user isolation;
In other words, you can only see the node in the node uid, but not see uname, unless the uid in the container and node on the same uname;
In other words, you see the node in the node on the process of the user's uname is not necessarily run on the container uname, but uid must be running on the container uid process;

Swarm & compose use aspect

Note swarm viol is not yet mature, binpack strategy to achieve some problems, will lead to the final scheduling out of the node is not what you expected.
Note compose is not yet mature, you may encounter a single Kai container no problem, put the compose Kai on the failure of the problem. Such as: the deployment of a high time-dependent container is likely to lead to failure;

Container side

Note dm default pool and container storage space size. Container default 10G, pool default 100G, you may need to dm.basesize, dm.loopdatasize on demand expansion;
Note nsenter into the container, do not see the configuration in the container env variable; see env suggested docker exec or docker inspect;
Pay attention to the docker daemon, docker daemon default-ulimit and docker run ulimit three inheritance relationship;

Due to space constraints, here is no longer too much cited.

———-

Follow-up plan

The following gives the SCE follow-up need to continue deep plowing several points.

  • Provide CI & CD solutions
  • Explore large-scale cluster network solutions
  • Continued follow-up of large-scale cluster management program
  • In-depth study of OpenStack and Docker integration program

———-

Q & A

Q: How to achieve cross-machine deployment?
A: shipyard and swarm can, of course, there are other programs. Shipyard has a good web UI, support container multi-machine deployment, scheduling based on tag, but also scale. Swarm scheduling strategy is more flexible, you can specify the filter, weighter scheduling deployment.

Q: Can Compose achieve cross-machine choreography?
A: No. Currently only support stand-alone arrangement.

Q: How is container monitoring implemented?
A: cadvisor do container monitoring. Monitoring this part is also a certain job need to think about to do, for example, can be considered and EK combined to provide a separate docker cluster monitoring solution out.

Q: How can a container be expanded?
A: There is no better solution at present, our recommended practice is to delete a big start again.

How to choose the data storage program to ensure data security and easy to expand, easy to migrate?

  • For the need to ensure data security, it is recommended to use the local + cloud hard disk program;
  • For easy expansion, easy migration needs, it is recommended to use data to write image or volume mount program;
  • If there is a high-performance needs, and your container is running on the physical machine, it is recommended to use local volume + host dev program;

No program is optimal, the specific program selection or need to be based on the specific use of the scene may be.

Q: SCE program, docker container is probably running in that layer?
A: Probably the stack is IaaS >>> CaaS, SCE docker container is running in CaaS.

===========================

Thank you for our partner capital online support for this group.

The above content is based on the June 2, 2015 micro-credit group to share content. Shareers Zhao Haichuan, Sina SCE engineers, is mainly responsible for virtualization and Docker related work, is committed to promoting the use of Docker in the company . Microblogging: wangzi19870227. Next, DockOne organizes targeted technology sharing every week, and welcomes interested students.

    Heads up! This alert needs your attention, but it's not super important.