レナート   PID EINS!   ﻟﻴﻨﺎﺭﺕ

Tue, 10 Apr 2012

Control Groups vs. Control Groups

TL;DR: systemd does not require the performance-sensitive bits of Linux control groups enabled in the kernel. However, it does require some non-performance-sensitive bits of the control group logic.

In some areas of the community there's still some confusion about Linux control groups and their performance impact, and what precisely it is that systemd requires of them. In the hope to clear this up a bit, I'd like to point out a few things:

Control Groups are two things: (A) a way to hierarchally group and label processes, and (B) a way to then apply resource limits to these groups. systemd only requires the former (A), and not the latter (B). That means you can compile your kernel without any control group resource controllers (B) and systemd will work perfectly on it. However, if you in addition disable the grouping feature entirely (A) then systemd will loudly complain at boot and proceed only reluctantly with a big warning and in a limited functionality mode.

At compile time, the grouping/labelling feature in the kernel is enabled by CONFIG_CGROUPS=y, the individual controllers by CONFIG_CGROUP_FREEZER=y, CONFIG_CGROUP_DEVICE=y, CONFIG_CGROUP_CPUACCT=y, CONFIG_CGROUP_MEM_RES_CTLR=y, CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y, CONFIG_CGROUP_MEM_RES_CTLR_KMEM=y, CONFIG_CGROUP_PERF=y, CONFIG_CGROUP_SCHED=y, CONFIG_BLK_CGROUP=y, CONFIG_NET_CLS_CGROUP=y, CONFIG_NETPRIO_CGROUP=y. And since (as mentioned) we only need the former (A), not the latter (B) you may disable all of the latter options while enabling CONFIG_CGROUPS=y, if you want to run systemd on your system.

What about the performance impact of these options? Well, every bit of code comes at some price, so none of these options come entirely for free. However, the grouping feature (A) alters the general logic very little, it just sticks hierarchial labels on processes, and its impact is minimal since that is usually not in any hot path of the OS. This is different for the various controllers (B) which have a much bigger impact since they influence the resource management of the OS and are full of hot paths. This means that the kernel feature that systemd mandatorily requires (A) has a minimal effect on system performance, but the actually performance-sensitive features of control groups (B) are entirely optional.

On boot, systemd will mount all controller hierarchies it finds enabled in the kernel to individual directories below /sys/fs/cgroup/. This is the official place where kernel controllers are mounted to these days. The /sys/fs/cgroup/ mount point in the kernel was created precisely for this purpose. Since the control group controllers are a shared facility that might be used by a number of different subsystems a few projects have agreed on a set of rules in order to avoid that the various bits of code step on each other's toes when using these directories.

systemd will also maintain its own, private, controller-less, named control group hierarchy which is mounted to /sys/fs/cgroup/systemd/. This hierarchy is private property of systemd, and other software should not try to interfere with it. This hierarchy is how systemd makes use of the naming and grouping feature of control groups (A) without actually requiring any kernel controller enabled for that.

Now, you might notice that by default systemd does create per-service cgroups in the "cpu" controller if it finds it enabled in the kernel. This is entirely optional, however. We chose to make use of it by default to even out CPU usage between system services. Example: On a traditional web server machine Apache might end up having 100 CGI worker processes around, while MySQL only has 5 processes running. Without the use of the "cpu" controller this means that Apache all together ends up having 20x more CPU available than MySQL since the kernel tries to provide every process with the same amount of CPU time. On the other hand, if we add these two services to the "cpu" controller in individual groups by default, Apache and MySQL get the same amount of CPU, which we think is a good default.

Note that if the CPU controller is not enabled in the kernel systemd will not attempt to make use of the "cpu" hierarchy as described above. Also, even if it is enabled in the kernel it is trivial to tell systemd not to make use of it: Simply edit /etc/systemd/system.conf and set DefaultControllers= to the empty string.

Let's discuss a few frequently heard complaints regarding systemd's use of control groups:

I hope this explanation was useful for a reader or two! Thank you for your time!

posted at: 19:09 | path: /projects | permanent link to this entry | comments


It should be obvious but in case it isn't: the opinions reflected here are my own. They are not the views of my employer, or Ronald McDonald, or anyone else.

Please note that I take the liberty to delete any comments posted here that I deem inappropriate, off-topic, or insulting. And I excercise this liberty quite agressively. So yes, if you comment here, I might censor you. If you don't want to be censored you are welcome to comment on your own blog instead.


Lennart Poettering <mzoybt (at) 0pointer (dot) net>
Syndicated on Planet GNOME, Planet Fedora, planet.freedesktop.org, Planet Debian Upstream. feed RSS 0.91, RSS 2.0
Archives: 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013

Valid XHTML 1.0 Strict!   Valid CSS!