Cfengine 3.1.2 was released December 9th, 2010. The theme of the release is major efficiency improvements, but some bug fixes are also included. Looking in the ChangeLog-file in the root of the tarball, we find the following goodies.
cf-promisesis almost gone
- potential for much less network load when checking for policy updates
ps-command runs less frequently
- faster access to classes
We will have a look at how to take advantage of the new features in the following.
Caching policy checks
The biggest improvements in version 3.1.2 comes from caching the outcome of
cf-promises runs. The main task of
cf-promises is to verify that the syntax of the policy is correct (but it also does more advanced analysis). The other components of Cfengine reading the policy (e.g.
cf-serverd, etc.) will require verification from
cf-promises before even touching the policy themselves. In the default schedule of
cf-agent runs every five minutes. This means
cf-promises is run at least every five minutes as well.
The amount of time it takes to run
cf-promises depends on a number of factors, but perhaps most importantly on the size and complexity of the checked policy. My workstation, using a fairly small policy, seems to spend a bit less than one second on running
root@host:# time cf-promises
Using an enterprise-size policy, this check will probably take multiple seconds to complete. Using the default schedule for
cf-agent, this means about 30 seconds of server time per hour per server. Now, if we assume 1000 servers in the organisation, we see that a total of more than 8 hours of precious server time is getting lost just to this simple thing every hour. Even though Cfengine is the most efficient configuration management solution on the planet, there seems to be room for improvement here.
How often does the configuration management policy of an organisation change? Hopefully not every day. Perhaps every week, or month? Assuming once a week, it means that
cf-promises is run against the same configuration 12*24*7 = 2016 times a row! How about reducing this to once?
This is exactly what Cfengine 3.1.2 brings, simply by touching a file
WORKDIR/masterfiles/cf_promises_validated. This file is created by
cf-agent or any other Cfengine component after it has successfully verified the policy with
cf-promises. Also, before running
cf-promises, the components check if any file either
- included by body common control inputs
is newer than the file
cf_promises_validated (based on modification time). If not, the run of
cf-promises is skipped, and 8 hours of server time is reclaimed every hour!
Reducing network load
The performance improvement by caching the outcome of runs of
cf-promises is done automatically by Cfengine, without any need of policy change. However, in the most common architecture of Cfengine, there is still much to be gained by doing a tiny modification to the policy. We assume the hosts pull policy updates from the
WORKDIR/masterfiles directory on distribution server(s), as seen in the picture (shamelessly copied from the Cfengine community site).
The trick is to try to copy our new
cf_promises_validated file (based on modification time) before trying to copy anything from the remote masterfiles directory itself. Note that it resides in masterfiles on the distribution server(s), so the access policy should need only minor modifications, if any. The end hosts check if the file got copied using if-repaired classes, and only copy the rest of masterfiles if so. The important thing to note here, is that the distribution servers must also run the policy, i.e. copy from their
WORKDIR/inputs locally and run
cf-agent regularly in order to update
cf_promises_validated if anything changed in
WORKDIR/masterfiles to signal the end hosts of the change. This is however the normal way to do it. The following policy snippet demonstrates the more efficient copy.
# *) the class “am_distribution_server” is set on distribution servers
# *) the string variable “distribution_server” contains the address of the distribution server
comment => “Check whether a validation stamp is available for a new policy update”,
copy_from => u_remote_cp(“/var/cfengine/masterfiles/cf_promises_validated”,”$(distribution_server)”),
classes => u_if_repaired(“validated_updates_ready”);
# distribution server should always put masterfiles in inputs in order to check new policy
comment => “Copy policy updates from master source on distribution server if a new validation was acquired”,
copy_from => u_remote_cp(“/var/cfengine/masterfiles”,”$(distribution_server)”),
depth_search => u_recurse(“inf”);
The bodies used are found in the Cfengine Open-Promise Body Library, but they are prefixed by “u_” to make the update-policy (usually part of failsafe.cf) self-containing.
Using this method, we reduce the (possible recursive) copy of the masterfiles directory containing tens or hundreds of files to one file. Reclaim those bytes!
As you might know, Cfengine relies on the
ps command in order to acquire information about the running processes. Process information interfaces are not portable across Unices, so the
ps command was deemed the only viable interface. Process information is required for example when making promises to start a process if it is not running, like the following.
restart_class => “start_apache”;
restart_class => “start_mysql”;
Previously, these processes-promises would result in two executions of
ps. However, running commands has significant overhead, and should thus be done only when strictly necessary. Cfengine 3.1.2 caches the output of a
ps execution and reuses it within the same bundle, so
ps will only run once in Cfengine 3.1.2 using the above snippet. The reason for not using a global cache for the whole run of
cf-agent is to try to avoid staleness, as process information may change quite frequently.
Another internal optimisation introduced is to index class access though a simple hash table. The hash function used is simply to pick the first character of the class name (so “aix” hashes to 0, “linux” hashes to 11, etc.). Because classes seem to be distributed quite well alphabetically, this should be quite efficient. There is a trade-off between the computational burden of the hash function and the quality of the probability distribution it produces. This hash function is very fast, and should have an decent distribution on classes. As a flat list of classes was used previously, the optimisation should have significant effect on all access to classes, including class evaluation, definition, and reporting.
As usual, Cfengine 3.1.2 is provided not only as a source tarball, but also prepackaged for the most popular Linux distributions by logging into the Engine Room (free registration required). Users of the following distributions enjoy free packages, both 32- and 64-bit versions.
- CentOS 5
- Debian 5 and 6
- Fedora 14
- Red Hat Enterprise Linux 3, 4, 5 and 6
- Suse 9, 10 and 11
- Ubuntu 8, 9 and 10
Note that most distributions also maintain a Cfengine 3 package, but this is usually older and may not be built in a uniform way.
Please leave a comment if you found this useful, or have suggestions for improvements.