Posted on Do 25 Oktober 2012

systemd for Developers III

Here's the third episode of of my systemd for Developers series.

Logging to the Journal

In a recent blog story intended for administrators I shed some light on how to use the journalctl(1) tool to browse and search the systemd journal. In this blog story for developers I want to explain a little how to get log data into the systemd Journal in the first place.

The good thing is that getting log data into the Journal is not particularly hard, since there's a good chance the Journal already collects it anyway and writes it to disk. The journal collects:

  1. All data logged via libc syslog()
  2. The data from the kernel logged with printk()
  3. Everything written to STDOUT/STDERR of any system service

This covers pretty much all of the traditional log output of a Linux system, including messages from the kernel initialization phase, the initial RAM disk, the early boot logic, and the main system runtime.

syslog()

Let's have a quick look how syslog() is used again. Let's write a journal message using this call:

#include <syslog.h>

int main(int argc, char *argv[]) {
        syslog(LOG_NOTICE, "Hello World");
        return 0;
}

This is C code, of course. Many higher level languages provide APIs that allow writing local syslog messages. Regardless which language you choose, all data written like this ends up in the Journal.

Let's have a look how this looks after it has been written into the journal (this is the JSON output journalctl -o json-pretty generates):

{
        "_BOOT_ID" : "5335e9cf5d954633bb99aefc0ec38c25",
        "_TRANSPORT" : "syslog",
        "PRIORITY" : "5",
        "_UID" : "500",
        "_GID" : "500",
        "_AUDIT_SESSION" : "2",
        "_AUDIT_LOGINUID" : "500",
        "_SYSTEMD_CGROUP" : "/user/lennart/2",
        "_SYSTEMD_SESSION" : "2",
        "_SELINUX_CONTEXT" : "unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023",
        "_MACHINE_ID" : "a91663387a90b89f185d4e860000001a",
        "_HOSTNAME" : "epsilon",
        "_COMM" : "test-journal-su",
        "_CMDLINE" : "./test-journal-submit",
        "SYSLOG_FACILITY" : "1",
        "_EXE" : "/home/lennart/projects/systemd/test-journal-submit",
        "_PID" : "3068",
        "SYSLOG_IDENTIFIER" : "test-journal-submit",
        "MESSAGE" : "Hello World!",
        "_SOURCE_REALTIME_TIMESTAMP" : "1351126905014938"
}

This nicely shows how the Journal implicitly augmented our little log message with various meta data fields which describe in more detail the context our message was generated from. For an explanation of the various fields, please refer to systemd.journal-fields(7)

printf()

If you are writing code that is run as a systemd service, generating journal messages is even easier:

#include <stdio.h>

int main(int argc, char *argv[]) {
        printf("Hello World\n");
        return 0;
}

Yupp, that's easy, indeed.

The printed string in this example is logged at a default log priority of LOG_INFO[1]. Sometimes it is useful to change the log priority for such a printed string. When systemd parses STDOUT/STDERR of a service it will look for priority values enclosed in < > at the beginning of each line[2], following the scheme used by the kernel's printk() which in turn took inspiration from the BSD syslog network serialization of messages. We can make use of this systemd feature like this:

#include <stdio.h>

#define PREFIX_NOTICE "<5>"

int main(int argc, char *argv[]) {
        printf(PREFIX_NOTICE "Hello World\n");
        return 0;
}

Nice! Logging with nothing but printf() but we still get log priorities!

This scheme works with any programming language, including, of course, shell:

#!/bin/bash

echo "<5>Hellow world"

Native Messages

Now, what I explained above is not particularly exciting: the take-away is pretty much only that things end up in the journal if they are output using the traditional message printing APIs. Yaaawn!

Let's make this more interesting, let's look at what the Journal provides as native APIs for logging, and let's see what its benefits are. Let's translate our little example into the 1:1 counterpart using the Journal's logging API sd_journal_print(3):

#include <systemd/sd-journal.h>

int main(int argc, char *argv[]) {
        sd_journal_print(LOG_NOTICE, "Hello World");
        return 0;
}

This doesn't look much more interesting than the two examples above, right? After compiling this with `pkg-config --cflags --libs libsystemd-journal` appended to the compiler parameters, let's have a closer look at the JSON representation of the journal entry this generates:

 {
        "_BOOT_ID" : "5335e9cf5d954633bb99aefc0ec38c25",
        "PRIORITY" : "5",
        "_UID" : "500",
        "_GID" : "500",
        "_AUDIT_SESSION" : "2",
        "_AUDIT_LOGINUID" : "500",
        "_SYSTEMD_CGROUP" : "/user/lennart/2",
        "_SYSTEMD_SESSION" : "2",
        "_SELINUX_CONTEXT" : "unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023",
        "_MACHINE_ID" : "a91663387a90b89f185d4e860000001a",
        "_HOSTNAME" : "epsilon",
        "CODE_FUNC" : "main",
        "_TRANSPORT" : "journal",
        "_COMM" : "test-journal-su",
        "_CMDLINE" : "./test-journal-submit",
        "CODE_FILE" : "src/journal/test-journal-submit.c",
        "_EXE" : "/home/lennart/projects/systemd/test-journal-submit",
        "MESSAGE" : "Hello World",
        "CODE_LINE" : "4",
        "_PID" : "3516",
        "_SOURCE_REALTIME_TIMESTAMP" : "1351128226954170"
}

This looks pretty much the same, right? Almost! I highlighted three new lines compared to the earlier output. Yes, you guessed it, by using sd_journal_print() meta information about the generating source code location is implicitly appended to each message[3], which is helpful for a developer to identify the source of a problem.

The primary reason for using the Journal's native logging APIs is a not just the source code location however: it is to allow passing additional structured log messages from the program into the journal. This additional log data may the be used to search the journal for, is available for consumption for other programs, and might help the administrator to track down issues beyond what is expressed in the human readable message text. Here's and example how to do that with sd_journal_send():

#include <systemd/sd-journal.h>
#include <unistd.h>
#include <stdlib.h>

int main(int argc, char *argv[]) {
        sd_journal_send("MESSAGE=Hello World!",
                        "MESSAGE_ID=52fb62f99e2c49d89cfbf9d6de5e3555",
                        "PRIORITY=5",
                        "HOME=%s", getenv("HOME"),
                        "TERM=%s", getenv("TERM"),
                        "PAGE_SIZE=%li", sysconf(_SC_PAGESIZE),
                        "N_CPUS=%li", sysconf(_SC_NPROCESSORS_ONLN),
                        NULL);
        return 0;
}

This will write a log message to the journal much like the earlier examples. However, this times a few additional, structured fields are attached:

{
        "__CURSOR" : "s=ac9e9c423355411d87bf0ba1a9b424e8;i=5930;b=5335e9cf5d954633bb99aefc0ec38c25;m=16544f875b;t=4ccd863cdc4f0;x=896defe53cc1a96a",
        "__REALTIME_TIMESTAMP" : "1351129666274544",
        "__MONOTONIC_TIMESTAMP" : "95903778651",
        "_BOOT_ID" : "5335e9cf5d954633bb99aefc0ec38c25",
        "PRIORITY" : "5",
        "_UID" : "500",
        "_GID" : "500",
        "_AUDIT_SESSION" : "2",
        "_AUDIT_LOGINUID" : "500",
        "_SYSTEMD_CGROUP" : "/user/lennart/2",
        "_SYSTEMD_SESSION" : "2",
        "_SELINUX_CONTEXT" : "unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023",
        "_MACHINE_ID" : "a91663387a90b89f185d4e860000001a",
        "_HOSTNAME" : "epsilon",
        "CODE_FUNC" : "main",
        "_TRANSPORT" : "journal",
        "_COMM" : "test-journal-su",
        "_CMDLINE" : "./test-journal-submit",
        "CODE_FILE" : "src/journal/test-journal-submit.c",
        "_EXE" : "/home/lennart/projects/systemd/test-journal-submit",
        "MESSAGE" : "Hello World!",
        "_PID" : "4049",
        "CODE_LINE" : "6",
        "MESSAGE_ID" : "52fb62f99e2c49d89cfbf9d6de5e3555",
        "HOME" : "/home/lennart",
        "TERM" : "xterm-256color",
        "PAGE_SIZE" : "4096",
        "N_CPUS" : "4",
        "_SOURCE_REALTIME_TIMESTAMP" : "1351129666241467"
}

Awesome! Our simple example worked! The five meta data fields we attached to our message appeared in the journal. We used sd_journal_send() for this which works much like sd_journal_print() but takes a NULL terminated list of format strings each followed by its arguments. The format strings must include the field name and a '=' before the values.

Our little structured message included seven fields. The first three we passed are well-known fields:

  1. MESSAGE= is the actual human readable message part of the structured message.
  2. PRIORITY= is the numeric message priority value as known from BSD syslog formatted as an integer string.
  3. MESSAGE_ID= is a 128bit ID that identifies our specific message call, formatted as hexadecimal string. We randomly generated this string with journalctl --new-id128. This can be used by applications to track down all occasions of this specific message. The 128bit can be a UUID, but this is not a requirement or enforced.

Applications may relatively freely define additional fields as they see fit (we defined four pretty arbitrary ones in our example). A complete list of the currently well-known fields is available in systemd.journal-fields(7).

Let's see how the message ID helps us finding this message and all its occasions in the journal:

$ journalctl MESSAGE_ID=52fb62f99e2c49d89cfbf9d6de5e3555
-- Logs begin at Thu, 2012-10-18 04:07:03 CEST, end at Thu, 2012-10-25 04:48:21 CEST. --
Oct 25 03:47:46 epsilon test-journal-se[4049]: Hello World!
Oct 25 04:40:36 epsilon test-journal-se[4480]: Hello World!

Seems I already invoked this example tool twice!

Many messages systemd itself generates have message IDs. This is useful for example, to find all occasions where a program dumped core (journalctl MESSAGE_ID=fc2e22bc6ee647b6b90729ab34a250b1), or when a user logged in (journalctl MESSAGE_ID=8d45620c1a4348dbb17410da57c60c66). If your application generates a message that might be interesting to recognize in the journal stream later on, we recommend attaching such a message ID to it. You can easily allocate a new one for your message with journalctl --new-id128.

This example shows how we can use the Journal's native APIs to generate structured, recognizable messages. You can do much more than this with the C API. For example, you may store binary data in journal fields as well, which is useful to attach coredumps or hard disk SMART states to events where this applies. In order to make this blog story not longer than it already is we'll not go into detail about how to do this, an I ask you to check out sd_journal_send(3) for further information on this.

Python

The examples above focus on C. Structured logging to the Journal is also available from other languages. Along with systemd itself we ship bindings for Python. Here's an example how to use this:

from systemd import journal
journal.send('Hello world')
journal.send('Hello, again, world', FIELD2='Greetings!', FIELD3='Guten tag')

Other binding exist for Node.js, PHP, Lua.

Portability

Generating structured data is a very useful feature for services to make their logs more accessible both for administrators and other programs. In addition to the implicit structure the Journal adds to all logged messages it is highly beneficial if the various components of our stack also provide explicit structure in their messages, coming from within the processes themselves.

Porting an existing program to the Journal's logging APIs comes with one pitfall though: the Journal is Linux-only. If non-Linux portability matters for your project it's a good idea to provide an alternative log output, and make it selectable at compile-time.

Regardless which way to log you choose, in all cases we'll forward the message to a classic syslog daemon running side-by-side with the Journal, if there is one. However, much of the structured meta data of the message is not forwarded since the classic syslog protocol simply has no generally accepted way to encode this and we shouldn't attempt to serialize meta data into classic syslog messages which might turn /var/log/messages into an unreadable dump of machine data. Anyway, to summarize this: regardless if you log with syslog(), printf(), sd_journal_print() or sd_journal_send(), the message will be stored and indexed by the journal and it will also be forwarded to classic syslog.

And that's it for today. In a follow-up episode we'll focus on retrieving messages from the Journal using the C API, possibly filtering for a specific subset of messages. Later on, I hope to give a real-life example how to port an existing service to the Journal's logging APIs. Stay tuned!

Footnotes

[1] This can be changed with the SyslogLevel= service setting. See systemd.exec(5) for details.

[2] Interpretation of the < > prefixes of logged lines may be disabled with the SyslogLevelPrefix= service setting. See systemd.exec(5) for details.

[3] Appending the code location to the log messages can be turned off at compile time by defining -DSD_JOURNAL_SUPPRESS_LOCATION.

© Lennart Poettering. Built using Pelican. Theme by Giulio Fidente on github. .