Overview
---------

DLog is meant to be a performant and system-wide logging solution. This means
that it has to do more than just harvest logs which makes it a bit more complex
than what would suffice for a single program. Various users of DLog also have
their own unique needs and while they are free to fork the code, we try to
accomodate their needs in the central repo for corporate reasons. This document
is meant to serve as a basic introduction to DLog and shed some light on some
of the more "wtf" of its elements.

Components
----------

 Module      │ Description
─────────────┼────────────────────────────────────────────────────────────────
 libdlog     │ The client-side API, responsible for a priori filtering and
             │ sending the log to the appropriate backend-dependent log sink.
─────────────┼────────────────────────────────────────────────────────────────
 dlogutil    │ The developer tool for printing logs collected by the framework
             │ and a handful of other administrative tasks.
─────────────┼────────────────────────────────────────────────────────────────
 dlog_logger │ Under the pipe backend (see below), this daemon is responsible
             │ for storing logs received from libdlog. Under all backends, it
             │ also periodically copies data from log storage to disk files as
             │ defined in its configuration.
─────────────┼────────────────────────────────────────────────────────────────
 dlogctl     │ A developer tool to control some aspects of DLog at run time.
             │ In particular it changes a priori filters and disables buffers.
─────────────┼────────────────────────────────────────────────────────────────
 dlogsend    │ A simple command-line utility that sends logs. 'Nuff said.
─────────────┼────────────────────────────────────────────────────────────────
 libdlogutil │ A developer API to programmatically receive logs.

Backends
--------

DLog started out as an Android feature and was just a wrapper around /dev/log_*
kernel devices. There were 4 of those (to keep platform, radio, app and other
logs separate), also introduced just for Android (officially Linux never had
this feature). At some point the kernel guys decided to throw those devices out
(which finally happened in v5.0 after spending some time in staging and having
nobody rework it) and suggested to create a solution in userland instead.

This led us to create the pipe backend. Due to strict client requirements about
performance, we've shuffled a lot of code around, including moving processing
to the server. This was done in hope that the legacy Android devices would end
up being dropped in production images, which never happened. In the end, this
resulted in two backends with little in common regarding how they handle stuff
under the hood, but sharing idiosyncracies (like the 4 buffers).

Another backend is the zero-copy backend, developed for stable products, where
reading logs only really happens at a service depot if something breaks. This
approach emphasises performance of log writing at all costs, sacrificing various
needless abstractions (like the 4 buffers) and reading performance. Experimental
and fresh, so probably not recommended for general use, yet.

Here's a comparison:

               │ Android Logger Backend        │ Pipe Backend                  │ Zero-copy Backend
───────────────┼───────────────────────────────┼───────────────────────────────┼───────────────────────────────
 Log storage   │ Four kernel devices, one for  │ dlog_logger daemon stores all │ One kernel device representing
               │ each core buffer.             │ core logs in its memory. Size │ a giant chunk of memory, split
               │                               │ of the buffers is controlled  │ into smaller blocks that are
               │                               │ through a configuration file. │ leased to programs for direct
               │                               │ They are cyclical as well.    │ writing. Size is compile-time
               │                               │                               │
───────────────┼───────────────────────────────┼───────────────────────────────┼───────────────────────────────
 Mem footprint │ Medium, usually on the order  │ Similar to Android Logger,    │ Large, on the order of 32 MB.
               │ of 2 MB. Entirely filled with │ with the size being set at    │ Chunks are filled individually
               │ log entries, so no waste. The │ boot time via text config.    │ in parallel so many can be
               │ size of each buffer is set at │                               │ partially empty (overhead due
               │ kernel module compile time.   │                               │ to waste). The size is set at
               │                               │                               │ kernel module compile time.
───────────────┼───────────────────────────────┼───────────────────────────────┼───────────────────────────────
 Cycling       │ Cyclical, so all logs are in  │ Cyclical.                     │ The whole block is recycled at
 semantics     │ order and removal happens by  │ Same as Android Logger.       │ once, so it is possible for a
               │ individual log age.           │                               │ log to be recycled even if the
               │                               │                               │ previous log wasn't. Blocks
               │                               │                               │ are recycled in roughly cyclic
               │                               │                               │ order though.
───────────────┼───────────────────────────────┼───────────────────────────────┼───────────────────────────────
 Log interface │ Devices can be opened, read   │ For each buffer, the daemon   │ Clients can use an ioctl to
               │ and written to like regular   │ exposes twin sockets: one to  │ obtain memory chunks which can
               │ files.                        │ rule the buffer and one to    │ be mmap()'d. Then they write
               │                               │ bind clients to the buffer.   │ or read mapped memory directly
───────────────┼───────────────────────────────┼───────────────────────────────┼───────────────────────────────
 Timestamps    │ The device adds a timestamp   │ Both REALTIME and MONOTONIC   │ Clients procure a timestamp
               │ when it RECEIVES a log, using │ timestamps are added; both by │ using the MONOTONIC clock.
               │ the REALTIME clock.           │ the library on SENDING a log  │ The difference between SENDING
               │                               │ and the daemon on RECEIVING,  │ and RECEIVING isn't meaningful
               │                               │ for a total of 4 timestamps.  │ in this context, nominally it
               │                               │                               │ is a SENDING timestamp though.
───────────────┼───────────────────────────────┼───────────────────────────────┼───────────────────────────────
 Sorting       │ Logs are implicitly sorted by │ Logs are sorted in the daemon │ Logs are sorted by dlogutil.
               │ the order they arrived in the │ according to one of the four  │ The memory chunks are assigned
               │ buffer. dlogutil then simply  │ timestamps, as specified in   │ in an arbitrary fashion, but
               │ merges the buffers since they │ the config. dlogutil allows   │ since the size is known before
               │ are already sorted.           │ to specify a different one,   │ the action, a bucket sort can
               │                               │ in which case it sorts again, │ be used.
               │                               │ else merging already-sorted.  │
───────────────┼───────────────────────────────┼───────────────────────────────┼───────────────────────────────
 Monitoring    │ Possible and cheap,           │ Possible and reasonably cheap │ Possible, but requires fairly
 live logs     │ with no loss of logs.         │ with no loss of logs.         │ expensive polling of the whole
               │                               │                               │ memory area. Risks some logs
               │                               │                               │ slipping between the cracks.
───────────────┼───────────────────────────────┼───────────────────────────────┼───────────────────────────────
 QoS           │ Available.                    │ Available.                    │ Not available since it requires
               │                               │                               │ major overheads and the problem
               │                               │                               │ it solves (overhead of logs) is
               │                               │                               │ already mitigated with zero-copy
───────────────┼───────────────────────────────┼───────────────────────────────┼───────────────────────────────
 Performance   │ High, direct communication    │ Medium, but still the best    │ Extreme for writing.
               │ with the kernel.              │ available in the userspace.   │ Mediocre for reading.
───────────────┼───────────────────────────────┼───────────────────────────────┼───────────────────────────────
 Availability  │ Only with kernel support.     │ Practically every *nix.       │ Only with kernel support.
───────────────┼───────────────────────────────┼───────────────────────────────┼───────────────────────────────
 Reliability   │ High, the kernel ensures that │ Medium, the daemon can die    │ Remains to be seen, but there
               │ nothing goes wrong.           │ and pipes can get clogged.    │ are sacrifices in the name of
               │                               │                               │ performance so it's unlikely
               │                               │                               │ to be very robust and it has
               │                               │                               │ not yet seen much live testing
───────────────┼───────────────────────────────┼───────────────────────────────┼───────────────────────────────
 Maintenance   │ Hard, involves tinkering with │ Easy, everything is in the    │ Hard, Adeptus Mechanicus tier.
               │ kernel code and the internal  │ userspace and the internal    │
               │ format is fairly rigid.       │ format is forward-compatible. │

Backend support on the platform is autodetected by looking if the relevant
devices are present, but it can be specified explicitly via config if need be.

The autodetection is not only for simplicity; in Tizen IoT images, the kernel
and modules are different for each target, but the root filesystem is shared.
The DLog config is part of the root filesystem, and so is shared between all the
targets. This means that we can't supply a static config because it would be
used on targets with varying support for kernel-based backends.

dlog_logger interface
---------------------

As mentioned in the table above, when the pipe backend is being used, DLog's
daemon, dlog_logger, uses two sockets per each buffer in order to interact with
other processes. One of them, the control socket, is used to perform
administrative tasks, such as clearing the buffer, fetching usage and capacity,
but also fetching the actual logs (for returning them by (lib)dlogutil). In
order to do so, the client just writes a simple control message. The other
socket is used to write logs -- in order to do so, the process which wants to
log writes a control message, and in response receives a pipe connected to the
daemon (hence the name: pipe backend) to which it can finally write the logs.
The reason for such a convoluted process is that pipes are more performant than
sockets.  Therefore, in a long running process (which is the case that we
optimize for), it is better to suffer from some overhead during connection,
than to use sockets which are simpler, but would result in higher overhead for
every message sent.

We use two sockets instead of one, because they require different permissions;
almost everything on Tizen can log, but we only want administrators to perform
administrative tasks. Additionally, we eschew DBus in favor of a simple custom
solution, because DLog is supposed to be a very low level, low dependency,
highest possible performance solution.

When the Android Logger or the zero-copy backend is being used, applications can
interact with devices directly. Therefore, the sockets aren't used and aren't
created. In some cases, this might mean that the daemon is not needed at all; in
this case it will detect it and quit automatically. However, there are other
features that require the daemon. These include log persistence, KMSG and Syslog
support and QoS. These features are explained below, and when they are enabled,
the daemon won't quit even if a kernel-based backend is being used.

KMSG and Syslog
---------------

At some point the idea to make DLog the logging overmind of the system was
floated around: to make it responsible (at least to an extent) for the various
other places that collect logs. Ultimately, nothing came out of it (syslog is
handled by systemd journal), but the code remains and complicates things; see
below. However, we're reluctant to simplify because unified logging could be
genuinely good to have and we anticipate that we're going to get asked to get
back to this at some point in the foreseeable future. Those two buffers are
handled using the pipe backend even if the logger backend is being used (and
sockets are created for them).

Timestamps and sorting
----------------------

The original kernel devices just slapped a timestamp when they got a log and
called it a day. The pipe-backend daemon initially replicated this, but at some
point people noticed that if the system is very busy, the timestamp they gather
locally and the one added by the daemon can differ a bit and requested that we
use an "accurate" timestamp, i.e. add it library-side. With multiple programs,
it is possible for the daemon to receive logs in a different order than their
timestamps were added, which then led to complaints about logs being printed
seemingly out of order, confusing developers. Sorting was added, which stumbled
upon a bunch of issues like "what if somebody changes the clock, especially
backwards" or "what if a client forked the kernel to change device clock type
to be fundamentally incomparable". Ultimately we had to collect all possible
timestamps since they could potentially have to be compared against each other
(see below: KMSG, Syslog and Android devices all use different timestamps, only
pipe-based logs are flexible).

Here's what timestamp types are available from different log sources:

- KMSG gets a sent-monotonic timestamp when log is sent, and the daemon adds
  both receive timestamps.
- syslog gets a sent-realtime timestamp when log is sent, and the daemon adds
  both receive timestamps.
- When Android Logger backend is enabled, all its buffers have only the
  sent-realtime timestamp available.
- When zero-copy backend is enabled, all its buffers have only the
  sent-monotonic timestamp available.
- When pipe backend is enabled, all its buffers have all four timestamps
  available. Note that if the stdout redirector is used, receive timestamps
  refer to the moment when the last character (newline or a zero byte) is
  received. Also, in the case of the stdout redirector, sent timestamps are
  just copies of the receive timestamps and can be wildly inaccurate -- the
  only reason we add them is that they are available in the normal logs and we
  want to be able to depend on their existence.

Having four different timestamps inflicts quad damage upon our ability to
compare such logs to each other. We are forced to copy both KMSG and Syslog
logs into the daemon just so that they have some sort of receive timestamp.

It would be great to simplify this but I expect it would take some sneaky
gaslighting to implement without anybody raising a big fuss about it and it
would still have to involve explaining to random people why their timestamp
is slightly off.

Filtering and limiting
----------------------

A posteriori filtering, as done by dlogutil and dlog_logger file logging, is
fairly obvious: people are usually only interested in a subset of logs. This
filtering can be done by a handful of parameters such as pid, tag, and priority.

A priori filtering is done by the library.  It is done to reduce responsibility
a client program has over the amount of logs it produces. Everybody wants to
produce as many logs as possible, but if everybody does that then the system can
grind to a halt (at least on Tizen, where everything is too broken to go without
logs so people throw absolutely disgusting amounts of them and the hardware is
not stellar either so struggles a lot). Nipping as many logs in the bud as we
can means that everybody can spam as many logs as they want relatively cheaply,
since they all get thrown directly in the trash anyway (there's a metaphor for
something here). This type of filtering can be done either based on a static
config loaded just once at startup, or a mode enabled where the config can be
changed dynamically at runtime. This incurs a runtime cost for logging as we
could not do it "properly" based on events without changing the API, so it can
be disabled if not wanted. Dynamic filtering is controlled through dlogctl.

Unlike a posteriori filtering, a priori filtering doesn't have to mean dropping
affected logs completely. Instead of denying the log completely in
configuration, the user can limit logs of each tag to a number per minute. This
is, however, a limit per process; for example if we limit the tag GENERIC_TAG to
10 tags per minute, and we have 10 processes spamming with this tag, we will get
100 logs every minute. This is not an intentional limitation, but it's hard to
think of a better solution in the library, and implementing it server side would
require either forking the kernel module (which is not something we are going to
do), or dropping the Android Logger backend.

A priori filtering can also be done by the pid, and again the user can either
deny the log completely or set a limit per time unit. However, currently,
a single filter can't use both pid and other (tag/priority) settings.

QoS
---

This is a simple mechanism for controlling the total amount of logs in the
system. From the user perspective, it amounts to choosing the total amount of
logs, and afterwards, logs will be dropped in order to stay under the limit.
It just works™.

As usual, it's not so simple from the technical standpoint. The main issue is,
again, the Android Logger backend, which means that logging programs talk
directly to the kernel and not to any userspace service. Our solution is
located in the dlog_logger server, but works in a way that doesn't require a
connection to DLog clients. Specifically, the daemon counts the logs, and when
the limit is surpassed, it makes a list of logging processes, and then makes an
attempt (using one of the heuristics) to give each one a limit per time period.
This limit is sent to the clients using the dynamic filtering (see above).

The above is not even implemented for the zero-copy backend, though the same
set of limitations would apply since it's a kernel module. On the other hand it
would ruin performance due to the overhead of syscalls, plus there's no need to
limit logs if they are fast enough not to hog the entire CPU) so we probably are
going to leave it unsupported.

All of this means that QoS limit is not a hard limit, but rather a suggestion
that the logger tries its best to follow; however, it will be broken from time
to time, and as often there will be less logs kept than possible. A better
solution would, again, either be in kernel space or be pipe backend only.

libdlogutil
-----------

This is a work in progress library to receive logs from a working system. It
is meant to be used in, for example, log analysis daemons. A developer wanting
to use this library has to first use the configuration function, and then they
can use the connect function and then get the logs one by one (with a possible
timeout).

The library uses much of the code that used to be in dlogutil, and dlogutil
has been rewritten to use the library. This means that libdlogutil should work
exactly the same way as dlogutil, but with a different log sink -- to the
program and not to the user. Since developers are often familiar with the
command line utility, they should have a decent idea about what the various
API functions do, and they've got dlogutil itself as the ultimate reference.

Beware! The API is, so far, internal. This means that while we will try not
to introduce breaking changes, we are not bound not to do so and anybody
using the library is doing it on their own responsibility.

Testing
-------

The unit tests are located in the src/tests directory. Each file is its own
binary and is built against the DLog source code directly (as you can see in
Makefile.am). We do not use any test framework; instead, we fail by assert.h
assertions or returning a failure exit code manually. These tests are run
automatically during compilation. Sometimes, we'd like to check for the nitty
gritty syscall things; we use the linker wrapping mechanism for that, which
allows to replace an existing function with a new one (which could call the
existing function in its body). The flag is set in Makefile.am, and using this
in .c test files is rather intuitive (see existing files for comparision). Note
however, that this mechanism is incredibly fragile (for example: what if the
compiler inlines the function?), so don't be surprised if it decides to break
randomly. Also note that some of these tests don't make much sense (and actually
were added mostly for coverage/KPI reasons), in that they don't really check the
result of the function, but rather the concrete implementation (usually this
means the order of syscalls/glibc functions). This means that you sadly
shouldn't be surprised when let's say changing malloc to calloc, or some similar
totally correct thing, breaks the tests.

The integration tests are located in the tests directory. The main test file is
dlog_test.in, which is a big POSIX shell script that runs all tests in order. It
mostly runs the executables exported by the main packages, like dlogutil and
dlogsend, but it also uses some purpose-built tools that are also located in the
tests directory, which however only use the DLog public API. These tests are
located in a separate package, dlog-tests, and can be run manually; for details,
see the message during the instalation of this package. They are also highly
stateful, and for now they don't really care about fully cleaning their own
state to a degree higher than required for them to work. These tests also don't
count into coverage statistics (for technical reasons). You can also find some
benchmarks in this directory, but they have been mostly written ad hoc, and are
not really supposed to be run right now.

The TL;DR on how to test DLog 100% is as follows:
1. Compile the package. If the unit tests fail, you will get a log file in the
   build directory in the GBS root. If they succeed, you won't get a log file,
   so no need to worry.
2. Install all the packages, including dlog-tests. Make sure to reboot your
   target afterwards (you will get a message about this while installing the
   package).
3. Run `dlog_test pipe` and `dlog_test logger`. There are other options (see the
   script for details) but it is enough to do this.
4. Uninstall test package, reboot afterwards. You can test DLog manually now if
   you feel like so. See if the target boots, call dlogutil to see if there are
   any logs, etc.

As you might have seen, both unit and integration tests could use some love.
Unfortunately, we have never really had any time to do any improvement here
(except of course writing new tests, which we do try to do, especially if there
is a regression). We'd like to avoid adding dependencies just for tests, but it
would be great to improve these one day. In the meantime, both unit and
integration tests can catch (and have caught in the past) some real bugs.