Under the hood
==============
-Architecture
-------------
+faultd is implemented in C, using many kernel-style idioms.
+To learn more about them please check kernel documentation.
+As it's running as root it should be reliable.
+That's why:
+
+.. WARNING::
+**faultd is meant to be single threaded.
+Internal faultd API does not support concurrent access at all.
+So if for any reason (including DBUS communication using glib) you run an additional thread you have to ensure that faultd functions are called only in the context of the main thread.**
+
+Modules
+-------
+
+faultd has been designed to be extensible and modular.
+That's why the very first and basic abstraction that is defined inside faultd is *struct faultd_module*.
+Generally every piece of faultd code, apart from utilities like logs and helpers, is encapsulated into module abstraction.
+
+For every module, name, type and init() should be defined.
+Name is just a string which identifies the module.
+Type is an enum value that determines in which order modules should be initialized.
+Generally, the bigger the number is, the later module is going to be initialized.
+Instead of using pure integer *enum faultd_module_type* should be used.
+Module life cycle starts with init() and ends with cleanup().
+It's mandatory for every module to have a init() set, while cleanup() is optional.
+To register the module *FAULTD_MODULE_REGISTER()* macro should be used.
+
+As usa of bare module API is not always enough it's possible to build additional abstraction layers on the top of module API.
+To do so, *struct faultd_module* should be embedded in some bigger structure that provides this abstraction layer.
+To access the container structure in init() or cleanup() the *container_of()* macro should be used.
+An example of simple module has been presented below.
+
+.. code:: C
+ struct my_module {
+ struct faultd_module mod;
+ char *my_data;
+ };
+
+ #define to_my_module(MOD) \
+ container_of(MOD, struct my_module, mod)
+
+ static int module_init(struct faultd_module *module,
+ struct faultd_config *config,
+ sd_event *event)
+ {
+ struct my_module *mmod = to_my_module(module);
+
+ /* Alloc your resources like memory, open fds etc. */
+ mmod->my_data = calloc(10, sizeof(char));
+ if (!mmod->my_data)
+ return -ENOMEM;
+
+ return 0;
+ }
+
+ static void module_cleanup(struct faultd_module *module)
+ {
+ struct my_module *mmod = to_my_module(module);
+
+ free(mmod->my_data);
+ }
+
+ struct my_module my_mod = {
+ .mod = {
+ .name = "my_module",
+ .type = FAULTD_MODULE_TYPE_LISTENER,
+ .init = module_init,
+ .cleanup = module_cleanup,
+ .node = LIST_HEAD_INIT(&mmod.mod.node),
+ },
+ .my_data = NULL,
+ };
+
+ FAULTD_MODULE_REGISTER(&my_mod.mod);
+
+Architecture Overview
+---------------------
+
+On a very general level faultd is just a collection of modules that are all initialized in the beginning and cleaned up just before returning from main().
+To ensure the right order of module initialization every module has to have a type.
+This type can be understood as run level.
+The greater the type value is, the later module is initialized.
+In init() a module should allocate all required resources and release them in cleanup().
+A module should not block inside init() for a long period of time (in read(), poll() etc).
+For this purpose there is a main loop running inside the daemon.
+It can be accessed via the event parameter of the init().
+By default all modules should use sd_event mainloop but if it's necessary it is also possible to use glib mainloop.
+The mainloop object can be accessed using g_mainloop_new() with NULL as a context or just by passing NULL as mainloop argument to other glib functions.
Core
----
-Modules
+faultd core is a set of modules that provides basic abstractions layers (including modules).
+Generally you should not modify core unless you have a very good reason to do so.
+In addition all modules in core directory should always be built-in.
+
+The most important abstraction which is commonly used in faultd is struct faultd_module.
+Every event that goes through the event processor (event_processor_report_event()) is stored in the database so it can be retrieved later.
+To keep the correct order of events monotonic time and boot-id are used.
+
+The typical flow of events starts with a listener.
+Then control goes to a decision maker and ends in action.
+In the very beginning an event of some particular type (like service_failed_event) is detected and reported by one of the listeners.
+After reporting it to the core a suitable decision maker is called to decide what to do with this event.
+When the decision is made, the decision maker creates the decision_made_event and fills it with data about the action which should be executed.
+This event is once again reported to the core and now it's handled by the action executor which tries to find a suitable action and execute it.
+After this, the action_executed event is reported to the core and usually this is the last step of event processing in faultd.
+
+Event Types
+-----------
+
+The base abstraction on which most of faultd modules operate is struct faultd_event.
+As there may be many different types of events in the system struct faultd_event can be easily extended.
+To do so a structure which contains struct faultd_event as one of its fields should be defined.
+To allow this event type to be used in common faultd core a couple of methods has to be defined:
+
+- char *to_string(struct faultd_event *)
+ This function should print all the event's data to a newly allocated string
+
+- void release(struct faultd_event *)
+ This function is called to release all the memory owned by the event when all references has been drooped.
+
+- void serialize(struct faultd_event*, struct faultd_object*)
+ This function is called to serialize given event to a generic faultd_object.
+ This function should put all the event data to faultd_object to allow placing this event into the database.
+
+- int allocate_event(struct faultd_event_type *type, void *data, struct faultd_event **ev)
+ This function should allocate new event based on its data.
+
+- int deserialize_event(struct faultd_event_type *type, struct faultd_object *data, struct faultd_event **ev)
+ This function should parse given faultd_object (usually retrieved from DB) and allocate event based on it.
+
+Those functions should be passed to core using struct faultd_event_type.
+Apart from methods, a unique name for each event type should also be defined.
+When struct faultd_event_type is initialized it should be registered in the core using FAULTD_EVENT_TYPE_REGISTER() macro.
+An example of event type definition has been shown below:
+
+.. code:: C
+ /* Header file */
+ #define EXAMPLE_EVENT_ID "example"
+ #define EXAMPLE_FIELD_ID "example"
+
+ struct example_event {
+ struct faultd_event event;
+ int example;
+ };
+
+ struct e_event_data {
+ int example;
+ };
+
+ #define to_example_event(EVENT) \
+ container_of(EVENT, struct example_event, event)
+
+ /* .c file */
+
+ static int allocate_ee_event(struct faultd_event_type *type,
+ void *data, struct faultd_event **ev)
+ {
+ struct example_event_event *e_ev;
+ struct e_event_data *e_ev_data = data;
+ int ret;
+
+ e_ev = calloc(1, sizeof(*e_ev));
+ if (!e_ev)
+ return -ENOMEM;
+
+ ret = faultd_event_init_internal(type, &e_ev->event);
+ if (ret)
+ goto free_e_ev;
+
+ e_ev->example = e_ev_data->example;
+
+ *ev = &e_ev->event;
+ return 0;
+
+ free_e_ev:
+ free(e_ev);
+
+ return ret;
+ }
+
+ static int deserialize_e_event(struct faultd_event_type *type,
+ struct faultd_object *data, struct faultd_event **ev)
+ {
+ int ret = -EINVAL;
+ struct e_event_data e_ev_data;
+ struct faultd_object *obj;
+
+ memset(&e_ev_data, 0, sizeof(e_ev_data));
+
+ list_for_each_entry(obj, &data->val.children, node) {
+ if ((obj->type == TYPE_INT) &&
+ (strcmp(EXAMPLE_FIELD_ID, obj->key) == 0)) {
+
+ e_ev_data = obj->val.i;
+ }
+ }
+
+ if (!e_ev_data.example) {
+ ret = -EINVAL;
+ goto finish;
+ }
+
+ ret = allocate_e_event(type, &e_ev_data, ev);
+ if (ret < 0)
+ goto finish;
+
+ ret = faultd_event_deserialize_internal(data, type, *ev);
+ if (ret < 0)
+ goto finish;
+
+ ret = 0;
+ finish:
+ return ret;
+ }
+
+ static void e_event_release(struct faultd_event *ev)
+ {
+ struct example_event_event *e_ev =
+ to_example_event_event(ev);
+
+ free(e_ev);
+ }
+
+ static char *e_event_to_string(struct faultd_event *ev)
+ {
+ struct example_event_event *e_ev =
+ to_example_event_event(ev);
+ char *str;
+ int ret;
+
+ ret = asprintf(&str, "Example Event:"
+ " Example: %d"
+ " Impl: %s"
+ " Result: %d",
+ e_ev->example);
+
+ return ret > 0 ? str : NULL;
+ }
+
+ static void e_event_serialize(struct faultd_event *ev, struct faultd_object *out)
+ {
+ struct example_event_event *e_ev =
+ to_example_event_event(ev);
+
+ faultd_event_serialize_internal(ev, out);
+
+ faultd_object_append_int(out, EXAMPLE_FIELD_ID, e_ev->example);
+ }
+
+ static struct faultd_event_type example_event_type = {
+ .name = EXAMPLE_EVENT_EVENT_ID,
+ .default_ops = {
+ .release = e_event_release,
+ .serialize = e_event_serialize,
+ .to_string = e_event_to_string,
+ },
+ .allocate_event = allocate_e_event,
+ .deserialize_event = deserialize_e_event,
+ .node = LIST_HEAD_INIT(example_event_type.node),
+ };
+
+ FAULTD_EVENT_TYPE_REGISTER(example_event_type, example_event_et)
+
+Listeners
+---------
+
+Listeners are just modules that watch the system and generate a suitable event when something happens in area of their interests.
+There is no special API for defining them, they just use the base module abstraction.
+The typical listener module consists of three basic functions:
+
+- module init()
+ Apart from memory allocation the listener module should use this function to install its watchers, for example start poll() on some file descriptor.
+ Because module should not block in init() instead of direct call of poll() mainloop infrastructure should be used.
+ It's advised to use sd_event for this purpose, however it's also possible to use glib if faultd has been compiled with glib support.
+
+- module cleanup()
+ Here all resources claimed by the module should be released.
+ The module should also take care of unregistering all event sources from the mainloop and close related fds.
+
+- file descriptor/dbus callback
+ This is usually a function that you pass to the mainloop while registering a file descriptor.
+ In this callback the listener should collect all the required data about the event, allocate it using faultd_event_create() and pass it for further processing using event_processor_report_event().
+
+Decision Makers
+---------------
+
+Decision makers are modules that decide what action should be taken when an event arrives.
+So it basically gets an event reported by some listener and checks what happened.
+It may also check previous events by querying the database.
+Then it chooses an action to be executed or simply decides to ignore the event.
+
+Decision makers have their own abstraction called struct faultd_event_handler.
+To create a new decision maker functions listed below should be implemented:
+
+- init()
+ This function should claim required resources and load configuration (if needed) from the config parameter.
+
+- cleanup()
+ Here all resources claimed by the module should be released.
+ It's also the last opportunity for the decision maker to do something with pending events (if needed), otherwise they will just be dropped.
+
+- event_match()
+ This function should check if the decision maker is interested in handling the event passed as a parameter.
+ It's called before placing the event in the decision maker's queue.
+ Usually it calls faultd_event_is_of_type() to check if the event is of some particular type.
+ It should return non-zero if the decision maker is interested in this event.
+
+- handle_event()
+ This is the main routine of every decision maker.
+ It is called when the event queue for this decision maker (field event_queue in struct faultd_event_handler) is not empty.
+ Firstly, the decision maker should pop an event from the queue.
+ Then it should perform its logic to determine which action should be executed.
+ This may include querying the data base, contacting some other daemon or maybe even showing a pop up.
+ Next, a decision made event should be allocated and filled with details about the action using faultd_event_create().
+ Lastly, the event should be reported to the core with event_processor_report_event().
+
+Actions
-------
+Action modules are used to implement some operation that should be executed in response to some event.
+To implement this kind of module struct faultd_action should be used.
+It is required to implement only one function - execute().
+First, this function should pop an event from the queue (it's guaranteed that it is decision_made_event).
+Then it should check if action data provided in that event is correct and perform the action.
+All messages related to action execution (including errors) should be logged as strings appended to the action_log object inside the action_executed_event passed as a parameter.
+Lastly, this function should set the result field of action_executed_event to indicate if action execution failed or not.
+The value returned from this function should be always 0 unless the error which occurred is fatal and this action won't be able to handle any more requests.
+
+If an action requires relatively long time to be executed it should be processed asynchronously.
+It means that execute() should prepare the whole operation, store required data (including exec_info param) set the result field to -EPROBE_DEFER and return.
+When the action is really executed it should set the new value of result field and report action_executed_event on its own.
+
Plugins
-------
+
+Generally all modules apart from core can be compiled as shared libraries (plugins).
+To do so you just have to add your own library target in Makefile.am instead of adding it to faultd main binary sources.
+Plugins are loaded by faultd in the very beginning before initiating any module.
+It's not possible to load any new plugins when faultd is already started.
+After loading plugins all modules are initialized using standard initialization process as described earlier in this document.