--- /dev/null
+<?xml version='1.0' encoding='utf-8' ?>
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
+<!ENTITY % BOOK_ENTITIES SYSTEM "Wayland.ent">
+%BOOK_ENTITIES;
+]>
+<chapter id="chap-Wayland-Protocol">
+ <title>The Wayland Protocol</title>
+ <section id="sect-Wayland-Protocol-Basic-Principles">
+ <title>Basic Principles</title>
+ <para>
+ The wayland protocol is an asynchronous object oriented protocol. All
+ requests are method invocations on some object. The request include
+ an object id that uniquely identifies an object on the server. Each
+ object implements an interface and the requests include an opcode that
+ identifies which method in the interface to invoke.
+ </para>
+ <para>
+ The server sends back events to the client, each event is emitted from
+ an object. Events can be error conditions. The event includes the
+ object id and the event opcode, from which the client can determine
+ the type of event. Events are generated both in response to requests
+ (in which case the request and the event constitutes a round trip) or
+ spontaneously when the server state changes.
+ </para>
+ <para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ State is broadcast on connect, events are sent
+ out when state changes. Clients must listen for
+ these changes and cache the state.
+ There is no need (or mechanism) to query server state.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The server will broadcast the presence of a number of global objects,
+ which in turn will broadcast their current state.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ </section>
+ <section id="sect-Wayland-Protocol-Code-Generation">
+ <title>Code Generation</title>
+ <para>
+ The interfaces, requests and events are defined in
+ <filename>protocol/wayland.xml</filename>.
+ This xml is used to generate the function prototypes that can be used by
+ clients and compositors.
+ </para>
+ <para>
+ The protocol entry points are generated as inline functions which just
+ wrap the <function>wl_proxy_*</function> functions. The inline functions aren't
+ part of the library ABI and language bindings should generate their
+ own stubs for the protocol entry points from the xml.
+ </para>
+ </section>
+ <section id="sect-Wayland-Protocol-Wire-Format">
+ <title>Wire Format</title>
+ <para>
+ The protocol is sent over a UNIX domain stream socket. Currently, the
+ endpoint is named <systemitem class="service">\wayland</systemitem>,
+ but it is subject to change. The protocol is message-based. A
+ message sent by a client to the server is called request. A message
+ from the server to a client is called event. Every message is
+ structured as 32-bit words, values are represented in the host's
+ byte-order.
+ </para>
+ <para>
+ The message header has 2 words in it:
+ <itemizedlist>
+ <listitem>
+ <para>
+ The first word is the sender's object id (32-bit).
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The second has 2 parts of 16-bit. The upper 16-bits are the message
+ size in bytes, starting at the header (i.e. it has a minimum value of 8).The lower is the request/event opcode.
+ </para>
+ </listitem>
+ </itemizedlist>
+ The payload describes the request/event arguments. Every argument is always
+ aligned to 32-bits. Where padding is required, the value of padding bytes is
+ undefined. There is no prefix that describes the type, but it is
+ inferred implicitly from the xml specification.
+ </para>
+ <para>
+
+ The representation of argument types are as follows:
+ <variablelist>
+ <varlistentry>
+ <term>int</term>
+ <term>uint</term>
+ <listitem>
+ <para>
+ The value is the 32-bit value of the signed/unsigned
+ int.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>string</term>
+ <listitem>
+ <para>
+ Starts with an unsigned 32-bit length, followed by the
+ string contents, including terminating NUL byte, then padding to a
+ 32-bit boundary.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>object</term>
+ <listitem>
+ <para>
+ 32-bit object ID.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>new_id</term>
+ <listitem>
+ <para>
+ The 32-bit object ID. On requests, the client
+ decides the ID. The only events with <type>new_id</type> are
+ advertisements of globals, and the server will use IDs below
+ 0x10000.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>array</term>
+ <listitem>
+ <para>
+ Starts with 32-bit array size in bytes, followed by the array
+ contents verbatim, and finally padding to a 32-bit boundary.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>fd</term>
+ <listitem>
+ <para>
+ The file descriptor is not stored in the message buffer, but in
+ the ancillary data of the UNIX domain socket message (msg_control).
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ </section>
+ <section id="sect-Wayland-Protocol-Interfaces">
+ <title>Interfaces</title>
+ <para>
+ The protocol includes several interfaces which are used for
+ interacting with the server. Each interface provides requests,
+ events, and errors (which are really just special events) as described
+ above. Specific compositor implementations may have their own
+ interfaces provided as extensions, but there are several which are
+ always expected to be present.
+ </para>
+ <para>
+ Core interfaces:
+ <variablelist>
+ <varlistentry>
+ <term>wl_display</term>
+ <listitem>
+ <para>
+ provides global functionality like objecting binding and
+ fatal error events
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>wl_callback</term>
+ <listitem>
+ <para>
+ callback interface for done events
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>wl_compositor</term>
+ <listitem>
+ <para>
+ core compositor interface, allows surface creation
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>wl_shm</term>
+ <listitem>
+ <para>
+ buffer management interface with buffer creation and format
+ handling
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>wl_buffer</term>
+ <listitem>
+ <para>
+ buffer handling interface for indicating damage and object
+ destruction, also provides buffer release events from the
+ server
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>wl_data_offer</term>
+ <listitem>
+ <para>
+ for accepting and receiving specific mime types
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>wl_data_source</term>
+ <listitem>
+ <para>
+ for offering specific mime types
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>wl_data_device</term>
+ <listitem>
+ <para>
+ lets clients manage drag & drop, provides pointer enter/leave events and motion
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>wl_data_device_manager</term>
+ <listitem>
+ <para>
+ for managing data sources and devices
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>wl_shell</term>
+ <listitem>
+ <para>
+ shell surface handling
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>wl_shell_surface</term>
+ <listitem>
+ <para>
+ shell surface handling and desktop-like events (e.g. set a
+ surface to fullscreen, display a popup, etc.)
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>wl_input_device</term>
+ <listitem>
+ <para>
+ cursor setting, motion, button, and key events, etc.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>wl_output</term>
+ <listitem>
+ <para>
+ events describing an attached output (subpixel orientation,
+ current mode & geometry, etc.) </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ </section>
+ <section id="sect-Wayland-Protocol-Connect-Time">
+ <title>Connect Time</title>
+ <para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ no fixed format connect block, the server emits a bunch of
+ events at connect time
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ presence events for global objects: output, compositor, input
+ devices
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ </section>
+ <section id="sect-Wayland-Protocol-Security-and-Authentication">
+ <title>Security and Authentication</title>
+ <para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ mostly about access to underlying buffers, need new drm auth
+ mechanism (the grant-to ioctl idea), need to check the cmd stream?
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ getting the server socket depends on the compositor type, could
+ be a system wide name, through fd passing on the session dbus.
+ or the client is forked by the compositor and the fd is
+ already opened.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ </section>
+ <section id="sect-Wayland-Protocol-Creating-Objects">
+ <title>Creating Objects</title>
+ <para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ client allocates object ID, uses range protocol
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ server tracks how many IDs are left in current range, sends
+ new range when client is about to run out.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ </section>
+ <section id="sect-Wayland-Protocol-Compositor">
+ <title>Compositor</title>
+ <para>
+ The compositor is a global object, advertised at connect time.
+ </para>
+ <para>
+ See <xref linkend="protocol-spec-interface-wl_compositor"/> for the
+ protocol description.
+ </para>
+ </section>
+ <section id="sect-Wayland-Protocol-Surface">
+ <title>Surface</title>
+ <para>
+ Created by the client.
+ </para>
+ <para>
+ See <xref linkend="protocol-interface-wl_surface"/> for the protocol
+ description.
+ </para>
+ <para>
+ Needs a way to set input region, opaque region.
+ </para>
+ </section>
+ <section id="sect-Wayland-Protocol-Input">
+ <title>Input</title>
+ <para>
+ Represents a group of input devices, including mice, keyboards. Has a
+ keyboard and pointer focus. Global object. Pointer events are
+ delivered in both screen coordinates and surface local coordinates.
+ </para>
+ <para>
+ See <xref linkend="protocol-interface-wl_input_device"/> for the
+ protocol description.
+ </para>
+ <para>
+ Talk about:
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ keyboard map, change events
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ xkb on wayland
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ multi pointer wayland
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ <para>
+ A surface can change the pointer image when the surface is the pointer
+ focus of the input device. Wayland doesn't automatically change the
+ pointer image when a pointer enters a surface, but expects the
+ application to set the cursor it wants in response the pointer
+ focus and motion events. The rationale is that a client has to manage
+ changing pointer images for UI elements within the surface in response
+ to motion events anyway, so we'll make that the only mechanism for
+ setting changing the pointer image. If the server receives a request
+ to set the pointer image after the surface loses pointer focus, the
+ request is ignored. To the client this will look like it successfully
+ set the pointer image.
+ </para>
+ <para>
+ The compositor will revert the pointer image back to a default image
+ when no surface has the pointer focus for that device. Clients can
+ revert the pointer image back to the default image by setting a NULL
+ image.
+ </para>
+ <para>
+ What if the pointer moves from one window which has set a special
+ pointer image to a surface that doesn't set an image in response to
+ the motion event? The new surface will be stuck with the special
+ pointer image. We can't just revert the pointer image on leaving a
+ surface, since if we immediately enter a surface that sets a different
+ image, the image will flicker. Broken app, I suppose.
+ </para>
+ </section>
+ <section id="sect-Wayland-Protocol-Output">
+ <title>Output</title>
+ <para>
+ A output is a global object, advertised at connect time or as they
+ come and go.
+ </para>
+ <para>
+ See <xref linkend="protocol-interface-wl_output"/> for the protocol
+ description.
+ </para>
+ <para>
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ laid out in a big (compositor) coordinate system
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ basically xrandr over wayland
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ geometry needs position in compositor coordinate system\
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ events to advertise available modes, requests to move and change
+ modes
+ </para>
+ </listitem>
+ </itemizedlist>
+ </section>
+ <section id="sect-Wayland-Protocol-Shared-Object-Cache">
+ <title>Shared Object Cache</title>
+ <para>
+ Cache for sharing glyphs, icons, cursors across clients. Lets clients
+ share identical objects. The cache is a global object, advertised at
+ connect time.
+ <programlisting>
+ Interface: cache
+ Requests: upload(key, visual, bo, stride, width, height)
+ Events: item(key, bo, x, y, stride)
+ retire(bo)
+ </programlisting>
+ </para>
+ <para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ Upload by passing a visual, bo, stride, width, height to the
+ cache.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Upload returns a bo name, stride, and x, y location of object in
+ the buffer. Clients take a reference on the atlas bo.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Shared objects are refcounted, freed by client (when purging
+ glyphs from the local cache) or when a client exits.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Server can't delete individual items from an atlas, but it can
+ throw out an entire atlas bo if it becomes too sparse. The server
+ sends out an <type>retire</type> event when this happens, and clients
+ must throw away any objects from that bo and reupload. Between the
+ server dropping the atlas and the client receiving the retire event,
+ clients can still legally use the old atlas since they have a ref on
+ the bo.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ cairo needs to hook into the glyph cache, and maybe also a way
+ to create a read-only surface based on an object form the cache
+ (icons).
+ <function>cairo_wayland_create_cached_surface(surface-data)</function>
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ </section>
+ <section id="sect-Wayland-Protocol-Drag-and-Drop">
+ <title>Drag and Drop</title>
+ <para>
+ Multi-device aware. Orthogonal to rest of wayland, as it is its own
+ toplevel object. Since the compositor determines the drag target, it
+ works with transformed surfaces (dragging to a scaled down window in
+ expose mode, for example).
+ </para>
+ <para>
+ See <xref linkend="protocol-interface-wl_data_offer"/>,
+ <xref linkend="protocol-interface-wl_data_source"/> and
+ <xref linkend="protocol-interface-wl_data_offer"/> for
+ protocol descriptions.
+ </para>
+ <para>
+ Issues:
+ <itemizedlist>
+ <listitem>
+ <para>
+ we can set the cursor image to the current cursor + dragged
+ object, which will last as long as the drag, but maybe an request to
+ attach an image to the cursor will be more convenient?
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Should drag.send() destroy the object? There's nothing to do
+ after the data has been transferred.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ How do we marshal several mime-types? We could make the drag
+ setup a multi-step operation: dnd.create, drag.offer(mime-type1),
+ drag.offer(mime-type2), drag.activate(). The drag object could send
+ multiple offer events on each motion event. Or we could just
+ implement an array type, but that's a pain to work with.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Middle-click drag to pop up menu? Ctrl/Shift/Alt drag?
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Send a file descriptor over the protocol to let initiator and
+ source exchange data out of band?
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Action? Specify action when creating the drag object? Ask
+ action?
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ <para>
+ Sequence of events:
+ <orderedlist>
+ <listitem>
+ <para>
+ The initiator surface receives a click (which grabs the input
+ device to that surface) and then enough motion to decide that a drag
+ is starting. Wayland has no subwindows, so it's entirely up to the
+ application to decide whether or not a draggable object within the
+ surface was clicked.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The initiator creates a drag object by calling the
+ <function>create_drag</function> method on the dnd global
+ object. As for any client created object, the client allocates
+ the id. The <function>create_drag</function> method also takes
+ the originating surface, the device that's dragging and the
+ mime-types supported. If the surface
+ has indeed grabbed the device passed in, the server will create an
+ active drag object for the device. If the grab was released in the
+ meantime, the drag object will be in-active, that is, the same state
+ as when the grab is released. In that case, the client will receive
+ a button up event, which will let it know that the drag finished.
+ To the client it will look like the drag was immediately cancelled
+ by the grab ending.
+ </para>
+ <para>
+ The special mime-type application/x-root-target indicates that the
+ initiator is looking for drag events to the root window as well.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ To indicate the object being dragged, the initiator can replace
+ the pointer image with an larger image representing the data being
+ dragged with the cursor image overlaid. The pointer image will
+ remain in place as long as the grab is in effect, since the
+ initiating surface keeps pointer focus, and no other surface
+ receives enter events.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ As long as the grab is active (or until the initiator cancels
+ the drag by destroying the drag object), the drag object will send
+ <function>offer</function> events to surfaces it moves across. As for motion
+ events, these events contain the surface local coordinates of the
+ device as well as the list of mime-types offered. When a device
+ leaves a surface, it will send an <function>offer</function> event with an empty
+ list of mime-types to indicate that the device left the surface.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ If a surface receives an offer event and decides that it's in an
+ area that can accept a drag event, it should call the
+ <function>accept</function> method on the drag object in the event. The surface
+ passes a mime-type in the request, picked from the list in the offer
+ event, to indicate which of the types it wants. At this point, the
+ surface can update the appearance of the drop target to give
+ feedback to the user that the drag has a valid target. If the
+ <function>offer</function> event moves to a different drop target (the surface
+ decides the offer coordinates is outside the drop target) or leaves
+ the surface (the offer event has an empty list of mime-types) it
+ should revert the appearance of the drop target to the inactive
+ state. A surface can also decide to retract its drop target (if the
+ drop target disappears or moves, for example), by calling the accept
+ method with a NULL mime-type.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ When a target surface sends an <function>accept</function> request, the drag
+ object will send a <function>target</function> event to the initiator surface.
+ This tells the initiator that the drag currently has a potential
+ target and which of the offered mime-types the target wants. The
+ initiator can change the pointer image or drag source appearance to
+ reflect this new state. If the target surface retracts its drop
+ target of if the surface disappears, a <function>target</function> event with a
+ NULL mime-type will be sent.
+ </para>
+ <para>
+ If the initiator listed application/x-root-target as a valid
+ mime-type, dragging into the root window will make the drag object
+ send a <function>target</function> event with the application/x-root-target
+ mime-type.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ When the grab is released (indicated by the button release
+ event), if the drag has an active target, the initiator calls the
+ <function>send</function> method on the drag object to send the data to be
+ transferred by the drag operation, in the format requested by the
+ target. The initiator can then destroy the drag object by calling
+ the <function>destroy</function> method.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The drop target receives a <function>data</function> event from the drag
+ object with the requested data.
+ </para>
+ </listitem>
+ </orderedlist>
+ </para>
+ <para>
+ MIME is defined in RFC's 2045-2049. A
+ <ulink url="ftp://ftp.isi.edu/in-notes/iana/assignments/media-types/">
+ registry of MIME types</ulink> is maintained by the Internet Assigned
+ Numbers Authority (IANA).
+ </para>
+ </section>
+</chapter>
+++ /dev/null
-\documentclass{article}
-\usepackage{palatino}
-\usepackage{graphicx}
-
-\author{Kristian Høgsberg\\
-\texttt{krh@bitplanet.net}
-}
-
-\title{The Wayland Compositing System}
-
-\begin{document}
-
-\maketitle
-
-\section{Wayland Overview}
-
-\begin{itemize}
-\item wayland is a protocol for a new display server.
-\item weston is the open source project implementing a wayland based compositor
-\end{itemize}
-
-\subsection{Replacing X11}
-
-In Linux and other Unix-like systems, the X stack has grown to
-encompass functionality arguably belonging in client libraries,
-helper libraries, or the host operating system kernel. Support for
-things like PCI resource management, display configuration management,
-direct rendering, and memory management has been integrated into the X
-stack, imposing limitations like limited support for standalone
-applications, duplication in other projects (e.g. the Linux fb layer
-or the DirectFB project), and high levels of complexity for systems
-combining multiple elements (for example radeon memory map handling
-between the fb driver and X driver, or VT switching).
-
-Moreover, X has grown to incorporate modern features like offscreen
-rendering and scene composition, but subject to the limitations of the
-X architecture. For example, the X implementation of composition adds
-additional context switches and makes things like input redirection
-difficult.
-
-\begin{figure}
-\begin{center}
-\includegraphics[width=70mm]{x-architecture.png}
-\caption{\small \sl X with a compositing manager.\label{fig:X architecture}}
-\end{center}
-\end{figure}
-
-The diagram above illustrates the central role of the X server and
-compositor in operations, and the steps required to get contents on to
-the screen.
-
-Over time, X developers came to understand the shortcomings of this
-approach and worked to split things up. Over the past several years,
-a lot of functionality has moved out of the X server and into
-client-side libraries or kernel drivers. One of the first components
-to move out was font rendering, with freetype and fontconfig providing
-an alternative to the core X fonts. Direct rendering OpenGL as a
-graphics driver in a client side library went through some iterations,
-ending up as DRI2, which abstracted most of the direct rendering
-buffer management from client code. Then cairo came along and provided
-a modern 2D rendering library independent of X, and compositing
-managers took over control of the rendering of the desktop as toolkits
-like GTK+ and Qt moved away from using X APIs for rendering. Recently,
-memory and display management have moved to the Linux kernel, further
-reducing the scope of X and its driver stack. The end result is a
-highly modular graphics stack.
-
-\subsection{Make the compositing manager the display server}
-
-Wayland is a new display server and compositing protocol, and Weston
-is the implementation of this protocol which builds on top of all the
-components above. We are trying to distill out the functionality in
-the X server that is still used by the modern Linux desktop. This
-turns out to be not a whole lot. Applications can allocate their own
-off-screen buffers and render their window contents directly, using
-hardware accelerated libraries like libGL, or high quality software
-implementations like those found in Cairo. In the end, what’s needed
-is a way to present the resulting window surface for display, and a
-way to receive and arbitrate input among multiple clients. This is
-what Wayland provides, by piecing together the components already in
-the eco-system in a slightly different way.
-
-X will always be relevant, in the same way Fortran compilers and VRML
-browsers are, but it’s time that we think about moving it out of the
-critical path and provide it as an optional component for legacy
-applications.
-
-Overall, the philosophy of Wayland is to provide clients with a way to
-manage windows and how their contents is displayed. Rendering is left
-to clients, and system wide memory management interfaces are used to
-pass buffer handles between clients and the compositing manager.
-
-\begin{figure}
-\begin{center}
-\includegraphics[width=50mm]{wayland-architecture.png}
-\caption{\small \sl The Wayland system\label{fig:Wayland architecture}}
-\end{center}
-\end{figure}
-
-The figure above illustrates how Wayland clients interact with a
-Wayland server. Note that window management and composition are
-handled entirely in the server, significantly reducing complexity
-while marginally improving performance through reduced context
-switching. The resulting system is easier to build and extend than a
-similar X system, because often changes need only be made in one
-place. Or in the case of protocol extensions, two (rather than 3 or 4
-in the X case where window management and/or composition handling may
-also need to be updated).
-
-\section{Wayland protocol}
-
-\subsection{Basic Principles}
-
-The wayland protocol is an asynchronous object oriented protocol. All
-requests are method invocations on some object. The request include
-an object id that uniquely identifies an object on the server. Each
-object implements an interface and the requests include an opcode that
-identifies which method in the interface to invoke.
-
-The server sends back events to the client, each event is emitted from
-an object. Events can be error conditions. The event includes the
-object id and the event opcode, from which the client can determine
-the type of event. Events are generated both in response to requests
-(in which case the request and the event constitutes a round trip) or
-spontaneously when the server state changes.
-
-\begin{itemize}
-\item State is broadcast on connect, events are sent out when state
- changes. Clients must listen for these changes and cache the state.
- There is no need (or mechanism) to query server state.
-
-\item the server will broadcast the presence of a number of global objects,
- which in turn will broadcast their current state.
-\end{itemize}
-
-\subsection{Code generation}
-
-The interfaces, requests and events are defined in protocol/wayland.xml.
-This xml is used to generate the function prototypes that can be used by
-clients and compositors.
-
-The protocol entry points are generated as inline functions which just
-wrap the \verb:wl_proxy_*: functions. The inline functions aren't
-part of the library ABI and language bindings should generate their
-own stubs for the protocol entry points from the xml.
-
-\subsection{Wire format}
-
-The protocol is sent over a UNIX domain stream socket. Currently, the
-endpoint is named \texttt{\textbackslash0wayland}, but it is subject
-to change. The protocol is message-based. A message sent by a client
-to the server is called \texttt{request}. A message from the server
-to a client is called \texttt{event}. Every message is structured as
-32-bit words, values are represented in the host's byte-order.
-
-The message header has 2 words in it:
-\begin{itemize}
-\item The first word is the sender's object id (32-bit).
-\item The second has 2 parts of 16-bit. The upper 16-bits are the message
- size in bytes, starting at the header (i.e. it has a minimum value of 8).
- The lower is the request/event opcode.
-\end{itemize}
-
-The payload describes the request/event arguments. Every argument is always
-aligned to 32-bits. Where padding is required, the value of padding bytes is
-undefined. There is no prefix that describes the type, but it is
-inferred implicitly from the xml specification.
-
-The representation of argument types are as follows:
-\begin{itemize}
-\item "int" or "uint": The value is the 32-bit value of the signed/unsigned
- int.
-\item "string": Starts with an unsigned 32-bit length, followed by the
- string contents, including terminating NUL byte, then padding to a
- 32-bit boundary.
-\item "object": A 32-bit object ID.
-\item "new\_id": the 32-bit object ID. On requests, the client
- decides the ID. The only events with "new\_id" are advertisements of
- globals, and the server will use IDs below 0x10000.
-\item "array": Starts with 32-bit array size in bytes, followed by the array
- contents verbatim, and finally padding to a 32-bit boundary.
-\item "fd": the file descriptor is not stored in the message buffer, but in
- the ancillary data of the UNIX domain socket message (msg\_control).
-\end{itemize}
-
-\subsection{Interfaces}
-
-The protocol includes several interfaces which are used for
-interacting with the server. Each interface provides requests,
-events, and errors (which are really just special events) as described
-above. Specific compositor implementations may have their own
-interfaces provided as extensions, but there are several which are
-always expected to be present.
-
-Core interfaces:
-\begin{itemize}
-\item wl_display: provides global functionality like objecting binding and fatal error events
-\item wl_callback: callback interface for dnoe events
-\item wl_compositor: core compositor interface, allows surface creation
-\item wl_shm: buffer management interface with buffer creation and format handling
-\item wl_buffer: buffer handling interface for indicating damage and object destruction, also provides buffer release events from the server
-\item wl_data_offer: for accepting and receiving specific mime types
-\item wl_data_source: for offering specific mime types
-\item wl_data_Device: lets clients manage drag & drop, provides pointer enter/leave events and motion
-\item wl_data_device_manager: for managing data sources and devices
-\item wl_shell: shell surface handling
-\item wl_shell_surface: shell surface handling and desktop-like events (e.g. set a surface to fullscreen, display a popup, etc.)
-\item wl_surface: surface management (destruction, damage, buffer attach, frame handling)
-\item wl_input_device: cursor setting, motion, button, and key events, etc.
-\item wl_output: events describing an attached output (subpixel orientation, current mode & geometry, etc.)
-\end{itemize}
-
-\subsection{Connect Time}
-
-\begin{itemize}
-\item no fixed format connect block, the server emits a bunch of
- events at connect time
-\item presence events for global objects: output, compositor, input
- devices
-\end{itemize}
-\subsection{Security and Authentication}
-
-\begin{itemize}
-\item mostly about access to underlying buffers, need new drm auth
- mechanism (the grant-to ioctl idea), need to check the cmd stream?
-
-\item getting the server socket depends on the compositor type, could
- be a system wide name, through fd passing on the session dbus. or
- the client is forked by the compositor and the fd is already opened.
-\end{itemize}
-
-\subsection{Creating Objects}
-
-\begin{itemize}
-\item client allocates object ID, uses range protocol
-\item server tracks how many IDs are left in current range, sends new
- range when client is about to run out.
-\end{itemize}
-
-\subsection{Compositor}
-
-The compositor is a global object, advertised at connect time.
-
-\begin{tabular}{l}
- \hline
- Interface \texttt{compositor} \\ \hline
- Requests \\ \hline
- \texttt{create\_surface(id)} \\
- \texttt{commit()} \\ \hline
- Events \\ \hline
- \texttt{device(device)} \\
- \texttt{acknowledge(key, frame)} \\
- \texttt{frame(frame, time)} \\ \hline
-\end{tabular}
-
-
-\begin{itemize}
-\item a global object
-\item broadcasts drm file name, or at least a string like drm:/dev/dri/card0
-\item commit/ack/frame protocol
-\end{itemize}
-
-\subsection{Surface}
-
-Created by the client.
-
-\begin{tabular}{l}
- \hline
- Interface \texttt{surface} \\ \hline
- Requests \\ \hline
- \texttt{destroy()} \\
- \texttt{attach()} \\
- \texttt{map()} \\
- \texttt{damage()} \\ \hline
- Events \\ \hline
- no events \\ \hline
-\end{tabular}
-
-Needs a way to set input region, opaque region.
-
-\subsection{Input}
-
-Represents a group of input devices, including mice, keyboards. Has a
-keyboard and pointer focus. Global object. Pointer events are
-delivered in both screen coordinates and surface local coordinates.
-
-\begin{tabular}{l}
- \hline
- Interface \texttt{cache} \\ \hline
- Requests \\ \hline
- \texttt{attach(buffer, x, y)} \\
- Events \\ \hline
- \texttt{motion(x, y, sx, sy)} \\
- \texttt{button(button, state, x, y, sx, sy)} \\
- \texttt{key(key, state)} \\
- \texttt{pointer\_focus(surface)} \\
- \texttt{keyboard\_focus(surface, keys)} \\ \hline
-\end{tabular}
-
-Talk about:
-
-\begin{itemize}
-\item keyboard map, change events
-\item xkb on wayland
-\item multi pointer wayland
-\end{itemize}
-
-A surface can change the pointer image when the surface is the pointer
-focus of the input device. Wayland doesn't automatically change the
-pointer image when a pointer enters a surface, but expects the
-application to set the cursor it wants in response the pointer
-focus and motion events. The rationale is that a client has to manage
-changing pointer images for UI elements within the surface in response
-to motion events anyway, so we'll make that the only mechanism for
-setting changing the pointer image. If the server receives a request
-to set the pointer image after the surface loses pointer focus, the
-request is ignored. To the client this will look like it successfully
-set the pointer image.
-
-The compositor will revert the pointer image back to a default image
-when no surface has the pointer focus for that device. Clients can
-revert the pointer image back to the default image by setting a NULL
-image.
-
-What if the pointer moves from one window which has set a special
-pointer image to a surface that doesn't set an image in response to
-the motion event? The new surface will be stuck with the special
-pointer image. We can't just revert the pointer image on leaving a
-surface, since if we immediately enter a surface that sets a different
-image, the image will flicker. Broken app, I suppose.
-
-\subsection{Output}
-
-A output is a global object, advertised at connect time or as they
-come and go.
-
-\begin{tabular}{l}
- \hline
- Interface \texttt{output} \\ \hline
- Requests \\ \hline
- no requests \\ \hline
- Events \\ \hline
- \texttt{geometry(width, height)} \\ \hline
-\end{tabular}
-
-\begin{itemize}
-\item laid out in a big (compositor) coordinate system
-\item basically xrandr over wayland
-\item geometry needs position in compositor coordinate system\
-\item events to advertise available modes, requests to move and change
- modes
-\end{itemize}
-
-\subsection{Shared object cache}
-
-Cache for sharing glyphs, icons, cursors across clients. Lets clients
-share identical objects. The cache is a global object, advertised at
-connect time.
-
-\begin{tabular}{l}
- \hline
- Interface \texttt{cache} \\ \hline
- Requests \\ \hline
- \texttt{upload(key, visual, bo, stride, width, height)} \\ \hline
- Events \\ \hline
- \texttt{item(key, bo, x, y, stride)} \\
- \texttt{retire(bo)} \\ \hline
-\end{tabular}
-
-\begin{itemize}
-
-\item Upload by passing a visual, bo, stride, width, height to the
- cache.
-
-\item Upload returns a bo name, stride, and x, y location of object in
- the buffer. Clients take a reference on the atlas bo.
-
-\item Shared objects are refcounted, freed by client (when purging
- glyphs from the local cache) or when a client exits.
-
-\item Server can't delete individual items from an atlas, but it can
- throw out an entire atlas bo if it becomes too sparse. The server
- sends out an \texttt{retire} event when this happens, and clients
- must throw away any objects from that bo and reupload. Between the
- server dropping the atlas and the client receiving the retire event,
- clients can still legally use the old atlas since they have a ref on
- the bo.
-
-\item cairo needs to hook into the glyph cache, and maybe also a way
- to create a read-only surface based on an object form the cache
- (icons).
-
- \texttt{cairo\_wayland\_create\_cached\_surface(surface-data)}.
-
-\end{itemize}
-
-
-\subsection{Drag and Drop}
-
-Multi-device aware. Orthogonal to rest of wayland, as it is its own
-toplevel object. Since the compositor determines the drag target, it
-works with transformed surfaces (dragging to a scaled down window in
-expose mode, for example).
-
-Issues:
-
-\begin{itemize}
-\item we can set the cursor image to the current cursor + dragged
- object, which will last as long as the drag, but maybe an request to
- attach an image to the cursor will be more convenient?
-
-\item Should drag.send() destroy the object? There's nothing to do
- after the data has been transferred.
-
-\item How do we marshal several mime-types? We could make the drag
- setup a multi-step operation: dnd.create, drag.offer(mime-type1),
- drag.offer(mime-type2), drag.activate(). The drag object could send
- multiple offer events on each motion event. Or we could just
- implement an array type, but that's a pain to work with.
-
-\item Middle-click drag to pop up menu? Ctrl/Shift/Alt drag?
-
-\item Send a file descriptor over the protocol to let initiator and
- source exchange data out of band?
-
-\item Action? Specify action when creating the drag object? Ask
- action?
-\end{itemize}
-
-New objects, requests and events:
-
-\begin{itemize}
-\item New toplevel dnd global. One method, creates a drag object:
- \texttt{dnd.start(new object id, surface, input device, mime
- types)}. Starts drag for the device, if it's grabbed by the
- surface. drag ends when button is released. Caller is responsible
- for destroying the drag object.
-
-\item Drag object methods:
-
- \texttt{drag.destroy(id)}, destroy drag object.
-
- \texttt{drag.send(id, data)}, send drag data.
-
- \texttt{drag.accept(id, mime type)}, accept drag offer, called by
- target surface.
-
-\item Drag object events:
-
- \texttt{drag.offer(id, mime-types)}, sent to potential destination
- surfaces to offer drag data. If the device leaves the window or the
- originator cancels the drag, this event is sent with mime-types =
- NULL.
-
- \texttt{drag.target(id, mime-type)}, sent to drag originator when a
- target surface has accepted the offer. if a previous target goes
- away, this event is sent with mime-type = NULL.
-
- \texttt{drag.data(id, data)}, sent to target, contains dragged data.
- ends transaction on the target side.
-\end{itemize}
-
-Sequence of events:
-
-\begin{itemize}
-\item The initiator surface receives a click (which grabs the input
- device to that surface) and then enough motion to decide that a drag
- is starting. Wayland has no subwindows, so it's entirely up to the
- application to decide whether or not a draggable object within the
- surface was clicked.
-
-\item The initiator creates a drag object by calling the
- \texttt{create\_drag} method on the dnd global object. As for any
- client created object, the client allocates the id. The
- \texttt{create\_drag} method also takes the originating surface, the
- device that's dragging and the mime-types supported. If the surface
- has indeed grabbed the device passed in, the server will create an
- active drag object for the device. If the grab was released in the
- meantime, the drag object will be in-active, that is, the same state
- as when the grab is released. In that case, the client will receive
- a button up event, which will let it know that the drag finished.
- To the client it will look like the drag was immediately cancelled
- by the grab ending.
-
- The special mime-type application/x-root-target indicates that the
- initiator is looking for drag events to the root window as well.
-
-\item To indicate the object being dragged, the initiator can replace
- the pointer image with an larger image representing the data being
- dragged with the cursor image overlaid. The pointer image will
- remain in place as long as the grab is in effect, since the
- initiating surface keeps pointer focus, and no other surface
- receives enter events.
-
-\item As long as the grab is active (or until the initiator cancels
- the drag by destroying the drag object), the drag object will send
- \texttt{offer} events to surfaces it moves across. As for motion
- events, these events contain the surface local coordinates of the
- device as well as the list of mime-types offered. When a device
- leaves a surface, it will send an \texttt{offer} event with an empty
- list of mime-types to indicate that the device left the surface.
-
-\item If a surface receives an offer event and decides that it's in an
- area that can accept a drag event, it should call the
- \texttt{accept} method on the drag object in the event. The surface
- passes a mime-type in the request, picked from the list in the offer
- event, to indicate which of the types it wants. At this point, the
- surface can update the appearance of the drop target to give
- feedback to the user that the drag has a valid target. If the
- \texttt{offer} event moves to a different drop target (the surface
- decides the offer coordinates is outside the drop target) or leaves
- the surface (the offer event has an empty list of mime-types) it
- should revert the appearance of the drop target to the inactive
- state. A surface can also decide to retract its drop target (if the
- drop target disappears or moves, for example), by calling the accept
- method with a NULL mime-type.
-
-\item When a target surface sends an \texttt{accept} request, the drag
- object will send a \texttt{target} event to the initiator surface.
- This tells the initiator that the drag currently has a potential
- target and which of the offered mime-types the target wants. The
- initiator can change the pointer image or drag source appearance to
- reflect this new state. If the target surface retracts its drop
- target of if the surface disappears, a \texttt{target} event with a
- NULL mime-type will be sent.
-
- If the initiator listed application/x-root-target as a valid
- mime-type, dragging into the root window will make the drag object
- send a \texttt{target} event with the application/x-root-target
- mime-type.
-
-\item When the grab is released (indicated by the button release
- event), if the drag has an active target, the initiator calls the
- \texttt{send} method on the drag object to send the data to be
- transferred by the drag operation, in the format requested by the
- target. The initiator can then destroy the drag object by calling
- the \texttt{destroy} method.
-
-\item The drop target receives a \texttt{data} event from the drag
- object with the requested data.
-\end{itemize}
-
-MIME is defined in RFC's 2045-2049. A registry of MIME types is
-maintained by the Internet Assigned Numbers Authority (IANA).
-
-ftp://ftp.isi.edu/in-notes/iana/assignments/media-types/
-
-
-\section{Types of compositors}
-
-\subsection{System Compositor}
-
-\begin{itemize}
-\item ties in with graphical boot
-\item hosts different types of session compositors
-\item lets us switch between multiple sessions (fast user switching,
- secure/personal desktop switching)
-\item multiseat
-\item linux implementation using libudev, egl, kms, evdev, cairo
-\item for fullscreen clients, the system compositor can reprogram the
- video scanout address to source from the client provided buffer.
-\end{itemize}
-
-\subsection{Session Compositor}
-
-\begin{itemize}
-\item nested under the system compositor. nesting is feasible because
- protocol is async, roundtrip would break nesting
-\item gnome-shell
-\item moblin
-\item compiz?
-\item kde compositor?
-\item text mode using vte
-\item rdp session
-\item fullscreen X session under wayland
-\item can run without system compositor, on the hw where it makes
- sense
-\item root window less X server, bridging X windows into a wayland
- session compositor
-\end{itemize}
-
-\subsection{Embbedding Compositor}
-
-X11 lets clients embed windows from other clients, or lets client copy
-pixmap contents rendered by another client into their window. This is
-often used for applets in a panel, browser plugins and similar.
-Wayland doesn't directly allow this, but clients can communicate GEM
-buffer names out-of-band, for example, using d-bus or as command line
-arguments when the panel launches the applet. Another option is to
-use a nested wayland instance. For this, the wayland server will have
-to be a library that the host application links to. The host
-application will then pass the wayland server socket name to the
-embedded application, and will need to implement the wayland
-compositor interface. The host application composites the client
-surfaces as part of it's window, that is, in the web page or in the
-panel. The benefit of nesting the wayland server is that it provides
-the requests the embedded client needs to inform the host about buffer
-updates and a mechanism for forwarding input events from the host
-application.
-
-\begin{itemize}
-\item firefox embedding flash by being a special purpose compositor to
- the plugin
-\end{itemize}
-
-\section{Implementation}
-
-what's currently implemented
-
-\subsection{Wayland Server Library}
-
-\texttt{libwayland-server.so}
-
-\begin{itemize}
-\item implements protocol side of a compositor
-\item minimal, doesn't include any rendering or input device handling
-\item helpers for running on egl and evdev, and for nested wayland
-\end{itemize}
-
-\subsection{Wayland Client Library}
-
-\texttt{libwayland.so}
-
-\begin{itemize}
-\item minimal, designed to support integration with real toolkits such as
- Qt, GTK+ or Clutter.
-
-\item doesn't cache state, but lets the toolkits cache server state in
- native objects (GObject or QObject or whatever).
-\end{itemize}
-
-\subsection{Wayland System Compositor}
-
-\begin{itemize}
-\item implementation of the system compositor
-
-\item uses libudev, eagle (egl), evdev and drm
-
-\item integrates with ConsoleKit, can create new sessions
-
-\item allows multi seat setups
-
-\item configurable through udev rules and maybe /etc/wayland.d type thing
-\end{itemize}
-
-\subsection{X Server Session}
-
-\begin{itemize}
-\item xserver module and driver support
-
-\item uses wayland client library
-
-\item same X.org server as we normally run, the front buffer is a wayland
- surface but all accel code, 3d and extensions are there
-
-\item when full screen the session compositor will scan out from the X
- server wayland surface, at which point X is running pretty much as it
- does natively.
-\end{itemize}
-
-\end{document}