doc: add documentation for accel subsystem

author Oded Gabbay <ogabbay@kernel.org>

Sat, 19 Nov 2022 19:54:20 +0000 (21:54 +0200)

committer Oded Gabbay <ogabbay@kernel.org>

Tue, 22 Nov 2022 11:14:52 +0000 (13:14 +0200)
author Oded Gabbay <ogabbay@kernel.org>
Sat, 19 Nov 2022 19:54:20 +0000 (21:54 +0200)
committer Oded Gabbay <ogabbay@kernel.org>
Tue, 22 Nov 2022 11:14:52 +0000 (13:14 +0200)
diff --git a/Documentation/accel/index.rst b/Documentation/accel/index.rst

new file mode 100644 (file)

index 0000000..2b43c9a
--- /dev/null
+++ b/Documentation/accel/index.rst
@@ -0,0 +1,17 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+====================
+Compute Accelerators
+====================
+
+.. toctree::
+   :maxdepth: 1
+
+   introduction
+
+.. only::  subproject and html
+
+   Indices
+   =======
+
+   * :ref:`genindex`
diff --git a/Documentation/accel/introduction.rst b/Documentation/accel/introduction.rst

new file mode 100644 (file)

index 0000000..6f31af1
--- /dev/null
+++ b/Documentation/accel/introduction.rst
@@ -0,0 +1,110 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+============
+Introduction
+============
+
+The Linux compute accelerators subsystem is designed to expose compute
+accelerators in a common way to user-space and provide a common set of
+functionality.
+
+These devices can be either stand-alone ASICs or IP blocks inside an SoC/GPU.
+Although these devices are typically designed to accelerate
+Machine-Learning (ML) and/or Deep-Learning (DL) computations, the accel layer
+is not limited to handling these types of accelerators.
+
+Typically, a compute accelerator will belong to one of the following
+categories:
+
+- Edge AI - doing inference at an edge device. It can be an embedded ASIC/FPGA,
+  or an IP inside a SoC (e.g. laptop web camera). These devices
+  are typically configured using registers and can work with or without DMA.
+
+- Inference data-center - single/multi user devices in a large server. This
+  type of device can be stand-alone or an IP inside a SoC or a GPU. It will
+  have on-board DRAM (to hold the DL topology), DMA engines and
+  command submission queues (either kernel or user-space queues).
+  It might also have an MMU to manage multiple users and might also enable
+  virtualization (SR-IOV) to support multiple VMs on the same device. In
+  addition, these devices will usually have some tools, such as profiler and
+  debugger.
+
+- Training data-center - Similar to Inference data-center cards, but typically
+  have more computational power and memory b/w (e.g. HBM) and will likely have
+  a method of scaling-up/out, i.e. connecting to other training cards inside
+  the server or in other servers, respectively.
+
+All these devices typically have different runtime user-space software stacks,
+that are tailored-made to their h/w. In addition, they will also probably
+include a compiler to generate programs to their custom-made computational
+engines. Typically, the common layer in user-space will be the DL frameworks,
+such as PyTorch and TensorFlow.
+
+Sharing code with DRM
+=====================
+
+Because this type of devices can be an IP inside GPUs or have similar
+characteristics as those of GPUs, the accel subsystem will use the
+DRM subsystem's code and functionality. i.e. the accel core code will
+be part of the DRM subsystem and an accel device will be a new type of DRM
+device.
+
+This will allow us to leverage the extensive DRM code-base and
+collaborate with DRM developers that have experience with this type of
+devices. In addition, new features that will be added for the accelerator
+drivers can be of use to GPU drivers as well.
+
+Differentiation from GPUs
+=========================
+
+Because we want to prevent the extensive user-space graphic software stack
+from trying to use an accelerator as a GPU, the compute accelerators will be
+differentiated from GPUs by using a new major number and new device char files.
+
+Furthermore, the drivers will be located in a separate place in the kernel
+tree - drivers/accel/.
+
+The accelerator devices will be exposed to the user space with the dedicated
+261 major number and will have the following convention:
+
+- device char files - /dev/accel/accel*
+- sysfs             - /sys/class/accel/accel*/
+- debugfs           - /sys/kernel/debug/accel/accel*/
+
+Getting Started
+===============
+
+First, read the DRM documentation at Documentation/gpu/index.rst.
+Not only it will explain how to write a new DRM driver but it will also
+contain all the information on how to contribute, the Code Of Conduct and
+what is the coding style/documentation. All of that is the same for the
+accel subsystem.
+
+Second, make sure the kernel is configured with CONFIG_DRM_ACCEL.
+
+To expose your device as an accelerator, two changes are needed to
+be done in your driver (as opposed to a standard DRM driver):
+
+- Add the DRIVER_COMPUTE_ACCEL feature flag in your drm_driver's
+  driver_features field. It is important to note that this driver feature is
+  mutually exclusive with DRIVER_RENDER and DRIVER_MODESET. Devices that want
+  to expose both graphics and compute device char files should be handled by
+  two drivers that are connected using the auxiliary bus framework.
+
+- Change the open callback in your driver fops structure to accel_open().
+  Alternatively, your driver can use DEFINE_DRM_ACCEL_FOPS macro to easily
+  set the correct function operations pointers structure.
+
+External References
+===================
+
+email threads
+-------------
+
+* `Initial discussion on the New subsystem for acceleration devices <https://lkml.org/lkml/2022/7/31/83>`_ - Oded Gabbay (2022)
+* `patch-set to add the new subsystem <https://lkml.org/lkml/2022/10/22/544>`_ - Oded Gabbay (2022)
+
+Conference talks
+----------------
+
+* `LPC 2022 Accelerators BOF outcomes summary <https://airlied.blogspot.com/2022/09/accelerators-bof-outcomes-summary.html>`_ - Dave Airlie (2022)
diff --git a/Documentation/subsystem-apis.rst b/Documentation/subsystem-apis.rst

index af65004..b51f385 100644 (file)
--- a/Documentation/subsystem-apis.rst
+++ b/Documentation/subsystem-apis.rst
@@ -43,6 +43,7 @@ needed).
     input/index
     hwmon/index
     gpu/index
+   accel/index
     security/index
     sound/index
     crypto/index
diff --git a/MAINTAINERS b/MAINTAINERS

index 6c17511..29fff3b 100644 (file)
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6839,6 +6839,7 @@ L:        dri-devel@lists.freedesktop.org
  S:     Maintained
  C:     irc://irc.oftc.net/dri-devel
  T:     git https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/accel.git
+F:     Documentation/accel/
  F:     drivers/accel/
  
  DRM DRIVERS FOR ALLWINNER A10
author	Oded Gabbay <ogabbay@kernel.org>
	Sat, 19 Nov 2022 19:54:20 +0000 (21:54 +0200)
committer	Oded Gabbay <ogabbay@kernel.org>
	Tue, 22 Nov 2022 11:14:52 +0000 (13:14 +0200)
Documentation/accel/index.rst	[new file with mode: 0644]	patch \| blob
Documentation/accel/introduction.rst	[new file with mode: 0644]	patch \| blob
Documentation/subsystem-apis.rst		patch \| blob \| history
MAINTAINERS		patch \| blob \| history