Nicolas Saenz Julienne <nsaenz@kernel.org> <nsaenzjulienne@suse.de>
Nicolas Saenz Julienne <nsaenz@kernel.org> <nsaenzjulienne@suse.com>
Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
+Oleksandr Natalenko <oleksandr@natalenko.name> <oleksandr@redhat.com>
Oleksij Rempel <linux@rempel-privat.de> <bug-track@fisher-privat.net>
Oleksij Rempel <linux@rempel-privat.de> <external.Oleksij.Rempel@de.bosch.com>
Oleksij Rempel <linux@rempel-privat.de> <fixed-term.Oleksij.Rempel@de.bosch.com>
Paul E. McKenney <paulmck@kernel.org> <paulmck@linux.ibm.com>
Paul E. McKenney <paulmck@kernel.org> <paulmck@linux.vnet.ibm.com>
Paul E. McKenney <paulmck@kernel.org> <paulmck@us.ibm.com>
+Paul Mackerras <paulus@ozlabs.org> <paulus@samba.org>
+Paul Mackerras <paulus@ozlabs.org> <paulus@au1.ibm.com>
Peter A Jonsson <pj@ludd.ltu.se>
Peter Oruba <peter.oruba@amd.com>
Peter Oruba <peter@oruba.de>
e) thrashing
f) direct compact
g) write-protect copy
+h) IRQ/SOFTIRQ
and makes these statistics available to userspace through
the taskstats interface.
for a description of the fields pertaining to delay accounting.
It will generally be in the form of counters returning the cumulative
delay seen for cpu, sync block I/O, swapin, memory reclaim, thrash page
-cache, direct compact, write-protect copy etc.
+cache, direct compact, write-protect copy, IRQ/SOFTIRQ etc.
Taking the difference of two successive readings of a given
counter (say cpu_delay_total) for a task will give the delay
CPU count real total virtual total delay total delay average
8 7000000 6872122 3382277 0.423ms
IO count delay total delay average
- 0 0 0ms
+ 0 0 0.000ms
SWAP count delay total delay average
- 0 0 0ms
+ 0 0 0.000ms
RECLAIM count delay total delay average
- 0 0 0ms
+ 0 0 0.000ms
THRASHING count delay total delay average
- 0 0 0ms
+ 0 0 0.000ms
COMPACT count delay total delay average
- 0 0 0ms
- WPCOPY count delay total delay average
- 0 0 0ms
+ 0 0 0.000ms
+ WPCOPY count delay total delay average
+ 0 0 0.000ms
+ IRQ count delay total delay average
+ 0 0 0.000ms
Get IO accounting for pid 1, it works only with -p::
-kcov: code coverage for fuzzing
+KCOV: code coverage for fuzzing
===============================
-kcov exposes kernel code coverage information in a form suitable for coverage-
-guided fuzzing (randomized testing). Coverage data of a running kernel is
-exported via the "kcov" debugfs file. Coverage collection is enabled on a task
-basis, and thus it can capture precise coverage of a single system call.
+KCOV collects and exposes kernel code coverage information in a form suitable
+for coverage-guided fuzzing. Coverage data of a running kernel is exported via
+the ``kcov`` debugfs file. Coverage collection is enabled on a task basis, and
+thus KCOV can capture precise coverage of a single system call.
-Note that kcov does not aim to collect as much coverage as possible. It aims
-to collect more or less stable coverage that is function of syscall inputs.
-To achieve this goal it does not collect coverage in soft/hard interrupts
-and instrumentation of some inherently non-deterministic parts of kernel is
-disabled (e.g. scheduler, locking).
+Note that KCOV does not aim to collect as much coverage as possible. It aims
+to collect more or less stable coverage that is a function of syscall inputs.
+To achieve this goal, it does not collect coverage in soft/hard interrupts
+(unless remove coverage collection is enabled, see below) and from some
+inherently non-deterministic parts of the kernel (e.g. scheduler, locking).
-kcov is also able to collect comparison operands from the instrumented code
-(this feature currently requires that the kernel is compiled with clang).
+Besides collecting code coverage, KCOV can also collect comparison operands.
+See the "Comparison operands collection" section for details.
+
+Besides collecting coverage data from syscall handlers, KCOV can also collect
+coverage for annotated parts of the kernel executing in background kernel
+tasks or soft interrupts. See the "Remote coverage collection" section for
+details.
Prerequisites
-------------
-Configure the kernel with::
+KCOV relies on compiler instrumentation and requires GCC 6.1.0 or later
+or any Clang version supported by the kernel.
- CONFIG_KCOV=y
+Collecting comparison operands is supported with GCC 8+ or with Clang.
-CONFIG_KCOV requires gcc 6.1.0 or later.
+To enable KCOV, configure the kernel with::
-If the comparison operands need to be collected, set::
+ CONFIG_KCOV=y
+
+To enable comparison operands collection, set::
CONFIG_KCOV_ENABLE_COMPARISONS=y
-Profiling data will only become accessible once debugfs has been mounted::
+Coverage data only becomes accessible once debugfs has been mounted::
mount -t debugfs none /sys/kernel/debug
Coverage collection
-------------------
-The following program demonstrates coverage collection from within a test
-program using kcov:
+The following program demonstrates how to use KCOV to collect coverage for a
+single syscall from within a test program:
.. code-block:: c
perror("ioctl"), exit(1);
/* Reset coverage from the tail of the ioctl() call. */
__atomic_store_n(&cover[0], 0, __ATOMIC_RELAXED);
- /* That's the target syscal call. */
+ /* Call the target syscall call. */
read(-1, NULL, 0);
/* Read number of PCs collected. */
n = __atomic_load_n(&cover[0], __ATOMIC_RELAXED);
return 0;
}
-After piping through addr2line output of the program looks as follows::
+After piping through ``addr2line`` the output of the program looks as follows::
SyS_read
fs/read_write.c:562
fs/read_write.c:562
If a program needs to collect coverage from several threads (independently),
-it needs to open /sys/kernel/debug/kcov in each thread separately.
+it needs to open ``/sys/kernel/debug/kcov`` in each thread separately.
The interface is fine-grained to allow efficient forking of test processes.
-That is, a parent process opens /sys/kernel/debug/kcov, enables trace mode,
-mmaps coverage buffer and then forks child processes in a loop. Child processes
-only need to enable coverage (disable happens automatically on thread end).
+That is, a parent process opens ``/sys/kernel/debug/kcov``, enables trace mode,
+mmaps coverage buffer, and then forks child processes in a loop. The child
+processes only need to enable coverage (it gets disabled automatically when
+a thread exits).
Comparison operands collection
------------------------------
return 0;
}
-Note that the kcov modes (coverage collection or comparison operands) are
-mutually exclusive.
+Note that the KCOV modes (collection of code coverage or comparison operands)
+are mutually exclusive.
Remote coverage collection
--------------------------
-With KCOV_ENABLE coverage is collected only for syscalls that are issued
-from the current process. With KCOV_REMOTE_ENABLE it's possible to collect
-coverage for arbitrary parts of the kernel code, provided that those parts
-are annotated with kcov_remote_start()/kcov_remote_stop().
-
-This allows to collect coverage from two types of kernel background
-threads: the global ones, that are spawned during kernel boot in a limited
-number of instances (e.g. one USB hub_event() worker thread is spawned per
-USB HCD); and the local ones, that are spawned when a user interacts with
-some kernel interface (e.g. vhost workers); as well as from soft
-interrupts.
-
-To enable collecting coverage from a global background thread or from a
-softirq, a unique global handle must be assigned and passed to the
-corresponding kcov_remote_start() call. Then a userspace process can pass
-a list of such handles to the KCOV_REMOTE_ENABLE ioctl in the handles
-array field of the kcov_remote_arg struct. This will attach the used kcov
-device to the code sections, that are referenced by those handles.
-
-Since there might be many local background threads spawned from different
-userspace processes, we can't use a single global handle per annotation.
-Instead, the userspace process passes a non-zero handle through the
-common_handle field of the kcov_remote_arg struct. This common handle gets
-saved to the kcov_handle field in the current task_struct and needs to be
-passed to the newly spawned threads via custom annotations. Those threads
-should in turn be annotated with kcov_remote_start()/kcov_remote_stop().
-
-Internally kcov stores handles as u64 integers. The top byte of a handle
-is used to denote the id of a subsystem that this handle belongs to, and
-the lower 4 bytes are used to denote the id of a thread instance within
-that subsystem. A reserved value 0 is used as a subsystem id for common
-handles as they don't belong to a particular subsystem. The bytes 4-7 are
-currently reserved and must be zero. In the future the number of bytes
-used for the subsystem or handle ids might be increased.
-
-When a particular userspace process collects coverage via a common
-handle, kcov will collect coverage for each code section that is annotated
-to use the common handle obtained as kcov_handle from the current
-task_struct. However non common handles allow to collect coverage
-selectively from different subsystems.
+Besides collecting coverage data from handlers of syscalls issued from a
+userspace process, KCOV can also collect coverage for parts of the kernel
+executing in other contexts - so-called "remote" coverage.
+
+Using KCOV to collect remote coverage requires:
+
+1. Modifying kernel code to annotate the code section from where coverage
+ should be collected with ``kcov_remote_start`` and ``kcov_remote_stop``.
+
+2. Using ``KCOV_REMOTE_ENABLE`` instead of ``KCOV_ENABLE`` in the userspace
+ process that collects coverage.
+
+Both ``kcov_remote_start`` and ``kcov_remote_stop`` annotations and the
+``KCOV_REMOTE_ENABLE`` ioctl accept handles that identify particular coverage
+collection sections. The way a handle is used depends on the context where the
+matching code section executes.
+
+KCOV supports collecting remote coverage from the following contexts:
+
+1. Global kernel background tasks. These are the tasks that are spawned during
+ kernel boot in a limited number of instances (e.g. one USB ``hub_event``
+ worker is spawned per one USB HCD).
+
+2. Local kernel background tasks. These are spawned when a userspace process
+ interacts with some kernel interface and are usually killed when the process
+ exits (e.g. vhost workers).
+
+3. Soft interrupts.
+
+For #1 and #3, a unique global handle must be chosen and passed to the
+corresponding ``kcov_remote_start`` call. Then a userspace process must pass
+this handle to ``KCOV_REMOTE_ENABLE`` in the ``handles`` array field of the
+``kcov_remote_arg`` struct. This will attach the used KCOV device to the code
+section referenced by this handle. Multiple global handles identifying
+different code sections can be passed at once.
+
+For #2, the userspace process instead must pass a non-zero handle through the
+``common_handle`` field of the ``kcov_remote_arg`` struct. This common handle
+gets saved to the ``kcov_handle`` field in the current ``task_struct`` and
+needs to be passed to the newly spawned local tasks via custom kernel code
+modifications. Those tasks should in turn use the passed handle in their
+``kcov_remote_start`` and ``kcov_remote_stop`` annotations.
+
+KCOV follows a predefined format for both global and common handles. Each
+handle is a ``u64`` integer. Currently, only the one top and the lower 4 bytes
+are used. Bytes 4-7 are reserved and must be zero.
+
+For global handles, the top byte of the handle denotes the id of a subsystem
+this handle belongs to. For example, KCOV uses ``1`` as the USB subsystem id.
+The lower 4 bytes of a global handle denote the id of a task instance within
+that subsystem. For example, each ``hub_event`` worker uses the USB bus number
+as the task instance id.
+
+For common handles, a reserved value ``0`` is used as a subsystem id, as such
+handles don't belong to a particular subsystem. The lower 4 bytes of a common
+handle identify a collective instance of all local tasks spawned by the
+userspace process that passed a common handle to ``KCOV_REMOTE_ENABLE``.
+
+In practice, any value can be used for common handle instance id if coverage
+is only collected from a single userspace process on the system. However, if
+common handles are used by multiple processes, unique instance ids must be
+used for each process. One option is to use the process id as the common
+handle instance id.
+
+The following program demonstrates using KCOV to collect coverage from both
+local tasks spawned by the process and the global task that handles USB bus #1:
.. code-block:: c
Gid: 100 100 100 100
FDSize: 256
Groups: 100 14 16
+ Kthread: 0
VmPeak: 5004 kB
VmSize: 5004 kB
VmLck: 0 kB
NSpid descendant namespace process ID hierarchy
NSpgid descendant namespace process group ID hierarchy
NSsid descendant namespace session ID hierarchy
+ Kthread kernel thread flag, 1 is yes, 0 is no
VmPeak peak virtual memory size
VmSize total program size
VmLck locked memory size
Fixes: 1f2e3d4c5b6a ("The first line of the commit specified by the first 12 characters of its SHA-1 ID")
Another tag is used for linking web pages with additional backgrounds or
-details, for example a report about a bug fixed by the patch or a document
-with a specification implemented by the patch::
+details, for example an earlier discussion which leads to the patch or a
+document with a specification implemented by the patch::
Link: https://example.com/somewhere.html optional-other-stuff
by tools like b4 or a git hook like the one described in
'Documentation/maintainer/configure-git.rst'.
-A third kind of tag is used to document who was involved in the development of
+If the URL points to a public bug report being fixed by the patch, use the
+"Closes:" tag instead::
+
+ Closes: https://example.com/issues/1234 optional-other-stuff
+
+Some bug trackers have the ability to close issues automatically when a
+commit with such a tag is applied. Some bots monitoring mailing lists can
+also track such tags and take certain actions. Private bug trackers and
+invalid URLs are forbidden.
+
+Another kind of tag is used to document who was involved in the development of
the patch. Each of these uses this format::
tag: Full Name <email address> optional-other-stuff
- Reported-by: names a user who reported a problem which is fixed by this
patch; this tag is used to give credit to the (often underappreciated)
people who test our code and let us know when things do not work
- correctly. Note, this tag should be followed by a Link: tag pointing to the
- report, unless the report is not available on the web.
+ correctly. Note, this tag should be followed by a Closes: tag pointing to
+ the report, unless the report is not available on the web. The Link: tag
+ can be used instead of Closes: if the patch fixes a part of the issue(s)
+ being reported.
- Cc: the named person received a copy of the patch and had the
opportunity to comment on it.
change five years from now.
If related discussions or any other background information behind the change
-can be found on the web, add 'Link:' tags pointing to it. In case your patch
-fixes a bug, for example, add a tag with a URL referencing the report in the
-mailing list archives or a bug tracker; if the patch is a result of some
-earlier mailing list discussion or something documented on the web, point to
-it.
+can be found on the web, add 'Link:' tags pointing to it. If the patch is a
+result of some earlier mailing list discussions or something documented on the
+web, point to it.
When linking to mailing list archives, preferably use the lore.kernel.org
message archiver service. To create the link URL, use the contents of the
summarize the relevant points of the discussion that led to the
patch as submitted.
+In case your patch fixes a bug, use the 'Closes:' tag with a URL referencing
+the report in the mailing list archives or a public bug tracker. For example::
+
+ Closes: https://example.com/issues/1234
+
+Some bug trackers have the ability to close issues automatically when a
+commit with such a tag is applied. Some bots monitoring mailing lists can
+also track such tags and take certain actions. Private bug trackers and
+invalid URLs are forbidden.
+
If your patch fixes a bug in a specific commit, e.g. you found an issue using
``git bisect``, please use the 'Fixes:' tag with the first 12 characters of
the SHA-1 ID, and the one line summary. Do not split the tag across multiple
The Reported-by tag gives credit to people who find bugs and report them and it
hopefully inspires them to help us again in the future. The tag is intended for
bugs; please do not use it to credit feature requests. The tag should be
-followed by a Link: tag pointing to the report, unless the report is not
-available on the web. Please note that if the bug was reported in private, then
-ask for permission first before using the Reported-by tag.
+followed by a Closes: tag pointing to the report, unless the report is not
+available on the web. The Link: tag can be used instead of Closes: if the patch
+fixes a part of the issue(s) being reported. Please note that if the bug was
+reported in private, then ask for permission first before using the Reported-by
+tag.
A Tested-by: tag indicates that the patch has been successfully tested (in
some environment) by the person named. This tag informs maintainers that
CPU count real total virtual total delay total delay average
8 7000000 6872122 3382277 0.423ms
IO count delay total delay average
- 0 0 0ms
+ 0 0 0.000ms
SWAP count delay total delay average
- 0 0 0ms
+ 0 0 0.000ms
RECLAIM count delay total delay average
- 0 0 0ms
+ 0 0 0.000ms
THRASHING count delay total delay average
- 0 0 0ms
+ 0 0 0.000ms
COMPACT count delay total delay average
- 0 0 0ms
+ 0 0 0.000ms
WPCOPY count delay total delay average
0 0 0ms
F: Documentation/admin-guide/media/em28xx*
F: drivers/media/usb/em28xx/
-EMBEDDED LINUX
-M: Olivia Mackall <olivia@selenic.com>
-M: David Woodhouse <dwmw2@infradead.org>
-L: linux-embedded@vger.kernel.org
-S: Maintained
-
EMMC CMDQ HOST CONTROLLER INTERFACE (CQHCI) DRIVER
M: Adrian Hunter <adrian.hunter@intel.com>
M: Ritesh Harjani <riteshh@codeaurora.org>
* 'data' contains an integer that corresponds to the feature we're
* testing
*/
-static int proc_salinfo_show(struct seq_file *m, void *v)
+static int __maybe_unused proc_salinfo_show(struct seq_file *m, void *v)
{
unsigned long data = (unsigned long)v;
seq_puts(m, (sal_platform_features & data) ? "1\n" : "0\n");
return __per_cpu_start + __per_cpu_offset[smp_processor_id()];
}
-static inline void
+static inline __init void
alloc_per_cpu_data(void)
{
size_t size = PERCPU_PAGE_SIZE * num_possible_cpus();
pgd = pgd_offset(mm, taddr);
if (pgd_present(*pgd)) {
- p4d = p4d_offset(pgd, addr);
+ p4d = p4d_offset(pgd, taddr);
if (p4d_present(*p4d)) {
pud = pud_offset(p4d, taddr);
if (pud_present(*pud)) {
die("Unknown ELF version\n");
if (ehdr.e_ehsize != sizeof(Elf_Ehdr))
- die("Bad Elf header size\n");
+ die("Bad ELF header size\n");
if (ehdr.e_phentsize != sizeof(Elf_Phdr))
die("Bad program header entry\n");
/*
* arch/um/kernel/elf_aux.c
*
- * Scan the Elf auxiliary vector provided by the host to extract
+ * Scan the ELF auxiliary vector provided by the host to extract
* information about vsyscall-page, etc.
*
* Copyright (C) 2004 Fujitsu Siemens Computers GmbH
const Elf_Shdr *symtab);
#define arch_kexec_apply_relocations_add arch_kexec_apply_relocations_add
-void *arch_kexec_kernel_image_load(struct kimage *image);
-#define arch_kexec_kernel_image_load arch_kexec_kernel_image_load
-
int arch_kimage_file_post_load_cleanup(struct kimage *image);
#define arch_kimage_file_post_load_cleanup arch_kimage_file_post_load_cleanup
#endif
/* arch-dependent functionality related to kexec file-based syscall */
#ifdef CONFIG_KEXEC_FILE
-void *arch_kexec_kernel_image_load(struct kimage *image)
-{
- if (!image->fops || !image->fops->load)
- return ERR_PTR(-ENOEXEC);
-
- return image->fops->load(image, image->kernel_buf,
- image->kernel_buf_len, image->initrd_buf,
- image->initrd_buf_len, image->cmdline_buf,
- image->cmdline_buf_len);
-}
-
/*
* Apply purgatory relocations.
*
if (ehdr.e_version != EV_CURRENT)
die("Unknown ELF version\n");
if (ehdr.e_ehsize != sizeof(Elf_Ehdr))
- die("Bad Elf header size\n");
+ die("Bad ELF header size\n");
if (ehdr.e_phentsize != sizeof(Elf_Phdr))
die("Bad program header entry\n");
if (ehdr.e_shentsize != sizeof(Elf_Shdr))
*/
u8 dca_get_tag(int cpu)
{
- struct device *dev = NULL;
-
- return dca_common_get_tag(dev, cpu);
+ return dca_common_get_tag(NULL, cpu);
}
EXPORT_SYMBOL_GPL(dca_get_tag);
iounmap(priv->odb_base);
err_free_res:
pci_release_regions(pdev);
- pci_clear_master(pdev);
err_disable_pdev:
pci_disable_device(pdev);
err_clean:
pci_disable_msi(priv->pdev);
#endif
pci_release_regions(pdev);
- pci_clear_master(pdev);
pci_disable_device(pdev);
pci_set_drvdata(pdev, NULL);
kfree(priv);
tsi721_disable_ints(priv);
tsi721_dma_stop_all(priv);
- pci_clear_master(pdev);
pci_disable_device(pdev);
}
return;
if (class == ELFCLASSNONE) {
- dev_err(&rproc->dev, "Elf class is not set\n");
+ dev_err(&rproc->dev, "ELF class is not set\n");
return;
}
return;
if (class == ELFCLASSNONE) {
- dev_err(&rproc->dev, "Elf class is not set\n");
+ dev_err(&rproc->dev, "ELF class is not set\n");
return;
}
// SPDX-License-Identifier: GPL-2.0-only
/*
- * Remote Processor Framework Elf loader
+ * Remote Processor Framework ELF loader
*
* Copyright (C) 2011 Texas Instruments, Inc.
* Copyright (C) 2011 Google, Inc.
const char *name = rproc->firmware;
struct device *dev = &rproc->dev;
/*
- * Elf files are beginning with the same structure. Thus, to simplify
+ * ELF files are beginning with the same structure. Thus, to simplify
* header parsing, we can use the elf32_hdr one for both elf64 and
* elf32.
*/
has_dumped = 1;
- offset += sizeof(elf); /* Elf header */
+ offset += sizeof(elf); /* ELF header */
offset += segs * sizeof(struct elf_phdr); /* Program headers */
/* Write notes phdr entry */
fill_note(&auxv_note, "CORE", NT_AUXV, i * sizeof(elf_addr_t), auxv);
thread_status_size += notesize(&auxv_note);
- offset = sizeof(*elf); /* Elf header */
+ offset = sizeof(*elf); /* ELF header */
offset += segs * sizeof(struct elf_phdr); /* Program headers */
/* Write notes phdr entry */
* LOCKING:
* There are three level of locking required by epoll :
*
- * 1) epmutex (mutex)
+ * 1) epnested_mutex (mutex)
* 2) ep->mtx (mutex)
* 3) ep->lock (rwlock)
*
* we need a lock that will allow us to sleep. This lock is a
* mutex (ep->mtx). It is acquired during the event transfer loop,
* during epoll_ctl(EPOLL_CTL_DEL) and during eventpoll_release_file().
- * Then we also need a global mutex to serialize eventpoll_release_file()
- * and ep_free().
- * This mutex is acquired by ep_free() during the epoll file
- * cleanup path and it is also acquired by eventpoll_release_file()
- * if a file has been pushed inside an epoll set and it is then
- * close()d without a previous call to epoll_ctl(EPOLL_CTL_DEL).
- * It is also acquired when inserting an epoll fd onto another epoll
- * fd. We do this so that we walk the epoll tree and ensure that this
+ * The epnested_mutex is acquired when inserting an epoll fd onto another
+ * epoll fd. We do this so that we walk the epoll tree and ensure that this
* insertion does not create a cycle of epoll file descriptors, which
* could lead to deadlock. We need a global mutex to prevent two
* simultaneous inserts (A into B and B into A) from racing and
* of epoll file descriptors, we use the current recursion depth as
* the lockdep subkey.
* It is possible to drop the "ep->mtx" and to use the global
- * mutex "epmutex" (together with "ep->lock") to have it working,
+ * mutex "epnested_mutex" (together with "ep->lock") to have it working,
* but having "ep->mtx" will make the interface more scalable.
- * Events that require holding "epmutex" are very rare, while for
+ * Events that require holding "epnested_mutex" are very rare, while for
* normal operations the epoll private "ep->mtx" will guarantee
* a better scalability.
*/
/* The file descriptor information this item refers to */
struct epoll_filefd ffd;
+ /*
+ * Protected by file->f_lock, true for to-be-released epitem already
+ * removed from the "struct file" items list; together with
+ * eventpoll->refcount orchestrates "struct eventpoll" disposal
+ */
+ bool dying;
+
/* List containing poll wait queues */
struct eppoll_entry *pwqlist;
u64 gen;
struct hlist_head refs;
+ /*
+ * usage count, used together with epitem->dying to
+ * orchestrate the disposal of this struct
+ */
+ refcount_t refcount;
+
#ifdef CONFIG_NET_RX_BUSY_POLL
/* used to track busy poll napi_id */
unsigned int napi_id;
/* Maximum number of epoll watched descriptors, per user */
static long max_user_watches __read_mostly;
-/*
- * This mutex is used to serialize ep_free() and eventpoll_release_file().
- */
-static DEFINE_MUTEX(epmutex);
+/* Used for cycles detection */
+static DEFINE_MUTEX(epnested_mutex);
static u64 loop_check_gen = 0;
/*
* List of files with newly added links, where we may need to limit the number
- * of emanating paths. Protected by the epmutex.
+ * of emanating paths. Protected by the epnested_mutex.
*/
struct epitems_head {
struct hlist_head epitems;
/*
* This function unregisters poll callbacks from the associated file
- * descriptor. Must be called with "mtx" held (or "epmutex" if called from
- * ep_free).
+ * descriptor. Must be called with "mtx" held.
*/
static void ep_unregister_pollwait(struct eventpoll *ep, struct epitem *epi)
{
kmem_cache_free(epi_cache, epi);
}
+static void ep_get(struct eventpoll *ep)
+{
+ refcount_inc(&ep->refcount);
+}
+
+/*
+ * Returns true if the event poll can be disposed
+ */
+static bool ep_refcount_dec_and_test(struct eventpoll *ep)
+{
+ if (!refcount_dec_and_test(&ep->refcount))
+ return false;
+
+ WARN_ON_ONCE(!RB_EMPTY_ROOT(&ep->rbr.rb_root));
+ return true;
+}
+
+static void ep_free(struct eventpoll *ep)
+{
+ mutex_destroy(&ep->mtx);
+ free_uid(ep->user);
+ wakeup_source_unregister(ep->ws);
+ kfree(ep);
+}
+
/*
* Removes a "struct epitem" from the eventpoll RB tree and deallocates
* all the associated resources. Must be called with "mtx" held.
+ * If the dying flag is set, do the removal only if force is true.
+ * This prevents ep_clear_and_put() from dropping all the ep references
+ * while running concurrently with eventpoll_release_file().
+ * Returns true if the eventpoll can be disposed.
*/
-static int ep_remove(struct eventpoll *ep, struct epitem *epi)
+static bool __ep_remove(struct eventpoll *ep, struct epitem *epi, bool force)
{
struct file *file = epi->ffd.file;
struct epitems_head *to_free;
/* Remove the current item from the list of epoll hooks */
spin_lock(&file->f_lock);
+ if (epi->dying && !force) {
+ spin_unlock(&file->f_lock);
+ return false;
+ }
+
to_free = NULL;
head = file->f_ep;
if (head->first == &epi->fllink && !epi->fllink.next) {
call_rcu(&epi->rcu, epi_rcu_free);
percpu_counter_dec(&ep->user->epoll_watches);
+ return ep_refcount_dec_and_test(ep);
+}
- return 0;
+/*
+ * ep_remove variant for callers owing an additional reference to the ep
+ */
+static void ep_remove_safe(struct eventpoll *ep, struct epitem *epi)
+{
+ WARN_ON_ONCE(__ep_remove(ep, epi, false));
}
-static void ep_free(struct eventpoll *ep)
+static void ep_clear_and_put(struct eventpoll *ep)
{
- struct rb_node *rbp;
+ struct rb_node *rbp, *next;
struct epitem *epi;
+ bool dispose;
/* We need to release all tasks waiting for these file */
if (waitqueue_active(&ep->poll_wait))
ep_poll_safewake(ep, NULL, 0);
- /*
- * We need to lock this because we could be hit by
- * eventpoll_release_file() while we're freeing the "struct eventpoll".
- * We do not need to hold "ep->mtx" here because the epoll file
- * is on the way to be removed and no one has references to it
- * anymore. The only hit might come from eventpoll_release_file() but
- * holding "epmutex" is sufficient here.
- */
- mutex_lock(&epmutex);
+ mutex_lock(&ep->mtx);
/*
* Walks through the whole tree by unregistering poll callbacks.
}
/*
- * Walks through the whole tree by freeing each "struct epitem". At this
- * point we are sure no poll callbacks will be lingering around, and also by
- * holding "epmutex" we can be sure that no file cleanup code will hit
- * us during this operation. So we can avoid the lock on "ep->lock".
- * We do not need to lock ep->mtx, either, we only do it to prevent
- * a lockdep warning.
+ * Walks through the whole tree and try to free each "struct epitem".
+ * Note that ep_remove_safe() will not remove the epitem in case of a
+ * racing eventpoll_release_file(); the latter will do the removal.
+ * At this point we are sure no poll callbacks will be lingering around.
+ * Since we still own a reference to the eventpoll struct, the loop can't
+ * dispose it.
*/
- mutex_lock(&ep->mtx);
- while ((rbp = rb_first_cached(&ep->rbr)) != NULL) {
+ for (rbp = rb_first_cached(&ep->rbr); rbp; rbp = next) {
+ next = rb_next(rbp);
epi = rb_entry(rbp, struct epitem, rbn);
- ep_remove(ep, epi);
+ ep_remove_safe(ep, epi);
cond_resched();
}
+
+ dispose = ep_refcount_dec_and_test(ep);
mutex_unlock(&ep->mtx);
- mutex_unlock(&epmutex);
- mutex_destroy(&ep->mtx);
- free_uid(ep->user);
- wakeup_source_unregister(ep->ws);
- kfree(ep);
+ if (dispose)
+ ep_free(ep);
}
static int ep_eventpoll_release(struct inode *inode, struct file *file)
struct eventpoll *ep = file->private_data;
if (ep)
- ep_free(ep);
+ ep_clear_and_put(ep);
return 0;
}
{
struct eventpoll *ep;
struct epitem *epi;
- struct hlist_node *next;
+ bool dispose;
/*
- * We don't want to get "file->f_lock" because it is not
- * necessary. It is not necessary because we're in the "struct file"
- * cleanup path, and this means that no one is using this file anymore.
- * So, for example, epoll_ctl() cannot hit here since if we reach this
- * point, the file counter already went to zero and fget() would fail.
- * The only hit might come from ep_free() but by holding the mutex
- * will correctly serialize the operation. We do need to acquire
- * "ep->mtx" after "epmutex" because ep_remove() requires it when called
- * from anywhere but ep_free().
- *
- * Besides, ep_remove() acquires the lock, so we can't hold it here.
+ * Use the 'dying' flag to prevent a concurrent ep_clear_and_put() from
+ * touching the epitems list before eventpoll_release_file() can access
+ * the ep->mtx.
*/
- mutex_lock(&epmutex);
- if (unlikely(!file->f_ep)) {
- mutex_unlock(&epmutex);
- return;
- }
- hlist_for_each_entry_safe(epi, next, file->f_ep, fllink) {
+again:
+ spin_lock(&file->f_lock);
+ if (file->f_ep && file->f_ep->first) {
+ epi = hlist_entry(file->f_ep->first, struct epitem, fllink);
+ epi->dying = true;
+ spin_unlock(&file->f_lock);
+
+ /*
+ * ep access is safe as we still own a reference to the ep
+ * struct
+ */
ep = epi->ep;
- mutex_lock_nested(&ep->mtx, 0);
- ep_remove(ep, epi);
+ mutex_lock(&ep->mtx);
+ dispose = __ep_remove(ep, epi, true);
mutex_unlock(&ep->mtx);
+
+ if (dispose)
+ ep_free(ep);
+ goto again;
}
- mutex_unlock(&epmutex);
+ spin_unlock(&file->f_lock);
}
static int ep_alloc(struct eventpoll **pep)
ep->rbr = RB_ROOT_CACHED;
ep->ovflist = EP_UNACTIVE_PTR;
ep->user = user;
+ refcount_set(&ep->refcount, 1);
*pep = ep;
*/
list_del_init(&wait->entry);
/*
- * ->whead != NULL protects us from the race with ep_free()
- * or ep_remove(), ep_remove_wait_queue() takes whead->lock
- * held by the caller. Once we nullify it, nothing protects
- * ep/epi or even wait.
+ * ->whead != NULL protects us from the race with
+ * ep_clear_and_put() or ep_remove(), ep_remove_wait_queue()
+ * takes whead->lock held by the caller. Once we nullify it,
+ * nothing protects ep/epi or even wait.
*/
smp_store_release(&ep_pwq_from_wait(wait)->whead, NULL);
}
* is connected to n file sources. In this case each file source has 1 path
* of length 1. Thus, the numbers below should be more than sufficient. These
* path limits are enforced during an EPOLL_CTL_ADD operation, since a modify
- * and delete can't add additional paths. Protected by the epmutex.
+ * and delete can't add additional paths. Protected by the epnested_mutex.
*/
static const int path_limits[PATH_ARR_SIZE] = { 1000, 500, 100, 50, 10 };
static int path_count[PATH_ARR_SIZE];
if (tep)
mutex_unlock(&tep->mtx);
+ /*
+ * ep_remove_safe() calls in the later error paths can't lead to
+ * ep_free() as the ep file itself still holds an ep reference.
+ */
+ ep_get(ep);
+
/* now check if we've created too many backpaths */
if (unlikely(full_check && reverse_path_check())) {
- ep_remove(ep, epi);
+ ep_remove_safe(ep, epi);
return -EINVAL;
}
if (epi->event.events & EPOLLWAKEUP) {
error = ep_create_wakeup_source(epi);
if (error) {
- ep_remove(ep, epi);
+ ep_remove_safe(ep, epi);
return error;
}
}
* high memory pressure.
*/
if (unlikely(!epq.epi)) {
- ep_remove(ep, epi);
+ ep_remove_safe(ep, epi);
return -ENOMEM;
}
out_free_fd:
put_unused_fd(fd);
out_free_ep:
- ep_free(ep);
+ ep_clear_and_put(ep);
return error;
}
* We do not need to take the global 'epumutex' on EPOLL_CTL_ADD when
* the epoll file descriptor is attaching directly to a wakeup source,
* unless the epoll file descriptor is nested. The purpose of taking the
- * 'epmutex' on add is to prevent complex toplogies such as loops and
+ * 'epnested_mutex' on add is to prevent complex toplogies such as loops and
* deep wakeup paths from forming in parallel through multiple
* EPOLL_CTL_ADD operations.
*/
if (READ_ONCE(f.file->f_ep) || ep->gen == loop_check_gen ||
is_file_epoll(tf.file)) {
mutex_unlock(&ep->mtx);
- error = epoll_mutex_lock(&epmutex, 0, nonblock);
+ error = epoll_mutex_lock(&epnested_mutex, 0, nonblock);
if (error)
goto error_tgt_fput;
loop_check_gen++;
error = -EEXIST;
break;
case EPOLL_CTL_DEL:
- if (epi)
- error = ep_remove(ep, epi);
- else
+ if (epi) {
+ /*
+ * The eventpoll itself is still alive: the refcount
+ * can't go to zero here.
+ */
+ ep_remove_safe(ep, epi);
+ error = 0;
+ } else {
error = -ENOENT;
+ }
break;
case EPOLL_CTL_MOD:
if (epi) {
if (full_check) {
clear_tfile_check_list();
loop_check_gen++;
- mutex_unlock(&epmutex);
+ mutex_unlock(&epnested_mutex);
}
fdput(tf);
{
struct posix_acl *sentinel = uncached_acl_sentinel(current);
- if (cmpxchg(p, ACL_NOT_CACHED, sentinel) != ACL_NOT_CACHED) {
- /* Not the first reader or sentinel already in place. */
- }
+ /* If the ACL isn't being read yet, set our sentinel. */
+ cmpxchg(p, ACL_NOT_CACHED, sentinel);
}
static void nfs3_complete_get_acl(struct posix_acl **p, struct posix_acl *acl)
* a better backward&forward compatibility, since a small piece of
* request will be less likely to be broken if disk layout get changed.
*/
-static int ocfs2_info_handle(struct inode *inode, struct ocfs2_info *info,
- int compat_flag)
+static noinline_for_stack int
+ocfs2_info_handle(struct inode *inode, struct ocfs2_info *info, int compat_flag)
{
int i, status = 0;
u64 req_addr;
long ocfs2_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
{
struct inode *inode = file_inode(filp);
- int new_clusters;
- int status;
- struct ocfs2_space_resv sr;
- struct ocfs2_new_group_input input;
- struct reflink_arguments args;
- const char __user *old_path;
- const char __user *new_path;
- bool preserve;
- struct ocfs2_info info;
void __user *argp = (void __user *)arg;
+ int status;
switch (cmd) {
case OCFS2_IOC_RESVSP:
case OCFS2_IOC_RESVSP64:
case OCFS2_IOC_UNRESVSP:
case OCFS2_IOC_UNRESVSP64:
+ {
+ struct ocfs2_space_resv sr;
+
if (copy_from_user(&sr, (int __user *) arg, sizeof(sr)))
return -EFAULT;
return ocfs2_change_file_space(filp, cmd, &sr);
+ }
case OCFS2_IOC_GROUP_EXTEND:
+ {
+ int new_clusters;
+
if (!capable(CAP_SYS_RESOURCE))
return -EPERM;
status = ocfs2_group_extend(inode, new_clusters);
mnt_drop_write_file(filp);
return status;
+ }
case OCFS2_IOC_GROUP_ADD:
case OCFS2_IOC_GROUP_ADD64:
+ {
+ struct ocfs2_new_group_input input;
+
if (!capable(CAP_SYS_RESOURCE))
return -EPERM;
status = ocfs2_group_add(inode, &input);
mnt_drop_write_file(filp);
return status;
+ }
case OCFS2_IOC_REFLINK:
+ {
+ struct reflink_arguments args;
+ const char __user *old_path;
+ const char __user *new_path;
+ bool preserve;
+
if (copy_from_user(&args, argp, sizeof(args)))
return -EFAULT;
old_path = (const char __user *)(unsigned long)args.old_path;
preserve = (args.preserve != 0);
return ocfs2_reflink_ioctl(inode, old_path, new_path, preserve);
+ }
case OCFS2_IOC_INFO:
+ {
+ struct ocfs2_info info;
+
if (copy_from_user(&info, argp, sizeof(struct ocfs2_info)))
return -EFAULT;
return ocfs2_info_handle(inode, &info, 0);
+ }
case FITRIM:
{
struct super_block *sb = inode->i_sb;
seq_put_decimal_ull(m, "\t", task_session_nr_ns(p, pid->numbers[g].ns));
#endif
seq_putc(m, '\n');
+
+ seq_printf(m, "Kthread:\t%c\n", p->flags & PF_KTHREAD ? '1' : '0');
}
void render_sigset_t(struct seq_file *m, const char *header,
return error;
setattr_copy(&nop_mnt_idmap, inode, attr);
- mark_inode_dirty(inode);
return 0;
}
return error;
setattr_copy(&nop_mnt_idmap, inode, iattr);
- mark_inode_dirty(inode);
proc_set_user(de, inode->i_uid, inode->i_gid);
de->mode = inode->i_mode;
return error;
setattr_copy(&nop_mnt_idmap, inode, attr);
- mark_inode_dirty(inode);
return 0;
}
#define arch_irq_stat() 0
#endif
-#ifdef arch_idle_time
-
-u64 get_idle_time(struct kernel_cpustat *kcs, int cpu)
-{
- u64 idle;
-
- idle = kcs->cpustat[CPUTIME_IDLE];
- if (cpu_online(cpu) && !nr_iowait_cpu(cpu))
- idle += arch_idle_time(cpu);
- return idle;
-}
-
-static u64 get_iowait_time(struct kernel_cpustat *kcs, int cpu)
-{
- u64 iowait;
-
- iowait = kcs->cpustat[CPUTIME_IOWAIT];
- if (cpu_online(cpu) && nr_iowait_cpu(cpu))
- iowait += arch_idle_time(cpu);
- return iowait;
-}
-
-#else
-
u64 get_idle_time(struct kernel_cpustat *kcs, int cpu)
{
u64 idle, idle_usecs = -1ULL;
return iowait;
}
-#endif
-
static void show_irq_gap(struct seq_file *p, unsigned int gap)
{
static const char zeros[] = " 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0";
return acc;
}
- /* Read Elf note segment */
+ /* Read ELF note segment */
if (*fpos < elfcorebuf_sz + elfnotes_sz) {
void *kaddr;
ehdr_ptr = (Elf64_Ehdr *)elfptr;
phdr_ptr = (Elf64_Phdr*)(elfptr + sizeof(Elf64_Ehdr)); /* PT_NOTE hdr */
- /* Skip Elf header, program headers and Elf note segment. */
+ /* Skip ELF header, program headers and ELF note segment. */
vmcore_off = elfsz + elfnotes_sz;
for (i = 0; i < ehdr_ptr->e_phnum; i++, phdr_ptr++) {
ehdr_ptr = (Elf32_Ehdr *)elfptr;
phdr_ptr = (Elf32_Phdr*)(elfptr + sizeof(Elf32_Ehdr)); /* PT_NOTE hdr */
- /* Skip Elf header, program headers and Elf note segment. */
+ /* Skip ELF header, program headers and ELF note segment. */
vmcore_off = elfsz + elfnotes_sz;
for (i = 0; i < ehdr_ptr->e_phnum; i++, phdr_ptr++) {
loff_t vmcore_off;
struct vmcore *m;
- /* Skip Elf header, program headers and Elf note segment. */
+ /* Skip ELF header, program headers and ELF note segment. */
vmcore_off = elfsz + elfnotes_sz;
list_for_each_entry(m, vc_list, list) {
addr = elfcorehdr_addr;
- /* Read Elf header */
+ /* Read ELF header */
rc = elfcorehdr_read((char *)&ehdr, sizeof(Elf64_Ehdr), &addr);
if (rc < 0)
return rc;
addr = elfcorehdr_addr;
- /* Read Elf header */
+ /* Read ELF header */
rc = elfcorehdr_read((char *)&ehdr, sizeof(Elf32_Ehdr), &addr);
if (rc < 0)
return rc;
}
/**
- * vmcoredd_update_program_headers - Update all Elf program headers
+ * vmcoredd_update_program_headers - Update all ELF program headers
* @elfptr: Pointer to elf header
* @elfnotesz: Size of elf notes aligned to page size
* @vmcoreddsz: Size of device dumps to be added to elf note header
*
- * Determine type of Elf header (Elf64 or Elf32) and update the elf note size.
+ * Determine type of ELF header (Elf64 or Elf32) and update the elf note size.
* Also update the offsets of all the program headers after the elf note header.
*/
static void vmcoredd_update_program_headers(char *elfptr, size_t elfnotesz,
/**
* vmcoredd_update_size - Update the total size of the device dumps and update
- * Elf header
+ * ELF header
* @dump_size: Size of the current device dump to be added to total size
*
- * Update the total size of all the device dumps and update the Elf program
+ * Update the total size of all the device dumps and update the ELF program
* headers. Calculate the new offsets for the vmcore list and update the
* total vmcore size.
*/
* @data: dump info.
*
* Allocate a buffer and invoke the calling driver's dump collect routine.
- * Write Elf note at the beginning of the buffer to indicate vmcore device
+ * Write ELF note at the beginning of the buffer to indicate vmcore device
* dump and add the dump to global list.
*/
int vmcore_add_device_dump(struct vmcoredd_data *data)
u64 wpcopy_start;
u64 wpcopy_delay; /* wait for write-protect copy */
+ u64 irq_delay; /* wait for IRQ/SOFTIRQ */
+
u32 freepages_count; /* total count of memory reclaim */
u32 thrashing_count; /* total count of thrash waits */
u32 compact_count; /* total count of memory compact */
u32 wpcopy_count; /* total count of write-protect copy */
+ u32 irq_count; /* total count of IRQ/SOFTIRQ */
};
#endif
extern void __delayacct_compact_end(void);
extern void __delayacct_wpcopy_start(void);
extern void __delayacct_wpcopy_end(void);
+extern void __delayacct_irq(struct task_struct *task, u32 delta);
static inline void delayacct_tsk_init(struct task_struct *tsk)
{
__delayacct_wpcopy_end();
}
+static inline void delayacct_irq(struct task_struct *task, u32 delta)
+{
+ if (!static_branch_unlikely(&delayacct_key))
+ return;
+
+ if (task->delays)
+ __delayacct_irq(task, delta);
+}
+
#else
static inline void delayacct_init(void)
{}
{}
static inline void delayacct_wpcopy_end(void)
{}
+static inline void delayacct_irq(struct task_struct *task, u32 delta)
+{}
#endif /* CONFIG_TASK_DELAY_ACCT */
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_HEX_H
+#define _LINUX_HEX_H
+
+#include <linux/types.h>
+
+extern const char hex_asc[];
+#define hex_asc_lo(x) hex_asc[((x) & 0x0f)]
+#define hex_asc_hi(x) hex_asc[((x) & 0xf0) >> 4]
+
+static inline char *hex_byte_pack(char *buf, u8 byte)
+{
+ *buf++ = hex_asc_hi(byte);
+ *buf++ = hex_asc_lo(byte);
+ return buf;
+}
+
+extern const char hex_asc_upper[];
+#define hex_asc_upper_lo(x) hex_asc_upper[((x) & 0x0f)]
+#define hex_asc_upper_hi(x) hex_asc_upper[((x) & 0xf0) >> 4]
+
+static inline char *hex_byte_pack_upper(char *buf, u8 byte)
+{
+ *buf++ = hex_asc_upper_hi(byte);
+ *buf++ = hex_asc_upper_lo(byte);
+ return buf;
+}
+
+extern int hex_to_bin(unsigned char ch);
+extern int __must_check hex2bin(u8 *dst, const char *src, size_t count);
+extern char *bin2hex(char *dst, const void *src, size_t count);
+
+bool mac_pton(const char *s, u8 *mac);
+
+#endif
#include <linux/compiler.h>
#include <linux/container_of.h>
#include <linux/bitops.h>
+#include <linux/hex.h>
#include <linux/kstrtox.h>
#include <linux/log2.h>
#include <linux/math.h>
SYSTEM_SUSPEND,
} system_state;
-extern const char hex_asc[];
-#define hex_asc_lo(x) hex_asc[((x) & 0x0f)]
-#define hex_asc_hi(x) hex_asc[((x) & 0xf0) >> 4]
-
-static inline char *hex_byte_pack(char *buf, u8 byte)
-{
- *buf++ = hex_asc_hi(byte);
- *buf++ = hex_asc_lo(byte);
- return buf;
-}
-
-extern const char hex_asc_upper[];
-#define hex_asc_upper_lo(x) hex_asc_upper[((x) & 0x0f)]
-#define hex_asc_upper_hi(x) hex_asc_upper[((x) & 0xf0) >> 4]
-
-static inline char *hex_byte_pack_upper(char *buf, u8 byte)
-{
- *buf++ = hex_asc_upper_hi(byte);
- *buf++ = hex_asc_upper_lo(byte);
- return buf;
-}
-
-extern int hex_to_bin(unsigned char ch);
-extern int __must_check hex2bin(u8 *dst, const char *src, size_t count);
-extern char *bin2hex(char *dst, const void *src, size_t count);
-
-bool mac_pton(const char *s, u8 *mac);
-
/*
* General tracing related utility functions - trace_printk(),
* tracing_on/tracing_off and tracing_start()/tracing_stop
void *buf, unsigned int size,
bool get_value);
void *kexec_purgatory_get_symbol_addr(struct kimage *image, const char *name);
-void *kexec_image_load_default(struct kimage *image);
#ifndef arch_kexec_kernel_image_probe
static inline int
}
#endif
-#ifndef arch_kexec_kernel_image_load
-static inline void *arch_kexec_kernel_image_load(struct kimage *image)
-{
- return kexec_image_load_default(image);
-}
-#endif
-
#ifdef CONFIG_KEXEC_SIG
#ifdef CONFIG_SIGNED_PE_FILE_VERIFICATION
int kexec_kernel_verify_pe_sig(const char *kernel, unsigned long kernel_len);
long long ll;
} DWunion;
+long long notrace __ashldi3(long long u, word_type b);
+long long notrace __ashrdi3(long long u, word_type b);
+word_type notrace __cmpdi2(long long a, long long b);
+long long notrace __lshrdi3(long long u, word_type b);
+long long notrace __muldi3(long long u, long long v);
+word_type notrace __ucmpdi2(unsigned long long a, unsigned long long b);
+
#endif /* __ASM_LIBGCC_H */
static inline void rb_set_parent(struct rb_node *rb, struct rb_node *p)
{
- rb->__rb_parent_color = rb_color(rb) | (unsigned long)p;
+ rb->__rb_parent_color = rb_color(rb) + (unsigned long)p;
}
static inline void rb_set_parent_color(struct rb_node *rb,
struct rb_node *p, int color)
{
- rb->__rb_parent_color = (unsigned long)p | color;
+ rb->__rb_parent_color = (unsigned long)p + color;
}
static inline void
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM notifier
+
+#if !defined(_TRACE_NOTIFIERS_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_NOTIFIERS_H
+
+#include <linux/tracepoint.h>
+
+DECLARE_EVENT_CLASS(notifier_info,
+
+ TP_PROTO(void *cb),
+
+ TP_ARGS(cb),
+
+ TP_STRUCT__entry(
+ __field(void *, cb)
+ ),
+
+ TP_fast_assign(
+ __entry->cb = cb;
+ ),
+
+ TP_printk("%ps", __entry->cb)
+);
+
+/*
+ * notifier_register - called upon notifier callback registration
+ *
+ * @cb: callback pointer
+ *
+ */
+DEFINE_EVENT(notifier_info, notifier_register,
+
+ TP_PROTO(void *cb),
+
+ TP_ARGS(cb)
+);
+
+/*
+ * notifier_unregister - called upon notifier callback unregistration
+ *
+ * @cb: callback pointer
+ *
+ */
+DEFINE_EVENT(notifier_info, notifier_unregister,
+
+ TP_PROTO(void *cb),
+
+ TP_ARGS(cb)
+);
+
+/*
+ * notifier_run - called upon notifier callback execution
+ *
+ * @cb: callback pointer
+ *
+ */
+DEFINE_EVENT(notifier_info, notifier_run,
+
+ TP_PROTO(void *cb),
+
+ TP_ARGS(cb)
+);
+
+#endif /* _TRACE_NOTIFIERS_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
#define _BITUL(x) (_UL(1) << (x))
#define _BITULL(x) (_ULL(1) << (x))
-#define __ALIGN_KERNEL(x, a) __ALIGN_KERNEL_MASK(x, (typeof(x))(a) - 1)
+#define __ALIGN_KERNEL(x, a) __ALIGN_KERNEL_MASK(x, (__typeof__(x))(a) - 1)
#define __ALIGN_KERNEL_MASK(x, mask) (((x) + (mask)) & ~(mask))
#define __KERNEL_DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d))
*/
-#define TASKSTATS_VERSION 13
+#define TASKSTATS_VERSION 14
#define TS_COMM_LEN 32 /* should be >= TASK_COMM_LEN
* in linux/sched.h */
/* v13: Delay waiting for write-protect copy */
__u64 wpcopy_count;
__u64 wpcopy_delay_total;
+
+ /* v14: Delay waiting for IRQ/SOFTIRQ */
+ __u64 irq_count;
+ __u64 irq_delay_total;
};
d->compact_delay_total = (tmp < d->compact_delay_total) ? 0 : tmp;
tmp = d->wpcopy_delay_total + tsk->delays->wpcopy_delay;
d->wpcopy_delay_total = (tmp < d->wpcopy_delay_total) ? 0 : tmp;
+ tmp = d->irq_delay_total + tsk->delays->irq_delay;
+ d->irq_delay_total = (tmp < d->irq_delay_total) ? 0 : tmp;
d->blkio_count += tsk->delays->blkio_count;
d->swapin_count += tsk->delays->swapin_count;
d->freepages_count += tsk->delays->freepages_count;
d->thrashing_count += tsk->delays->thrashing_count;
d->compact_count += tsk->delays->compact_count;
d->wpcopy_count += tsk->delays->wpcopy_count;
+ d->irq_count += tsk->delays->irq_count;
raw_spin_unlock_irqrestore(&tsk->delays->lock, flags);
return 0;
¤t->delays->wpcopy_delay,
¤t->delays->wpcopy_count);
}
+
+void __delayacct_irq(struct task_struct *task, u32 delta)
+{
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&task->delays->lock, flags);
+ task->delays->irq_delay += delta;
+ task->delays->irq_count++;
+ raw_spin_unlock_irqrestore(&task->delays->lock, flags);
+}
+
/*
* The number of tasks checked:
*/
-int __read_mostly sysctl_hung_task_check_count = PID_MAX_LIMIT;
+static int __read_mostly sysctl_hung_task_check_count = PID_MAX_LIMIT;
/*
* Limit number of tasks checked in a batch.
/*
* Zero (default value) means use sysctl_hung_task_timeout_secs:
*/
-unsigned long __read_mostly sysctl_hung_task_check_interval_secs;
+static unsigned long __read_mostly sysctl_hung_task_check_interval_secs;
-int __read_mostly sysctl_hung_task_warnings = 10;
+static int __read_mostly sysctl_hung_task_warnings = 10;
static int __read_mostly did_panic;
static bool hung_task_show_lock;
* Should we panic (and reboot, if panic_timeout= is set) when a
* hung task is detected:
*/
-unsigned int __read_mostly sysctl_hung_task_panic =
- IS_ENABLED(CONFIG_BOOTPARAM_HUNG_TASK_PANIC);
+static unsigned int __read_mostly sysctl_hung_task_panic =
+ IS_ENABLED(CONFIG_BOOTPARAM_HUNG_TASK_PANIC);
static int
hung_task_panic(struct notifier_block *this, unsigned long event, void *ptr)
return ret;
}
-void *kexec_image_load_default(struct kimage *image)
+static void *kexec_image_load_default(struct kimage *image)
{
if (!image->fops || !image->fops->load)
return ERR_PTR(-ENOEXEC);
/* IMA needs to pass the measurement list to the next kernel. */
ima_add_kexec_buffer(image);
- /* Call arch image load handlers */
- ldata = arch_kexec_kernel_image_load(image);
+ /* Call image load handler */
+ ldata = kexec_image_load_default(image);
if (IS_ERR(ldata)) {
ret = PTR_ERR(ldata);
#include <linux/vmalloc.h>
#include <linux/reboot.h>
+#define CREATE_TRACE_POINTS
+#include <trace/events/notifier.h>
+
/*
* Notifier list for kernel code which wants to be called
* at shutdown. This is used to stop any idling DMA operations
}
n->next = *nl;
rcu_assign_pointer(*nl, n);
+ trace_notifier_register((void *)n->notifier_call);
return 0;
}
while ((*nl) != NULL) {
if ((*nl) == n) {
rcu_assign_pointer(*nl, n->next);
+ trace_notifier_unregister((void *)n->notifier_call);
return 0;
}
nl = &((*nl)->next);
continue;
}
#endif
+ trace_notifier_run((void *)nb->notifier_call);
ret = nb->notifier_call(nb, val, v);
if (nr_calls)
rq->prev_irq_time += irq_delta;
delta -= irq_delta;
psi_account_irqtime(rq->curr, irq_delta);
+ delayacct_irq(rq->curr, irq_delta);
#endif
#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
if (static_key_false((¶virt_steal_rq_enabled))) {
/**
* build_id_parse_buf - Get build ID from a buffer
- * @buf: Elf note section(s) to parse
+ * @buf: ELF note section(s) to parse
* @buf_size: Size of @buf in bytes
* @build_id: Build ID parsed from @buf, at least BUILD_ID_SIZE_MAX long
*
static inline void rb_set_black(struct rb_node *rb)
{
- rb->__rb_parent_color |= RB_BLACK;
+ rb->__rb_parent_color += RB_BLACK;
}
static inline struct rb_node *rb_red_parent(struct rb_node *red)
for (i = 0; i < UNESCAPE_ALL_MASK + 1; i++)
test_string_unescape("unescape", i, false);
test_string_unescape("unescape inplace",
- get_random_u32_below(UNESCAPE_ANY + 1), true);
+ get_random_u32_below(UNESCAPE_ALL_MASK + 1), true);
/* Without dictionary */
for (i = 0; i < ESCAPE_ALL_MASK + 1; i++)
*
* Return: newly allocated copy of @s or %NULL in case of error
*/
+noinline
char *kstrdup(const char *s, gfp_t gfp)
{
size_t len;
Cc:
)};
+our @link_tags = qw(Link Closes);
+
+#Create a search and print patterns for all these strings to be used directly below
+our $link_tags_search = "";
+our $link_tags_print = "";
+foreach my $entry (@link_tags) {
+ if ($link_tags_search ne "") {
+ $link_tags_search .= '|';
+ $link_tags_print .= ' or ';
+ }
+ $entry .= ':';
+ $link_tags_search .= $entry;
+ $link_tags_print .= "'$entry'";
+}
+$link_tags_search = "(?:${link_tags_search})";
+
our $tracing_logging_tags = qr{(?xi:
[=-]*> |
<[=-]* |
}
}
-# check if Reported-by: is followed by a Link:
+# check if Reported-by: is followed by a Closes: tag
if ($sign_off =~ /^reported(?:|-and-tested)-by:$/i) {
if (!defined $lines[$linenr]) {
WARN("BAD_REPORTED_BY_LINK",
- "Reported-by: should be immediately followed by Link: to the report\n" . $herecurr . $rawlines[$linenr] . "\n");
- } elsif ($rawlines[$linenr] !~ m{^link:\s*https?://}i) {
+ "Reported-by: should be immediately followed by Closes: with a URL to the report\n" . $herecurr . "\n");
+ } elsif ($rawlines[$linenr] !~ /^closes:\s*/i) {
WARN("BAD_REPORTED_BY_LINK",
- "Reported-by: should be immediately followed by Link: with a URL to the report\n" . $herecurr . $rawlines[$linenr] . "\n");
+ "Reported-by: should be immediately followed by Closes: with a URL to the report\n" . $herecurr . $rawlines[$linenr] . "\n");
}
}
}
# file delta changes
$line =~ /^\s*(?:[\w\.\-\+]*\/)++[\w\.\-\+]+:/ ||
# filename then :
- $line =~ /^\s*(?:Fixes:|Link:|$signature_tags)/i ||
- # A Fixes: or Link: line or signature tag line
+ $line =~ /^\s*(?:Fixes:|$link_tags_search|$signature_tags)/i ||
+ # A Fixes:, link or signature tag line
$commit_log_possible_stack_dump)) {
WARN("COMMIT_LOG_LONG_LINE",
"Possible unwrapped commit description (prefer a maximum 75 chars per line)\n" . $herecurr);
# Check for odd tags before a URI/URL
if ($in_commit_log &&
- $line =~ /^\s*(\w+):\s*http/ && $1 ne 'Link') {
+ $line =~ /^\s*(\w+:)\s*http/ && $1 !~ /^$link_tags_search$/) {
if ($1 =~ /^v(?:ersion)?\d+/i) {
WARN("COMMIT_LOG_VERSIONING",
"Patch version information should be after the --- line\n" . $herecurr);
} else {
WARN("COMMIT_LOG_USE_LINK",
- "Unknown link reference '$1:', use 'Link:' instead\n" . $herecurr);
+ "Unknown link reference '$1', use $link_tags_print instead\n" . $herecurr);
+ }
+ }
+
+# Check for misuse of the link tags
+ if ($in_commit_log &&
+ $line =~ /^\s*(\w+:)\s*(\S+)/) {
+ my $tag = $1;
+ my $value = $2;
+ if ($tag =~ /^$link_tags_search$/ && $value !~ m{^https?://}) {
+ WARN("COMMIT_LOG_WRONG_LINK",
+ "'$tag' should be followed by a public http(s) link\n" . $herecurr);
}
}
"'$spdx_license' is not supported in LICENSES/...\n" . $herecurr);
}
if ($realfile =~ m@^Documentation/devicetree/bindings/@ &&
- not $spdx_license =~ /GPL-2\.0.*BSD-2-Clause/) {
+ $spdx_license !~ /GPL-2\.0(?:-only)? OR BSD-2-Clause/) {
my $msg_level = \&WARN;
$msg_level = \&CHK if ($file);
if (&{$msg_level}("SPDX_LICENSE_TAG",
$fixed[$fixlinenr] =~ s/SPDX-License-Identifier: .*/SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)/;
}
}
+ if ($realfile =~ m@^include/dt-bindings/@ &&
+ $spdx_license !~ /GPL-2\.0(?:-only)? OR \S+/) {
+ WARN("SPDX_LICENSE_TAG",
+ "DT binding headers should be licensed (GPL-2.0-only OR .*)\n" . $herecurr);
+ }
}
}
}
$var !~ /^(?:[A-Z]+_){1,5}[A-Z]{1,3}[a-z]/ &&
#Ignore Page<foo> variants
$var !~ /^(?:Clear|Set|TestClear|TestSet|)Page[A-Z]/ &&
+#Ignore ETHTOOL_LINK_MODE_<foo> variants
+ $var !~ /^ETHTOOL_LINK_MODE_/ &&
#Ignore SI style variants like nS, mV and dB
#(ie: max_uV, regulator_min_uA_show, RANGE_mA_VALUE)
$var !~ /^(?:[a-z0-9_]*|[A-Z0-9_]*)?_?[a-z][A-Z](?:_[a-z0-9_]+|_[A-Z0-9_]+)?$/ &&
self.show_subtree(child, level + 1)
def invoke(self, arg, from_tty):
+ if utils.gdb_eval_or_none("clk_root_list") is None:
+ raise gdb.GdbError("No clocks registered")
gdb.write(" enable prepare protect \n")
gdb.write(" clock count count count rate \n")
gdb.write("------------------------------------------------------------------------\n")
#include <linux/clk-provider.h>
#include <linux/fs.h>
#include <linux/hrtimer.h>
+#include <linux/irq.h>
#include <linux/mount.h>
#include <linux/of_fdt.h>
+#include <linux/radix-tree.h>
#include <linux/threads.h>
/* We need to stringify expanded macros so that they can be parsed */
import gdb
+LX_CONFIG(CONFIG_DEBUG_INFO_REDUCED)
+
/* linux/clk-provider.h */
if IS_BUILTIN(CONFIG_COMMON_CLK):
LX_GDBPARSED(CLK_GET_RATE_NOCACHE)
/* linux/htimer.h */
LX_GDBPARSED(hrtimer_resolution)
+/* linux/irq.h */
+LX_GDBPARSED(IRQD_LEVEL)
+LX_GDBPARSED(IRQ_HIDDEN)
+
/* linux/module.h */
LX_GDBPARSED(MOD_TEXT)
/* linux/of_fdt.h> */
LX_VALUE(OF_DT_HEADER)
+/* linux/radix-tree.h */
+LX_GDBPARSED(RADIX_TREE_ENTRY_MASK)
+LX_GDBPARSED(RADIX_TREE_INTERNAL_NODE)
+LX_GDBPARSED(RADIX_TREE_MAP_SIZE)
+LX_GDBPARSED(RADIX_TREE_MAP_SHIFT)
+LX_GDBPARSED(RADIX_TREE_MAP_MASK)
+
/* Kernel Configs */
LX_CONFIG(CONFIG_GENERIC_CLOCKEVENTS)
LX_CONFIG(CONFIG_GENERIC_CLOCKEVENTS_BROADCAST)
LX_CONFIG(CONFIG_NR_CPUS)
LX_CONFIG(CONFIG_OF)
LX_CONFIG(CONFIG_TICK_ONESHOT)
+LX_CONFIG(CONFIG_GENERIC_IRQ_SHOW_LEVEL)
+LX_CONFIG(CONFIG_X86_LOCAL_APIC)
+LX_CONFIG(CONFIG_SMP)
+LX_CONFIG(CONFIG_X86_THERMAL_VECTOR)
+LX_CONFIG(CONFIG_X86_MCE_THRESHOLD)
+LX_CONFIG(CONFIG_X86_MCE_AMD)
+LX_CONFIG(CONFIG_X86_MCE)
+LX_CONFIG(CONFIG_X86_IO_APIC)
+LX_CONFIG(CONFIG_HAVE_KVM)
task_ptr_type = task_type.get_type().pointer()
if utils.is_target_arch("x86"):
- var_ptr = gdb.parse_and_eval("&pcpu_hot.current_task")
- return per_cpu(var_ptr, cpu).dereference()
+ if gdb.lookup_global_symbol("cpu_tasks"):
+ # This is a UML kernel, which stores the current task
+ # differently than other x86 sub architectures
+ var_ptr = gdb.parse_and_eval("(struct task_struct *)cpu_tasks[0].task")
+ return var_ptr.dereference()
+ else:
+ var_ptr = gdb.parse_and_eval("&pcpu_hot.current_task")
+ return per_cpu(var_ptr, cpu).dereference()
elif utils.is_target_arch("aarch64"):
- current_task_addr = gdb.parse_and_eval("$SP_EL0")
- if((current_task_addr >> 63) != 0):
- current_task = current_task_addr.cast(task_ptr_type)
- return current_task.dereference()
- else:
- raise gdb.GdbError("Sorry, obtaining the current task is not allowed "
- "while running in userspace(EL0)")
+ current_task_addr = gdb.parse_and_eval("$SP_EL0")
+ if (current_task_addr >> 63) != 0:
+ current_task = current_task_addr.cast(task_ptr_type)
+ return current_task.dereference()
+ else:
+ raise gdb.GdbError("Sorry, obtaining the current task is not allowed "
+ "while running in userspace(EL0)")
else:
raise gdb.GdbError("Sorry, obtaining the current task is not yet "
"supported with this arch")
import gdb
import sys
-from linux.utils import CachedType
+from linux.utils import CachedType, gdb_eval_or_none
from linux.lists import list_for_each_entry
generic_pm_domain_type = CachedType('struct generic_pm_domain')
gdb.write(' %-50s %s\n' % (kobj_path, rtpm_status_str(dev)))
def invoke(self, arg, from_tty):
+ if gdb_eval_or_none("&gpd_list") is None:
+ raise gdb.GdbError("No power domain(s) registered")
gdb.write('domain status children\n');
gdb.write(' /device runtime status\n');
gdb.write('----------------------------------------------------------------------\n');
--- /dev/null
+# SPDX-License-Identifier: GPL-2.0
+#
+# Copyright 2023 Broadcom
+
+import gdb
+
+from linux import constants
+from linux import cpus
+from linux import utils
+from linux import radixtree
+
+irq_desc_type = utils.CachedType("struct irq_desc")
+
+def irq_settings_is_hidden(desc):
+ return desc['status_use_accessors'] & constants.LX_IRQ_HIDDEN
+
+def irq_desc_is_chained(desc):
+ return desc['action'] and desc['action'] == gdb.parse_and_eval("&chained_action")
+
+def irqd_is_level(desc):
+ return desc['irq_data']['common']['state_use_accessors'] & constants.LX_IRQD_LEVEL
+
+def show_irq_desc(prec, irq):
+ text = ""
+
+ desc = radixtree.lookup(gdb.parse_and_eval("&irq_desc_tree"), irq)
+ if desc is None:
+ return text
+
+ desc = desc.cast(irq_desc_type.get_type())
+ if desc is None:
+ return text
+
+ if irq_settings_is_hidden(desc):
+ return text
+
+ any_count = 0
+ if desc['kstat_irqs']:
+ for cpu in cpus.each_online_cpu():
+ any_count += cpus.per_cpu(desc['kstat_irqs'], cpu)
+
+ if (desc['action'] == 0 or irq_desc_is_chained(desc)) and any_count == 0:
+ return text;
+
+ text += "%*d: " % (prec, irq)
+ for cpu in cpus.each_online_cpu():
+ if desc['kstat_irqs']:
+ count = cpus.per_cpu(desc['kstat_irqs'], cpu)
+ else:
+ count = 0
+ text += "%10u" % (count)
+
+ name = "None"
+ if desc['irq_data']['chip']:
+ chip = desc['irq_data']['chip']
+ if chip['name']:
+ name = chip['name'].string()
+ else:
+ name = "-"
+
+ text += " %8s" % (name)
+
+ if desc['irq_data']['domain']:
+ text += " %*lu" % (prec, desc['irq_data']['hwirq'])
+ else:
+ text += " %*s" % (prec, "")
+
+ if constants.LX_CONFIG_GENERIC_IRQ_SHOW_LEVEL:
+ text += " %-8s" % ("Level" if irqd_is_level(desc) else "Edge")
+
+ if desc['name']:
+ text += "-%-8s" % (desc['name'].string())
+
+ """ Some toolchains may not be able to provide information about irqaction """
+ try:
+ gdb.lookup_type("struct irqaction")
+ action = desc['action']
+ if action is not None:
+ text += " %s" % (action['name'].string())
+ while True:
+ action = action['next']
+ if action is not None:
+ break
+ if action['name']:
+ text += ", %s" % (action['name'].string())
+ except:
+ pass
+
+ text += "\n"
+
+ return text
+
+def show_irq_err_count(prec):
+ cnt = utils.gdb_eval_or_none("irq_err_count")
+ text = ""
+ if cnt is not None:
+ text += "%*s: %10u\n" % (prec, "ERR", cnt['counter'])
+ return text
+
+def x86_show_irqstat(prec, pfx, field, desc):
+ irq_stat = gdb.parse_and_eval("&irq_stat")
+ text = "%*s: " % (prec, pfx)
+ for cpu in cpus.each_online_cpu():
+ stat = cpus.per_cpu(irq_stat, cpu)
+ text += "%10u " % (stat[field])
+ text += " %s\n" % (desc)
+ return text
+
+def x86_show_mce(prec, var, pfx, desc):
+ pvar = gdb.parse_and_eval(var)
+ text = "%*s: " % (prec, pfx)
+ for cpu in cpus.each_online_cpu():
+ text += "%10u " % (cpus.per_cpu(pvar, cpu))
+ text += " %s\n" % (desc)
+ return text
+
+def x86_show_interupts(prec):
+ text = x86_show_irqstat(prec, "NMI", '__nmi_count', 'Non-maskable interrupts')
+
+ if constants.LX_CONFIG_X86_LOCAL_APIC:
+ text += x86_show_irqstat(prec, "LOC", 'apic_timer_irqs', "Local timer interrupts")
+ text += x86_show_irqstat(prec, "SPU", 'irq_spurious_count', "Spurious interrupts")
+ text += x86_show_irqstat(prec, "PMI", 'apic_perf_irqs', "Performance monitoring interrupts")
+ text += x86_show_irqstat(prec, "IWI", 'apic_irq_work_irqs', "IRQ work interrupts")
+ text += x86_show_irqstat(prec, "RTR", 'icr_read_retry_count', "APIC ICR read retries")
+ if utils.gdb_eval_or_none("x86_platform_ipi_callback") is not None:
+ text += x86_show_irqstat(prec, "PLT", 'x86_platform_ipis', "Platform interrupts")
+
+ if constants.LX_CONFIG_SMP:
+ text += x86_show_irqstat(prec, "RES", 'irq_resched_count', "Rescheduling interrupts")
+ text += x86_show_irqstat(prec, "CAL", 'irq_call_count', "Function call interrupts")
+ text += x86_show_irqstat(prec, "TLB", 'irq_tlb_count', "TLB shootdowns")
+
+ if constants.LX_CONFIG_X86_THERMAL_VECTOR:
+ text += x86_show_irqstat(prec, "TRM", 'irq_thermal_count', "Thermal events interrupts")
+
+ if constants.LX_CONFIG_X86_MCE_THRESHOLD:
+ text += x86_show_irqstat(prec, "THR", 'irq_threshold_count', "Threshold APIC interrupts")
+
+ if constants.LX_CONFIG_X86_MCE_AMD:
+ text += x86_show_irqstat(prec, "DFR", 'irq_deferred_error_count', "Deferred Error APIC interrupts")
+
+ if constants.LX_CONFIG_X86_MCE:
+ text += x86_show_mce(prec, "&mce_exception_count", "MCE", "Machine check exceptions")
+ text == x86_show_mce(prec, "&mce_poll_count", "MCP", "Machine check polls")
+
+ text += show_irq_err_count(prec)
+
+ if constants.LX_CONFIG_X86_IO_APIC:
+ cnt = utils.gdb_eval_or_none("irq_mis_count")
+ if cnt is not None:
+ text += "%*s: %10u\n" % (prec, "MIS", cnt['counter'])
+
+ if constants.LX_CONFIG_HAVE_KVM:
+ text += x86_show_irqstat(prec, "PIN", 'kvm_posted_intr_ipis', 'Posted-interrupt notification event')
+ text += x86_show_irqstat(prec, "NPI", 'kvm_posted_intr_nested_ipis', 'Nested posted-interrupt event')
+ text += x86_show_irqstat(prec, "PIW", 'kvm_posted_intr_wakeup_ipis', 'Posted-interrupt wakeup event')
+
+ return text
+
+def arm_common_show_interrupts(prec):
+ text = ""
+ nr_ipi = utils.gdb_eval_or_none("nr_ipi")
+ ipi_desc = utils.gdb_eval_or_none("ipi_desc")
+ ipi_types = utils.gdb_eval_or_none("ipi_types")
+ if nr_ipi is None or ipi_desc is None or ipi_types is None:
+ return text
+
+ if prec >= 4:
+ sep = " "
+ else:
+ sep = ""
+
+ for ipi in range(nr_ipi):
+ text += "%*s%u:%s" % (prec - 1, "IPI", ipi, sep)
+ desc = ipi_desc[ipi].cast(irq_desc_type.get_type().pointer())
+ if desc == 0:
+ continue
+ for cpu in cpus.each_online_cpu():
+ text += "%10u" % (cpus.per_cpu(desc['kstat_irqs'], cpu))
+ text += " %s" % (ipi_types[ipi].string())
+ text += "\n"
+ return text
+
+def aarch64_show_interrupts(prec):
+ text = arm_common_show_interrupts(prec)
+ text += "%*s: %10lu\n" % (prec, "ERR", gdb.parse_and_eval("irq_err_count"))
+ return text
+
+def arch_show_interrupts(prec):
+ text = ""
+ if utils.is_target_arch("x86"):
+ text += x86_show_interupts(prec)
+ elif utils.is_target_arch("aarch64"):
+ text += aarch64_show_interrupts(prec)
+ elif utils.is_target_arch("arm"):
+ text += arm_common_show_interrupts(prec)
+ elif utils.is_target_arch("mips"):
+ text += show_irq_err_count(prec)
+ else:
+ raise gdb.GdbError("Unsupported architecture: {}".format(target_arch))
+
+ return text
+
+class LxInterruptList(gdb.Command):
+ """Print /proc/interrupts"""
+
+ def __init__(self):
+ super(LxInterruptList, self).__init__("lx-interruptlist", gdb.COMMAND_DATA)
+
+ def invoke(self, arg, from_tty):
+ nr_irqs = gdb.parse_and_eval("nr_irqs")
+ prec = 3
+ j = 1000
+ while prec < 10 and j <= nr_irqs:
+ prec += 1
+ j *= 10
+
+ gdb.write("%*s" % (prec + 8, ""))
+ for cpu in cpus.each_online_cpu():
+ gdb.write("CPU%-8d" % cpu)
+ gdb.write("\n")
+
+ if utils.gdb_eval_or_none("&irq_desc_tree") is None:
+ return
+
+ for irq in range(nr_irqs):
+ gdb.write(show_irq_desc(prec, irq))
+ gdb.write(arch_show_interrupts(prec))
+
+
+LxInterruptList()
+# SPDX-License-Identifier: GPL-2.0
#
# gdb helper commands and functions for Linux kernel debugging
#
from linux import utils
from linux import tasks
from linux import lists
+from linux import vfs
from struct import *
gdb.write("{:^18} {:^15} {:>9} {} {} options\n".format(
"mount", "super_block", "devname", "pathname", "fstype"))
- for vfs in lists.list_for_each_entry(namespace['list'],
+ for mnt in lists.list_for_each_entry(namespace['list'],
mount_ptr_type, "mnt_list"):
- devname = vfs['mnt_devname'].string()
+ devname = mnt['mnt_devname'].string()
devname = devname if devname else "none"
pathname = ""
- parent = vfs
+ parent = mnt
while True:
mntpoint = parent['mnt_mountpoint']
- pathname = utils.dentry_name(mntpoint) + pathname
+ pathname = vfs.dentry_name(mntpoint) + pathname
if (parent == parent['mnt_parent']):
break
parent = parent['mnt_parent']
if (pathname == ""):
pathname = "/"
- superblock = vfs['mnt']['mnt_sb']
+ superblock = mnt['mnt']['mnt_sb']
fstype = superblock['s_type']['name'].string()
s_flags = int(superblock['s_flags'])
- m_flags = int(vfs['mnt']['mnt_flags'])
+ m_flags = int(mnt['mnt']['mnt_flags'])
rd = "ro" if (s_flags & constants.LX_SB_RDONLY) else "rw"
gdb.write("{} {} {} {} {} {}{}{} 0 0\n".format(
- vfs.format_string(), superblock.format_string(), devname,
+ mnt.format_string(), superblock.format_string(), devname,
pathname, fstype, rd, info_opts(FS_INFO, s_flags),
info_opts(MNT_INFO, m_flags)))
--- /dev/null
+# SPDX-License-Identifier: GPL-2.0
+#
+# Radix Tree Parser
+#
+# Copyright (c) 2016 Linaro Ltd
+# Copyright (c) 2023 Broadcom
+#
+# Authors:
+# Kieran Bingham <kieran.bingham@linaro.org>
+# Florian Fainelli <f.fainelli@gmail.com>
+
+import gdb
+
+from linux import utils
+from linux import constants
+
+radix_tree_root_type = utils.CachedType("struct xarray")
+radix_tree_node_type = utils.CachedType("struct xa_node")
+
+def is_internal_node(node):
+ long_type = utils.get_long_type()
+ return ((node.cast(long_type) & constants.LX_RADIX_TREE_ENTRY_MASK) == constants.LX_RADIX_TREE_INTERNAL_NODE)
+
+def entry_to_node(node):
+ long_type = utils.get_long_type()
+ node_type = node.type
+ indirect_ptr = node.cast(long_type) & ~constants.LX_RADIX_TREE_INTERNAL_NODE
+ return indirect_ptr.cast(radix_tree_node_type.get_type().pointer())
+
+def node_maxindex(node):
+ return (constants.LX_RADIX_TREE_MAP_SIZE << node['shift']) - 1
+
+def lookup(root, index):
+ if root.type == radix_tree_root_type.get_type().pointer():
+ node = root.dereference()
+ elif root.type != radix_tree_root_type.get_type():
+ raise gdb.GdbError("must be {} not {}"
+ .format(radix_tree_root_type.get_type(), root.type))
+
+ node = root['xa_head']
+ if node == 0:
+ return None
+
+ if not (is_internal_node(node)):
+ if (index > 0):
+ return None
+ return node
+
+ node = entry_to_node(node)
+ maxindex = node_maxindex(node)
+
+ if (index > maxindex):
+ return None
+
+ shift = node['shift'] + constants.LX_RADIX_TREE_MAP_SHIFT
+
+ while True:
+ offset = (index >> node['shift']) & constants.LX_RADIX_TREE_MAP_MASK
+ slot = node['slots'][offset]
+
+ if slot == 0:
+ return None
+
+ node = slot.cast(node.type.pointer()).dereference()
+ if node == 0:
+ return None
+
+ shift -= constants.LX_RADIX_TREE_MAP_SHIFT
+ if (shift <= 0):
+ break
+
+ return node
+
+class LxRadixTree(gdb.Function):
+ """ Lookup and return a node from a RadixTree.
+
+$lx_radix_tree_lookup(root_node [, index]): Return the node at the given index.
+If index is omitted, the root node is dereference and returned."""
+
+ def __init__(self):
+ super(LxRadixTree, self).__init__("lx_radix_tree_lookup")
+
+ def invoke(self, root, index=0):
+ result = lookup(root, index)
+ if result is None:
+ raise gdb.GdbError("No entry in tree at index {}".format(index))
+
+ return result
+
+LxRadixTree()
def print_active_timers(base):
- curr = base['active']['next']['node']
- curr = curr.address.cast(rbtree.rb_node_type.get_type().pointer())
+ curr = base['active']['rb_root']['rb_leftmost']
idx = 0
while curr:
yield print_timer(curr, idx)
ts = cpus.per_cpu(tick_sched_ptr, cpu)
text = "cpu: {}\n".format(cpu)
- for i in xrange(max_clock_bases):
+ for i in range(max_clock_bases):
text += " clock {}:\n".format(i)
text += print_base(cpu_base['clock_base'][i])
num_bytes = (nr_cpu_ids + 7) / 8
buf = utils.read_memoryview(inf, bits, num_bytes).tobytes()
buf = binascii.b2a_hex(buf)
+ if type(buf) is not str:
+ buf=buf.decode()
chunks = []
i = num_bytes
if 0 < extra <= 4:
chunks[0] = chunks[0][0] # Cut off the first 0
- return "".join(chunks)
+ return "".join(str(chunks))
class LxTimerList(gdb.Command):
max_clock_bases = gdb.parse_and_eval("HRTIMER_MAX_CLOCK_BASES")
text = "Timer List Version: gdb scripts\n"
- text += "HRTIMER_MAX_CLOCK_BASES: {}\n".format(max_clock_bases)
+ text += "HRTIMER_MAX_CLOCK_BASES: {}\n".format(
+ max_clock_bases.type.fields()[max_clock_bases].enumval)
text += "now at {} nsecs\n".format(ktime_get())
for cpu in cpus.each_online_cpu():
def read_memoryview(inf, start, length):
- return memoryview(inf.read_memory(start, length))
+ m = inf.read_memory(start, length)
+ if type(m) is memoryview:
+ return m
+ return memoryview(m)
def read_u16(buffer, offset):
return gdb.parse_and_eval(expresssion)
except gdb.error:
return None
-
-
-def dentry_name(d):
- parent = d['d_parent']
- if parent == d or parent == 0:
- return ""
- p = dentry_name(d['d_parent']) + "/"
- return p + d['d_iname'].string()
--- /dev/null
+#
+# gdb helper commands and functions for Linux kernel debugging
+#
+# VFS tools
+#
+# Copyright (c) 2023 Glenn Washburn
+# Copyright (c) 2016 Linaro Ltd
+#
+# Authors:
+# Glenn Washburn <development@efficientek.com>
+# Kieran Bingham <kieran.bingham@linaro.org>
+#
+# This work is licensed under the terms of the GNU GPL version 2.
+#
+
+import gdb
+from linux import utils
+
+
+def dentry_name(d):
+ parent = d['d_parent']
+ if parent == d or parent == 0:
+ return ""
+ p = dentry_name(d['d_parent']) + "/"
+ return p + d['d_iname'].string()
+
+class DentryName(gdb.Function):
+ """Return string of the full path of a dentry.
+
+$lx_dentry_name(PTR): Given PTR to a dentry struct, return a string
+of the full path of the dentry."""
+
+ def __init__(self):
+ super(DentryName, self).__init__("lx_dentry_name")
+
+ def invoke(self, dentry_ptr):
+ return dentry_name(dentry_ptr)
+
+DentryName()
+
+
+dentry_type = utils.CachedType("struct dentry")
+
+class InodeDentry(gdb.Function):
+ """Return dentry pointer for inode.
+
+$lx_i_dentry(PTR): Given PTR to an inode struct, return a pointer to
+the associated dentry struct, if there is one."""
+
+ def __init__(self):
+ super(InodeDentry, self).__init__("lx_i_dentry")
+
+ def invoke(self, inode_ptr):
+ d_u = inode_ptr["i_dentry"]["first"]
+ if d_u == 0:
+ return ""
+ return utils.container_of(d_u, dentry_type.get_type().pointer(), "d_u")
+
+InodeDentry()
gdb.write("NOTE: gdb 7.2 or later required for Linux helper scripts to "
"work.\n")
else:
+ import linux.constants
+ if linux.constants.LX_CONFIG_DEBUG_INFO_REDUCED:
+ raise gdb.GdbError("Reduced debug information will prevent GDB "
+ "from having complete types.\n")
import linux.utils
import linux.symbols
import linux.modules
import linux.lists
import linux.rbtree
import linux.proc
- import linux.constants
import linux.timerlist
import linux.clk
import linux.genpd
import linux.device
+ import linux.vfs
import linux.mm
+ import linux.radixtree
+ import linux.interrupts
if is_enabled CONFIG_KALLSYMS; then
if ! cmp -s System.map ${kallsyms_vmlinux}.syms; then
echo >&2 Inconsistent kallsyms data
- echo >&2 Try "make KALLSYMS_EXTRA_PASS=1" as a workaround
+ echo >&2 'Try "make KALLSYMS_EXTRA_PASS=1" as a workaround'
exit 1
fi
fi
if (strncmp(elf_hdr->e_ident, ELFMAG, sizeof(ELFMAG) - 1))
dev_err(component->dev, "Wrong ELF header prefix\n");
if (elf_hdr->e_ehsize != sizeof(Elf32_Ehdr))
- dev_err(component->dev, "Wrong Elf header size\n");
+ dev_err(component->dev, "Wrong ELF header size\n");
if (elf_hdr->e_machine != EM_XTENSA)
dev_err(component->dev, "Wrong DSP code file\n");
printf("\n\nCPU %15s%15s%15s%15s%15s\n"
" %15llu%15llu%15llu%15llu%15.3fms\n"
"IO %15s%15s%15s\n"
- " %15llu%15llu%15llums\n"
+ " %15llu%15llu%15.3fms\n"
"SWAP %15s%15s%15s\n"
- " %15llu%15llu%15llums\n"
+ " %15llu%15llu%15.3fms\n"
"RECLAIM %12s%15s%15s\n"
- " %15llu%15llu%15llums\n"
+ " %15llu%15llu%15.3fms\n"
"THRASHING%12s%15s%15s\n"
- " %15llu%15llu%15llums\n"
+ " %15llu%15llu%15.3fms\n"
"COMPACT %12s%15s%15s\n"
- " %15llu%15llu%15llums\n"
+ " %15llu%15llu%15.3fms\n"
"WPCOPY %12s%15s%15s\n"
- " %15llu%15llu%15llums\n",
+ " %15llu%15llu%15.3fms\n"
+ "IRQ %15s%15s%15s\n"
+ " %15llu%15llu%15.3fms\n",
"count", "real total", "virtual total",
"delay total", "delay average",
(unsigned long long)t->cpu_count,
"count", "delay total", "delay average",
(unsigned long long)t->blkio_count,
(unsigned long long)t->blkio_delay_total,
- average_ms(t->blkio_delay_total, t->blkio_count),
+ average_ms((double)t->blkio_delay_total, t->blkio_count),
"count", "delay total", "delay average",
(unsigned long long)t->swapin_count,
(unsigned long long)t->swapin_delay_total,
- average_ms(t->swapin_delay_total, t->swapin_count),
+ average_ms((double)t->swapin_delay_total, t->swapin_count),
"count", "delay total", "delay average",
(unsigned long long)t->freepages_count,
(unsigned long long)t->freepages_delay_total,
- average_ms(t->freepages_delay_total, t->freepages_count),
+ average_ms((double)t->freepages_delay_total, t->freepages_count),
"count", "delay total", "delay average",
(unsigned long long)t->thrashing_count,
(unsigned long long)t->thrashing_delay_total,
- average_ms(t->thrashing_delay_total, t->thrashing_count),
+ average_ms((double)t->thrashing_delay_total, t->thrashing_count),
"count", "delay total", "delay average",
(unsigned long long)t->compact_count,
(unsigned long long)t->compact_delay_total,
- average_ms(t->compact_delay_total, t->compact_count),
+ average_ms((double)t->compact_delay_total, t->compact_count),
"count", "delay total", "delay average",
(unsigned long long)t->wpcopy_count,
(unsigned long long)t->wpcopy_delay_total,
- average_ms(t->wpcopy_delay_total, t->wpcopy_count));
+ average_ms((double)t->wpcopy_delay_total, t->wpcopy_count),
+ "count", "delay total", "delay average",
+ (unsigned long long)t->irq_count,
+ (unsigned long long)t->irq_delay_total,
+ average_ms((double)t->irq_delay_total, t->irq_count));
}
static void task_context_switch_counts(struct taskstats *t)
// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
/*
- * resolve_btfids scans Elf object for .BTF_ids section and resolves
+ * resolve_btfids scans ELF object for .BTF_ids section and resolves
* its symbols with BTF ID values.
*
* Each symbol points to 4 bytes data and is expected to have
goto errout;
}
- /* Elf is corrupted/truncated, avoid calling elf_strptr. */
+ /* ELF is corrupted/truncated, avoid calling elf_strptr. */
if (!elf_rawdata(elf_getscn(elf, obj->efile.shstrndx), NULL)) {
pr_warn("elf: failed to get section names strings from %s: %s\n",
obj->path, elf_errmsg(-1));
target->rel_ip = usdt_rel_ip;
target->sema_off = usdt_sema_off;
- /* notes.args references strings from Elf itself, so they can
+ /* notes.args references strings from ELF itself, so they can
* be referenced safely until elf_end() call
*/
target->spec_str = note.args;
Elf_Scn *sec = NULL;
size_t cnt = 1;
- /* Elf is corrupted/truncated, avoid calling elf_strptr. */
+ /* ELF is corrupted/truncated, avoid calling elf_strptr. */
if (!elf_rawdata(elf_getscn(elf, ep->e_shstrndx), NULL))
return NULL;