platform/kernel/linux-starfive.git
7 years agodrm/nouveau/core/memory: comptag allocation
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/core/memory: comptag allocation

nvkm_memory is going to be used by the upcoming mmu rework for the basic
representation of a memory allocation, as such, this commit adds support
for comptag allocation to nvkm_memory.

This is very simple for now, in that it requires comptags for the entire
memory allocation even if only certain ranges are compressed.

Support for tracking ranges will be added at a later date.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/ltc: init comptag mm in fb subdev
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/ltc: init comptag mm in fb subdev

A single location for the MM allows us to share allocation logic.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/fb/gf100: clear comptags at allocation time rather than mmu map
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/fb/gf100: clear comptags at allocation time rather than mmu map

We probably don't want to destroy compression data when doing multiple
mappings of a memory object.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/fb: move comptag init out of ram submodule
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/fb: move comptag init out of ram submodule

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/fb: move comptags mm into nvkm_fb
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/fb: move comptags mm into nvkm_fb

We're moving towards having a central place to handle comptag allocation,
and as some GPUs don't have a ram submodule (ie. Tegra), we need to move
the mm somewhere else.

It probably never belonged in ram anyways.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/core/mm: introduce functions to access info about a given allocation
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/core/mm: introduce functions to access info about a given allocation

These will be used in upcoming patches.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/core/mm: have users explicitly define heap identifiers
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/core/mm: have users explicitly define heap identifiers

Different sections of VRAM may have different properties (ie. can't be used
for compression/display, can't be mapped, etc).

We currently already support this, but it's a bit magic.  This change makes
it more obvious where we're allocating from.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau: separate constant-va tracking from nvkm vma structure
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau: separate constant-va tracking from nvkm vma structure

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau: separate buffer object backing memory from nvkm structures
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau: separate buffer object backing memory from nvkm structures

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau: hang drm client of a master
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau: hang drm client of a master

TTM memory allocations will be hanging off the DRM's client, but the
locking needed to do so gets really tricky with all the other use of
the DRM's object tree.

To solve this, we make the normal DRM client a child of a new master,
where the memory allocations will be done from instead.

This also solves a potential race with client creation.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau: consolidate identical functions in nouveau_ttm.c
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau: consolidate identical functions in nouveau_ttm.c

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau: remove unnecessary use of ttm_mem_type_manager::priv
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau: remove unnecessary use of ttm_mem_type_manager::priv

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau: swap loop order in move_notify() hook
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau: swap loop order in move_notify() hook

The conditional is the same for every mapping.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau: simplify const-va map condition
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau: simplify const-va map condition

We don't really care about where the memory is, just that it's compatible
with a VMA allocated for a given page size.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau: split various bo flags out into their own members
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau: split various bo flags out into their own members

It's far more convenient to deal with like this.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau: remove unused sysmem fence code
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau: remove unused sysmem fence code

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau: store nouveau_drm in nouveau_cli, as opposed to drm_device
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau: store nouveau_drm in nouveau_cli, as opposed to drm_device

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/gr/gf100-gk208: copy big page size setting from fb
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/gr/gf100-gk208: copy big page size setting from fb

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/gr/gf100-gk208: make use of init_gpc_mmu() hook to share setup
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/gr/gf100-gk208: make use of init_gpc_mmu() hook to share setup

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/fb: finalise big page size selection in constructor
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/fb: finalise big page size selection in constructor

MMU will need to know this during its constructor, so we can't delay
deciding this until init-time.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/mmu/nv04-nv4x: move global vmm to nvkm_mmu
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/mmu/nv04-nv4x: move global vmm to nvkm_mmu

In a future commit, this will be constructed by common code.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/imem: use fast-path for resume restore
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/imem: use fast-path for resume restore

Before: "imem: init completed in 299277us"
 After: "imem: init completed in  11574us"

Suspend from Fedora 26 gnome desktop on GP102.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/imem: use fast-path for suspend backup
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/imem: use fast-path for suspend backup

Before: "imem: suspend completed in 5540487us"
 After: "imem: suspend completed in 1871526us"

Suspend from Fedora 26 gnome desktop on GP102.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/imem: separate pre-BAR2-bootstrap objects from the rest
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/imem: separate pre-BAR2-bootstrap objects from the rest

These will require slow-path access during suspend/resume.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/imem: switch to kvmalloc/kvfree for suspend/resume backup
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/imem: switch to kvmalloc/kvfree for suspend/resume backup

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/imem: separate suspend/resume backup handling into their own functions
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/imem: separate suspend/resume backup handling into their own functions

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/imem: remove now-unused wrapper for backend objects
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/imem: remove now-unused wrapper for backend objects

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/imem/nv50: support eviction of BAR2 mappings
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/imem/nv50: support eviction of BAR2 mappings

A good deal of the structures we map into here aren't accessed very often
at all, and Fedora 26 has exposed an issue where after creating a heap of
channels, BAR2 space would run out, and we'd need to make use of the slow
path while accessing important structures like page tables.

This implements an LRU on BAR2 space, which allows eviction of mappings
that aren't currently needed, to make space for other objects.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/imem/nv50: prevent fast-path for mapped objects when BAR isn't ready
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/imem/nv50: prevent fast-path for mapped objects when BAR isn't ready

Another piece of solving the "GP100 BAR2 VMM bootstrap" puzzle.

Without doing this, we'd attempt to write PDEs for the lower page table
levels through BAR2 before BAR2 access has been fully initialised.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/imem/nv50: map bar2 write-combined
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/imem/nv50: map bar2 write-combined

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/imem/nv50: embed nvkm_instobj directly into nv04_instobj
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/imem/nv50: embed nvkm_instobj directly into nv04_instobj

This is not as simple as it was for earlier GPUs, due to the need to swap
accessor functions depending on whether BAR2 is usable or not.

We were previously protected by nvkm_instobj's accessor functions keeping
an object mapped permanently, with some unclear magic that managed to hit
the slow-path where needed even if an object was marked as mapped.

That's been replaced here by reference counting maps (some objects, like
page tables can be accessed concurrently), and swapping the functions as
necessary.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/imem/nv50: move slow-path locking into rd/wr functions
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/imem/nv50: move slow-path locking into rd/wr functions

This is to simplify upcoming changes.  The slow-path is something that
currently occurs during bootstrap of the BAR2 VMM, while backing up an
object during suspend/resume, or when BAR2 address space runs out.

The latter is a real problem that can happen at runtime, and occurs in
Fedora 26 already (due to some change that causes a lot of channels to
be created at login), so ideally we'd prefer not to make it any slower.

We'd also like suspend/resume speed to not suffer.

Upcoming commits will solve those problems in a better way, making the
extra overhead of moving the locking here a non-issue.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/imem/nv50: split object map out from api functions
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/imem/nv50: split object map out from api functions

acquire()/boot() will need different logic in addition to performing
the actual mapping.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/imem/nv40: map bar2 write-combined
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/imem/nv40: map bar2 write-combined

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/imem/nv40: embed nvkm_instobj directly into nv04_instobj
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/imem/nv40: embed nvkm_instobj directly into nv04_instobj

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/imem/nv04: directly embed nvkm_instobj into nv04_instobj
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/imem/nv04: directly embed nvkm_instobj into nv04_instobj

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/imem: allow nvkm_instobj to be directly embedded in backend object
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/imem: allow nvkm_instobj to be directly embedded in backend object

This will eliminate a step through the call chain, and give backends
more flexibility.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/core/memory: split info pointers from accessor pointers
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/core/memory: split info pointers from accessor pointers

The accessor functions can change as a result of acquire()/release() calls,
and are protected by any refcounting done there.

Other functions must remain constant, as they can be called any time.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/imem: add some useful debug output
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/imem: add some useful debug output

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/bar/gm107-: wait for instance block binding to complete
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/bar/gm107-: wait for instance block binding to complete

Discovered by accident while working to use BAR2 access to instmem objects
on more paths.

We've apparently been relying on luck up until now!

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/bar: initialise bar2 during oneinit
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/bar: initialise bar2 during oneinit

If we initialise BAR2 earlier, we're able to complete BAR1 setup using
the instmem fast-path.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/bar: prevent BAR2 mapping of objects during destructor
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/bar: prevent BAR2 mapping of objects during destructor

GP100's page table nests a lot more deeply than the GF100-compatible
layout we're currently using, which means our hackish-but-simple way
of dealing with BAR2 VMM teardown won't work anymore.

In order to sanely handle the chicken-and-egg (BAR2's PTs get mapped
into themselves) problem, we need prevent page tables getting mapped
back into BAR2 during the destruction of its VMM.

To do this, we simply key off the state that's now maintained by the
BAR2 init/fini functions.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/bar: modify interface to bar2 vmm mapping
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/bar: modify interface to bar2 vmm mapping

Match API with the BAR1 version.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/bar: modify interface to bar1 vmm mapping
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/bar: modify interface to bar1 vmm mapping

Upcoming changes will remove the nvkm_vmm pointer from nvkm_vma, instead
requiring it to be explicitly specified on each operation.

It's not currently possible to get this information for BAR1 mappings,
so let's fix that ahead of time.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/bar: expose interface to bar2 teardown
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/bar: expose interface to bar2 teardown

Will prevent spurious MMU fault interrupts if something decides to touch
BAR1 after we've unloaded the driver.

Exposed external to BAR so that INSTMEM can use it to better control the
suspend/resume fast-path access.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/bar: expose interface to bar2 initialisation
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/bar: expose interface to bar2 initialisation

If we want to be able to hit the instmem fast-path in a few trickier cases,
we need to be more flexible with when we can initialise BAR2 access.

There's probably a decent case to be made for merging BAR/INSTMEM into BUS,
but that's something to ponder another day.

Flushes have been added after the write to bind the instance block,
as later commits will reveal the need for them.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/bar: implement bar1 teardown
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/bar: implement bar1 teardown

Will prevent spurious MMU fault interrupts if something decides to touch
BAR1 after we've unloaded the driver.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/bar: move bar1 initialisation into its own function
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/bar: move bar1 initialisation into its own function

BAR2 being done for practical reasons, this is just for consistency.

Flushes have been added after the write to bind the instance block,
as later commits will reveal the need for them.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/bar: swap oneinit/init ordering, and rename bar3 to bar2
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/bar: swap oneinit/init ordering, and rename bar3 to bar2

NVIDIA call it BAR2, Linux APIs treat it as BAR3 due to BAR1 being a
64-bit BAR, which I presume take two slots or something.

No actual code changes here, just to make future commits less messy.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/bar: remove NV_PMC_ENABLE_PFIFO twiddling
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/bar: remove NV_PMC_ENABLE_PFIFO twiddling

It's handled by FIFO preinit() now.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/bar/nv50,g84: drop mmu invalidate
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/bar/nv50,g84: drop mmu invalidate

Will already be done by MMU as a result of the PT writes that occur
during BAR2 bootstrapping.

This is likely just a left-over from the days when it was hardcoded.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/fifo: perform reset from preinit
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/fifo: perform reset from preinit

RM appears to do this really early in its initialisation, before DEVINIT.

We currently do this before BAR2 initialisation for some reason.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/disp: add missing newline in ior debug messages
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/disp: add missing newline in ior debug messages

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/secboot: add missing newline in debug message
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/secboot: add missing newline in debug message

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/core/device: remove object include to prevent unnecessary rebuilds
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/core/device: remove object include to prevent unnecessary rebuilds

nvkm_device hasn't subclassed nvkm_object in a long time.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/core/subdev: compile out messages for unwanted debug levels
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/core/subdev: compile out messages for unwanted debug levels

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/core/gpuobj: remove embedded struct nvkm_object
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/core/gpuobj: remove embedded struct nvkm_object

nvkm_gpuobj hasn't subclassed nvkm_object in a long time.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/core/object: plumb the unmap ioctl through
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/core/object: plumb the unmap ioctl through

MMU will be using this for BAR mappings.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/core/object: allow arguments to be passed to map function
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/core/object: allow arguments to be passed to map function

MMU will be needing this to specify kind info on BAR mappings.

We have no userspace currently using these interfaces, so break the ABI
instead of supporting both.  NVIF version bump so any future use can be
guarded.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/core/object: separate oclass data out into its own header
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/core/object: separate oclass data out into its own header

Want to be able to include this from core/device.h without pulling in
core/object.h.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau: fix handling of GART OOM on pre-NV50 chipsets
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau: fix handling of GART OOM on pre-NV50 chipsets

The correct thing to do on OOM is to return 0 and set mm_node to NULL,
otherwise TTM will assume some other kind of error, and not attempt to
evict other buffers to make space.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/kms/nv50: prevent oops in failure paths
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/kms/nv50: prevent oops in failure paths

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/kms: add 8.1Gbps DP link rate
Ilia Mirkin [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/kms: add 8.1Gbps DP link rate

This was already done in dcb.c inside nvkm, but the other parser did not
get the update.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/bios/init: use ARRAY_SIZE
Jérémy Lefaure [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/bios/init: use ARRAY_SIZE

Using the ARRAY_SIZE macro improves the readability of the code. Also,
it is useless to re-invent it.

Found with Coccinelle with the following semantic patch:
@r depends on (org || report)@
type T;
T[] E;
position p;
@@
(
 (sizeof(E)@p /sizeof(*E))
|
 (sizeof(E)@p /sizeof(E[...]))
|
 (sizeof(E)@p /sizeof(T))
)

Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agoremove some useless semicolons
Ben Skeggs [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
remove some useless semicolons

Reported-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau: Document nouveau support for Tegra in DRIVER_DESC
Rhys Kidd [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau: Document nouveau support for Tegra in DRIVER_DESC

nouveau supports the Tegra K1 and higher after the SoC-based GPUs converged
with the main GeForce GPU families.

v2:
- Qualify that support is Tegra K1+ (Martin Peres)

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Acked-by: Pierre Moreau <pierre.morrow@free.fr>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agodrm/nouveau/therm/gp100: initial implementation of new gp1xx temperature sensor
Rhys Kidd [Tue, 31 Oct 2017 17:56:19 +0000 (03:56 +1000)]
drm/nouveau/therm/gp100: initial implementation of new gp1xx temperature sensor

v2:
 - add nv138 and drop nv13b chipsets (Ilia Mirkin)
 - refactor out status variable and instead mask tsensor (Ilia Mirkin)
 - switch SHADOWed state message away from nvkm_error() (Ilia Mirkin)
 - rename internal temperature variable (Karol Herbst)

v3:
 - use nvkm_trace() for SHADOWed state message (Ben Skeggs)

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
7 years agoBackmerge tag 'v4.14-rc7' into drm-next
Dave Airlie [Thu, 2 Nov 2017 02:40:41 +0000 (12:40 +1000)]
Backmerge tag 'v4.14-rc7' into drm-next

Linux 4.14-rc7

Requested by Ben Skeggs for nouveau to avoid major conflicts,
and things were getting a bit conflicty already, esp around amdgpu
reverts.

7 years agoMerge tag 'drm-hisilicon-next-2017-11-01' of github.com:xin3liang/linux into drm...
Dave Airlie [Thu, 2 Nov 2017 01:36:55 +0000 (11:36 +1000)]
Merge tag 'drm-hisilicon-next-2017-11-01' of github.com:xin3liang/linux into drm-next

For 4.15

* tag 'drm-hisilicon-next-2017-11-01' of github.com:xin3liang/linux:
  drm/hisilicon: Ensure LDI regs are properly configured.

7 years agoMerge tag 'drm-msm-next-2017-11-01' of git://people.freedesktop.org/~robclark/linux...
Dave Airlie [Thu, 2 Nov 2017 01:29:28 +0000 (11:29 +1000)]
Merge tag 'drm-msm-next-2017-11-01' of git://people.freedesktop.org/~robclark/linux into drm-next

 + preemption support for a5xx[1][2]

 + display fixes for 8x96 (snapdragon 820) including fixes for 4k scanout
   (hwpipe assignment re-work to handle multiple hwpipe assigned to plane
   for wide scanout)

 + async cursor plane updates and fixes

 + refactor adreno_bind/hwinit.. still defer fw loading until device open,
   but move clk/irq/etc to probe/bind time to fix issues when fw isn't
   present in filesys

 + clk/dt bindings cleanups w/ backward compat via msm_clk_get() (dt docs
   part ack'ed by Rob Herring)

 + fw loading re-work with helper to handle either /lib/firmware/qcom/$fw
   or /lib/firmware/$fw.. background, we've started landing fw for some of
   generations in linux-firmware, but there is a preference to put fw files
   under 'qcom' subdirectory, which is not what was done on android or for
   people who copied fw from android.  So now we first look in qcom subdir
   and then fallback to the original location.

 + bunch of GPU debugging enhancements, to dump full cmdline of processes
   that trigger faults, and to add a new debugfs to capture cmdstream of
   just submits that triggered faults.. both quite useful for piglit ;-)

* tag 'drm-msm-next-2017-11-01' of git://people.freedesktop.org/~robclark/linux: (38 commits)
  drm/msm: use %z format modifier for printing size_t
  drm/msm/mdp5: Don't use async plane update path if plane visibility changes
  drm/msm/mdp5: mdp5_crtc: Restore cursor state only if LM cursors are enabled
  drm/msm/mdp5: Update mdp5_pipe_assign to spit out both planes
  drm/msm/mdp5: Prepare mdp5_pipe_assign for some rework
  drm/msm: remove mdp5_cursor_plane_funcs
  drm/msm: update cursors asynchronously through atomic
  drm/msm/atomic: switch to drm_atomic_helper_check
  drm/msm/mdp5: restore cursor state when enabling crtc
  drm/msm/mdp5: don't use autosuspend
  drm/msm/mdp5: ignore planes that are not visible
  drm/msm: dump submits which triggered gpu hang
  drm/msm: preserve IOVAs in submit's bo table
  drm/msm/rd: allow adding addition msg to top of dump
  drm/msm: split rd debugfs file
  drm/msm: add special _get_vaddr_active() for cmdstream dumps
  drm/msm: show task cmdline in gpu recovery messages
  drm/msm: dump a rd GPUADDR header for all buffers in the command
  drm/msm: Removed unused struct_mutex_task
  drm/msm: Implement preemption for A5XX targets
  ...

7 years agodrm/msm: use %z format modifier for printing size_t
Arnd Bergmann [Thu, 3 Aug 2017 11:50:48 +0000 (13:50 +0200)]
drm/msm: use %z format modifier for printing size_t

The return type of ARRAY_SIZE() is size_t, so we have to use
%zu instead of %lu to avoid this warning:

drivers/gpu/drm/msm/msm_gpu.c: In function 'msm_gpu_init':
drivers/gpu/drm/msm/msm_gpu.c:742:31: error: format '%lu' expects argument of type 'long unsigned int', but argument 7 has type 'unsigned int' [-Werror=format=]

The warning it otherwise harmless as size_t is always the
same size as unsigned long in all supported architectures,
but gcc doesn't know that.

Fixes: c2fceabca6d5 ("drm/msm: Support multiple ringbuffers")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rob Clark <robdclark@gmail.com>
7 years agodrm/hisilicon: Ensure LDI regs are properly configured.
Peter Griffin [Tue, 15 Aug 2017 14:14:25 +0000 (15:14 +0100)]
drm/hisilicon: Ensure LDI regs are properly configured.

This patch fixes the following soft lockup:
  BUG: soft lockup - CPU#0 stuck for 23s! [weston:307]

On weston idle-timeout the IP is powered down and reset
asserted. On weston resume we get a massive vblank
IRQ storm due to the LDI registers having lost some state.

This state loss is caused by ade_crtc_atomic_begin() not
calling ade_ldi_set_mode(). With this patch applied
resuming from Weston idle-timeout works well.

Signed-off-by: Peter Griffin <peter.griffin@linaro.org>
Tested-by: John Stultz <john.stultz@linaro.org>
Cc: stable@vger.kernel.org
Reviewed-by: Xinliang Liu <xinliang.liu@linaro.org>
Signed-off-by: Xinliang Liu <xinliang.liu@linaro.org>
7 years agoLinux 4.14-rc7
Linus Torvalds [Sun, 29 Oct 2017 20:58:38 +0000 (13:58 -0700)]
Linux 4.14-rc7

7 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Sun, 29 Oct 2017 15:11:49 +0000 (08:11 -0700)]
Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) Fix route leak in xfrm_bundle_create().

 2) In mac80211, validate user rate mask before configuring it. From
    Johannes Berg.

 3) Properly enforce memory limits in fair queueing code, from Toke
    Hoiland-Jorgensen.

 4) Fix lockdep splat in inet_csk_route_req(), from Eric Dumazet.

 5) Fix TSO header allocation and management in mvpp2 driver, from Yan
    Markman.

 6) Don't take socket lock in BH handler in strparser code, from Tom
    Herbert.

 7) Don't show sockets from other namespaces in AF_UNIX code, from
    Andrei Vagin.

 8) Fix double free in error path of tap_open(), from Girish Moodalbail.

 9) Fix TX map failure path in igb and ixgbe, from Jean-Philippe Brucker
    and Alexander Duyck.

10) Fix DCB mode programming in stmmac driver, from Jose Abreu.

11) Fix err_count handling in various tunnels (ipip, ip6_gre). From Xin
    Long.

12) Properly align SKB head before building SKB in tuntap, from Jason
    Wang.

13) Avoid matching qdiscs with a zero handle during lookups, from Cong
    Wang.

14) Fix various endianness bugs in sctp, from Xin Long.

15) Fix tc filter callback races and add selftests which trigger the
    problem, from Cong Wang.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (73 commits)
  selftests: Introduce a new test case to tc testsuite
  selftests: Introduce a new script to generate tc batch file
  net_sched: fix call_rcu() race on act_sample module removal
  net_sched: add rtnl assertion to tcf_exts_destroy()
  net_sched: use tcf_queue_work() in tcindex filter
  net_sched: use tcf_queue_work() in rsvp filter
  net_sched: use tcf_queue_work() in route filter
  net_sched: use tcf_queue_work() in u32 filter
  net_sched: use tcf_queue_work() in matchall filter
  net_sched: use tcf_queue_work() in fw filter
  net_sched: use tcf_queue_work() in flower filter
  net_sched: use tcf_queue_work() in flow filter
  net_sched: use tcf_queue_work() in cgroup filter
  net_sched: use tcf_queue_work() in bpf filter
  net_sched: use tcf_queue_work() in basic filter
  net_sched: introduce a workqueue for RCU callbacks of tc filter
  sctp: fix some type cast warnings introduced since very beginning
  sctp: fix a type cast warnings that causes a_rwnd gets the wrong value
  sctp: fix some type cast warnings introduced by transport rhashtable
  sctp: fix some type cast warnings introduced by stream reconf
  ...

7 years agoMerge branch 'net_sched-fix-races-with-RCU-callbacks'
David S. Miller [Sun, 29 Oct 2017 13:49:32 +0000 (22:49 +0900)]
Merge branch 'net_sched-fix-races-with-RCU-callbacks'

Cong Wang says:

====================
net_sched: fix races with RCU callbacks

Recently, the RCU callbacks used in TC filters and TC actions keep
drawing my attention, they introduce at least 4 race condition bugs:

1. A simple one fixed by Daniel:

commit c78e1746d3ad7d548bdf3fe491898cc453911a49
Author: Daniel Borkmann <daniel@iogearbox.net>
Date:   Wed May 20 17:13:33 2015 +0200

    net: sched: fix call_rcu() race on classifier module unloads

2. A very nasty one fixed by me:

commit 1697c4bb5245649a23f06a144cc38c06715e1b65
Author: Cong Wang <xiyou.wangcong@gmail.com>
Date:   Mon Sep 11 16:33:32 2017 -0700

    net_sched: carefully handle tcf_block_put()

3. Two more bugs found by Chris:
https://patchwork.ozlabs.org/patch/826696/
https://patchwork.ozlabs.org/patch/826695/

Usually RCU callbacks are simple, however for TC filters and actions,
they are complex because at least TC actions could be destroyed
together with the TC filter in one callback. And RCU callbacks are
invoked in BH context, without locking they are parallel too. All of
these contribute to the cause of these nasty bugs.

Alternatively, we could also:

a) Introduce a spinlock to serialize these RCU callbacks. But as I
said in commit 1697c4bb5245 ("net_sched: carefully handle
tcf_block_put()"), it is very hard to do because of tcf_chain_dump().
Potentially we need to do a lot of work to make it possible (if not
impossible).

b) Just get rid of these RCU callbacks, because they are not
necessary at all, callers of these call_rcu() are all on slow paths
and holding RTNL lock, so blocking is allowed in their contexts.
However, David and Eric dislike adding synchronize_rcu() here.

As suggested by Paul, we could defer the work to a workqueue and
gain the permission of holding RTNL again without any performance
impact, however, in tcf_block_put() we could have a deadlock when
flushing workqueue while hodling RTNL lock, the trick here is to
defer the work itself in workqueue and make it queued after all
other works so that we keep the same ordering to avoid any
use-after-free. Please see the first patch for details.

Patch 1 introduces the infrastructure, patch 2~12 move each
tc filter to the new tc filter workqueue, patch 13 adds
an assertion to catch potential bugs like this, patch 14
closes another rcu callback race, patch 15 and patch 16 add
new test cases.
====================

Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoselftests: Introduce a new test case to tc testsuite
Chris Mi [Fri, 27 Oct 2017 01:24:43 +0000 (18:24 -0700)]
selftests: Introduce a new test case to tc testsuite

In this patchset, we fixed a tc bug. This patch adds the test case
that reproduces the bug. To run this test case, user should specify
an existing NIC device:
  # sudo ./tdc.py -d enp4s0f0

This test case belongs to category "flower". If user doesn't specify
a NIC device, the test cases belong to "flower" will not be run.

In this test case, we create 1M filters and all filters share the same
action. When destroying all filters, kernel should not panic. It takes
about 18s to run it.

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Lucas Bates <lucasb@mojatatu.com>
Signed-off-by: Chris Mi <chrism@mellanox.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoselftests: Introduce a new script to generate tc batch file
Chris Mi [Fri, 27 Oct 2017 01:24:42 +0000 (18:24 -0700)]
selftests: Introduce a new script to generate tc batch file

  # ./tdc_batch.py -h
  usage: tdc_batch.py [-h] [-n NUMBER] [-o] [-s] [-p] device file

  TC batch file generator

  positional arguments:
    device                device name
    file                  batch file name

  optional arguments:
    -h, --help            show this help message and exit
    -n NUMBER, --number NUMBER
                          how many lines in batch file
    -o, --skip_sw         skip_sw (offload), by default skip_hw
    -s, --share_action    all filters share the same action
    -p, --prio            all filters have different prio

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Lucas Bates <lucasb@mojatatu.com>
Signed-off-by: Chris Mi <chrism@mellanox.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet_sched: fix call_rcu() race on act_sample module removal
Cong Wang [Fri, 27 Oct 2017 01:24:41 +0000 (18:24 -0700)]
net_sched: fix call_rcu() race on act_sample module removal

Similar to commit c78e1746d3ad
("net: sched: fix call_rcu() race on classifier module unloads"),
we need to wait for flying RCU callback tcf_sample_cleanup_rcu().

Cc: Yotam Gigi <yotamg@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet_sched: add rtnl assertion to tcf_exts_destroy()
Cong Wang [Fri, 27 Oct 2017 01:24:40 +0000 (18:24 -0700)]
net_sched: add rtnl assertion to tcf_exts_destroy()

After previous patches, it is now safe to claim that
tcf_exts_destroy() is always called with RTNL lock.

Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet_sched: use tcf_queue_work() in tcindex filter
Cong Wang [Fri, 27 Oct 2017 01:24:39 +0000 (18:24 -0700)]
net_sched: use tcf_queue_work() in tcindex filter

Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.

Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet_sched: use tcf_queue_work() in rsvp filter
Cong Wang [Fri, 27 Oct 2017 01:24:38 +0000 (18:24 -0700)]
net_sched: use tcf_queue_work() in rsvp filter

Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.

Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet_sched: use tcf_queue_work() in route filter
Cong Wang [Fri, 27 Oct 2017 01:24:37 +0000 (18:24 -0700)]
net_sched: use tcf_queue_work() in route filter

Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.

Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet_sched: use tcf_queue_work() in u32 filter
Cong Wang [Fri, 27 Oct 2017 01:24:36 +0000 (18:24 -0700)]
net_sched: use tcf_queue_work() in u32 filter

Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.

Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet_sched: use tcf_queue_work() in matchall filter
Cong Wang [Fri, 27 Oct 2017 01:24:35 +0000 (18:24 -0700)]
net_sched: use tcf_queue_work() in matchall filter

Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.

Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet_sched: use tcf_queue_work() in fw filter
Cong Wang [Fri, 27 Oct 2017 01:24:34 +0000 (18:24 -0700)]
net_sched: use tcf_queue_work() in fw filter

Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.

Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet_sched: use tcf_queue_work() in flower filter
Cong Wang [Fri, 27 Oct 2017 01:24:33 +0000 (18:24 -0700)]
net_sched: use tcf_queue_work() in flower filter

Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.

Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet_sched: use tcf_queue_work() in flow filter
Cong Wang [Fri, 27 Oct 2017 01:24:32 +0000 (18:24 -0700)]
net_sched: use tcf_queue_work() in flow filter

Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.

Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet_sched: use tcf_queue_work() in cgroup filter
Cong Wang [Fri, 27 Oct 2017 01:24:31 +0000 (18:24 -0700)]
net_sched: use tcf_queue_work() in cgroup filter

Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.

Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet_sched: use tcf_queue_work() in bpf filter
Cong Wang [Fri, 27 Oct 2017 01:24:30 +0000 (18:24 -0700)]
net_sched: use tcf_queue_work() in bpf filter

Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.

Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet_sched: use tcf_queue_work() in basic filter
Cong Wang [Fri, 27 Oct 2017 01:24:29 +0000 (18:24 -0700)]
net_sched: use tcf_queue_work() in basic filter

Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.

Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet_sched: introduce a workqueue for RCU callbacks of tc filter
Cong Wang [Fri, 27 Oct 2017 01:24:28 +0000 (18:24 -0700)]
net_sched: introduce a workqueue for RCU callbacks of tc filter

This patch introduces a dedicated workqueue for tc filters
so that each tc filter's RCU callback could defer their
action destroy work to this workqueue. The helper
tcf_queue_work() is introduced for them to use.

Because we hold RTNL lock when calling tcf_block_put(), we
can not simply flush works inside it, therefore we have to
defer it again to this workqueue and make sure all flying RCU
callbacks have already queued their work before this one, in
other words, to ensure this is the last one to execute to
prevent any use-after-free.

On the other hand, this makes tcf_block_put() ugly and
harder to understand. Since David and Eric strongly dislike
adding synchronize_rcu(), this is probably the only
solution that could make everyone happy.

Please also see the code comments below.

Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'sctp-endianness-fixes'
David S. Miller [Sun, 29 Oct 2017 09:03:25 +0000 (18:03 +0900)]
Merge branch 'sctp-endianness-fixes'

Xin Long says:

====================
sctp: a bunch of fixes for some sparse warnings

As Eric noticed, when running 'make C=2 M=net/sctp/', a plenty of
warnings or errors checked by sparse appear. They are all problems
about Endian and type cast.

Most of them are just warnings by which no issues could be caused
while some might be bugs.

This patchset fixes them with four patches basically according to
how they are introduced.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosctp: fix some type cast warnings introduced since very beginning
Xin Long [Sat, 28 Oct 2017 11:43:57 +0000 (19:43 +0800)]
sctp: fix some type cast warnings introduced since very beginning

These warnings were found by running 'make C=2 M=net/sctp/'.
They are there since very beginning.

Note after this patch, there still one warning left in
sctp_outq_flush():
  sctp_chunk_fail(chunk, SCTP_ERROR_INV_STRM)

Since it has been moved to sctp_stream_outq_migrate on net-next,
to avoid the extra job when merging net-next to net, I will post
the fix for it after the merging is done.

Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosctp: fix a type cast warnings that causes a_rwnd gets the wrong value
Xin Long [Sat, 28 Oct 2017 11:43:56 +0000 (19:43 +0800)]
sctp: fix a type cast warnings that causes a_rwnd gets the wrong value

These warnings were found by running 'make C=2 M=net/sctp/'.

Commit d4d6fb5787a6 ("sctp: Try not to change a_rwnd when faking a
SACK from SHUTDOWN.") expected to use the peers old rwnd and add
our flight size to the a_rwnd. But with the wrong Endian, it may
not work as well as expected.

So fix it by converting to the right value.

Fixes: d4d6fb5787a6 ("sctp: Try not to change a_rwnd when faking a SACK from SHUTDOWN.")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosctp: fix some type cast warnings introduced by transport rhashtable
Xin Long [Sat, 28 Oct 2017 11:43:55 +0000 (19:43 +0800)]
sctp: fix some type cast warnings introduced by transport rhashtable

These warnings were found by running 'make C=2 M=net/sctp/'.

They are introduced by not aware of Endian for the port when
coding transport rhashtable patches.

Fixes: 7fda702f9315 ("sctp: use new rhlist interface on sctp transport rhashtable")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosctp: fix some type cast warnings introduced by stream reconf
Xin Long [Sat, 28 Oct 2017 11:43:54 +0000 (19:43 +0800)]
sctp: fix some type cast warnings introduced by stream reconf

These warnings were found by running 'make C=2 M=net/sctp/'.

They are introduced by not aware of Endian when coding stream
reconf patches.

Since commit c0d8bab6ae51 ("sctp: add get and set sockopt for
reconf_enable") enabled stream reconf feature for users, the
Fixes tag below would use it.

Fixes: c0d8bab6ae51 ("sctp: add get and set sockopt for reconf_enable")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet_sched: avoid matching qdisc with zero handle
Cong Wang [Sat, 28 Oct 2017 05:08:56 +0000 (22:08 -0700)]
net_sched: avoid matching qdisc with zero handle

Davide found the following script triggers a NULL pointer
dereference:

ip l a name eth0 type dummy
tc q a dev eth0 parent :1 handle 1: htb

This is because for a freshly created netdevice noop_qdisc
is attached and when passing 'parent :1', kernel actually
tries to match the major handle which is 0 and noop_qdisc
has handle 0 so is matched by mistake. Commit 69012ae425d7
tries to fix a similar bug but still misses this case.

Handle 0 is not a valid one, should be just skipped. In
fact, kernel uses it as TC_H_UNSPEC.

Fixes: 69012ae425d7 ("net: sched: fix handling of singleton qdiscs with qdisc_hash")
Fixes: 59cc1f61f09c ("net: sched:convert qdisc linked list to hashtable")
Reported-by: Davide Caratti <dcaratti@redhat.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosctp: reset owner sk for data chunks on out queues when migrating a sock
Xin Long [Fri, 27 Oct 2017 18:13:29 +0000 (02:13 +0800)]
sctp: reset owner sk for data chunks on out queues when migrating a sock

Now when migrating sock to another one in sctp_sock_migrate(), it only
resets owner sk for the data in receive queues, not the chunks on out
queues.

It would cause that data chunks length on the sock is not consistent
with sk sk_wmem_alloc. When closing the sock or freeing these chunks,
the old sk would never be freed, and the new sock may crash due to
the overflow sk_wmem_alloc.

syzbot found this issue with this series:

  r0 = socket$inet_sctp()
  sendto$inet(r0)
  listen(r0)
  accept4(r0)
  close(r0)

Although listen() should have returned error when one TCP-style socket
is in connecting (I may fix this one in another patch), it could also
be reproduced by peeling off an assoc.

This issue is there since very beginning.

This patch is to reset owner sk for the chunks on out queues so that
sk sk_wmem_alloc has correct value after accept one sock or peeloff
an assoc to one sock.

Note that when resetting owner sk for chunks on outqueue, it has to
sctp_clear_owner_w/skb_orphan chunks before changing assoc->base.sk
first and then sctp_set_owner_w them after changing assoc->base.sk,
due to that sctp_wfree and it's callees are using assoc->base.sk.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'sockmap-fixes'
David S. Miller [Sun, 29 Oct 2017 02:18:49 +0000 (11:18 +0900)]
Merge branch 'sockmap-fixes'

John Fastabend says:

====================
net: sockmap fixes

Last two fixes (as far as I know) for sockmap code this round.

First, we are using the qdisc cb structure when making the data end
calculation. This is really just wrong so, store it with the other
metadata in the correct tcp_skb_cb sturct to avoid breaking things.

Next, with recent work to attach multiple programs to a cgroup a
specific enumeration of return codes was agreed upon. However,
I wrote the sk_skb program types before seeing this work and used
a different convention. Patch 2 in the series aligns the return
codes to avoid breaking with this infrastructure and also aligns
with other programming conventions to avoid being the odd duck out
forcing programs to remember SK_SKB programs are different. Pusing
to net because its a user visible change. With this SK_SKB program
return codes are the same as other cgroup program types.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: rename sk_actions to align with bpf infrastructure
John Fastabend [Fri, 27 Oct 2017 16:45:53 +0000 (09:45 -0700)]
bpf: rename sk_actions to align with bpf infrastructure

Recent additions to support multiple programs in cgroups impose
a strict requirement, "all yes is yes, any no is no". To enforce
this the infrastructure requires the 'no' return code, SK_DROP in
this case, to be 0.

To apply these rules to SK_SKB program types the sk_actions return
codes need to be adjusted.

This fix adds SK_PASS and makes 'SK_DROP = 0'. Finally, remove
SK_ABORTED to remove any chance that the API may allow aborted
program flows to be passed up the stack. This would be incorrect
behavior and allow programs to break existing policies.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>