csky: atomic: Optimize cmpxchg with acquire & release
authorGuo Ren <guoren@linux.alibaba.com>
Wed, 6 Apr 2022 12:30:13 +0000 (20:30 +0800)
committerGuo Ren <guoren@linux.alibaba.com>
Mon, 25 Apr 2022 05:51:42 +0000 (13:51 +0800)
commit186f69b64c80a594337211e8238e44a3863e9d94
tree865d149558c77c8bd22a4efa31c6bb2b252e49a7
parent8318f7c231d5be09e47410c5ab387b9bef6fe19e
csky: atomic: Optimize cmpxchg with acquire & release

Optimize cmpxchg with ASM acquire/release fence ASM instructions
instead of previous generic based. Prevent a fence when cmxchg's
first load != old.

Comments by Rutland:

8e86f0b409a4 ("arm64: atomics: fix use of acquire + release for
full barrier semantics")

Comments by Boqun:

FWIW, you probably need to make sure that a barrier instruction inside
an lr/sc loop is a good thing. IIUC, the execution time of a barrier
instruction is determined by the status of store buffers and invalidate
queues (and probably other stuffs), so it may increase the execution
time of the lr/sc loop, and make it unlikely to succeed. But this really
depends on how the arch executes these instructions.

Link: https://lore.kernel.org/linux-riscv/CAJF2gTSAxpAi=LbAdu7jntZRUa=-dJwL0VfmDfBV5MHB=rcZ-w@mail.gmail.com/T/#m27a0f1342995deae49ce1d0e1f2683f8a181d6c3
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
arch/csky/include/asm/barrier.h
arch/csky/include/asm/cmpxchg.h