xfs: use dedicated log worker wq to avoid deadlock with cil wq
authorBrian Foster <bfoster@redhat.com>
Tue, 28 Mar 2017 21:51:44 +0000 (14:51 -0700)
committerDarrick J. Wong <darrick.wong@oracle.com>
Mon, 3 Apr 2017 22:18:15 +0000 (15:18 -0700)
commit696a562072e3c14bcd13ae5acc19cdf27679e865
tree09f26df4208076edfb53e3e66cad7eecea5af038
parentf1c0e20243baca1194fccfdf0111831bee12647c
xfs: use dedicated log worker wq to avoid deadlock with cil wq

The log covering background task used to be part of the xfssyncd
workqueue. That workqueue was removed as of commit 5889608df ("xfs:
syncd workqueue is no more") and the associated work item scheduled
to the xfs-log wq. The latter is used for log buffer I/O completion.

Since xfs_log_worker() can invoke a log flush, a deadlock is
possible between the xfs-log and xfs-cil workqueues. Consider the
following codepath from xfs_log_worker():

xfs_log_worker()
  xfs_log_force()
    _xfs_log_force()
      xlog_cil_force()
        xlog_cil_force_lsn()
          xlog_cil_push_now()
            flush_work()

The above is in xfs-log wq context and blocked waiting on the
completion of an xfs-cil work item. Concurrently, the cil push in
progress can end up blocked here:

xlog_cil_push_work()
  xlog_cil_push()
    xlog_write()
      xlog_state_get_iclog_space()
        xlog_wait(&log->l_flush_wait, ...)

The above is in xfs-cil context waiting on log buffer I/O
completion, which executes in xfs-log wq context. In this scenario
both workqueues are deadlocked waiting on eachother.

Add a new workqueue specifically for the high level log covering and
ail pushing worker, as was the case prior to commit 5889608df.

Diagnosed-by: David Jeffery <djeffery@redhat.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
fs/xfs/xfs_log.c
fs/xfs/xfs_mount.h
fs/xfs/xfs_super.c