platform/kernel/linux-3.10.git
9 years agoconstify ->actor sandbox/pawelo/kernfs
Al Viro [Thu, 23 May 2013 02:22:04 +0000 (22:22 -0400)]
constify ->actor

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years ago->readdir() is gone
Al Viro [Thu, 23 May 2013 01:44:23 +0000 (21:44 -0400)]
->readdir() is gone

everything's converted to ->iterate()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert ecryptfs
Al Viro [Thu, 23 May 2013 01:23:40 +0000 (21:23 -0400)]
convert ecryptfs

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert coda
Al Viro [Thu, 23 May 2013 01:15:30 +0000 (21:15 -0400)]
convert coda

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert ocfs2
Al Viro [Thu, 23 May 2013 01:06:00 +0000 (21:06 -0400)]
convert ocfs2

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert fatfs
Al Viro [Wed, 22 May 2013 22:37:16 +0000 (18:37 -0400)]
convert fatfs

... pox upon the idiotic ioctls; life would be much easier without
those.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert xfs
Al Viro [Wed, 22 May 2013 21:07:56 +0000 (17:07 -0400)]
convert xfs

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert btrfs
Al Viro [Wed, 22 May 2013 20:48:09 +0000 (16:48 -0400)]
convert btrfs

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert hostfs
Al Viro [Wed, 22 May 2013 20:34:19 +0000 (16:34 -0400)]
convert hostfs

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert afs
Al Viro [Wed, 22 May 2013 20:31:14 +0000 (16:31 -0400)]
convert afs

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert ncpfs
Al Viro [Wed, 22 May 2013 19:11:27 +0000 (15:11 -0400)]
convert ncpfs

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert hfsplus
Al Viro [Wed, 22 May 2013 18:59:39 +0000 (14:59 -0400)]
convert hfsplus

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert hfs
Al Viro [Wed, 22 May 2013 18:29:35 +0000 (14:29 -0400)]
convert hfs

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert befs
Al Viro [Wed, 22 May 2013 17:44:05 +0000 (13:44 -0400)]
convert befs

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert cifs
Al Viro [Wed, 22 May 2013 20:17:25 +0000 (16:17 -0400)]
convert cifs

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert freevxfs
Al Viro [Sat, 18 May 2013 07:15:00 +0000 (03:15 -0400)]
convert freevxfs

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert fuse
Al Viro [Sat, 18 May 2013 07:03:58 +0000 (03:03 -0400)]
convert fuse

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert hpfs
Al Viro [Sat, 18 May 2013 06:58:57 +0000 (02:58 -0400)]
convert hpfs

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoreiserfs: switch reiserfs_readdir_dentry to inode
Al Viro [Sat, 18 May 2013 02:58:58 +0000 (22:58 -0400)]
reiserfs: switch reiserfs_readdir_dentry to inode

... and clean the callers up a bit

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoreiserfs: is_privroot_deh() needs only directory inode, actually
Al Viro [Sat, 18 May 2013 02:45:29 +0000 (22:45 -0400)]
reiserfs: is_privroot_deh() needs only directory inode, actually

... and that - only to get the superblock.  Privroot is a directory
and we don't allow hardlinks to those...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert reiserfs
Al Viro [Sat, 18 May 2013 02:42:17 +0000 (22:42 -0400)]
convert reiserfs

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert ntfs
Al Viro [Sat, 18 May 2013 01:22:31 +0000 (21:22 -0400)]
convert ntfs

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert isofs
Al Viro [Sat, 18 May 2013 01:11:59 +0000 (21:11 -0400)]
convert isofs

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert jffs2
Al Viro [Fri, 17 May 2013 22:08:49 +0000 (18:08 -0400)]
convert jffs2

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert f2fs
Al Viro [Fri, 17 May 2013 22:02:17 +0000 (18:02 -0400)]
convert f2fs

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert 9p
Al Viro [Fri, 17 May 2013 21:51:41 +0000 (17:51 -0400)]
convert 9p

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert affs
Al Viro [Fri, 17 May 2013 21:44:42 +0000 (17:44 -0400)]
convert affs

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert adfs
Al Viro [Fri, 17 May 2013 21:30:10 +0000 (17:30 -0400)]
convert adfs

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert logfs
Al Viro [Fri, 17 May 2013 21:06:34 +0000 (17:06 -0400)]
convert logfs

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert jfs
Al Viro [Fri, 17 May 2013 21:00:34 +0000 (17:00 -0400)]
convert jfs

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert ceph
Al Viro [Fri, 17 May 2013 20:52:26 +0000 (16:52 -0400)]
convert ceph

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert nfs
Al Viro [Fri, 17 May 2013 20:34:50 +0000 (16:34 -0400)]
convert nfs

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert ext4
Al Viro [Fri, 17 May 2013 20:08:53 +0000 (16:08 -0400)]
convert ext4

and trim the living hell out bogosities in inline dir case

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert qnx6
Al Viro [Fri, 17 May 2013 19:32:10 +0000 (15:32 -0400)]
convert qnx6

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert qnx4
Al Viro [Fri, 17 May 2013 19:17:59 +0000 (15:17 -0400)]
convert qnx4

... and use strnlen() instead of strlen() - it's done on untrusted data,
after all.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert omfs
Al Viro [Fri, 17 May 2013 19:05:25 +0000 (15:05 -0400)]
convert omfs

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert nilfs2
Al Viro [Thu, 16 May 2013 18:36:14 +0000 (14:36 -0400)]
convert nilfs2

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert gfs2
Al Viro [Thu, 16 May 2013 18:14:48 +0000 (14:14 -0400)]
convert gfs2

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert exofs
Al Viro [Thu, 16 May 2013 17:48:17 +0000 (13:48 -0400)]
convert exofs

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert bfs
Al Viro [Thu, 16 May 2013 17:41:48 +0000 (13:41 -0400)]
convert bfs

... and get rid of that ridiculous mutex in bfs_readdir()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert procfs
Al Viro [Thu, 16 May 2013 16:07:31 +0000 (12:07 -0400)]
convert procfs

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert openpromfs
Al Viro [Thu, 16 May 2013 05:52:12 +0000 (01:52 -0400)]
convert openpromfs

what the hell is op_mutex for, BTW?

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert efs
Al Viro [Thu, 16 May 2013 05:41:10 +0000 (01:41 -0400)]
convert efs

* sanity checks belong before risky operation, not after it
* don't quit as soon as we'd found an entry

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert configfs
Al Viro [Thu, 16 May 2013 05:28:34 +0000 (01:28 -0400)]
convert configfs

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert romfs
Al Viro [Thu, 16 May 2013 05:22:00 +0000 (01:22 -0400)]
convert romfs

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert squashfs
Al Viro [Thu, 16 May 2013 05:17:58 +0000 (01:17 -0400)]
convert squashfs

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert ubifs
Al Viro [Thu, 16 May 2013 05:14:46 +0000 (01:14 -0400)]
convert ubifs

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert udf
Al Viro [Thu, 16 May 2013 05:09:37 +0000 (01:09 -0400)]
convert udf

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoconvert ext3
Al Viro [Thu, 16 May 2013 01:02:48 +0000 (21:02 -0400)]
convert ext3

new helper: dir_relax(inode).  Call when you are in location that will
_not_ be invalidated by directory modifications (block boundary, in case
of ext*).  Returns whether the directory has survived (dropping i_mutex
allows rmdir to kill the sucker; if it returns false to us, ->iterate()
is obviously done)

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoswitch dcache_readdir() users to ->iterate()
Al Viro [Thu, 16 May 2013 00:23:06 +0000 (20:23 -0400)]
switch dcache_readdir() users to ->iterate()

new helpers - dir_emit_dot(file, ctx, dentry), dir_emit_dotdot(file, ctx),
dir_emit_dots(file, ctx).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agosimple local unixlike: switch to ->iterate()
Al Viro [Wed, 15 May 2013 22:51:49 +0000 (18:51 -0400)]
simple local unixlike: switch to ->iterate()

ext2, ufs, minix, sysv

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agointroduce ->iterate(), ctx->pos, dir_emit()
Al Viro [Wed, 15 May 2013 22:49:12 +0000 (18:49 -0400)]
introduce ->iterate(), ctx->pos, dir_emit()

New method - ->iterate(file, ctx).  That's the replacement for ->readdir();
it takes callback from ctx->actor, uses ctx->pos instead of file->f_pos and
calls dir_emit(ctx, ...) instead of filldir(data, ...).  It does *not*
update file->f_pos (or look at it, for that matter); iterate_dir() does the
update.

Note that dir_emit() takes the offset from ctx->pos (and eventually
filldir_t will lose that argument).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agointroduce iterate_dir() and dir_context
Al Viro [Wed, 15 May 2013 17:52:59 +0000 (13:52 -0400)]
introduce iterate_dir() and dir_context

iterate_dir(): new helper, replacing vfs_readdir().

struct dir_context: contains the readdir callback (and will get more stuff
in it), embedded into whatever data that callback wants to deal with;
eventually, we'll be passing it to ->readdir() replacement instead of
(data,filldir) pair.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agosysfs: bail early from kernfs_file_mmap() to avoid spurious lockdep warning
Tejun Heo [Tue, 10 Dec 2013 14:29:17 +0000 (09:29 -0500)]
sysfs: bail early from kernfs_file_mmap() to avoid spurious lockdep warning

This is v3.14 fix for the same issue that a8b14744429f ("sysfs: give
different locking key to regular and bin files") addresses for v3.13.
Due to the extensive kernfs reorganization in v3.14 branch, the same
fix couldn't be ported as-is.  The v3.13 fix was ignored while merging
it into v3.14 branch.

027a485d12e0 ("sysfs: use a separate locking class for open files
depending on mmap") assigned different lockdep key to
sysfs_open_file->mutex depending on whether the file implements mmap
or not in an attempt to avoid spurious lockdep warning caused by
merging of regular and bin file paths.

While this restored some of the original behavior of using different
locks (at least lockdep is concerned) for the different clases of
files.  The restoration wasn't full because now the lockdep key
assignment depends on whether the file has mmap or not instead of
whether it's a regular file or not.

This means that bin files which don't implement mmap will get assigned
the same lockdep class as regular files.  This is problematic because
file_operations for bin files still implements the mmap file operation
and checking whether the sysfs file actually implements mmap happens
in the file operation after grabbing @sysfs_open_file->mutex.  We
still end up adding locking dependency from mmap locking to
sysfs_open_file->mutex to the regular file mutex which triggers
spurious circular locking warning.

For v3.13, a8b14744429f ("sysfs: give different locking key to regular
and bin files") fixed it by giving sysfs_open_file->mutex different
lockdep keys depending on whether the file is regular or bin instead
of whether mmap exists or not; however, due to the way sysfs is now
layered behind kernfs, this approach is no longer viable.  kernfs can
tell whether a sysfs node has mmap implemented or not but can't tell
whether a bin file from a regular one.

This patch updates kernfs such that kernfs_file_mmap() checks
SYSFS_FLAG_HAS_MMAP and bail before grabbing sysfs_open_file->mutex so
that it doesn't add spurious locking dependency from mmap to
sysfs_open_file->mutex and changes sysfs so that it specifies
kernfs_ops->mmap iff the sysfs file implements mmap.  Combined, this
ensures that sysfs_open_file->mutex is grabbed under mmap path iff the
sysfs file actually implements mmap.  As sysfs_open_file->mutex is
already given a different lockdep key if mmap is implemented, this
removes the spurious locking dependency.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Dave Jones <davej@redhat.com>
Link: http://lkml.kernel.org/g/20131203184324.GA11320@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: remove cross inclusions of internal headers
Tejun Heo [Thu, 28 Nov 2013 19:54:47 +0000 (14:54 -0500)]
sysfs, kernfs: remove cross inclusions of internal headers

fs/kernfs/kernfs-internal.h needed to include fs/sysfs/sysfs.h because
part of kernfs core implementation was living in sysfs.

fs/sysfs/sysfs.h needed to include fs/kernfs/kernfs-internal.h because
include/linux/kernfs.h didn't expose enough interface.

The separation is complete and neither is true anymore.  Remove the
cross inclusion and make sysfs a proper user of kernfs.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: implement kernfs_ns_enabled()
Tejun Heo [Fri, 29 Nov 2013 22:19:09 +0000 (17:19 -0500)]
sysfs, kernfs: implement kernfs_ns_enabled()

fs/sysfs/symlink.c::sysfs_delete_link() tests @sd->s_flags for
SYSFS_FLAG_NS.  Let's add kernfs_ns_enabled() so that sysfs doesn't
have to test sysfs_dirent flag directly.  This makes things tidier for
kernfs proper too.

This is purely cosmetic.

v2: To avoid possible NULL deref, use noop dummy implementation which
    always returns false when !CONFIG_SYSFS.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: make sysfs_dirent definition public
Tejun Heo [Fri, 29 Nov 2013 22:18:32 +0000 (17:18 -0500)]
sysfs, kernfs: make sysfs_dirent definition public

sysfs_dirent includes some information which should be available to
kernfs users - the type, flags, name and parent pointer.  This patch
moves sysfs_dirent definition from kernfs/kernfs-internal.h to
include/linux/kernfs.h so that kernfs users can access them.

The type part of flags is exported as enum kernfs_node_type, the flags
kernfs_node_flag, sysfs_type() and kernfs_enable_ns() are moved to
include/linux/kernfs.h and the former is updated to return the enum
type.  sysfs_dirent->s_parent and ->s_name are marked explicitly as
public.

This patch doesn't introduce any functional changes.

v2: Flags exported too and kernfs_enable_ns() definition moved.

v3: While moving kernfs_enable_ns() to include/linux/kernfs.h, v1 and
    v2 put the definition outside CONFIG_SYSFS replacing the dummy
    implementation with the actual implementation too.  Unfortunately,
    this can lead to oops when !CONFIG_SYSFS because
    kernfs_enable_ns() may be called on a NULL @sd and now tries to
    dereference @sd instead of not doing anything.  This issue was
    reported by Yuanhan Liu.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: move mount core code to fs/kernfs/mount.c
Tejun Heo [Thu, 28 Nov 2013 19:54:44 +0000 (14:54 -0500)]
sysfs, kernfs: move mount core code to fs/kernfs/mount.c

Move core mount code to fs/kernfs/mount.c.  The respective
declarations in fs/sysfs/sysfs.h are moved to
fs/kernfs/kernfs-internal.h.

This is pure relocation.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: prepare mount path for kernfs
Tejun Heo [Thu, 28 Nov 2013 19:54:43 +0000 (14:54 -0500)]
sysfs, kernfs: prepare mount path for kernfs

We're in the process of separating out core sysfs functionality into
kernfs which will deal with sysfs_dirents directly.  This patch
rearranges mount path so that the kernfs and sysfs parts are separate.

* As sysfs_super_info won't be visible outside kernfs proper,
  kernfs_super_ns() is added to allow kernfs users to access a
  super_block's namespace tag.

* Generic mount operation is separated out into kernfs_mount_ns().
  sysfs_mount() now just performs sysfs-specific permission check,
  acquires namespace tag, and invokes kernfs_mount_ns().

* Generic superblock release is separated out into kernfs_kill_sb()
  which can be used directly as file_system_type->kill_sb().  As sysfs
  needs to put the namespace tag, sysfs_kill_sb() wraps
  kernfs_kill_sb() with ns tag put.

* sysfs_dir_cachep init and sysfs_inode_init() are separated out into
  kernfs_init().  kernfs_init() uses only small amount of memory and
  trying to handle and propagate kernfs_init() failure doesn't make
  much sense.  Use SLAB_PANIC for sysfs_dir_cachep and make
  sysfs_inode_init() panic on failure.

  After this change, kernfs_init() should be called before
  sysfs_init(), fs/namespace.c::mnt_init() modified accordingly.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: linux-fsdevel@vger.kernel.org
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: make super_blocks bind to different kernfs_roots
Tejun Heo [Thu, 28 Nov 2013 19:54:42 +0000 (14:54 -0500)]
sysfs, kernfs: make super_blocks bind to different kernfs_roots

kernfs is being updated to allow multiple sysfs_dirent hierarchies so
that it can also be used by other users.  Currently, sysfs
super_blocks are always attached to one kernfs_root - sysfs_root - and
distinguished only by their namespace tags.

This patch adds sysfs_super_info->root and update
sysfs_fill/test_super() so that super_blocks are identified by the
combination of both the associated kernfs_root and namespace tag.
This allows mounting different kernfs hierarchies.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: make inode number ida per kernfs_root
Tejun Heo [Thu, 28 Nov 2013 19:54:41 +0000 (14:54 -0500)]
sysfs, kernfs: make inode number ida per kernfs_root

kernfs is being updated to allow multiple sysfs_dirent hierarchies so
that it can also be used by other users.  Currently, inode number is
allocated using a global ida, sysfs_ino_ida; however, inos for
different hierarchies should be handled separately.

This patch makes ino allocation per kernfs_root.  sysfs_ino_ida is
replaced by kernfs_root->ino_ida and sysfs_new_dirent() is updated to
take @root and allocate ino from it.  ida_simple_get/remove() are used
instead of sysfs_ino_lock and sysfs_alloc/free_ino().

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: implement kernfs_create/destroy_root()
Tejun Heo [Thu, 28 Nov 2013 19:54:40 +0000 (14:54 -0500)]
sysfs, kernfs: implement kernfs_create/destroy_root()

There currently is single kernfs hierarchy in the whole system which
is used for sysfs.  kernfs needs to support multiple hierarchies to
allow other users.  This patch introduces struct kernfs_root which
serves as the root of each kernfs hierarchy and implements
kernfs_create/destroy_root().

* Each kernfs_root is associated with a root sd (sysfs_dentry).  The
  root is freed when the root sd is released and kernfs_destory_root()
  simply invokes kernfs_remove() on the root sd.  sysfs_remove_one()
  is updated to handle release of the root sd.  Note that ps_iattr
  update in sysfs_remove_one() is trivially updated for readability.

* Root sd's are now dynamically allocated using sysfs_new_dirent().
  Update sysfs_alloc_ino() so that it gives out ino from 1 so that the
  root sd still gets ino 1.

* While kernfs currently only points to the root sd, it'll soon grow
  fields which are specific to each hierarchy.  As determining a given
  sd's root will be necessary, sd->s_dir.root is added.  This backlink
  fits better as a separate field in sd; however, sd->s_dir is inside
  union with space to spare, so use it to save space and provide
  kernfs_root() accessor to determine the root sd.

* As hierarchies may be destroyed now, each mount needs to hold onto
  the hierarchy it's attached to.  Update sysfs_fill_super() and
  sysfs_kill_sb() so that they get and put the kernfs_root
  respectively.

* sysfs_root is replaced with kernfs_root which is dynamically created
  by invoking kernfs_create_root() from sysfs_init().

This patch doesn't introduce any visible behavior changes.

v2: kernfs_create_root() forgot to set @sd->priv.  Fixed.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: introduce sysfs_root_sd
Tejun Heo [Thu, 28 Nov 2013 19:54:39 +0000 (14:54 -0500)]
sysfs, kernfs: introduce sysfs_root_sd

Currently, it's assumed that there's a single kernfs hierarchy in the
system anchored at sysfs_root which is defined as a global struct.  To
allow other users of kernfs, this will be made dynamic.  Introduce a
new global variable sysfs_root_sd which points to &sysfs_root and
convert all &sysfs_root users.

This patch doesn't introduce any behavior difference.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: no need to kern_mount() sysfs from sysfs_init()
Tejun Heo [Thu, 28 Nov 2013 19:54:38 +0000 (14:54 -0500)]
sysfs, kernfs: no need to kern_mount() sysfs from sysfs_init()

It has been very long since sysfs depended on vfs to keep track of
internal states and whether sysfs is mounted or not doesn't make any
difference to sysfs's internal operation.

In addition to init and filesystem type registration, sysfs_init()
invokes kern_mount() to create in-kernel mount of sysfs.  This
internal mounting doesn't server any purpose anymore.  Remove it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: make sysfs_super_info->ns const
Tejun Heo [Thu, 28 Nov 2013 19:54:37 +0000 (14:54 -0500)]
sysfs, kernfs: make sysfs_super_info->ns const

Add const qualifier to sysfs_super_info->ns so that it's consistent
with other namespace tag usages in sysfs.  Because kobject doesn't use
const qualifier for namespace tags, this ends up requiring an explicit
cast to drop const qualifier in free_sysfs_super_info().

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: drop unused params from sysfs_fill_super()
Tejun Heo [Thu, 28 Nov 2013 19:54:36 +0000 (14:54 -0500)]
sysfs, kernfs: drop unused params from sysfs_fill_super()

sysfs_fill_super() takes three params - @sb, @data and @silent - but
uses only @sb.  Drop the latter two.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: move symlink core code to fs/kernfs/symlink.c
Tejun Heo [Thu, 28 Nov 2013 19:54:35 +0000 (14:54 -0500)]
sysfs, kernfs: move symlink core code to fs/kernfs/symlink.c

Move core symlink code to fs/kernfs/symlink.c.  fs/sysfs/symlink.c now
only contains sysfs wrappers around kernfs interfaces.  The respective
declarations in fs/sysfs/sysfs.h are moved to
fs/kernfs/kernfs-internal.h.

This is pure relocation.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: move file core code to fs/kernfs/file.c
Tejun Heo [Thu, 28 Nov 2013 19:54:34 +0000 (14:54 -0500)]
sysfs, kernfs: move file core code to fs/kernfs/file.c

Move core file code to fs/kernfs/file.c.  fs/sysfs/file.c now contains
sysfs kernfs_ops callbacks, sysfs wrappers around kernfs interfaces,
and sysfs_schedule_callback().  The respective declarations in
fs/sysfs/sysfs.h are moved to fs/kernfs/kernfs-internal.h.

This is pure relocation.

v2: Refreshed on top of the v2 of "sysfs, kernfs: prepare read path
    for kernfs".

v3: Refreshed on top of the v3 of "sysfs, kernfs: prepare read path
    for kernfs".

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: move dir core code to fs/kernfs/dir.c
Tejun Heo [Thu, 28 Nov 2013 19:54:33 +0000 (14:54 -0500)]
sysfs, kernfs: move dir core code to fs/kernfs/dir.c

Move core dir code to fs/kernfs/dir.c.  fs/sysfs/dir.c now only
contains sysfs_warn_dup() and sysfs wrappers around kernfs interfaces.
The respective declarations in fs/sysfs/sysfs.h are moved to
fs/kernfs/kernfs-internal.h.

This is pure relocation.

v2: sysfs_symlink_target_lock was mistakenly relocated to kernfs.  It
    should remain with sysfs.  Fixed.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: move inode code to fs/kernfs/inode.c
Tejun Heo [Thu, 28 Nov 2013 19:54:32 +0000 (14:54 -0500)]
sysfs, kernfs: move inode code to fs/kernfs/inode.c

There's nothing sysfs-specific in fs/sysfs/inode.c.  Move everything
in it to fs/kernfs/inode.c.  The respective declarations in
fs/sysfs/sysfs.h are moved to fs/kernfs/kernfs-internal.h.

This is pure relocation.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: move internal decls to fs/kernfs/kernfs-internal.h
Tejun Heo [Thu, 28 Nov 2013 19:54:31 +0000 (14:54 -0500)]
sysfs, kernfs: move internal decls to fs/kernfs/kernfs-internal.h

Move data structure, constant and basic accessor declarations from
fs/sysfs/sysfs.h to fs/kernfs/kernfs-internal.h.  The two files
currently include each other.  Once kernfs / sysfs separation is
complete, the cross inclusions will be removed.  Inclusion protectors
are added to fs/sysfs/sysfs.h to allow cross-inclusion.

This patch doesn't introduce any functional changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: introduce kernfs[_find_and]_get() and kernfs_put()
Tejun Heo [Thu, 28 Nov 2013 19:54:30 +0000 (14:54 -0500)]
sysfs, kernfs: introduce kernfs[_find_and]_get() and kernfs_put()

Introduce kernfs interface for finding, getting and putting
sysfs_dirents.

* sysfs_find_dirent() is renamed to kernfs_find_ns() and lockdep
  assertion for sysfs_mutex is added.

* sysfs_get_dirent_ns() is renamed to kernfs_find_and_get().

* Macro inline dancing around __sysfs_get/put() are removed and
  kernfs_get/put() are made proper functions implemented in
  fs/sysfs/dir.c.

While the conversions are mostly equivalent, there's one difference -
kernfs_get() doesn't return the input param as its return value.  This
change is intentional.  While passing through the input increases
writability in some areas, it is unnecessary and has been shown to
cause confusion regarding how the last ref is handled.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: revamp sysfs_dirent active_ref lockdep annotation
Tejun Heo [Thu, 28 Nov 2013 19:54:29 +0000 (14:54 -0500)]
sysfs, kernfs: revamp sysfs_dirent active_ref lockdep annotation

Currently, sysfs_dirent active_ref lockdep annotation uses
attribute->[s]key as the lockdep key, which forces
kernfs_create_file_ns() to assume that sysfs_dirent->priv is pointing
to a struct attribute which may not be true for non-sysfs users.  This
patch restructures the lockdep annotation such that

* kernfs_ops contains lockdep_key which is used by default for files
  created kernfs_create_file_ns().

* kernfs_create_file_ns_key() is introduced which takes an extra @key
  argument.  The created file will use the specified key for
  active_ref lockdep annotation.  If NULL is specified, lockdep for
  the file is disabled.

* sysfs_add_file_mode_ns() is updated to use
  kernfs_create_file_ns_key() with the appropriate key from the
  attribute or NULL if ignore_lockdep is set.

This makes the lockdep annotation properly contained in kernfs while
allowing sysfs to cleanly keep its current behavior.  This patch
doesn't introduce any behavior differences.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: reorganize SYSFS_* constants
Tejun Heo [Thu, 28 Nov 2013 19:54:28 +0000 (14:54 -0500)]
sysfs, kernfs: reorganize SYSFS_* constants

We want to add one more SYSFS_FLAG_* but we can't use the next higher
bit, 0x10000, as the flag field is 16bits wide.  The flags are
currently arranged weirdly - 8 bits are set aside for the type flags
when there are only three three used, the first flag starts at 0x1000
instead of 0x0100 and flag literals have 5 digits (20 bits) when only
4 digits can be used.

Rearrange them so that type bits are only the lowest four, flags start
at 0x0010 and similar flags are grouped.

This patch doesn't cause any behavior difference.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: introduce kernfs_notify()
Tejun Heo [Thu, 28 Nov 2013 19:54:27 +0000 (14:54 -0500)]
sysfs, kernfs: introduce kernfs_notify()

Introduce kernfs interface to wake up poll(2) which takes and returns
sysfs_dirents.

sysfs_notify_dirent() is renamed to kernfs_notify() and sysfs_notify()
is updated so that it doesn't directly grab sysfs_mutex but acquires
the target sysfs_dirents using sysfs_get_dirent().
sysfs_notify_dirent() is reimplemented as a dumb inline wrapper around
kernfs_notify().

This patch doesn't introduce any behavior changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: add kernfs_ops->seq_{start|next|stop}()
Tejun Heo [Thu, 28 Nov 2013 19:54:26 +0000 (14:54 -0500)]
sysfs, kernfs: add kernfs_ops->seq_{start|next|stop}()

kernfs_ops currently only supports single_open() behavior which is
pretty restrictive.  Add optional callbacks ->seq_{start|next|stop}()
which, when implemented, are invoked for seq_file traversal.  This
allows full seq_file functionality for kernfs users.  This currently
doesn't have any user and doesn't change any behavior.

v2: Refreshed on top of the updated "sysfs, kernfs: prepare read path
    for kernfs".

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: remove sysfs_add_one()
Tejun Heo [Thu, 28 Nov 2013 19:54:25 +0000 (14:54 -0500)]
sysfs, kernfs: remove sysfs_add_one()

sysfs_add_one() is a wrapper around __sysfs_add_one() which prints out
duplicate name warning if __sysfs_add_one() fails with -EEXIST.  The
previous kernfs conversions moved all dup warnings to sysfs interface
functions and sysfs_add_one() doesn't have any user left.

Remove sysfs_add_one() and update __sysfs_add_one() to take its name.

This patch doesn't make any functional changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: introduce kernfs_create_file[_ns]()
Tejun Heo [Thu, 28 Nov 2013 19:54:24 +0000 (14:54 -0500)]
sysfs, kernfs: introduce kernfs_create_file[_ns]()

Introduce kernfs interface to create a file which takes and returns
sysfs_dirents.

The actual file creation part is separated out from
sysfs_add_file_mode_ns() into kernfs_create_file_ns().  The former now
only decides the kernfs_ops to use and the file's size and invokes the
latter.

This patch doesn't introduce behavior changes.

v2: Dummy implementation for !CONFIG_SYSFS updated to return -ENOSYS.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: remove SYSFS_KOBJ_BIN_ATTR
Tejun Heo [Thu, 28 Nov 2013 19:54:23 +0000 (14:54 -0500)]
sysfs, kernfs: remove SYSFS_KOBJ_BIN_ATTR

After kernfs_ops and sysfs_dirent->s_attr.size addition, the
distinction between SYSFS_KOBJ_BIN_ATTR and SYSFS_KOBJ_ATTR is only
necessary while creating files to decide which kernfs_ops to use.
Afterwards, they behave exactly the same.

This patch removes SYSFS_KOBJ_BIN_ATTR along with sysfs_is_bin().
sysfs_add_file[_mode_ns]() are updated to take bool @is_bin instead of
@type.

This patch doesn't introduce any behavior changes.  This completely
isolates the distinction between the two sysfs file types in the sysfs
layer proper.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: add sysfs_dirent->s_attr.size
Tejun Heo [Thu, 28 Nov 2013 19:54:22 +0000 (14:54 -0500)]
sysfs, kernfs: add sysfs_dirent->s_attr.size

sysfs sets the size of regular files unconditionally at PAGE_SIZE and
takes the size of bin files from bin_attribute.  The latter is a
pretty bad interface which forces bin_attribute users to create a
separate copy of bin_attribute for each instance of the file -
e.g. pci resource files.

Add sysfs_dirent->s_attr.size so that the size can be specified
separately.  This unifies inode init paths of ATTR and BIN_ATTR
identical and allows for generic size handling for kernfs.

Unfortunately, this grows the size of sysfs_dirent by sizeof(loff_t).

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: introduce kernfs_ops
Tejun Heo [Thu, 28 Nov 2013 19:54:21 +0000 (14:54 -0500)]
sysfs, kernfs: introduce kernfs_ops

We're in the process of separating out core sysfs functionality into
kernfs which will deal with sysfs_dirents directly.  This patch
introduces kernfs_ops which hosts methods kernfs users implement and
updates fs/sysfs/file.c such that sysfs_kf_*() functions populate
kernfs_ops and kernfs_file_*() functions call the matching entries
from kernfs_ops.

kernfs_ops contains the following groups of methods.

* seq_show() - for kernfs files which use seq_file for reads.

* read() - for direct read implementations.  Used iff seq_show() is
  not implemented.

* write() - for writes.

* mmap() - for mmaps.

Notes:

* sysfs_elem_attr->ops is added so that kernfs_ops can be accessed
  from sysfs_dirent.  kernfs_ops() helper is added to verify locking
  and access the field.

* SYSFS_FLAG_HAS_(SEQ_SHOW|MMAP) added.  sd->s_attr->ops is accessible
  only while holding active_ref and there are cases where we want to
  take different actions depending on which ops are implemented.
  These two flags cache whether the two ops are implemented for those.

* kernfs_file_*() no longer test sysfs type but chooses different
  behaviors depending on which methods in kernfs_ops are implemented.
  The conversions are trivial except for the open path.  As
  kernfs_file_open() now decides whether to allow read/write accesses
  depending on the kernfs_ops implemented, the presence of methods in
  kobjs and attribute_bin should be propagated to kernfs_ops.
  sysfs_add_file_mode_ns() is updated so that it propagates presence /
  absence of the callbacks through _empty, _ro, _wo, _rw kernfs_ops.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs: use a separate locking class for open files depending on mmap
Tejun Heo [Sun, 17 Nov 2013 02:17:36 +0000 (11:17 +0900)]
sysfs: use a separate locking class for open files depending on mmap

The following two commits implemented mmap support in the regular file
path and merged bin file support into the regular path.

 73d9714627ad ("sysfs: copy bin mmap support from fs/sysfs/bin.c to fs/sysfs/file.c")
 3124eb1679b2 ("sysfs: merge regular and bin file handling")

After the merge, the following commands trigger a spurious lockdep
warning.  "test-mmap-read" simply mmaps the file and dumps the
content.

  $ cat /sys/block/sda/trace/act_mask
  $ test-mmap-read /sys/devices/pci0000\:00/0000\:00\:03.0/resource0 4096

  ======================================================
  [ INFO: possible circular locking dependency detected ]
  3.12.0-work+ #378 Not tainted
  -------------------------------------------------------
  test-mmap-read/567 is trying to acquire lock:
   (&of->mutex){+.+.+.}, at: [<ffffffff8120a8df>] sysfs_bin_mmap+0x4f/0x120

  but task is already holding lock:
   (&mm->mmap_sem){++++++}, at: [<ffffffff8114b399>] vm_mmap_pgoff+0x49/0xa0

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

  -> #3 (&mm->mmap_sem){++++++}:
  ...
  -> #2 (sr_mutex){+.+.+.}:
  ...
  -> #1 (&bdev->bd_mutex){+.+.+.}:
  ...
  -> #0 (&of->mutex){+.+.+.}:
  ...

  other info that might help us debug this:

  Chain exists of:
   &of->mutex --> sr_mutex --> &mm->mmap_sem

   Possible unsafe locking scenario:

 CPU0                    CPU1
 ----                    ----
    lock(&mm->mmap_sem);
 lock(sr_mutex);
 lock(&mm->mmap_sem);
    lock(&of->mutex);

   *** DEADLOCK ***

  1 lock held by test-mmap-read/567:
   #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff8114b399>] vm_mmap_pgoff+0x49/0xa0

  stack backtrace:
  CPU: 3 PID: 567 Comm: test-mmap-read Not tainted 3.12.0-work+ #378
  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
   ffffffff81ed41a0 ffff880009441bc8 ffffffff81611ad2 ffffffff81eccb80
   ffff880009441c08 ffffffff8160f215 ffff880009441c60 ffff880009c75208
   0000000000000000 ffff880009c751e0 ffff880009c75208 ffff880009c74ac0
  Call Trace:
   [<ffffffff81611ad2>] dump_stack+0x4e/0x7a
   [<ffffffff8160f215>] print_circular_bug+0x2b0/0x2bf
   [<ffffffff8109ca0a>] __lock_acquire+0x1a3a/0x1e60
   [<ffffffff8109d6ba>] lock_acquire+0x9a/0x1d0
   [<ffffffff81615547>] mutex_lock_nested+0x67/0x3f0
   [<ffffffff8120a8df>] sysfs_bin_mmap+0x4f/0x120
   [<ffffffff8115d363>] mmap_region+0x3b3/0x5b0
   [<ffffffff8115d8ae>] do_mmap_pgoff+0x34e/0x3d0
   [<ffffffff8114b3ba>] vm_mmap_pgoff+0x6a/0xa0
   [<ffffffff8115be3e>] SyS_mmap_pgoff+0xbe/0x250
   [<ffffffff81008282>] SyS_mmap+0x22/0x30
   [<ffffffff8161a4d2>] system_call_fastpath+0x16/0x1b

This happens because one file nests sr_mutex, which nests mm->mmap_sem
under it, under of->mutex while mmap implementation naturally nests
of->mutex under mm->mmap_sem.  The warning is false positive as
of->mutex is per open-file and the two paths belong to two different
files.  This warning didn't trigger before regular and bin file
supports were merged because only bin file supported mmap and the
other side of locking happened only on regular files which used
equivalent but separate locking.

It'd be best if we give separate locking classes per file but we can't
easily do that.  Let's differentiate on ->mmap() for now.  Later we'll
add explicit file operations struct and can add per-ops lockdep key
there.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: move sysfs_open_file to include/linux/kernfs.h
Tejun Heo [Thu, 28 Nov 2013 19:54:20 +0000 (14:54 -0500)]
sysfs, kernfs: move sysfs_open_file to include/linux/kernfs.h

sysfs_open_file will be used as the primary handle for kernfs methods.
Move its definition from fs/sysfs/file.c to include/linux/kernfs.h and
mark the public and private fields.

This is pure relocation.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: prepare open, release, poll paths for kernfs
Tejun Heo [Thu, 28 Nov 2013 19:54:19 +0000 (14:54 -0500)]
sysfs, kernfs: prepare open, release, poll paths for kernfs

We're in the process of separating out core sysfs functionality into
kernfs which will deal with sysfs_dirents directly.  This patch
prepares the rest - open, release and poll.  There isn't much to do.
Just renaming is enough.  As sysfs_file_operations and
sysfs_bin_operations are identical now, use the same file_operations
for both - kernfs_file_operations.

This patch doesn't introduce any behavior changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: prepare mmap path for kernfs
Tejun Heo [Thu, 28 Nov 2013 19:54:18 +0000 (14:54 -0500)]
sysfs, kernfs: prepare mmap path for kernfs

We're in the process of separating out core sysfs functionality into
kernfs which will deal with sysfs_dirents directly.  This patch
rearranges mmap path so that the kernfs and sysfs parts are separate.

sysfs_kf_bin_mmap() which handles the interaction with bin_attribute
mmap method is factored out of sysfs_bin_mmap(), which is renamed to
kernfs_file_mmap().  All vma ops are renamed accordingly.

sysfs_bin_mmap() is updated such that it can be used for both file
types.  This will eventually allow using the same file_operations for
both file types, which is necessary to separate out kernfs.

This patch doesn't introduce any behavior changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: prepare write path for kernfs
Tejun Heo [Thu, 28 Nov 2013 19:54:17 +0000 (14:54 -0500)]
sysfs, kernfs: prepare write path for kernfs

We're in the process of separating out core sysfs functionality into
kernfs which will deal with sysfs_dirents directly.  This patch
rearranges write path so that the kernfs and sysfs parts are separate.

kernfs_file_write() handles all boilerplate work including buffer
management and locking and invokes sysfs_kf_write() or
sysfs_kf_bin_write() depending on the file type which deals with the
interaction with kobj store or bin_attribute write method.

While this patch changes the order of some operations, it shouldn't
change any visible behavior.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: prepare read path for kernfs
Tejun Heo [Thu, 28 Nov 2013 19:54:16 +0000 (14:54 -0500)]
sysfs, kernfs: prepare read path for kernfs

We're in the process of separating out core sysfs functionality into
kernfs which will deal with sysfs_dirents directly.  This patch
rearranges read path so that the kernfs and sysfs parts are separate.

* Regular file read path is refactored such that
  kernfs_seq_start/next/stop/show() handle all the boilerplate work
  including locking and updating event count for poll, while
  sysfs_kf_seq_show() deals with interaction with kobj show method.

* Bin file read path is refactored such that kernfs_file_direct_read()
  handles all the boilerplate work including buffer management and
  locking, while sysfs_kf_bin_read() deals with interaction with
  bin_attribute read method.

kernfs_file_read() is added.  It invokes either the seq_file or direct
read path depending on the file type.  This will eventually allow
using the same file_operations for both file types, which is necessary
to separate out kernfs.

While this patch changes the order of some operations, it shouldn't
change any visible behavior.

v2: Dropped unnecessary zeroing of @count from sysfs_kf_seq_show().
    Add comments explaining single_open() behavior.  Both suggested by
    Pavel.

v3: seq_stop() is called even after seq_start() failed.
    kernfs_seq_start() updated so that it doesn't unlock
    sysfs_open_file->mutex on failure so that kernfs_seq_stop()
    doesn't try to unlock an already unlocked mutex.  Reported by
    Fengguang.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: introduce kernfs_create_dir[_ns]()
Tejun Heo [Thu, 28 Nov 2013 19:54:15 +0000 (14:54 -0500)]
sysfs, kernfs: introduce kernfs_create_dir[_ns]()

Introduce kernfs interface to manipulate a directory which takes and
returns sysfs_dirents.

create_dir() is renamed to kernfs_create_dir_ns() and its argumantes
and return value are updated.  create_dir() usages are replaced with
kernfs_create_dir_ns() and sysfs_create_subdir() usages are replaced
with kernfs_create_dir().  Dup warnings are handled explicitly by
sysfs users of the kernfs interface.

sysfs_enable_ns() is renamed to kernfs_enable_ns().

This patch doesn't introduce any behavior changes.

v2: Dummy implementation for !CONFIG_SYSFS updated to return -ENOSYS.

v3: kernfs_enable_ns() added.

v4: Refreshed on top of "sysfs: drop kobj_ns_type handling, take #2"
    so that this patch removes sysfs_enable_ns().

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: replace sysfs_dirent->s_dir.kobj and ->s_attr.[bin_]attr with ->priv
Tejun Heo [Thu, 28 Nov 2013 19:54:14 +0000 (14:54 -0500)]
sysfs, kernfs: replace sysfs_dirent->s_dir.kobj and ->s_attr.[bin_]attr with ->priv

A directory sysfs_dirent points to the associated kobj.  A regular or
bin file points to the associated [bin_]attribute.  This patch
replaces sysfs_dirent->s_dir.kobj and ->s_attr.[bin_]attr with void *
->priv.

This is to prepare for kernfs interface so that sysfs can specify the
private data in the same way for directories and files.  This lower
debuggability but not by much - the whole thing was overlaid in a
union anyway.  If debuggability becomes an issue, we can later add
->priv accessors which explicitly check for the sysfs_dirent type and
performs casting.

This patch doesn't introduce any behavior difference.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs: separate out dup filename warning into a separate function
Tejun Heo [Thu, 24 Oct 2013 15:49:11 +0000 (11:49 -0400)]
sysfs: separate out dup filename warning into a separate function

Separate out sysfs_warn_dup() out of sysfs_add_one().  This will help
separating out the core sysfs functionalities into kernfs so that it
can be used by non-sysfs users too.

This doesn't make any functional changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs: remove unused sysfs_get_dentry() prototype
Tejun Heo [Thu, 24 Oct 2013 15:49:09 +0000 (11:49 -0400)]
sysfs: remove unused sysfs_get_dentry() prototype

sysfs_get_dentry() has been gone for years now.  Remove the left-over
prototype.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs: honor bin_attr.attr.ignore_lockdep
Tejun Heo [Thu, 24 Oct 2013 15:49:08 +0000 (11:49 -0400)]
sysfs: honor bin_attr.attr.ignore_lockdep

ignore_lockdep is currently honored only for regular files.  There's
no reason to ignore it for bin files.  Update sysfs_ignore_lockdep()
so that bin_attr.attr.ignore_lockdep works too.

While this doesn't have any in-kernel user, this unifies the behaviors
between regular and bin files and will help later changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs: merge sysfs_elem_bin_attr into sysfs_elem_attr
Tejun Heo [Thu, 24 Oct 2013 15:49:07 +0000 (11:49 -0400)]
sysfs: merge sysfs_elem_bin_attr into sysfs_elem_attr

3124eb1679 ("sysfs: merge regular and bin file handling") folded bin
file handling into regular file handling.  Among other things, bin
file now shares the same open path including sysfs_open_dirent
association using sysfs_dirent->s_attr.open.  This is buggy because
->s_bin_attr lives in the same union and doesn't have the field.  This
bug doesn't trigger because sysfs_elem_bin_attr doesn't have an active
field at the conflicting position.  It does have a field "buffers" but
it isn't used anymore.

This patch collapses sysfs_elem_bin_attr into sysfs_elem_attr so that
the bin_attr is accessed through ->s_attr.bin_attr which lives with
->s_attr.attr in an anonymous union.  The code paths already assume
bin_attr contains attr as the first element, so this doesn't add any
more assumptions while making it explicit that the two types are
handled together.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs: make sysfs_file_ops() follow ignore_lockdep flag
Tejun Heo [Mon, 14 Oct 2013 13:27:11 +0000 (09:27 -0400)]
sysfs: make sysfs_file_ops() follow ignore_lockdep flag

375b611e60 ("sysfs: remove sysfs_buffer->ops") introduced
sysfs_file_ops() which determines the associated file operation of a
given sysfs_dirent.  As file ops access should be protected by an
active reference, the new function includes a lockdep assertion on the
sysfs_dirent; unfortunately, I forgot to take attr->ignore_lockdep
flag into account and the lockdep assertion trips spuriously for files
which opt out from active reference lockdep checking.

# cat /sys/devices/pci0000:00/0000:00:01.2/usb1/authorized

 ------------[ cut here ]------------
 WARNING: CPU: 1 PID: 540 at /work/os/work/fs/sysfs/file.c:79 sysfs_file_ops+0x4e/0x60()
 Modules linked in:
 CPU: 1 PID: 540 Comm: cat Not tainted 3.11.0-work+ #3
 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  0000000000000009 ffff880016205c08 ffffffff81ca0131 0000000000000000
  ffff880016205c40 ffffffff81096d0d ffff8800166cb898 ffff8800166f6f60
  ffffffff8125a220 ffff880011ab1ec0 ffff88000aff0c78 ffff880016205c50
 Call Trace:
  [<ffffffff81ca0131>] dump_stack+0x4e/0x82
  [<ffffffff81096d0d>] warn_slowpath_common+0x7d/0xa0
  [<ffffffff81096dea>] warn_slowpath_null+0x1a/0x20
  [<ffffffff8125994e>] sysfs_file_ops+0x4e/0x60
  [<ffffffff8125a274>] sysfs_open_file+0x54/0x300
  [<ffffffff811df612>] do_dentry_open.isra.17+0x182/0x280
  [<ffffffff811df820>] finish_open+0x30/0x40
  [<ffffffff811f0623>] do_last+0x503/0xd90
  [<ffffffff811f0f6b>] path_openat+0xbb/0x6d0
  [<ffffffff811f23ba>] do_filp_open+0x3a/0x90
  [<ffffffff811e09a9>] do_sys_open+0x129/0x220
  [<ffffffff811e0abe>] SyS_open+0x1e/0x20
  [<ffffffff81caf3c2>] system_call_fastpath+0x16/0x1b
 ---[ end trace aa48096b111dafdb ]---

Rename fs/sysfs/dir.c::ignore_lockdep() to sysfs_ignore_lockdep() and
move it to fs/sysfs/sysfs.h and make sysfs_file_ops() skip lockdep
assertion if sysfs_ignore_lockdep() is true.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: introduce kernfs_setattr()
Tejun Heo [Sat, 23 Nov 2013 22:21:52 +0000 (17:21 -0500)]
sysfs, kernfs: introduce kernfs_setattr()

Introduce kernfs setattr interface - kernfs_setattr().

sysfs_sd_setattr() is renamed to __kernfs_setattr() and
kernfs_setattr() is a simple wrapper around it with sysfs_mutex
locking.  sysfs_chmod_file() is updated to get an explicit ref on
kobj->sd and then invoke kernfs_setattr() so that it doesn't have to
use internal interface.

This patch doesn't introduce any behavior differences.

v2: Dummy implementation for !CONFIG_SYSFS updated to return -ENOSYS.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: introduce kernfs_rename[_ns]()
Tejun Heo [Sat, 23 Nov 2013 22:21:51 +0000 (17:21 -0500)]
sysfs, kernfs: introduce kernfs_rename[_ns]()

Introduce kernfs rename interface, krenfs_rename[_ns]().

This is just rename of sysfs_rename().  No functional changes.
Function comment is added to kernfs_rename_ns() and @new_parent_sd is
renamed to @new_parent for consistency with other kernfs interfaces.

v2: Dummy implementation for !CONFIG_SYSFS updated to return -ENOSYS.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: introduce kernfs_create_link()
Tejun Heo [Sat, 23 Nov 2013 22:21:50 +0000 (17:21 -0500)]
sysfs, kernfs: introduce kernfs_create_link()

Separate out kernfs symlink interface - kernfs_create_link() - which
takes and returns sysfs_dirents, from sysfs_do_create_link_sd().
sysfs_do_create_link_sd() now just determines the parent and target
sysfs_dirents and invokes the new interface and handles dup warning.

This patch doesn't introduce behavior changes.

v2: Dummy implementation for !CONFIG_SYSFS updated to return -ENOSYS.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs, kernfs: introduce kernfs_remove[_by_name[_ns]]()
Tejun Heo [Sat, 23 Nov 2013 22:21:49 +0000 (17:21 -0500)]
sysfs, kernfs: introduce kernfs_remove[_by_name[_ns]]()

Introduce kernfs removal interfaces - kernfs_remove() and
kernfs_remove_by_name[_ns]().

These are just renames of sysfs_remove() and sysfs_hash_and_remove().
No functional changes.

v2: Dummy kernfs_remove_by_name_ns() for !CONFIG_SYSFS updated to
    return -ENOSYS instead of 0.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs: move sysfs_hash_and_remove() to fs/sysfs/dir.c
Tejun Heo [Thu, 24 Oct 2013 15:49:10 +0000 (11:49 -0400)]
sysfs: move sysfs_hash_and_remove() to fs/sysfs/dir.c

Most removal related logic is implemented in fs/sysfs/dir.c.  Move
sysfs_hash_and_remove() to fs/sysfs/dir.c so that __sysfs_remove()
doesn't have to be public.

This is pure relocation.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agosysfs: make sure read buffer is zeroed
Tejun Heo [Mon, 19 May 2014 19:52:10 +0000 (15:52 -0400)]
sysfs: make sure read buffer is zeroed

13c589d5b0ac ("sysfs: use seq_file when reading regular files")
switched sysfs from custom read implementation to seq_file to enable
later transition to kernfs.  After the change, the buffer passed to
->show() is acquired through seq_get_buf(); unfortunately, this
introduces a subtle behavior change.  Before the commit, the buffer
passed to ->show() was always zero as it was allocated using
get_zeroed_page().  Because seq_file doesn't clear buffers on
allocation and neither does seq_get_buf(), after the commit, depending
on the behavior of ->show(), we may end up exposing uninitialized data
to userland thus possibly altering userland visible behavior and
leaking information.

Fix it by explicitly clearing the buffer.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Ron <ron@debian.org>
Fixes: 13c589d5b0ac ("sysfs: use seq_file when reading regular files")
Cc: stable <stable@vger.kernel.org> # 3.13+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>