Tejun Heo [Thu, 12 Sep 2013 02:29:06 +0000 (22:29 -0400)]
sysfs: remove ktype->namespace() invocations in symlink code
There's no reason for sysfs to be calling ktype->namespace(). It is
backwards, obfuscates what's going on and unnecessarily tangles two
separate layers.
There are two places where symlink code calls ktype->namespace().
* sysfs_do_create_link_sd() calls it to find out the namespace tag of
the target directory. Unless symlinking races with cross-namespace
renaming, this equals @target_sd->s_ns.
* sysfs_rename_link() uses it to find out the new namespace to rename
to and the new namespace can be different from the existing one.
The function is renamed to sysfs_rename_link_ns() with an explicit
@ns argument and the ktype->namespace() invocation is shifted to the
device layer.
While this patch replaces ktype->namespace() invocation with the
recorded result in @target_sd, this shouldn't result in any behvior
difference.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Kay Sievers <kay@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tejun Heo [Thu, 12 Sep 2013 02:29:05 +0000 (22:29 -0400)]
sysfs: remove ktype->namespace() invocations in directory code
For some unrecognizable reason, namespace information is communicated
to sysfs through ktype->namespace() callback when there's *nothing*
which needs the use of a callback. The whole sequence of operations
is completely synchronous and sysfs operations simply end up calling
back into the layer which just invoked it in order to find out the
namespace information, which is completely backwards, obfuscates
what's going on and unnecessarily tangles two separate layers.
This patch doesn't remove ktype->namespace() but shifts its handling
to kobject layer. We probably want to get rid of the callback in the
long term.
This patch adds an explicit param to sysfs_{create|rename|move}_dir()
and renames them to sysfs_{create|rename|move}_dir_ns(), respectively.
ktype->namespace() invocations are moved to the calling sites of the
above functions. A new helper kboject_namespace() is introduced which
directly tests kobj_ns_type_operations->type which should give the
same result as testing sysfs_fs_type(parent_sd) and returns @kobj's
namespace tag as necessary. kobject_namespace() is extern as it will
be used from another file in the following patches.
This patch should be an equivalent conversion without any functional
difference.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Kay Sievers <kay@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tejun Heo [Thu, 12 Sep 2013 02:29:04 +0000 (22:29 -0400)]
sysfs: make attr namespace interface less convoluted
sysfs ns (namespace) implementation became more convoluted than
necessary while trying to hide ns information from visible interface.
The relatively recent attr ns support is a good example.
* attr ns tag is determined by sysfs_ops->namespace() callback while
dir tag is determined by kobj_type->namespace(). The placement is
arbitrary.
* Instead of performing operations with explicit ns tag, the namespace
callback is routed through sysfs_attr_ns(), sysfs_ops->namespace(),
class_attr_namespace(), class_attr->namespace(). It's not simpler
in any sense. The only thing this convolution does is traversing
the whole stack backwards.
The namespace callbacks are unncessary because the operations involved
are inherently synchronous. The information can be provided in in
straight-forward top-down direction and reversing that direction is
unnecessary and against basic design principles.
This backward interface is unnecessarily convoluted and hinders
properly separating out sysfs from driver model / kobject for proper
layering. This patch updates attr ns support such that
* sysfs_ops->namespace() and class_attr->namespace() are dropped.
* sysfs_{create|remove}_file_ns(), which take explicit @ns param, are
added and sysfs_{create|remove}_file() are now simple wrappers
around the ns aware functions.
* ns handling is dropped from sysfs_chmod_file(). Nobody uses it at
this point. sysfs_chmod_file_ns() can be added later if necessary.
* Explicit @ns is propagated through class_{create|remove}_file_ns()
and netdev_class_{create|remove}_file_ns().
* driver/net/bonding which is currently the only user of attr
namespace is updated to use netdev_class_{create|remove}_file_ns()
with @bh->net as the ns tag instead of using the namespace callback.
This patch should be an equivalent conversion without any functional
difference. It makes the code easier to follow, reduces lines of code
a bit and helps proper separation and layering.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Kay Sievers <kay@vrfy.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tejun Heo [Thu, 12 Sep 2013 02:29:03 +0000 (22:29 -0400)]
sysfs: drop semicolon from to_sysfs_dirent() definition
The expansion of to_sysfs_dirent() contains an unncessary trailing
semicolon making it impossible to use in the middle of statements.
Drop it.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric W. Biederman [Tue, 26 Mar 2013 03:07:01 +0000 (20:07 -0700)]
sysfs: Restrict mounting sysfs
Don't allow mounting sysfs unless the caller has CAP_SYS_ADMIN rights
over the net namespace. The principle here is if you create or have
capabilities over it you can mount it, otherwise you get to live with
what other people have mounted.
Instead of testing this with a straight forward ns_capable call,
perform this check the long and torturous way with kobject helpers,
this keeps direct knowledge of namespaces out of sysfs, and preserves
the existing sysfs abstractions.
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Eric W. Biederman [Sun, 31 Mar 2013 02:57:41 +0000 (19:57 -0700)]
userns: Better restrictions on when proc and sysfs can be mounted
Rely on the fact that another flavor of the filesystem is already
mounted and do not rely on state in the user namespace.
Verify that the mounted filesystem is not covered in any significant
way. I would love to verify that the previously mounted filesystem
has no mounts on top but there are at least the directories
/proc/sys/fs/binfmt_misc and /sys/fs/cgroup/ that exist explicitly
for other filesystems to mount on top of.
Refactor the test into a function named fs_fully_visible and call that
function from the mount routines of proc and sysfs. This makes this
test local to the filesystems involved and the results current of when
the mounts take place, removing a weird threading of the user
namespace, the mount namespace and the filesystems themselves.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Greg Kroah-Hartman [Wed, 21 Aug 2013 23:37:42 +0000 (16:37 -0700)]
sysfs: file.c: fix up broken string warnings
This fixes the coding style warnings in fs/sysfs/file.c for broken
strings across lines.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Wed, 21 Aug 2013 23:34:59 +0000 (16:34 -0700)]
sysfs: fix up uaccess.h coding style warnings
This fixes the uaccess.h warnings in the sysfs.c files.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Wed, 21 Aug 2013 23:33:34 +0000 (16:33 -0700)]
sysfs: fix up 80 column coding style issues
This fixes up the 80 column coding style issues in the sysfs .c files.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Wed, 21 Aug 2013 23:28:26 +0000 (16:28 -0700)]
sysfs: fix up space coding style issues
This fixes up all of the space-related coding style issues for the sysfs
code.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Wed, 21 Aug 2013 23:21:17 +0000 (16:21 -0700)]
sysfs: remove trailing whitespace
This removes all trailing whitespace errors in the sysfs code.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Wed, 21 Aug 2013 23:17:47 +0000 (16:17 -0700)]
sysfs: fix placement of EXPORT_SYMBOL()
The export should happen after the function, not at the bottom of the
file, so fix that up.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Wed, 21 Aug 2013 23:14:11 +0000 (16:14 -0700)]
sysfs: group: update copyright to add myself and the LF
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Wed, 21 Aug 2013 23:12:34 +0000 (16:12 -0700)]
sysfs: group.c: add kerneldoc for sysfs_remove_group
sysfs_remove_group() never had kerneldoc, so add it, and fix up the
kerneldoc for sysfs_remove_groups() which didn't specify the parameters
properly.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Wed, 21 Aug 2013 23:10:02 +0000 (16:10 -0700)]
sysfs: group.c: fix up broken string coding style
checkpatch complains about the broken string in the file, and it's
correct, so fix it up.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Wed, 21 Aug 2013 23:07:29 +0000 (16:07 -0700)]
sysfs: group.c: fix up some * coding style issues
This fixes up the * coding style warnings for the group.c sysfs file.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Wed, 21 Aug 2013 23:06:14 +0000 (16:06 -0700)]
sysfs: group.c: fix trailing whitespace
There was some trailing spaces in the file, fix that up.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Wed, 21 Aug 2013 23:04:12 +0000 (16:04 -0700)]
sysfs: group.c: move EXPORT_SYMBOL_GPL() to the proper location
This fixes up the coding style issue of incorrectly placing the
EXPORT_SYMBOL_GPL() macro, it should be right after the function itself,
not at the end of the file.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Wed, 21 Aug 2013 20:47:50 +0000 (13:47 -0700)]
sysfs: add sysfs_create/remove_groups()
These functions are being open-coded in 3 different places in the driver
core, and other driver subsystems will want to start doing this as well,
so move it to the sysfs core to keep it all in one place, where we know
it is written properly.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Oliver Schinagl [Sun, 14 Jul 2013 23:05:56 +0000 (16:05 -0700)]
sysfs: prevent warning when only using binary attributes
When only using bin_attrs instead of attrs the kernel prints a warning
and refuses to create the sysfs entry. This fixes that.
Signed-off-by: Oliver Schinagl <oliver@schinagl.nl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Sun, 14 Jul 2013 23:05:55 +0000 (16:05 -0700)]
sysfs: add support for binary attributes in groups
groups should be able to support binary attributes, just like it
supports "normal" attributes. This lets us only handle one type of
structure, groups, throughout the driver core and subsystems, making
binary attributes a "full fledged" part of the driver model, and not
something just "tacked on".
Reported-by: Oliver Schinagl <oliver@schinagl.nl>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Nick Dyer [Fri, 7 Jun 2013 14:45:13 +0000 (15:45 +0100)]
sysfs_notify is only possible on file attributes
If sysfs_notify is called on a binary attribute, bad things can
happen, so prevent it.
Note, no in-kernel usage of this is currently present, but in the
future, it's good to be safe.
Changes in V2:
- Also ignore sysfs_notify on dirs, links
- Use WARN_ON rather than silently failing
- Compiled and tested (huge apologies about first submission)
Signed-off-by: Nick Dyer <nick.dyer@itdev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Rami Rosen [Mon, 29 Apr 2013 13:05:32 +0000 (16:05 +0300)]
sysfs: kill sysfs_sb declaration in fs/sysfs/inode.c.
This patch removes sysfs_sb declaration from fs/sysfs/inode.c
(due to
0f4288ec6fcc1a47d1fa0241ec1c6dacd5a09e96,
"Kill unused sysfs_sb variable").
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Warner Wang [Mon, 13 May 2013 03:11:05 +0000 (11:11 +0800)]
sysfs: sysfs_link_sibling(): fix typo in comment
Fix a typo subling->sibling in the comment of sysfs_link_sibling().
Signed-off-by: Warner Wang <warner.wang@hp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Anand Avati [Thu, 5 Sep 2013 09:44:44 +0000 (11:44 +0200)]
fuse: drop dentry on failed revalidate
Drop a subtree when we find that it has moved or been delated. This can be
done as long as there are no submounts under this location.
If the directory was moved and we come across the same directory in a
future lookup it will be reconnected by d_materialise_unique().
Signed-off-by: Anand Avati <avati@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Miklos Szeredi [Thu, 5 Sep 2013 09:44:43 +0000 (11:44 +0200)]
fuse: clean up return in fuse_dentry_revalidate()
On errors unrelated to the filesystem's state (ENOMEM, ENOTCONN) return the
error itself from ->d_revalidate() insted of returning zero (invalid).
Also make a common label for invalidating the dentry. This will be used by
the next patch.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Miklos Szeredi [Thu, 5 Sep 2013 09:44:42 +0000 (11:44 +0200)]
fuse: use d_materialise_unique()
Use d_materialise_unique() instead of d_splice_alias(). This allows dentry
subtrees to be moved to a new place if there moved, even if something is
referencing a dentry in the subtree (open fd, cwd, etc..).
This will also allow us to drop a subtree if it is found to be replaced by
something else. In this case the disconnected subtree can later be
reconnected to its new location.
d_materialise_unique() ensures that a directory entry only ever has one
alias. We keep fc->inst_mutex around the calls for d_materialise_unique()
on directories to prevent a race with mkdir "stealing" the inode.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Miklos Szeredi [Thu, 5 Sep 2013 09:44:41 +0000 (11:44 +0200)]
sysfs: use check_submounts_and_drop()
Do have_submounts(), shrink_dcache_parent() and d_drop() atomically.
check_submounts_and_drop() can deal with negative dentries and
non-directories as well.
Non-directories can also be mounted on. And just like directories we don't
want these to disappear with invalidation.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Miklos Szeredi [Thu, 5 Sep 2013 09:44:40 +0000 (11:44 +0200)]
nfs: use check_submounts_and_drop()
Do have_submounts(), shrink_dcache_parent() and d_drop() atomically.
check_submounts_and_drop() can deal with negative dentries and
non-directories as well.
Non-directories can also be mounted on. And just like directories we don't
want these to disappear with invalidation.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Miklos Szeredi [Thu, 5 Sep 2013 09:44:39 +0000 (11:44 +0200)]
gfs2: use check_submounts_and_drop()
Do have_submounts(), shrink_dcache_parent() and d_drop() atomically.
check_submounts_and_drop() can deal with negative dentries and
non-directories as well.
Non-directories can also be mounted on. And just like directories we don't
want these to disappear with invalidation.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Miklos Szeredi [Thu, 5 Sep 2013 09:44:38 +0000 (11:44 +0200)]
afs: use check_submounts_and_drop()
Do have_submounts(), shrink_dcache_parent() and d_drop() atomically.
check_submounts_and_drop() can deal with negative dentries as well.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Miklos Szeredi [Thu, 5 Sep 2013 12:39:11 +0000 (14:39 +0200)]
vfs: check unlinked ancestors before mount
We check submounts before doing d_drop() on a non-empty directory dentry in
NFS (have_submounts()), but we do not exclude a racing mount. Nor do we
prevent mounts to be added to the disconnected subtree using relative paths
after the d_drop().
This patch fixes these issues by checking for unlinked (unhashed, non-root)
ancestors before proceeding with the mount. This is done with rename
seqlock taken for write and with ->d_lock grabbed on each ancestor in turn,
including our dentry itself. This ensures that the only one of
check_submounts_and_drop() or has_unlinked_ancestor() can succeed.
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Miklos Szeredi [Thu, 5 Sep 2013 09:44:36 +0000 (11:44 +0200)]
vfs: check submounts and drop atomically
We check submounts before doing d_drop() on a non-empty directory dentry in
NFS (have_submounts()), but we do not exclude a racing mount.
Process A: have_submounts() -> returns false
Process B: mount() -> success
Process A: d_drop()
This patch prepares the ground for the fix by doing the following
operations all under the same rename lock:
have_submounts()
shrink_dcache_parent()
d_drop()
This is actually an optimization since have_submounts() and
shrink_dcache_parent() both traverse the same dentry tree separately.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: David Howells <dhowells@redhat.com>
CC: Steven Whitehouse <swhiteho@redhat.com>
CC: Trond Myklebust <Trond.Myklebust@netapp.com>
CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Miklos Szeredi [Thu, 5 Sep 2013 09:44:35 +0000 (11:44 +0200)]
vfs: add d_walk()
This one replaces three instances open coded tree walking (have_submounts,
select_parent, d_genocide) with a common helper.
In addition to slightly reducing the kernel size, this simplifies the
callers and makes them less bug prone.
Change-Id: I82891c4cc0b3cd13cc4faef5656d4eb01f4f1e99
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Miklos Szeredi [Thu, 5 Sep 2013 09:44:34 +0000 (11:44 +0200)]
vfs: restructure d_genocide()
It shouldn't matter when we decrement the refcount during the walk as long
as we do it exactly once.
Restructure d_genocide() to do the killing on entering the dentry instead
of when leaving it. This helps creating a common helper for tree walking.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Yan, Zheng [Tue, 13 Aug 2013 07:42:02 +0000 (15:42 +0800)]
vfs: call d_op->d_prune() before unhashing dentry
The d_prune dentry operation is used to notify filesystem when VFS
about to prune a hashed dentry from the dcache. There are three
code paths that prune dentries: shrink_dcache_for_umount_subtree(),
prune_dcache_sb() and d_prune_aliases(). For the d_prune_aliases()
case, VFS unhashes the dentry first, then call the d_prune dentry
operation. This confuses ceph_d_prune() (ceph uses the d_prune
dentry operation to maintain a flag indicating whether the complete
contents of a directory are in the dcache, pruning unhashed dentry
does not affect dir's completeness)
This patch fixes the issue by calling the d_prune dentry operation
in d_prune_aliases(), before unhashing the dentry. Also make VFS
only call the d_prune dentry operation for hashed dentry, to avoid
calling the d_prune dentry operation twice when dentry is pruned
by d_prune_aliases().
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Linus Torvalds [Mon, 2 Sep 2013 18:38:06 +0000 (11:38 -0700)]
vfs: reimplement d_rcu_to_refcount() using lockref_get_or_lock()
This moves __d_rcu_to_refcount() from <linux/dcache.h> into fs/namei.c
and re-implements it using the lockref infrastructure instead. It also
adds a lot of comments about what is actually going on, because turning
a dentry that was looked up using RCU into a long-lived reference
counted entry is one of the more subtle parts of the rcu walk.
We also used to be _particularly_ subtle in unlazy_walk() where we
re-validate both the dentry and its parent using the same sequence
count. We used to do it by nesting the locks and then verifying the
sequence count just once.
That was silly, because nested locking is expensive, but the sequence
count check is not. So this just re-validates the dentry and the parent
separately, avoiding the nested locking, and making the lockref lookup
possible.
Acked-by: Waiman Long <waiman.long@hp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Waiman Long [Mon, 2 Sep 2013 18:29:22 +0000 (11:29 -0700)]
vfs: use lockref_get_not_zero() for optimistic lockless dget_parent()
A valid parent pointer is always going to have a non-zero reference
count, but if we look up the parent optimistically without locking, we
have to protect against the (very unlikely) race against renaming
changing the parent from under us.
We do that by using lockref_get_not_zero(), and then re-checking the
parent pointer after getting a valid reference.
[ This is a re-implementation of a chunk from the original patch by
Waiman Long: "dcache: Enable lockless update of dentry's refcount".
I've completely rewritten the patch-series and split it up, but I'm
attributing this part to Waiman as it's close enough to his earlier
patch - Linus ]
Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Waiman Long [Thu, 29 Aug 2013 01:24:59 +0000 (18:24 -0700)]
vfs: make the dentry cache use the lockref infrastructure
This just replaces the dentry count/lock combination with the lockref
structure that contains both a count and a spinlock, and does the
mechanical conversion to use the lockref infrastructure.
There are no semantic changes here, it's purely syntactic. The
reference lockref implementation uses the spinlock exactly the same way
that the old dcache code did, and the bulk of this patch is just
expanding the internal "d_count" use in the dcache code to use
"d_lockref.count" instead.
This is purely preparation for the real change to make the reference
count updates be lockless during the 3.12 merge window.
[ As with the previous commit, this is a rewritten version of a concept
originally from Waiman, so credit goes to him, blame for any errors
goes to me.
Waiman's patch had some semantic differences for taking advantage of
the lockless update in dget_parent(), while this patch is
intentionally a pure search-and-replace change with no semantic
changes. - Linus ]
Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Peng Tao [Thu, 18 Jul 2013 14:09:08 +0000 (22:09 +0800)]
vfs: constify dentry parameter in d_count()
so that it can be used in places like d_compare/d_hash
without causing a compiler warning.
Signed-off-by: Peng Tao <tao.peng@emc.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Fri, 5 Jul 2013 14:59:33 +0000 (18:59 +0400)]
helper for reading ->d_count
Change-Id: I17f408c47173052817d0fb79f8506e418e47a5de
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Will Deacon [Wed, 27 Nov 2013 13:52:53 +0000 (13:52 +0000)]
lockref: include mutex.h rather than reinvent arch_mutex_cpu_relax
arch_mutex_cpu_relax is already conditionally defined in mutex.h, so
simply include that header rather than replicate the code here.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Peter Zijlstra [Thu, 14 Nov 2013 22:31:54 +0000 (14:31 -0800)]
lockref: use BLOATED_SPINLOCKS to avoid explicit config dependencies
Avoid the fragile Kconfig construct guestimating spinlock_t sizes; use a
friendly compile-time test to determine this.
[kirill.shutemov@linux.intel.com: drop CONFIG_CMPXCHG_LOCKREF]
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Steven Whitehouse [Tue, 15 Oct 2013 14:18:08 +0000 (15:18 +0100)]
GFS2: Use lockref for glocks
Currently glocks have an atomic reference count and also a spinlock
which covers various internal fields, such as the state. This intent of
this patch is to replace the spinlock and the atomic reference count
with a lockref structure. This contains a spinlock which we can continue
to use as before, and a reference counter which is used in conjuction
with the spinlock to replace the previous atomic counter.
As a result of this there are some new rules for reference counting on
glocks. We need to distinguish between reference count changes under
gl_spin (which are now just increment or decrement of the new counter,
provided the count cannot hit zero) and those which are outside of
gl_spin, but which now take gl_spin internally.
The conversion is relatively straight forward. There is probably some
further clean up which can be done, but the priority at this stage is to
make the change in as simple a manner as possible.
A consequence of this change is that the reference count is being
decoupled from the lru list processing. This should allow future
adoption of the lru_list code with glocks in due course.
The reason for using the "dead" state and not just relying on 0 being
the "invalid state" is so that in due course 0 ref counts can be
allowable. The intent is to eventually be able to remove the ref count
changes which are currently hidden away in state_change().
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Steven Whitehouse [Tue, 20 Aug 2013 08:35:09 +0000 (09:35 +0100)]
GFS2: Take glock reference in examine_bucket()
We need to check the glock ref counter in a race free way
in order to ensure that the gfs2_glock_hold() call will
succeed. The easiest way to do that is to simply take the
reference count early in the common code of examine_bucket,
skipping any glocks with zero ref count.
That means that the examiner functions all need to put their
reference on the glock once they've performed their function.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Reported-by: David Teigland <teigland@redhat.com>
Tested-by: David Teigland <teigland@redhat.com>
Heiko Carstens [Mon, 23 Sep 2013 10:59:56 +0000 (12:59 +0200)]
lockref: use arch_mutex_cpu_relax() in CMPXCHG_LOOP()
Make use of arch_mutex_cpu_relax() so architectures can override the
default cpu_relax() semantics.
This is especially useful for s390, where cpu_relax() means that we
yield() the current (virtual) cpu and therefore is very expensive,
and would contradict the whole purpose of the lockless cmpxchg loop.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Will Deacon [Thu, 26 Sep 2013 16:27:00 +0000 (17:27 +0100)]
lockref: allow relaxed cmpxchg64 variant for lockless updates
The 64-bit cmpxchg operation on the lockref is ordered by virtue of
hazarding between the cmpxchg operation and the reference count
manipulation. On weakly ordered memory architectures (such as ARM), it
can be of great benefit to omit the barrier instructions where they are
not needed.
This patch moves the lockless lockref code over to a cmpxchg64_relaxed
operation, which doesn't provide barrier semantics. If the operation
isn't defined, we simply #define it as the usual 64-bit cmpxchg macro.
Cc: Waiman Long <Waiman.Long@hp.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Will Deacon [Thu, 19 Sep 2013 18:06:46 +0000 (19:06 +0100)]
lockref: use cmpxchg64 explicitly for lockless updates
The cmpxchg() function tends not to support 64-bit arguments on 32-bit
architectures. This could be either due to use of unsigned long
arguments (like on ARM) or lack of instruction support (cmpxchgq on
x86). However, these architectures may implement a specific cmpxchg64()
function to provide 64-bit cmpxchg support instead.
Since the lockref code requires a 64-bit cmpxchg and relies on the
architecture selecting ARCH_USE_CMPXCHG_LOCKREF, move to using cmpxchg64
instead of cmpxchg and allow 32-bit architectures to make use of the
lockless lockref implementation.
Cc: Waiman Long <Waiman.Long@hp.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 7 Sep 2013 22:49:18 +0000 (15:49 -0700)]
lockref: add ability to mark lockrefs "dead"
The only actual current lockref user (dcache) uses zero reference counts
even for perfectly live dentries, because it's a cache: there may not be
any users, but that doesn't mean that we want to throw away the dentry.
At the same time, the dentry cache does have a notion of a truly "dead"
dentry that we must not even increment the reference count of, because
we have pruned it and it is not valid.
Currently that distinction is not visible in the lockref itself, and the
dentry cache validation uses "lockref_get_or_lock()" to either get a new
reference to a dentry that already had existing references (and thus
cannot be dead), or get the dentry lock so that we can then verify the
dentry and increment the reference count under the lock if that
verification was successful.
That's all somewhat complicated.
This adds the concept of being "dead" to the lockref itself, by simply
using a count that is negative. This allows a usage scenario where we
can increment the refcount of a dentry without having to validate it,
and pushing the special "we killed it" case into the lockref code.
The dentry code itself doesn't actually use this yet, and it's probably
too late in the merge window to do that code (the dentry_kill() code
with its "should I decrement the count" logic really is pretty complex
code), but let's introduce the concept at the lockref level now.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 2 Sep 2013 19:12:15 +0000 (12:12 -0700)]
lockref: implement lockless reference count updates using cmpxchg()
Instead of taking the spinlock, the lockless versions atomically check
that the lock is not taken, and do the reference count update using a
cmpxchg() loop. This is semantically identical to doing the reference
count update protected by the lock, but avoids the "wait for lock"
contention that you get when accesses to the reference count are
contended.
Note that a "lockref" is absolutely _not_ equivalent to an atomic_t.
Even when the lockref reference counts are updated atomically with
cmpxchg, the fact that they also verify the state of the spinlock means
that the lockless updates can never happen while somebody else holds the
spinlock.
So while "lockref_put_or_lock()" looks a lot like just another name for
"atomic_dec_and_lock()", and both optimize to lockless updates, they are
fundamentally different: the decrement done by atomic_dec_and_lock() is
truly independent of any lock (as long as it doesn't decrement to zero),
so a locked region can still see the count change.
The lockref structure, in contrast, really is a *locked* reference
count. If you hold the spinlock, the reference count will be stable and
you can modify the reference count without using atomics, because even
the lockless updates will see and respect the state of the lock.
In order to enable the cmpxchg lockless code, the architecture needs to
do three things:
(1) Make sure that the "arch_spinlock_t" and an "unsigned int" can fit
in an aligned u64, and have a "cmpxchg()" implementation that works
on such a u64 data type.
(2) define a helper function to test for a spinlock being unlocked
("arch_spin_value_unlocked()")
(3) select the "ARCH_USE_CMPXCHG_LOCKREF" config variable in its
Kconfig file.
This enables it for x86-64 (but not 32-bit, we'd need to make sure
cmpxchg() turns into the proper cmpxchg8b in order to enable it for
32-bit mode).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 2 Sep 2013 18:58:20 +0000 (11:58 -0700)]
lockref: uninline lockref helper functions
They aren't very good to inline, since they already call external
functions (the spinlock code), and we're going to create rather more
complicated versions of them that can do the reference count updates
locklessly.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 2 Sep 2013 18:14:19 +0000 (11:14 -0700)]
lockref: add 'lockref_get_or_lock() helper
This behaves like "lockref_get_not_zero()", but instead of doing nothing
if the count was zero, it returns with the lock held.
This allows callers to revalidate the lockref-protected data structure
if required even if the count was zero to begin with, and possibly
increment the count if it passes muster.
In particular, the dentry code wants this when it wants to turn an
RCU-protected dentry into a stable refcounted one: if the dentry count
it zero, but the sequence number still validates the dentry, we can take
a reference to it.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Waiman Long [Thu, 29 Aug 2013 01:13:26 +0000 (18:13 -0700)]
Add new lockref infrastructure reference implementation
This introduces a new "lockref" structure that supports the concept of
lockless updates of reference counts that still honor an attached
spinlock.
NOTE! This reference implementation is not the optimized lockless
version, rather it is the fallback implementation using standard
spinlocks. The actual optimized versions will be merged into 3.12, but
I wanted to get the infrastructure in place and document the new
interfaces.
[ Also note that this particular commit is drastically cut-down minimal
version of the original patch by Waiman. In order to properly credit
the original author I'm marking Waiman as the author here, but in the
end this patch bears little resemblance to the patch by Waiman. So
blame any errors on me editing things down to the point where I can
introduce the infrastructure before the merge window for 3.12 actually
opens. - Linus ]
Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Maciej Wereski [Tue, 17 Mar 2015 15:01:17 +0000 (16:01 +0100)]
rinato: Enable CONFIG_DEVPTS_MULTIPLE_INSTANCES
This is required by systemd. Other configurations has this already
enabled.
Related: https://git.tizen.org/cgit/platform/upstream/systemd.git/commit/?h=upstream&id=
b52a4a3b05a2a0d69868d57fd54f6e4b8fa0e7ca
Change-Id: Ib8e4744c4cea1e0b6fe86ecb9e09cfa49be683be
Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>
Joonyoung Shim [Fri, 6 Mar 2015 09:37:54 +0000 (18:37 +0900)]
arm: tizen_rinato_defconfig: enable mali r4p0
Use mali r4p0 instead of r3p2.
Change-Id: Ie777bf67c1887df0ca5180a42ac6c6a5d0753bd6
Signed-off-by: Joonyoung Shim <jy0922.shim@samsung.com>
Joonyoung Shim [Fri, 6 Mar 2015 09:35:54 +0000 (18:35 +0900)]
arm: tizen_defconfig: enable mali r4p0
Use mali r4p0 instead of r3p2.
Change-Id: I911044ee55e7830f56e93eeeb56adfc1fcfe2b0e
Signed-off-by: Joonyoung Shim <jy0922.shim@samsung.com>
Joonyoung Shim [Fri, 6 Mar 2015 09:28:55 +0000 (18:28 +0900)]
arm: tizen_odroid_defconfig: enable mali r4p0
Use mali r4p0 instead of r3p2.
Change-Id: I9a40493ea1d1dce95576c8d0f17af0a34fa13f28
Signed-off-by: Joonyoung Shim <jy0922.shim@samsung.com>
Marek Szyprowski [Thu, 26 Feb 2015 08:55:32 +0000 (09:55 +0100)]
drm/exynos: gsc: always use hw buffer 0 until queue management get fixed
Buffer sequence selection is broken and must be fixed. For the time being
always queue buffers for hw id 0, because hardware always operates on the
first src and dst buffer. This fixes IOMMU faults and makes the driver
usable from userspace.
Suggested-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Change-Id: I46f43a5ad8b714a78bad7383bc5e532bf5015ecd
Beata Michalska [Thu, 26 Feb 2015 12:18:46 +0000 (13:18 +0100)]
drm/exynos/ipp: Validate buffer enqueue requests
As for now there is no validation of incoming buffer
enqueue request as far as the gem buffers are being
concerned. This might lead to some undesired cases
when the driver tries to operate on invalid buffers
(wiht no valid gem object handle i.e.).
Add some basic checks to rule out those potential issues.
Change-Id: I117b5c566169d33fd46646068f835f48b333da73
Signed-off-by: Beata Michalska <b.michalska@samsung.com>
Seung-Woo Kim [Thu, 26 Feb 2015 02:11:23 +0000 (11:11 +0900)]
ARM: tizen_odroid_defconfig: enable audit options
This patch enables audit options to print log for security smack
from tizen_odroid_defconfig.
Change-Id: I5b6d034accdffce08c6320424960a1576ad03bca
Signed-off-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Sylwester Nawrocki [Mon, 16 Feb 2015 10:45:39 +0000 (11:45 +0100)]
V4L: s5c73m3: Use internal firmware by default
Set the boot_from_rom module parameter by default to 1 so the internal
sensors's firmware from ROM is used. The external firmware will be
loaded only if the module parameter is explicitly set to 0 by the user.
This prevents camera stream on failures for some S5C73M3 revisions.
Change-Id: I8c8c936c982df0db6f570e33b55596cce11b0b16
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Arun Kumar K [Mon, 15 Jul 2013 10:51:23 +0000 (07:51 -0300)]
[media] exynos4-is: Fix fimc-lite bayer formats
The 10-bit and 12-bit Bayer output formats supported by FIMC-LITE
actually use 16 bits where the extra bits are padded with zeros.
The patch corrects buffer allocation for these two formats by
modifying the depth field. This prevents memory corruption by the
output DMA due to insufficient buffer size.
Signed-off-by: Arun Kumar K <arun.kk@samsung.com>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Cc: stable@vger.kernel.org
Change-Id: Id0e3f13ce13de51218aa0b99f86311fcf411e4ec
Beata Michalska [Thu, 12 Feb 2015 11:36:24 +0000 (12:36 +0100)]
[media] exynos5-is: Reset CMU-ISP
Reset CMU-ISP prior to entering low-power mode.
Change-Id: I2caf9ecbee728f07480ee8b18ff1d5558db77bad
Signed-off-by: Beata Michalska <b.michalska@samsung.com>
YoungJun Cho [Wed, 9 Jul 2014 04:01:10 +0000 (13:01 +0900)]
drm/exynos: debugfs: add debugfs interface and gem_info node
The memps requires gem_info with gem_names to analyze graphics(video)
shared memory, so adds gem_info node with debugfs interface.
Change-Id: Ia923aa53c1508174e874d36001f53b0c42daac21
Signed-off-by: YoungJun Cho <yj44.cho@samsung.com>
Daniel Vetter [Wed, 14 Aug 2013 22:02:30 +0000 (00:02 +0200)]
drm: use common drm_gem_dmabuf_release in i915/exynos drivers
Note that this is slightly tricky since both drivers store their
native objects in dma_buf->priv. But both also embed the base
drm_gem_object at the first position, so the implicit cast is ok.
To use the release helper we need to export it, too.
Change-Id: I37e9ffec79c90304d444ae9b6c47346f125feb49
Cc: Inki Dae <inki.dae@samsung.com>
Cc: Intel Graphics Development <intel-gfx@lists.freedesktop.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
[This patch is necessary for commit
7f663e197afa drm/prime: proper locking+refcounting for obj->dma_buf link]
Signed-off-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Chanho Park [Wed, 4 Feb 2015 12:18:22 +0000 (21:18 +0900)]
packaging: fix kernel-devel requires
"%{variant}-linux-kernel" is provided instead of "linux-kernel".
Change-Id: Iad93b6e0fb53a5ead25842cd1de800b5af1b630b
Signed-off-by: Chanho Park <chanho61.park@samsung.com>
Chanho Park [Mon, 2 Feb 2015 01:20:16 +0000 (10:20 +0900)]
exynos/drm: g2d: return 0 instead of ret in g2d_map_cmdlist_gem
This patch fixes a return value of g2d_map_cmdlist_gem. Since applied
dmabuf_sync, the return value was changed to ret when success return.
This is wrong value when everything is successful.
Change-Id: I0ddb735b8f894ec065ee90865d0a8b45bf892b8e
Reported-by: Voloshynov Sergii <s.voloshynov@samsung.com>
Signed-off-by: Voloshynov Sergii <s.voloshynov@samsung.com>
Signed-off-by: Chanho Park <chanho61.park@samsung.com>
Yuvaraj Kumar C D [Tue, 22 Oct 2013 09:11:56 +0000 (14:41 +0530)]
mmc: dw_mmc: exynos: Revert the sdr_timing assignment
e6c784eded7b3 ("mmc: dw_mmc: exynos: move the exynos private init") was
wrongly assigning ddr_timing value to sdr_timing. This patch fixes this
by reverting the sdr_timing assignment statement to the earlier location.
Change-Id: I00d74956f2ce166063446b388b92c166d8b524dc
Signed-off-by: Yuvaraj Kumar C D <yuvaraj.cd@samsung.com>
Acked-by: Seungwon Jeon <tgih.jun@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
Stephane Desneux [Wed, 14 Jan 2015 08:40:55 +0000 (09:40 +0100)]
Workaround for building using perl 5.20.0
Change-Id: Iff1664f694da27ba209bf7c3febf2f3662c8b5cc
Signed-off-by: Stephane Desneux <stephane.desneux@open.eurogiciel.org>
Chanho Park [Mon, 5 Jan 2015 04:52:09 +0000 (13:52 +0900)]
Revert "ARM: exynos: add support secondary core bootup in big.LITTLE processor."
This reverts commit
c25aae8a02c0e3132df581d1d12be1d6738a08d6.
I think we need to investigate this patch more and more.
Change-Id: Idb69b33334f53ddd414123f6e9ac432840b99857
Signed-off-by: Chanho Park <chanho61.park@samsung.com>
Andrey Ryabinin [Sat, 8 Nov 2014 14:48:05 +0000 (17:48 +0300)]
security: smack: fix out-of-bounds access in smk_parse_smack()
Setting smack label on file (e.g. 'attr -S -s SMACK64 -V "test" test')
triggered following spew on the kernel with KASan applied:
==================================================================
BUG: AddressSanitizer: out of bounds access in strncpy+0x28/0x60 at addr
ffff8800059ad064
=============================================================================
BUG kmalloc-8 (Not tainted): kasan error
-----------------------------------------------------------------------------
Disabling lock debugging due to kernel taint
INFO: Slab 0xffffea0000166b40 objects=128 used=7 fp=0xffff8800059ad080 flags=0x4000000000000080
INFO: Object 0xffff8800059ad060 @offset=96 fp=0xffff8800059ad080
Bytes b4
ffff8800059ad050: a0 df 9a 05 00 88 ff ff 5a 5a 5a 5a 5a 5a 5a 5a ........ZZZZZZZZ
Object
ffff8800059ad060: 74 65 73 74 6b 6b 6b a5 testkkk.
Redzone
ffff8800059ad068: cc cc cc cc cc cc cc cc ........
Padding
ffff8800059ad078: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ
CPU: 0 PID: 528 Comm: attr Tainted: G B 3.18.0-rc1-mm1+ #5
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
0000000000000000 ffff8800059ad064 ffffffff81534cf2 ffff880005a5bc40
ffffffff8112fe1a 0000000100800006 0000000f059ad060 ffff880006000f90
0000000000000296 ffffea0000166b40 ffffffff8107ca97 ffff880005891060
Call Trace:
? dump_stack (lib/dump_stack.c:52)
? kasan_report_error (mm/kasan/report.c:102 mm/kasan/report.c:178)
? preempt_count_sub (kernel/sched/core.c:2651)
? __asan_load1 (mm/kasan/kasan.h:50 mm/kasan/kasan.c:248 mm/kasan/kasan.c:358)
? strncpy (lib/string.c:121)
? strncpy (lib/string.c:121)
? smk_parse_smack (security/smack/smack_access.c:457)
? setxattr (fs/xattr.c:343)
? smk_import_entry (security/smack/smack_access.c:514)
? smack_inode_setxattr (security/smack/smack_lsm.c:1093 (discriminator 1))
? security_inode_setxattr (security/security.c:602)
? vfs_setxattr (fs/xattr.c:134)
? setxattr (fs/xattr.c:343)
? setxattr (fs/xattr.c:360)
? get_parent_ip (kernel/sched/core.c:2606)
? preempt_count_sub (kernel/sched/core.c:2651)
? __percpu_counter_add (arch/x86/include/asm/preempt.h:98 lib/percpu_counter.c:90)
? get_parent_ip (kernel/sched/core.c:2606)
? preempt_count_sub (kernel/sched/core.c:2651)
? __mnt_want_write (arch/x86/include/asm/preempt.h:98 fs/namespace.c:359)
? path_setxattr (fs/xattr.c:380)
? SyS_lsetxattr (fs/xattr.c:397)
? system_call_fastpath (arch/x86/kernel/entry_64.S:423)
Read of size 1 by task attr:
Memory state around the buggy address:
ffff8800059ace80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff8800059acf00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff8800059acf80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>
ffff8800059ad000: 00 fc fc fc 00 fc fc fc 05 fc fc fc 04 fc fc fc
^
ffff8800059ad080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8800059ad100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8800059ad180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
strncpy() copies one byte more than the source string has.
Fix this by passing the correct length to strncpy().
Now we can remove initialization of the last byte in 'smack' string
because kzalloc() already did this for us.
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Change-Id: I7bb84eed3c348711312434d98d6cc13cbe8f5d76
Signed-off-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Andrey Ryabinin [Fri, 8 Aug 2014 21:22:07 +0000 (14:22 -0700)]
lib/idr.c: fix out-of-bounds pointer dereference
I'm working on address sanitizer project for kernel. Recently we
started experiments with stack instrumentation, to detect out-of-bounds
read/write bugs on stack.
Just after booting I've hit out-of-bounds read on stack in idr_for_each
(and in __idr_remove_all as well):
struct idr_layer **paa = &pa[0];
while (id >= 0 && id <= max) {
...
while (n < fls(id)) {
n += IDR_BITS;
p = *--paa; <--- here we are reading pa[-1] value.
}
}
Despite the fact that after this dereference we are exiting out of loop
and never use p, such behaviour is undefined and should be avoided.
Fix this by moving pointer derference to the beggining of the loop,
right before we will use it.
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Alexey Preobrazhensky <preobr@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: Id151fc7e874e3cff64da43eb3359f022de7e6cae
Signed-off-by: Seung-Woo Kim <sw0312.kim@samsung.com>
YoungJun Cho [Tue, 29 Oct 2013 11:30:26 +0000 (20:30 +0900)]
drm: delete unconsumed pending event list in drm_events_release
When there are unconsumed pending events, the events are
destroyed by calling destroy callback, but the events list
are remained, because there is no list_del().
It is possible that the page flip request is handled after
drm_events_release() is called and before drm_fb_release().
In this case a drm_pending_event is remained not freed.
So exynos driver checks again to remove it in its post
close routine. But the file_priv->event_list contains
undeleted ones, this can make oops for accessing invalid
memory.
Signed-off-by: YoungJun Cho <yj44.cho@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Change-Id: I25a471f4f4929150542eb6273c7673b9f44936b6
[back-ported from mainline to fix use after free issue]
Signed-off-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Tejun Heo [Thu, 11 Jul 2013 23:34:48 +0000 (16:34 -0700)]
cgroup: replace task_cgroup_path_from_hierarchy() with task_cgroup_path()
task_cgroup_path_from_hierarchy() was added for the planned new users
and none of the currently planned users wants to know about multiple
hierarchies. This patch drops the multiple hierarchy part and makes
it always return the path in the first non-dummy hierarchy.
As unified hierarchy will always have id 1, this is guaranteed to
return the path for the unified hierarchy if mounted; otherwise, it
will return the path from the hierarchy which happens to occupy the
lowest hierarchy id, which will usually be the first hierarchy mounted
after boot.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Lennart Poettering <lennart@poettering.net>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Jan Kaluža <jkaluza@redhat.com>
Change-Id: Iaa199f7332f01a03f791def776b5403f6fa459b3
Origin: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=
913ffdb54366f94eec65c656cae8c6e00e1ab1b0
Backported-by: Maciej Wereski <m.wereski@partner.samsung.com>
Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>
Tejun Heo [Mon, 15 Apr 2013 03:50:08 +0000 (20:50 -0700)]
cgroup: implement task_cgroup_path_from_hierarchy()
kdbus folks want a sane way to determine the cgroup path that a given
task belongs to on a given hierarchy, which is a reasonble thing to
expect from cgroup core.
Implement task_cgroup_path_from_hierarchy().
v2: Dropped unnecessary NULL check on the return value of
task_cgroup_from_root() as suggested by Li Zefan.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Greg Kroah-Hartman <greg@kroah.com>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Kay Sievers <kay@vrfy.org>
Cc: Lennart Poettering <lennart@poettering.net>
Cc: Daniel Mack <daniel@zonque.org>
Change-Id: Ifd630e09163b8272627c2ef8be1866c5e9dc05f9
Origin: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=
857a2beb09ab83e9a8185821ae16db7dfbe8b837
Backported-by: Maciej Wereski <m.wereski@partner.samsung.com>
Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>
Tejun Heo [Sun, 14 Apr 2013 18:36:58 +0000 (11:36 -0700)]
cgroup: make hierarchy_id use cyclic idr
We want to be able to lookup a hierarchy from its id and cyclic
allocation is a whole lot simpler with idr. Convert to idr and use
idr_alloc_cyclc().
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Change-Id: Ibd20ebe71ddb452178302cf86a22572b18cd82df
Origin: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=
1a574231669f8c3065c83974e9557fcbbd94b8a6
Backported-by: Maciej Wereski <m.wereski@partner.samsung.com>
Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>
Tejun Heo [Sun, 14 Apr 2013 18:36:57 +0000 (11:36 -0700)]
cgroup: drop hierarchy_id_lock
Now that hierarchy_id alloc / free are protected by the cgroup
mutexes, there's no need for this separate lock. Drop it.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Change-Id: I03fbc8bba08a785c6082a9b5bb1087c53c506c60
Origin: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=
54e7b4eb15fc4354d5ada5469e3db4a220ddb3ed
Backported-by: Maciej Wereski <m.wereski@partner.samsung.com>
Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>
Tejun Heo [Sun, 14 Apr 2013 18:36:56 +0000 (11:36 -0700)]
cgroup: refactor hierarchy_id handling
We're planning to converting hierarchy_ida to an idr and use it to
look up hierarchy from its id. As we want the mapping to happen
atomically with cgroupfs_root registration, this patch refactors
hierarchy_id init / exit so that ida operations happen inside
cgroup_[root_]mutex.
* s/init_root_id()/cgroup_init_root_id()/ and make it return 0 or
-errno like a normal function.
* Move hierarchy_id initialization from cgroup_root_from_opts() into
cgroup_mount() block where the root is confirmed to be used and
being registered while holding both mutexes.
* Split cgroup_drop_id() into cgroup_exit_root_id() and
cgroup_free_root(), so that ID release can happen before dropping
the mutexes in cgroup_kill_sb(). The latter expects hierarchy_id to
be exited before being invoked.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Change-Id: Ie1433632bcde96c359fd1d488a81cc3255ee92e6
Origin: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=
fa3ca07e96185aa1496b405472399a2a2a336a17
Backported-by: Maciej Wereski <m.wereski@partner.samsung.com>
Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>
Łukasz Stelmach [Wed, 10 Dec 2014 06:29:46 +0000 (07:29 +0100)]
smack: introduce a special case for tmpfs in smack_d_instantiate()
Files created with __shmem_file_stup() appear to have somewhat fake
dentries which make them look like root directories and not get
the label the current process or ("*") star meant for tmpfs files.
Change-Id: If0e2e3ceddeff55d5121e76e85dbea60414b786a
Signed-off-by: Łukasz Stelmach <l.stelmach@samsung.com>
Pranith Kumar [Mon, 15 Sep 2014 22:59:42 +0000 (18:59 -0400)]
selftests/memfd: Run test on all architectures
Remove the dependence on x86 to run the memfd test. Verfied on 32-bit powerpc.
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Reviewed-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Change-Id: I4e3a0d311842f5d0327abdb6bb8ce3ba1460f902
Origin: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=
ce6a144a0d01c6628496e4c0d18fbf3a0362cc67
Backported-by: Maciej Wereski <m.wereski@partner.samsung.com>
Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>
David Drysdale [Tue, 9 Sep 2014 21:50:57 +0000 (14:50 -0700)]
shm: add memfd.h to UAPI export list
The new header file memfd.h from commit
9183df25fe7b ("shm: add
memfd_create() syscall") should be exported.
Signed-off-by: David Drysdale <drysdale@google.com>
Reviewed-by: David Herrmann <dh.herrmann@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: Ibcd915aad320ddedcfcca0b7a098e03cc883fd88
Origin: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=
b01d072065b6f36550f486fe77f05b092225ba1b
Backported-by: Maciej Wereski <m.wereski@partner.samsung.com>
Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>
Pranith Kumar [Thu, 4 Sep 2014 15:58:19 +0000 (11:58 -0400)]
memfd_test: Add missing argument to printf()
Add a missing path argument buf to printf()
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Reviewed-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Change-Id: Ie4d1f23fc07a397971ee94c0fdd164fb7145771d
Origin: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=
2ed36928373cc3dfb20a4d17042e9a6e05538e41
Backported-by: Maciej Wereski <m.wereski@partner.samsung.com>
Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>
Pranith Kumar [Wed, 3 Sep 2014 14:31:16 +0000 (10:31 -0400)]
memfd_test: Make it work on 32-bit systems
This test currently fails on 32-bit systems since we use u64 type to pass the
flags to fcntl.
This commit changes this to use 'unsigned int' type for flags to fcntl making it
work on 32-bit systems.
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Reviewed-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Change-Id: I80190741a7cfaf9517cf220a58bcc36177139993
Origin: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=
57e67900d4c7949ad646a5f43a8ca5180170d2a0
Backported-by: Maciej Wereski <m.wereski@partner.samsung.com>
Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>
Will Deacon [Mon, 11 Aug 2014 13:24:47 +0000 (14:24 +0100)]
asm-generic: add memfd_create system call to unistd.h
Commit
9183df25fe7b ("shm: add memfd_create() syscall") added a new
system call (memfd_create) but didn't update the asm-generic unistd
header.
This patch adds the new system call to the asm-generic version of
unistd.h so that it can be used by architectures such as arm64.
Cc: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Change-Id: I7fff684716a86ad9f10e19755480c32ce9eeb861
Origin: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=
503e6636b6f96056210062be703356f4253b6db9
Backported-by: Maciej Wereski <m.wereski@partner.samsung.com>
Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>
Will Deacon [Mon, 11 Aug 2014 13:23:37 +0000 (14:23 +0100)]
arm64: compat: wire up memfd_create and getrandom syscalls for aarch32
arch/arm/ just grew support for the new memfd_create and getrandom
syscalls, so add them to our compat layer too.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Origin: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=
a97a42c47608d0bb6f2dfc2e162cc84a27beb43a
Backported-by: Maciej Wereski <m.wereski@partner.samsung.com>
getrandom isn't wired.
Change-Id: I0bfb09e924d839ede0f998a73a4b9a395359a1b6
Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>
Russell King [Sat, 9 Aug 2014 07:43:11 +0000 (08:43 +0100)]
ARM: wire up memfd_create syscall
Add the memfd_create syscall to ARM.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Origin: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=
e57e41931134e09fc6c03c8d4eb19d516cc6e59b
Backported-by: Maciej Wereski <m.wereski@partner.samsung.com>
Adjusted __NR_syscalls as in commit
eb6452537b280652eee66801ec97cc369e27e5d8.
Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>
Change-Id: I8bcbec0d5fb6241cbb5c13f142552dbfe5307c9e
David Herrmann [Fri, 8 Aug 2014 21:25:36 +0000 (14:25 -0700)]
shm: wait for pins to be released when sealing
If we set SEAL_WRITE on a file, we must make sure there cannot be any
ongoing write-operations on the file. For write() calls, we simply lock
the inode mutex, for mmap() we simply verify there're no writable
mappings. However, there might be pages pinned by AIO, Direct-IO and
similar operations via GUP. We must make sure those do not write to the
memfd file after we set SEAL_WRITE.
As there is no way to notify GUP users to drop pages or to wait for them
to be done, we implement the wait ourself: When setting SEAL_WRITE, we
check all pages for their ref-count. If it's bigger than 1, we know
there's some user of the page. We then mark the page and wait for up to
150ms for those ref-counts to be dropped. If the ref-counts are not
dropped in time, we refuse the seal operation.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Ryan Lortie <desrt@desrt.ca>
Cc: Lennart Poettering <lennart@poettering.net>
Cc: Daniel Mack <zonque@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: I952289df3c4261be68ab4dc590890fe20b0906a4
Origin: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=
05f65b5c70909ef686f865f0a85406d74d75f70f
Backported-by: Maciej Wereski <m.wereski@partner.samsung.com>
Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>
David Herrmann [Fri, 8 Aug 2014 21:25:34 +0000 (14:25 -0700)]
selftests: add memfd/sealing page-pinning tests
Setting SEAL_WRITE is not possible if there're pending GUP users. This
commit adds selftests for memfd+sealing that use FUSE to create pending
page-references. FUSE is very helpful here in that it allows us to delay
direct-IO operations for an arbitrary amount of time. This way, we can
force the kernel to pin pages and then run our normal selftests.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Ryan Lortie <desrt@desrt.ca>
Cc: Lennart Poettering <lennart@poettering.net>
Cc: Daniel Mack <zonque@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: Ideaf47a24b3522183cbe5ae10c320f7d38e31931
Origin: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=
87b2d44026e0e315a7401551e95b189ac4b28217
Backported-by: Maciej Wereski <m.wereski@partner.samsung.com>
Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>
David Herrmann [Fri, 8 Aug 2014 21:25:32 +0000 (14:25 -0700)]
selftests: add memfd_create() + sealing tests
Some basic tests to verify sealing on memfds works as expected and
guarantees the advertised semantics.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Ryan Lortie <desrt@desrt.ca>
Cc: Lennart Poettering <lennart@poettering.net>
Cc: Daniel Mack <zonque@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: I9a6bfd2205aa868f327fdb04788a1f6bae23eb17
Origin: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=
4f5ce5e8d7e2da3c714df8a7fa42edb9f992fc52
Backported-by: Maciej Wereski <m.wereski@partner.samsung.com>
Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>
David Herrmann [Fri, 8 Aug 2014 21:25:29 +0000 (14:25 -0700)]
shm: add memfd_create() syscall
memfd_create() is similar to mmap(MAP_ANON), but returns a file-descriptor
that you can pass to mmap(). It can support sealing and avoids any
connection to user-visible mount-points. Thus, it's not subject to quotas
on mounted file-systems, but can be used like malloc()'ed memory, but with
a file-descriptor to it.
memfd_create() returns the raw shmem file, so calls like ftruncate() can
be used to modify the underlying inode. Also calls like fstat() will
return proper information and mark the file as regular file. If you want
sealing, you can specify MFD_ALLOW_SEALING. Otherwise, sealing is not
supported (like on all other regular files).
Compared to O_TMPFILE, it does not require a tmpfs mount-point and is not
subject to a filesystem size limit. It is still properly accounted to
memcg limits, though, and to the same overcommit or no-overcommit
accounting as all user memory.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Ryan Lortie <desrt@desrt.ca>
Cc: Lennart Poettering <lennart@poettering.net>
Cc: Daniel Mack <zonque@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: I2ac7e2b47a1d68d4b83680f4527e5ed2aa9a420c
Origin: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=
9183df25fe7b194563db3fec6dc3202a5855839c
Backported-by: Maciej Wereski <m.wereski@partner.samsung.com>
Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>
David Herrmann [Fri, 8 Aug 2014 21:25:27 +0000 (14:25 -0700)]
shm: add sealing API
If two processes share a common memory region, they usually want some
guarantees to allow safe access. This often includes:
- one side cannot overwrite data while the other reads it
- one side cannot shrink the buffer while the other accesses it
- one side cannot grow the buffer beyond previously set boundaries
If there is a trust-relationship between both parties, there is no need
for policy enforcement. However, if there's no trust relationship (eg.,
for general-purpose IPC) sharing memory-regions is highly fragile and
often not possible without local copies. Look at the following two
use-cases:
1) A graphics client wants to share its rendering-buffer with a
graphics-server. The memory-region is allocated by the client for
read/write access and a second FD is passed to the server. While
scanning out from the memory region, the server has no guarantee that
the client doesn't shrink the buffer at any time, requiring rather
cumbersome SIGBUS handling.
2) A process wants to perform an RPC on another process. To avoid huge
bandwidth consumption, zero-copy is preferred. After a message is
assembled in-memory and a FD is passed to the remote side, both sides
want to be sure that neither modifies this shared copy, anymore. The
source may have put sensible data into the message without a separate
copy and the target may want to parse the message inline, to avoid a
local copy.
While SIGBUS handling, POSIX mandatory locking and MAP_DENYWRITE provide
ways to achieve most of this, the first one is unproportionally ugly to
use in libraries and the latter two are broken/racy or even disabled due
to denial of service attacks.
This patch introduces the concept of SEALING. If you seal a file, a
specific set of operations is blocked on that file forever. Unlike locks,
seals can only be set, never removed. Hence, once you verified a specific
set of seals is set, you're guaranteed that no-one can perform the blocked
operations on this file, anymore.
An initial set of SEALS is introduced by this patch:
- SHRINK: If SEAL_SHRINK is set, the file in question cannot be reduced
in size. This affects ftruncate() and open(O_TRUNC).
- GROW: If SEAL_GROW is set, the file in question cannot be increased
in size. This affects ftruncate(), fallocate() and write().
- WRITE: If SEAL_WRITE is set, no write operations (besides resizing)
are possible. This affects fallocate(PUNCH_HOLE), mmap() and
write().
- SEAL: If SEAL_SEAL is set, no further seals can be added to a file.
This basically prevents the F_ADD_SEAL operation on a file and
can be set to prevent others from adding further seals that you
don't want.
The described use-cases can easily use these seals to provide safe use
without any trust-relationship:
1) The graphics server can verify that a passed file-descriptor has
SEAL_SHRINK set. This allows safe scanout, while the client is
allowed to increase buffer size for window-resizing on-the-fly.
Concurrent writes are explicitly allowed.
2) For general-purpose IPC, both processes can verify that SEAL_SHRINK,
SEAL_GROW and SEAL_WRITE are set. This guarantees that neither
process can modify the data while the other side parses it.
Furthermore, it guarantees that even with writable FDs passed to the
peer, it cannot increase the size to hit memory-limits of the source
process (in case the file-storage is accounted to the source).
The new API is an extension to fcntl(), adding two new commands:
F_GET_SEALS: Return a bitset describing the seals on the file. This
can be called on any FD if the underlying file supports
sealing.
F_ADD_SEALS: Change the seals of a given file. This requires WRITE
access to the file and F_SEAL_SEAL may not already be set.
Furthermore, the underlying file must support sealing and
there may not be any existing shared mapping of that file.
Otherwise, EBADF/EPERM is returned.
The given seals are _added_ to the existing set of seals
on the file. You cannot remove seals again.
The fcntl() handler is currently specific to shmem and disabled on all
files. A file needs to explicitly support sealing for this interface to
work. A separate syscall is added in a follow-up, which creates files that
support sealing. There is no intention to support this on other
file-systems. Semantics are unclear for non-volatile files and we lack any
use-case right now. Therefore, the implementation is specific to shmem.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Ryan Lortie <desrt@desrt.ca>
Cc: Lennart Poettering <lennart@poettering.net>
Cc: Daniel Mack <zonque@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: I58642ae2db7fef5d952b22beada3525526dd3a20
Origin: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=
40e041a2c858b3caefc757e26cb85bfceae5062b
Backported-by: Maciej Wereski <m.wereski@partner.samsung.com>
Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>
David Herrmann [Fri, 8 Aug 2014 21:25:25 +0000 (14:25 -0700)]
mm: allow drivers to prevent new writable mappings
This patch (of 6):
The i_mmap_writable field counts existing writable mappings of an
address_space. To allow drivers to prevent new writable mappings, make
this counter signed and prevent new writable mappings if it is negative.
This is modelled after i_writecount and DENYWRITE.
This will be required by the shmem-sealing infrastructure to prevent any
new writable mappings after the WRITE seal has been set. In case there
exists a writable mapping, this operation will fail with EBUSY.
Note that we rely on the fact that iff you already own a writable mapping,
you can increase the counter without using the helpers. This is the same
that we do for i_writecount.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Ryan Lortie <desrt@desrt.ca>
Cc: Lennart Poettering <lennart@poettering.net>
Cc: Daniel Mack <zonque@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: If33fdcedbcf202ab177c4e21afc7eec261088a8b
Origin: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=
4bb5f5d9395bc112d93a134d8f5b05611eddc9c0
Backported-by: Maciej Wereski <m.wereski@partner.samsung.com>
Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>
Oleg Nesterov [Wed, 11 Sep 2013 21:20:20 +0000 (14:20 -0700)]
mm: mmap_region: kill correct_wcount/inode, use allow_write_access()
correct_wcount and inode in mmap_region() just complicate the code. This
boolean was needed previously, when deny_write_access() was called before
vma_merge(), now we can simply check VM_DENYWRITE and do
allow_write_access() if it is set.
allow_write_access() checks file != NULL, so this is safe even if it was
possible to use VM_DENYWRITE && !file. Just we need to ensure we use the
same file which was deny_write_access()'ed, so the patch also moves "file
= vma->vm_file" down after allow_write_access().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Colin Cross <ccross@android.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: I05df8842b7c4b7e3e29b35d914f297ce37af1685
Origin: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=
e86867720e617774b560dfbc169b7f3d0d490950
Backported-by: Maciej Wereski <m.wereski@partner.samsung.com>
Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>
Simon Horman [Wed, 22 May 2013 05:50:31 +0000 (14:50 +0900)]
sched: add cond_resched_rcu() helper
This is intended for use in loops which read data protected by RCU and may
have a large number of iterations. Such an example is dumping the list of
connections known to IPVS: ip_vs_conn_array() and ip_vs_conn_seq_next().
The benefits are for CONFIG_PREEMPT_RCU=y where we save CPU cycles
by moving rcu_read_lock and rcu_read_unlock out of large loops
but still allowing the current task to be preempted after every
loop iteration for the CONFIG_PREEMPT_RCU=n case.
The call to cond_resched() is not needed when CONFIG_PREEMPT_RCU=y.
Thanks to Paul E. McKenney for explaining this and for the
final version that checks the context with CONFIG_DEBUG_ATOMIC_SLEEP=y
for all possible configurations.
The function can be empty in the CONFIG_PREEMPT_RCU case,
rcu_read_lock and rcu_read_unlock are not needed in this case
because the task can be preempted on indication from scheduler.
Thanks to Peter Zijlstra for catching this and for his help
in trying a solution that changes __might_sleep.
Initial cond_resched_rcu_lock() function suggested by Eric Dumazet.
Tested-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Change-Id: I5f36f86484198f9064725d424c3d91d5fac8e1d4
Origin: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=
f6f3c437d09e2f62533034e67bfb4385191e992c
Backported-by: Maciej Wereski <m.wereski@partner.samsung.com>
Signed-off-by: Maciej Wereski <m.wereski@partner.samsung.com>
Chander Kashyap [Tue, 18 Jun 2013 15:29:34 +0000 (00:29 +0900)]
ARM: EXYNOS: use four additional chipid bits to identify EXYNOS family
Use chipid[27:20] bits to identify the EXYNOS family while setting
up the serial port during the uncompression setup. This uses four
additional bits of chipid to identify the EXYNOS family since this
is required for identifying EXYNOS5420 SoC.
Change-Id: Ic7cc14e68d16ae3da2a7d2177b40e40b0295d9a8
Signed-off-by: Chander Kashyap <chander.kashyap@linaro.org>
Signed-off-by: Thomas Abraham <thomas.abraham@linaro.org>
Reviewed-by: Tomasz Figa <t.figa@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Hyungwon Hwang [Thu, 11 Dec 2014 07:47:30 +0000 (16:47 +0900)]
ARM: exynos: add support secondary core bootup in big.LITTLE processor.
This patch adds support secondary core bootup in big.LITTLE processor
for platsmp. Just core id cannot be used for identification, because
there is a pair of the cores which have same core id. Cluster id have
to be included for their identification. This patch makes cpu index
using cluster id and core id for the calculation of their register
address and the identification. But there is a problem to use cluster id
for core index creation. That is, cluster id does not start from 0 in
old processors which do not have more than one cluster. For example,
Exynos4412's cluster id for its 4 core is 0xa. So I makes all cluster id
to 0 when they are bigger than 1. Normally big.LITTLE processor does not
use platsmp. But at this moment, just for the minimal functionality of
big.LITTLE, this patch adds support for it. This patch can be reverted
after another CPU management method is adopted.
Change-Id: Ifa2d62545dd4174998f962c9608fd6d9b6034c16
Signed-off-by: Hyungwon Hwang <human.hwang@samsung.com>
Seung-Woo Kim [Fri, 19 Dec 2014 05:49:32 +0000 (14:49 +0900)]
ARM: dts: exynos4: fix wrong compatible string for hdmi
The compatible string with exynos5 causes to get wrong configuration
data with recently updated hdmi driver. So this patch fixes to bind
hdmi with exynos4 family compatible strings.
Change-Id: I0c6cae24ad49197de47ce3ca047bd62bf190f8a4
Signed-off-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Hyungwon Hwang [Fri, 21 Nov 2014 10:25:41 +0000 (19:25 +0900)]
ARM: exynos: add exynos_firmware_init function for early init
This patch adds exynos_firmware_init function for early init.
Change-Id: Ibe8136ea9a08b422945acc7d3a36b3cfc0838602
Signed-off-by: Hyungwon Hwang <human.hwang@samsung.com>
Hyungwon Hwang [Fri, 21 Nov 2014 10:16:48 +0000 (19:16 +0900)]
ARM: exynos: fix UART address selection for DEBUG_LL
The exynos5 SoCs using A15+A7 can boot to A15 or A7. If it boots using
A7, it can't detect right UART physical address only the part number of
CP15. It's possible to solve as checking Cluster ID additionally.
Change-Id: I1a227bf1186a988f7a8429ee3b5251528d0ee32a
Signed-off-by: Joonyoung Shim <jy0922.shim@samsung.com>
Hyungwon Hwang [Wed, 17 Dec 2014 00:30:20 +0000 (09:30 +0900)]
ARM: exynos: restrain the use of gpio driver for Exynos5420
Now, pinctrl is being used to control gpio of Exynos5420. So this
patch is needed not to try to use gpio driver for Exynos5420, and fail
making meaningless error log in boot process.
Change-Id: I41b2575229e6c3f571800f07e775b5a6a11e456a
Signed-off-by: Hyungwon Hwang <human.hwang@samsung.com>