Arnaldo Carvalho de Melo [Mon, 18 Mar 2019 19:41:28 +0000 (16:41 -0300)]
perf evsel: Free evsel->counts in perf_evsel__exit()
[ Upstream commit
42dfa451d825a2ad15793c476f73e7bbc0f9d312 ]
Using gcc's ASan, Changbin reports:
=================================================================
==7494==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 48 byte(s) in 1 object(s) allocated from:
#0 0x7f0333a89138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138)
#1 0x5625e5330a5e in zalloc util/util.h:23
#2 0x5625e5330a9b in perf_counts__new util/counts.c:10
#3 0x5625e5330ca0 in perf_evsel__alloc_counts util/counts.c:47
#4 0x5625e520d8e5 in __perf_evsel__read_on_cpu util/evsel.c:1505
#5 0x5625e517a985 in perf_evsel__read_on_cpu /home/work/linux/tools/perf/util/evsel.h:347
#6 0x5625e517ad1a in test__openat_syscall_event tests/openat-syscall.c:47
#7 0x5625e51528e6 in run_test tests/builtin-test.c:358
#8 0x5625e5152baf in test_and_print tests/builtin-test.c:388
#9 0x5625e51543fe in __cmd_test tests/builtin-test.c:583
#10 0x5625e515572f in cmd_test tests/builtin-test.c:722
#11 0x5625e51c3fb8 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
#12 0x5625e51c44f7 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
#13 0x5625e51c48fb in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
#14 0x5625e51c5069 in main /home/changbin/work/linux/tools/perf/perf.c:520
#15 0x7f033214d09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
Indirect leak of 72 byte(s) in 1 object(s) allocated from:
#0 0x7f0333a89138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138)
#1 0x5625e532560d in zalloc util/util.h:23
#2 0x5625e532566b in xyarray__new util/xyarray.c:10
#3 0x5625e5330aba in perf_counts__new util/counts.c:15
#4 0x5625e5330ca0 in perf_evsel__alloc_counts util/counts.c:47
#5 0x5625e520d8e5 in __perf_evsel__read_on_cpu util/evsel.c:1505
#6 0x5625e517a985 in perf_evsel__read_on_cpu /home/work/linux/tools/perf/util/evsel.h:347
#7 0x5625e517ad1a in test__openat_syscall_event tests/openat-syscall.c:47
#8 0x5625e51528e6 in run_test tests/builtin-test.c:358
#9 0x5625e5152baf in test_and_print tests/builtin-test.c:388
#10 0x5625e51543fe in __cmd_test tests/builtin-test.c:583
#11 0x5625e515572f in cmd_test tests/builtin-test.c:722
#12 0x5625e51c3fb8 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
#13 0x5625e51c44f7 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
#14 0x5625e51c48fb in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
#15 0x5625e51c5069 in main /home/changbin/work/linux/tools/perf/perf.c:520
#16 0x7f033214d09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
His patch took care of evsel->prev_raw_counts, but the above backtraces
are about evsel->counts, so fix that instead.
Reported-by: Changbin Du <changbin.du@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lkml.kernel.org/n/tip-hd1x13g59f0nuhe4anxhsmfp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Changbin Du [Sat, 16 Mar 2019 08:05:49 +0000 (16:05 +0800)]
perf hist: Add missing map__put() in error case
[ Upstream commit
cb6186aeffda4d27e56066c79e9579e7831541d3 ]
We need to map__put() before returning from failure of
sample__resolve_callchain().
Detected with gcc's ASan.
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Krister Johansen <kjlx@templeofstupid.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Fixes:
9c68ae98c6f7 ("perf callchain: Reference count maps")
Link: http://lkml.kernel.org/r/20190316080556.3075-10-changbin.du@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Changbin Du [Sat, 16 Mar 2019 08:05:48 +0000 (16:05 +0800)]
perf top: Fix error handling in cmd_top()
[ Upstream commit
70c819e4bf1c5f492768b399d898d458ccdad2b6 ]
We should go to the cleanup path, to avoid leaks, detected using gcc's
ASan.
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20190316080556.3075-9-changbin.du@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Changbin Du [Sat, 16 Mar 2019 08:05:46 +0000 (16:05 +0800)]
perf build-id: Fix memory leak in print_sdt_events()
[ Upstream commit
8bde8516893da5a5fdf06121f74d11b52ab92df5 ]
Detected with gcc's ASan:
Direct leak of 4356 byte(s) in 120 object(s) allocated from:
#0 0x7ff1a2b5a070 in __interceptor_strdup (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x3b070)
#1 0x55719aef4814 in build_id_cache__origname util/build-id.c:215
#2 0x55719af649b6 in print_sdt_events util/parse-events.c:2339
#3 0x55719af66272 in print_events util/parse-events.c:2542
#4 0x55719ad1ecaa in cmd_list /home/changbin/work/linux/tools/perf/builtin-list.c:58
#5 0x55719aec745d in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
#6 0x55719aec7d1a in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
#7 0x55719aec8184 in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
#8 0x55719aeca41a in main /home/changbin/work/linux/tools/perf/perf.c:520
#9 0x7ff1a07ae09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Fixes:
40218daea1db ("perf list: Show SDT and pre-cached events")
Link: http://lkml.kernel.org/r/20190316080556.3075-7-changbin.du@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Changbin Du [Sat, 16 Mar 2019 08:05:45 +0000 (16:05 +0800)]
perf config: Fix a memory leak in collect_config()
[ Upstream commit
54569ba4b06d5baedae4614bde33a25a191473ba ]
Detected with gcc's ASan:
Direct leak of 66 byte(s) in 5 object(s) allocated from:
#0 0x7ff3b1f32070 in __interceptor_strdup (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x3b070)
#1 0x560c8761034d in collect_config util/config.c:597
#2 0x560c8760d9cb in get_value util/config.c:169
#3 0x560c8760dfd7 in perf_parse_file util/config.c:285
#4 0x560c8760e0d2 in perf_config_from_file util/config.c:476
#5 0x560c876108fd in perf_config_set__init util/config.c:661
#6 0x560c87610c72 in perf_config_set__new util/config.c:709
#7 0x560c87610d2f in perf_config__init util/config.c:718
#8 0x560c87610e5d in perf_config util/config.c:730
#9 0x560c875ddea0 in main /home/changbin/work/linux/tools/perf/perf.c:442
#10 0x7ff3afb8609a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Fixes:
20105ca1240c ("perf config: Introduce perf_config_set class")
Link: http://lkml.kernel.org/r/20190316080556.3075-6-changbin.du@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Changbin Du [Sat, 16 Mar 2019 08:05:44 +0000 (16:05 +0800)]
perf config: Fix an error in the config template documentation
[ Upstream commit
9b40dff7ba3caaf0d1919f98e136fa3400bd34aa ]
The option 'sort-order' should be 'sort_order'.
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Fixes:
893c5c798be9 ("perf config: Show default report configuration in example and docs")
Link: http://lkml.kernel.org/r/20190316080556.3075-5-changbin.du@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Changbin Du [Sat, 16 Mar 2019 08:05:42 +0000 (16:05 +0800)]
perf list: Don't forget to drop the reference to the allocated thread_map
[ Upstream commit
39df730b09774bd860e39ea208a48d15078236cb ]
Detected via gcc's ASan:
Direct leak of 2048 byte(s) in 64 object(s) allocated from:
6 #0 0x7f606512e370 in __interceptor_realloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee370)
7 #1 0x556b0f1d7ddd in thread_map__realloc util/thread_map.c:43
8 #2 0x556b0f1d84c7 in thread_map__new_by_tid util/thread_map.c:85
9 #3 0x556b0f0e045e in is_event_supported util/parse-events.c:2250
10 #4 0x556b0f0e1aa1 in print_hwcache_events util/parse-events.c:2382
11 #5 0x556b0f0e3231 in print_events util/parse-events.c:2514
12 #6 0x556b0ee0a66e in cmd_list /home/changbin/work/linux/tools/perf/builtin-list.c:58
13 #7 0x556b0f01e0ae in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
14 #8 0x556b0f01e859 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
15 #9 0x556b0f01edc8 in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
16 #10 0x556b0f01f71f in main /home/changbin/work/linux/tools/perf/perf.c:520
17 #11 0x7f6062ccf09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Fixes:
89896051f8da ("perf tools: Do not put a variable sized type not at the end of a struct")
Link: http://lkml.kernel.org/r/20190316080556.3075-3-changbin.du@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
David Arcari [Tue, 12 Feb 2019 14:34:39 +0000 (09:34 -0500)]
tools/power turbostat: return the exit status of a command
[ Upstream commit
2a95496634a017c19641f26f00907af75b962f01 ]
turbostat failed to return a non-zero exit status even though the
supplied command (turbostat <command>) failed. Currently when turbostat
forks a command it returns zero instead of the actual exit status of the
command. Modify the code to return the exit status.
Signed-off-by: David Arcari <darcari@redhat.com>
Acked-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Matteo Croce [Mon, 18 Mar 2019 21:24:03 +0000 (22:24 +0100)]
x86/mm: Don't leak kernel addresses
[ Upstream commit
a3151724437f54076cc10bc02b1c4f0003ae36cd ]
Since commit:
ad67b74d2469d9b8 ("printk: hash addresses printed with %p")
at boot "____ptrval____" is printed instead of actual addresses:
found SMP MP-table at [mem 0x000f5cc0-0x000f5ccf] mapped at [(____ptrval____)]
Instead of changing the print to "%px", and leaking a kernel addresses,
just remove the print completely, like in:
071929dbdd865f77 ("arm64: Stop printing the virtual memory layout").
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Konstantin Khlebnikov [Wed, 6 Mar 2019 17:11:42 +0000 (20:11 +0300)]
sched/core: Fix buffer overflow in cgroup2 property cpu.max
[ Upstream commit
4c47acd824aaaa8fc6dc519fb4e08d1522105b7a ]
Add limit into sscanf format string for on-stack buffer.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes:
0d5936344f30 ("sched: Implement interface for cgroup unified hierarchy")
Link: https://lkml.kernel.org/r/155189230232.2620.13120481613524200065.stgit@buzz
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Peter Zijlstra [Tue, 5 Mar 2019 08:32:02 +0000 (09:32 +0100)]
sched/cpufreq: Fix 32-bit math overflow
[ Upstream commit
a23314e9d88d89d49e69db08f60b7caa470f04e1 ]
Vincent Wang reported that get_next_freq() has a mult overflow bug on
32-bit platforms in the IOWAIT boost case, since in that case {util,max}
are in freq units instead of capacity units.
Solve this by moving the IOWAIT boost to capacity units. And since this
means @max is constant; simplify the code.
Reported-by: Vincent Wang <vincent.wang@unisoc.com>
Tested-by: Vincent Wang <vincent.wang@unisoc.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Chunyan Zhang <zhang.lyra@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Perret <quentin.perret@arm.com>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190305083202.GU32494@hirez.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Maurizio Lombardi [Mon, 28 Jan 2019 14:24:42 +0000 (15:24 +0100)]
scsi: iscsi: flush running unbind operations when removing a session
[ Upstream commit
165aa2bfb42904b1bec4bf2fa257c8c603c14a06 ]
In some cases, the iscsi_remove_session() function is called while an
unbind_work operation is still running. This may cause a situation where
sysfs objects are removed in an incorrect order, triggering a kernel
warning.
[ 605.249442] ------------[ cut here ]------------
[ 605.259180] sysfs group 'power' not found for kobject 'target2:0:0'
[ 605.321371] WARNING: CPU: 1 PID: 26794 at fs/sysfs/group.c:235 sysfs_remove_group+0x76/0x80
[ 605.341266] Modules linked in: dm_service_time target_core_user target_core_pscsi target_core_file target_core_iblock iscsi_target_mod target_core_mod nls_utf8 isofs ppdev bochs_drm nfit ttm libnvdimm drm_kms_helper syscopyarea sysfillrect sysimgblt joydev pcspkr fb_sys_fops drm i2c_piix4 sg parport_pc parport xfs libcrc32c dm_multipath sr_mod sd_mod cdrom ata_generic 8021q garp mrp ata_piix stp crct10dif_pclmul crc32_pclmul llc libata crc32c_intel virtio_net net_failover ghash_clmulni_intel serio_raw failover sunrpc dm_mirror dm_region_hash dm_log dm_mod be2iscsi bnx2i cnic uio cxgb4i cxgb4 libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi
[ 605.627479] CPU: 1 PID: 26794 Comm: kworker/u32:2 Not tainted 4.18.0-60.el8.x86_64 #1
[ 605.721401] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29 04/01/2014
[ 605.823651] Workqueue: scsi_wq_2 __iscsi_unbind_session [scsi_transport_iscsi]
[ 605.830940] RIP: 0010:sysfs_remove_group+0x76/0x80
[ 605.922907] Code: 48 89 df 5b 5d 41 5c e9 38 c4 ff ff 48 89 df e8 e0 bf ff ff eb cb 49 8b 14 24 48 8b 75 00 48 c7 c7 38 73 cb a7 e8 24 77 d7 ff <0f> 0b 5b 5d 41 5c c3 0f 1f 00 0f 1f 44 00 00 41 56 41 55 41 54 55
[ 606.122304] RSP: 0018:
ffffbadcc8d1bda8 EFLAGS:
00010286
[ 606.218492] RAX:
0000000000000000 RBX:
0000000000000000 RCX:
0000000000000000
[ 606.326381] RDX:
ffff98bdfe85eb40 RSI:
ffff98bdfe856818 RDI:
ffff98bdfe856818
[ 606.514498] RBP:
ffffffffa7ab73e0 R08:
0000000000000268 R09:
0000000000000007
[ 606.529469] R10:
0000000000000000 R11:
ffffffffa860d9ad R12:
ffff98bdf978e838
[ 606.630535] R13:
ffff98bdc2cd4010 R14:
ffff98bdc2cd3ff0 R15:
ffff98bdc2cd4000
[ 606.824707] FS:
0000000000000000(0000) GS:
ffff98bdfe840000(0000) knlGS:
0000000000000000
[ 607.018333] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 607.117844] CR2:
00007f84b78ac024 CR3:
000000002c00a003 CR4:
00000000003606e0
[ 607.117844] DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
[ 607.420926] DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
[ 607.524236] Call Trace:
[ 607.530591] device_del+0x56/0x350
[ 607.624393] ? ata_tlink_match+0x30/0x30 [libata]
[ 607.727805] ? attribute_container_device_trigger+0xb4/0xf0
[ 607.829911] scsi_target_reap_ref_release+0x39/0x50
[ 607.928572] scsi_remove_target+0x1a2/0x1d0
[ 608.017350] __iscsi_unbind_session+0xb3/0x160 [scsi_transport_iscsi]
[ 608.117435] process_one_work+0x1a7/0x360
[ 608.132917] worker_thread+0x30/0x390
[ 608.222900] ? pwq_unbound_release_workfn+0xd0/0xd0
[ 608.323989] kthread+0x112/0x130
[ 608.418318] ? kthread_bind+0x30/0x30
[ 608.513821] ret_from_fork+0x35/0x40
[ 608.613909] ---[ end trace
0b98c310c8a6138c ]---
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Acked-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Zhang Rui [Mon, 18 Mar 2019 14:26:33 +0000 (22:26 +0800)]
thermal/intel_powerclamp: fix truncated kthread name
[ Upstream commit
e925b5be5751f6a7286bbd9a4cbbc4ac90cc5fa6 ]
kthread name only allows 15 characters (TASK_COMMON_LEN is 16).
Thus rename the kthreads created by intel_powerclamp driver from
"kidle_inject/ + decimal cpuid" to "kidle_inj/ + decimal cpuid"
to avoid truncated kthead name for cpu 100 and later.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Matthew Garrett [Wed, 10 Oct 2018 08:30:07 +0000 (01:30 -0700)]
thermal/int340x_thermal: fix mode setting
[ Upstream commit
396ee4d0cd52c13b3f6421b8d324d65da5e7e409 ]
int3400 only pushes the UUID into the firmware when the mode is flipped
to "enable". The current code only exposes the mode flag if the firmware
supports the PASSIVE_1 UUID, which not all machines do. Remove the
restriction.
Signed-off-by: Matthew Garrett <mjg59@google.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Matthew Garrett [Wed, 10 Oct 2018 08:30:06 +0000 (01:30 -0700)]
thermal/int340x_thermal: Add additional UUIDs
[ Upstream commit
16fc8eca1975358111dbd7ce65e4ce42d1a848fb ]
Add more supported DPTF policies than the driver currently exposes.
Signed-off-by: Matthew Garrett <mjg59@google.com>
Cc: Nisha Aram <nisha.aram@intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Phil Elwell [Tue, 29 Jan 2019 09:55:57 +0000 (09:55 +0000)]
thermal: bcm2835: Fix crash in bcm2835_thermal_debugfs
[ Upstream commit
35122495a8c6683e863acf7b05a7036b2be64c7a ]
"cat /sys/kernel/debug/bcm2835_thermal/regset" causes a NULL pointer
dereference in bcm2835_thermal_debugfs. The driver makes use of the
implementation details of the thermal framework to retrieve a pointer
to its private data from a struct thermal_zone_device, and gets it
wrong - leading to the crash. Instead, store its private data as the
drvdata and retrieve the thermal_zone_device pointer from it.
Fixes:
bcb7dd9ef206 ("thermal: bcm2835: add thermal driver for bcm2835 SoC")
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Marek Szyprowski [Tue, 22 Jan 2019 15:47:41 +0000 (16:47 +0100)]
thermal: samsung: Fix incorrect check after code merge
[ Upstream commit
3b5236cc5d086dd3ddd01113ee9255421aab9fab ]
Merge commit
19785cf93b6c ("Merge branch 'linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal")
broke the code introduced by commit
ffe6e16f14fa ("thermal: exynos: Reduce
severity of too early temperature read"). Restore the original code from
the mentioned commit to finally fix the warning message during boot:
thermal thermal_zone0: failed to read out thermal zone (-22)
Reported-by: Marian Mihailescu <mihailescu2m@gmail.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Fixes:
19785cf93b6c ("Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal")
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Luc Van Oostenryck [Sat, 19 Jan 2019 16:15:23 +0000 (17:15 +0100)]
thermal/intel_powerclamp: fix __percpu declaration of worker_data
[ Upstream commit
aa36e3616532f82a920b5ebf4e059fbafae63d88 ]
This variable is declared as:
static struct powerclamp_worker_data * __percpu worker_data;
In other words, a percpu pointer to struct ...
But this variable not used like so but as a pointer to a percpu
struct powerclamp_worker_data.
So fix the declaration as:
static struct powerclamp_worker_data __percpu *worker_data;
This also quiets Sparse's warnings from __verify_pcpu_ptr(), like:
494:49: warning: incorrect type in initializer (different address spaces)
494:49: expected void const [noderef] <asn:3> *__vpp_verify
494:49: got struct powerclamp_worker_data *
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Colin Ian King [Sun, 17 Mar 2019 23:21:24 +0000 (23:21 +0000)]
ALSA: opl3: fix mismatch between snd_opl3_drum_switch definition and declaration
[ Upstream commit
b4748e7ab731e436cf5db4786358ada5dd2db6dd ]
The function snd_opl3_drum_switch declaration in the header file
has the order of the two arguments on_off and vel swapped when
compared to the definition arguments of vel and on_off. Fix this
by swapping them around to match the definition.
This error predates the git history, so no idea when this error
was introduced.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Arnd Bergmann [Thu, 7 Mar 2019 10:10:11 +0000 (11:10 +0100)]
mmc: davinci: remove extraneous __init annotation
[ Upstream commit
9ce58dd7d9da3ca0d7cb8c9568f1c6f4746da65a ]
Building with clang finds a mistaken __init tag:
WARNING: vmlinux.o(.text+0x5e4250): Section mismatch in reference from the function davinci_mmcsd_probe() to the function .init.text:init_mmcsd_host()
The function davinci_mmcsd_probe() references
the function __init init_mmcsd_host().
This is often because davinci_mmcsd_probe lacks a __init
annotation or the annotation of init_mmcsd_host is wrong.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Wolfram Sang <wsa@the-dreams.de>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Feng Tang [Thu, 14 Mar 2019 10:37:29 +0000 (18:37 +0800)]
i40iw: Avoid panic when handling the inetdev event
[ Upstream commit
ec4fe4bcc584b55e24e8d1768f5510a62c0fd619 ]
There is a panic reported that on a system with x722 ethernet, when doing
the operations like:
# ip link add br0 type bridge
# ip link set eno1 master br0
# systemctl restart systemd-networkd
The system will panic "BUG: unable to handle kernel null pointer
dereference at
0000000000000034", with call chain:
i40iw_inetaddr_event
notifier_call_chain
blocking_notifier_call_chain
notifier_call_chain
__inet_del_ifa
inet_rtm_deladdr
rtnetlink_rcv_msg
netlink_rcv_skb
rtnetlink_rcv
netlink_unicast
netlink_sendmsg
sock_sendmsg
__sys_sendto
It is caused by "local_ipaddr = ntohl(in->ifa_list->ifa_address)", while
the in->ifa_list is NULL.
So add a check for the "in->ifa_list == NULL" case, and skip the ARP
operation accordingly.
Signed-off-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Jack Morgenstein [Wed, 6 Mar 2019 17:17:56 +0000 (19:17 +0200)]
IB/mlx4: Fix race condition between catas error reset and aliasguid flows
[ Upstream commit
587443e7773e150ae29e643ee8f41a1eed226565 ]
Code review revealed a race condition which could allow the catas error
flow to interrupt the alias guid query post mechanism at random points.
Thiis is fixed by doing cancel_delayed_work_sync() instead of
cancel_delayed_work() during the alias guid mechanism destroy flow.
Fixes:
a0c64a17aba8 ("mlx4: Add alias_guid mechanism")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Dave Airlie [Fri, 15 Mar 2019 01:37:20 +0000 (11:37 +1000)]
drm/udl: use drm_gem_object_put_unlocked.
[ Upstream commit
8f3b487685b2acf71b42bb30d68fd9271bec8695 ]
When Daniel removed struct_mutex he didn't fix this call to the unlocked
variant which is required since we no longer use struct mutex.
This fixes a bunch of:
WARNING: CPU: 4 PID: 1370 at drivers/gpu/drm/drm_gem.c:931 drm_gem_object_put+0x2b/0x30 [drm]
Modules linked in: udl xt_CHECKSUM ipt_MASQUERADE tun bridge stp llc nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t>
CPU: 4 PID: 1370 Comm: Xorg Not tainted 5.0.0+ #2
backtraces when you plug in a udl device.
Fixes:
ae358dacd217 (drm/udl: Get rid of dev->struct_mutex usage)
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Andy Shevchenko [Tue, 12 Mar 2019 14:44:28 +0000 (16:44 +0200)]
auxdisplay: hd44780: Fix memory leak on ->remove()
[ Upstream commit
41c8d0adf3c4df1867d98cee4a2c4531352a33ad ]
We have to free on ->remove() the allocated resources on ->probe().
Fixes:
d47d88361fee ("auxdisplay: Add HD44780 Character LCD support")
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Kangjie Lu [Fri, 15 Mar 2019 04:04:14 +0000 (23:04 -0500)]
ALSA: sb8: add a check for request_region
[ Upstream commit
dcd0feac9bab901d5739de51b3f69840851f8919 ]
In case request_region fails, the fix returns an error code to
avoid NULL pointer dereference.
Signed-off-by: Kangjie Lu <kjlu@umn.edu>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Kangjie Lu [Fri, 15 Mar 2019 03:58:29 +0000 (22:58 -0500)]
ALSA: echoaudio: add a check for ioremap_nocache
[ Upstream commit
6ade657d6125ec3ec07f95fa51e28138aef6208f ]
In case ioremap_nocache fails, the fix releases chip and returns
an error code upstream to avoid NULL pointer dereference.
Signed-off-by: Kangjie Lu <kjlu@umn.edu>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Lukas Czerner [Fri, 15 Mar 2019 04:22:28 +0000 (00:22 -0400)]
ext4: report real fs size after failed resize
[ Upstream commit
6c7328400e0488f7d49e19e02290ba343b6811b2 ]
Currently when the file system resize using ext4_resize_fs() fails it
will report into log that "resized filesystem to <requested block
count>". However this may not be true in the case of failure. Use the
current block count as returned by ext4_blocks_count() to report the
block count.
Additionally, report a warning that "error occurred during file system
resize"
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Lukas Czerner [Fri, 15 Mar 2019 04:15:32 +0000 (00:15 -0400)]
ext4: add missing brelse() in add_new_gdb_meta_bg()
[ Upstream commit
d64264d6218e6892edd832dc3a5a5857c2856c53 ]
Currently in add_new_gdb_meta_bg() there is a missing brelse of gdb_bh
in case ext4_journal_get_write_access() fails.
Additionally kvfree() is missing in the same error path. Fix it by
moving the ext4_journal_get_write_access() before the ext4 sb update as
Ted suggested and release n_group_desc and gdb_bh in case it fails.
Fixes:
61a9c11e5e7a ("ext4: add missing brelse() add_new_gdb_meta_bg()'s error path")
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Jan Kara [Fri, 15 Mar 2019 03:46:05 +0000 (23:46 -0400)]
ext4: avoid panic during forced reboot
[ Upstream commit
1dc1097ff60e4105216da7cd0aa99032b039a994 ]
When admin calls "reboot -f" - i.e., does a hard system reboot by
directly calling reboot(2) - ext4 filesystem mounted with errors=panic
can panic the system. This happens because the underlying device gets
disabled without unmounting the filesystem and thus some syscall running
in parallel to reboot(2) can result in the filesystem getting IO errors.
This is somewhat surprising to the users so try improve the behavior by
switching to errors=remount-ro behavior when the system is running
reboot(2).
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Stephane Eranian [Thu, 7 Mar 2019 18:52:33 +0000 (10:52 -0800)]
perf/core: Restore mmap record type correctly
[ Upstream commit
d9c1bb2f6a2157b38e8eb63af437cb22701d31ee ]
On mmap(), perf_events generates a RECORD_MMAP record and then checks
which events are interested in this record. There are currently 2
versions of mmap records: RECORD_MMAP and RECORD_MMAP2. MMAP2 is larger.
The event configuration controls which version the user level tool
accepts.
If the event->attr.mmap2=1 field then MMAP2 record is returned. The
perf_event_mmap_output() takes care of this. It checks attr->mmap2 and
corrects the record fields before putting it in the sampling buffer of
the event. At the end the function restores the modified MMAP record
fields.
The problem is that the function restores the size but not the type.
Thus, if a subsequent event only accepts MMAP type, then it would
instead receive an MMAP2 record with a size of MMAP record.
This patch fixes the problem by restoring the record type on exit.
Signed-off-by: Stephane Eranian <eranian@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Fixes:
13d7a2410fa6 ("perf: Add attr->mmap2 attribute to an event")
Link: http://lkml.kernel.org/r/20190307185233.225521-1-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
ZhangXiaoxu [Sat, 2 Mar 2019 01:17:32 +0000 (09:17 +0800)]
inotify: Fix fsnotify_mark refcount leak in inotify_update_existing_watch()
[ Upstream commit
62c9d2674b31d4c8a674bee86b7edc6da2803aea ]
Commit
4d97f7d53da7dc83 ("inotify: Add flag IN_MASK_CREATE for
inotify_add_watch()") forgot to call fsnotify_put_mark() with
IN_MASK_CREATE after fsnotify_find_mark()
Fixes:
4d97f7d53da7dc83 ("inotify: Add flag IN_MASK_CREATE for inotify_add_watch()")
Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Corentin Labbe [Mon, 25 Feb 2019 09:45:38 +0000 (09:45 +0000)]
arc: hsdk_defconfig: Enable CONFIG_BLK_DEV_RAM
[ Upstream commit
0728aeb7ead99a9b0dac2f3c92b3752b4e02ff97 ]
We have now a HSDK device in our kernelci lab, but kernel builded via
the hsdk_defconfig lacks ramfs supports, so it cannot boot kernelci jobs
yet.
So this patch enable CONFIG_BLK_DEV_RAM in hsdk_defconfig.
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Acked-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Eugeniy Paltsev [Mon, 25 Feb 2019 17:16:01 +0000 (20:16 +0300)]
ARC: u-boot args: check that magic number is correct
[ Upstream commit
edb64bca50cd736c6894cc6081d5263c007ce005 ]
In case of devboards we really often disable bootloader and load
Linux image in memory via JTAG. Even if kernel tries to verify
uboot_tag and uboot_arg there is sill a chance that we treat some
garbage in registers as valid u-boot arguments in JTAG case.
E.g. it is enough to have '1' in r0 to treat any value in r2 as
a boot command line.
So check that magic number passed from u-boot is correct and drop
u-boot arguments otherwise. That helps to reduce the possibility
of using garbage as u-boot arguments in JTAG case.
We can safely check U-boot magic value (0x0) in linux passed via
r1 register as U-boot pass it from the beginning. So there is no
backward-compatibility issues.
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Greg Kroah-Hartman [Wed, 17 Apr 2019 06:38:55 +0000 (08:38 +0200)]
Linux 4.19.35
Marc Orr [Tue, 2 Apr 2019 06:56:00 +0000 (23:56 -0700)]
KVM: x86: nVMX: fix x2APIC VTPR read intercept
commit
c73f4c998e1fd4249b9edfa39e23f4fda2b9b041 upstream.
Referring to the "VIRTUALIZING MSR-BASED APIC ACCESSES" chapter of the
SDM, when "virtualize x2APIC mode" is 1 and "APIC-register
virtualization" is 0, a RDMSR of 808H should return the VTPR from the
virtual APIC page.
However, for nested, KVM currently fails to disable the read intercept
for this MSR. This means that a RDMSR exit takes precedence over
"virtualize x2APIC mode", and KVM passes through L1's TPR to L2,
instead of sourcing the value from L2's virtual APIC page.
This patch fixes the issue by disabling the read intercept, in VMCS02,
for the VTPR when "APIC-register virtualization" is 0.
The issue described above and fix prescribed here, were verified with
a related patch in kvm-unit-tests titled "Test VMX's virtualize x2APIC
mode w/ nested".
Signed-off-by: Marc Orr <marcorr@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Fixes:
c992384bde84f ("KVM: vmx: speed up MSR bitmap merge")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Marc Orr [Tue, 2 Apr 2019 06:55:59 +0000 (23:55 -0700)]
KVM: x86: nVMX: close leak of L0's x2APIC MSRs (CVE-2019-3887)
commit
acff78477b9b4f26ecdf65733a4ed77fe837e9dc upstream.
The nested_vmx_prepare_msr_bitmap() function doesn't directly guard the
x2APIC MSR intercepts with the "virtualize x2APIC mode" MSR. As a
result, we discovered the potential for a buggy or malicious L1 to get
access to L0's x2APIC MSRs, via an L2, as follows.
1. L1 executes WRMSR(IA32_SPEC_CTRL, 1). This causes the spec_ctrl
variable, in nested_vmx_prepare_msr_bitmap() to become true.
2. L1 disables "virtualize x2APIC mode" in VMCS12.
3. L1 enables "APIC-register virtualization" in VMCS12.
Now, KVM will set VMCS02's x2APIC MSR intercepts from VMCS12, and then
set "virtualize x2APIC mode" to 0 in VMCS02. Oops.
This patch closes the leak by explicitly guarding VMCS02's x2APIC MSR
intercepts with VMCS12's "virtualize x2APIC mode" control.
The scenario outlined above and fix prescribed here, were verified with
a related patch in kvm-unit-tests titled "Add leak scenario to
virt_x2apic_mode_test".
Note, it looks like this issue may have been introduced inadvertently
during a merge---see
15303ba5d1cd.
Signed-off-by: Marc Orr <marcorr@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Erik Schmauss [Wed, 17 Oct 2018 21:09:35 +0000 (14:09 -0700)]
ACPICA: AML interpreter: add region addresses in global list during initialization
commit
4abb951b73ff0a8a979113ef185651aa3c8da19b upstream.
The table load process omitted adding the operation region address
range to the global list. This omission is problematic because the OS
queries the global list to check for address range conflicts before
deciding which drivers to load. This commit may result in warning
messages that look like the following:
[ 7.871761] ACPI Warning: system_IO range 0x00000428-0x0000042F conflicts with op_region 0x00000400-0x0000047F (\PMIO) (
20180531/utaddress-213)
[ 7.871769] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
However, these messages do not signify regressions. It is a result of
properly adding address ranges within the global address list.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=200011
Tested-by: Jean-Marc Lenoir <archlinux@jihemel.com>
Signed-off-by: Erik Schmauss <erik.schmauss@intel.com>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tomohiro Mayama [Sat, 9 Mar 2019 16:10:12 +0000 (01:10 +0900)]
arm64: dts: rockchip: Fix vcc_host1_5v GPIO polarity on rk3328-rock64
commit
a8772e5d826d0f61f8aa9c284b3ab49035d5273d upstream.
This patch makes USB ports functioning again.
Fixes:
955bebde057e ("arm64: dts: rockchip: add rk3328-rock64 board")
Cc: stable@vger.kernel.org
Suggested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Tomohiro Mayama <parly-gh@iris.mystia.org>
Tested-by: Katsuhiro Suzuki <katsuhiro@katsuster.net>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Katsuhiro Suzuki [Thu, 6 Sep 2018 15:39:47 +0000 (00:39 +0900)]
arm64: dts: rockchip: fix vcc_host1_5v pin assign on rk3328-rock64
commit
ef05bcb60c1a8841e38c91923ba998181117a87c upstream.
This patch fixes pin assign of vcc_host1_5v. This regulator is
controlled by USB20_HOST_DRV signal.
ROCK64 schematic says that GPIO0_A2 pin is used as USB20_HOST_DRV.
GPIO0_D3 pin is for SPDIF_TX_M0.
Signed-off-by: Katsuhiro Suzuki <katsuhiro@katsuster.net>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mikulas Patocka [Fri, 5 Apr 2019 19:26:39 +0000 (15:26 -0400)]
dm integrity: fix deadlock with overlapping I/O
commit
4ed319c6ac08e9a28fca7ac188181ac122f4de84 upstream.
dm-integrity will deadlock if overlapping I/O is issued to it, the bug
was introduced by commit
724376a04d1a ("dm integrity: implement fair
range locks"). Users rarely use overlapping I/O so this bug went
undetected until now.
Fix this bug by correcting, likely cut-n-paste, typos in
ranges_overlap() and also remove a flawed ranges_overlap() check in
remove_range_unlocked(). This condition could leave unprocessed bios
hanging on wait_list forever.
Cc: stable@vger.kernel.org # v4.19+
Fixes:
724376a04d1a ("dm integrity: implement fair range locks")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ilya Dryomov [Tue, 26 Mar 2019 19:20:58 +0000 (20:20 +0100)]
dm table: propagate BDI_CAP_STABLE_WRITES to fix sporadic checksum errors
commit
eb40c0acdc342b815d4d03ae6abb09e80c0f2988 upstream.
Some devices don't use blk_integrity but still want stable pages
because they do their own checksumming. Examples include rbd and iSCSI
when data digests are negotiated. Stacking DM (and thus LVM) on top of
these devices results in sporadic checksum errors.
Set BDI_CAP_STABLE_WRITES if any underlying device has it set.
Cc: stable@vger.kernel.org
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mikulas Patocka [Thu, 21 Mar 2019 20:46:12 +0000 (16:46 -0400)]
dm: revert
8f50e358153d ("dm: limit the max bio size as BIO_MAX_PAGES * PAGE_SIZE")
commit
75ae193626de3238ca5fb895868ec91c94e63b1b upstream.
The limit was already incorporated to dm-crypt with commit
4e870e948fba
("dm crypt: fix error with too large bios"), so we don't need to apply
it globally to all targets. The quantity BIO_MAX_PAGES * PAGE_SIZE is
wrong anyway because the variable ti->max_io_len it is supposed to be in
the units of 512-byte sectors not in bytes.
Reduction of the limit to 1048576 sectors could even cause data
corruption in rare cases - suppose that we have a dm-striped device with
stripe size 768MiB. The target will call dm_set_target_max_io_len with
the value 1572864. The buggy code would reduce it to 1048576. Now, the
dm-core will errorneously split the bios on 1048576-sector boundary
insetad of 1572864-sector boundary and pass these stripe-crossing bios
to the striped target.
Cc: stable@vger.kernel.org # v4.16+
Fixes:
8f50e358153d ("dm: limit the max bio size as BIO_MAX_PAGES * PAGE_SIZE")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mikulas Patocka [Wed, 13 Mar 2019 11:56:02 +0000 (07:56 -0400)]
dm integrity: change memcmp to strncmp in dm_integrity_ctr
commit
0d74e6a3b6421d98eeafbed26f29156d469bc0b5 upstream.
If the string opt_string is small, the function memcmp can access bytes
that are beyond the terminating nul character. In theory, it could cause
segfault, if opt_string were located just below some unmapped memory.
Change from memcmp to strncmp so that we don't read bytes beyond the end
of the string.
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sergey Miroshnichenko [Tue, 12 Mar 2019 12:05:48 +0000 (15:05 +0300)]
PCI: pciehp: Ignore Link State Changes after powering off a slot
commit
3943af9d01e94330d0cfac6fccdbc829aad50c92 upstream.
During a safe hot remove, the OS powers off the slot, which may cause a
Data Link Layer State Changed event. The slot has already been set to
OFF_STATE, so that event results in re-enabling the device, making it
impossible to safely remove it.
Clear out the Presence Detect Changed and Data Link Layer State Changed
events when the disabled slot has settled down.
It is still possible to re-enable the device if it remains in the slot
after pressing the Attention Button by pressing it again.
Fixes the problem that Micah reported below: an NVMe drive power button may
not actually turn off the drive.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=203237
Reported-by: Micah Parrish <micah.parrish@hpe.com>
Tested-by: Micah Parrish <micah.parrish@hpe.com>
Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
[bhelgaas: changelog, add bugzilla URL]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org # v4.19+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andre Przywara [Fri, 5 Apr 2019 15:20:47 +0000 (16:20 +0100)]
PCI: Add function 1 DMA alias quirk for Marvell 9170 SATA controller
commit
9cde402a59770a0669d895399c13407f63d7d209 upstream.
There is a Marvell 88SE9170 PCIe SATA controller I found on a board here.
Some quick testing with the ARM SMMU enabled reveals that it suffers from
the same requester ID mixup problems as the other Marvell chips listed
already.
Add the PCI vendor/device ID to the list of chips which need the
workaround.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Lendacky, Thomas [Tue, 2 Apr 2019 15:21:18 +0000 (15:21 +0000)]
x86/perf/amd: Remove need to check "running" bit in NMI handler
commit
3966c3feca3fd10b2935caa0b4a08c7dd59469e5 upstream.
Spurious interrupt support was added to perf in the following commit, almost
a decade ago:
63e6be6d98e1 ("perf, x86: Catch spurious interrupts after disabling counters")
The two previous patches (resolving the race condition when disabling a
PMC and NMI latency mitigation) allow for the removal of this older
spurious interrupt support.
Currently in x86_pmu_stop(), the bit for the PMC in the active_mask bitmap
is cleared before disabling the PMC, which sets up a race condition. This
race condition was mitigated by introducing the running bitmap. That race
condition can be eliminated by first disabling the PMC, waiting for PMC
reset on overflow and then clearing the bit for the PMC in the active_mask
bitmap. The NMI handler will not re-enable a disabled counter.
If x86_pmu_stop() is called from the perf NMI handler, the NMI latency
mitigation support will guard against any unhandled NMI messages.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org> # 4.14.x-
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lkml.kernel.org/r/Message-ID:
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Lendacky, Thomas [Tue, 2 Apr 2019 15:21:16 +0000 (15:21 +0000)]
x86/perf/amd: Resolve NMI latency issues for active PMCs
commit
6d3edaae16c6c7d238360f2841212c2b26774d5e upstream.
On AMD processors, the detection of an overflowed PMC counter in the NMI
handler relies on the current value of the PMC. So, for example, to check
for overflow on a 48-bit counter, bit 47 is checked to see if it is 1 (not
overflowed) or 0 (overflowed).
When the perf NMI handler executes it does not know in advance which PMC
counters have overflowed. As such, the NMI handler will process all active
PMC counters that have overflowed. NMI latency in newer AMD processors can
result in multiple overflowed PMC counters being processed in one NMI and
then a subsequent NMI, that does not appear to be a back-to-back NMI, not
finding any PMC counters that have overflowed. This may appear to be an
unhandled NMI resulting in either a panic or a series of messages,
depending on how the kernel was configured.
To mitigate this issue, add an AMD handle_irq callback function,
amd_pmu_handle_irq(), that will invoke the common x86_pmu_handle_irq()
function and upon return perform some additional processing that will
indicate if the NMI has been handled or would have been handled had an
earlier NMI not handled the overflowed PMC. Using a per-CPU variable, a
minimum value of the number of active PMCs or 2 will be set whenever a
PMC is active. This is used to indicate the possible number of NMIs that
can still occur. The value of 2 is used for when an NMI does not arrive
at the LAPIC in time to be collapsed into an already pending NMI. Each
time the function is called without having handled an overflowed counter,
the per-CPU value is checked. If the value is non-zero, it is decremented
and the NMI indicates that it handled the NMI. If the value is zero, then
the NMI indicates that it did not handle the NMI.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org> # 4.14.x-
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lkml.kernel.org/r/Message-ID:
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Lendacky, Thomas [Tue, 2 Apr 2019 15:21:14 +0000 (15:21 +0000)]
x86/perf/amd: Resolve race condition when disabling PMC
commit
914123fa39042e651d79eaf86bbf63a1b938dddf upstream.
On AMD processors, the detection of an overflowed counter in the NMI
handler relies on the current value of the counter. So, for example, to
check for overflow on a 48 bit counter, bit 47 is checked to see if it
is 1 (not overflowed) or 0 (overflowed).
There is currently a race condition present when disabling and then
updating the PMC. Increased NMI latency in newer AMD processors makes this
race condition more pronounced. If the counter value has overflowed, it is
possible to update the PMC value before the NMI handler can run. The
updated PMC value is not an overflowed value, so when the perf NMI handler
does run, it will not find an overflowed counter. This may appear as an
unknown NMI resulting in either a panic or a series of messages, depending
on how the kernel is configured.
To eliminate this race condition, the PMC value must be checked after
disabling the counter. Add an AMD function, amd_pmu_disable_all(), that
will wait for the NMI handler to reset any active and overflowed counter
after calling x86_pmu_disable_all().
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org> # 4.14.x-
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lkml.kernel.org/r/Message-ID:
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alexander Potapenko [Tue, 2 Apr 2019 11:28:13 +0000 (13:28 +0200)]
x86/asm: Use stricter assembly constraints in bitops
commit
5b77e95dd7790ff6c8fbf1cd8d0104ebed818a03 upstream.
There's a number of problems with how arch/x86/include/asm/bitops.h
is currently using assembly constraints for the memory region
bitops are modifying:
1) Use memory clobber in bitops that touch arbitrary memory
Certain bit operations that read/write bits take a base pointer and an
arbitrarily large offset to address the bit relative to that base.
Inline assembly constraints aren't expressive enough to tell the
compiler that the assembly directive is going to touch a specific memory
location of unknown size, therefore we have to use the "memory" clobber
to indicate that the assembly is going to access memory locations other
than those listed in the inputs/outputs.
To indicate that BTR/BTS instructions don't necessarily touch the first
sizeof(long) bytes of the argument, we also move the address to assembly
inputs.
This particular change leads to size increase of 124 kernel functions in
a defconfig build. For some of them the diff is in NOP operations, other
end up re-reading values from memory and may potentially slow down the
execution. But without these clobbers the compiler is free to cache
the contents of the bitmaps and use them as if they weren't changed by
the inline assembly.
2) Use byte-sized arguments for operations touching single bytes.
Passing a long value to ANDB/ORB/XORB instructions makes the compiler
treat sizeof(long) bytes as being clobbered, which isn't the case. This
may theoretically lead to worse code in the case of heavy optimization.
Practical impact:
I've built a defconfig kernel and looked through some of the functions
generated by GCC 7.3.0 with and without this clobber, and didn't spot
any miscompilations.
However there is a (trivial) theoretical case where this code leads to
miscompilation:
https://lkml.org/lkml/2019/3/28/393
using just GCC 8.3.0 with -O2. It isn't hard to imagine someone writes
such a function in the kernel someday.
So the primary motivation is to fix an existing misuse of the asm
directive, which happens to work in certain configurations now, but
isn't guaranteed to work under different circumstances.
[ --mingo: Added -stable tag because defconfig only builds a fraction
of the kernel and the trivial testcase looks normal enough to
be used in existing or in-development code. ]
Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: <stable@vger.kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: James Y Knight <jyknight@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20190402112813.193378-1-glider@google.com
[ Edited the changelog, tidied up one of the defines. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Rasmus Villemoes [Fri, 11 Jan 2019 08:49:30 +0000 (09:49 +0100)]
x86/asm: Remove dead __GNUC__ conditionals
commit
88ca66d8540ca26119b1428cddb96b37925bdf01 upstream.
The minimum supported gcc version is >= 4.6, so these can be removed.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190111084931.24601-1-linux@rasmusvillemoes.dk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Max Filippov [Thu, 4 Apr 2019 18:08:40 +0000 (11:08 -0700)]
xtensa: fix return_address
commit
ada770b1e74a77fff2d5f539bf6c42c25f4784db upstream.
return_address returns the address that is one level higher in the call
stack than requested in its argument, because level 0 corresponds to its
caller's return address. Use requested level as the number of stack
frames to skip.
This fixes the address reported by might_sleep and friends.
Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mel Gorman [Tue, 19 Mar 2019 12:36:10 +0000 (12:36 +0000)]
sched/fair: Do not re-read ->h_load_next during hierarchical load calculation
commit
0e9f02450da07fc7b1346c8c32c771555173e397 upstream.
A NULL pointer dereference bug was reported on a distribution kernel but
the same issue should be present on mainline kernel. It occured on s390
but should not be arch-specific. A partial oops looks like:
Unable to handle kernel pointer dereference in virtual kernel address space
...
Call Trace:
...
try_to_wake_up+0xfc/0x450
vhost_poll_wakeup+0x3a/0x50 [vhost]
__wake_up_common+0xbc/0x178
__wake_up_common_lock+0x9e/0x160
__wake_up_sync_key+0x4e/0x60
sock_def_readable+0x5e/0x98
The bug hits any time between 1 hour to 3 days. The dereference occurs
in update_cfs_rq_h_load when accumulating h_load. The problem is that
cfq_rq->h_load_next is not protected by any locking and can be updated
by parallel calls to task_h_load. Depending on the compiler, code may be
generated that re-reads cfq_rq->h_load_next after the check for NULL and
then oops when reading se->avg.load_avg. The dissassembly showed that it
was possible to reread h_load_next after the check for NULL.
While this does not appear to be an issue for later compilers, it's still
an accident if the correct code is generated. Full locking in this path
would have high overhead so this patch uses READ_ONCE to read h_load_next
only once and check for NULL before dereferencing. It was confirmed that
there were no further oops after 10 days of testing.
As Peter pointed out, it is also necessary to use WRITE_ONCE() to avoid any
potential problems with store tearing.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Fixes:
685207963be9 ("sched: Move h_load calculation to task_h_load()")
Link: https://lkml.kernel.org/r/20190319123610.nsivgf3mjbjjesxb@techsingularity.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dan Carpenter [Thu, 4 Apr 2019 15:12:17 +0000 (18:12 +0300)]
xen: Prevent buffer overflow in privcmd ioctl
commit
42d8644bd77dd2d747e004e367cb0c895a606f39 upstream.
The "call" variable comes from the user in privcmd_ioctl_hypercall().
It's an offset into the hypercall_page[] which has (PAGE_SIZE / 32)
elements. We need to put an upper bound on it to prevent an out of
bounds access.
Cc: stable@vger.kernel.org
Fixes:
1246ae0bb992 ("xen: add variable hypercall caller")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Will Deacon [Mon, 8 Apr 2019 16:56:34 +0000 (17:56 +0100)]
arm64: backtrace: Don't bother trying to unwind the userspace stack
commit
1e6f5440a6814d28c32d347f338bfef68bc3e69d upstream.
Calling dump_backtrace() with a pt_regs argument corresponding to
userspace doesn't make any sense and our unwinder will simply print
"Call trace:" before unwinding the stack looking for user frames.
Rather than go through this song and dance, just return early if we're
passed a user register state.
Cc: <stable@vger.kernel.org>
Fixes:
1149aad10b1e ("arm64: Add dump_backtrace() in show_regs")
Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Geis [Wed, 13 Mar 2019 18:45:36 +0000 (18:45 +0000)]
arm64: dts: rockchip: fix rk3328 rgmii high tx error rate
commit
6fd8b9780ec1a49ac46e0aaf8775247205e66231 upstream.
Several rk3328 based boards experience high rgmii tx error rates.
This is due to several pins in the rk3328.dtsi rgmii pinmux that are
missing a defined pull strength setting.
This causes the pinmux driver to default to 2ma (bit mask 00).
These pins are only defined in the rk3328.dtsi, and are not listed in
the rk3328 specification.
The TRM only lists them as "Reserved"
(RK3328 TRM V1.1, 3.3.3 Detail Register Description, GRF_GPIO0B_IOMUX,
GRF_GPIO0C_IOMUX, GRF_GPIO0D_IOMUX).
However, removal of these pins from the rgmii pinmux definition causes
the interface to fail to transmit.
Also, the rgmii tx and rx pins defined in the dtsi are not consistent
with the rk3328 specification, with tx pins currently set to 12ma and
rx pins set to 2ma.
Fix this by setting tx pins to 8ma and the rx pins to 4ma, consistent
with the specification.
Defining the drive strength for the undefined pins eliminated the high
tx packet error rate observed under heavy data transfers.
Aligning the drive strength to the TRM values eliminated the occasional
packet retry errors under iperf3 testing.
This allows much higher data rates with no recorded tx errors.
Tested on the rk3328-roc-cc board.
Fixes:
52e02d377a72 ("arm64: dts: rockchip: add core dtsi file for RK3328 SoCs")
Cc: stable@vger.kernel.org
Signed-off-by: Peter Geis <pgwipeout@gmail.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Will Deacon [Mon, 8 Apr 2019 11:45:09 +0000 (12:45 +0100)]
arm64: futex: Fix FUTEX_WAKE_OP atomic ops with non-zero result value
commit
045afc24124d80c6998d9c770844c67912083506 upstream.
Rather embarrassingly, our futex() FUTEX_WAKE_OP implementation doesn't
explicitly set the return value on the non-faulting path and instead
leaves it holding the result of the underlying atomic operation. This
means that any FUTEX_WAKE_OP atomic operation which computes a non-zero
value will be reported as having failed. Regrettably, I wrote the buggy
code back in 2011 and it was upstreamed as part of the initial arm64
support in 2012.
The reasons we appear to get away with this are:
1. FUTEX_WAKE_OP is rarely used and therefore doesn't appear to get
exercised by futex() test applications
2. If the result of the atomic operation is zero, the system call
behaves correctly
3. Prior to version 2.25, the only operation used by GLIBC set the
futex to zero, and therefore worked as expected. From 2.25 onwards,
FUTEX_WAKE_OP is not used by GLIBC at all.
Fix the implementation by ensuring that the return value is either 0
to indicate that the atomic operation completed successfully, or -EFAULT
if we encountered a fault when accessing the user mapping.
Cc: <stable@kernel.org>
Fixes:
6170a97460db ("arm64: Atomic operations")
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
David Engraf [Mon, 11 Mar 2019 07:57:42 +0000 (08:57 +0100)]
ARM: dts: at91: Fix typo in ISC_D0 on PC9
commit
e7dfb6d04e4715be1f3eb2c60d97b753fd2e4516 upstream.
The function argument for the ISC_D0 on PC9 was incorrect. According to
the documentation it should be 'C' aka 3.
Signed-off-by: David Engraf <david.engraf@sysgo.com>
Reviewed-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Fixes:
7f16cb676c00 ("ARM: at91/dt: add sama5d2 pinmux")
Cc: <stable@vger.kernel.org> # v4.4+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Ujfalusi [Fri, 15 Mar 2019 10:59:09 +0000 (12:59 +0200)]
ARM: dts: am335x-evm: Correct the regulators for the audio codec
commit
4f96dc0a3e79ec257a2b082dab3ee694ff88c317 upstream.
Correctly map the regulators used by tlv320aic3106.
Both 1.8V and 3.3V for the codec is derived from VBAT via fixed regulators.
Cc: <Stable@vger.kernel.org> # v4.14+
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Ujfalusi [Fri, 15 Mar 2019 10:59:17 +0000 (12:59 +0200)]
ARM: dts: am335x-evmsk: Correct the regulators for the audio codec
commit
6691370646e844be98bb6558c024269791d20bd7 upstream.
Correctly map the regulators used by tlv320aic3106.
Both 1.8V and 3.3V for the codec is derived from VBAT via fixed regulators.
Cc: <Stable@vger.kernel.org> # v4.14+
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jonas Karlman [Sun, 24 Feb 2019 21:51:22 +0000 (21:51 +0000)]
ARM: dts: rockchip: fix rk3288 cpu opp node reference
commit
6b2fde3dbfab6ebc45b0cd605e17ca5057ff9a3b upstream.
The following error can be seen during boot:
of: /cpus/cpu@501: Couldn't find opp node
Change cpu nodes to use operating-points-v2 in order to fix this.
Fixes:
ce76de984649 ("ARM: dts: rockchip: convert rk3288 to operating-points-v2")
Cc: stable@vger.kernel.org
Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cornelia Huck [Mon, 8 Apr 2019 12:33:22 +0000 (14:33 +0200)]
virtio: Honour 'may_reduce_num' in vring_create_virtqueue
commit
cf94db21905333e610e479688add629397a4b384 upstream.
vring_create_virtqueue() allows the caller to specify via the
may_reduce_num parameter whether the vring code is allowed to
allocate a smaller ring than specified.
However, the split ring allocation code tries to allocate a
smaller ring on allocation failure regardless of what the
caller specified. This may cause trouble for e.g. virtio-pci
in legacy mode, which does not support ring resizing. (The
packed ring code does not resize in any case.)
Let's fix this by bailing out immediately in the split ring code
if the requested size cannot be allocated and may_reduce_num has
not been specified.
While at it, fix a typo in the usage instructions.
Fixes:
2a2d1382fe9d ("virtio: Add improved queue allocation API")
Cc: stable@vger.kernel.org # v4.6+
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kefeng Wang [Thu, 4 Apr 2019 07:45:12 +0000 (15:45 +0800)]
genirq: Initialize request_mutex if CONFIG_SPARSE_IRQ=n
commit
e8458e7afa855317b14915d7b86ab3caceea7eb6 upstream.
When CONFIG_SPARSE_IRQ is disable, the request_mutex in struct irq_desc
is not initialized which causes malfunction.
Fixes:
9114014cf4e6 ("genirq: Add mutex to irq desc to serialize request/free_irq()")
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190404074512.145533-1-wangkefeng.wang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stephen Boyd [Mon, 25 Mar 2019 18:10:26 +0000 (11:10 -0700)]
genirq: Respect IRQCHIP_SKIP_SET_WAKE in irq_chip_set_wake_parent()
commit
325aa19598e410672175ed50982f902d4e3f31c5 upstream.
If a child irqchip calls irq_chip_set_wake_parent() but its parent irqchip
has the IRQCHIP_SKIP_SET_WAKE flag set an error is returned.
This is inconsistent behaviour vs. set_irq_wake_real() which returns 0 when
the irqchip has the IRQCHIP_SKIP_SET_WAKE flag set. It doesn't attempt to
walk the chain of parents and set irq wake on any chips that don't have the
flag set either. If the intent is to call the .irq_set_wake() callback of
the parent irqchip, then we expect irqchip implementations to omit the
IRQCHIP_SKIP_SET_WAKE flag and implement an .irq_set_wake() function that
calls irq_chip_set_wake_parent().
The problem has been observed on a Qualcomm sdm845 device where set wake
fails on any GPIO interrupts after applying work in progress wakeup irq
patches to the GPIO driver. The chain of chips looks like this:
QCOM GPIO -> QCOM PDC (SKIP) -> ARM GIC (SKIP)
The GPIO controllers parent is the QCOM PDC irqchip which in turn has ARM
GIC as parent. The QCOM PDC irqchip has the IRQCHIP_SKIP_SET_WAKE flag
set, and so does the grandparent ARM GIC.
The GPIO driver doesn't know if the parent needs to set wake or not, so it
unconditionally calls irq_chip_set_wake_parent() causing this function to
return a failure because the parent irqchip (PDC) doesn't have the
.irq_set_wake() callback set. Returning 0 instead makes everything work and
irqs from the GPIO controller can be configured for wakeup.
Make it consistent by returning 0 (success) from irq_chip_set_wake_parent()
when a parent chip has IRQCHIP_SKIP_SET_WAKE set.
[ tglx: Massaged changelog ]
Fixes:
08b55e2a9208e ("genirq: Add irqchip_set_wake_parent")
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-gpio@vger.kernel.org
Cc: Lina Iyer <ilina@codeaurora.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190325181026.247796-1-swboyd@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jason Yan [Fri, 12 Apr 2019 02:09:16 +0000 (10:09 +0800)]
block: fix the return errno for direct IO
commit
a89afe58f1a74aac768a5eb77af95ef4ee15beaa upstream.
If the last bio returned is not dio->bio, the status of the bio will
not assigned to dio->bio if it is error. This will cause the whole IO
status wrong.
ksoftirqd/21-117 [021] ..s. 4017.966090: 8,0 C N 4883648 [0]
<idle>-0 [018] ..s. 4017.970888: 8,0 C WS 4924800 + 1024 [0]
<idle>-0 [018] ..s. 4017.970909: 8,0 D WS 4935424 + 1024 [<idle>]
<idle>-0 [018] ..s. 4017.970924: 8,0 D WS 4936448 + 321 [<idle>]
ksoftirqd/21-117 [021] ..s. 4017.995033: 8,0 C R 4883648 + 336 [65475]
ksoftirqd/21-117 [021] d.s. 4018.001988: myprobe1: (blkdev_bio_end_io+0x0/0x168) bi_status=7
ksoftirqd/21-117 [021] d.s. 4018.001992: myprobe: (aio_complete_rw+0x0/0x148) x0=0xffff802f2595ad80 res=0x12a000 res2=0x0
We always have to assign bio->bi_status to dio->bio.bi_status because we
will only check dio->bio.bi_status when we return the whole IO to
the upper layer.
Fixes:
542ff7bf18c6 ("block: new direct I/O implementation")
Cc: stable@vger.kernel.org
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jérôme Glisse [Wed, 10 Apr 2019 20:27:51 +0000 (16:27 -0400)]
block: do not leak memory in bio_copy_user_iov()
commit
a3761c3c91209b58b6f33bf69dd8bb8ec0c9d925 upstream.
When bio_add_pc_page() fails in bio_copy_user_iov() we should free
the page we just allocated otherwise we are leaking it.
Cc: linux-block@vger.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: stable@vger.kernel.org
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dmitry V. Levin [Fri, 29 Mar 2019 17:12:21 +0000 (20:12 +0300)]
riscv: Fix syscall_get_arguments() and syscall_set_arguments()
commit
10a16997db3d99fc02c026cf2c6e6c670acafab0 upstream.
RISC-V syscall arguments are located in orig_a0,a1..a5 fields
of struct pt_regs.
Due to an off-by-one bug and a bug in pointer arithmetic
syscall_get_arguments() was reading s3..s7 fields instead of a1..a5.
Likewise, syscall_set_arguments() was writing s3..s7 fields
instead of a1..a5.
Link: http://lkml.kernel.org/r/20190329171221.GA32456@altlinux.org
Fixes:
e2c0cdfba7f69 ("RISC-V: User-facing API")
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: linux-riscv@lists.infradead.org
Cc: stable@vger.kernel.org # v4.15+
Acked-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Anand Jain [Tue, 2 Apr 2019 10:07:40 +0000 (18:07 +0800)]
btrfs: prop: fix vanished compression property after failed set
commit
272e5326c7837697882ce3162029ba893059b616 upstream.
The compression property resets to NULL, instead of the old value if we
fail to set the new compression parameter.
$ btrfs prop get /btrfs compression
compression=lzo
$ btrfs prop set /btrfs compression zli
ERROR: failed to set compression for /btrfs: Invalid argument
$ btrfs prop get /btrfs compression
This is because the compression property ->validate() is successful for
'zli' as the strncmp() used the length passed from the userspace.
Fix it by using the expected string length in strncmp().
Fixes:
63541927c8d1 ("Btrfs: add support for inode properties")
Fixes:
5c1aab1dd544 ("btrfs: Add zstd support")
CC: stable@vger.kernel.org # 4.14+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Anand Jain [Tue, 2 Apr 2019 10:07:38 +0000 (18:07 +0800)]
btrfs: prop: fix zstd compression parameter validation
commit
50398fde997f6be8faebdb5f38e9c9c467370f51 upstream.
We let pass zstd compression parameter even if it is not fully valid.
For example:
$ btrfs prop set /btrfs compression zst
$ btrfs prop get /btrfs compression
compression=zst
zlib and lzo are fine.
Fix it by checking the correct prefix length.
Fixes:
5c1aab1dd544 ("btrfs: Add zstd support")
CC: stable@vger.kernel.org # 4.14+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Filipe Manana [Tue, 26 Mar 2019 10:49:56 +0000 (10:49 +0000)]
Btrfs: do not allow trimming when a fs is mounted with the nologreplay option
commit
f35f06c35560a86e841631f0243b83a984dc11a9 upstream.
Whan a filesystem is mounted with the nologreplay mount option, which
requires it to be mounted in RO mode as well, we can not allow discard on
free space inside block groups, because log trees refer to extents that
are not pinned in a block group's free space cache (pinning the extents is
precisely the first phase of replaying a log tree).
So do not allow the fitrim ioctl to do anything when the filesystem is
mounted with the nologreplay option, because later it can be mounted RW
without that option, which causes log replay to happen and result in
either a failure to replay the log trees (leading to a mount failure), a
crash or some silent corruption.
Reported-by: Darrick J. Wong <darrick.wong@oracle.com>
Fixes:
96da09192cda ("btrfs: Introduce new mount option to disable tree log replay")
CC: stable@vger.kernel.org # 4.9+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
S.j. Wang [Wed, 27 Feb 2019 06:31:12 +0000 (06:31 +0000)]
ASoC: fsl_esai: fix channel swap issue when stream starts
commit
0ff4e8c61b794a4bf6c854ab071a1abaaa80f358 upstream.
There is very low possibility ( < 0.1% ) that channel swap happened
in beginning when multi output/input pin is enabled. The issue is
that hardware can't send data to correct pin in the beginning with
the normal enable flow.
This is hardware issue, but there is no errata, the workaround flow
is that: Each time playback/recording, firstly clear the xSMA/xSMB,
then enable TE/RE, then enable xSMB and xSMA (xSMB must be enabled
before xSMA). Which is to use the xSMA as the trigger start register,
previously the xCR_TE or xCR_RE is the bit for starting.
Fixes commit
43d24e76b698 ("ASoC: fsl_esai: Add ESAI CPU DAI driver")
Cc: <stable@vger.kernel.org>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Acked-by: Nicolin Chen <nicoleotsuka@gmail.com>
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Guenter Roeck [Fri, 22 Mar 2019 22:39:48 +0000 (15:39 -0700)]
ASoC: intel: Fix crash at suspend/resume after failed codec registration
commit
8f71370f4b02730e8c27faf460af7a3586e24e1f upstream.
If codec registration fails after the ASoC Intel SST driver has been probed,
the kernel will Oops and crash at suspend/resume.
general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI
CPU: 1 PID: 2811 Comm: cat Tainted: G W 4.19.30 #15
Hardware name: GOOGLE Clapper, BIOS Google_Clapper.5216.199.7 08/22/2014
RIP: 0010:snd_soc_suspend+0x5a/0xd21
Code: 03 80 3c 10 00 49 89 d7 74 0b 48 89 df e8 71 72 c4 fe 4c 89
fa 48 8b 03 48 89 45 d0 48 8d 98 a0 01 00 00 48 89 d8 48 c1 e8 03
<8a> 04 10 84 c0 0f 85 85 0c 00 00 80 3b 00 0f 84 6b 0c 00 00 48 8b
RSP: 0018:
ffff888035407750 EFLAGS:
00010202
RAX:
0000000000000034 RBX:
00000000000001a0 RCX:
0000000000000000
RDX:
dffffc0000000000 RSI:
0000000000000008 RDI:
ffff88805c417098
RBP:
ffff8880354077b0 R08:
dffffc0000000000 R09:
ffffed100b975718
R10:
0000000000000001 R11:
ffffffff949ea4a3 R12:
1ffff1100b975746
R13:
dffffc0000000000 R14:
ffff88805cba4588 R15:
dffffc0000000000
FS:
0000794a78e91b80(0000) GS:
ffff888068d00000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00007bd5283ccf58 CR3:
000000004b7aa000 CR4:
00000000001006e0
Call Trace:
? dpm_complete+0x67b/0x67b
? i915_gem_suspend+0x14d/0x1ad
sst_soc_prepare+0x91/0x1dd
? sst_be_hw_params+0x7e/0x7e
dpm_prepare+0x39a/0x88b
dpm_suspend_start+0x13/0x9d
suspend_devices_and_enter+0x18f/0xbd7
? arch_suspend_enable_irqs+0x11/0x11
? printk+0xd9/0x12d
? lock_release+0x95f/0x95f
? log_buf_vmcoreinfo_setup+0x131/0x131
? rcu_read_lock_sched_held+0x140/0x22a
? __bpf_trace_rcu_utilization+0xa/0xa
? __pm_pr_dbg+0x186/0x190
? pm_notifier_call_chain+0x39/0x39
? suspend_test+0x9d/0x9d
pm_suspend+0x2f4/0x728
? trace_suspend_resume+0x3da/0x3da
? lock_release+0x95f/0x95f
? kernfs_fop_write+0x19f/0x32d
state_store+0xd8/0x147
? sysfs_kf_read+0x155/0x155
kernfs_fop_write+0x23e/0x32d
__vfs_write+0x108/0x608
? vfs_read+0x2e9/0x2e9
? rcu_read_lock_sched_held+0x140/0x22a
? __bpf_trace_rcu_utilization+0xa/0xa
? debug_smp_processor_id+0x10/0x10
? selinux_file_permission+0x1c5/0x3c8
? rcu_sync_lockdep_assert+0x6a/0xad
? __sb_start_write+0x129/0x2ac
vfs_write+0x1aa/0x434
ksys_write+0xfe/0x1be
? __ia32_sys_read+0x82/0x82
do_syscall_64+0xcd/0x120
entry_SYSCALL_64_after_hwframe+0x49/0xbe
In the observed situation, the problem is seen because the codec driver
failed to probe due to a hardware problem.
max98090 i2c-
193C9890:00: Failed to read device revision: -1
max98090 i2c-
193C9890:00: ASoC: failed to probe component -1
cht-bsw-max98090 cht-bsw-max98090: ASoC: failed to instantiate card -1
cht-bsw-max98090 cht-bsw-max98090: snd_soc_register_card failed -1
cht-bsw-max98090: probe of cht-bsw-max98090 failed with error -1
The problem is similar to the problem solved with commit
2fc995a87f2e
("ASoC: intel: Fix crash at suspend/resume without card registration"),
but codec registration fails at a later point. At that time, the pointer
checked with the above mentioned commit is already set, but it is not
cleared if the device is subsequently removed. Adding a remove function
to clear the pointer fixes the problem.
Cc: stable@vger.kernel.org
Cc: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Cc: Curtis Malainey <cujomalainey@chromium.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Thelen [Sat, 6 Apr 2019 01:39:18 +0000 (18:39 -0700)]
mm: writeback: use exact memcg dirty counts
commit
0b3d6e6f2dd0a7b697b1aa8c167265908940624b upstream.
Since commit
a983b5ebee57 ("mm: memcontrol: fix excessive complexity in
memory.stat reporting") memcg dirty and writeback counters are managed
as:
1) per-memcg per-cpu values in range of [-32..32]
2) per-memcg atomic counter
When a per-cpu counter cannot fit in [-32..32] it's flushed to the
atomic. Stat readers only check the atomic. Thus readers such as
balance_dirty_pages() may see a nontrivial error margin: 32 pages per
cpu.
Assuming 100 cpus:
4k x86 page_size: 13 MiB error per memcg
64k ppc page_size: 200 MiB error per memcg
Considering that dirty+writeback are used together for some decisions the
errors double.
This inaccuracy can lead to undeserved oom kills. One nasty case is
when all per-cpu counters hold positive values offsetting an atomic
negative value (i.e. per_cpu[*]=32, atomic=n_cpu*-32).
balance_dirty_pages() only consults the atomic and does not consider
throttling the next n_cpu*32 dirty pages. If the file_lru is in the
13..200 MiB range then there's absolutely no dirty throttling, which
burdens vmscan with only dirty+writeback pages thus resorting to oom
kill.
It could be argued that tiny containers are not supported, but it's more
subtle. It's the amount the space available for file lru that matters.
If a container has memory.max-200MiB of non reclaimable memory, then it
will also suffer such oom kills on a 100 cpu machine.
The following test reliably ooms without this patch. This patch avoids
oom kills.
$ cat test
mount -t cgroup2 none /dev/cgroup
cd /dev/cgroup
echo +io +memory > cgroup.subtree_control
mkdir test
cd test
echo 10M > memory.max
(echo $BASHPID > cgroup.procs && exec /memcg-writeback-stress /foo)
(echo $BASHPID > cgroup.procs && exec dd if=/dev/zero of=/foo bs=2M count=100)
$ cat memcg-writeback-stress.c
/*
* Dirty pages from all but one cpu.
* Clean pages from the non dirtying cpu.
* This is to stress per cpu counter imbalance.
* On a 100 cpu machine:
* - per memcg per cpu dirty count is 32 pages for each of 99 cpus
* - per memcg atomic is -99*32 pages
* - thus the complete dirty limit: sum of all counters 0
* - balance_dirty_pages() only sees atomic count -99*32 pages, which
* it max()s to 0.
* - So a workload can dirty -99*32 pages before balance_dirty_pages()
* cares.
*/
#define _GNU_SOURCE
#include <err.h>
#include <fcntl.h>
#include <sched.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/stat.h>
#include <sys/sysinfo.h>
#include <sys/types.h>
#include <unistd.h>
static char *buf;
static int bufSize;
static void set_affinity(int cpu)
{
cpu_set_t affinity;
CPU_ZERO(&affinity);
CPU_SET(cpu, &affinity);
if (sched_setaffinity(0, sizeof(affinity), &affinity))
err(1, "sched_setaffinity");
}
static void dirty_on(int output_fd, int cpu)
{
int i, wrote;
set_affinity(cpu);
for (i = 0; i < 32; i++) {
for (wrote = 0; wrote < bufSize; ) {
int ret = write(output_fd, buf+wrote, bufSize-wrote);
if (ret == -1)
err(1, "write");
wrote += ret;
}
}
}
int main(int argc, char **argv)
{
int cpu, flush_cpu = 1, output_fd;
const char *output;
if (argc != 2)
errx(1, "usage: output_file");
output = argv[1];
bufSize = getpagesize();
buf = malloc(getpagesize());
if (buf == NULL)
errx(1, "malloc failed");
output_fd = open(output, O_CREAT|O_RDWR);
if (output_fd == -1)
err(1, "open(%s)", output);
for (cpu = 0; cpu < get_nprocs(); cpu++) {
if (cpu != flush_cpu)
dirty_on(output_fd, cpu);
}
set_affinity(flush_cpu);
if (fsync(output_fd))
err(1, "fsync(%s)", output);
if (close(output_fd))
err(1, "close(%s)", output);
free(buf);
}
Make balance_dirty_pages() and wb_over_bg_thresh() work harder to
collect exact per memcg counters. This avoids the aforementioned oom
kills.
This does not affect the overhead of memory.stat, which still reads the
single atomic counter.
Why not use percpu_counter? memcg already handles cpus going offline, so
no need for that overhead from percpu_counter. And the percpu_counter
spinlocks are more heavyweight than is required.
It probably also makes sense to use exact dirty and writeback counters
in memcg oom reports. But that is saved for later.
Link: http://lkml.kernel.org/r/20190329174609.164344-1-gthelen@google.com
Signed-off-by: Greg Thelen <gthelen@google.com>
Reviewed-by: Roman Gushchin <guro@fb.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: <stable@vger.kernel.org> [4.16+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Arnd Bergmann [Sat, 6 Apr 2019 01:38:53 +0000 (18:38 -0700)]
include/linux/bitrev.h: fix constant bitrev
commit
6147e136ff5071609b54f18982dea87706288e21 upstream.
clang points out with hundreds of warnings that the bitrev macros have a
problem with constant input:
drivers/hwmon/sht15.c:187:11: error: variable '__x' is uninitialized when used within its own initialization
[-Werror,-Wuninitialized]
u8 crc = bitrev8(data->val_status & 0x0F);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/linux/bitrev.h:102:21: note: expanded from macro 'bitrev8'
__constant_bitrev8(__x) : \
~~~~~~~~~~~~~~~~~~~^~~~
include/linux/bitrev.h:67:11: note: expanded from macro '__constant_bitrev8'
u8 __x = x; \
~~~ ^
Both the bitrev and the __constant_bitrev macros use an internal
variable named __x, which goes horribly wrong when passing one to the
other.
The obvious fix is to rename one of the variables, so this adds an extra
'_'.
It seems we got away with this because
- there are only a few drivers using bitrev macros
- usually there are no constant arguments to those
- when they are constant, they tend to be either 0 or (unsigned)-1
(drivers/isdn/i4l/isdnhdlc.o, drivers/iio/amplifiers/ad8366.c) and
give the correct result by pure chance.
In fact, the only driver that I could find that gets different results
with this is drivers/net/wan/slic_ds26522.c, which in turn is a driver
for fairly rare hardware (adding the maintainer to Cc for testing).
Link: http://lkml.kernel.org/r/20190322140503.123580-1-arnd@arndb.de
Fixes:
556d2f055bf6 ("ARM: 8187/1: add CONFIG_HAVE_ARCH_BITREVERSE to support rbit instruction")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Cc: Zhao Qiang <qiang.zhao@nxp.com>
Cc: Yalin Wang <yalin.wang@sonymobile.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
David Rientjes [Tue, 19 Mar 2019 22:19:56 +0000 (15:19 -0700)]
kvm: svm: fix potential get_num_contig_pages overflow
commit
ede885ecb2cdf8a8dd5367702e3d964ec846a2d5 upstream.
get_num_contig_pages() could potentially overflow int so make its type
consistent with its usage.
Reported-by: Cfir Cohen <cfir@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dave Airlie [Fri, 5 Apr 2019 03:17:13 +0000 (13:17 +1000)]
drm/udl: add a release method and delay modeset teardown
commit
9b39b013037fbfa8d4b999345d9e904d8a336fc2 upstream.
If we unplug a udl device, the usb callback with deinit the
mode_config struct, however userspace will still have an open
file descriptor and a framebuffer on that device. When userspace
closes the fd, we'll oops because it'll try and look stuff up
in the object idr which we've destroyed.
This punts destroying the mode objects until release time instead.
Cc: stable@vger.kernel.org
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190405031715.5959-2-airlied@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Yan Zhao [Wed, 27 Mar 2019 04:54:51 +0000 (00:54 -0400)]
drm/i915/gvt: do not deliver a workload if its creation fails
commit
dade58ed5af6365ac50ff4259c2a0bf31219e285 upstream.
in workload creation routine, if any failure occurs, do not queue this
workload for delivery. if this failure is fatal, enter into failsafe
mode.
Fixes:
6d76303553ba ("drm/i915/gvt: Move common vGPU workload creation into scheduler.c")
Cc: stable@vger.kernel.org #4.19+
Cc: zhenyuw@linux.intel.com
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andrei Vagin [Mon, 8 Apr 2019 04:15:42 +0000 (21:15 -0700)]
alarmtimer: Return correct remaining time
commit
07d7e12091f4ab869cc6a4bb276399057e73b0b3 upstream.
To calculate a remaining time, it's required to subtract the current time
from the expiration time. In alarm_timer_remaining() the arguments of
ktime_sub are swapped.
Fixes:
d653d8457c76 ("alarmtimer: Implement remaining callback")
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Cc: Stephen Boyd <sboyd@kernel.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190408041542.26338-1-avagin@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sven Schnelle [Thu, 4 Apr 2019 16:16:04 +0000 (18:16 +0200)]
parisc: also set iaoq_b in instruction_pointer_set()
commit
f324fa58327791b2696628b31480e7e21c745706 upstream.
When setting the instruction pointer on PA-RISC we also need
to set the back of the instruction queue to the new offset, otherwise
we will execute on instruction from the new location, and jumping
back to the old location stored in iaoq_b.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Helge Deller <deller@gmx.de>
Fixes:
75ebedf1d263 ("parisc: Add HAVE_REGS_AND_STACK_ACCESS_API feature")
Cc: stable@vger.kernel.org # 4.19+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sven Schnelle [Thu, 4 Apr 2019 16:16:03 +0000 (18:16 +0200)]
parisc: regs_return_value() should return gpr28
commit
45efd871bf0a47648f119d1b41467f70484de5bc upstream.
While working on kretprobes for PA-RISC I was wondering while the
kprobes sanity test always fails on kretprobes. This is caused by
returning gpr20 instead of gpr28.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # 4.14+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Helge Deller [Tue, 2 Apr 2019 10:13:27 +0000 (12:13 +0200)]
parisc: Detect QEMU earlier in boot process
commit
d006e95b5561f708d0385e9677ffe2c46f2ae345 upstream.
While adding LASI support to QEMU, I noticed that the QEMU detection in
the kernel happens much too late. For example, when a LASI chip is found
by the kernel, it registers the LASI LED driver as well. But when we
run on QEMU it makes sense to avoid spending unnecessary CPU cycles, so
we need to access the running_on_QEMU flag earlier than before.
This patch now makes the QEMU detection the fist task of the Linux
kernel by moving it to where the kernel enters the C-coding.
Fixes:
310d82784fb4 ("parisc: qemu idle sleep support")
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v4.14+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Geis [Wed, 13 Mar 2019 19:02:30 +0000 (19:02 +0000)]
arm64: dts: rockchip: fix rk3328 sdmmc0 write errors
commit
09f91381fa5de1d44bc323d8bf345f5d57b3d9b5 upstream.
Various rk3328 based boards experience occasional sdmmc0 write errors.
This is due to the rk3328.dtsi tx drive levels being set to 4ma, vs
8ma per the rk3328 datasheet default settings.
Fix this by setting the tx signal pins to 8ma.
Inspiration from tonymac32's patch,
https://github.com/ayufan-rock64/linux-kernel/commit/
dc1212b347e0da17c5460bcc0a56b07d02bac3f8
Fixes issues on the rk3328-roc-cc and the rk3328-rock64 (as per the
above commit message).
Tested on the rk3328-roc-cc board.
Fixes:
52e02d377a72 ("arm64: dts: rockchip: add core dtsi file for RK3328 SoCs")
Cc: stable@vger.kernel.org
Signed-off-by: Peter Geis <pgwipeout@gmail.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Aneesh Kumar K.V [Sat, 6 Apr 2019 01:39:10 +0000 (18:39 -0700)]
mm/huge_memory.c: fix modifying of page protection by insert_pfn_pmd()
commit
c6f3c5ee40c10bb65725047a220570f718507001 upstream.
With some architectures like ppc64, set_pmd_at() cannot cope with a
situation where there is already some (different) valid entry present.
Use pmdp_set_access_flags() instead to modify the pfn which is built to
deal with modifying existing PMD entries.
This is similar to commit
cae85cb8add3 ("mm/memory.c: fix modifying of
page protection by insert_pfn()")
We also do similar update w.r.t insert_pfn_pud eventhough ppc64 don't
support pud pfn entries now.
Without this patch we also see the below message in kernel log "BUG:
non-zero pgtables_bytes on freeing mm:"
Link: http://lkml.kernel.org/r/20190402115125.18803-1-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reported-by: Chandan Rajendra <chandan@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hui Wang [Mon, 8 Apr 2019 07:58:11 +0000 (15:58 +0800)]
ALSA: hda - Add two more machines to the power_save_blacklist
commit
cae30527901d9590db0e12ace994c1d58bea87fd upstream.
Recently we set CONFIG_SND_HDA_POWER_SAVE_DEFAULT to 1 when
configuring the kernel, then two machines were reported to have noise
after installing the new kernel. Put them in the blacklist, the
noise disappears.
https://bugs.launchpad.net/bugs/1821663
Cc: <stable@vger.kernel.org>
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Richard Sailer [Tue, 2 Apr 2019 13:52:04 +0000 (15:52 +0200)]
ALSA: hda/realtek - Add quirk for Tuxedo XC 1509
commit
80690a276f444a68a332136d98bfea1c338bc263 upstream.
This adds a SND_PCI_QUIRK(...) line for the Tuxedo XC 1509.
The Tuxedo XC 1509 and the System76 oryp5 are the same barebone
notebooks manufactured by Clevo. To name the fixups both use after the
actual underlying hardware, this patch also changes System76_orpy5
to clevo_pb51ed in 2 enum symbols and one function name,
matching the other pci_quirk entries which are also named after the
device ODM.
Fixes:
7f665b1c3283 ("ALSA: hda/realtek - Headset microphone and internal speaker support for System76 oryp5")
Signed-off-by: Richard Sailer <rs@tuxedocomputers.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jian-Hong Pan [Mon, 1 Apr 2019 03:25:05 +0000 (11:25 +0800)]
ALSA: hda/realtek: Enable headset MIC of Acer TravelMate B114-21 with ALC233
commit
ea5c7eba216e832906e594799b8670f1954a588c upstream.
The Acer TravelMate B114-21 laptop cannot detect and record sound from
headset MIC. This patch adds the ALC233_FIXUP_ACER_HEADSET_MIC HDA verb
quirk chained with ALC233_FIXUP_ASUS_MIC_NO_PRESENCE pin quirk to fix
this issue.
[ fixed the missing brace and reordered the entry -- tiwai ]
Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com>
Signed-off-by: Daniel Drake <drake@endlessm.com>
Reviewed-by: Kailang Yang <kailang@realtek.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Zubin Mithra [Thu, 4 Apr 2019 21:33:55 +0000 (14:33 -0700)]
ALSA: seq: Fix OOB-reads from strlcpy
commit
212ac181c158c09038c474ba68068be49caecebb upstream.
When ioctl calls are made with non-null-terminated userspace strings,
strlcpy causes an OOB-read from within strlen. Fix by changing to use
strscpy instead.
Signed-off-by: Zubin Mithra <zsm@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Erik Schmauss [Mon, 8 Apr 2019 20:42:26 +0000 (13:42 -0700)]
ACPICA: Namespace: remove address node from global list after method termination
commit
c5781ffbbd4f742a58263458145fe7f0ac01d9e0 upstream.
ACPICA commit
b233720031a480abd438f2e9c643080929d144c3
ASL operation_regions declare a range of addresses that it uses. In a
perfect world, the range of addresses should be used exclusively by
the AML interpreter. The OS can use this information to decide which
drivers to load so that the AML interpreter and device drivers use
different regions of memory.
During table load, the address information is added to a global
address range list. Each node in this list contains an address range
as well as a namespace node of the operation_region. This list is
deleted at ACPI shutdown.
Unfortunately, ASL operation_regions can be declared inside of control
methods. Although this is not recommended, modern firmware contains
such code. New module level code changes unintentionally removed the
functionality of adding and removing nodes to the global address
range list.
A few months ago, support for adding addresses has been re-
implemented. However, the removal of the address range list was
missed and resulted in some systems to crash due to the address list
containing bogus namespace nodes from operation_regions declared in
control methods. In order to fix the crash, this change removes
dynamic operation_regions after control method termination.
Link: https://github.com/acpica/acpica/commit/b2337200
Link: https://bugzilla.kernel.org/show_bug.cgi?id=202475
Fixes:
4abb951b73ff ("ACPICA: AML interpreter: add region addresses in global list during initialization")
Reported-by: Michael J Gruber <mjg@fedoraproject.org>
Signed-off-by: Erik Schmauss <erik.schmauss@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Cc: 4.20+ <stable@vger.kernel.org> # 4.20+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Furquan Shaikh [Wed, 20 Mar 2019 22:28:44 +0000 (15:28 -0700)]
ACPICA: Clear status of GPEs before enabling them
commit
c8b1917c8987a6fa3695d479b4d60fbbbc3e537b upstream.
Commit
18996f2db918 ("ACPICA: Events: Stop unconditionally clearing
ACPI IRQs during suspend/resume") was added to stop clearing event
status bits unconditionally in the system-wide suspend and resume
paths. This was done because of an issue with a laptop lid appaering
to be closed even when it was used to wake up the system from suspend
(see https://bugzilla.kernel.org/show_bug.cgi?id=196249), which
happened because event status bits were cleared unconditionally on
system resume. Though this change fixed the issue in the resume path,
it introduced regressions in a few suspend paths.
First regression was reported and fixed in the S5 entry path by commit
fa85015c0d95 ("ACPICA: Clear status of all events when entering S5").
Next regression was reported and fixed for all legacy sleep paths by
commit
f317c7dc12b7 ("ACPICA: Clear status of all events when entering
sleep states"). However, there still is a suspend-to-idle regression,
since suspend-to-idle does not follow the legacy sleep paths.
In the suspend-to-idle case, wakeup is enabled as part of device
suspend. If the status bits of wakeup GPEs are set when they are
enabled, it causes a premature system wakeup to occur.
To address that problem, partially revert commit
18996f2db918 to
restore GPE status bits clearing before the GPE is enabled in
acpi_ev_enable_gpe().
Fixes:
18996f2db918 ("ACPICA: Events: Stop unconditionally clearing ACPI IRQs during suspend/resume")
Signed-off-by: Furquan Shaikh <furquan@google.com>
Cc: 4.17+ <stable@vger.kernel.org> # 4.17+
[ rjw: Subject & changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Axel Lin [Mon, 11 Mar 2019 09:57:30 +0000 (17:57 +0800)]
hwmon: (w83773g) Select REGMAP_I2C to fix build error
commit
a165dcc923ada2ffdee1d4f41f12f81b66d04c55 upstream.
Select REGMAP_I2C to avoid below build error:
ERROR: "__devm_regmap_init_i2c" [drivers/hwmon/w83773g.ko] undefined!
Fixes:
ee249f271524 ("hwmon: Add W83773G driver")
Cc: stable@vger.kernel.org
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Mon, 21 Jan 2019 16:26:42 +0000 (17:26 +0100)]
tty: ldisc: add sysctl to prevent autoloading of ldiscs
commit
7c0cca7c847e6e019d67b7d793efbbe3b947d004 upstream.
By default, the kernel will automatically load the module of any line
dicipline that is asked for. As this sometimes isn't the safest thing
to do, provide a sysctl to disable this feature.
By default, we set this to 'y' as that is the historical way that Linux
has worked, and we do not want to break working systems. But in the
future, perhaps this can default to 'n' to prevent this functionality.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Fri, 5 Apr 2019 13:39:26 +0000 (15:39 +0200)]
tty: mark Siemens R3964 line discipline as BROKEN
commit
c7084edc3f6d67750f50d4183134c4fb5712a5c8 upstream.
The n_r3964 line discipline driver was written in a different time, when
SMP machines were rare, and users were trusted to do the right thing.
Since then, the world has moved on but not this code, it has stayed
rooted in the past with its lovely hand-crafted list structures and
loads of "interesting" race conditions all over the place.
After attempting to clean up most of the issues, I just gave up and am
now marking the driver as BROKEN so that hopefully someone who has this
hardware will show up out of the woodwork (I know you are out there!)
and will help with debugging a raft of changes that I had laying around
for the code, but was too afraid to commit as odds are they would break
things.
Many thanks to Jann and Linus for pointing out the initial problems in
this codebase, as well as many reviews of my attempts to fix the issues.
It was a case of whack-a-mole, and as you can see, the mole won.
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yueyi Li [Mon, 24 Dec 2018 07:40:07 +0000 (07:40 +0000)]
arm64: kaslr: Reserve size of ARM64_MEMSTART_ALIGN in linear region
[ Upstream commit
c8a43c18a97845e7f94ed7d181c11f41964976a2 ]
When KASLR is enabled (CONFIG_RANDOMIZE_BASE=y), the top 4K of kernel
virtual address space may be mapped to physical addresses despite being
reserved for ERR_PTR values.
Fix the randomization of the linear region so that we avoid mapping the
last page of the virtual address space.
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: liyueyi <liyueyi@live.com>
[will: rewrote commit message; merged in suggestion from Ard]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Sasha Levin (Microsoft) <sashal@kernel.org>
Florian Westphal [Fri, 12 Apr 2019 17:55:03 +0000 (10:55 -0700)]
netfilter: nfnetlink_cttimeout: fetch timeouts for udplite and gre, too
commit
89259088c1b7fecb43e8e245dc931909132a4e03 upstream
syzbot was able to trigger the WARN in cttimeout_default_get() by
passing UDPLITE as l4protocol. Alias UDPLITE to UDP, both use
same timeout values.
Furthermore, also fetch GRE timeouts. GRE is a bit more complicated,
as it still can be a module and its netns_proto_gre struct layout isn't
visible outside of the gre module. Can't move timeouts around, it
appears conntrack sysctl unregister assumes net_generic() returns
nf_proto_net, so we get crash. Expose layout of netns_proto_gre instead.
A followup nf-next patch could make gre tracker be built-in as well
if needed, its not that large.
Last, make the WARN() mention the missing protocol value in case
anything else is missing.
Reported-by: syzbot+2fae8fa157dd92618cae@syzkaller.appspotmail.com
Fixes:
8866df9264a3 ("netfilter: nfnetlink_cttimeout: pass default timeout policy to obj_to_nlattr")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Zubin Mithra <zsm@chromium.org>
Signed-off-by: Sasha Levin (Microsoft) <sashal@kernel.org>
Pablo Neira Ayuso [Fri, 12 Apr 2019 17:55:02 +0000 (10:55 -0700)]
netfilter: nfnetlink_cttimeout: pass default timeout policy to obj_to_nlattr
commit
8866df9264a34e675b4ee8a151db819b87cce2d3 upstream
Otherwise, we hit a NULL pointer deference since handlers always assume
default timeout policy is passed.
netlink: 24 bytes leftover after parsing attributes in process `syz-executor2'.
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 9575 Comm: syz-executor1 Not tainted 4.19.0+ #312
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:icmp_timeout_obj_to_nlattr+0x77/0x170 net/netfilter/nf_conntrack_proto_icmp.c:297
Fixes:
c779e849608a ("netfilter: conntrack: remove get_timeout() indirection")
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Zubin Mithra <zsm@chromium.org>
Signed-off-by: Sasha Levin (Microsoft) <sashal@kernel.org>
Neil Armstrong [Thu, 11 Apr 2019 10:11:22 +0000 (12:11 +0200)]
Revert "clk: meson: clean-up clock registration"
This reverts commit
c8e4f8406842332fb55cd792016e5dac266f6354.
This patch was not initially a fix and is dependent on other
changes which are not fixes eithers.
With this change, multiple Amlogic based boards fails to boot,
as reported by kernelci.
Cc: stable@vger.kernel.org # 4.19.34
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Nick Desaulniers [Sat, 6 Apr 2019 01:38:45 +0000 (18:38 -0700)]
lib/string.c: implement a basic bcmp
[ Upstream commit
5f074f3e192f10c9fade898b9b3b8812e3d83342 ]
A recent optimization in Clang (r355672) lowers comparisons of the
return value of memcmp against zero to comparisons of the return value
of bcmp against zero. This helps some platforms that implement bcmp
more efficiently than memcmp. glibc simply aliases bcmp to memcmp, but
an optimized implementation is in the works.
This results in linkage failures for all targets with Clang due to the
undefined symbol. For now, just implement bcmp as a tailcail to memcmp
to unbreak the build. This routine can be further optimized in the
future.
Other ideas discussed:
* A weak alias was discussed, but breaks for architectures that define
their own implementations of memcmp since aliases to declarations are
not permitted (only definitions). Arch-specific memcmp
implementations typically declare memcmp in C headers, but implement
them in assembly.
* -ffreestanding also is used sporadically throughout the kernel.
* -fno-builtin-bcmp doesn't work when doing LTO.
Link: https://bugs.llvm.org/show_bug.cgi?id=41035
Link: https://code.woboq.org/userspace/glibc/string/memcmp.c.html#bcmp
Link: https://github.com/llvm/llvm-project/commit/8e16d73346f8091461319a7dfc4ddd18eedcff13
Link: https://github.com/ClangBuiltLinux/linux/issues/416
Link: http://lkml.kernel.org/r/20190313211335.165605-1-ndesaulniers@google.com
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Reported-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Suggested-by: James Y Knight <jyknight@google.com>
Suggested-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Suggested-by: Nathan Chancellor <natechancellor@gmail.com>
Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Nick Desaulniers [Thu, 6 Dec 2018 19:12:31 +0000 (11:12 -0800)]
x86/vdso: Drop implicit common-page-size linker flag
commit
ac3e233d29f7f77f28243af0132057d378d3ea58 upstream.
GNU linker's -z common-page-size's default value is based on the target
architecture. arch/x86/entry/vdso/Makefile sets it to the architecture
default, which is implicit and redundant. Drop it.
Fixes:
2aae950b21e4 ("x86_64: Add vDSO for x86-64 with gettimeofday/clock_gettime/getcpu")
Reported-by: Dmitry Golovin <dima@golovin.in>
Reported-by: Bill Wendling <morbo@google.com>
Suggested-by: Dmitry Golovin <dima@golovin.in>
Suggested-by: Rui Ueyama <ruiu@google.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Fangrui Song <maskray@google.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20181206191231.192355-1-ndesaulniers@google.com
Link: https://bugs.llvm.org/show_bug.cgi?id=38774
Link: https://github.com/ClangBuiltLinux/linux/issues/31
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Nick Desaulniers [Mon, 11 Feb 2019 19:30:04 +0000 (11:30 -0800)]
kbuild: clang: choose GCC_TOOLCHAIN_DIR not on LD
commit
ad15006cc78459d059af56729c4d9bed7c7fd860 upstream.
This causes an issue when trying to build with `make LD=ld.lld` if
ld.lld and the rest of your cross tools aren't in the same directory
(ex. /usr/local/bin) (as is the case for Android's build system), as the
GCC_TOOLCHAIN_DIR then gets set based on `which $(LD)` which will point
where LLVM tools are, not GCC/binutils tools are located.
Instead, select the GCC_TOOLCHAIN_DIR based on another tool provided by
binutils for which LLVM does not provide a substitute for, such as
elfedit.
Fixes:
785f11aa595b ("kbuild: Add better clang cross build support")
Link: https://github.com/ClangBuiltLinux/linux/issues/341
Suggested-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Masahiro Yamada [Mon, 5 Nov 2018 07:52:34 +0000 (16:52 +0900)]
kbuild: deb-pkg: fix bindeb-pkg breakage when O= is used
[ Upstream commit
02826a6ba301b72461c3706e1cc66d5571cd327e ]
Ard Biesheuvel reports bindeb-pkg with O= option is broken in the
following way:
...
LD [M] sound/soc/rockchip/snd-soc-rk3399-gru-sound.ko
LD [M] sound/soc/rockchip/snd-soc-rockchip-pcm.ko
LD [M] sound/soc/rockchip/snd-soc-rockchip-rt5645.ko
LD [M] sound/soc/rockchip/snd-soc-rockchip-spdif.ko
LD [M] sound/soc/sh/rcar/snd-soc-rcar.ko
fakeroot -u debian/rules binary
make KERNELRELEASE=4.19.0-12677-g19beffaf7a99-dirty ARCH=arm64 KBUILD_SRC= intdeb-pkg
/bin/bash /home/ard/linux/scripts/package/builddeb
Makefile:600: include/config/auto.conf: No such file or directory
***
*** Configuration file ".config" not found!
***
*** Please run some configurator (e.g. "make oldconfig" or
*** "make menuconfig" or "make xconfig").
***
make[12]: *** [syncconfig] Error 1
make[11]: *** [syncconfig] Error 2
make[10]: *** [include/config/auto.conf] Error 2
make[9]: *** [__sub-make] Error 2
...
Prior to commit
80463f1b7bf9 ("kbuild: add --include-dir flag only
for out-of-tree build"), both srctree and objtree were added to
--include-dir redundantly, and the wrong code '$MAKE image_name'
was working by relying on that. Now, the potential issue that had
previously been hidden just showed up.
'$MAKE image_name' recurses to the generated $(objtree)/Makefile and
ends up with running in srctree, which is incorrect. It should be
invoked with '-f $srctree/Makefile' (or KBUILD_SRC=) to be executed
in objtree.
Fixes:
80463f1b7bf9 ("kbuild: add --include-dir flag only for out-of-tree build")
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Huy Nguyen [Thu, 7 Mar 2019 20:07:32 +0000 (14:07 -0600)]
net/mlx5e: Update xon formula
[ Upstream commit
e28408e98bced123038857b6e3c81fa12a2e3e68 ]
Set xon = xoff - netdev's max_mtu.
netdev's max_mtu will give enough time for the pause frame to
arrive at the sender.
Fixes:
0696d60853d5 ("net/mlx5e: Receive buffer configuration")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>