Skip to content

fix bug when removing admc_adc#3

Closed
njpillitteri wants to merge 1 commit intoanalogdevicesinc:xcomm_zynqfrom
njpillitteri:xcomm_zynq
Closed

fix bug when removing admc_adc#3
njpillitteri wants to merge 1 commit intoanalogdevicesinc:xcomm_zynqfrom
njpillitteri:xcomm_zynq

Conversation

@njpillitteri
Copy link
Copy Markdown
Contributor

No description provided.

lclausen-adi pushed a commit that referenced this pull request May 4, 2015
H_RST bit in H_CSR register may be found lit before reset is started,
for example if preceding reset flow hasn't completed.
In that case asserting H_RST will be ignored, therefore we need to clean
H_RST bit to start a successful reset sequence.

Cc: <stable@vger.kernel.org> #3.10+
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
lclausen-adi pushed a commit that referenced this pull request May 4, 2015
This patch is to fix two deadlock cases.
Deadlock 1:
CPU #1
 pinctrl_register-> pinctrl_get ->
 create_pinctrl
 (Holding lock pinctrl_maps_mutex)
 -> get_pinctrl_dev_from_devname
 (Trying to acquire lock pinctrldev_list_mutex)
CPU #0
 pinctrl_unregister
 (Holding lock pinctrldev_list_mutex)
 -> pinctrl_put ->> pinctrl_free ->
 pinctrl_dt_free_maps -> pinctrl_unregister_map
 (Trying to acquire lock pinctrl_maps_mutex)

Simply to say
CPU#1 is holding lock A and trying to acquire lock B,
CPU#0 is holding lock B and trying to acquire lock A.

Deadlock 2:
CPU #3
 pinctrl_register-> pinctrl_get ->
 create_pinctrl
 (Holding lock pinctrl_maps_mutex)
 -> get_pinctrl_dev_from_devname
 (Trying to acquire lock pinctrldev_list_mutex)
CPU #2
 pinctrl_unregister
 (Holding lock pctldev->mutex)
 -> pinctrl_put ->> pinctrl_free ->
 pinctrl_dt_free_maps -> pinctrl_unregister_map
 (Trying to acquire lock pinctrl_maps_mutex)
CPU #0
 tegra_gpio_request
 (Holding lock pinctrldev_list_mutex)
 -> pinctrl_get_device_gpio_range
 (Trying to acquire lock pctldev->mutex)

Simply to say
CPU#3 is holding lock A and trying to acquire lock D,
CPU#2 is holding lock B and trying to acquire lock A,
CPU#0 is holding lock D and trying to acquire lock B.

Cc: Stable <stable@vger.kernel.org>
Signed-off-by: Jim Lin <jilin@nvidia.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
lclausen-adi pushed a commit that referenced this pull request May 4, 2015
… into fixes

Merge " mvebu fixes for 3.19-rc (part #3)" from Andrew Lunn:

mvebu: completely disable hardware I/O coherency

* tag 'mvebu-fixes-3.19-3' of git://git.infradead.org/linux-mvebu:
  ARM: mvebu: completely disable hardware I/O coherency

Signed-off-by: Olof Johansson <olof@lixom.net>
lclausen-adi pushed a commit that referenced this pull request May 4, 2015
It is possible for ata_sff_flush_pio_task() to set ap->hsm_task_state to
HSM_ST_IDLE in between the time __ata_sff_port_intr() checks for HSM_ST_IDLE
and before it calls ata_sff_hsm_move() causing ata_sff_hsm_move() to BUG().

This problem is hard to reproduce making this patch hard to verify, but this
fix will prevent the race.

I have not been able to reproduce the problem, but here is a crash dump from
a 2.6.32 kernel.

On examining the ata port's state, its hsm_task_state field has a value of HSM_ST_IDLE:

crash> struct ata_port.hsm_task_state ffff881c1121c000
  hsm_task_state = 0

Normally, this should not be possible as ata_sff_hsm_move() was called from ata_sff_host_intr(),
which checks hsm_task_state and won't call ata_sff_hsm_move() if it has a HSM_ST_IDLE value.

PID: 11053  TASK: ffff8816e846cae0  CPU: 0   COMMAND: "sshd"
 #0 [ffff88008ba03960] machine_kexec at ffffffff81038f3b
 #1 [ffff88008ba039c0] crash_kexec at ffffffff810c5d92
 #2 [ffff88008ba03a90] oops_end at ffffffff8152b510
 #3 [ffff88008ba03ac0] die at ffffffff81010e0b
 #4 [ffff88008ba03af0] do_trap at ffffffff8152ad74
 #5 [ffff88008ba03b50] do_invalid_op at ffffffff8100cf95
 #6 [ffff88008ba03bf0] invalid_op at ffffffff8100bf9b
    [exception RIP: ata_sff_hsm_move+317]
    RIP: ffffffff813a77ad  RSP: ffff88008ba03ca0  RFLAGS: 00010097
    RAX: 0000000000000000  RBX: ffff881c1121dc60  RCX: 0000000000000000
    RDX: ffff881c1121dd10  RSI: ffff881c1121dc60  RDI: ffff881c1121c000
    RBP: ffff88008ba03d00   R8: 0000000000000000   R9: 000000000000002e
    R10: 000000000001003f  R11: 000000000000009b  R12: ffff881c1121c000
    R13: 0000000000000000  R14: 0000000000000050  R15: ffff881c1121dd78
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffff88008ba03d08] ata_sff_host_intr at ffffffff813a7fbd
 #8 [ffff88008ba03d38] ata_sff_interrupt at ffffffff813a821e
 #9 [ffff88008ba03d78] handle_IRQ_event at ffffffff810e6ec0
--- <IRQ stack> ---
    [exception RIP: pipe_poll+48]
    RIP: ffffffff81192780  RSP: ffff880f26d459b8  RFLAGS: 00000246
    RAX: 0000000000000000  RBX: ffff880f26d459c8  RCX: 0000000000000000
    RDX: 0000000000000001  RSI: 0000000000000000  RDI: ffff881a0539fa80
    RBP: ffffffff8100bb8e   R8: ffff8803b23324a0   R9: 0000000000000000
    R10: ffff880f26d45dd0  R11: 0000000000000008  R12: ffffffff8109b646
    R13: ffff880f26d45948  R14: 0000000000000246  R15: 0000000000000246
    ORIG_RAX: ffffffffffffff10  CS: 0010  SS: 0018
    RIP: 00007f26017435c3  RSP: 00007fffe020c420  RFLAGS: 00000206
    RAX: 0000000000000017  RBX: ffffffff8100b072  RCX: 00007fffe020c45c
    RDX: 00007f2604a3f120  RSI: 00007f2604a3f140  RDI: 000000000000000d
    RBP: 0000000000000000   R8: 00007fffe020e570   R9: 0101010101010101
    R10: 0000000000000000  R11: 0000000000000246  R12: 00007fffe020e5f0
    R13: 00007fffe020e5f4  R14: 00007f26045f373c  R15: 00007fffe020e5e0
    ORIG_RAX: 0000000000000017  CS: 0033  SS: 002b

Somewhere between the ata_sff_hsm_move() check and the ata_sff_host_intr() check, the value changed.
On examining the other cpus to see what else was running, another cpu was running the error handler
routines:

PID: 326    TASK: ffff881c11014aa0  CPU: 1   COMMAND: "scsi_eh_1"
 #0 [ffff88008ba27e90] crash_nmi_callback at ffffffff8102fee6
 #1 [ffff88008ba27ea0] notifier_call_chain at ffffffff8152d515
 #2 [ffff88008ba27ee0] atomic_notifier_call_chain at ffffffff8152d57a
 #3 [ffff88008ba27ef0] notify_die at ffffffff810a154e
 #4 [ffff88008ba27f20] do_nmi at ffffffff8152b1db
 #5 [ffff88008ba27f50] nmi at ffffffff8152aaa0
    [exception RIP: _spin_lock_irqsave+47]
    RIP: ffffffff8152a1ff  RSP: ffff881c11a73aa0  RFLAGS: 00000006
    RAX: 0000000000000001  RBX: ffff881c1121deb8  RCX: 0000000000000000
    RDX: 0000000000000246  RSI: 0000000000000020  RDI: ffff881c122612d8
    RBP: ffff881c11a73aa0   R8: ffff881c17083800   R9: 0000000000000000
    R10: 0000000000000000  R11: 0000000000000000  R12: ffff881c1121c000
    R13: 000000000000001f  R14: ffff881c1121dd50  R15: ffff881c1121dc60
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
--- <NMI exception stack> ---
 #6 [ffff881c11a73aa0] _spin_lock_irqsave at ffffffff8152a1ff
 #7 [ffff881c11a73aa8] ata_exec_internal_sg at ffffffff81396fb5
 #8 [ffff881c11a73b58] ata_exec_internal at ffffffff81397109
 #9 [ffff881c11a73bd8] atapi_eh_request_sense at ffffffff813a34eb

Before it tried to acquire a spinlock, ata_exec_internal_sg() called ata_sff_flush_pio_task().
This function will set ap->hsm_task_state to HSM_ST_IDLE, and has no locking around setting this
value. ata_sff_flush_pio_task() can then race with the interrupt handler and potentially set
HSM_ST_IDLE at a fatal moment, which will trigger a kernel BUG.

v2: Fixup comment in ata_sff_flush_pio_task()

tj: Further updated comment.  Use ap->lock instead of shost lock and
    use the [un]lock_irq variant instead of the irqsave/restore one.

Signed-off-by: David Milburn <dmilburn@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
lclausen-adi pushed a commit that referenced this pull request May 4, 2015
Ben Hutchings says:

====================
Fixes for sh_eth #3

I'm continuing review and testing of Ethernet support on the R-Car H2
chip.  This series fixes the last of the more serious issues I've found.

These are not tested on any of the other supported chips.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
@mhennerich mhennerich self-assigned this May 13, 2015
@mhennerich mhennerich added the bug label May 13, 2015
@mhennerich
Copy link
Copy Markdown
Contributor

For some reason this commit closed this pull request.
2257760

And github doesn't allow me to open it again.
Can you send your pull request again?

-Michael

lclausen-adi pushed a commit that referenced this pull request Jun 17, 2015
The UTMI clock must be selected by any high-speed USB IP. The logic behind it
needs this particular clock.
So, correct the clock in the device tree files affected.

Reported-by: Bo Shen <voice.shen@atmel.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: <stable@vger.kernel.org> #3.18
lclausen-adi pushed a commit that referenced this pull request Jul 1, 2015
The kbuild test robot noticed some problems with the newhaven
lcd driver.

1. lcd_of_match is not NULL terminated at line 606

2. Use ARRAY_SIZE instead of dividing sizeof array with sizeof an element

3. Random config test warnings:
   drivers/built-in.o: In function `brightness_store':
>> newhaven_lcd.c:(.text+0xd2225): undefined reference to
  `i2c_master_send'
   drivers/built-in.o: In function `lcd_init':
>> newhaven_lcd.c:(.init.text+0x8a9e): undefined reference to
   `i2c_register_driver'
   drivers/built-in.o: In function `lcd_exit':
>> newhaven_lcd.c:(.exit.text+0x439): undefined reference to
   `i2c_del_driver'
   drivers/built-in.o: In function `lcd_remove':
>> newhaven_lcd.c:(.exit.text+0x462): undefined reference to
   `i2c_master_send'

This patch squashes the two patches they kindly sent us plus adds
a Kconfig fix for the warnings in #3 above.

Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Alan Tull <atull@opensource.altera.com>
andreamerello pushed a commit to andreamerello/linux-analogdevices that referenced this pull request Nov 19, 2015
commit 3f1f9b8 upstream.

This fixes the following lockdep complaint:

[ INFO: possible circular locking dependency detected ]
3.16.0-rc2-mm1+ analogdevicesinc#7 Tainted: G           O
-------------------------------------------------------
kworker/u24:0/4356 is trying to acquire lock:
 (&(&sbi->s_es_lru_lock)->rlock){+.+.-.}, at: [<ffffffff81285fff>] __ext4_es_shrink+0x4f/0x2e0

but task is already holding lock:
 (&ei->i_es_lock){++++-.}, at: [<ffffffff81286961>] ext4_es_insert_extent+0x71/0x180

which lock already depends on the new lock.

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&ei->i_es_lock);
                               lock(&(&sbi->s_es_lru_lock)->rlock);
                               lock(&ei->i_es_lock);
  lock(&(&sbi->s_es_lru_lock)->rlock);

 *** DEADLOCK ***

6 locks held by kworker/u24:0/4356:
 #0:  ("writeback"){.+.+.+}, at: [<ffffffff81071d00>] process_one_work+0x180/0x560
 analogdevicesinc#1:  ((&(&wb->dwork)->work)){+.+.+.}, at: [<ffffffff81071d00>] process_one_work+0x180/0x560
 analogdevicesinc#2:  (&type->s_umount_key#22){++++++}, at: [<ffffffff811a9c74>] grab_super_passive+0x44/0x90
 analogdevicesinc#3:  (jbd2_handle){+.+...}, at: [<ffffffff812979f9>] start_this_handle+0x189/0x5f0
 analogdevicesinc#4:  (&ei->i_data_sem){++++..}, at: [<ffffffff81247062>] ext4_map_blocks+0x132/0x550
 analogdevicesinc#5:  (&ei->i_es_lock){++++-.}, at: [<ffffffff81286961>] ext4_es_insert_extent+0x71/0x180

stack backtrace:
CPU: 0 PID: 4356 Comm: kworker/u24:0 Tainted: G           O   3.16.0-rc2-mm1+ analogdevicesinc#7
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Workqueue: writeback bdi_writeback_workfn (flush-253:0)
 ffffffff8213dce0 ffff880014b07538 ffffffff815df0bb 0000000000000007
 ffffffff8213e040 ffff880014b07588 ffffffff815db3dd ffff880014b07568
 ffff880014b07610 ffff88003b868930 ffff88003b868908 ffff88003b868930
Call Trace:
 [<ffffffff815df0bb>] dump_stack+0x4e/0x68
 [<ffffffff815db3dd>] print_circular_bug+0x1fb/0x20c
 [<ffffffff810a7a3e>] __lock_acquire+0x163e/0x1d00
 [<ffffffff815e89dc>] ? retint_restore_args+0xe/0xe
 [<ffffffff815ddc7b>] ? __slab_alloc+0x4a8/0x4ce
 [<ffffffff81285fff>] ? __ext4_es_shrink+0x4f/0x2e0
 [<ffffffff810a8707>] lock_acquire+0x87/0x120
 [<ffffffff81285fff>] ? __ext4_es_shrink+0x4f/0x2e0
 [<ffffffff8128592d>] ? ext4_es_free_extent+0x5d/0x70
 [<ffffffff815e6f09>] _raw_spin_lock+0x39/0x50
 [<ffffffff81285fff>] ? __ext4_es_shrink+0x4f/0x2e0
 [<ffffffff8119760b>] ? kmem_cache_alloc+0x18b/0x1a0
 [<ffffffff81285fff>] __ext4_es_shrink+0x4f/0x2e0
 [<ffffffff812869b8>] ext4_es_insert_extent+0xc8/0x180
 [<ffffffff812470f4>] ext4_map_blocks+0x1c4/0x550
 [<ffffffff8124c4c4>] ext4_writepages+0x6d4/0xd00
	...

Reported-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reported-by: Minchan Kim <minchan@kernel.org>
Cc: Zheng Liu <gnehzuil.liu@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
lclausen-adi pushed a commit that referenced this pull request Feb 15, 2016
When a43eec3 ("bpf: introduce bpf_perf_event_output() helper") added
PERF_COUNT_SW_BPF_OUTPUT we ended up with a new entry in the event_symbols_sw
array that wasn't initialized, thus set to NULL, fix print_symbol_events()
to check for that case so that we don't crash if this happens again.

  (gdb) bt
  #0  __match_glob (ignore_space=false, pat=<optimized out>, str=<optimized out>) at util/string.c:198
  #1  strglobmatch (str=<optimized out>, pat=pat@entry=0x7fffffffe61d "stall") at util/string.c:252
  #2  0x00000000004993a5 in print_symbol_events (type=1, syms=0x872880 <event_symbols_sw+160>, max=11, name_only=false, event_glob=0x7fffffffe61d "stall")
      at util/parse-events.c:1615
  #3  print_events (event_glob=event_glob@entry=0x7fffffffe61d "stall", name_only=false) at util/parse-events.c:1675
  #4  0x000000000042c79e in cmd_list (argc=1, argv=0x7fffffffe390, prefix=<optimized out>) at builtin-list.c:68
  #5  0x00000000004788a5 in run_builtin (p=p@entry=0x871758 <commands+120>, argc=argc@entry=2, argv=argv@entry=0x7fffffffe390) at perf.c:370
  #6  0x0000000000420ab0 in handle_internal_command (argv=0x7fffffffe390, argc=2) at perf.c:429
  #7  run_argv (argv=0x7fffffffe110, argcp=0x7fffffffe11c) at perf.c:473
  #8  main (argc=2, argv=0x7fffffffe390) at perf.c:588
  (gdb) p event_symbols_sw[PERF_COUNT_SW_BPF_OUTPUT]
  $4 = {symbol = 0x0, alias = 0x0}
  (gdb)

A patch to robustify perf to not segfault when the next counter gets added in
the kernel will follow this one.

Reported-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-57wysblcjfrseb0zg5u7ek10@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
mhennerich pushed a commit that referenced this pull request Apr 5, 2016
Fixes segmentation fault using, for instance:

  (gdb) run record -I -e intel_pt/tsc=1,noretcomp=1/u /bin/ls
  Starting program: /home/acme/bin/perf record -I -e intel_pt/tsc=1,noretcomp=1/u /bin/ls
  Missing separate debuginfos, use: dnf debuginfo-install glibc-2.22-7.fc23.x86_64
  [Thread debugging using libthread_db enabled]
  Using host libthread_db library "/lib64/libthread_db.so.1".

 Program received signal SIGSEGV, Segmentation fault.
  0 x00000000004b9ea5 in tracepoint_error (e=0x0, err=13, sys=0x19b1370 "sched", name=0x19a5d00 "sched_switch") at util/parse-events.c:410
  (gdb) bt
  #0  0x00000000004b9ea5 in tracepoint_error (e=0x0, err=13, sys=0x19b1370 "sched", name=0x19a5d00 "sched_switch") at util/parse-events.c:410
  #1  0x00000000004b9fc5 in add_tracepoint (list=0x19a5d20, idx=0x7fffffffb8c0, sys_name=0x19b1370 "sched", evt_name=0x19a5d00 "sched_switch", err=0x0, head_config=0x0)
      at util/parse-events.c:433
  #2  0x00000000004ba334 in add_tracepoint_event (list=0x19a5d20, idx=0x7fffffffb8c0, sys_name=0x19b1370 "sched", evt_name=0x19a5d00 "sched_switch", err=0x0, head_config=0x0)
      at util/parse-events.c:498
  #3  0x00000000004bb699 in parse_events_add_tracepoint (list=0x19a5d20, idx=0x7fffffffb8c0, sys=0x19b1370 "sched", event=0x19a5d00 "sched_switch", err=0x0, head_config=0x0)
      at util/parse-events.c:936
  #4  0x00000000004f6eda in parse_events_parse (_data=0x7fffffffb8b0, scanner=0x19a49d0) at util/parse-events.y:391
  #5  0x00000000004bc8e5 in parse_events__scanner (str=0x663ff2 "sched:sched_switch", data=0x7fffffffb8b0, start_token=258) at util/parse-events.c:1361
  #6  0x00000000004bca57 in parse_events (evlist=0x19a5220, str=0x663ff2 "sched:sched_switch", err=0x0) at util/parse-events.c:1401
  #7  0x0000000000518d5f in perf_evlist__can_select_event (evlist=0x19a3b90, str=0x663ff2 "sched:sched_switch") at util/record.c:253
  #8  0x0000000000553c42 in intel_pt_track_switches (evlist=0x19a3b90) at arch/x86/util/intel-pt.c:364
  #9  0x00000000005549d1 in intel_pt_recording_options (itr=0x19a2c40, evlist=0x19a3b90, opts=0x8edf68 <record+232>) at arch/x86/util/intel-pt.c:664
  #10 0x000000000051e076 in auxtrace_record__options (itr=0x19a2c40, evlist=0x19a3b90, opts=0x8edf68 <record+232>) at util/auxtrace.c:539
  #11 0x0000000000433368 in cmd_record (argc=1, argv=0x7fffffffde60, prefix=0x0) at builtin-record.c:1264
  #12 0x000000000049bec2 in run_builtin (p=0x8fa2a8 <commands+168>, argc=5, argv=0x7fffffffde60) at perf.c:390
  #13 0x000000000049c12a in handle_internal_command (argc=5, argv=0x7fffffffde60) at perf.c:451
  #14 0x000000000049c278 in run_argv (argcp=0x7fffffffdcbc, argv=0x7fffffffdcb0) at perf.c:495
  #15 0x000000000049c60a in main (argc=5, argv=0x7fffffffde60) at perf.c:618
(gdb)

Intel PT attempts to find the sched:sched_switch tracepoint but that seg
faults if tracefs is not readable, because the error reporting structure
is null, as errors are not reported when automatically adding
tracepoints.  Fix by checking before using.

Committer note:

This doesn't take place in a kernel that supports
perf_event_attr.context_switch, that is the default way that will be
used for tracking context switches, only in older kernels, like 4.2, in
a machine with Intel PT (e.g. Broadwell) for non-priviledged users.

Further info from a similar patch by Wang:

The error is in tracepoint_error: it assumes the 'e' parameter is valid.

However, there are many situation a parse_event() can be called without
parse_events_error. See result of

  $ grep 'parse_events(.*NULL)' ./tools/perf/ -r'

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Tong Zhang <ztong@vt.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: stable@vger.kernel.org # v4.4+
Fixes: 1965817 ("perf tools: Enhance parsing events tracepoint error output")
Link: http://lkml.kernel.org/r/1453809921-24596-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
mhennerich pushed a commit that referenced this pull request Apr 5, 2016
In the regular MIPS instruction set RDHWR is encoded with the SPECIAL3
(011111) major opcode.  Therefore it cannot trigger the CpU (Coprocessor
Unusable) exception, and certainly not for coprocessor 0, as the opcode
does not overlap with any of the older ISA reservations, i.e. LWC0
(110000), SWC0 (111000), LDC0 (110100) or SDC0 (111100).  The closest
match might be SDC3 (111111), possibly causing a CpU #3 exception,
however our code does not handle it anyway.  A quick check with a MIPS I
and a MIPS III processor:

CPU0 revision is: 00000220 (R3000)
CPU0 revision is: 0000044 (R4400SC)

indeed indicates that the RI (Reserved Instruction) exception is
triggered.  It's only LL and SC that require emulation in the CpU #0
exception handler as they reuse the LWC0 and SWC0 opcodes respectively.

In the microMIPS instruction set RDHWR is mandatory and triggering the
RI exception is required on unimplemented or disabled register accesses.
Therefore emulating the microMIPS instruction in the CpU #0 exception
handler is not required either.

Signed-off-by: Maciej W. Rozycki <macro@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12280/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
mhennerich pushed a commit that referenced this pull request Apr 5, 2016
Ilya reported following lockdep splat:

kernel: =========================
kernel: [ BUG: held lock freed! ]
kernel: 4.5.0-rc1-ceph-00026-g5e0a311 #1 Not tainted
kernel: -------------------------
kernel: swapper/5/0 is freeing memory
ffff880035c9d200-ffff880035c9dbff, with a lock still held there!
kernel: (&(&queue->rskq_lock)->rlock){+.-...}, at:
[<ffffffff816f6a88>] inet_csk_reqsk_queue_add+0x28/0xa0
kernel: 4 locks held by swapper/5/0:
kernel: #0:  (rcu_read_lock){......}, at: [<ffffffff8169ef6b>]
netif_receive_skb_internal+0x4b/0x1f0
kernel: #1:  (rcu_read_lock){......}, at: [<ffffffff816e977f>]
ip_local_deliver_finish+0x3f/0x380
kernel: #2:  (slock-AF_INET){+.-...}, at: [<ffffffff81685ffb>]
sk_clone_lock+0x19b/0x440
kernel: #3:  (&(&queue->rskq_lock)->rlock){+.-...}, at:
[<ffffffff816f6a88>] inet_csk_reqsk_queue_add+0x28/0xa0

To properly fix this issue, inet_csk_reqsk_queue_add() needs
to return to its callers if the child as been queued
into accept queue.

We also need to make sure listener is still there before
calling sk->sk_data_ready(), by holding a reference on it,
since the reference carried by the child can disappear as
soon as the child is put on accept queue.

Reported-by: Ilya Dryomov <idryomov@gmail.com>
Fixes: ebb516a ("tcp/dccp: fix race at listener dismantle phase")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dbogdan pushed a commit that referenced this pull request May 11, 2016
The kbuild test robot noticed some problems with the newhaven
lcd driver.

1. lcd_of_match is not NULL terminated at line 606

2. Use ARRAY_SIZE instead of dividing sizeof array with sizeof an element

3. Random config test warnings:
   drivers/built-in.o: In function `brightness_store':
>> newhaven_lcd.c:(.text+0xd2225): undefined reference to
  `i2c_master_send'
   drivers/built-in.o: In function `lcd_init':
>> newhaven_lcd.c:(.init.text+0x8a9e): undefined reference to
   `i2c_register_driver'
   drivers/built-in.o: In function `lcd_exit':
>> newhaven_lcd.c:(.exit.text+0x439): undefined reference to
   `i2c_del_driver'
   drivers/built-in.o: In function `lcd_remove':
>> newhaven_lcd.c:(.exit.text+0x462): undefined reference to
   `i2c_master_send'

This patch squashes the two patches they kindly sent us plus adds
a Kconfig fix for the warnings in #3 above.

Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Alan Tull <atull@opensource.altera.com>
lclausen-adi pushed a commit that referenced this pull request Jul 11, 2016
We must not attempt to send WMI packets while holding the data-lock,
as it may deadlock:

BUG: sleeping function called from invalid context at drivers/net/wireless/ath/ath10k/wmi.c:1824
in_atomic(): 1, irqs_disabled(): 0, pid: 2878, name: wpa_supplicant

=============================================
[ INFO: possible recursive locking detected ]
4.4.6+ #21 Tainted: G        W  O
---------------------------------------------
wpa_supplicant/2878 is trying to acquire lock:
 (&(&ar->data_lock)->rlock){+.-...}, at: [<ffffffffa0721511>] ath10k_wmi_tx_beacons_iter+0x26/0x11a [ath10k_core]

but task is already holding lock:
 (&(&ar->data_lock)->rlock){+.-...}, at: [<ffffffffa070251b>] ath10k_peer_create+0x122/0x1ae [ath10k_core]

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&(&ar->data_lock)->rlock);
  lock(&(&ar->data_lock)->rlock);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

4 locks held by wpa_supplicant/2878:
 #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff816493ca>] rtnl_lock+0x12/0x14
 #1:  (&ar->conf_mutex){+.+.+.}, at: [<ffffffffa0706932>] ath10k_add_interface+0x3b/0xbda [ath10k_core]
 #2:  (&(&ar->data_lock)->rlock){+.-...}, at: [<ffffffffa070251b>] ath10k_peer_create+0x122/0x1ae [ath10k_core]
 #3:  (rcu_read_lock){......}, at: [<ffffffffa062f304>] rcu_read_lock+0x0/0x66 [mac80211]

stack backtrace:
CPU: 3 PID: 2878 Comm: wpa_supplicant Tainted: G        W  O    4.4.6+ #21
Hardware name: To be filled by O.E.M. To be filled by O.E.M./ChiefRiver, BIOS 4.6.5 06/07/2013
 0000000000000000 ffff8801fcadf8f0 ffffffff8137086d ffffffff82681720
 ffffffff82681720 ffff8801fcadf9b0 ffffffff8112e3be ffff8801fcadf920
 0000000100000000 ffffffff82681720 ffffffffa0721500 ffff8801fcb8d348
Call Trace:
 [<ffffffff8137086d>] dump_stack+0x81/0xb6
 [<ffffffff8112e3be>] __lock_acquire+0xc5b/0xde7
 [<ffffffffa0721500>] ? ath10k_wmi_tx_beacons_iter+0x15/0x11a [ath10k_core]
 [<ffffffff8112d0d0>] ? mark_lock+0x24/0x201
 [<ffffffff8112e908>] lock_acquire+0x132/0x1cb
 [<ffffffff8112e908>] ? lock_acquire+0x132/0x1cb
 [<ffffffffa0721511>] ? ath10k_wmi_tx_beacons_iter+0x26/0x11a [ath10k_core]
 [<ffffffffa07214eb>] ? ath10k_wmi_cmd_send_nowait+0x1ce/0x1ce [ath10k_core]
 [<ffffffff816f9e2b>] _raw_spin_lock_bh+0x31/0x40
 [<ffffffffa0721511>] ? ath10k_wmi_tx_beacons_iter+0x26/0x11a [ath10k_core]
 [<ffffffffa0721511>] ath10k_wmi_tx_beacons_iter+0x26/0x11a [ath10k_core]
 [<ffffffffa07214eb>] ? ath10k_wmi_cmd_send_nowait+0x1ce/0x1ce [ath10k_core]
 [<ffffffffa062eb18>] __iterate_interfaces+0x9d/0x13d [mac80211]
 [<ffffffffa062f609>] ieee80211_iterate_active_interfaces_atomic+0x32/0x3e [mac80211]
 [<ffffffffa07214eb>] ? ath10k_wmi_cmd_send_nowait+0x1ce/0x1ce [ath10k_core]
 [<ffffffffa071fa9f>] ath10k_wmi_tx_beacons_nowait.isra.13+0x14/0x16 [ath10k_core]
 [<ffffffffa0721676>] ath10k_wmi_cmd_send+0x71/0x242 [ath10k_core]
 [<ffffffffa07023f6>] ath10k_wmi_peer_delete+0x3f/0x42 [ath10k_core]
 [<ffffffffa0702557>] ath10k_peer_create+0x15e/0x1ae [ath10k_core]
 [<ffffffffa0707004>] ath10k_add_interface+0x70d/0xbda [ath10k_core]
 [<ffffffffa05fffcc>] drv_add_interface+0x123/0x1a5 [mac80211]
 [<ffffffffa061554b>] ieee80211_do_open+0x351/0x667 [mac80211]
 [<ffffffffa06158aa>] ieee80211_open+0x49/0x4c [mac80211]
 [<ffffffff8163ecf9>] __dev_open+0x88/0xde
 [<ffffffff8163ef6e>] __dev_change_flags+0xa4/0x13a
 [<ffffffff8163f023>] dev_change_flags+0x1f/0x54
 [<ffffffff816a5532>] devinet_ioctl+0x2b9/0x5c9
 [<ffffffff816514dd>] ? copy_to_user+0x32/0x38
 [<ffffffff816a6115>] inet_ioctl+0x81/0x9d
 [<ffffffff816a6115>] ? inet_ioctl+0x81/0x9d
 [<ffffffff81621cf8>] sock_do_ioctl+0x20/0x3d
 [<ffffffff816223c4>] sock_ioctl+0x222/0x22e
 [<ffffffff8121cf95>] do_vfs_ioctl+0x453/0x4d7
 [<ffffffff81625603>] ? __sys_recvmsg+0x4c/0x5b
 [<ffffffff81225af1>] ? __fget_light+0x48/0x6c
 [<ffffffff8121d06b>] SyS_ioctl+0x52/0x74
 [<ffffffff816fa736>] entry_SYSCALL_64_fastpath+0x16/0x7a

Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
lclausen-adi pushed a commit that referenced this pull request Jul 11, 2016
In commit f9476a6 ("drm/i915: Refactor platform specifics out of
intel_get_shared_dpll()"), the ibx_get_dpll() function lacked an error
check, that can lead to a NULL pointer dereference when trying to enable
three pipes.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000068
IP: [<ffffffffa0482275>] intel_reference_shared_dpll+0x15/0x100 [i915]
PGD cec87067 PUD d30ce067 PMD 0
Oops: 0000 [#1] PREEMPT SMP
Modules linked in: snd_hda_intel i915 drm_kms_helper drm intel_gtt sch_fq_codel cfg80211 binfmt_misc i2c_algo_bit cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea intel_rapl iosf_mbi x86_pkg_temp_thermal coretemp agpgart kvm_intel snd_hda_codec_hdmi kvm iTCO_wdt snd_hda_codec_realtek snd_hda_codec_generic irqbypass aesni_intel aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd psmouse pcspkr snd_hda_codec i2c_i801 snd_hwdep snd_hda_core snd_pcm snd_timer lpc_ich mfd_core snd soundcore wmi evdev tpm_tis tpm [last unloaded: drm]
CPU: 3 PID: 5810 Comm: kms_flip Tainted: G     U  W       4.6.0-test+ #3
Hardware name:                  /DZ77BH-55K, BIOS BHZ7710H.86A.0100.2013.0517.0942 05/17/2013
task: ffff8800d3908040 ti: ffff8801166c8000 task.ti: ffff8801166c8000
RIP: 0010:[<ffffffffa0482275>]  [<ffffffffa0482275>] intel_reference_shared_dpll+0x15/0x100 [i915]
RSP: 0018:ffff8801166cba60  EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002
RDX: 0000000000000001 RSI: ffff8800d07f1bf8 RDI: 0000000000000000
RBP: ffff8801166cba88 R08: 0000000000000002 R09: ffff8800d32e5698
R10: 0000000000000001 R11: ffff8800cc89ac88 R12: ffff8800d07f1bf8
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS:  00007f4c3fc8d8c0(0000) GS:ffff88011bcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000068 CR3: 00000000d3b4c000 CR4: 00000000001406e0
Stack:
 0000000000000000 ffff8800d07f1bf8 0000000000000000 ffff8800d04c0000
 0000000000000000 ffff8801166cbaa8 ffffffffa04823a7 ffff8800d07f1bf8
 ffff8800d32e5698 ffff8801166cbab8 ffffffffa04840cf ffff8801166cbaf0
Call Trace:
 [<ffffffffa04823a7>] ibx_get_dpll+0x47/0xa0 [i915]
 [<ffffffffa04840cf>] intel_get_shared_dpll+0x1f/0x50 [i915]
 [<ffffffffa046d080>] ironlake_crtc_compute_clock+0x280/0x430 [i915]
 [<ffffffffa0472ac0>] intel_crtc_atomic_check+0x240/0x320 [i915]
 [<ffffffffa03da18e>] drm_atomic_helper_check_planes+0x14e/0x1d0 [drm_kms_helper]
 [<ffffffffa0474a0c>] intel_atomic_check+0x5dc/0x1110 [i915]
 [<ffffffffa029d3aa>] drm_atomic_check_only+0x14a/0x660 [drm]
 [<ffffffffa029d086>] ? drm_atomic_set_crtc_for_connector+0x96/0x100 [drm]
 [<ffffffffa029d8d7>] drm_atomic_commit+0x17/0x60 [drm]
 [<ffffffffa03dc3b7>] restore_fbdev_mode+0x237/0x260 [drm_kms_helper]
 [<ffffffffa029c65a>] ? drm_modeset_lock_all_ctx+0x9a/0xb0 [drm]
 [<ffffffffa03de9b3>] drm_fb_helper_restore_fbdev_mode_unlocked+0x33/0x80 [drm_kms_helper]
 [<ffffffffa03dea2d>] drm_fb_helper_set_par+0x2d/0x50 [drm_kms_helper]
 [<ffffffffa03de93a>] drm_fb_helper_hotplug_event+0xaa/0xf0 [drm_kms_helper]
 [<ffffffffa03de9d6>] drm_fb_helper_restore_fbdev_mode_unlocked+0x56/0x80 [drm_kms_helper]
 [<ffffffffa0490f72>] intel_fbdev_restore_mode+0x22/0x80 [i915]
 [<ffffffffa04ba45e>] i915_driver_lastclose+0xe/0x20 [i915]
 [<ffffffffa02810de>] drm_lastclose+0x2e/0x130 [drm]
 [<ffffffffa028148c>] drm_release+0x2ac/0x4b0 [drm]
 [<ffffffff811a6b2d>] __fput+0xed/0x1f0
 [<ffffffff811a6c6e>] ____fput+0xe/0x10
 [<ffffffff81079156>] task_work_run+0x76/0xb0
 [<ffffffff8105aaab>] do_exit+0x3ab/0xc60
 [<ffffffff810a145f>] ? trace_hardirqs_on_caller+0x12f/0x1c0
 [<ffffffff8105c67e>] do_group_exit+0x4e/0xc0
 [<ffffffff8105c704>] SyS_exit_group+0x14/0x20
 [<ffffffff8158bb25>] entry_SYSCALL_64_fastpath+0x18/0xa8
Code: 14 80 48 8d 34 90 b8 01 00 00 00 d3 e0 09 04 b3 5b 41 5c 5d c3 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 49 89 fe 41 55 41 54 53 <44> 8b 67 68 48 89 f3 48 8b be 08 02 00 00 4c 8b 2e e8 15 9d fd
RIP  [<ffffffffa0482275>] intel_reference_shared_dpll+0x15/0x100 [i915]
 RSP <ffff8801166cba60>
CR2: 0000000000000068

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: drm-intel-fixes@lists.freedesktop.org
Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Fixes: f9476a6 ("drm/i915: Refactor platform specifics out of intel_get_shared_dpll()")
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463748426-5956-1-git-send-email-ander.conselvan.de.oliveira@intel.com
(cherry picked from commit bb14316)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
dbogdan pushed a commit that referenced this pull request Jul 29, 2016
The kbuild test robot noticed some problems with the newhaven
lcd driver.

1. lcd_of_match is not NULL terminated at line 606

2. Use ARRAY_SIZE instead of dividing sizeof array with sizeof an element

3. Random config test warnings:
   drivers/built-in.o: In function `brightness_store':
>> newhaven_lcd.c:(.text+0xd2225): undefined reference to
  `i2c_master_send'
   drivers/built-in.o: In function `lcd_init':
>> newhaven_lcd.c:(.init.text+0x8a9e): undefined reference to
   `i2c_register_driver'
   drivers/built-in.o: In function `lcd_exit':
>> newhaven_lcd.c:(.exit.text+0x439): undefined reference to
   `i2c_del_driver'
   drivers/built-in.o: In function `lcd_remove':
>> newhaven_lcd.c:(.exit.text+0x462): undefined reference to
   `i2c_master_send'

This patch squashes the two patches they kindly sent us plus adds
a Kconfig fix for the warnings in #3 above.

Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Alan Tull <atull@opensource.altera.com>
lclausen-adi pushed a commit that referenced this pull request Aug 5, 2016
The driver never calls platform_set_drvdata() , so platform_get_drvdata()
in .remove returns NULL and thus $indio_dev variable in .remove is NULL.
Then it's only a matter of dereferencing the indio_dev variable to make
the kernel blow as seen below. This patch adds the platform_set_drvdata()
call to fix the problem.

root@armhf:~# rmmod at91-sama5d2_adc

Unable to handle kernel NULL pointer dereference at virtual address 000001d4
pgd = dd57c000
[000001d4] *pgd=00000000
Internal error: Oops: 5 [#1] ARM
Modules linked in: at91_sama5d2_adc(-)
CPU: 0 PID: 1334 Comm: rmmod Not tainted 4.6.0-rc3-next-20160418+ #3
Hardware name: Atmel SAMA5
task: dd4fcc40 ti: de910000 task.ti: de910000
PC is at mutex_lock+0x4/0x24
LR is at iio_device_unregister+0x14/0x6c
pc : [<c05f4624>]    lr : [<c0471f74>]    psr: a00d0013
               sp : de911f00  ip : 00000000  fp : be898bd8
r10: 00000000  r9 : de910000  r8 : c0107724
r7 : 00000081  r6 : bf001048  r5 : 000001d4  r4 : 00000000
r3 : bf000000  r2 : 00000000  r1 : 00000004  r0 : 000001d4
Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 10c53c7d  Table: 3d57c059  DAC: 00000051
Process rmmod (pid: 1334, stack limit = 0xde910208)
Stack: (0xde911f00 to 0xde912000)
1f00: bf000000 00000000 df5c7e10 bf000010 bf000000 df5c7e10 df5c7e10 c0351ca8
1f20: c0351c84 df5c7e10 bf001048 c0350734 bf001048 df5c7e10 df5c7e44 c035087c
1f40: bf001048 7f62dd4c 00000800 c034fb30 bf0010c0 c0158ee8 de910000 31397461
1f60: 6d61735f 32643561 6364615f 00000000 de911f90 de910000 de910000 00000000
1f80: de911fb0 10c53c7d de911f9c c05f33d8 de911fa0 00910000 be898ecb 7f62dd10
1fa0: 00000000 c0107560 be898ecb 7f62dd10 7f62dd4c 00000800 6f844800 6f844800
1fc0: be898ecb 7f62dd10 00000000 00000081 00000000 7f62dd10 be898bd8 be898bd8
1fe0: b6eedab1 be898b6c 7f61056b b6eedab6 000d0030 7f62dd4c 00000000 00000000
[<c05f4624>] (mutex_lock) from [<c0471f74>] (iio_device_unregister+0x14/0x6c)
[<c0471f74>] (iio_device_unregister) from [<bf000010>] (at91_adc_remove+0x10/0x3c [at91_sama5d2_adc])
[<bf000010>] (at91_adc_remove [at91_sama5d2_adc]) from [<c0351ca8>] (platform_drv_remove+0x24/0x3c)
[<c0351ca8>] (platform_drv_remove) from [<c0350734>] (__device_release_driver+0x84/0x110)
[<c0350734>] (__device_release_driver) from [<c035087c>] (driver_detach+0x8c/0x90)
[<c035087c>] (driver_detach) from [<c034fb30>] (bus_remove_driver+0x4c/0xa0)
[<c034fb30>] (bus_remove_driver) from [<c0158ee8>] (SyS_delete_module+0x110/0x1d0)
[<c0158ee8>] (SyS_delete_module) from [<c0107560>] (ret_fast_syscall+0x0/0x3c)
Code: e3520001 1affffd5 eafffff4 f5d0f000 (e1902f9f)
---[ end trace 86914d7ad3696fca ]---

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Ludovic Desroches <ludovic.desroches@atmel.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
lclausen-adi pushed a commit that referenced this pull request Aug 5, 2016
…ing"

This patch causes a Kernel panic when called on a DVB driver.

This was also reported by David R <david@unsolicited.net>:

May  7 14:47:35 server kernel: [  501.247123] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
May  7 14:47:35 server kernel: [  501.247239] IP: [<ffffffffa0222c71>] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2]
May  7 14:47:35 server kernel: [  501.247354] PGD cae6f067 PUD ca99c067 PMD 0
May  7 14:47:35 server kernel: [  501.247426] Oops: 0000 [#1] SMP
May  7 14:47:35 server kernel: [  501.247482] Modules linked in: xfs tun xt_connmark xt_TCPMSS xt_tcpmss xt_owner xt_REDIRECT nf_nat_redirect xt_nat ipt_MASQUERADE nf_nat_masquerade_ipv4 ts_kmp ts_bm xt_string ipt_REJECT nf_reject_ipv4 xt_recent xt_conntrack xt_multiport xt_pkttype xt_tcpudp xt_mark nf_log_ipv4 nf_log_common xt_LOG xt_limit iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables ip6table_filter ip6_tables x_tables pppoe pppox dm_crypt ts2020 regmap_i2c ds3000 cx88_dvb dvb_pll cx88_vp3054_i2c mt352 videobuf2_dvb cx8800 cx8802 cx88xx pl2303 tveeprom videobuf2_dma_sg ppdev videobuf2_memops videobuf2_v4l2 videobuf2_core dvb_usb_digitv snd_hda_codec_via snd_hda_codec_hdmi snd_hda_codec_generic radeon dvb_usb snd_hda_intel amd64_edac_mod serio_raw snd_hda_codec edac_core fbcon k10temp bitblit softcursor snd_hda_core font snd_pcm_oss i2c_piix4 snd_mixer_oss tileblit drm_kms_helper syscopyarea snd_pcm snd_seq_dummy sysfillrect snd_seq_oss sysimgblt fb_sys_fops ttm snd_seq_midi r8169 snd_rawmidi drm snd_seq_midi_event e1000e snd_seq snd_seq_device snd_timer snd ptp pps_core i2c_algo_bit soundcore parport_pc ohci_pci shpchp tpm_tis tpm nfsd auth_rpcgss oid_registry hwmon_vid exportfs nfs_acl mii nfs bonding lockd grace lp sunrpc parport
May  7 14:47:35 server kernel: [  501.249564] CPU: 1 PID: 6889 Comm: vb2-cx88[0] Not tainted 4.5.3 #3
May  7 14:47:35 server kernel: [  501.249644] Hardware name: System manufacturer System Product Name/M4A785TD-V EVO, BIOS 0211    07/08/2009
May  7 14:47:35 server kernel: [  501.249767] task: ffff8800aebf3600 ti: ffff8801e07a0000 task.ti: ffff8801e07a0000
May  7 14:47:35 server kernel: [  501.249861] RIP: 0010:[<ffffffffa0222c71>]  [<ffffffffa0222c71>] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2]
May  7 14:47:35 server kernel: [  501.250002] RSP: 0018:ffff8801e07a3de8  EFLAGS: 00010086
May  7 14:47:35 server kernel: [  501.250071] RAX: 0000000000000283 RBX: ffff880210dc5000 RCX: 0000000000000283
May  7 14:47:35 server kernel: [  501.250161] RDX: ffffffffa0222cf0 RSI: 0000000000000000 RDI: ffff880210dc5014
May  7 14:47:35 server kernel: [  501.250251] RBP: ffff8801e07a3df8 R08: ffff8801e07a0000 R09: 0000000000000000
May  7 14:47:35 server kernel: [  501.250348] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8800cda2a9d8
May  7 14:47:35 server kernel: [  501.250438] R13: ffff880210dc51b8 R14: 0000000000000000 R15: ffff8800cda2a828
May  7 14:47:35 server kernel: [  501.250528] FS:  00007f5b77fff700(0000) GS:ffff88021fc40000(0000) knlGS:00000000adaffb40
May  7 14:47:35 server kernel: [  501.250631] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
May  7 14:47:35 server kernel: [  501.250704] CR2: 0000000000000004 CR3: 00000000ca19d000 CR4: 00000000000006e0
May  7 14:47:35 server kernel: [  501.250794] Stack:
May  7 14:47:35 server kernel: [  501.250822]  ffff8801e07a3df8 ffffffffa0222cfd ffff8801e07a3e70 ffffffffa0236beb
May  7 14:47:35 server kernel: [  501.250937]  0000000000000283 ffff8801e07a3e94 0000000000000000 0000000000000000
May  7 14:47:35 server kernel: [  501.251051]  ffff8800aebf3600 ffffffff8108d8e0 ffff8801e07a3e38 ffff8801e07a3e38
May  7 14:47:35 server kernel: [  501.251165] Call Trace:
May  7 14:47:35 server kernel: [  501.251200]  [<ffffffffa0222cfd>] ? __verify_planes_array_core+0xd/0x10 [videobuf2_v4l2]
May  7 14:47:35 server kernel: [  501.251306]  [<ffffffffa0236beb>] vb2_core_dqbuf+0x2eb/0x4c0 [videobuf2_core]
May  7 14:47:35 server kernel: [  501.251398]  [<ffffffff8108d8e0>] ? prepare_to_wait_event+0x100/0x100
May  7 14:47:35 server kernel: [  501.251482]  [<ffffffffa023855b>] vb2_thread+0x1cb/0x220 [videobuf2_core]
May  7 14:47:35 server kernel: [  501.251569]  [<ffffffffa0238390>] ? vb2_core_qbuf+0x230/0x230 [videobuf2_core]
May  7 14:47:35 server kernel: [  501.251662]  [<ffffffffa0238390>] ? vb2_core_qbuf+0x230/0x230 [videobuf2_core]
May  7 14:47:35 server kernel: [  501.255982]  [<ffffffff8106f984>] kthread+0xc4/0xe0
May  7 14:47:35 server kernel: [  501.260292]  [<ffffffff8106f8c0>] ? kthread_park+0x50/0x50
May  7 14:47:35 server kernel: [  501.264615]  [<ffffffff81697a5f>] ret_from_fork+0x3f/0x70
May  7 14:47:35 server kernel: [  501.268962]  [<ffffffff8106f8c0>] ? kthread_park+0x50/0x50
May  7 14:47:35 server kernel: [  501.273216] Code: 0d 01 74 16 48 8b 46 28 48 8b 56 30 48 89 87 d0 01 00 00 48 89 97 d8 01 00 00 5d c3 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 <8b> 46 04 48 89 e5 8d 50 f7 31 c0 83 fa 01 76 02 5d c3 48 83 7e
May  7 14:47:35 server kernel: [  501.282146] RIP  [<ffffffffa0222c71>] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2]
May  7 14:47:35 server kernel: [  501.286391]  RSP <ffff8801e07a3de8>
May  7 14:47:35 server kernel: [  501.290619] CR2: 0000000000000004
May  7 14:47:35 server kernel: [  501.294786] ---[ end trace b2b354153ccad110 ]---

This reverts commit 2c1f695.

Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: Hans Verkuil <hans.verkuil@cisco.com>
Cc: stable@vger.kernel.org
Fixes: 2c1f695 ("[media] videobuf2-v4l2: Verify planes array in buffer dequeueing")
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
dbogdan pushed a commit that referenced this pull request Sep 27, 2017
The kbuild test robot noticed some problems with the newhaven
lcd driver.

1. lcd_of_match is not NULL terminated at line 606

2. Use ARRAY_SIZE instead of dividing sizeof array with sizeof an element

3. Random config test warnings:
   drivers/built-in.o: In function `brightness_store':
>> newhaven_lcd.c:(.text+0xd2225): undefined reference to
  `i2c_master_send'
   drivers/built-in.o: In function `lcd_init':
>> newhaven_lcd.c:(.init.text+0x8a9e): undefined reference to
   `i2c_register_driver'
   drivers/built-in.o: In function `lcd_exit':
>> newhaven_lcd.c:(.exit.text+0x439): undefined reference to
   `i2c_del_driver'
   drivers/built-in.o: In function `lcd_remove':
>> newhaven_lcd.c:(.exit.text+0x462): undefined reference to
   `i2c_master_send'

This patch squashes the two patches they kindly sent us plus adds
a Kconfig fix for the warnings in #3 above.

Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Alan Tull <atull@opensource.altera.com>
commodo pushed a commit that referenced this pull request Apr 26, 2018
[ Upstream commit 72d5481 ]

It is unlikely request_threaded_irq will fail, but if it does for some
reason we should clear iommu->pr_irq in the error path. Also
intel_svm_finish_prq shouldn't try to clean up the page request
interrupt if pr_irq is 0. Without these, if request_threaded_irq were
to fail the following occurs:

fail with no fixes:

[    0.683147] ------------[ cut here ]------------
[    0.683148] NULL pointer, cannot free irq
[    0.683158] WARNING: CPU: 1 PID: 1 at kernel/irq/irqdomain.c:1632 irq_domain_free_irqs+0x126/0x140
[    0.683160] Modules linked in:
[    0.683163] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.15.0-rc2 #3
[    0.683165] Hardware name:                  /NUC7i3BNB, BIOS BNKBL357.86A.0036.2017.0105.1112 01/05/2017
[    0.683168] RIP: 0010:irq_domain_free_irqs+0x126/0x140
[    0.683169] RSP: 0000:ffffc90000037ce8 EFLAGS: 00010292
[    0.683171] RAX: 000000000000001d RBX: ffff880276283c00 RCX: ffffffff81c5e5e8
[    0.683172] RDX: 0000000000000001 RSI: 0000000000000096 RDI: 0000000000000246
[    0.683174] RBP: ffff880276283c00 R08: 0000000000000000 R09: 000000000000023c
[    0.683175] R10: 0000000000000007 R11: 0000000000000000 R12: 000000000000007a
[    0.683176] R13: 0000000000000001 R14: 0000000000000000 R15: 0000010010000000
[    0.683178] FS:  0000000000000000(0000) GS:ffff88027ec80000(0000) knlGS:0000000000000000
[    0.683180] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.683181] CR2: 0000000000000000 CR3: 0000000001c09001 CR4: 00000000003606e0
[    0.683182] Call Trace:
[    0.683189]  intel_svm_finish_prq+0x3c/0x60
[    0.683191]  free_dmar_iommu+0x1ac/0x1b0
[    0.683195]  init_dmars+0xaaa/0xaea
[    0.683200]  ? klist_next+0x19/0xc0
[    0.683203]  ? pci_do_find_bus+0x50/0x50
[    0.683205]  ? pci_get_dev_by_id+0x52/0x70
[    0.683208]  intel_iommu_init+0x498/0x5c7
[    0.683211]  pci_iommu_init+0x13/0x3c
[    0.683214]  ? e820__memblock_setup+0x61/0x61
[    0.683217]  do_one_initcall+0x4d/0x1a0
[    0.683220]  kernel_init_freeable+0x186/0x20e
[    0.683222]  ? set_debug_rodata+0x11/0x11
[    0.683225]  ? rest_init+0xb0/0xb0
[    0.683226]  kernel_init+0xa/0xff
[    0.683229]  ret_from_fork+0x1f/0x30
[    0.683259] Code: 89 ee 44 89 e7 e8 3b e8 ff ff 5b 5d 44 89 e7 44 89 ee 41 5c 41 5d 41 5e e9 a8 84 ff ff 48 c7 c7 a8 71 a7 81 31 c0 e8 6a d3 f9 ff <0f> ff 5b 5d 41 5c 41 5d 41 5
e c3 0f 1f 44 00 00 66 2e 0f 1f 84
[    0.683285] ---[ end trace f7650e42792627ca ]---

with iommu->pr_irq = 0, but no check in intel_svm_finish_prq:

[    0.669561] ------------[ cut here ]------------
[    0.669563] Trying to free already-free IRQ 0
[    0.669573] WARNING: CPU: 3 PID: 1 at kernel/irq/manage.c:1546 __free_irq+0xa4/0x2c0
[    0.669574] Modules linked in:
[    0.669577] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 4.15.0-rc2 #4
[    0.669579] Hardware name:                  /NUC7i3BNB, BIOS BNKBL357.86A.0036.2017.0105.1112 01/05/2017
[    0.669581] RIP: 0010:__free_irq+0xa4/0x2c0
[    0.669582] RSP: 0000:ffffc90000037cc0 EFLAGS: 00010082
[    0.669584] RAX: 0000000000000021 RBX: 0000000000000000 RCX: ffffffff81c5e5e8
[    0.669585] RDX: 0000000000000001 RSI: 0000000000000086 RDI: 0000000000000046
[    0.669587] RBP: 0000000000000000 R08: 0000000000000000 R09: 000000000000023c
[    0.669588] R10: 0000000000000007 R11: 0000000000000000 R12: ffff880276253960
[    0.669589] R13: ffff8802762538a4 R14: ffff880276253800 R15: ffff880276283600
[    0.669593] FS:  0000000000000000(0000) GS:ffff88027ed80000(0000) knlGS:0000000000000000
[    0.669594] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.669596] CR2: 0000000000000000 CR3: 0000000001c09001 CR4: 00000000003606e0
[    0.669602] Call Trace:
[    0.669616]  free_irq+0x30/0x60
[    0.669620]  intel_svm_finish_prq+0x34/0x60
[    0.669623]  free_dmar_iommu+0x1ac/0x1b0
[    0.669627]  init_dmars+0xaaa/0xaea
[    0.669631]  ? klist_next+0x19/0xc0
[    0.669634]  ? pci_do_find_bus+0x50/0x50
[    0.669637]  ? pci_get_dev_by_id+0x52/0x70
[    0.669639]  intel_iommu_init+0x498/0x5c7
[    0.669642]  pci_iommu_init+0x13/0x3c
[    0.669645]  ? e820__memblock_setup+0x61/0x61
[    0.669648]  do_one_initcall+0x4d/0x1a0
[    0.669651]  kernel_init_freeable+0x186/0x20e
[    0.669653]  ? set_debug_rodata+0x11/0x11
[    0.669656]  ? rest_init+0xb0/0xb0
[    0.669658]  kernel_init+0xa/0xff
[    0.669661]  ret_from_fork+0x1f/0x30
[    0.669662] Code: 7a 08 75 0e e9 c3 01 00 00 4c 39 7b 08 74 57 48 89 da 48 8b 5a 18 48 85 db 75 ee 89 ee 48 c7 c7 78 67 a7 81 31 c0 e8 4c 37 fa ff <0f> ff 48 8b 34 24 4c 89 ef e
8 0e 4c 68 00 49 8b 46 40 48 8b 80
[    0.669688] ---[ end trace 58a470248700f2fc ]---

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commodo pushed a commit that referenced this pull request Apr 26, 2018
[ Upstream commit d754941 ]

If, for any reason, userland shuts down iscsi transport interfaces
before proper logouts - like when logging in to LUNs manually, without
logging out on server shutdown, or when automated scripts can't
umount/logout from logged LUNs - kernel will hang forever on its
sd_sync_cache() logic, after issuing the SYNCHRONIZE_CACHE cmd to all
still existent paths.

PID: 1 TASK: ffff8801a69b8000 CPU: 1 COMMAND: "systemd-shutdow"
 #0 [ffff8801a69c3a30] __schedule at ffffffff8183e9ee
 #1 [ffff8801a69c3a80] schedule at ffffffff8183f0d5
 #2 [ffff8801a69c3a98] schedule_timeout at ffffffff81842199
 #3 [ffff8801a69c3b40] io_schedule_timeout at ffffffff8183e604
 #4 [ffff8801a69c3b70] wait_for_completion_io_timeout at ffffffff8183fc6c
 #5 [ffff8801a69c3bd0] blk_execute_rq at ffffffff813cfe10
 #6 [ffff8801a69c3c88] scsi_execute at ffffffff815c3fc7
 #7 [ffff8801a69c3cc8] scsi_execute_req_flags at ffffffff815c60fe
 #8 [ffff8801a69c3d30] sd_sync_cache at ffffffff815d37d7
 #9 [ffff8801a69c3da8] sd_shutdown at ffffffff815d3c3c

This happens because iscsi_eh_cmd_timed_out(), the transport layer
timeout helper, would tell the queue timeout function (scsi_times_out)
to reset the request timer over and over, until the session state is
back to logged in state. Unfortunately, during server shutdown, this
might never happen again.

Other option would be "not to handle" the issue in the transport
layer. That would trigger the error handler logic, which would also need
the session state to be logged in again.

Best option, for such case, is to tell upper layers that the command was
handled during the transport layer error handler helper, marking it as
DID_NO_CONNECT, which will allow completion and inform about the
problem.

After the session was marked as ISCSI_STATE_FAILED, due to the first
timeout during the server shutdown phase, all subsequent cmds will fail
to be queued, allowing upper logic to fail faster.

Signed-off-by: Rafael David Tinoco <rafael.tinoco@canonical.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commodo pushed a commit that referenced this pull request Apr 26, 2018
[ Upstream commit 2c0aa08 ]

Scenario:
1. Port down and do fail over
2. Ap do rds_bind syscall

PID: 47039  TASK: ffff89887e2fe640  CPU: 47  COMMAND: "kworker/u:6"
 #0 [ffff898e35f159f0] machine_kexec at ffffffff8103abf9
 #1 [ffff898e35f15a60] crash_kexec at ffffffff810b96e3
 #2 [ffff898e35f15b30] oops_end at ffffffff8150f518
 #3 [ffff898e35f15b60] no_context at ffffffff8104854c
 #4 [ffff898e35f15ba0] __bad_area_nosemaphore at ffffffff81048675
 #5 [ffff898e35f15bf0] bad_area_nosemaphore at ffffffff810487d3
 #6 [ffff898e35f15c00] do_page_fault at ffffffff815120b8
 #7 [ffff898e35f15d10] page_fault at ffffffff8150ea95
    [exception RIP: unknown or invalid address]
    RIP: 0000000000000000  RSP: ffff898e35f15dc8  RFLAGS: 00010282
    RAX: 00000000fffffffe  RBX: ffff889b77f6fc00  RCX:ffffffff81c99d88
    RDX: 0000000000000000  RSI: ffff896019ee08e8  RDI:ffff889b77f6fc00
    RBP: ffff898e35f15df0   R8: ffff896019ee08c8  R9:0000000000000000
    R10: 0000000000000400  R11: 0000000000000000  R12:ffff896019ee08c0
    R13: ffff889b77f6fe68  R14: ffffffff81c99d80  R15: ffffffffa022a1e0
    ORIG_RAX: ffffffffffffffff  CS: 0010 SS: 0018
 #8 [ffff898e35f15dc8] cma_ndev_work_handler at ffffffffa022a228 [rdma_cm]
 #9 [ffff898e35f15df8] process_one_work at ffffffff8108a7c6
 #10 [ffff898e35f15e58] worker_thread at ffffffff8108bda0
 #11 [ffff898e35f15ee8] kthread at ffffffff81090fe6

PID: 45659  TASK: ffff880d313d2500  CPU: 31  COMMAND: "oracle_45659_ap"
 #0 [ffff881024ccfc98] __schedule at ffffffff8150bac4
 #1 [ffff881024ccfd40] schedule at ffffffff8150c2cf
 #2 [ffff881024ccfd50] __mutex_lock_slowpath at ffffffff8150cee7
 #3 [ffff881024ccfdc0] mutex_lock at ffffffff8150cdeb
 #4 [ffff881024ccfde0] rdma_destroy_id at ffffffffa022a027 [rdma_cm]
 #5 [ffff881024ccfe10] rds_ib_laddr_check at ffffffffa0357857 [rds_rdma]
 #6 [ffff881024ccfe50] rds_trans_get_preferred at ffffffffa0324c2a [rds]
 #7 [ffff881024ccfe80] rds_bind at ffffffffa031d690 [rds]
 #8 [ffff881024ccfeb0] sys_bind at ffffffff8142a670

PID: 45659                          PID: 47039
rds_ib_laddr_check
  /* create id_priv with a null event_handler */
  rdma_create_id
  rdma_bind_addr
    cma_acquire_dev
      /* add id_priv to cma_dev->id_list */
      cma_attach_to_dev
                                    cma_ndev_work_handler
                                      /* event_hanlder is null */
                                      id_priv->id.event_handler

Signed-off-by: Guanglei Li <guanglei.li@oracle.com>
Signed-off-by: Honglei Wang <honglei.wang@oracle.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Yanjun Zhu <yanjun.zhu@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Acked-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commodo added a commit that referenced this pull request May 8, 2018
This appears when using some GPIO drivers that enable `can_sleep` to true.
This (`can_sleep`) seems to be a debugging facility, that is also enabled by
the PCA953x driver (used for the SidekiqZ2 GPIO expanders).

Even though `can_sleep` is true, the kernel must still compile the `DEBUG`
macro to actually call more debug code, so in essence the cansleep()
variants are only useful when compiling the code for debugging.

The warning is more harmless (as it seems), and it's a lot of dmesg log-spam.

Kernel warnings look like this:
```
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at drivers/gpio/gpiolib.c:2626 gpiod_set_value+0x70/0xa8
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.0-04227-g8a37c19debc4 #3
Hardware name: Xilinx Zynq Platform
[<c0015a28>] (unwind_backtrace) from [<c0012730>] (show_stack+0x10/0x14)
[<c0012730>] (show_stack) from [<c017d2dc>] (dump_stack+0x80/0xa0)
[<c017d2dc>] (dump_stack) from [<c0022190>] (__warn+0xcc/0xfc)
[<c0022190>] (__warn) from [<c0022264>] (warn_slowpath_null+0x1c/0x24)
[<c0022264>] (warn_slowpath_null) from [<c01a2344>] (gpiod_set_value+0x70/0xa8)
[<c01a2344>] (gpiod_set_value) from [<c0365f0c>] (ad9361_reset+0x28/0xa0)
[<c0365f0c>] (ad9361_reset) from [<c036b13c>] (ad9361_probe+0x1a64/0x21cc)
[<c036b13c>] (ad9361_probe) from [<c02214f0>] (spi_drv_probe+0x84/0xa0)
[<c02214f0>] (spi_drv_probe) from [<c01df2e4>] (driver_probe_device+0x12c/0x294)
[<c01df2e4>] (driver_probe_device) from [<c01df4cc>] (__driver_attach+0x80/0xa4)
[<c01df4cc>] (__driver_attach) from [<c01dda7c>] (bus_for_each_dev+0x6c/0x90)
[<c01dda7c>] (bus_for_each_dev) from [<c01dea14>] (bus_add_driver+0xc8/0x1e4)
[<c01dea14>] (bus_add_driver) from [<c01dfcdc>] (driver_register+0x9c/0xe0)
[<c01dfcdc>] (driver_register) from [<c00097fc>] (do_one_initcall+0xac/0x150)
[<c00097fc>] (do_one_initcall) from [<c05e2d5c>] (kernel_init_freeable+0x120/0x1e8)
[<c05e2d5c>] (kernel_init_freeable) from [<c04796e8>] (kernel_init+0x8/0xf4)
[<c04796e8>] (kernel_init) from [<c000ef78>] (ret_from_fork+0x14/0x3c)
---[ end trace 0bb33d2be832b243 ]---
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at drivers/gpio/gpiolib.c:2626 gpiod_set_value+0x70/0xa8
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W       4.9.0-04227-g8a37c19debc4 #3
Hardware name: Xilinx Zynq Platform
[<c0015a28>] (unwind_backtrace) from [<c0012730>] (show_stack+0x10/0x14)
[<c0012730>] (show_stack) from [<c017d2dc>] (dump_stack+0x80/0xa0)
[<c017d2dc>] (dump_stack) from [<c0022190>] (__warn+0xcc/0xfc)
[<c0022190>] (__warn) from [<c0022264>] (warn_slowpath_null+0x1c/0x24)
[<c0022264>] (warn_slowpath_null) from [<c01a2344>] (gpiod_set_value+0x70/0xa8)
[<c01a2344>] (gpiod_set_value) from [<c0365f28>] (ad9361_reset+0x44/0xa0)
[<c0365f28>] (ad9361_reset) from [<c036b13c>] (ad9361_probe+0x1a64/0x21cc)
[<c036b13c>] (ad9361_probe) from [<c02214f0>] (spi_drv_probe+0x84/0xa0)
[<c02214f0>] (spi_drv_probe) from [<c01df2e4>] (driver_probe_device+0x12c/0x294)
[<c01df2e4>] (driver_probe_device) from [<c01df4cc>] (__driver_attach+0x80/0xa4)
[<c01df4cc>] (__driver_attach) from [<c01dda7c>] (bus_for_each_dev+0x6c/0x90)
[<c01dda7c>] (bus_for_each_dev) from [<c01dea14>] (bus_add_driver+0xc8/0x1e4)
[<c01dea14>] (bus_add_driver) from [<c01dfcdc>] (driver_register+0x9c/0xe0)
[<c01dfcdc>] (driver_register) from [<c00097fc>] (do_one_initcall+0xac/0x150)
[<c00097fc>] (do_one_initcall) from [<c05e2d5c>] (kernel_init_freeable+0x120/0x1e8)
[<c05e2d5c>] (kernel_init_freeable) from [<c04796e8>] (kernel_init+0x8/0xf4)
[<c04796e8>] (kernel_init) from [<c000ef78>] (ret_from_fork+0x14/0x3c)
---[ end trace 0bb33d2be832b244 ]---
```

Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
commodo added a commit that referenced this pull request May 8, 2018
This appears when using some GPIO drivers that enable `can_sleep` to true.
This (`can_sleep`) seems to be a debugging facility, that is also enabled
by the PCA953x driver (used for the SidekiqZ2 GPIO expanders).

Even though `can_sleep` is true, the kernel must still compile the `DEBUG`
macro to actually call more debug code, so in essence the cansleep()
variants are only useful when compiling the code for debugging.

The warning is more harmless (as it seems), and it's a lot of dmesg
log-spam.

Kernel warnings look like this:
```
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at drivers/gpio/gpiolib.c:2626 gpiod_set_value+0x70/0xa8
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.0-04227-g8a37c19debc4 #3
Hardware name: Xilinx Zynq Platform
[<c0015a28>] (unwind_backtrace) from [<c0012730>] (show_stack+0x10/0x14)
[<c0012730>] (show_stack) from [<c017d2dc>] (dump_stack+0x80/0xa0)
[<c017d2dc>] (dump_stack) from [<c0022190>] (__warn+0xcc/0xfc)
[<c0022190>] (__warn) from [<c0022264>] (warn_slowpath_null+0x1c/0x24)
[<c0022264>] (warn_slowpath_null) from [<c01a2344>] (gpiod_set_value+0x70/0xa8)
[<c01a2344>] (gpiod_set_value) from [<c0365f0c>] (ad9361_reset+0x28/0xa0)
[<c0365f0c>] (ad9361_reset) from [<c036b13c>] (ad9361_probe+0x1a64/0x21cc)
[<c036b13c>] (ad9361_probe) from [<c02214f0>] (spi_drv_probe+0x84/0xa0)
[<c02214f0>] (spi_drv_probe) from [<c01df2e4>] (driver_probe_device+0x12c/0x294)
[<c01df2e4>] (driver_probe_device) from [<c01df4cc>] (__driver_attach+0x80/0xa4)
[<c01df4cc>] (__driver_attach) from [<c01dda7c>] (bus_for_each_dev+0x6c/0x90)
[<c01dda7c>] (bus_for_each_dev) from [<c01dea14>] (bus_add_driver+0xc8/0x1e4)
[<c01dea14>] (bus_add_driver) from [<c01dfcdc>] (driver_register+0x9c/0xe0)
[<c01dfcdc>] (driver_register) from [<c00097fc>] (do_one_initcall+0xac/0x150)
[<c00097fc>] (do_one_initcall) from [<c05e2d5c>] (kernel_init_freeable+0x120/0x1e8)
[<c05e2d5c>] (kernel_init_freeable) from [<c04796e8>] (kernel_init+0x8/0xf4)
[<c04796e8>] (kernel_init) from [<c000ef78>] (ret_from_fork+0x14/0x3c)
---[ end trace 0bb33d2be832b243 ]---
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at drivers/gpio/gpiolib.c:2626 gpiod_set_value+0x70/0xa8
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W       4.9.0-04227-g8a37c19debc4 #3
Hardware name: Xilinx Zynq Platform
[<c0015a28>] (unwind_backtrace) from [<c0012730>] (show_stack+0x10/0x14)
[<c0012730>] (show_stack) from [<c017d2dc>] (dump_stack+0x80/0xa0)
[<c017d2dc>] (dump_stack) from [<c0022190>] (__warn+0xcc/0xfc)
[<c0022190>] (__warn) from [<c0022264>] (warn_slowpath_null+0x1c/0x24)
[<c0022264>] (warn_slowpath_null) from [<c01a2344>] (gpiod_set_value+0x70/0xa8)
[<c01a2344>] (gpiod_set_value) from [<c0365f28>] (ad9361_reset+0x44/0xa0)
[<c0365f28>] (ad9361_reset) from [<c036b13c>] (ad9361_probe+0x1a64/0x21cc)
[<c036b13c>] (ad9361_probe) from [<c02214f0>] (spi_drv_probe+0x84/0xa0)
[<c02214f0>] (spi_drv_probe) from [<c01df2e4>] (driver_probe_device+0x12c/0x294)
[<c01df2e4>] (driver_probe_device) from [<c01df4cc>] (__driver_attach+0x80/0xa4)
[<c01df4cc>] (__driver_attach) from [<c01dda7c>] (bus_for_each_dev+0x6c/0x90)
[<c01dda7c>] (bus_for_each_dev) from [<c01dea14>] (bus_add_driver+0xc8/0x1e4)
[<c01dea14>] (bus_add_driver) from [<c01dfcdc>] (driver_register+0x9c/0xe0)
[<c01dfcdc>] (driver_register) from [<c00097fc>] (do_one_initcall+0xac/0x150)
[<c00097fc>] (do_one_initcall) from [<c05e2d5c>] (kernel_init_freeable+0x120/0x1e8)
[<c05e2d5c>] (kernel_init_freeable) from [<c04796e8>] (kernel_init+0x8/0xf4)
[<c04796e8>] (kernel_init) from [<c000ef78>] (ret_from_fork+0x14/0x3c)
---[ end trace 0bb33d2be832b244 ]---
```

Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
commodo pushed a commit that referenced this pull request Jul 16, 2018
…sfers

This bug happens only when the UDC needs to sleep during usb_ep_dequeue,
as is the case for (at least) dwc3.

[  382.200896] BUG: scheduling while atomic: screen/1808/0x00000100
[  382.207124] 4 locks held by screen/1808:
[  382.211266]  #0:  (rcu_callback){....}, at: [<c10b4ff0>] rcu_process_callbacks+0x260/0x440
[  382.219949]  #1:  (rcu_read_lock_sched){....}, at: [<c1358ba0>] percpu_ref_switch_to_atomic_rcu+0xb0/0x130
[  382.230034]  #2:  (&(&ctx->ctx_lock)->rlock){....}, at: [<c11f0c73>] free_ioctx_users+0x23/0xd0
[  382.230096]  #3:  (&(&ffs->eps_lock)->rlock){....}, at: [<f81e7710>] ffs_aio_cancel+0x20/0x60 [usb_f_fs]
[  382.230160] Modules linked in: usb_f_fs libcomposite configfs bnep btsdio bluetooth ecdh_generic brcmfmac brcmutil intel_powerclamp coretemp dwc3 kvm_intel ulpi udc_core kvm irqbypass crc32_pclmul crc32c_intel pcbc dwc3_pci aesni_intel aes_i586 crypto_simd cryptd ehci_pci ehci_hcd gpio_keys usbcore basincove_gpadc industrialio usb_common
[  382.230407] CPU: 1 PID: 1808 Comm: screen Not tainted 4.14.0-edison+ #117
[  382.230416] Hardware name: Intel Corporation Merrifield/BODEGA BAY, BIOS 542 2015.01.21:18.19.48
[  382.230425] Call Trace:
[  382.230438]  <SOFTIRQ>
[  382.230466]  dump_stack+0x47/0x62
[  382.230498]  __schedule_bug+0x61/0x80
[  382.230522]  __schedule+0x43/0x7a0
[  382.230587]  schedule+0x5f/0x70
[  382.230625]  dwc3_gadget_ep_dequeue+0x14c/0x270 [dwc3]
[  382.230669]  ? do_wait_intr_irq+0x70/0x70
[  382.230724]  usb_ep_dequeue+0x19/0x90 [udc_core]
[  382.230770]  ffs_aio_cancel+0x37/0x60 [usb_f_fs]
[  382.230798]  kiocb_cancel+0x31/0x40
[  382.230822]  free_ioctx_users+0x4d/0xd0
[  382.230858]  percpu_ref_switch_to_atomic_rcu+0x10a/0x130
[  382.230881]  ? percpu_ref_exit+0x40/0x40
[  382.230904]  rcu_process_callbacks+0x2b3/0x440
[  382.230965]  __do_softirq+0xf8/0x26b
[  382.231011]  ? __softirqentry_text_start+0x8/0x8
[  382.231033]  do_softirq_own_stack+0x22/0x30
[  382.231042]  </SOFTIRQ>
[  382.231071]  irq_exit+0x45/0xc0
[  382.231089]  smp_apic_timer_interrupt+0x13c/0x150
[  382.231118]  apic_timer_interrupt+0x35/0x3c
[  382.231132] EIP: __copy_user_ll+0xe2/0xf0
[  382.231142] EFLAGS: 00210293 CPU: 1
[  382.231154] EAX: bfd4508c EBX: 00000004 ECX: 00000003 EDX: f3d8fe50
[  382.231165] ESI: f3d8fe51 EDI: bfd4508d EBP: f3d8fe14 ESP: f3d8fe08
[  382.231176]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[  382.231265]  core_sys_select+0x25f/0x320
[  382.231346]  ? __wake_up_common_lock+0x62/0x80
[  382.231399]  ? tty_ldisc_deref+0x13/0x20
[  382.231438]  ? ldsem_up_read+0x1b/0x40
[  382.231459]  ? tty_ldisc_deref+0x13/0x20
[  382.231479]  ? tty_write+0x29f/0x2e0
[  382.231514]  ? n_tty_ioctl+0xe0/0xe0
[  382.231541]  ? tty_write_unlock+0x30/0x30
[  382.231566]  ? __vfs_write+0x22/0x110
[  382.231604]  ? security_file_permission+0x2f/0xd0
[  382.231635]  ? rw_verify_area+0xac/0x120
[  382.231677]  ? vfs_write+0x103/0x180
[  382.231711]  SyS_select+0x87/0xc0
[  382.231739]  ? SyS_write+0x42/0x90
[  382.231781]  do_fast_syscall_32+0xd6/0x1a0
[  382.231836]  entry_SYSENTER_32+0x47/0x71
[  382.231848] EIP: 0xb7f75b05
[  382.231857] EFLAGS: 00000246 CPU: 1
[  382.231868] EAX: ffffffda EBX: 00000400 ECX: bfd4508c EDX: bfd4510c
[  382.231878] ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: bfd45020
[  382.231889]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
[  382.232281] softirq: huh, entered softirq 9 RCU c10b4d90 with preempt_count 00000100, exited with 00000000?

Tested-by: Sam Protsenko <semen.protsenko@linaro.org>
Signed-off-by: Vincent Pelletier <plr.vincent@gmail.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
commodo pushed a commit that referenced this pull request Sep 10, 2020
Current code enables TCSR.TE and RCSR.RE together, and disable
TCSR.TE and RCSR.RE together in trigger(), which only supports
one operation mode:
1. Rx synchronous with Tx: TE is last enabled and first disabled

Other operation mode need to be considered also:
2. Tx synchronous with Rx: RE is last enabled and first disabled.
3. Asynchronous mode: Tx and Rx are independent.

So the enable TCSR.TE and RCSR.RE sequence and the disable
sequence need to be refined accordingly for #2 and #3.

There is slightly against what RM recommennds with this change.
For example in Rx synchronous with Tx mode, case "aplay 1.wav;
arecord 2.wav" enable TE before RE. But it should be safe to
do so, judging by years of testing results.

Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Reviewed-by: Nicolin Chen <nicoleotsuka@gmail.com>
Link: https://lore.kernel.org/r/20200805063413.4610-2-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
commodo pushed a commit that referenced this pull request Sep 10, 2020
…iu Wang <shengjiu.wang@nxp.com>:

refine and clean code for synchronous mode

Shengjiu Wang (3):
  ASoC: fsl_sai: Refine enable/disable TE/RE sequence in trigger()
  ASoC: fsl_sai: Drop TMR/RMR settings for synchronous mode
  ASoC: fsl_sai: Replace synchronous check with fsl_sai_dir_is_synced

changes in v3:
- Add reviewed-by Nicolin
- refine the commit log.
- Add one more patch #3

changes in v2:
- Split the commit
- refine the sequence in trigger stop

 sound/soc/fsl/fsl_sai.c | 173 +++++++++++++++++++++++-----------------
 1 file changed, 102 insertions(+), 71 deletions(-)

--
2.27.0
commodo pushed a commit that referenced this pull request Sep 10, 2020
This reverts commit 61eee4a ("ALSA: hda: Add support for Loongson
7A1000 controller") to fix the following error on the Loongson LS7A
platform:

rcu: INFO: rcu_preempt self-detected stall on CPU
<SNIP>
NMI backtrace for cpu 0
CPU: 0 PID: 68 Comm: kworker/0:2 Not tainted 5.8.0+ #3
Hardware name:  , BIOS
Workqueue: events azx_probe_work [snd_hda_intel]
<SNIP>
Call Trace:
[<ffffffff80211a64>] show_stack+0x9c/0x130
[<ffffffff8065a740>] dump_stack+0xb0/0xf0
[<ffffffff80665774>] nmi_cpu_backtrace+0x134/0x140
[<ffffffff80665910>] nmi_trigger_cpumask_backtrace+0x190/0x200
[<ffffffff802b1abc>] rcu_dump_cpu_stacks+0x12c/0x190
[<ffffffff802b08cc>] rcu_sched_clock_irq+0xa2c/0xfc8
[<ffffffff802b91d4>] update_process_times+0x2c/0xb8
[<ffffffff802cad80>] tick_sched_timer+0x40/0xb8
[<ffffffff802ba5f0>] __hrtimer_run_queues+0x118/0x1d0
[<ffffffff802bab74>] hrtimer_interrupt+0x12c/0x2d8
[<ffffffff8021547c>] c0_compare_interrupt+0x74/0xa0
[<ffffffff80296bd0>] __handle_irq_event_percpu+0xa8/0x198
[<ffffffff80296cf0>] handle_irq_event_percpu+0x30/0x90
[<ffffffff8029d958>] handle_percpu_irq+0x88/0xb8
[<ffffffff80296124>] generic_handle_irq+0x44/0x60
[<ffffffff80b3cfd0>] do_IRQ+0x18/0x28
[<ffffffff8067ace4>] plat_irq_dispatch+0x64/0x100
[<ffffffff80209a20>] handle_int+0x140/0x14c
[<ffffffff802402e8>] irq_exit+0xf8/0x100

Because AZX_DRIVER_GENERIC can not work well for Loongson LS7A HDA
controller, it needs some workarounds which are not merged into the
upstream kernel at this time, so it should revert this patch now.

Fixes: 61eee4a ("ALSA: hda: Add support for Loongson 7A1000 controller")
Cc: <stable@vger.kernel.org> # 5.9-rc1+
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Link: https://lore.kernel.org/r/1598348388-2518-1-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Takashi Iwai <tiwai@suse.de>
commodo pushed a commit that referenced this pull request Sep 10, 2020
I got the following lockdep splat while testing:

  ======================================================
  WARNING: possible circular locking dependency detected
  5.8.0-rc7-00172-g021118712e59 #932 Not tainted
  ------------------------------------------------------
  btrfs/229626 is trying to acquire lock:
  ffffffff828513f0 (cpu_hotplug_lock){++++}-{0:0}, at: alloc_workqueue+0x378/0x450

  but task is already holding lock:
  ffff889dd3889518 (&fs_info->scrub_lock){+.+.}-{3:3}, at: btrfs_scrub_dev+0x11c/0x630

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

  -> #7 (&fs_info->scrub_lock){+.+.}-{3:3}:
	 __mutex_lock+0x9f/0x930
	 btrfs_scrub_dev+0x11c/0x630
	 btrfs_dev_replace_by_ioctl.cold.21+0x10a/0x1d4
	 btrfs_ioctl+0x2799/0x30a0
	 ksys_ioctl+0x83/0xc0
	 __x64_sys_ioctl+0x16/0x20
	 do_syscall_64+0x50/0x90
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  -> #6 (&fs_devs->device_list_mutex){+.+.}-{3:3}:
	 __mutex_lock+0x9f/0x930
	 btrfs_run_dev_stats+0x49/0x480
	 commit_cowonly_roots+0xb5/0x2a0
	 btrfs_commit_transaction+0x516/0xa60
	 sync_filesystem+0x6b/0x90
	 generic_shutdown_super+0x22/0x100
	 kill_anon_super+0xe/0x30
	 btrfs_kill_super+0x12/0x20
	 deactivate_locked_super+0x29/0x60
	 cleanup_mnt+0xb8/0x140
	 task_work_run+0x6d/0xb0
	 __prepare_exit_to_usermode+0x1cc/0x1e0
	 do_syscall_64+0x5c/0x90
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  -> #5 (&fs_info->tree_log_mutex){+.+.}-{3:3}:
	 __mutex_lock+0x9f/0x930
	 btrfs_commit_transaction+0x4bb/0xa60
	 sync_filesystem+0x6b/0x90
	 generic_shutdown_super+0x22/0x100
	 kill_anon_super+0xe/0x30
	 btrfs_kill_super+0x12/0x20
	 deactivate_locked_super+0x29/0x60
	 cleanup_mnt+0xb8/0x140
	 task_work_run+0x6d/0xb0
	 __prepare_exit_to_usermode+0x1cc/0x1e0
	 do_syscall_64+0x5c/0x90
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  -> #4 (&fs_info->reloc_mutex){+.+.}-{3:3}:
	 __mutex_lock+0x9f/0x930
	 btrfs_record_root_in_trans+0x43/0x70
	 start_transaction+0xd1/0x5d0
	 btrfs_dirty_inode+0x42/0xd0
	 touch_atime+0xa1/0xd0
	 btrfs_file_mmap+0x3f/0x60
	 mmap_region+0x3a4/0x640
	 do_mmap+0x376/0x580
	 vm_mmap_pgoff+0xd5/0x120
	 ksys_mmap_pgoff+0x193/0x230
	 do_syscall_64+0x50/0x90
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  -> #3 (&mm->mmap_lock#2){++++}-{3:3}:
	 __might_fault+0x68/0x90
	 _copy_to_user+0x1e/0x80
	 perf_read+0x141/0x2c0
	 vfs_read+0xad/0x1b0
	 ksys_read+0x5f/0xe0
	 do_syscall_64+0x50/0x90
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  -> #2 (&cpuctx_mutex){+.+.}-{3:3}:
	 __mutex_lock+0x9f/0x930
	 perf_event_init_cpu+0x88/0x150
	 perf_event_init+0x1db/0x20b
	 start_kernel+0x3ae/0x53c
	 secondary_startup_64+0xa4/0xb0

  -> #1 (pmus_lock){+.+.}-{3:3}:
	 __mutex_lock+0x9f/0x930
	 perf_event_init_cpu+0x4f/0x150
	 cpuhp_invoke_callback+0xb1/0x900
	 _cpu_up.constprop.26+0x9f/0x130
	 cpu_up+0x7b/0xc0
	 bringup_nonboot_cpus+0x4f/0x60
	 smp_init+0x26/0x71
	 kernel_init_freeable+0x110/0x258
	 kernel_init+0xa/0x103
	 ret_from_fork+0x1f/0x30

  -> #0 (cpu_hotplug_lock){++++}-{0:0}:
	 __lock_acquire+0x1272/0x2310
	 lock_acquire+0x9e/0x360
	 cpus_read_lock+0x39/0xb0
	 alloc_workqueue+0x378/0x450
	 __btrfs_alloc_workqueue+0x15d/0x200
	 btrfs_alloc_workqueue+0x51/0x160
	 scrub_workers_get+0x5a/0x170
	 btrfs_scrub_dev+0x18c/0x630
	 btrfs_dev_replace_by_ioctl.cold.21+0x10a/0x1d4
	 btrfs_ioctl+0x2799/0x30a0
	 ksys_ioctl+0x83/0xc0
	 __x64_sys_ioctl+0x16/0x20
	 do_syscall_64+0x50/0x90
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  other info that might help us debug this:

  Chain exists of:
    cpu_hotplug_lock --> &fs_devs->device_list_mutex --> &fs_info->scrub_lock

   Possible unsafe locking scenario:

	 CPU0                    CPU1
	 ----                    ----
    lock(&fs_info->scrub_lock);
				 lock(&fs_devs->device_list_mutex);
				 lock(&fs_info->scrub_lock);
    lock(cpu_hotplug_lock);

   *** DEADLOCK ***

  2 locks held by btrfs/229626:
   #0: ffff88bfe8bb86e0 (&fs_devs->device_list_mutex){+.+.}-{3:3}, at: btrfs_scrub_dev+0xbd/0x630
   #1: ffff889dd3889518 (&fs_info->scrub_lock){+.+.}-{3:3}, at: btrfs_scrub_dev+0x11c/0x630

  stack backtrace:
  CPU: 15 PID: 229626 Comm: btrfs Kdump: loaded Not tainted 5.8.0-rc7-00172-g021118712e59 #932
  Hardware name: Quanta Tioga Pass Single Side 01-0030993006/Tioga Pass Single Side, BIOS F08_3A18 12/20/2018
  Call Trace:
   dump_stack+0x78/0xa0
   check_noncircular+0x165/0x180
   __lock_acquire+0x1272/0x2310
   lock_acquire+0x9e/0x360
   ? alloc_workqueue+0x378/0x450
   cpus_read_lock+0x39/0xb0
   ? alloc_workqueue+0x378/0x450
   alloc_workqueue+0x378/0x450
   ? rcu_read_lock_sched_held+0x52/0x80
   __btrfs_alloc_workqueue+0x15d/0x200
   btrfs_alloc_workqueue+0x51/0x160
   scrub_workers_get+0x5a/0x170
   btrfs_scrub_dev+0x18c/0x630
   ? start_transaction+0xd1/0x5d0
   btrfs_dev_replace_by_ioctl.cold.21+0x10a/0x1d4
   btrfs_ioctl+0x2799/0x30a0
   ? do_sigaction+0x102/0x250
   ? lockdep_hardirqs_on_prepare+0xca/0x160
   ? _raw_spin_unlock_irq+0x24/0x30
   ? trace_hardirqs_on+0x1c/0xe0
   ? _raw_spin_unlock_irq+0x24/0x30
   ? do_sigaction+0x102/0x250
   ? ksys_ioctl+0x83/0xc0
   ksys_ioctl+0x83/0xc0
   __x64_sys_ioctl+0x16/0x20
   do_syscall_64+0x50/0x90
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

This happens because we're allocating the scrub workqueues under the
scrub and device list mutex, which brings in a whole host of other
dependencies.

Because the work queue allocation is done with GFP_KERNEL, it can
trigger reclaim, which can lead to a transaction commit, which in turns
needs the device_list_mutex, it can lead to a deadlock. A different
problem for which this fix is a solution.

Fix this by moving the actual allocation outside of the
scrub lock, and then only take the lock once we're ready to actually
assign them to the fs_info.  We'll now have to cleanup the workqueues in
a few more places, so I've added a helper to do the refcount dance to
safely free the workqueues.

CC: stable@vger.kernel.org # 5.4+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
commodo pushed a commit that referenced this pull request Sep 10, 2020
…s metrics" test

Linux 5.9 introduced perf test case "Parse and process metrics" and
on s390 this test case always dumps core:

  [root@t35lp67 perf]# ./perf test -vvvv -F 67
  67: Parse and process metrics                             :
  --- start ---
  metric expr inst_retired.any / cpu_clk_unhalted.thread for IPC
  parsing metric: inst_retired.any / cpu_clk_unhalted.thread
  Segmentation fault (core dumped)
  [root@t35lp67 perf]#

I debugged this core dump and gdb shows this call chain:

  (gdb) where
   #0  0x000003ffabc3192a in __strnlen_c_1 () from /lib64/libc.so.6
   #1  0x000003ffabc293de in strcasestr () from /lib64/libc.so.6
   #2  0x0000000001102ba2 in match_metric(list=0x1e6ea20 "inst_retired.any",
            n=<optimized out>)
       at util/metricgroup.c:368
   #3  find_metric (map=<optimized out>, map=<optimized out>,
           metric=0x1e6ea20 "inst_retired.any")
      at util/metricgroup.c:765
   #4  __resolve_metric (ids=0x0, map=<optimized out>, metric_list=0x0,
           metric_no_group=<optimized out>, m=<optimized out>)
      at util/metricgroup.c:844
   #5  resolve_metric (ids=0x0, map=0x0, metric_list=0x0,
          metric_no_group=<optimized out>)
      at util/metricgroup.c:881
   #6  metricgroup__add_metric (metric=<optimized out>,
        metric_no_group=metric_no_group@entry=false, events=<optimized out>,
        events@entry=0x3ffd84fb878, metric_list=0x0,
        metric_list@entry=0x3ffd84fb868, map=0x0)
      at util/metricgroup.c:943
   #7  0x00000000011034ae in metricgroup__add_metric_list (map=0x13f9828 <map>,
        metric_list=0x3ffd84fb868, events=0x3ffd84fb878,
        metric_no_group=<optimized out>, list=<optimized out>)
      at util/metricgroup.c:988
   #8  parse_groups (perf_evlist=perf_evlist@entry=0x1e70260,
          str=str@entry=0x12f34b2 "IPC", metric_no_group=<optimized out>,
          metric_no_merge=<optimized out>,
          fake_pmu=fake_pmu@entry=0x1462f18 <perf_pmu.fake>,
          metric_events=0x3ffd84fba58, map=0x1)
      at util/metricgroup.c:1040
   #9  0x0000000001103eb2 in metricgroup__parse_groups_test(
  	evlist=evlist@entry=0x1e70260, map=map@entry=0x13f9828 <map>,
  	str=str@entry=0x12f34b2 "IPC",
  	metric_no_group=metric_no_group@entry=false,
  	metric_no_merge=metric_no_merge@entry=false,
  	metric_events=0x3ffd84fba58)
      at util/metricgroup.c:1082
   #10 0x00000000010c84d8 in __compute_metric (ratio2=0x0, name2=0x0,
          ratio1=<synthetic pointer>, name1=0x12f34b2 "IPC",
  	vals=0x3ffd84fbad8, name=0x12f34b2 "IPC")
      at tests/parse-metric.c:159
   #11 compute_metric (ratio=<synthetic pointer>, vals=0x3ffd84fbad8,
  	name=0x12f34b2 "IPC")
      at tests/parse-metric.c:189
   #12 test_ipc () at tests/parse-metric.c:208
.....
..... omitted many more lines

This test case was added with
commit 218ca91 ("perf tests: Add parse metric test for frontend metric").

When I compile with make DEBUG=y it works fine and I do not get a core dump.

It turned out that the above listed function call chain worked on a struct
pmu_event array which requires a trailing element with zeroes which was
missing. The marco map_for_each_event() loops over that array tests for members
metric_expr/metric_name/metric_group being non-NULL. Adding this element fixes
the issue.

Output after:

  [root@t35lp46 perf]# ./perf test 67
  67: Parse and process metrics                             : Ok
  [root@t35lp46 perf]#

Committer notes:

As Ian remarks, this is not s390 specific:

<quote Ian>
  This also shows up with address sanitizer on all architectures
  (perhaps change the patch title) and perhaps add a "Fixes: <commit>"
  tag.

  =================================================================
  ==4718==ERROR: AddressSanitizer: global-buffer-overflow on address
  0x55c93b4d59e8 at pc 0x55c93a1541e2 bp 0x7ffd24327c60 sp
  0x7ffd24327c58
  READ of size 8 at 0x55c93b4d59e8 thread T0
      #0 0x55c93a1541e1 in find_metric tools/perf/util/metricgroup.c:764:2
      #1 0x55c93a153e6c in __resolve_metric tools/perf/util/metricgroup.c:844:9
      #2 0x55c93a152f18 in resolve_metric tools/perf/util/metricgroup.c:881:9
      #3 0x55c93a1528db in metricgroup__add_metric
  tools/perf/util/metricgroup.c:943:9
      #4 0x55c93a151996 in metricgroup__add_metric_list
  tools/perf/util/metricgroup.c:988:9
      #5 0x55c93a1511b9 in parse_groups tools/perf/util/metricgroup.c:1040:8
      #6 0x55c93a1513e1 in metricgroup__parse_groups_test
  tools/perf/util/metricgroup.c:1082:9
      #7 0x55c93a0108ae in __compute_metric tools/perf/tests/parse-metric.c:159:8
      #8 0x55c93a010744 in compute_metric tools/perf/tests/parse-metric.c:189:9
      #9 0x55c93a00f5ee in test_ipc tools/perf/tests/parse-metric.c:208:2
      #10 0x55c93a00f1e8 in test__parse_metric
  tools/perf/tests/parse-metric.c:345:2
      #11 0x55c939fd7202 in run_test tools/perf/tests/builtin-test.c:410:9
      #12 0x55c939fd6736 in test_and_print tools/perf/tests/builtin-test.c:440:9
      #13 0x55c939fd58c3 in __cmd_test tools/perf/tests/builtin-test.c:661:4
      #14 0x55c939fd4e02 in cmd_test tools/perf/tests/builtin-test.c:807:9
      #15 0x55c939e4763d in run_builtin tools/perf/perf.c:313:11
      #16 0x55c939e46475 in handle_internal_command tools/perf/perf.c:365:8
      #17 0x55c939e4737e in run_argv tools/perf/perf.c:409:2
      #18 0x55c939e45f7e in main tools/perf/perf.c:539:3

  0x55c93b4d59e8 is located 0 bytes to the right of global variable
  'pme_test' defined in 'tools/perf/tests/parse-metric.c:17:25'
  (0x55c93b4d54a0) of size 1352
  SUMMARY: AddressSanitizer: global-buffer-overflow
  tools/perf/util/metricgroup.c:764:2 in find_metric
  Shadow bytes around the buggy address:
    0x0ab9a7692ae0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x0ab9a7692af0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x0ab9a7692b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x0ab9a7692b10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x0ab9a7692b20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  =>0x0ab9a7692b30: 00 00 00 00 00 00 00 00 00 00 00 00 00[f9]f9 f9
    0x0ab9a7692b40: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
    0x0ab9a7692b50: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
    0x0ab9a7692b60: f9 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00
    0x0ab9a7692b70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x0ab9a7692b80: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  Shadow byte legend (one shadow byte represents 8 application bytes):
    Addressable:           00
    Partially addressable: 01 02 03 04 05 06 07
    Heap left redzone:	   fa
    Freed heap region:	   fd
    Stack left redzone:	   f1
    Stack mid redzone:	   f2
    Stack right redzone:     f3
    Stack after return:	   f5
    Stack use after scope:   f8
    Global redzone:          f9
    Global init order:	   f6
    Poisoned by user:        f7
    Container overflow:	   fc
    Array cookie:            ac
    Intra object redzone:    bb
    ASan internal:           fe
    Left alloca redzone:     ca
    Right alloca redzone:    cb
    Shadow gap:              cc
</quote>

I'm also adding the missing "Fixes" tag and setting just .name to NULL,
as doing it that way is more compact (the compiler will zero out
everything else) and the table iterators look for .name being NULL as
the sentinel marking the end of the table.

Fixes: 0a507af ("perf tests: Add parse metric test for ipc metric")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20200825071211.16959-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
commodo pushed a commit that referenced this pull request Sep 15, 2020
Move all allocations outside of the regulator_lock()ed section.

======================================================
WARNING: possible circular locking dependency detected
5.7.13+ #535 Not tainted
------------------------------------------------------
f2fs_discard-179:7/702 is trying to acquire lock:
c0e5d920 (regulator_list_mutex){+.+.}-{3:3}, at: regulator_lock_dependent+0x54/0x2c0

but task is already holding lock:
cb95b080 (&dcc->cmd_lock){+.+.}-{3:3}, at: __issue_discard_cmd+0xec/0x5f8

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

[...]

-> #3 (fs_reclaim){+.+.}-{0:0}:
       fs_reclaim_acquire.part.11+0x40/0x50
       fs_reclaim_acquire+0x24/0x28
       __kmalloc_track_caller+0x54/0x218
       kstrdup+0x40/0x5c
       create_regulator+0xf4/0x368
       regulator_resolve_supply+0x1a0/0x200
       regulator_register+0x9c8/0x163c

[...]

other info that might help us debug this:

Chain exists of:
  regulator_list_mutex --> &sit_i->sentry_lock --> &dcc->cmd_lock

[...]

Fixes: f8702f9 ("regulator: core: Use ww_mutex for regulators locking")
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/6eebc99b2474f4ffaa0405b15178ece0e7e4f608.1597195321.git.mirq-linux@rere.qmqm.pl
Signed-off-by: Mark Brown <broonie@kernel.org>
commodo pushed a commit that referenced this pull request Feb 19, 2021
Like other tunneling interfaces, the bareudp doesn't need TXLOCK.
So, It is good to set the NETIF_F_LLTX flag to improve performance and
to avoid lockdep's false-positive warning.

Test commands:
    ip netns add A
    ip netns add B
    ip link add veth0 netns A type veth peer name veth1 netns B
    ip netns exec A ip link set veth0 up
    ip netns exec A ip a a 10.0.0.1/24 dev veth0
    ip netns exec B ip link set veth1 up
    ip netns exec B ip a a 10.0.0.2/24 dev veth1

    for i in {2..1}
    do
            let A=$i-1
            ip netns exec A ip link add bareudp$i type bareudp \
		    dstport $i ethertype ip
            ip netns exec A ip link set bareudp$i up
            ip netns exec A ip a a 10.0.$i.1/24 dev bareudp$i
            ip netns exec A ip r a 10.0.$i.2 encap ip src 10.0.$A.1 \
		    dst 10.0.$A.2 via 10.0.$i.2 dev bareudp$i

            ip netns exec B ip link add bareudp$i type bareudp \
		    dstport $i ethertype ip
            ip netns exec B ip link set bareudp$i up
            ip netns exec B ip a a 10.0.$i.2/24 dev bareudp$i
            ip netns exec B ip r a 10.0.$i.1 encap ip src 10.0.$A.2 \
		    dst 10.0.$A.1 via 10.0.$i.1 dev bareudp$i
    done
    ip netns exec A ping 10.0.2.2

Splat looks like:
[   96.992803][  T822] ============================================
[   96.993954][  T822] WARNING: possible recursive locking detected
[   96.995102][  T822] 5.10.0+ #819 Not tainted
[   96.995927][  T822] --------------------------------------------
[   96.997091][  T822] ping/822 is trying to acquire lock:
[   96.998083][  T822] ffff88810f753898 (_xmit_NONE#2){+.-.}-{2:2}, at: __dev_queue_xmit+0x1f52/0x2960
[   96.999813][  T822]
[   96.999813][  T822] but task is already holding lock:
[   97.001192][  T822] ffff88810c385498 (_xmit_NONE#2){+.-.}-{2:2}, at: __dev_queue_xmit+0x1f52/0x2960
[   97.002908][  T822]
[   97.002908][  T822] other info that might help us debug this:
[   97.004401][  T822]  Possible unsafe locking scenario:
[   97.004401][  T822]
[   97.005784][  T822]        CPU0
[   97.006407][  T822]        ----
[   97.007010][  T822]   lock(_xmit_NONE#2);
[   97.007779][  T822]   lock(_xmit_NONE#2);
[   97.008550][  T822]
[   97.008550][  T822]  *** DEADLOCK ***
[   97.008550][  T822]
[   97.010057][  T822]  May be due to missing lock nesting notation
[   97.010057][  T822]
[   97.011594][  T822] 7 locks held by ping/822:
[   97.012426][  T822]  #0: ffff888109a144f0 (sk_lock-AF_INET){+.+.}-{0:0}, at: raw_sendmsg+0x12f7/0x2b00
[   97.014191][  T822]  #1: ffffffffbce2f5a0 (rcu_read_lock_bh){....}-{1:2}, at: ip_finish_output2+0x249/0x2020
[   97.016045][  T822]  #2: ffffffffbce2f5a0 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x1fd/0x2960
[   97.017897][  T822]  #3: ffff88810c385498 (_xmit_NONE#2){+.-.}-{2:2}, at: __dev_queue_xmit+0x1f52/0x2960
[   97.019684][  T822]  #4: ffffffffbce2f600 (rcu_read_lock){....}-{1:2}, at: bareudp_xmit+0x31b/0x3690 [bareudp]
[   97.021573][  T822]  #5: ffffffffbce2f5a0 (rcu_read_lock_bh){....}-{1:2}, at: ip_finish_output2+0x249/0x2020
[   97.023424][  T822]  #6: ffffffffbce2f5a0 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x1fd/0x2960
[   97.025259][  T822]
[   97.025259][  T822] stack backtrace:
[   97.026349][  T822] CPU: 3 PID: 822 Comm: ping Not tainted 5.10.0+ #819
[   97.027609][  T822] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[   97.029407][  T822] Call Trace:
[   97.030015][  T822]  dump_stack+0x99/0xcb
[   97.030783][  T822]  __lock_acquire.cold.77+0x149/0x3a9
[   97.031773][  T822]  ? stack_trace_save+0x81/0xa0
[   97.032661][  T822]  ? register_lock_class+0x1910/0x1910
[   97.033673][  T822]  ? register_lock_class+0x1910/0x1910
[   97.034679][  T822]  ? rcu_read_lock_sched_held+0x91/0xc0
[   97.035697][  T822]  ? rcu_read_lock_bh_held+0xa0/0xa0
[   97.036690][  T822]  lock_acquire+0x1b2/0x730
[   97.037515][  T822]  ? __dev_queue_xmit+0x1f52/0x2960
[   97.038466][  T822]  ? check_flags+0x50/0x50
[   97.039277][  T822]  ? netif_skb_features+0x296/0x9c0
[   97.040226][  T822]  ? validate_xmit_skb+0x29/0xb10
[   97.041151][  T822]  _raw_spin_lock+0x30/0x70
[   97.041977][  T822]  ? __dev_queue_xmit+0x1f52/0x2960
[   97.042927][  T822]  __dev_queue_xmit+0x1f52/0x2960
[   97.043852][  T822]  ? netdev_core_pick_tx+0x290/0x290
[   97.044824][  T822]  ? mark_held_locks+0xb7/0x120
[   97.045712][  T822]  ? lockdep_hardirqs_on_prepare+0x12c/0x3e0
[   97.046824][  T822]  ? __local_bh_enable_ip+0xa5/0xf0
[   97.047771][  T822]  ? ___neigh_create+0x12a8/0x1eb0
[   97.048710][  T822]  ? trace_hardirqs_on+0x41/0x120
[   97.049626][  T822]  ? ___neigh_create+0x12a8/0x1eb0
[   97.050556][  T822]  ? __local_bh_enable_ip+0xa5/0xf0
[   97.051509][  T822]  ? ___neigh_create+0x12a8/0x1eb0
[   97.052443][  T822]  ? check_chain_key+0x244/0x5f0
[   97.053352][  T822]  ? rcu_read_lock_bh_held+0x56/0xa0
[   97.054317][  T822]  ? ip_finish_output2+0x6ea/0x2020
[   97.055263][  T822]  ? pneigh_lookup+0x410/0x410
[   97.056135][  T822]  ip_finish_output2+0x6ea/0x2020
[ ... ]

Acked-by: Guillaume Nault <gnault@redhat.com>
Fixes: 571912c ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Link: https://lore.kernel.org/r/20201228152136.24215-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
commodo pushed a commit that referenced this pull request Feb 19, 2021
Ido Schimmel says:

====================
nexthop: Various fixes

This series contains various fixes for the nexthop code. The bugs were
uncovered during the development of resilient nexthop groups.

Patches #1-#2 fix the error path of nexthop_create_group(). I was not
able to trigger these bugs with current code, but it is possible with
the upcoming resilient nexthop groups code which adds a user
controllable memory allocation further in the function.

Patch #3 fixes wrong validation of netlink attributes.

Patch #4 fixes wrong invocation of mausezahn in a selftest.
====================

Link: https://lore.kernel.org/r/20210107144824.1135691-1-idosch@idosch.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
commodo pushed a commit that referenced this pull request Feb 19, 2021
We had kernel panic, it is caused by unload module and last
close confirmation.

call trace:
[1196029.743127]  free_sess+0x15/0x50 [rtrs_client]
[1196029.743128]  rtrs_clt_close+0x4c/0x70 [rtrs_client]
[1196029.743129]  ? rnbd_clt_unmap_device+0x1b0/0x1b0 [rnbd_client]
[1196029.743130]  close_rtrs+0x25/0x50 [rnbd_client]
[1196029.743131]  rnbd_client_exit+0x93/0xb99 [rnbd_client]
[1196029.743132]  __x64_sys_delete_module+0x190/0x260

And in the crashdump confirmation kworker is also running.
PID: 6943   TASK: ffff9e2ac8098000  CPU: 4   COMMAND: "kworker/4:2"
 #0 [ffffb206cf337c30] __schedule at ffffffff9f93f891
 #1 [ffffb206cf337cc8] schedule at ffffffff9f93fe98
 #2 [ffffb206cf337cd0] schedule_timeout at ffffffff9f943938
 #3 [ffffb206cf337d50] wait_for_completion at ffffffff9f9410a7
 #4 [ffffb206cf337da0] __flush_work at ffffffff9f08ce0e
 #5 [ffffb206cf337e20] rtrs_clt_close_conns at ffffffffc0d5f668 [rtrs_client]
 #6 [ffffb206cf337e48] rtrs_clt_close at ffffffffc0d5f801 [rtrs_client]
 #7 [ffffb206cf337e68] close_rtrs at ffffffffc0d26255 [rnbd_client]
 #8 [ffffb206cf337e78] free_sess at ffffffffc0d262ad [rnbd_client]
 #9 [ffffb206cf337e88] rnbd_clt_put_dev at ffffffffc0d266a7 [rnbd_client]

The problem is both code path try to close same session, which lead to
panic.

To fix it, just skip the sess if the refcount already drop to 0.

Fixes: f7a7a5c ("block/rnbd: client: main functionality")
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
nunojsa pushed a commit that referenced this pull request Aug 19, 2021
"BUG: KASAN: stack-out-of-bounds in strncpy+0x30/0x68"

Linux-ATF interface is using 16 bytes of SMC payload. In case clock name is
longer than 15 bytes, string terminated NULL character will not be received
by Linux. Add explicit NULL character at last byte to fix issues when clock
name is longer.

This fixes below bug reported by KASAN:

[    7.522474] ==================================================================
[    7.529795] BUG: KASAN: stack-out-of-bounds in strncpy+0x30/0x68
[    7.535871] Read of size 1 at addr ffff0008c89a7410 by task swapper/0/1
[    7.542557]
[    7.544065] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.4.0-00396-g81ef9e7-dirty #3
[    7.551809] Hardware name: Xilinx Versal vck190 Eval board revA (QSPI) (DT)
[    7.558847] Call trace:
[    7.561321]  dump_backtrace+0x0/0x1e8
[    7.565023]  show_stack+0x14/0x20
[    7.568374]  dump_stack+0xd4/0x108
[    7.571817]  print_address_description.isra.0+0xbc/0x37c
[    7.577189]  __kasan_report+0x144/0x198
[    7.581068]  kasan_report+0xc/0x18
[    7.584507]  __asan_load1+0x5c/0x68
[    7.588032]  strncpy+0x30/0x68
[    7.591120]  zynqmp_clock_probe+0x238/0x7b8
[    7.595350]  platform_drv_probe+0x6c/0xc8
[    7.599405]  really_probe+0x14c/0x418
[    7.603108]  driver_probe_device+0x74/0x130
[    7.607339]  __device_attach_driver+0xc4/0xe8
[    7.611744]  bus_for_each_drv+0xec/0x150
[    7.615711]  __device_attach+0x160/0x1d8
[    7.619678]  device_initial_probe+0x10/0x18
[    7.623907]  bus_probe_device+0xe0/0xf0
[    7.627785]  device_add+0x528/0x950
[    7.631312]  of_device_add+0x5c/0x80
[    7.634926]  of_platform_device_create_pdata+0x120/0x168
[    7.640299]  of_platform_bus_create+0x244/0x4e0
[    7.644880]  of_platform_populate+0x50/0xe8
[    7.649110]  zynqmp_firmware_probe+0x370/0x3a8
[    7.653602]  platform_drv_probe+0x6c/0xc8
[    7.657656]  really_probe+0x14c/0x418
[    7.661359]  driver_probe_device+0x74/0x130
[    7.665589]  device_driver_attach+0x94/0xa0
[    7.669820]  __driver_attach+0x70/0x108
[    7.673698]  bus_for_each_dev+0xe4/0x158
[    7.677664]  driver_attach+0x30/0x40
[    7.681278]  bus_add_driver+0x21c/0x2b8
[    7.685157]  driver_register+0xbc/0x1d0
[    7.689035]  __platform_driver_register+0x7c/0x88
[    7.693793]  zynqmp_firmware_driver_init+0x1c/0x24
[    7.698637]  do_one_initcall+0xa4/0x234
[    7.702518]  kernel_init_freeable+0x1b0/0x24c
[    7.706924]  kernel_init+0x10/0x110
[    7.710450]  ret_from_fork+0x10/0x18
[    7.714058]
[    7.715559] The buggy address belongs to the page:
[    7.720405] page:ffff0008f9be1c88 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0
[    7.728772] raw: 0008d00000000000 ffff0008f9be1c90 ffff0008f9be1c90 0000000000000000
[    7.736606] raw: 0000000000000000 0000000000000000 00000000ffffffff
[    7.742942] page dumped because: kasan: bad access detected
[    7.748572]
[    7.750076] addr ffff0008c89a7410 is located in stack of task swapper/0/1 at offset 112 in frame:
[    7.759052]  zynqmp_clock_probe+0x0/0x7b8
[    7.763103]
[    7.764604] this frame has 3 objects:
[    7.768306]  [32, 44) 'response'
[    7.768312]  [64, 80) 'ret_payload'
[    7.771573]  [96, 112) 'name'
[    7.775095]
[    7.779585] Memory state around the buggy address:
[    7.784430]  ffff0008c89a7300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[    7.791735]  ffff0008c89a7380: 00 00 00 00 f1 f1 f1 f1 00 04 f2 f2 00 00 f2 f2
[    7.799040] >ffff0008c89a7400: 00 00 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
[    7.806342]                          ^
[    7.810132]  ffff0008c89a7480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[    7.817437]  ffff0008c89a7500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[    7.824738] ==================================================================

Signed-off-by: Ian Nam <young.kwan.nam@xilinx.com>
Signed-off-by: Rajan Vaja <rajan.vaja@xilinx.com>
nunojsa added a commit that referenced this pull request Mar 29, 2022
When handling the notifications sent by dts overlays, the code was
calling 'of_find_node_with_property()' to look for jesd204 devices. The
typical way to call this is:

```
for (dn = of_find_node_with_property(NULL, prop_name); dn; \
	     dn = of_find_node_with_property(dn, prop_name))
```

So when we are done with the loop, all nodes are properly handled (i.e:
of_node_put() was called). The problem here is that we first call
'of_find_node_with_property()' with a non NULL 'from' and so 'of_node_put()'
will be wrongly called on it (since that node is not something we really hold
a reference) leading to the following dump when releasing a dt overlay:

[ 8590.192388] OF: ERROR: Bad of_node_put() on /fragment@0/__overlay__
[ 8590.192410] CPU: 0 PID: 1352 Comm: dtoverlay Tainted: G        WC        5.10.63-v7l+ #3
[ 8590.192420] Hardware name: BCM2711
[ 8590.192429] Backtrace:
[ 8590.192468] [<c0ffcedc>] (dump_backtrace) from [<c0ffd250>] (show_stack+0x20/0x24)
[ 8590.192483]  r7:ffffffff r6:00000000 r5:60000013 r4:c1ee76fc
[ 8590.192501] [<c0ffd230>] (show_stack) from [<c1001a80>] (dump_stack+0xc4/0xf0)
[ 8590.192519] [<c10019bc>] (dump_stack) from [<c0b9dae8>] (of_node_release+0x120/0x124)
[ 8590.192533]  r9:c49b1100 r8:c4fe3020 r7:c5fb0190 r6:c5fb01bc r5:00000000 r4:c5fb01bc
[ 8590.192553] [<c0b9d9c8>] (of_node_release) from [<c07a1470>] (kobject_put+0xb8/0xf8)
[ 8590.192565]  r7:00000000 r6:c1f30748 r5:00000000 r4:c5fb01bc
[ 8590.192580] [<c07a13b8>] (kobject_put) from [<c0b9cd40>] (of_node_put+0x24/0x28)
[ 8590.192592]  r7:c1f3097c r6:c4fe3000 r5:c4fe3000 r4:00000001
[ 8590.192609] [<c0b9cd1c>] (of_node_put) from [<c0ba2580>] (free_overlay_changeset+0x68/0xa8)
[ 8590.192626] [<c0ba2518>] (free_overlay_changeset) from [<c0ba27f8>] (of_overlay_remove+0x1b8/0x2b8)
...

This change fixes it by refactoring the way we detect jesd204 devices.
Note that with this, the following dts overlay example won't work:

```
__overlay__ {
	node1 {
		node2 {
			jesd204_device;
		};
	};
};
```

For now this should not be a problem as we don't really have jesd
devices defined in child nodes of a device.

Fixes: b61aebb ("jesd204: jesd204-core: Support for dynamic dt changes")
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
nunojsa added a commit that referenced this pull request Mar 29, 2022
When handling the notifications sent by dts overlays, the code was
calling 'of_find_node_with_property()' to look for jesd204 devices. The
typical way to call this is:

```
for (dn = of_find_node_with_property(NULL, prop_name); dn; \
	     dn = of_find_node_with_property(dn, prop_name))
```

So when we are done with the loop, all nodes are properly handled (i.e:
of_node_put() was called). The problem here is that we first call
'of_find_node_with_property()' with a non NULL 'from' and so
'of_node_put()' will be wrongly called on it (since that node is not
something we really hold a reference for) leading to the following dump
when releasing a dt overlay:

[ 8590.192388] OF: ERROR: Bad of_node_put() on /fragment@0/__overlay__
[ 8590.192410] CPU: 0 PID: 1352 Comm: dtoverlay Tainted: G        WC        5.10.63-v7l+ #3
[ 8590.192420] Hardware name: BCM2711
[ 8590.192429] Backtrace:
[ 8590.192468] [<c0ffcedc>] (dump_backtrace) from [<c0ffd250>] (show_stack+0x20/0x24)
[ 8590.192483]  r7:ffffffff r6:00000000 r5:60000013 r4:c1ee76fc
[ 8590.192501] [<c0ffd230>] (show_stack) from [<c1001a80>] (dump_stack+0xc4/0xf0)
[ 8590.192519] [<c10019bc>] (dump_stack) from [<c0b9dae8>] (of_node_release+0x120/0x124)
[ 8590.192533]  r9:c49b1100 r8:c4fe3020 r7:c5fb0190 r6:c5fb01bc r5:00000000 r4:c5fb01bc
[ 8590.192553] [<c0b9d9c8>] (of_node_release) from [<c07a1470>] (kobject_put+0xb8/0xf8)
[ 8590.192565]  r7:00000000 r6:c1f30748 r5:00000000 r4:c5fb01bc
[ 8590.192580] [<c07a13b8>] (kobject_put) from [<c0b9cd40>] (of_node_put+0x24/0x28)
[ 8590.192592]  r7:c1f3097c r6:c4fe3000 r5:c4fe3000 r4:00000001
[ 8590.192609] [<c0b9cd1c>] (of_node_put) from [<c0ba2580>] (free_overlay_changeset+0x68/0xa8)
[ 8590.192626] [<c0ba2518>] (free_overlay_changeset) from [<c0ba27f8>] (of_overlay_remove+0x1b8/0x2b8)
...

This change fixes it by refactoring the way we detect jesd204 devices.
Note that with this, the following dts overlay example won't work:

```
__overlay__ {
	node1 {
		node2 {
			jesd204_device;
		};
	};
};
```

For now this should not be a problem as we don't really have jesd
devices defined in child nodes of a device.

Fixes: b61aebb ("jesd204: jesd204-core: Support for dynamic dt changes")
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
nunojsa added a commit that referenced this pull request Mar 31, 2022
When handling the notifications sent by dts overlays, the code was
calling 'of_find_node_with_property()' to look for jesd204 devices. The
typical way to call this is:

```
for (dn = of_find_node_with_property(NULL, prop_name); dn; \
	     dn = of_find_node_with_property(dn, prop_name))
```

So when we are done with the loop, all nodes are properly handled (i.e:
of_node_put() was called). The problem here is that we first call
'of_find_node_with_property()' with a non NULL 'from' and so
'of_node_put()' will be wrongly called on it (since that node is not
something we really hold a reference for) leading to the following dump
when releasing a dt overlay:

[ 8590.192388] OF: ERROR: Bad of_node_put() on /fragment@0/__overlay__
[ 8590.192410] CPU: 0 PID: 1352 Comm: dtoverlay Tainted: G        WC        5.10.63-v7l+ #3
[ 8590.192420] Hardware name: BCM2711
[ 8590.192429] Backtrace:
[ 8590.192468] [<c0ffcedc>] (dump_backtrace) from [<c0ffd250>] (show_stack+0x20/0x24)
[ 8590.192483]  r7:ffffffff r6:00000000 r5:60000013 r4:c1ee76fc
[ 8590.192501] [<c0ffd230>] (show_stack) from [<c1001a80>] (dump_stack+0xc4/0xf0)
[ 8590.192519] [<c10019bc>] (dump_stack) from [<c0b9dae8>] (of_node_release+0x120/0x124)
[ 8590.192533]  r9:c49b1100 r8:c4fe3020 r7:c5fb0190 r6:c5fb01bc r5:00000000 r4:c5fb01bc
[ 8590.192553] [<c0b9d9c8>] (of_node_release) from [<c07a1470>] (kobject_put+0xb8/0xf8)
[ 8590.192565]  r7:00000000 r6:c1f30748 r5:00000000 r4:c5fb01bc
[ 8590.192580] [<c07a13b8>] (kobject_put) from [<c0b9cd40>] (of_node_put+0x24/0x28)
[ 8590.192592]  r7:c1f3097c r6:c4fe3000 r5:c4fe3000 r4:00000001
[ 8590.192609] [<c0b9cd1c>] (of_node_put) from [<c0ba2580>] (free_overlay_changeset+0x68/0xa8)
[ 8590.192626] [<c0ba2518>] (free_overlay_changeset) from [<c0ba27f8>] (of_overlay_remove+0x1b8/0x2b8)
...

This change fixes it by refactoring the way we detect jesd204 devices.
Note that with this, the following dts overlay example won't work:

```
__overlay__ {
	node1 {
		node2 {
			jesd204_device;
		};
	};
};
```

For now this should not be a problem as we don't really have jesd
devices defined in child nodes of a device.

Fixes: b61aebb ("jesd204: jesd204-core: Support for dynamic dt changes")
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
nunojsa added a commit that referenced this pull request Mar 31, 2022
When handling the notifications sent by dts overlays, the code was
calling 'of_find_node_with_property()' to look for jesd204 devices. The
typical way to call this is:

```
for (dn = of_find_node_with_property(NULL, prop_name); dn; \
	     dn = of_find_node_with_property(dn, prop_name))
```

So when we are done with the loop, all nodes are properly handled (i.e:
of_node_put() was called). The problem here is that we first call
'of_find_node_with_property()' with a non NULL 'from' and so 'of_node_put()'
will be wrongly called on it (since that node is not something we really hold
a reference) leading to the following dump when releasing a dt overlay:

[ 8590.192388] OF: ERROR: Bad of_node_put() on /fragment@0/__overlay__
[ 8590.192410] CPU: 0 PID: 1352 Comm: dtoverlay Tainted: G        WC        5.10.63-v7l+ #3
[ 8590.192420] Hardware name: BCM2711
[ 8590.192429] Backtrace:
[ 8590.192468] [<c0ffcedc>] (dump_backtrace) from [<c0ffd250>] (show_stack+0x20/0x24)
[ 8590.192483]  r7:ffffffff r6:00000000 r5:60000013 r4:c1ee76fc
[ 8590.192501] [<c0ffd230>] (show_stack) from [<c1001a80>] (dump_stack+0xc4/0xf0)
[ 8590.192519] [<c10019bc>] (dump_stack) from [<c0b9dae8>] (of_node_release+0x120/0x124)
[ 8590.192533]  r9:c49b1100 r8:c4fe3020 r7:c5fb0190 r6:c5fb01bc r5:00000000 r4:c5fb01bc
[ 8590.192553] [<c0b9d9c8>] (of_node_release) from [<c07a1470>] (kobject_put+0xb8/0xf8)
[ 8590.192565]  r7:00000000 r6:c1f30748 r5:00000000 r4:c5fb01bc
[ 8590.192580] [<c07a13b8>] (kobject_put) from [<c0b9cd40>] (of_node_put+0x24/0x28)
[ 8590.192592]  r7:c1f3097c r6:c4fe3000 r5:c4fe3000 r4:00000001
[ 8590.192609] [<c0b9cd1c>] (of_node_put) from [<c0ba2580>] (free_overlay_changeset+0x68/0xa8)
[ 8590.192626] [<c0ba2518>] (free_overlay_changeset) from [<c0ba27f8>] (of_overlay_remove+0x1b8/0x2b8)
...

This change fixes it by refactoring the way we detect jesd204 devices.
Note that with this, the following dts overlay example won't work:

```
__overlay__ {
	node1 {
		node2 {
			jesd204_device;
		};
	};
};
```

For now this should not be a problem as we don't really have jesd
devices defined in child nodes of a device.

Fixes: b61aebb ("jesd204: jesd204-core: Support for dynamic dt changes")
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
nunojsa added a commit that referenced this pull request Mar 31, 2022
When handling the notifications sent by dts overlays, the code was
calling 'of_find_node_with_property()' to look for jesd204 devices. The
typical way to call this is:

```
for (dn = of_find_node_with_property(NULL, prop_name); dn; \
	     dn = of_find_node_with_property(dn, prop_name))
```

So when we are done with the loop, all nodes are properly handled (i.e:
of_node_put() was called). The problem here is that we first call
'of_find_node_with_property()' with a non NULL 'from' and so
'of_node_put()' will be wrongly called on it (since that node is not
something we really hold a reference for) leading to the following dump
when releasing a dt overlay:

[ 8590.192388] OF: ERROR: Bad of_node_put() on /fragment@0/__overlay__
[ 8590.192410] CPU: 0 PID: 1352 Comm: dtoverlay Tainted: G        WC        5.10.63-v7l+ #3
[ 8590.192420] Hardware name: BCM2711
[ 8590.192429] Backtrace:
[ 8590.192468] [<c0ffcedc>] (dump_backtrace) from [<c0ffd250>] (show_stack+0x20/0x24)
[ 8590.192483]  r7:ffffffff r6:00000000 r5:60000013 r4:c1ee76fc
[ 8590.192501] [<c0ffd230>] (show_stack) from [<c1001a80>] (dump_stack+0xc4/0xf0)
[ 8590.192519] [<c10019bc>] (dump_stack) from [<c0b9dae8>] (of_node_release+0x120/0x124)
[ 8590.192533]  r9:c49b1100 r8:c4fe3020 r7:c5fb0190 r6:c5fb01bc r5:00000000 r4:c5fb01bc
[ 8590.192553] [<c0b9d9c8>] (of_node_release) from [<c07a1470>] (kobject_put+0xb8/0xf8)
[ 8590.192565]  r7:00000000 r6:c1f30748 r5:00000000 r4:c5fb01bc
[ 8590.192580] [<c07a13b8>] (kobject_put) from [<c0b9cd40>] (of_node_put+0x24/0x28)
[ 8590.192592]  r7:c1f3097c r6:c4fe3000 r5:c4fe3000 r4:00000001
[ 8590.192609] [<c0b9cd1c>] (of_node_put) from [<c0ba2580>] (free_overlay_changeset+0x68/0xa8)
[ 8590.192626] [<c0ba2518>] (free_overlay_changeset) from [<c0ba27f8>] (of_overlay_remove+0x1b8/0x2b8)
...

This change fixes it by refactoring the way we detect jesd204 devices.
Note that with this, the following dts overlay example won't work:

```
__overlay__ {
	node1 {
		node2 {
			jesd204_device;
		};
	};
};
```

For now this should not be a problem as we don't really have jesd
devices defined in child nodes of a device.

Fixes: b61aebb ("jesd204: jesd204-core: Support for dynamic dt changes")
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
azure-pipelines bot pushed a commit that referenced this pull request Mar 31, 2022
When handling the notifications sent by dts overlays, the code was
calling 'of_find_node_with_property()' to look for jesd204 devices. The
typical way to call this is:

```
for (dn = of_find_node_with_property(NULL, prop_name); dn; \
	     dn = of_find_node_with_property(dn, prop_name))
```

So when we are done with the loop, all nodes are properly handled (i.e:
of_node_put() was called). The problem here is that we first call
'of_find_node_with_property()' with a non NULL 'from' and so
'of_node_put()' will be wrongly called on it (since that node is not
something we really hold a reference for) leading to the following dump
when releasing a dt overlay:

[ 8590.192388] OF: ERROR: Bad of_node_put() on /fragment@0/__overlay__
[ 8590.192410] CPU: 0 PID: 1352 Comm: dtoverlay Tainted: G        WC        5.10.63-v7l+ #3
[ 8590.192420] Hardware name: BCM2711
[ 8590.192429] Backtrace:
[ 8590.192468] [<c0ffcedc>] (dump_backtrace) from [<c0ffd250>] (show_stack+0x20/0x24)
[ 8590.192483]  r7:ffffffff r6:00000000 r5:60000013 r4:c1ee76fc
[ 8590.192501] [<c0ffd230>] (show_stack) from [<c1001a80>] (dump_stack+0xc4/0xf0)
[ 8590.192519] [<c10019bc>] (dump_stack) from [<c0b9dae8>] (of_node_release+0x120/0x124)
[ 8590.192533]  r9:c49b1100 r8:c4fe3020 r7:c5fb0190 r6:c5fb01bc r5:00000000 r4:c5fb01bc
[ 8590.192553] [<c0b9d9c8>] (of_node_release) from [<c07a1470>] (kobject_put+0xb8/0xf8)
[ 8590.192565]  r7:00000000 r6:c1f30748 r5:00000000 r4:c5fb01bc
[ 8590.192580] [<c07a13b8>] (kobject_put) from [<c0b9cd40>] (of_node_put+0x24/0x28)
[ 8590.192592]  r7:c1f3097c r6:c4fe3000 r5:c4fe3000 r4:00000001
[ 8590.192609] [<c0b9cd1c>] (of_node_put) from [<c0ba2580>] (free_overlay_changeset+0x68/0xa8)
[ 8590.192626] [<c0ba2518>] (free_overlay_changeset) from [<c0ba27f8>] (of_overlay_remove+0x1b8/0x2b8)
...

This change fixes it by refactoring the way we detect jesd204 devices.
Note that with this, the following dts overlay example won't work:

```
__overlay__ {
	node1 {
		node2 {
			jesd204_device;
		};
	};
};
```

For now this should not be a problem as we don't really have jesd
devices defined in child nodes of a device.

Fixes: b61aebb ("jesd204: jesd204-core: Support for dynamic dt changes")
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
(cherry picked from commit e6f48d8)
azure-pipelines bot pushed a commit that referenced this pull request Mar 31, 2022
When handling the notifications sent by dts overlays, the code was
calling 'of_find_node_with_property()' to look for jesd204 devices. The
typical way to call this is:

```
for (dn = of_find_node_with_property(NULL, prop_name); dn; \
	     dn = of_find_node_with_property(dn, prop_name))
```

So when we are done with the loop, all nodes are properly handled (i.e:
of_node_put() was called). The problem here is that we first call
'of_find_node_with_property()' with a non NULL 'from' and so
'of_node_put()' will be wrongly called on it (since that node is not
something we really hold a reference for) leading to the following dump
when releasing a dt overlay:

[ 8590.192388] OF: ERROR: Bad of_node_put() on /fragment@0/__overlay__
[ 8590.192410] CPU: 0 PID: 1352 Comm: dtoverlay Tainted: G        WC        5.10.63-v7l+ #3
[ 8590.192420] Hardware name: BCM2711
[ 8590.192429] Backtrace:
[ 8590.192468] [<c0ffcedc>] (dump_backtrace) from [<c0ffd250>] (show_stack+0x20/0x24)
[ 8590.192483]  r7:ffffffff r6:00000000 r5:60000013 r4:c1ee76fc
[ 8590.192501] [<c0ffd230>] (show_stack) from [<c1001a80>] (dump_stack+0xc4/0xf0)
[ 8590.192519] [<c10019bc>] (dump_stack) from [<c0b9dae8>] (of_node_release+0x120/0x124)
[ 8590.192533]  r9:c49b1100 r8:c4fe3020 r7:c5fb0190 r6:c5fb01bc r5:00000000 r4:c5fb01bc
[ 8590.192553] [<c0b9d9c8>] (of_node_release) from [<c07a1470>] (kobject_put+0xb8/0xf8)
[ 8590.192565]  r7:00000000 r6:c1f30748 r5:00000000 r4:c5fb01bc
[ 8590.192580] [<c07a13b8>] (kobject_put) from [<c0b9cd40>] (of_node_put+0x24/0x28)
[ 8590.192592]  r7:c1f3097c r6:c4fe3000 r5:c4fe3000 r4:00000001
[ 8590.192609] [<c0b9cd1c>] (of_node_put) from [<c0ba2580>] (free_overlay_changeset+0x68/0xa8)
[ 8590.192626] [<c0ba2518>] (free_overlay_changeset) from [<c0ba27f8>] (of_overlay_remove+0x1b8/0x2b8)
...

This change fixes it by refactoring the way we detect jesd204 devices.
Note that with this, the following dts overlay example won't work:

```
__overlay__ {
	node1 {
		node2 {
			jesd204_device;
		};
	};
};
```

For now this should not be a problem as we don't really have jesd
devices defined in child nodes of a device.

Fixes: b61aebb ("jesd204: jesd204-core: Support for dynamic dt changes")
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
(cherry picked from commit e6f48d8)
jeremymc526 pushed a commit to zainar/linux_analogdev that referenced this pull request Jul 4, 2022
Both loopback DMA engines are now supported in the AXIDMA testcode.

The following Device Tree snippet is needed in order to test both
DMA engines

/include/ "system-conf.dtsi"
/ {
        axidmatest_1: axidmatest@1 {
                      compatible = "zainar,axi-dma-test-1.00.a";
                      dmas = <&axi_dma_0 0  /* Write channel */
                              &axi_dma_0 1  /* Read channel */
                              &axi_dma_1 0  /* Write channel */
                              &axi_dma_1 1>; /* Read channel */
                      dma-names = "axidma0", "axidma1", "axidma2", "axidma3";
        };
};

Test results:

[    2.669085]  ===== ZAINAR AXI DUAL FPGA DMA TEST PROBE =====
[    2.674858]  -- xilinx_axidmatest_probe --
[    2.679529]  -- dmatest_add_slave_channels --
[    2.683945] -- dmatest_add_slave_threads --
[    2.688253] dmatest: Started 1 threads using dma1chan0 dma1chan1
[    2.694771] dma1chan0-dma1c: verifying source buffer...
[    2.702279] dma1chan0-dma1c: verifying dest buffer...
[    2.709575] dma1chan0-dma1c: #0: No errors with
[    2.709585] src_off=0x124 dst_off=0x4dc len=0x380c
[    2.723140] dma1chan0-dma1c: verifying source buffer...
[    2.730605] dma1chan0-dma1c: verifying dest buffer...
[    2.737954] dma1chan0-dma1c: analogdevicesinc#1: No errors with
[    2.737964] src_off=0x304 dst_off=0x1c80 len=0xeb0
[    2.753020] dma1chan0-dma1c: verifying source buffer...
[    2.760480] dma1chan0-dma1c: verifying dest buffer...
[    2.767814] dma1chan0-dma1c: analogdevicesinc#2: No errors with
[    2.767824] src_off=0x90 dst_off=0x88 len=0x3f40
[    2.781391] dma1chan0-dma1c: verifying source buffer...
[    2.788854] dma1chan0-dma1c: verifying dest buffer...
[    2.796184] dma1chan0-dma1c: analogdevicesinc#3: No errors with
[    2.796193] src_off=0x90c dst_off=0x728 len=0x16c8
[    2.811164] dma1chan0-dma1c: verifying source buffer...
[    2.818629] dma1chan0-dma1c: verifying dest buffer...
[    2.825951] dma1chan0-dma1c: analogdevicesinc#4: No errors with
[    2.825960] src_off=0x11c dst_off=0x248 len=0x3c44
[    2.835414] dma1chan0-dma1c: terminating after 5 tests, 0 failures 89 iops 10624 KB/s (status 0)

[    2.844241]  -- xilinx_axidmatest_probe -- Finished adding first loopback channels
[    2.852467]  -- dmatest_add_slave_channels --
[    2.856821] -- dmatest_add_slave_threads --
[    2.861159] dmatest: Started 1 threads using dma2chan0 dma2chan1
[    2.866909] dma2chan0-dma2c: verifying source buffer...
[    2.874653] dma2chan0-dma2c: verifying dest buffer...
[    2.881974] dma2chan0-dma2c: #0: No errors with
[    2.881985] src_off=0x188c dst_off=0x1360 len=0x2378
[    2.896049] dma2chan0-dma2c: verifying source buffer...
[    2.903541] dma2chan0-dma2c: verifying dest buffer...
[    2.910840] dma2chan0-dma2c: analogdevicesinc#1: No errors with
[    2.910849] src_off=0xe5c dst_off=0x720 len=0x1c64
[    2.924670] dma2chan0-dma2c: verifying source buffer...
[    2.932158] dma2chan0-dma2c: verifying dest buffer...
[    2.939457] dma2chan0-dma2c: analogdevicesinc#2: No errors with
[    2.939466] src_off=0x1df4 dst_off=0xfa0 len=0x188c
[    2.953057] dma2chan0-dma2c: verifying source buffer...
[    2.960519] dma2chan0-dma2c: verifying dest buffer...
[    2.967847] dma2chan0-dma2c: analogdevicesinc#3: No errors with
[    2.967857] src_off=0x17a0 dst_off=0x2a08 len=0xed0
[    2.982723] dma2chan0-dma2c: verifying source buffer...
[    2.990180] dma2chan0-dma2c: verifying dest buffer...
[    2.997508] dma2chan0-dma2c: analogdevicesinc#4: No errors with
[    2.997517] src_off=0x458 dst_off=0x770 len=0x365c
[    3.006967] dma2chan0-dma2c: terminating after 5 tests, 0 failures 89 iops 7757 KB/s (status 0)
[    3.015704]  -- xilinx_axidmatest_probe -- Finished adding second loopback channels
rbolboac pushed a commit that referenced this pull request Sep 5, 2022
We hit following warning when running tests on kernel
compiled with CONFIG_DEBUG_ATOMIC_SLEEP=y:

 WARNING: CPU: 19 PID: 4472 at mm/gup.c:2381 __get_user_pages_fast+0x1a4/0x200
 CPU: 19 PID: 4472 Comm: dummy Not tainted 5.6.0-rc6+ #3
 RIP: 0010:__get_user_pages_fast+0x1a4/0x200
 ...
 Call Trace:
  perf_prepare_sample+0xff1/0x1d90
  perf_event_output_forward+0xe8/0x210
  __perf_event_overflow+0x11a/0x310
  __intel_pmu_pebs_event+0x657/0x850
  intel_pmu_drain_pebs_nhm+0x7de/0x11d0
  handle_pmi_common+0x1b2/0x650
  intel_pmu_handle_irq+0x17b/0x370
  perf_event_nmi_handler+0x40/0x60
  nmi_handle+0x192/0x590
  default_do_nmi+0x6d/0x150
  do_nmi+0x2f9/0x3c0
  nmi+0x8e/0xd7

While __get_user_pages_fast() is IRQ-safe, it calls access_ok(),
which warns on:

  WARN_ON_ONCE(!in_task() && !pagefault_disabled())

Peter suggested disabling page faults around __get_user_pages_fast(),
which gets rid of the warning in access_ok() call.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200407141427.3184722-1-jolsa@kernel.org
rbolboac pushed a commit that referenced this pull request Sep 5, 2022
sched/core.c uses update_avg() for rq->avg_idle and sched/fair.c uses an
open-coded version (with the exact same decay factor) for
rq->avg_scan_cost. On top of that, select_idle_cpu() expects to be able to
compare these two fields.

The only difference between the two is that rq->avg_scan_cost is computed
using a pure division rather than a shift. Turns out it actually matters,
first of all because the shifted value can be negative, and the standard
has this to say about it:

  """
  The result of E1 >> E2 is E1 right-shifted E2 bit positions. [...] If E1
  has a signed type and a negative value, the resulting value is
  implementation-defined.
  """

Not only this, but (arithmetic) right shifting a negative value (using 2's
complement) is *not* equivalent to dividing it by the corresponding power
of 2. Let's look at a few examples:

  -4      -> 0xF..FC
  -4 >> 3 -> 0xF..FF == -1 != -4 / 8

  -8      -> 0xF..F8
  -8 >> 3 -> 0xF..FF == -1 == -8 / 8

  -9      -> 0xF..F7
  -9 >> 3 -> 0xF..FE == -2 != -9 / 8

Make update_avg() use a division, and export it to the private scheduler
header to reuse it where relevant. Note that this still lets compilers use
a shift here, but should prevent any unwanted surprise. The disassembly of
select_idle_cpu() remains unchanged on arm64, and ttwu_do_wakeup() gains 2
instructions; the diff sort of looks like this:

  - sub x1, x1, x0
  + subs x1, x1, x0 // set condition codes
  + add x0, x1, #0x7
  + csel x0, x0, x1, mi // x0 = x1 < 0 ? x0 : x1
    add x0, x3, x0, asr #3

which does the right thing (i.e. gives us the expected result while still
using an arithmetic shift)

Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200330090127.16294-1-valentin.schneider@arm.com
nunojsa added a commit that referenced this pull request Sep 7, 2022
When handling the notifications sent by dts overlays, the code was
calling 'of_find_node_with_property()' to look for jesd204 devices. The
typical way to call this is:

```
for (dn = of_find_node_with_property(NULL, prop_name); dn; \
	     dn = of_find_node_with_property(dn, prop_name))
```

So when we are done with the loop, all nodes are properly handled (i.e:
of_node_put() was called). The problem here is that we first call
'of_find_node_with_property()' with a non NULL 'from' and so
'of_node_put()' will be wrongly called on it (since that node is not
something we really hold a reference for) leading to the following dump
when releasing a dt overlay:

[ 8590.192388] OF: ERROR: Bad of_node_put() on /fragment@0/__overlay__
[ 8590.192410] CPU: 0 PID: 1352 Comm: dtoverlay Tainted: G        WC        5.10.63-v7l+ #3
[ 8590.192420] Hardware name: BCM2711
[ 8590.192429] Backtrace:
[ 8590.192468] [<c0ffcedc>] (dump_backtrace) from [<c0ffd250>] (show_stack+0x20/0x24)
[ 8590.192483]  r7:ffffffff r6:00000000 r5:60000013 r4:c1ee76fc
[ 8590.192501] [<c0ffd230>] (show_stack) from [<c1001a80>] (dump_stack+0xc4/0xf0)
[ 8590.192519] [<c10019bc>] (dump_stack) from [<c0b9dae8>] (of_node_release+0x120/0x124)
[ 8590.192533]  r9:c49b1100 r8:c4fe3020 r7:c5fb0190 r6:c5fb01bc r5:00000000 r4:c5fb01bc
[ 8590.192553] [<c0b9d9c8>] (of_node_release) from [<c07a1470>] (kobject_put+0xb8/0xf8)
[ 8590.192565]  r7:00000000 r6:c1f30748 r5:00000000 r4:c5fb01bc
[ 8590.192580] [<c07a13b8>] (kobject_put) from [<c0b9cd40>] (of_node_put+0x24/0x28)
[ 8590.192592]  r7:c1f3097c r6:c4fe3000 r5:c4fe3000 r4:00000001
[ 8590.192609] [<c0b9cd1c>] (of_node_put) from [<c0ba2580>] (free_overlay_changeset+0x68/0xa8)
[ 8590.192626] [<c0ba2518>] (free_overlay_changeset) from [<c0ba27f8>] (of_overlay_remove+0x1b8/0x2b8)
...

This change fixes it by refactoring the way we detect jesd204 devices.
Note that with this, the following dts overlay example won't work:

```
__overlay__ {
	node1 {
		node2 {
			jesd204_device;
		};
	};
};
```

For now this should not be a problem as we don't really have jesd
devices defined in child nodes of a device.

Fixes: b61aebb ("jesd204: jesd204-core: Support for dynamic dt changes")
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
(cherry picked from commit e6f48d8)
nunojsa pushed a commit that referenced this pull request Mar 13, 2023
On the preemption path when updating a Xen guest's runstate times, this
lock is taken inside the scheduler rq->lock, which is a raw spinlock.
This was shown in a lockdep warning:

[   89.138354] =============================
[   89.138356] [ BUG: Invalid wait context ]
[   89.138358] 5.15.0-rc5+ #834 Tainted: G S        I E
[   89.138360] -----------------------------
[   89.138361] xen_shinfo_test/2575 is trying to lock:
[   89.138363] ffffa34a0364efd8 (&kvm->arch.pvclock_gtod_sync_lock){....}-{3:3}, at: get_kvmclock_ns+0x1f/0x130 [kvm]
[   89.138442] other info that might help us debug this:
[   89.138444] context-{5:5}
[   89.138445] 4 locks held by xen_shinfo_test/2575:
[   89.138447]  #0: ffff972bdc3b8108 (&vcpu->mutex){+.+.}-{4:4}, at: kvm_vcpu_ioctl+0x77/0x6f0 [kvm]
[   89.138483]  #1: ffffa34a03662e90 (&kvm->srcu){....}-{0:0}, at: kvm_arch_vcpu_ioctl_run+0xdc/0x8b0 [kvm]
[   89.138526]  #2: ffff97331fdbac98 (&rq->__lock){-.-.}-{2:2}, at: __schedule+0xff/0xbd0
[   89.138534]  #3: ffffa34a03662e90 (&kvm->srcu){....}-{0:0}, at: kvm_arch_vcpu_put+0x26/0x170 [kvm]
...
[   89.138695]  get_kvmclock_ns+0x1f/0x130 [kvm]
[   89.138734]  kvm_xen_update_runstate+0x14/0x90 [kvm]
[   89.138783]  kvm_xen_update_runstate_guest+0x15/0xd0 [kvm]
[   89.138830]  kvm_arch_vcpu_put+0xe6/0x170 [kvm]
[   89.138870]  kvm_sched_out+0x2f/0x40 [kvm]
[   89.138900]  __schedule+0x5de/0xbd0

Cc: stable@vger.kernel.org
Reported-by: syzbot+b282b65c2c68492df769@syzkaller.appspotmail.com
Fixes: 30b5c85 ("KVM: x86/xen: Add support for vCPU runstate information")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <1b02a06421c17993df337493a68ba923f3bd5c0f.camel@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
nunojsa added a commit that referenced this pull request Mar 14, 2023
When handling the notifications sent by dts overlays, the code was
calling 'of_find_node_with_property()' to look for jesd204 devices. The
typical way to call this is:

```
for (dn = of_find_node_with_property(NULL, prop_name); dn; \
	     dn = of_find_node_with_property(dn, prop_name))
```

So when we are done with the loop, all nodes are properly handled (i.e:
of_node_put() was called). The problem here is that we first call
'of_find_node_with_property()' with a non NULL 'from' and so
'of_node_put()' will be wrongly called on it (since that node is not
something we really hold a reference for) leading to the following dump
when releasing a dt overlay:

[ 8590.192388] OF: ERROR: Bad of_node_put() on /fragment@0/__overlay__
[ 8590.192410] CPU: 0 PID: 1352 Comm: dtoverlay Tainted: G        WC        5.10.63-v7l+ #3
[ 8590.192420] Hardware name: BCM2711
[ 8590.192429] Backtrace:
[ 8590.192468] [<c0ffcedc>] (dump_backtrace) from [<c0ffd250>] (show_stack+0x20/0x24)
[ 8590.192483]  r7:ffffffff r6:00000000 r5:60000013 r4:c1ee76fc
[ 8590.192501] [<c0ffd230>] (show_stack) from [<c1001a80>] (dump_stack+0xc4/0xf0)
[ 8590.192519] [<c10019bc>] (dump_stack) from [<c0b9dae8>] (of_node_release+0x120/0x124)
[ 8590.192533]  r9:c49b1100 r8:c4fe3020 r7:c5fb0190 r6:c5fb01bc r5:00000000 r4:c5fb01bc
[ 8590.192553] [<c0b9d9c8>] (of_node_release) from [<c07a1470>] (kobject_put+0xb8/0xf8)
[ 8590.192565]  r7:00000000 r6:c1f30748 r5:00000000 r4:c5fb01bc
[ 8590.192580] [<c07a13b8>] (kobject_put) from [<c0b9cd40>] (of_node_put+0x24/0x28)
[ 8590.192592]  r7:c1f3097c r6:c4fe3000 r5:c4fe3000 r4:00000001
[ 8590.192609] [<c0b9cd1c>] (of_node_put) from [<c0ba2580>] (free_overlay_changeset+0x68/0xa8)
[ 8590.192626] [<c0ba2518>] (free_overlay_changeset) from [<c0ba27f8>] (of_overlay_remove+0x1b8/0x2b8)
...

This change fixes it by refactoring the way we detect jesd204 devices.
Note that with this, the following dts overlay example won't work:

```
__overlay__ {
	node1 {
		node2 {
			jesd204_device;
		};
	};
};
```

For now this should not be a problem as we don't really have jesd
devices defined in child nodes of a device.

Fixes: b61aebb ("jesd204: jesd204-core: Support for dynamic dt changes")
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
(cherry picked from commit e6f48d8)
pcercuei pushed a commit that referenced this pull request Jun 21, 2023
When smb1 mount fails, KASAN detect slab-out-of-bounds in
init_smb2_rsp_hdr like the following one.
For smb1 negotiate(56bytes) , init_smb2_rsp_hdr() for smb2 is called.
The issue occurs while handling smb1 negotiate as smb2 server operations.
Add smb server operations for smb1 (get_cmd_val, init_rsp_hdr,
allocate_rsp_buf, check_user_session) to handle smb1 negotiate so that
smb2 server operation does not handle it.

[  411.400423] CIFS: VFS: Use of the less secure dialect vers=1.0 is
not recommended unless required for access to very old servers
[  411.400452] CIFS: Attempting to mount \\192.168.45.139\homes
[  411.479312] ksmbd: init_smb2_rsp_hdr : 492
[  411.479323] ==================================================================
[  411.479327] BUG: KASAN: slab-out-of-bounds in
init_smb2_rsp_hdr+0x1e2/0x1f4 [ksmbd]
[  411.479369] Read of size 16 at addr ffff888488ed0734 by task kworker/14:1/199

[  411.479379] CPU: 14 PID: 199 Comm: kworker/14:1 Tainted: G
 OE      6.1.21 #3
[  411.479386] Hardware name: ASUSTeK COMPUTER INC. Z10PA-D8
Series/Z10PA-D8 Series, BIOS 3801 08/23/2019
[  411.479390] Workqueue: ksmbd-io handle_ksmbd_work [ksmbd]
[  411.479425] Call Trace:
[  411.479428]  <TASK>
[  411.479432]  dump_stack_lvl+0x49/0x63
[  411.479444]  print_report+0x171/0x4a8
[  411.479452]  ? kasan_complete_mode_report_info+0x3c/0x200
[  411.479463]  ? init_smb2_rsp_hdr+0x1e2/0x1f4 [ksmbd]
[  411.479497]  kasan_report+0xb4/0x130
[  411.479503]  ? init_smb2_rsp_hdr+0x1e2/0x1f4 [ksmbd]
[  411.479537]  kasan_check_range+0x149/0x1e0
[  411.479543]  memcpy+0x24/0x70
[  411.479550]  init_smb2_rsp_hdr+0x1e2/0x1f4 [ksmbd]
[  411.479585]  handle_ksmbd_work+0x109/0x760 [ksmbd]
[  411.479616]  ? _raw_spin_unlock_irqrestore+0x50/0x50
[  411.479624]  ? smb3_encrypt_resp+0x340/0x340 [ksmbd]
[  411.479656]  process_one_work+0x49c/0x790
[  411.479667]  worker_thread+0x2b1/0x6e0
[  411.479674]  ? process_one_work+0x790/0x790
[  411.479680]  kthread+0x177/0x1b0
[  411.479686]  ? kthread_complete_and_exit+0x30/0x30
[  411.479692]  ret_from_fork+0x22/0x30
[  411.479702]  </TASK>

Fixes: 39b291b ("ksmbd: return unsupported error on smb1 mount")
Cc: stable@vger.kernel.org
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
pcercuei pushed a commit that referenced this pull request Jun 21, 2023
…kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 6.3, part #3

 - Ensure the guest PMU context is restored before the first KVM_RUN,
   fixing an issue where EL0 event counting is broken after vCPU
   save/restore

 - Actually initialize ID_AA64PFR0_EL1.{CSV2,CSV3} based on the
   sanitized, system-wide values for protected VMs
pcercuei pushed a commit that referenced this pull request Jun 21, 2023
…ct_cfm

This attempts to fix the following trace:

======================================================
WARNING: possible circular locking dependency detected
6.3.0-rc2-g0b93eeba4454 #4703 Not tainted
------------------------------------------------------
kworker/u3:0/46 is trying to acquire lock:
ffff888001fd9130 (sk_lock-AF_BLUETOOTH-BTPROTO_SCO){+.+.}-{0:0}, at:
sco_connect_cfm+0x118/0x4a0

but task is already holding lock:
ffffffff831e3340 (hci_cb_list_lock){+.+.}-{3:3}, at:
hci_sync_conn_complete_evt+0x1ad/0x3d0

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (hci_cb_list_lock){+.+.}-{3:3}:
       __mutex_lock+0x13b/0xcc0
       hci_sync_conn_complete_evt+0x1ad/0x3d0
       hci_event_packet+0x55c/0x7c0
       hci_rx_work+0x34c/0xa00
       process_one_work+0x575/0x910
       worker_thread+0x89/0x6f0
       kthread+0x14e/0x180
       ret_from_fork+0x2b/0x50

-> #1 (&hdev->lock){+.+.}-{3:3}:
       __mutex_lock+0x13b/0xcc0
       sco_sock_connect+0xfc/0x630
       __sys_connect+0x197/0x1b0
       __x64_sys_connect+0x37/0x50
       do_syscall_64+0x42/0x90
       entry_SYSCALL_64_after_hwframe+0x70/0xda

-> #0 (sk_lock-AF_BLUETOOTH-BTPROTO_SCO){+.+.}-{0:0}:
       __lock_acquire+0x18cc/0x3740
       lock_acquire+0x151/0x3a0
       lock_sock_nested+0x32/0x80
       sco_connect_cfm+0x118/0x4a0
       hci_sync_conn_complete_evt+0x1e6/0x3d0
       hci_event_packet+0x55c/0x7c0
       hci_rx_work+0x34c/0xa00
       process_one_work+0x575/0x910
       worker_thread+0x89/0x6f0
       kthread+0x14e/0x180
       ret_from_fork+0x2b/0x50

other info that might help us debug this:

Chain exists of:
  sk_lock-AF_BLUETOOTH-BTPROTO_SCO --> &hdev->lock --> hci_cb_list_lock

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(hci_cb_list_lock);
                               lock(&hdev->lock);
                               lock(hci_cb_list_lock);
  lock(sk_lock-AF_BLUETOOTH-BTPROTO_SCO);

 *** DEADLOCK ***

4 locks held by kworker/u3:0/46:
 #0: ffff8880028d1130 ((wq_completion)hci0#2){+.+.}-{0:0}, at:
 process_one_work+0x4c0/0x910
 #1: ffff8880013dfde0 ((work_completion)(&hdev->rx_work)){+.+.}-{0:0},
 at: process_one_work+0x4c0/0x910
 #2: ffff8880025d8070 (&hdev->lock){+.+.}-{3:3}, at:
 hci_sync_conn_complete_evt+0xa6/0x3d0
 #3: ffffffffb79e3340 (hci_cb_list_lock){+.+.}-{3:3}, at:
 hci_sync_conn_complete_evt+0x1ad/0x3d0

Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
btogorean pushed a commit that referenced this pull request Aug 25, 2023
[ Upstream commit 93c660c ]

ASAN reports an use-after-free in btf_dump_name_dups:

ERROR: AddressSanitizer: heap-use-after-free on address 0xffff927006db at pc 0xaaaab5dfb618 bp 0xffffdd89b890 sp 0xffffdd89b928
READ of size 2 at 0xffff927006db thread T0
    #0 0xaaaab5dfb614 in __interceptor_strcmp.part.0 (test_progs+0x21b614)
    #1 0xaaaab635f144 in str_equal_fn tools/lib/bpf/btf_dump.c:127
    #2 0xaaaab635e3e0 in hashmap_find_entry tools/lib/bpf/hashmap.c:143
    #3 0xaaaab635e72c in hashmap__find tools/lib/bpf/hashmap.c:212
    #4 0xaaaab6362258 in btf_dump_name_dups tools/lib/bpf/btf_dump.c:1525
    #5 0xaaaab636240c in btf_dump_resolve_name tools/lib/bpf/btf_dump.c:1552
    #6 0xaaaab6362598 in btf_dump_type_name tools/lib/bpf/btf_dump.c:1567
    #7 0xaaaab6360b48 in btf_dump_emit_struct_def tools/lib/bpf/btf_dump.c:912
    #8 0xaaaab6360630 in btf_dump_emit_type tools/lib/bpf/btf_dump.c:798
    #9 0xaaaab635f720 in btf_dump__dump_type tools/lib/bpf/btf_dump.c:282
    #10 0xaaaab608523c in test_btf_dump_incremental tools/testing/selftests/bpf/prog_tests/btf_dump.c:236
    #11 0xaaaab6097530 in test_btf_dump tools/testing/selftests/bpf/prog_tests/btf_dump.c:875
    #12 0xaaaab6314ed0 in run_one_test tools/testing/selftests/bpf/test_progs.c:1062
    #13 0xaaaab631a0a8 in main tools/testing/selftests/bpf/test_progs.c:1697
    #14 0xffff9676d214 in __libc_start_main ../csu/libc-start.c:308
    #15 0xaaaab5d65990  (test_progs+0x185990)

0xffff927006db is located 11 bytes inside of 16-byte region [0xffff927006d0,0xffff927006e0)
freed by thread T0 here:
    #0 0xaaaab5e2c7c4 in realloc (test_progs+0x24c7c4)
    #1 0xaaaab634f4a0 in libbpf_reallocarray tools/lib/bpf/libbpf_internal.h:191
    #2 0xaaaab634f840 in libbpf_add_mem tools/lib/bpf/btf.c:163
    #3 0xaaaab636643c in strset_add_str_mem tools/lib/bpf/strset.c:106
    #4 0xaaaab6366560 in strset__add_str tools/lib/bpf/strset.c:157
    #5 0xaaaab6352d70 in btf__add_str tools/lib/bpf/btf.c:1519
    #6 0xaaaab6353e10 in btf__add_field tools/lib/bpf/btf.c:2032
    #7 0xaaaab6084fcc in test_btf_dump_incremental tools/testing/selftests/bpf/prog_tests/btf_dump.c:232
    #8 0xaaaab6097530 in test_btf_dump tools/testing/selftests/bpf/prog_tests/btf_dump.c:875
    #9 0xaaaab6314ed0 in run_one_test tools/testing/selftests/bpf/test_progs.c:1062
    #10 0xaaaab631a0a8 in main tools/testing/selftests/bpf/test_progs.c:1697
    #11 0xffff9676d214 in __libc_start_main ../csu/libc-start.c:308
    #12 0xaaaab5d65990  (test_progs+0x185990)

previously allocated by thread T0 here:
    #0 0xaaaab5e2c7c4 in realloc (test_progs+0x24c7c4)
    #1 0xaaaab634f4a0 in libbpf_reallocarray tools/lib/bpf/libbpf_internal.h:191
    #2 0xaaaab634f840 in libbpf_add_mem tools/lib/bpf/btf.c:163
    #3 0xaaaab636643c in strset_add_str_mem tools/lib/bpf/strset.c:106
    #4 0xaaaab6366560 in strset__add_str tools/lib/bpf/strset.c:157
    #5 0xaaaab6352d70 in btf__add_str tools/lib/bpf/btf.c:1519
    #6 0xaaaab6353ff0 in btf_add_enum_common tools/lib/bpf/btf.c:2070
    #7 0xaaaab6354080 in btf__add_enum tools/lib/bpf/btf.c:2102
    #8 0xaaaab6082f50 in test_btf_dump_incremental tools/testing/selftests/bpf/prog_tests/btf_dump.c:162
    #9 0xaaaab6097530 in test_btf_dump tools/testing/selftests/bpf/prog_tests/btf_dump.c:875
    #10 0xaaaab6314ed0 in run_one_test tools/testing/selftests/bpf/test_progs.c:1062
    #11 0xaaaab631a0a8 in main tools/testing/selftests/bpf/test_progs.c:1697
    #12 0xffff9676d214 in __libc_start_main ../csu/libc-start.c:308
    #13 0xaaaab5d65990  (test_progs+0x185990)

The reason is that the key stored in hash table name_map is a string
address, and the string memory is allocated by realloc() function, when
the memory is resized by realloc() later, the old memory may be freed,
so the address stored in name_map references to a freed memory, causing
use-after-free.

Fix it by storing duplicated string address in name_map.

Fixes: 919d2b1 ("libbpf: Allow modification of BTF and add btf__add_str API")
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/bpf/20221011120108.782373-2-xukuohai@huaweicloud.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
btogorean pushed a commit that referenced this pull request Aug 25, 2023
…g the sock

[ Upstream commit 3cf7203 ]

There is a race condition in vxlan that when deleting a vxlan device
during receiving packets, there is a possibility that the sock is
released after getting vxlan_sock vs from sk_user_data. Then in
later vxlan_ecn_decapsulate(), vxlan_get_sk_family() we will got
NULL pointer dereference. e.g.

   #0 [ffffa25ec6978a38] machine_kexec at ffffffff8c669757
   #1 [ffffa25ec6978a90] __crash_kexec at ffffffff8c7c0a4d
   #2 [ffffa25ec6978b58] crash_kexec at ffffffff8c7c1c48
   #3 [ffffa25ec6978b60] oops_end at ffffffff8c627f2b
   #4 [ffffa25ec6978b80] page_fault_oops at ffffffff8c678fcb
   #5 [ffffa25ec6978bd8] exc_page_fault at ffffffff8d109542
   #6 [ffffa25ec6978c00] asm_exc_page_fault at ffffffff8d200b62
      [exception RIP: vxlan_ecn_decapsulate+0x3b]
      RIP: ffffffffc1014e7b  RSP: ffffa25ec6978cb0  RFLAGS: 00010246
      RAX: 0000000000000008  RBX: ffff8aa000888000  RCX: 0000000000000000
      RDX: 000000000000000e  RSI: ffff8a9fc7ab803e  RDI: ffff8a9fd1168700
      RBP: ffff8a9fc7ab803e   R8: 0000000000700000   R9: 00000000000010ae
      R10: ffff8a9fcb748980  R11: 0000000000000000  R12: ffff8a9fd1168700
      R13: ffff8aa000888000  R14: 00000000002a0000  R15: 00000000000010ae
      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
   #7 [ffffa25ec6978ce8] vxlan_rcv at ffffffffc10189cd [vxlan]
   #8 [ffffa25ec6978d90] udp_queue_rcv_one_skb at ffffffff8cfb6507
   #9 [ffffa25ec6978dc0] udp_unicast_rcv_skb at ffffffff8cfb6e45
  #10 [ffffa25ec6978dc8] __udp4_lib_rcv at ffffffff8cfb8807
  #11 [ffffa25ec6978e20] ip_protocol_deliver_rcu at ffffffff8cf76951
  #12 [ffffa25ec6978e48] ip_local_deliver at ffffffff8cf76bde
  #13 [ffffa25ec6978ea0] __netif_receive_skb_one_core at ffffffff8cecde9b
  #14 [ffffa25ec6978ec8] process_backlog at ffffffff8cece139
  #15 [ffffa25ec6978f00] __napi_poll at ffffffff8ceced1a
  #16 [ffffa25ec6978f28] net_rx_action at ffffffff8cecf1f3
  #17 [ffffa25ec6978fa0] __softirqentry_text_start at ffffffff8d4000ca
  #18 [ffffa25ec6978ff0] do_softirq at ffffffff8c6fbdc3

Reproducer: https://github.com/Mellanox/ovs-tests/blob/master/test-ovs-vxlan-remove-tunnel-during-traffic.sh

Fix this by waiting for all sk_user_data reader to finish before
releasing the sock.

Reported-by: Jianlin Shi <jishi@redhat.com>
Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
Fixes: 6a93cc9 ("udp-tunnel: Add a few more UDP tunnel APIs")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants