1. 06 Feb, 2018 1 commit
  2. 01 Feb, 2018 1 commit
    • Andrey Ryabinin's avatar
      lib/strscpy: Shut up KASAN false-positives in strscpy() · 1a3241ff
      Andrey Ryabinin authored
      
      strscpy() performs the word-at-a-time optimistic reads.  So it may may
      access the memory past the end of the object, which is perfectly fine
      since strscpy() doesn't use that (past-the-end) data and makes sure the
      optimistic read won't cross a page boundary.
      
      Use new read_word_at_a_time() to shut up the KASAN.
      
      Note that this potentially could hide some bugs.  In example bellow,
      stscpy() will copy more than we should (1-3 extra uninitialized bytes):
      
              char dst[8];
              char *src;
      
              src = kmalloc(5, GFP_KERNEL);
              memset(src, 0xff, 5);
              strscpy(dst, src, 8);
      Signed-off-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1a3241ff
  3. 27 Jan, 2018 1 commit
  4. 23 Jan, 2018 2 commits
    • Steven Rostedt (VMware)'s avatar
      vsprintf: Do not have bprintf dereference pointers · 841a915d
      Steven Rostedt (VMware) authored
      When trace_printk() was introduced, it was discussed that making it be as
      low overhead as possible, that the processing of the format string should be
      delayed until it is read. That is, a "trace_printk()" should not convert
      the %d into numbers and so on, but instead, save the fmt string and all the
      args in the buffer at the time of recording. When the trace_printk() data is
      read, it would then parse the format string and do the conversions of the
      saved arguments in the tracing buffer.
      
      The code to perform this was added to vsprintf where vbin_printf() would
      save the arguments of a specified format string in a buffer, then
      bstr_printf() could be used to convert the buffer with the same format
      string into the final output, as if vsprintf() was called in one go.
      
      The issue arises when dereferenced pointers are used. The problem is that
      something like %*pbl which reads a bitmask, will save the pointer to the
      bitmask in the buffer. Then the reading of the buffer via bstr_printf() will
      then look at the pointer to process the final output. Obviously the value of
      that pointer could have changed since the time it was recorded to the time
      the buffer is read. Worse yet, the bitmask could be unmapped, and the
      reading of the trace buffer could actually cause a kernel oops.
      
      Another problem is that user space tools such as perf and trace-cmd do not
      have access to the contents of these pointers, and they become useless when
      the tracing buffer is extracted.
      
      Instead of having vbin_printf() simply save the pointer in the buffer for
      later processing, have it perform the formatting at the time bin_printf() is
      called. This will fix the issue of dereferencing pointers at a later time,
      and has the extra benefit of having user space tools understand these
      values.
      
      Since perf and trace-cmd already can handle %p[sSfF] via saving kallsyms,
      their pointers are saved and not processed during vbin_printf(). If they
      were converted, it would break perf and trace-cmd, as they would not know
      how to deal with the conversion.
      
      Link: http://lkml.kernel.org/r/20171228204025.14a71d8f@gandalf.local.home
      
      Reported-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      841a915d
    • Bart Van Assche's avatar
      kobject: Export kobj_ns_grab_current() and kobj_ns_drop() · 172856ea
      Bart Van Assche authored
      
      Make it possible to call these two functions from a kernel module.
      Note: despite their name, these two functions can be used meaningfully
      independent of kobjects. A later patch will add calls to these
      functions from the SRP driver because this patch series modifies the
      SRP driver such that it can hold a reference to a namespace that can
      last longer than the lifetime of the process through which the
      namespace reference was obtained.
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@wdc.com>
      Acked-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      172856ea
  5. 22 Jan, 2018 2 commits
  6. 20 Jan, 2018 1 commit
  7. 19 Jan, 2018 1 commit
    • Bart Van Assche's avatar
      lib/scatterlist: Fix chaining support in sgl_alloc_order() · 8c7a8d1c
      Bart Van Assche authored
      This patch avoids that workloads with large block sizes (megabytes)
      can trigger the following call stack with the ib_srpt driver (that
      driver is the only driver that chains scatterlists allocated by
      sgl_alloc_order()):
      
      BUG: Bad page state in process kworker/0:1H  pfn:2423a78
      page:fffffb03d08e9e00 count:-3 mapcount:0 mapping:          (null) index:0x0
      flags: 0x57ffffc0000000()
      raw: 0057ffffc0000000 0000000000000000 0000000000000000 fffffffdffffffff
      raw: dead000000000100 dead000000000200 0000000000000000 0000000000000000
      page dumped because: nonzero _count
      CPU: 0 PID: 733 Comm: kworker/0:1H Tainted: G          I      4.15.0-rc7.bart+ #1
      Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
      Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
      Call Trace:
       dump_stack+0x5c/0x83
       bad_page+0xf5/0x10f
       get_page_from_freelist+0xa46/0x11b0
       __alloc_pages_nodemask+0x103/0x290
       sgl_alloc_order+0x101/0x180
       target_alloc_sgl+0x2c/0x40 [target_core_mod]
       srpt_alloc_rw_ctxs+0x173/0x2d0 [ib_srpt]
       srpt_handle_new_iu+0x61e/0x7f0 [ib_srpt]
       __ib_process_cq+0x55/0xa0 [ib_core]
       ib_cq_poll_work+0x1b/0x60 [ib_core]
       process_one_work+0x141/0x340
       worker_thread+0x47/0x3e0
       kthread+0xf5/0x130
       ret_from_fork+0x1f/0x30
      
      Fixes: e80a0af4
      
       ("lib/scatterlist: Introduce sgl_alloc() and sgl_free()")
      Reported-by: default avatarLaurence Oberman <loberman@redhat.com>
      Tested-by: default avatarLaurence Oberman <loberman@redhat.com>
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@wdc.com>
      Cc: Nicholas A. Bellinger <nab@linux-iscsi.org>
      Cc: Laurence Oberman <loberman@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      8c7a8d1c
  8. 15 Jan, 2018 16 commits
  9. 13 Jan, 2018 3 commits
    • Masami Hiramatsu's avatar
      error-injection: Support fault injection framework · 4b1a29a7
      Masami Hiramatsu authored
      Support in-kernel fault-injection framework via debugfs.
      This allows you to inject a conditional error to specified
      function using debugfs interfaces.
      
      Here is the result of test script described in
      Documentation/fault-injection/fault-injection.txt
      
        ===========
        # ./test_fail_function.sh
        1+0 records in
        1+0 records out
        1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0227404 s, 46.1 MB/s
        btrfs-progs v4.4
        See http://btrfs.wiki.kernel.org
      
       for more information.
      
        Label:              (null)
        UUID:               bfa96010-12e9-4360-aed0-42eec7af5798
        Node size:          16384
        Sector size:        4096
        Filesystem size:    1001.00MiB
        Block group profiles:
          Data:             single            8.00MiB
          Metadata:         DUP              58.00MiB
          System:           DUP              12.00MiB
        SSD detected:       no
        Incompat features:  extref, skinny-metadata
        Number of devices:  1
        Devices:
           ID        SIZE  PATH
            1  1001.00MiB  /dev/loop2
      
        mount: mount /dev/loop2 on /opt/tmpmnt failed: Cannot allocate memory
        SUCCESS!
        ===========
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Reviewed-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      4b1a29a7
    • Masami Hiramatsu's avatar
      error-injection: Add injectable error types · 663faf9f
      Masami Hiramatsu authored
      
      Add injectable error types for each error-injectable function.
      
      One motivation of error injection test is to find software flaws,
      mistakes or mis-handlings of expectable errors. If we find such
      flaws by the test, that is a program bug, so we need to fix it.
      
      But if the tester miss input the error (e.g. just return success
      code without processing anything), it causes unexpected behavior
      even if the caller is correctly programmed to handle any errors.
      That is not what we want to test by error injection.
      
      To clarify what type of errors the caller must expect for each
      injectable function, this introduces injectable error types:
      
       - EI_ETYPE_NULL : means the function will return NULL if it
      		    fails. No ERR_PTR, just a NULL.
       - EI_ETYPE_ERRNO : means the function will return -ERRNO
      		    if it fails.
       - EI_ETYPE_ERRNO_NULL : means the function will return -ERRNO
      		       (ERR_PTR) or NULL.
      
      ALLOW_ERROR_INJECTION() macro is expanded to get one of
      NULL, ERRNO, ERRNO_NULL to record the error type for
      each function. e.g.
      
       ALLOW_ERROR_INJECTION(open_ctree, ERRNO)
      
      This error types are shown in debugfs as below.
      
        ====
        / # cat /sys/kernel/debug/error_injection/list
        open_ctree [btrfs]	ERRNO
        io_ctl_init [btrfs]	ERRNO
        ====
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Reviewed-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      663faf9f
    • Masami Hiramatsu's avatar
      error-injection: Separate error-injection from kprobe · 540adea3
      Masami Hiramatsu authored
      
      Since error-injection framework is not limited to be used
      by kprobes, nor bpf. Other kernel subsystems can use it
      freely for checking safeness of error-injection, e.g.
      livepatch, ftrace etc.
      So this separate error-injection framework from kprobes.
      
      Some differences has been made:
      
      - "kprobe" word is removed from any APIs/structures.
      - BPF_ALLOW_ERROR_INJECTION() is renamed to
        ALLOW_ERROR_INJECTION() since it is not limited for BPF too.
      - CONFIG_FUNCTION_ERROR_INJECTION is the config item of this
        feature. It is automatically enabled if the arch supports
        error injection feature for kprobe or ftrace etc.
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Reviewed-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      540adea3
  10. 12 Jan, 2018 1 commit
  11. 10 Jan, 2018 1 commit
    • Christoph Hellwig's avatar
      dma-mapping: move swiotlb arch helpers to a new header · ea8c64ac
      Christoph Hellwig authored
      
      phys_to_dma, dma_to_phys and dma_capable are helpers published by
      architecture code for use of swiotlb and xen-swiotlb only.  Drivers are
      not supposed to use these directly, but use the DMA API instead.
      
      Move these to a new asm/dma-direct.h helper, included by a
      linux/dma-direct.h wrapper that provides the default linear mapping
      unless the architecture wants to override it.
      
      In the MIPS case the existing dma-coherent.h is reused for now as
      untangling it will take a bit of work.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Acked-by: default avatarRobin Murphy <robin.murphy@arm.com>
      ea8c64ac
  12. 09 Jan, 2018 3 commits
    • Alexei Starovoitov's avatar
      bpf: introduce BPF_JIT_ALWAYS_ON config · 290af866
      Alexei Starovoitov authored
      
      The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715.
      
      A quote from goolge project zero blog:
      "At this point, it would normally be necessary to locate gadgets in
      the host kernel code that can be used to actually leak data by reading
      from an attacker-controlled location, shifting and masking the result
      appropriately and then using the result of that as offset to an
      attacker-controlled address for a load. But piecing gadgets together
      and figuring out which ones work in a speculation context seems annoying.
      So instead, we decided to use the eBPF interpreter, which is built into
      the host kernel - while there is no legitimate way to invoke it from inside
      a VM, the presence of the code in the host kernel's text section is sufficient
      to make it usable for the attack, just like with ordinary ROP gadgets."
      
      To make attacker job harder introduce BPF_JIT_ALWAYS_ON config
      option that removes interpreter from the kernel in favor of JIT-only mode.
      So far eBPF JIT is supported by:
      x64, arm64, arm32, sparc64, s390, powerpc64, mips64
      
      The start of JITed program is randomized and code page is marked as read-only.
      In addition "constant blinding" can be turned on with net.core.bpf_jit_harden
      
      v2->v3:
      - move __bpf_prog_ret0 under ifdef (Daniel)
      
      v1->v2:
      - fix init order, test_bpf and cBPF (Daniel's feedback)
      - fix offloaded bpf (Jakub's feedback)
      - add 'return 0' dummy in case something can invoke prog->bpf_func
      - retarget bpf tree. For bpf-next the patch would need one extra hunk.
        It will be sent when the trees are merged back to net-next
      
      Considered doing:
        int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT;
      but it seems better to land the patch as-is and in bpf-next remove
      bpf_jit_enable global variable from all JITs, consolidate in one place
      and remove this jit_init() function.
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      290af866
    • Joe Perches's avatar
      treewide: Use DEVICE_ATTR_RW · b6b996b6
      Joe Perches authored
      
      Convert DEVICE_ATTR uses to DEVICE_ATTR_RW where possible.
      
      Done with perl script:
      
      $ git grep -w --name-only DEVICE_ATTR | \
        xargs perl -i -e 'local $/; while (<>) { s/\bDEVICE_ATTR\s*\(\s*(\w+)\s*,\s*\(?(\s*S_IRUGO\s*\|\s*S_IWUSR|\s*S_IWUSR\s*\|\s*S_IRUGO\s*|\s*0644\s*)\)?\s*,\s*\1_show\s*,\s*\1_store\s*\)/DEVICE_ATTR_RW(\1)/g; print;}'
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Acked-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      Acked-by: default avatarAndy Shevchenko <andy.shevchenko@gmail.com>
      Acked-by: default avatarBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Acked-by: default avatarZhang Rui <rui.zhang@intel.com>
      Acked-by: default avatarJarkko Nikula <jarkko.nikula@bitmer.com>
      Acked-by: default avatarJani Nikula <jani.nikula@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b6b996b6
    • Sergey Senozhatsky's avatar
      symbol lookup: introduce dereference_symbol_descriptor() · 04b8eb7a
      Sergey Senozhatsky authored
      dereference_symbol_descriptor() invokes appropriate ARCH specific
      function descriptor dereference callbacks:
      - dereference_kernel_function_descriptor() if the pointer is a
        kernel symbol;
      
      - dereference_module_function_descriptor() if the pointer is a
        module symbol.
      
      This is the last step needed to make '%pS/%ps' smart enough to
      handle function descriptor dereference on affected ARCHs and
      to retire '%pF/%pf'.
      
      To refresh it:
        Some architectures (ia64, ppc64, parisc64) use an indirect pointer
        for C function pointers - the function pointer points to a function
        descriptor and we need to dereference it to get the actual function
        pointer.
      
        Function descriptors live in .opd elf section and all affected
        ARCHs (ia64, ppc64, parisc64) handle it properly for kernel and
        modules. So we, technically, can decide if the dereference is
        needed by simply looking at the pointer: if it belongs to .opd
        section then we need to dereference it.
      
        The kernel and modules have their own .opd sections, obviously,
        that's why we need to split dereference_function_descriptor()
        and use separate kernel and module dereference arch callbacks.
      
      Link: http://lkml.kernel.org/r/20171206043649.GB15885@jagdpanzerIV
      
      
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: James Bottomley <jejb@parisc-linux.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Jessica Yu <jeyu@kernel.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-parisc@vger.kernel.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Tested-by: Tony Luck <tony.luck@intel.com> #ia64
      Tested-by: Santosh Sivaraj <santosh@fossix.org> #powerpc
      Tested-by: Helge Deller <deller@gmx.de> #parisc64
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      04b8eb7a
  13. 08 Jan, 2018 1 commit
  14. 06 Jan, 2018 1 commit
  15. 05 Jan, 2018 1 commit
    • Sergey Senozhatsky's avatar
      lib: do not use print_symbol() · d202d47b
      Sergey Senozhatsky authored
      print_symbol() is a very old API that has been obsoleted by %pS format
      specifier in a normal printk() call.
      
      Replace print_symbol() with a direct printk("%pS") call.
      
      Link: http://lkml.kernel.org/r/20171211125025.2270-13-sergey.senozhatsky@gmail.com
      
      
      To: Andrew Morton <akpm@linux-foundation.org>
      To: Russell King <linux@armlinux.org.uk>
      To: Catalin Marinas <catalin.marinas@arm.com>
      To: Mark Salter <msalter@redhat.com>
      To: Tony Luck <tony.luck@intel.com>
      To: David Howells <dhowells@redhat.com>
      To: Yoshinori Sato <ysato@users.sourceforge.jp>
      To: Guan Xuetao <gxt@mprc.pku.edu.cn>
      To: Borislav Petkov <bp@alien8.de>
      To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      To: Thomas Gleixner <tglx@linutronix.de>
      To: Peter Zijlstra <peterz@infradead.org>
      To: Vineet Gupta <vgupta@synopsys.com>
      To: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: LKML <linux-kernel@vger.kernel.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-c6x-dev@linux-c6x.org
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-am33-list@redhat.com
      Cc: linux-sh@vger.kernel.org
      Cc: linux-edac@vger.kernel.org
      Cc: x86@kernel.org
      Cc: linux-snps-arc@lists.infradead.org
      Signed-off-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      [pmladek@suse.com: updated commit message]
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      d202d47b
  16. 01 Jan, 2018 1 commit
  17. 29 Dec, 2017 1 commit
  18. 22 Dec, 2017 2 commits
    • Jens Axboe's avatar
      blk-mq: improve heavily contended tag case · 4e5dff41
      Jens Axboe authored
      
      Even with a number of waitqueues, we can get into a situation where we
      are heavily contended on the waitqueue lock. I got a report on spc1
      where we're spending seconds doing this. Arguably the use case is nasty,
      I reproduce it with one device and 1000 threads banging on the device.
      But that doesn't mean we shouldn't be handling it better.
      
      What ends up happening is that a thread will fail to get a tag, add
      itself to the waitqueue, and subsequently get woken up when a tag is
      freed - only to find itself going back to sleep on the waitqueue.
      
      Instead of waking all threads, use an exclusive wait and wake up our
      sbitmap batch count instead. This seems to work well for me (massive
      improvement for this use case), and it survives basic testing. But I
      haven't fully verified it yet.
      
      An additional improvement is running the queue and checking for a new
      tag BEFORE needing to add ourselves to the waitqueue.
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      4e5dff41
    • James Hogan's avatar
      lib/mpi: Fix umul_ppmm() for MIPS64r6 · bbc25bee
      James Hogan authored
      Current MIPS64r6 toolchains aren't able to generate efficient
      DMULU/DMUHU based code for the C implementation of umul_ppmm(), which
      performs an unsigned 64 x 64 bit multiply and returns the upper and
      lower 64-bit halves of the 128-bit result. Instead it widens the 64-bit
      inputs to 128-bits and emits a __multi3 intrinsic call to perform a 128
      x 128 multiply. This is both inefficient, and it results in a link error
      since we don't include __multi3 in MIPS linux.
      
      For example commit 90a53e44 ("cfg80211: implement regdb signature
      checking") merged in v4.15-rc1 recently broke the 64r6_defconfig and
      64r6el_defconfig builds by indirectly selecting MPILIB. The same build
      errors can be reproduced on older kernels by enabling e.g. CRYPTO_RSA:
      
      lib/mpi/generic_mpih-mul1.o: In function `mpihelp_mul_1':
      lib/mpi/generic_mpih-mul1.c:50: undefined reference to `__multi3'
      lib/mpi/generic_mpih-mul2.o: In function `mpihelp_addmul_1':
      lib/mpi/generic_mpih-mul2.c:49: undefined reference to `__multi3'
      lib/mpi/generic_mpih-mul3.o: In function `mpihelp_submul_1':
      lib/mpi/generic_mpih-mul3.c:49: undefined reference to `__multi3'
      lib/mpi/mpih-div.o In function `mpihelp_divrem':
      lib/mpi/mpih-div.c:205: undefined reference to `__multi3'
      lib/mpi/mpih-div.c:142: undefined reference to `__multi3'
      
      Therefore add an efficient MIPS64r6 implementation of umul_ppmm() using
      inline assembly and the DMULU/DMUHU instructions, to prevent __multi3
      calls being emitted.
      
      Fixes: 7fd08ca5
      
       ("MIPS: Add build support for the MIPS R6 ISA")
      Signed-off-by: default avatarJames Hogan <jhogan@kernel.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: linux-mips@linux-mips.org
      Cc: linux-crypto@vger.kernel.org
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      bbc25bee