- 02 Jun, 2017 1 commit
-
-
Michal Hocko authored
Igor Stoppa has noticed that __GFP_NOLOCKDEP can use a lower bit. At the time commit 7e784422 ("lockdep: allow to disable reclaim lockup detection") was written we still had __GFP_OTHER_NODE but I have removed it in commit 41b6167e ("mm: get rid of __GFP_OTHER_NODE") and forgot to lower the bit value. The current value is outside of __GFP_BITS_SHIFT so it cannot be used actually. Fixes: 7e784422 ("lockdep: allow to disable reclaim lockup detection") Signed-off-by:
Michal Hocko <mhocko@suse.com> Reported-by:
Igor Stoppa <igor.stoppa@nokia.com> Acked-by:
Vlastimil Babka <vbabka@suse.cz> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 31 May, 2017 1 commit
-
-
Nicholas Bellinger authored
This patch fixes a OOPs originally introduced by: commit bb048357 Author: Nicholas Bellinger <nab@linux-iscsi.org> Date: Thu Sep 5 14:54:04 2013 -0700 iscsi-target: Add sk->sk_state_change to cleanup after TCP failure which would trigger a NULL pointer dereference when a TCP connection was closed asynchronously via iscsi_target_sk_state_change(), but only when the initial PDU processing in iscsi_target_do_login() from iscsi_np process context was blocked waiting for backend I/O to complete. To address this issue, this patch makes the following changes. First, it introduces some common helper functions used for checking socket closing state, checking login_flags, and atomically checking socket closing state + setting login_flags. Second, it introduces a LOGIN_FLAGS_INITIAL_PDU bit to know when a TCP connection has dropped via iscsi_target_sk_state_change(), but the initial PDU processing within iscsi_target_do_login() in iscsi_np context is still running. For this case, it sets LOGIN_FLAGS_CLOSED, but doesn't invoke schedule_delayed_work(). The original NULL pointer dereference case reported by MNC is now handled by iscsi_target_do_login() doing a iscsi_target_sk_check_close() before transitioning to FFP to determine when the socket has already closed, or iscsi_target_start_negotiation() if the login needs to exchange more PDUs (eg: iscsi_target_do_login returned 0) but the socket has closed. For both of these cases, the cleanup up of remaining connection resources will occur in iscsi_target_start_negotiation() from iscsi_np process context once the failure is detected. Finally, to handle to case where iscsi_target_sk_state_change() is called after the initial PDU procesing is complete, it now invokes conn->login_work -> iscsi_target_do_login_rx() to perform cleanup once existing iscsi_target_sk_check_close() checks detect connection failure. For this case, the cleanup of remaining connection resources will occur in iscsi_target_do_login_rx() from delayed workqueue process context once the failure is detected. Reported-by:
Mike Christie <mchristi@redhat.com> Reviewed-by:
Mike Christie <mchristi@redhat.com> Tested-by:
Mike Christie <mchristi@redhat.com> Cc: Mike Christie <mchristi@redhat.com> Reported-by:
Hannes Reinecke <hare@suse.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Varun Prakash <varun@chelsio.com> Cc: <stable@vger.kernel.org> # v3.12+ Signed-off-by:
Nicholas Bellinger <nab@linux-iscsi.org>
-
- 26 May, 2017 2 commits
-
-
Eric Dumazet authored
Andrey Konovalov reported crashes in ipv4_mtu() I could reproduce the issue with KASAN kernels, between 10.246.7.151 and 10.246.7.152 : 1) 20 concurrent netperf -t TCP_RR -H 10.246.7.152 -l 1000 & 2) At the same time run following loop : while : do ip ro add 10.246.7.152 dev eth0 src 10.246.7.151 mtu 1500 ip ro del 10.246.7.152 dev eth0 src 10.246.7.151 mtu 1500 done Cong Wang attempted to add back rt->fi in commit 82486aa6 ("ipv4: restore rt->fi for reference counting") but this proved to add some issues that were complex to solve. Instead, I suggested to add a refcount to the metrics themselves, being a standalone object (in particular, no reference to other objects) I tried to make this patch as small as possible to ease its backport, instead of being super clean. Note that we believe that only ipv4 dst need to take care of the metric refcount. But if this is wrong, this patch adds the basic infrastructure to extend this to other families. Many thanks to Julian Anastasov for reviewing this patch, and Cong Wang for his efforts on this problem. Fixes: 2860583f ("ipv4: Kill rt->fi") Signed-off-by:
Eric Dumazet <edumazet@google.com> Reported-by:
Andrey Konovalov <andreyknvl@google.com> Reviewed-by:
Julian Anastasov <ja@ssi.bg> Acked-by:
Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Christoph Hellwig authored
We need to return an error for any call that asks for MSI / MSI-X vectors only, so that non-trivial fallback logic can work properly. Also valid dev->irq and use the "correct" errno value based on feedback from Linus. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reported-by:
Steven Rostedt <rostedt@goodmis.org> Fixes: aff17164 ("PCI: Provide sensible IRQ vector alloc/free routines") Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 25 May, 2017 1 commit
-
-
Daniel Borkmann authored
This patch adds various verifier test cases: 1) A test case for the pruning issue when tracking alignment is used. 2) Various PTR_TO_MAP_VALUE_OR_NULL tests to make sure pointer arithmetic turns such register into UNKNOWN_VALUE type. 3) Test cases for the special treatment of LD_ABS/LD_IND to make sure verifier doesn't break calling convention here. Latter is needed, since f.e. arm64 JIT uses r1 - r5 for storing temporary data, so they really must be marked as NOT_INIT. Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net> Acked-by:
Alexei Starovoitov <ast@kernel.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 24 May, 2017 1 commit
-
-
Vlad Yasevich authored
It appears that TCP checksum offloading has been broken for Q-in-Q vlans. The behavior was execerbated by the series commit afb0bc97 ("Merge branch 'stacked_vlan_tso'") that that enabled accleleration features on stacked vlans. However, event without that series, it is possible to trigger this issue. It just requires a lot more specialized configuration. The root cause is the interaction between how netdev_intersect_features() works, the features actually set on the vlan devices and HW having the ability to run checksum with longer headers. The issue starts when netdev_interesect_features() replaces NETIF_F_HW_CSUM with a combination of NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM, if the HW advertises IP|IPV6 specific checksums. This happens for tagged and multi-tagged packets. However, HW that enables IP|IPV6 checksum offloading doesn't gurantee that packets with arbitrarily long headers can be checksummed. This patch disables IP|IPV6 checksums on the packet for multi-tagged packets. CC: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> CC: Michal Kubecek <mkubecek@suse.cz> Signed-off-by:
Vladislav Yasevich <vyasevic@redhat.com> Acked-by:
Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 23 May, 2017 8 commits
-
-
Imre Deak authored
Some drivers - like i915 - may not support the system suspend direct complete optimization due to differences in their runtime and system suspend sequence. Add a flag that when set resumes the device before calling the driver's system suspend handlers which effectively disables the optimization. Needed by a future patch fixing suspend/resume on i915. Suggested by Rafael. Signed-off-by:
Imre Deak <imre.deak@intel.com> Signed-off-by:
Bjorn Helgaas <bhelgaas@google.com> Acked-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: stable@vger.kernel.org
-
Ilya Dryomov authored
Signed-off-by:
Ilya Dryomov <idryomov@gmail.com> Reviewed-by:
Alex Elder <elder@linaro.org>
-
Jesper Dangaard Brouer authored
Masks for extracting part of the Completion Queue Entry (CQE) field rss_hash_type was swapped, namely CQE_RSS_HTYPE_IP and CQE_RSS_HTYPE_L4. The bug resulted in setting skb->l4_hash, even-though the rss_hash_type indicated that hash was NOT computed over the L4 (UDP or TCP) part of the packet. Added comments from the datasheet, to make it more clear what these masks are selecting. Signed-off-by:
Jesper Dangaard Brouer <brouer@redhat.com> Acked-by:
Saeed Mahameed <saeedm@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Oliver Neukum authored
Some devices need their multicast filter reset but others are crashed by that. So the methods need to be separated. Signed-off-by:
Oliver Neukum <oneukum@suse.com> Reported-by:
"Ridgway, Keith" <kridgway@harris.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Mohamad Haj Yahia authored
Currently when firmware command gets stuck or it takes long time to complete, the driver command will get timeout and the command slot is freed and can be used for new commands, and if the firmware receive new command on the old busy slot its behavior is unexpected and this could be harmful. To fix this when the driver command gets timeout we return failure, but we don't free the command slot and we wait for the firmware to explicitly respond to that command. Once all the entries are busy we will stop processing new firmware commands. Fixes: 9cba4ebc ('net/mlx5: Fix potential deadlock in command mode change') Signed-off-by:
Mohamad Haj Yahia <mohamad@mellanox.com> Cc: kernel-team@fb.com Signed-off-by:
Saeed Mahameed <saeedm@mellanox.com>
-
Or Gerlitz authored
Add the accessors for realizing if this is a csum action, and for which fields checksum is needed. Signed-off-by:
Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by:
Paul Blakey <paulb@mellanox.com> Signed-off-by:
Saeed Mahameed <saeedm@mellanox.com>
-
Eric W. Biederman authored
When I introduced ptracer_cred I failed to consider the weirdness of fork where the task_struct copies the old value by default. This winds up leaving ptracer_cred set even when a process forks and the child process does not wind up being ptraced. Because ptracer_cred is not set on non-ptraced processes whose parents were ptraced this has broken the ability of the enlightenment window manager to start setuid children. Fix this by properly initializing ptracer_cred in ptrace_init_task This must be done with a little bit of care to preserve the current value of ptracer_cred when ptrace carries through fork. Re-reading the ptracer_cred from the ptracing process at this point is inconsistent with how PT_PTRACE_CAP has been maintained all of these years. Tested-by:
Takashi Iwai <tiwai@suse.de> Fixes: 64b875f7 ("ptrace: Capture the ptracer's creds not PT_PTRACE_CAP") Signed-off-by:
"Eric W. Biederman" <ebiederm@xmission.com>
-
Mika Westerberg authored
Sometimes it is more convenient to be able to match a whole family of products, like in case of bunch of Chromebooks based on Intel_Strago to apply a driver quirk instead of quirking each machine one-by-one. This adds support for DMI_PRODUCT_FAMILY identification string and also exports it to the userspace through sysfs attribute just like the existing ones. Suggested-by:
Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by:
Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by:
Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by:
Linus Walleij <linus.walleij@linaro.org>
-
- 22 May, 2017 4 commits
-
-
Ming Lei authored
No one uses it any more, so remove it. Reviewed-by:
Keith Busch <keith.busch@intel.com> Reviewed-by:
Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by:
Ming Lei <ming.lei@redhat.com> Signed-off-by:
Christoph Hellwig <hch@lst.de>
-
Jan Glauber authored
of_platform_device_destroy is the counterpart to of_platform_device_create which is a non-static function. After creating a platform device it might be neccessary to destroy it to deal with -EPROBE_DEFER where a repeated of_platform_device_create call would fail otherwise. Therefore also make of_platform_device_destroy globally visible. Signed-off-by:
Jan Glauber <jglauber@cavium.com> Acked-by:
Rob Herring <robh@kernel.org> Signed-off-by:
Ulf Hansson <ulf.hansson@linaro.org>
-
Anatolij Gustschin authored
Add stubs for gpiod_add_lookup_table() and gpiod_remove_lookup_table() for the !GPIOLIB case to prevent build errors. Signed-off-by:
Anatolij Gustschin <agust@denx.de> Reviewed-by:
Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by:
Linus Walleij <linus.walleij@linaro.org>
-
Linus Walleij authored
This reverts commit 8c58f1a7 . It turns out that applying these generic properties was premature: the properties used in the driver using this are of unclear electrical nature and the subject need to be discussed. Signed-off-by:
Linus Walleij <linus.walleij@linaro.org>
-
- 20 May, 2017 2 commits
-
-
James Smart authored
Remove NVMET_FCTGTFEAT_NEEDS_CMD_CPUSCHED. It's unnecessary. Signed-off-by:
James Smart <james.smart@broadcom.com> Reviewed-by:
Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Jens Axboe <axboe@fb.com>
-
James Smart authored
FC Port roles is a bit mask, not individual values. Correct nvme definitions to unique bits. Signed-off-by:
James Smart <james.smart@broadcom.com> Reviewed-by:
Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Jens Axboe <axboe@fb.com>
-
- 18 May, 2017 6 commits
-
-
Johan Hovold authored
Add a new interface for registering a serdev controller and clients, and a helper function to deregister serdev devices (or a tty device) that were previously registered using the new interface. Once every driver currently using the tty_port_register_device() helpers have been vetted and converted to use the new serdev registration interface (at least for deregistration), we can move serdev registration to the current helpers and get rid of the serdev-specific functions. Reviewed-by:
Rob Herring <robh@kernel.org> Signed-off-by:
Johan Hovold <johan@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Stefan Wahren authored
Starting with commit 6fe729c4 ("serdev: Add serdev_device_write subroutine") the function serdev_device_write_buf cannot be used in atomic context anymore (mutex_lock is sleeping). So restore the old behavior. Signed-off-by:
Stefan Wahren <stefan.wahren@i2se.com> Fixes: 6fe729c4 ("serdev: Add serdev_device_write subroutine") Acked-by:
Rob Herring <robh@kernel.org> Reviewed-by:
Johan Hovold <johan@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
linzhang authored
The function x25_init is not properly unregister related resources on error handler.It is will result in kernel oops if x25_init init failed, so add properly unregister call on error handler. Also, i adjust the coding style and make x25_register_sysctl properly return failure. Signed-off-by:
linzhang <xiaolou4617@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Peter Chen authored
According to xHCI spec Figure 30: Interrupt Throttle Flow Diagram If PCI Message Signaled Interrupts (MSI or MSI-X) are enabled, then the assertion of the Interrupt Pending (IP) flag in Figure 30 generates a PCI Dword write. The IP flag is automatically cleared by the completion of the PCI write. the MSI enabled HCs don't need to clear interrupt pending bit, but hcd->irq = 0 doesn't equal to MSI enabled HCD. At some Dual-role controller software designs, it sets hcd->irq as 0 to avoid HCD requesting interrupt, and they want to decide when to call usb_hcd_irq by software. Signed-off-by:
Peter Chen <peter.chen@nxp.com> Signed-off-by:
Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Christoffer Dall authored
If userspace creates the VCPUs after initializing the VGIC, then we end up in a situation where we trigger a bug in kvm_vcpu_get_idx(), because it is called prior to adding the VCPU into the vcpus array on the VM. There is no tight coupling between the VCPU index and the area of the redistributor region used for the VCPU, so we can simply ensure that all creations of redistributors are serialized per VM, and increment an offset when we successfully add a redistributor. The vgic_register_redist_iodev() function can be called from two paths: vgic_redister_all_redist_iodev() which is called via the kvm_vgic_addr() device attribute handler. This patch already holds the kvm->lock mutex. The other path is via kvm_vgic_vcpu_init, which is called through a longer chain from kvm_vm_ioctl_create_vcpu(), which releases the kvm->lock mutex just before calling kvm_arch_vcpu_create(), so we can simply take this mutex again later for our purposes. Fixes: ab6f468c10 ("KVM: arm/arm64: Register iodevs when setting redist base and creating VCPUs") Signed-off-by:
Christoffer Dall <cdall@linaro.org> Tested-by:
Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Reviewed-by:
Eric Auger <eric.auger@redhat.com>
-
Thomas Gleixner authored
Enabling the tracer selftest triggers occasionally the warning in text_poke(), which warns when the to be modified page is not marked reserved. The reason is that the tracer selftest installs kprobes on functions marked __init for testing. These probes are removed after the tests, but that removal schedules the delayed kprobes_optimizer work, which will do the actual text poke. If the work is executed after the init text is freed, then the warning triggers. The bug can be reproduced reliably when the work delay is increased. Flush the optimizer work and wait for the optimizing/unoptimizing lists to become empty before returning from the kprobes tracer selftest. That ensures that all operations which were queued due to the probes removal have completed. Link: http://lkml.kernel.org/r/20170516094802.76a468bb@gandalf.local.home Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Acked-by:
Masami Hiramatsu <mhiramat@kernel.org> Cc: stable@vger.kernel.org Fixes: 6274de49 ("kprobes: Support delayed unoptimizing") Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org>
-
- 17 May, 2017 1 commit
-
-
Johan Hovold authored
Add define for the maximum number of ports on a SuperSpeed hub as per USB 3.1 spec Table 10-5, and use it when verifying the retrieved hub descriptor. This specifically avoids benign attempts to update the DeviceRemovable mask for non-existing ports (should we get that far). Fixes: dbe79bbe ("USB 3.0 Hub Changes") Acked-by:
Alan Stern <stern@rowland.harvard.edu> Signed-off-by:
Johan Hovold <johan@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 16 May, 2017 2 commits
-
-
J. Bruce Fields authored
This reverts commit 51f56777 "nfsd: check for oversized NFSv2/v3 arguments", which breaks support for NFSv3 ACLs. That patch was actually an earlier draft of a fix for the problem that was eventually fixed by e6838a29 "nfsd: check for oversized NFSv2/v3 arguments". But somehow I accidentally left this earlier draft in the branch that was part of my 2.12 pull request. Reported-by:
Eryu Guan <eguan@redhat.com> Cc: stable@vger.kernel.org Signed-off-by:
J. Bruce Fields <bfields@redhat.com>
-
Gao Feng authored
The info->target comes from userspace and it would be used directly. So we need to add the sanity check to make sure it is a valid standard target, although the ebtables tool has already checked it. Kernel needs to validate anything coming from userspace. If the target is set as an evil value, it would break the ebtables and cause a panic. Because the non-standard target is treated as one offset. Now add one helper function ebt_invalid_target, and we would replace the macro INVALID_TARGET later. Signed-off-by:
Gao Feng <gfree.wind@vip.163.com> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
- 15 May, 2017 4 commits
-
-
Pablo Neira Ayuso authored
Andreas reports that the following incremental update using our commit protocol doesn't work. # nft -f incremental-update.nft delete element ip filter client_to_any { 10.180.86.22 : goto CIn_1 } delete chain ip filter CIn_1 ... Error: Could not process rule: Device or resource busy The existing code is not well-integrated into the commit phase protocol, since element deletions do not result in refcount decrement from the preparation phase. This results in bogus EBUSY errors like the one above. Two new functions come with this patch: * nft_set_elem_activate() function is used from the abort path, to restore the set element refcounting on objects that occurred from the preparation phase. * nft_set_elem_deactivate() that is called from nft_del_setelem() to decrement set element refcounting on objects from the preparation phase in the commit protocol. The nft_data_uninit() has been renamed to nft_data_release() since this function does not uninitialize any data store in the data register, instead just releases the references to objects. Moreover, a new function nft_data_hold() has been introduced to be used from nft_set_elem_activate(). Reported-by:
Andreas Schultz <aschultz@tpip.net> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
Willem de Bruijn authored
When looking up an iptables rule, the iptables binary compares the aligned match and target data (XT_ALIGN). In some cases this can exceed the actual data size to include padding bytes. Before commit f77bc5b2 ("iptables: use match, target and data copy_to_user helpers") the malloc()ed bytes were overwritten by the kernel with kzalloced contents, zeroing the padding and making the comparison succeed. After this patch, the kernel copies and clears only data, leaving the padding bytes undefined. Extend the clear operation from data size to aligned data size to include the padding bytes, if any. Padding bytes can be observed in both match and target, and the bug triggered, by issuing a rule with match icmp and target ACCEPT: iptables -t mangle -A INPUT -i lo -p icmp --icmp-type 1 -j ACCEPT iptables -t mangle -D INPUT -i lo -p icmp --icmp-type 1 -j ACCEPT Fixes: f77bc5b2 ("iptables: use match, target and data copy_to_user helpers") Reported-by:
Paul Moore <pmoore@redhat.com> Reported-by:
Richard Guy Briggs <rgb@redhat.com> Signed-off-by:
Willem de Bruijn <willemb@google.com> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
Liping Zhang authored
We can still delete the ct helper even if it is in use, this will cause a use-after-free error. In more detail, I mean: # nfct helper add ssdp inet udp # iptables -t raw -A OUTPUT -p udp -j CT --helper ssdp # nfct helper delete ssdp //--> oops, succeed! BUG: unable to handle kernel paging request at 000026ca IP: 0x26ca [...] Call Trace: ? ipv4_helper+0x62/0x80 [nf_conntrack_ipv4] nf_hook_slow+0x21/0xb0 ip_output+0xe9/0x100 ? ip_fragment.constprop.54+0xc0/0xc0 ip_local_out+0x33/0x40 ip_send_skb+0x16/0x80 udp_send_skb+0x84/0x240 udp_sendmsg+0x35d/0xa50 So add reference count to fix this issue, if ct helper is used by others, reject the delete request. Apply this patch: # nfct helper delete ssdp nfct v1.4.3: netlink error: Device or resource busy Signed-off-by:
Liping Zhang <zlpnobody@gmail.com> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
Liping Zhang authored
And convert module_put invocation to nf_conntrack_helper_put, this is prepared for the followup patch, which will add a refcnt for cthelper, so we can reject the deleting request when cthelper is in use. Signed-off-by:
Liping Zhang <zlpnobody@gmail.com> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
- 14 May, 2017 2 commits
-
-
Yishai Hadas authored
Root flow table is dynamically changed by the underlying flow steering layer, and IPoIB/ULPs have no idea what will be the root flow table in the future, hence we need a dynamic infrastructure to move Underlay QPs with the root flow table. Fixes: b3ba5149 ("net/mlx5: Refactor create flow table method to accept underlay QP") Signed-off-by:
Erez Shitrit <erezsh@mellanox.com> Signed-off-by:
Maor Gottlieb <maorg@mellanox.com> Signed-off-by:
Yishai Hadas <yishaih@mellanox.com> Signed-off-by:
Saeed Mahameed <saeedm@mellanox.com>
-
Dan Williams authored
Tetsuo reports: fs/built-in.o: In function `xfs_file_iomap_end': xfs_iomap.c:(.text+0xe0ef9): undefined reference to `put_dax' fs/built-in.o: In function `xfs_file_iomap_begin': xfs_iomap.c:(.text+0xe1a7f): undefined reference to `dax_get_by_host' make: *** [vmlinux] Error 1 $ grep DAX .config CONFIG_DAX=m # CONFIG_DEV_DAX is not set # CONFIG_FS_DAX is not set When FS_DAX=n we can/must throw away the dax code in filesystems. Implement 'fs_' versions of dax_get_by_host() and put_dax() that are nops in the FS_DAX=n case. Cc: <linux-xfs@vger.kernel.org> Cc: <linux-ext4@vger.kernel.org> Cc: Jan Kara <jack@suse.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: "Darrick J. Wong" <darrick.wong@oracle.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Tested-by:
Tony Luck <tony.luck@intel.com> Fixes: ef510424 ("block, dax: move 'select DAX' from BLOCK to FS_DAX") Reported-by:
Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 12 May, 2017 5 commits
-
-
Ross Zwisler authored
Patch series "mm,dax: Fix data corruption due to mmap inconsistency", v4. This series fixes data corruption that can happen for DAX mounts when page faults race with write(2) and as a result page tables get out of sync with block mappings in the filesystem and thus data seen through mmap is different from data seen through read(2). The series passes testing with t_mmap_stale test program from Ross and also other mmap related tests on DAX filesystem. This patch (of 4): dax_invalidate_mapping_entry() currently removes DAX exceptional entries only if they are clean and unlocked. This is done via: invalidate_mapping_pages() invalidate_exceptional_entry() dax_invalidate_mapping_entry() However, for page cache pages removed in invalidate_mapping_pages() there is an additional criteria which is that the page must not be mapped. This is noted in the comments above invalidate_mapping_pages() and is checked in invalidate_inode_page(). For DAX entries this means that we can can end up in a situation where a DAX exceptional entry, either a huge zero page or a regular DAX entry, could end up mapped but without an associated radix tree entry. This is inconsistent with the rest of the DAX code and with what happens in the page cache case. We aren't able to unmap the DAX exceptional entry because according to its comments invalidate_mapping_pages() isn't allowed to block, and unmap_mapping_range() takes a write lock on the mapping->i_mmap_rwsem. Since we essentially never have unmapped DAX entries to evict from the radix tree, just remove dax_invalidate_mapping_entry(). Fixes: c6dcf52c ("mm: Invalidate DAX radix tree entries only if appropriate") Link: http://lkml.kernel.org/r/20170510085419.27601-2-jack@suse.cz Signed-off-by:
Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by:
Jan Kara <jack@suse.cz> Reported-by:
Jan Kara <jack@suse.cz> Cc: Dan Williams <dan.j.williams@intel.com> Cc: <stable@vger.kernel.org> [4.10+] Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Michal Hocko authored
Commit 1f5307b1 ("mm, vmalloc: properly track vmalloc users") has pulled asm/pgtable.h include dependency to linux/vmalloc.h and that turned out to be a bad idea for some architectures. E.g. m68k fails with In file included from arch/m68k/include/asm/pgtable_mm.h:145:0, from arch/m68k/include/asm/pgtable.h:4, from include/linux/vmalloc.h:9, from arch/m68k/kernel/module.c:9: arch/m68k/include/asm/mcf_pgtable.h: In function 'nocache_page': >> arch/m68k/include/asm/mcf_pgtable.h:339:43: error: 'init_mm' undeclared (first use in this function) #define pgd_offset_k(address) pgd_offset(&init_mm, address) as spotted by kernel build bot. nios2 fails for other reason In file included from include/asm-generic/io.h:767:0, from arch/nios2/include/asm/io.h:61, from include/linux/io.h:25, from arch/nios2/include/asm/pgtable.h:18, from include/linux/mm.h:70, from include/linux/pid_namespace.h:6, from include/linux/ptrace.h:9, from arch/nios2/include/uapi/asm/elf.h:23, from arch/nios2/include/asm/elf.h:22, from include/linux/elf.h:4, from include/linux/module.h:15, from init/main.c:16: include/linux/vmalloc.h: In function '__vmalloc_node_flags': include/linux/vmalloc.h:99:40: error: 'PAGE_KERNEL' undeclared (first use in this function); did you mean 'GFP_KERNEL'? which is due to the newly added #include <asm/pgtable.h>, which on nios2 includes <linux/io.h> and thus <asm/io.h> and <asm-generic/io.h> which again includes <linux/vmalloc.h>. Tweaking that around just turns out a bigger headache than necessary. This patch reverts 1f5307b1 and reimplements the original fix in a different way. __vmalloc_node_flags can stay static inline which will cover vmalloc* functions. We only have one external user (kvmalloc_node) and we can export __vmalloc_node_flags_caller and provide the caller directly. This is much simpler and it doesn't really need any games with header files. [akpm@linux-foundation.org: coding-style fixes] [mhocko@kernel.org: revert old comment] Link: http://lkml.kernel.org/r/20170509211054.GB16325@dhcp22.suse.cz Fixes: 1f5307b1 ("mm, vmalloc: properly track vmalloc users") Link: http://lkml.kernel.org/r/20170509153702.GR6481@dhcp22.suse.cz Signed-off-by:
Michal Hocko <mhocko@suse.com> Cc: Tobias Klauser <tklauser@distanz.ch> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Deepa Dinamani authored
All uses of the current_fs_time() function have been replaced by other time interfaces. And, its use cases can be fulfilled by current_time() or ktime_get_* variants. Link: http://lkml.kernel.org/r/1491613030-11599-13-git-send-email-deepa.kernel@gmail.com Signed-off-by:
Deepa Dinamani <deepa.kernel@gmail.com> Reviewed-by:
Arnd Bergmann <arnd@arndb.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Daniel Borkmann authored
While working on the iproute2 generic XDP frontend, I noticed that as of right now it's possible to have native *and* generic XDP programs loaded both at the same time for the case when a driver supports native XDP. The intended model for generic XDP from b5cdae32 ("net: Generic XDP") is, however, that only one out of the two can be present at once which is also indicated as such in the XDP netlink dump part. The main rationale for generic XDP is to ease accessibility (in case a driver does not yet have XDP support) and to generically provide a semantical model as an example for driver developers wanting to add XDP support. The generic XDP option for an XDP aware driver can still be useful for comparing and testing both implementations. However, it is not intended to have a second XDP processing stage or layer with exactly the same functionality of the first native stage. Only reason could be to have a partial fallback for future XDP features that are not supported yet in the native implementation and we probably also shouldn't strive for such fallback and instead encourage native feature support in the first place. Given there's currently no such fallback issue or use case, lets not go there yet if we don't need to. Therefore, change semantics for loading XDP and bail out if the user tries to load a generic XDP program when a native one is present and vice versa. Another alternative to bailing out would be to handle the transition from one flavor to another gracefully, but that would require to bring the device down, exchange both types of programs, and bring it up again in order to avoid a tiny window where a packet could hit both hooks. Given this complicates the logic for just a debugging feature in the native case, I went with the simpler variant. For the dump, remove IFLA_XDP_FLAGS that was added with b5cdae32 and reuse IFLA_XDP_ATTACHED for indicating the mode. Dumping all or just a subset of flags that were used for loading the XDP prog is suboptimal in the long run since not all flags are useful for dumping and if we start to reuse the same flag definitions for load and dump, then we'll waste bit space. What we really just want is to dump the mode for now. Current IFLA_XDP_ATTACHED semantics are: nothing was installed (0), a program is running at the native driver layer (1). Thus, add a mode that says that a program is running at generic XDP layer (2). Applications will handle this fine in that older binaries will just indicate that something is attached at XDP layer, effectively this is similar to IFLA_XDP_FLAGS attr that we would have had modulo the redundancy. Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net> Acked-by:
Alexei Starovoitov <ast@kernel.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
After commit b5cdae32 ("net: Generic XDP") we automatically fall back to a generic XDP variant if the driver does not support native XDP. Allow for an option where the user can specify that always the native XDP variant should be selected and in case it's not supported by a driver, just bail out. Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net> Acked-by:
Alexei Starovoitov <ast@kernel.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-