1. 30 Nov, 2013 6 commits
  2. 29 Nov, 2013 4 commits
  3. 28 Nov, 2013 5 commits
  4. 23 Nov, 2013 8 commits
  5. 21 Nov, 2013 6 commits
    • Herbert Xu's avatar
      gso: handle new frag_list of frags GRO packets · 9d8506cc
      Herbert Xu authored
      Recently GRO started generating packets with frag_lists of frags.
      This was not handled by GSO, thus leading to a crash.
      
      Thankfully these packets are of a regular form and are easy to
      handle.  This patch handles them in two ways.  For completely
      non-linear frag_list entries, we simply continue to iterate over
      the frag_list frags once we exhaust the normal frags.  For frag_list
      entries with linear parts, we call pskb_trim on the first part
      of the frag_list skb, and then process the rest of the frags in
      the usual way.
      
      This patch also kills a chunk of dead frag_list code that has
      obviously never ever been run since it ends up generating a bogus
      GSO-segmented packet with a frag_list entry.
      
      Future work is planned to split super big packets into TSO
      ones.
      
      Fixes: 8a29111c
      
       ("net: gro: allow to build full sized skb")
      Reported-by: default avatarChristoph Paasch <christoph.paasch@uclouvain.be>
      Reported-by: default avatarJerry Chu <hkchu@google.com>
      Reported-by: Sander Eikelenboom <linux@eik...
      9d8506cc
    • Johannes Berg's avatar
      genetlink: fix genlmsg_multicast() bug · 220815a9
      Johannes Berg authored
      
      Unfortunately, I introduced a tremendously stupid bug into
      genlmsg_multicast() when doing all those multicast group
      changes: it adjusts the group number, but then passes it
      to genlmsg_multicast_netns() which does that again.
      
      Somehow, my tests failed to catch this, so add a warning
      into genlmsg_multicast_netns() and remove the offending
      group ID adjustment.
      
      Also add a warning to the similar code in other functions
      so people who misuse them are more loudly warned.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      220815a9
    • Daniel Borkmann's avatar
      packet: fix use after free race in send path when dev is released · e40526cb
      Daniel Borkmann authored
      Salam reported a use after free bug in PF_PACKET that occurs when
      we're sending out frames on a socket bound device and suddenly the
      net device is being unregistered. It appears that commit 827d9780
      introduced a possible race condition between {t,}packet_snd() and
      packet_notifier(). In the case of a bound socket, packet_notifier()
      can drop the last reference to the net_device and {t,}packet_snd()
      might end up suddenly sending a packet over a freed net_device.
      
      To avoid reverting 827d9780 and thus introducing a performance
      regression compared to the current state of things, we decided to
      hold a cached RCU protected pointer to the net device and maintain
      it on write side via bind spin_lock protected register_prot_hook()
      and __unregister_prot_hook() calls.
      
      In {t,}packet_snd() path, we access this pointer under rcu_read_lock
      through packet_cached_dev_get() that holds reference to the device
      to prevent it from being freed through packet_notifier() while
      we're in send path. This is okay to do as dev_put()/dev_hold() are
      per-cpu counters, so this should not be a performance issue. Also,
      the code simplifies a bit as we don't need need_rls_dev anymore.
      
      Fixes: 827d9780
      
       ("af-packet: Use existing netdev reference for bound sockets.")
      Reported-by: default avatarSalam Noureddine <noureddine@aristanetworks.com>
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: default avatarSalam Noureddine <noureddine@aristanetworks.com>
      Cc: Ben Greear <greearb@candelatech.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e40526cb
    • Michael Opdenacker's avatar
      wimax: remove dead code · aec6f90d
      Michael Opdenacker authored
      
      This removes a code line that is between a "return 0;" and an error label.
      This code line can never be reached.
      
      Found by Coverity (CID: 1130529)
      Signed-off-by: default avatarMichael Opdenacker <michael.opdenacker@free-electrons.com>
      Acked-by: default avatarJohannes Berg <johannes@sipsolutions.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aec6f90d
    • Hannes Frederic Sowa's avatar
      net: add BUG_ON if kernel advertises msg_namelen > sizeof(struct sockaddr_storage) · 68c6beb3
      Hannes Frederic Sowa authored
      
      In that case it is probable that kernel code overwrote part of the
      stack. So we should bail out loudly here.
      
      The BUG_ON may be removed in future if we are sure all protocols are
      conformant.
      Suggested-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      68c6beb3
    • Hannes Frederic Sowa's avatar
      net: rework recvmsg handler msg_name and msg_namelen logic · f3d33426
      Hannes Frederic Sowa authored
      
      This patch now always passes msg->msg_namelen as 0. recvmsg handlers must
      set msg_namelen to the proper size <= sizeof(struct sockaddr_storage)
      to return msg_name to the user.
      
      This prevents numerous uninitialized memory leaks we had in the
      recvmsg handlers and makes it harder for new code to accidentally leak
      uninitialized memory.
      
      Optimize for the case recvfrom is called with NULL as address. We don't
      need to copy the address at all, so set it to NULL before invoking the
      recvmsg handler. We can do so, because all the recvmsg handlers must
      cope with the case a plain read() is called on them. read() also sets
      msg_name to NULL.
      
      Also document these changes in include/linux/net.h as suggested by David
      Miller.
      
      Changes since RFC:
      
      Set msg->msg_name = NULL if user specified a NULL in msg_name but had a
      non-null msg_namelen in verify_iovec/verify_compat_iovec. This doesn't
      affect sendto as it would bail out earlier while trying to copy-in the
      address. It also more naturally reflects the logic by the callers of
      verify_iovec.
      
      With this change in place I could remove "
      if (!uaddr || msg_sys->msg_namelen == 0)
      	msg->msg_name = NULL
      ".
      
      This change does not alter the user visible error logic as we ignore
      msg_namelen as long as msg_name is NULL.
      
      Also remove two unnecessary curly brackets in ___sys_recvmsg and change
      comments to netdev style.
      
      Cc: David Miller <davem@davemloft.net>
      Suggested-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f3d33426
  6. 20 Nov, 2013 3 commits
    • Ding Tianhong's avatar
      bridge: flush br's address entry in fdb when remove the · f8730420
      Ding Tianhong authored
      
       bridge dev
      
      When the following commands are executed:
      
      brctl addbr br0
      ifconfig br0 hw ether <addr>
      rmmod bridge
      
      The calltrace will occur:
      
      [  563.312114] device eth1 left promiscuous mode
      [  563.312188] br0: port 1(eth1) entered disabled state
      [  563.468190] kmem_cache_destroy bridge_fdb_cache: Slab cache still has objects
      [  563.468197] CPU: 6 PID: 6982 Comm: rmmod Tainted: G           O 3.12.0-0.7-default+ #9
      [  563.468199] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
      [  563.468200]  0000000000000880 ffff88010f111e98 ffffffff814d1c92 ffff88010f111eb8
      [  563.468204]  ffffffff81148efd ffff88010f111eb8 0000000000000000 ffff88010f111ec8
      [  563.468206]  ffffffffa062a270 ffff88010f111ed8 ffffffffa063ac76 ffff88010f111f78
      [  563.468209] Call Trace:
      [  563.468218]  [<ffffffff814d1c92>] dump_stack+0x6a/0x78
      [  563.468234]  [<ffffffff81148efd>] kmem_cache_destroy+0xfd/0x100
      [  563.468242]  [<ffffffffa062a270>] br_fdb_fini+0x10/0x20 [bridge]
      [  563.468247]  [<ffffffffa063ac76>] br_deinit+0x4e/0x50 [bridge]
      [  563.468254]  [<ffffffff810c7dc9>] SyS_delete_module+0x199/0x2b0
      [  563.468259]  [<ffffffff814e0922>] system_call_fastpath+0x16/0x1b
      [  570.377958] Bridge firewalling registered
      
      --------------------------- cut here -------------------------------
      
      The reason is that when the bridge dev's address is changed, the
      br_fdb_change_mac_address() will add new address in fdb, but when
      the bridge was removed, the address entry in the fdb did not free,
      the bridge_fdb_cache still has objects when destroy the cache, Fix
      this by flushing the bridge address entry when removing the bridge.
      
      v2: according to the Toshiaki Makita and Vlad's suggestion, I only
          delete the vlan0 entry, it still have a leak here if the vlan id
          is other number, so I need to call fdb_delete_by_port(br, NULL, 1)
          to flush all entries whose dst is NULL for the bridge.
      Suggested-by: default avatarToshiaki Makita <toshiaki.makita1@gmail.com>
      Suggested-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f8730420
    • Vlad Yasevich's avatar
      net: core: Always propagate flag changes to interfaces · d2615bf4
      Vlad Yasevich authored
      The following commit:
          b6c40d68
          net: only invoke dev->change_rx_flags when device is UP
      
      tried to fix a problem with VLAN devices and promiscuouse flag setting.
      The issue was that VLAN device was setting a flag on an interface that
      was down, thus resulting in bad promiscuity count.
      This commit blocked flag propagation to any device that is currently
      down.
      
      A later commit:
          deede2fa
          vlan: Don't propagate flag changes on down interfaces
      
      fixed VLAN code to only propagate flags when the VLAN interface is up,
      thus fixing the same issue as above, only localized to VLAN.
      
      The problem we have now is that if we have create a complex stack
      involving multiple software devices like bridges, bonds, and vlans,
      then it is possible that the flags would not propagate properly to
      the physical devices.  A simple examle of the scenario is the
      following:
      
        eth0----> bond0 ----> bridge0 ---> vlan50
      
      If bond0 or eth0 happen to be down at the time bond0 is added to
      the bridge, then eth0 will never have promisc mode set which is
      currently required for operation as part of the bridge.  As a
      result, packets with vlan50 will be dropped by the interface.
      
      The only 2 devices that implement the special flag handling are
      VLAN and DSA and they both have required code to prevent incorrect
      flag propagation.  As a result we can remove the generic solution
      introduced in b6c40d68
      
       and leave
      it to the individual devices to decide whether they will block
      flag propagation or not.
      Reported-by: default avatarStefan Priebe <s.priebe@profihost.ag>
      Suggested-by: default avatarVeaceslav Falico <vfalico@redhat.com>
      Signed-off-by: default avatarVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d2615bf4
    • Alexei Starovoitov's avatar
      ipv4: fix race in concurrent ip_route_input_slow() · dcdfdf56
      Alexei Starovoitov authored
      CPUs can ask for local route via ip_route_input_noref() concurrently.
      if nh_rth_input is not cached yet, CPUs will proceed to allocate
      equivalent DSTs on 'lo' and then will try to cache them in nh_rth_input
      via rt_cache_route()
      Most of the time they succeed, but on occasion the following two lines:
      	orig = *p;
      	prev = cmpxchg(p, orig, rt);
      in rt_cache_route() do race and one of the cpus fails to complete cmpxchg.
      But ip_route_input_slow() doesn't check the return code of rt_cache_route(),
      so dst is leaking. dst_destroy() is never called and 'lo' device
      refcnt doesn't go to zero, which can be seen in the logs as:
      	unregister_netdevice: waiting for lo to become free. Usage count = 1
      Adding mdelay() between above two lines makes it easily reproducible.
      Fix it similar to nh_pcpu_rth_output case.
      
      Fixes: d2d68ba9
      
       ("ipv4: Cache input routes in fib_info nexthops.")
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dcdfdf56
  7. 19 Nov, 2013 8 commits