Yesterday around noon, the internet at my company started acting up. No matter, slowdowns happen and there’s roadwork going on outside: maybe they hit the fiber or something. So we waited.

Then our Samba servers started getting flaky. And the database too. Uh oh… That’s different.

We started investigating. Some machines were dropping ICMP packets like crazy, then recovered, then other machines started to become unpingable too. I fired up Wireshark and discovered an absolute flood of IGMP packets on all the trunks, mostly broadcast from Windows machine. It was so bad two Linux machines on the same switch couldn’t ping each other reliably if the switch was connected to the intranet.

So we suspected a DDOS attack initiated from within the intranet by an outside attacker. We cut off the internet, but the storm of packets kept on coming. Physically disconnecting machines from the intranet one by one didn’t do a thing either.

Eventually, we started disconnecting each trunk one by one from the main router until we disconnected one and all the activity lights immediately stopped on all the ports. We reconnected it and the crazy traffic resumed.

So we went to that trunk’s subrouter and did the same thing. When we found the cable that stopped all the traffic, we followed it and finally found one lonely $10 ethernet switch with… a cable with both ends plugged into the switch. We disconnected the cable and everything instantly returned to normal.

One measly cable brought the entire company to a standstill for hours! Because half of the software we have to use are cloud crap or need to call their particular motherships to activate their licenses, many people couldn’t work anymore for no good technical reason at all while we investigated the networking issue.

Anyway, I thought switches had protections against that sort of loopback connection, and routers prevented circular routes. But there’s theory and there’s reality. Crazy!

  • Nooodel@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 days ago

    Turns out a large excellence cluster technical university can do the same and bring down an entire campus for 2 days. Everything is in one big intranet, has main lines with high throughput routed to a large network node and one backup line from the local internet provider. It killed the main lines and thousands of staff plus some tens of thousands of students were connected through a household class fiber connection. That was fun :)

  • mindbleach@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 days ago

    Accidental ring architecture.

    It is surprising the switch doesn’t occasionally check for zero-ping echo between plugs.

  • stringere@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 days ago

    I managed to accomplish this at my first IT job, but I used broadcast with Symantec Ghost on a 10 port 100k/1mb hub to bring our office down without knowing any better! They bought me a 10/100 switch to push laptop images with after that incident.

  • Socsa@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 days ago

    Yup, the good old “loopback FU.”

    Routers do have some protections which can mitigate this, but the entire problem is broadcast flooding which can’t really be dealt with at later 2, or even at layer 3 within the same segment. Most places will have no broadcast forwarding between segments, but even if you detect unusual broadcast activity and ban that class of traffic, you break other things. A lot of times it is ARP floods, so it doesn’t happen when the network is static and converged until someone plugs a new laptop in, and then everyone assumes it’s that laptop.

  • Randelung@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 days ago

    Our Unifi network collapsed and I have no clue why. One theory was the automatic WiFi bridges that might have acted as loops.

  • Possibly linux@lemmy.zip
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 days ago

    If you are using a hub then that’s expected as they tend to be one of the main sources of floods on a network.

    If you have managed switches make sure you turn on loop protection and alerting. Ideally you should immediately know when something like that happens.

    Also bonus if you setup vlans with different subnets. From there practice least privilege and block all forward traffic by default.

  • pastermil@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 days ago

    Does that kind of loop really mess with things? ELI5 please!

    Also, what do you mean a lonely switch? Does it have that loop and a port connected to another switch in the network?

    • Socsa@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      0
      ·
      3 days ago

      Certain types of broadcast traffic always get re-broadcast from of every port on a switch. So if you directly connect two ports, and you get some broadcast coming into the switch, that broadcast will loop forever across that loopback, and then get propagated repeatedly until it hits a broadcast boundary. It’s surprisingly difficult to prevent even with managed switches unless you are willing to hand manage every port and significantly restrict the kind of network services which can flow through it.

      Some devices can detect these loops and break them, but that can have other unintended impacts if your network is designed (some would argue poorly) around using dumb switches to multiply limited Ethernet drops at the edge.

    • stoy@lemmy.zip
      link
      fedilink
      English
      arrow-up
      0
      ·
      3 days ago

      IT tech here, yes, yes it can.

      Network infrastructure is both increadibly smart while also being dumb in other ways.

      To do an ELI5 answer:

      Imagine you have a container of pearls that you need to sort, red, green and blue pearls all need to be dropped into a red, green or blue hole.

      The container is being refilled, but slow enough that it only gets a new pearl once you have sorted the previous.

      The holes are connected to pipes going to separate buckets.

      Everything is fine, but then some adds a new hole that is muticolored and tells you that all pearls should go there.

      You tell your friends that you have a faster way to deal with the perls and to send you their pearls.

      The new hole also has a pipe, but that is connected to the container that recieves pearls, so every time you drop a pearl into the new hole, it appears in the container again.

      So now you have a situation where you not only get your normal ammount of pearls, but everyone else’s pearls and you also get every pearl you send back again.

      You are smart and quickly realize that something is wrong and call for your teacher for help, networking gear don’t have that capabillity to understand that it is wrong, it just looks at each pearl and not the big picture.

      If we go back to the real world, we have developed tools to deal with this situation, we have protocols line spanning tree which can have switches speak with eachother and figure out if there is a physical loop before sending traffic through it.

      There are other tools as well, but they all need to be configured and to be honest, it is easily forgotten or made a low priority since it happens rarely.

      It is something that is often implemented after a big outage.

  • dan@upvote.au
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 days ago

    By “hub”, do you mean switch? I haven’t seen a hub in a very long time. I don’t think I’ve ever seen a 1Gbps one.

    • Possibly linux@lemmy.zip
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      3 days ago

      There is such a thing as a small 1Gbps hub that are designed to just handle a small network. They scare me as they are cheap on Amazon and could theoretically bring a network to its knees if a random user finds a port that isn’t authenticated.

      • Nougat@fedia.io
        link
        fedilink
        arrow-up
        0
        ·
        3 days ago

        For the passers-by, in very simple terms:

        A switch maintains a list of the IPs and MAC addresses of devices attached to it (ARP [Address Resolution Protocol] table). When a packet comes into the switch for a specific destination IP, the switch looks up on the ARP table where that destination IP can be found, and only sends the packet out on the port the destination device (or next hop towards that device) is connected to.

        A hub doesn’t do any of that. Every packet that comes into the hub gets sent out of every port on the hub, to every device connected to the hub. It’s on the connected devices’ to discard packets that aren’t addressed to them. On anything but a very small and relatively slow network, this would create an unnecessarily large amount of traffic, not to mention the security issue around sending packets to devices they’re not addressed to.

  • AstridWipenaugh@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 days ago

    I was diagnosing a network bottleneck at a customer site that didn’t make any sense. Literally everything had gigabit connections except one block of cubicles, but all the devices were connected to the same subnet router for that part of the building. Started tracing wires like you did and found that someone didn’t have a long enough cable when building the office and installed a 10 megabit linksys switch in the drop ceiling to connect two short cables. Rather than fix the cable, the customer just went to Best buy and bought a gigabit Linksys switch to replace it… A multi-million dollar operation is being held together by a $10 switch…

  • ramble81@lemm.ee
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 days ago

    I really hope you meant “switch” when saying “hub”. I haven’t seen a hub used in decades. Also your switch should have some level of STP protection enabled to prevent that. Even if someone had a hub with a routing loop, STP would have disabled the ports.

    • dan@upvote.au
      link
      fedilink
      English
      arrow-up
      0
      ·
      3 days ago

      Basic unmanaged switches often don’t have any sort of protection, and on some fancier managed switches it’s disabled by default (no idea why)

      • Jajcus@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        0
        ·
        3 days ago

        no idea why

        Because it makes initial connection much slower. Dumb switch - you insert a cable ant the and it works. STP-enabled switch: you insert a cable and it takes a while until the port is enabled (unless you do extra configuration, appropriate for your network topology). This is annoying and for inexperienced users it could seem like the switch ‘does mot work’. It is easier to sell a switch without such a feature enabled by default.

  • mlg@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 days ago

    Lol imagine the poor dude in his office who was just bored and thought “what if I plug this cable back into the hub, probably won’t do anything”

    • Socsa@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      0
      ·
      3 days ago

      In my experience it’s either someone doing it on purpose, or someone accidentally pulling the wrong cable out of a rats nest.

    • ExtremeDullard@lemmy.sdf.orgOP
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      3 days ago

      Actually this happened in the lab. I know exactly who did this because he told me: we were discussing what had happened and he said “Oh yeah, Daniel and I needed to connect this Windows machine to the intranet quick because we had something urgent to do, and we connected all the ends of the nest of ethernet cables at random until the machine connected. And then we left everything as it was.” But bad luck for us, their machine was connected, but so was that fatal cable on both ends. It just happened that their machine kept working well enough for them to finish what they were doing without noticing the problems rightaway.

      And in case you wonder, there’s no penalty in our company for owning up to honest mistakes, so that’s why he readily admitted to it. Only people who never do anything never do anything wrong.

      • GreyEyedGhost@lemmy.ca
        link
        fedilink
        English
        arrow-up
        0
        ·
        3 days ago

        I do hope you taught him the many better ways of doing this. I absolutely agree with making an environment where mistakes are easily owned up to (I made a mistake that ended up costing my employer over $10k in the last year), but if it isn’t coupled with turning those into learning experiences (here’s why you don’t do that, here’s why this is a better solution) then you just have a lot of mistakes happening over and over again.

  • Orbituary@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 days ago

    Just reading the title of the post I knew what happened. I read through the whole thing because your story was good and I was in suspense to figure out if it was a router or voip phone that was the culprit.

    Had this happen at work about a decade ago.

  • deadbeef@lemmy.nz
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 days ago

    Most hubs didn’t protect you from anything in particular.

    Most of them would forward everything to every port, some really insane ones would strip out the spanning tree that could have prevented a loop.

    It’s been a long time since I did anything that goes as far into a network as the desktop, but 15+ years ago we had a customer ring up with the same sort of complaint. After we followed the breadcrumbs on site we found a little 8 port hub ( that we hadn’t supplied ) plugged into two wall ports that went to two different Cisco edge switches in the server room, two cisco phones also with their passthrough ports both patched into same switch and then two desktop PC’s.

    Amazing.