• DigitalDilemma@lemmy.ml
    link
    fedilink
    English
    arrow-up
    65
    ·
    4 months ago

    I lost a day’s holiday, and our team spent 8 man days on this entirely preventable mistake.

    $10? Try extending our licence by another year for free, that might start going towards it.

    • MrMcGasion@lemmy.world
      link
      fedilink
      arrow-up
      11
      ·
      4 months ago

      Why would you want another year of their software for free? This is their second screw up (apparently they sent out a bad update that affected some Debian and RHEL machines a couple years ago). I’d be transitioning to a competitor at the first opportunity. It seems they aren’t testing releases before pushing them out to customers, which is about as crazy to me as running alpha software on a production system.

      I’m sure you have reasons, and this isn’t really meant to be directed at you personally, it’s just boggling to me that the IT sector as a whole hasn’t looked at this situation and collectively said “fuck that.”

      • DigitalDilemma@lemmy.ml
        link
        fedilink
        English
        arrow-up
        4
        ·
        4 months ago

        Why would you want another year of their software for free?

        Because AV, like everything else, costs a fortune at enterprise scale.

        And yeah, I do understand your real point, but it’s really hard to choose good software. Every purchasing decision is a gamble and pretty much every time you choose something it’ll go bad sooner or later. (We didn’t imagine Vmware would turn into an extortion racket, for example. And we were only saying a few months ago how good value and reliable PRTG was, and they’ve just quadrupled their costs)

        It doesn’t matter how much due diligence and testing you put into software, it’s really hard to choose good stuff. Crowdstrike was the choice a year ago (the Linux thing was more recent than that), and its detection methods remain world class. Do we trust it? Hell no, but if we change to something else, there are risks and costs to that too.

        • xavier666@lemm.ee
          link
          fedilink
          English
          arrow-up
          3
          ·
          4 months ago

          Do we trust it? Hell no, but if we change to something else, there are risks and costs to that too.

          Unfortunate reality for lot for medium to big size businesses.

        • ayyy@sh.itjust.works
          link
          fedilink
          arrow-up
          2
          arrow-down
          1
          ·
          4 months ago

          Maybe AV, at an enterprise scale, is actually a horrible idea that reduces security, availability, and reliability and should be abolished through policy.

          • DigitalDilemma@lemmy.ml
            link
            fedilink
            English
            arrow-up
            1
            ·
            4 months ago

            Maybe, but it’s not going to happen soon. Any malware type insurance requires effective AV on all devices, and C-levels do love their insurance.

      • Skull giver@popplesburger.hilciferous.nl
        link
        fedilink
        arrow-up
        3
        arrow-down
        1
        ·
        4 months ago

        Tbh the RHEL/Debian bug only occurred because of bugs in Debian and RHEL, they couldn’t really do much about those. Especially the Debian one, which only took place in Linux kernels several versions above the normal Debian kernel.

        CrowdStrike releasing a buggy release can just happen sometimes. I just hope the entire industry may condider that relying on three or four vendors for auto-updating software installed all corporate computers in the world may not be a good idea.

        This whole thing could’ve been malicious. We got lucky now that it only crashed these systems, just imagine the damage you can do if you hack CrowdStrike themselves and push out a cryptolocker.

        • DigitalDilemma@lemmy.ml
          link
          fedilink
          English
          arrow-up
          1
          ·
          4 months ago

          Not just Crowdstrike - any vendor that does automatic updates, which is more and more each day. Microsoft too big for a bad actor to do as you describe? Nope. Anything relying on free software? Supply chain vulnerabilities are huge and well documented - its only a matter of time.

          • Skull giver@popplesburger.hilciferous.nl
            link
            fedilink
            arrow-up
            2
            ·
            4 months ago

            The automatic update part was akin to virus definitions and triggered a bug in code released long before that. Not auto-updating your antivirus software would put a pretty high tax on the IT team as those updates can get released multiple times a day (and during weekends). I agree on not auto updating text editors and such, but there are types of software that need updates quickly and often.

            Supply chain attacks can always work, but this shows how ill-prepared companies are for their systems failing on a scale like this. The fix itself is maybe a minute or two per device if you use Microsoft’s dedicated repair tool, maybe even less if you use that thing with PXE boot, but we’re still weeks away from fixing the damage everywhere.

        • Scrubbles@poptalk.scrubbles.tech
          link
          fedilink
          English
          arrow-up
          1
          ·
          4 months ago

          Nah, I don’t buy that. When you’re in critical infrastructure like that it’s your job to anticipate things like people being above or below versions. This isn’t the latest version of flappy bird, this is kernel level code that needs to be space station level accurate, that they’re pushing remotely to massive amounts of critical infrastructure.

          I won’t say this was one guy, and I definitely don’t think it was malicious. This is just standard corporate software engineering, where deadlines are pushed to the max and QA is seen as an expense, not an investment. They’re learning the harsh realities of cutting QA processes right now, and I say good. There is zero reason a bit of this magnitude should have gone out. I mean, it was an empty file of zeroes. How did they not have any pipelines to check that file, code in the kernel itself to validate the file, or anyone put eyes on the file before pushing it.

          This is a massive company wide fuckup they had, and it’s going to end up with them reporting to Congress and many, many courts on what happened.

          • suoko@feddit.it
            link
            fedilink
            arrow-up
            1
            ·
            4 months ago

            Even an AI is good enough to avoid (or let someone avoid) pushing a similar bug 🫣

          • Skull giver@popplesburger.hilciferous.nl
            link
            fedilink
            arrow-up
            1
            ·
            4 months ago

            The Windows ordeal was definitely a fuck-up of their testing pipeline, and no doubt has something to do with the mass layoffs earlier this year. I’m sure they’ll be sued into oblivion (though I wonder what making this company go bankrupt or extracting the money out of it through lawsuits will do to all the businesses that currently have it deployed).

            The channel file wasn’t entirely zeroes, not for every customer at least. The code pages that were mapped as callbacks were empty or garbled, but not the entire file (see this thread, for instance).

            However, society shouldn’t crumble because of something like this. It shows how fragile our critical infrastructure really is. I don’t care about airlines and such, but 911 shouldn’t go down because of CrowdStrike or even because of Windows. Even airlines should’ve been able to fly some planes, it’s not like Boeings run Windows.