• Hugin@lemmy.world
    link
    fedilink
    arrow-up
    50
    ·
    2 days ago

    Where I worked we had a very important time sensitive project. The server had to do a lot of calculations on a terrain dataset that covered the entire planet.

    The server had a huge amount of RAM and each calculation block took about a week. It could not be saved until the end of the calculation and only that server had the RAM to do the work. So if it went down we could lose almost a weeks work.

    Project was due in 6 months and calculation time was estimated to be about 5 1/2 months. So we couldn’t afford any interruptions.

    We had bought a huge UPS meant for a whole server rack. For this one server. It could keep the server up for three days. That way even if wet lost power over the weekend it would keep going and we would have time to buy a generator.

    One Friday afternoon the building losses power and I go check on the server room. Sure enough the big UPS with a sign saying only for project xyz has a bunch of other servers plugged into it.

    I quickly unplug all but ours. I tell my boss and we go home at 5. Latter that day the power comes back on.

    On Monday there are a ton of departments bitching that they came in an their servers were unplugged. Lots of people wanted me fired. My boss backed me and nothing happened but it was stressful.

    • bitchkat@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      12 hours ago

      At a startup a long time ago, I was working on the weekend and brought my 3 year old with me. We had a customer coming in next week and this one machine was 5 days into a 7 day model build.

      We had to go into that office to help someone with something unrelated. The little shit saw the blinking light and headed straight for the button.

      On this computer (HP 710), it didn’t shut off until you released the button. He actually was just pressing it but got spooked when I tried to get to it.

      The next day our CEO told the guys that built that app that it had to be made so it could recover from crashes and restart from where it left off.