

January 24, 2006 - 22:00 UTC NEW
Well, we had an outage today to move some disks around. A replacement workunit storage server enclosure arrived and it should have been a simple disk exchange. But things kind of exploded and we didn't even get to that.
To shut down the workunit storage server we had to first unmount it on all the splitter/download machines. But koloth and kryten were having all kinds of mounting troubles. Eventually we had to reboot kryten to clear the pipes. But then it took 45 minutes to shut down for reasons we are still unclear about. We didn't want to power cycle the thing as the result storage disks attached to it were quite busy doing something.

Eventually the disks fell quiet, but then nothing happened for a good 15 minutes. We gave in and powered it down. We flipped the switch, but it didn't power back up. We tried again, and then smoke and sparks came out from around the power cord.

Uh oh.

The cord was slightly melted. We threw it out, got a new cord and a better surge protector in place, and kryten powered up (phew) but died within 15 seconds. Apparently we had a bad power supply.

By some divine luck we happened to have one spare E3500 power supply kicking around in the basement. It was an easy replacement and then kryten powered up just fine. That was a major relief, as we don't really have a good backup server for kryten at this point in time. After careful inspection and remounting everything we eventually came back on line.