Digital Unix machine called DecsSation, similar to the one I messed up as root

How to Completely Mess up a Unix Cluster

In the previous post, I outline how you could keep your web server running by automatic monitoring, running as the superuser, root. The warning is that doing anything as root is extremely dangerous. Here is what I did, about twenty-five years ago, to bring a cluster of Unix workstations to their knees.

I was a graduate student at a High Energy Physics experiment at Cornell University called CLEO. One of my duties as a research assistant was to take care of our code base on our newly acquired Unix machines – a cluster of what they called DECStations. Then my home university in Syracuse also got a bunch of them, and I was the de-facto expert of the cluster, mainly because our professional sysadmin, Judith, was only familiar with VAX/VMS machines. Although built by the same company (Digital), the DECStations ran Unix, which she had no experience with.

Judith created a bunch of init files (.cshrc, .bashrc etc.) and kept then in a directory /init/ for any user to copy to their home directory. I looked at the init folder and realized that she had left these files as executable. Not a big deal, but since I knew that these files are only sourced into the shell, and not executed, and being a perfectionist OCD case, I wanted turn the execute bit off. So I did (as root):
cd /init
chmod a-x *

To my surprise, ls -l showed no change. Then it occurred to me — * did not expand files starting with a dot. No problem, I issued the following command.
chmod a-x .*

Everything looked okay in ls -l output. So I went home to kill the rest of the Saturday — like most physics graduate students, my Saturdays were not exactly busy.

Come Monday morning, I heard from Syracuse that the DECStation cluster was completely offline. Nobody (other than root) could logon. They kept getting a cryptic error message saying “no shell.” Judith tried to solve the issue, and called me to figure out what was going on. I couldn’t think of a reason either. So she called Digital engineers. They flew down to Syracuse and tried to diagnose the issue, and failed. Finally they restored the system from a tape backup up – about ten hours of work. By Tuesday, everything was back to normal.

Next Saturday, I again looked at the /init/ folder and found the same files with the same file permission mode, execute bit turned on. I was about to issue the same commands, but stopped to think for a minute and typed in:
cd /init
echo .*

Then I saw what the problem was — it echoed . .. .cshrc .bashrc! My chmod a-x was actually taking out the execute bit from .., which happened to be /, after which nobody (other than root) could execute anything, including their shells!

I guess Unices have evolved from those days, and such things don’t happen any longer. But, I’m sure there are other ways in which you can mess up your Linux VPS or dedicated server in such a was that the second or even third tier support at your hosting provider will be left clueless, much like the Digital engineers twenty-five years ago. Be careful!