I Accidentally Deleted /usr

2/03/2021

Friday night, I accidentally deleted the /usr directory on my Arch Linux server. Here's how I recovered the system.

Deleting /usr

Friday night, I was ssh'd into my NAS cleaning out unused files in my home folder. At some point I had extracted something containing a usr directory. It was nearly midnight, and without stopping to think, I ran a very bad command:

rm -rf /usr

I stared for five or ten seconds while the command ran before I suddenly realized: I had not deleted the file /home/thor/usr, but rather the top-level directory /usr. I hit ctrl+c as fast as I could, but at this point there was little left.

On most Linux distributions, /usr holds most of the executables for the system. In the case of Arch Linux, the bin, lib, lib64, and sbin directories are all symlinked (aliased) to locations under /usr as well. This means that almost all commands, programs, and libraries are installed under /usr. The only commands I could run were the Bash built-ins - which notably doesn't even include ls.

A Plan

I run ZFS on my NAS, but the OS directories are stored on a regular ext4 drive. This means there were no snapshots to roll back to. I knew I would have to boot from a flash drive to try repairing the system, so I powered off my NAS and began combing the internet for suggestions. On an Arch Linux forum, I found a snippet of wisdom:

1) mount your partitions
2) pacman -r /mnt -S base
3) reboot into multi-user target on your system
4) pacman -Qk to find remaining broken packages
5) pacman -Syu

While I wouldn't end up exactly following these steps, this gave me a great starting-off point to do my repair. In some ways, it could be said that /usr is the most critical directory in a Linux installation, since it holds the executables - but at the same time, it is the most replaceable, since there are no configuration files or pieces of personal data stored there. Hence, if I could reinstall all the packages, the system would theoretically be completely back to normal.

What Worked

My actual recovery process took several (failed) attempts - I'll describe the steps that actually worked, ordered and arranged as if I got them right the first time (for easier reading).

After booting from an external flash drive, I mounted my OS disk under /mnt, then mounted my EFI partition under /mnt/boot. Reading pacman's documentation revealed that the -r option can be used to specify a "new root" from which pacman should operate. In this case, with my real OS root was mounted under /mnt, passing -r /mnt to pacman would tell it to operate on my real OS rather than the booted flash drive.

Before I could start reinstalling any packages, I would need the /usr directory to exist and have an intact structure - I had no idea what subdirectories may have been destroyed already. This was easy enough to do by making a new temp directory /mnt/root2, then using pacstrap -i /mnt/root2 base. This would populated /mnt/root2 with a fresh base installation of Arch. Once completed, I replaced my OS's /usr (/mnt/usr) with /mnt/root2/usr. This, I hoped, would allow pacman to successfully reinstall packages.

Ideally I would want to reinstall the same versions of all the packages that were previously installed on the system - I figured this would minimized the chance of conflicting libraries. I knew most or all of my system's packages would be stored in the cache, /mnt/var/cache/pacman/pkg. The cache has several versions of most packages in it, but I wanted to get a list of just the most recent of each one. After several minutes of examining, trial, and error, I came up with the following:

ls /mnt/var/cache/pacman/pkg/ | sed 's/-/ /' | sort -r | rev | uniq -f 1 | rev | sed 's/ /-/' | sort

Translated roughly to "list all the packages, replace the first dash in each name with a space, sort them by reverse-alphabetical order, flip each name to be backwards (package 3.1 -> 1.3 egakcap), keep only one occurrence of each name but skip field one, reverse the names back, put the dashes back, sort alphabetically". Technically, this command is flawed because it replaces the first dash in the name, but it should replace the last dash. In my case, I lucked out and this didn't cause any problems, but I should fix it if I ever use the command again.

Once I was satisfied with the results, I saved the list to a variable:

PACKAGES="$(ls /mnt/var/cache/pacman/pkg/ | sed 's/-/ /' | sort -r | rev | uniq -f 1 | rev | sed 's/ /-/' | sort)"

Then I prepended the full path to each name in the list:

for i in $PACKAGES ; do PACKAGES2="/mnt/var/cache/pacman/pkg/$i $PACKAGES2"; done

At this point, the variable $PACKAGES2 held a list of the full paths to each package I wanted installed on the system:

pacman -r /mnt -U $PACKAGES2

I also had to use the --assume-installed option on one particular Python package that pacman tried (and failed) to download. I figured whatever the package was, it probably wasn't critical for the system to operate, and this flag would allow me to instruct pacman to skip trying to install this particular package. Additionally, I had to make use of the --overwrite command to tell pacman to specifically overwrite certain files that already existed (due to me prepopulating the /usr directory with a base installation). The --overwrite command took several tries before I had a full list of all the files pacman needed permission to overwrite.

Once I successfully reinstalled all the packages in the cache, I decided I would try to chroot into the system so I could fully repair the installation from the inside (I couldn't do this before because there would have been no executables for me to run from inside the chroot).

arch-chroot /mnt bash

Inside the chroot, I was able to use pacman to get a list of which files the system had concerns about:

pacman -Qkk

There was quite a lot wrong with the system. I decided I would first try running a full update with pacman -Syu, then try reinstalling all packages on the system. Both of these likely required use of the --overwrite flag again (although I admittedly cannot remember).

Running pacman -Qe > /packagelist.txt gave me a file with all the packages listed. Trying to install using pacman -S $(cat /packagelist.txt) told me which couldn't be found in the repository (due to being installed from other sources, such as the AUR). At that point I just had to go through the /packages.txt and delete the 4 or 5 offending packages. Running pacman -S $(cat /packageslist.txt) successfully reinstalled all the packages, and pacman -Qkk confirmed that there were no malignant files to be found!

At this point, I also reinstalled grub and rebuilt the initramfs. These steps were probably not needed, but I didn't want to reboot into disappointment again, and I wasn't sure if all my "repair" work may have corrupted libraries needed by the ZFS module.

Rebooting was a success!

Aftermath

A lot of initial wisdom I found on the internet said recovering from a deleted /usr was not worth the time, and it would probably not result in a stable system even if successful. I found the optimism and DIY attitude of Arch Linux resources really shined in helping me get this problem repaired.

Now, I'm backing up my OS installation nightly. In the future, this would mean recovering would be as easy as booting into an external flash drive and re-copying whatever directories I needed from the backup into place (or even the entire OS).

Back to Blog