Looks like I am the one of every three that will experience something like this?
So I learned yesterday that the Debian “sarge” release is now final – as in it’s now the stable release. As I was mostly already running on the sarge packages I didn’t think it would be a big deal to do an official upgrade to bring everything up to the new stable. Well….
I was being a good little user and followed the upgrade instructions to a “T” but when I went to do the final
aptitude -f --with-recommends dist-upgradesomething went terribly wrong (since I was logging using the “script” command as they said in the upgrade instructions I can show you exactly what I did here…):
aptitude -f --with-recommends dist-upgrade .... (downloading progress) ... (Reading database ... 84587 files and directories currently installed.) Preparing to replace fileutils 4.1-10 (using .../fileutils_5.2.1-2_all.deb) ... Unpacking replacement fileutils ... dpkg (subprocess): failed to exec rm for cleanup: No such file or directory dpkg: error while cleaning up: subprocess rm cleanup returned error exit status 2 Setting up fileutils (5.2.1-2) ... dpkg (subprocess): failed to exec rm for cleanup: No such file or directory dpkg: error processing /var/cache/apt/archives/shellutils_5.2.1-2_all.deb (--unpack): subprocess rm cleanup returned error exit status 2 Errors were encountered while processing: /var/cache/apt/archives/shellutils_5.2.1-2_all.deb E: Sub-process /usr/bin/dpkg returned an error code (1)
Hmmm, okay, something went wrong in the install. In the past when I had issues with the fileutils package I would just manually force the install by doing a
dpkg -i --force-overwrite /var/cache/apt/archives/fileutils-xxx.deband all would be well. Well I started navigating around to do just that and discovered that I couldn’t run the “ls” or “rm” commands. What? Crap, now that error makes sense: “failed to exec rm for cleanup: No such file or directory” – it couldn’t find the actual rm command. Uh-oh.
Well, disturbing as that was I still thought I could manually install the fileutils (and shellutils?) package using dpkg, but when I went to do that it failed at the same point. I was unable to install filutils b/c the install itself depended on the very commands I was missing!
Here’s where I think I made my life difficult. Thinking I was still in windows I did a reboot… but in doing so my eth0 and wlan0 devices weren’t able to be brought up due to various errors (all relating I believe to the lack of the “rm” command) and thus I couldn’t do any form of apt-get OR even scp to or from my machine. Big mistake – I think what I should have done is to simply scp the “rm” command from another machine to my laptop that was foobared and then retry installation. Then once fileutils had been successfully installed I would have been free to resume installation (although still unaware of what originally caused the issue).
So now that I was basically stranded on a machine without internet access I thought I could use the Debian install cd to boot into some sort of rescue mode and just copy the rm command from there to my /bin. Well, if at the debian install boot you type
boot: expert26you can choose boot directly into a shell. All right, perfect, now all I had to do was mount my drives and copy “rm” over. Well, I couldn’t mount my drives I believe because the “reiserfs” filesystem type wasn’t supported by the minimal shell and environment that you boot into with the debian install cd (I also tried it with a Redhat workstation install cd I found – same thing). Jeez, it’s just one thing after another…
At this point it was sure liking I was going to have to install a minimal debian installation over my existing partition just to get to a point where I could mount drives etc… I did just that by booting into “expert26” and choosing “install minimal installation components” or something like that (after going in and editing the partition table to set the filesystem types without formatting any existing paritions). I selected only the components I thought would be needed to get to where the installation would mount the drives for me – I forgot what those were, sorry. When that step concluded I tried the “execute a shell” option one more time and this time all my drives had been mounted for me at ”/target”. Perfect. Then I just copied ”/bin/rm” to ”/target/bin/” (and did the same for “ls”) and rebooted.
It started booting but was throwing all sorts of wierd “xxx_MODULE” errors, basically indicating that the install had overwritten some of the lib directories needed by the kernel I was booting into. Luckily I still had the .deb files from my custom compiled kernel (see here and here) so all I did was reboot into the stock 2.4 kernel I still had listed in my grub config (don’t know why I had to go into 2.4 instead of 2.6..?) and then reinstall my custom kernel and headers. Now a reboot brought me back to my fully working environment. Whew – that was 2 hours of unecessary stress.
At this point I bravely redid a system upgrade and all went well – I hate not knowing what went wrong originally but am thankful I have my upgrades system back from its very disturbing state. For those who want to see everything I did you can take a look at my terminal log here.
So the lessons learned?
- Never reboot linux and think that it will solve your problems. It’s not windows.
- Have a rescue cd like PLD Linux rescue cd or SystemRescueCD around that contains the tools needed to mount your drives so you can boot to a shell and fix stuff. Apparently, the install cd doesn’t count when you have reiser or non extX filesystems.