SDB:Booting with the initial ramdisk

Version: 6.3 -

Problem

When the Linux kernel is loaded and has mounted the root filesystem ('/'), programs can be executed and additional kernel modules may be loaded to add additional functionality.

But in order to be actually able to mount the root filesystem, some conditions have to be met: the kernel needs the appropriate drivers to access the device containing the root filesystem (specially SCSI drivers). Additionally, the kernel needs the code that enables it to read the filesystem (ext2, reiserfs, romfs etc.). Furthermore, the root filesystem could be encrypted, necessitating prompting the user for key/password.

Singling out the problem of the SCSI drivers, various solutions come to mind: The kernel could contain all possible drivers. Problematic, as some drivers don't play along well with others; furthermore the kernel would get rather large. Another possibility would be providing different kernels that each only contain one or very few SCSI drivers. This path also has its problems, as this would sharply increase the number of different kernels needed. A problem that gets even more acute by the need to supply differently optimized kernels (Pentium optimized, SMP).

The approach of loading the SCSI drivers as modules leads to the general problem, which is countered by the concept of an initial ramdisk: The creation of a method to run userspace programs even before the root filesystem is mounted.

Concept of the initial ramdisk

The initial ramdisk (also called initdisk or initrd) solves the problems detailed above. The Linux kernel offers the possibility of loading a (small) filesystem into a ramdisk and executing programs from there before the actual root filesystem gets mounted. The loading of the initrd is done by the bootloader (LILO, loadlin etc.); all these bootloaders only need BIOS routines to load data from the boot media. If the bootloader can load the kernel, it can load the initial ramdisk just as well. So special drivers are not necessary.

Flow of operation when booting with initrd

The bootloader loads the kernel and the initrd into memory and starts the kernel, telling it that an initrd is present and where in memory the kernel will find it.

If the initrd was compressed (which is typically the case), the kernel decompresses and mounts it as a temporary root filesystem. After having done so, a special program named linuxrc, contained in the initrd, is executed. This program is able to perform all tasks necessary to enable the kernel to mount the real root filesystem. When linuxrc terminates, the (temporary) initrd is unmounted and booting continues by mounting the real root filesystem. The mounting of the initrd and the execution of linuxrc may thus be viewed as a short intermezzo during the normal boot procedure.

Correct is: "The kernel tries to remount the initial ramdisk to /initrd after booting into the real root device. If this does not work (eg the mountpoint /initrd does not exist), the kernel will try to unmount the initrd".

linuxrc

The program linuxrc has only to meet the following requirements: It has to have the name 'linuxrc' and it has to be in the root directory of the initrd. Besides that, the kernel has to be able to execute it. This means that linuxrc may well be linked dynamically, but then of course the shared libraries have to be completely available under /lib in the initrd. Additionally, linuxrc may also be a shell script, obviating the need of a shell, available under /bin. In short, the initrd must contain a minimal Linux system that allows the program linuxrc to be executed. The installation of SuSE Linux uses a statically linked linuxrc in order to keep the initrd as small as possible (space on the boot floppies is scarce). Linuxrc is run with root privileges.

The real root filesystem

As soon as linuxrc terminates, the initrd is unmounted and discarded and booting continues normally with the kernel mounting the real root filesystem. What gets mounted as the root filesystem may be changed by linuxrc. To do so, linuxrc only has to mount the /proc filesystem and write the value of the real root filesystem in numerical form to /proc/sys/kernel/real-root-dev.

Bootloader

Most bootloaders (specially LILO, loadlin and syslinux) are able to deal with initrd. The bootloaders are advised to use an initrd like this:

1. LILO

By adding the following line to /etc/lilo.conf

initrd=/boot/initdisk.gz

The file "/boot/initdisk.gz" is the initial ramdisk. It may (but needn't) be compressed.

2. loadlin.exe

Called as:

loadlin <kernel image> initrd=C:linuxinitdisk.gz <parameter>

3. syslinux

By adding the following line to syslinux.cfg

append initrd=initdisk.gz <additional parameters>

Use of initrd at SuSE

System installation

The initrd has been used for the installation for quite some time now: The user is able to load modules in linuxrc and enter the information (such as the source medium, for example) needed for the installation. Linuxrc then starts YaST, which does the installation. When YaST has done its work, it tells linuxrc where the root filesystem of the newly installed system lies. Linuxrc writes this value to /proc, exits, and the kernel boots the freshly installed system.

So during an installation of SuSE Linux you boot the system that has just been installed. A real reboot only happens if the currently running kernel is incompatible with the modules installed in the system. As SuSE Linux currently uses a uni-processor kernel for installation, this currently only happens if an SMP kernel and matching modules were installed. In order to be able to use all modules, the freshly installed SMP kernel has to be booted.

Booting the installed system

In the past, YaST offered more than 40 kernels for installation, where those kernels mostly differed in the specific SCSI driver they contained. This was necessary in order to be able to mount the root filesystem after booting. Additional drivers could later be loaded as modules.

As we now also offer optimized kernels, this concept wasn't supportable anymore (it would necessitate far more than 100 kernel images).

Therefore an initrd is now also used to start the normal system. The mechanism is similiar to during an installation, but the linuxrc used here is only a simple shell script, whose only task is to load certain modules. Typically, this is just one module, namely the SCSI driver needed to access the root filesystem.

Creating an initrd

The creation of an initrd is done via the script mk_initrd. In SuSE Linux, the modules to load are specified through the variable INITRD_MODULES in /etc/rc.config. After an installation this variable is preset to the correct values (as linuxrc knows which modules got loaded). One should note that the modules get loaded in exactly the order in which they appear in INITRD_MODULES. This is specially important if more than one SCSI driver is used, as otherwise the disk names would change. Strictly speaking, it would suffice to only load the SCSI driver needed for accessing the root filesystem. But as the automatic loading of additional SCSI drivers is not without pitfalls (how should it be triggered when disks are also attached to the second SCSI controller?), we load all SCSI drivers used during the installation by means of the initrd.

The current mk_initrd checks if a SCSI driver is actually needed to access the root file system. If you call mk_initrd on a system where / is on an EIDE disk, it won't create an initrd, as it isn't necessary because SuSE kernels always contain the EIDE drivers. But as the number of special EIDE controllers being offered is increasing, its foreseeable that in the future an initrd will be used for booting the installed system also in these cases.

Important:

As the loading of the initrd by the bootloader happens exactly like that of the kernel (LILO notes the location of the file in its map file), LILO must be reinstalled after every change to the initrd! After doing mk_initrd, it's therefor also necessary to do a lilo!

Bugs and problems in SuSE 6.3

Self-compiled kernels

If you compile your own kernel, the following frequent problem may occur: out of habit, the SCSI driver is compiled into the kernel but the existing initrd stays unmodified. When booting, the following happens: The kernel already contains the SCSI driver, the harware is correctly detected. Now the initrd tries to load the same driver, but as a module; with some SCSI drivers (most notably the aic7xxx) this leads to system lockup. Strictly speaking, this is a kernel bug (it should be impossible to load an already exiting driver additionally as module), but it's a known problem, although in another context (serial driver).

There are several solutions to this problem: either the driver is configured as a module (then it will get loaded by the initrd correctly), or the entry for the initrd is removed from /etc/lilo.conf. Equivalent to the latter solution would be to remove the driver from INITRD_MODULES and then call mk_initrd, which notices that no initrd is needed.

Module parameters

If parameters have to be passed to a given module in order to get it loaded, these get lost after the installation; they are not entered into the initrd used for booting the system. Luckily, only very few SCSI drivers require parameters. Whoever is confronted by this problem has to boot his system with the installation boot floppy and build himself a kernel containing the SCSI drivers.

The problem has been solved in the meantime. In SuSE 6.4, module parameters will also be used when starting a system from an initrd.

Network drivers in the initrd

If network drivers got loaded during the installation, these will also be entered in INITRD_MODULES and thus loaded in the initrd when booting the system. While this isn't an error in itself, it may lead to problems in some cases:

1. NE2000 compatible cards: The driver for these cards needs an additional module (8390.o). Module dependencies are currently ignored, so the loading of ne.o fails. As later, when configuring the network, the loading of the network driver is triggered by kerneld (and this time, because of using "modprobe", dependencies are respected), this is a purely cosmetic problem: The first load fails with an error message, but the second load succeeds, as the alias eth0=ne ist set correctly in /etc/modules.conf.

2. Forcing certain configurations: If it's required to force a network card into full-duplex mode or force the use of a certain media type (e.g. by passing options=xxx to the tulip driver), this will fail, because - as stated above - the parameters used during installation are not used later on.

This problem has also been solved. First, module parameters will be reused correctly in the future and secondly, the installation linuxrc will only enter SCSI drivers in INITRD_MODULES.

Further information

/usr/src/linux/Documentation/ramdisk.txt
/usr/src/linux/Documentation/initrd.txt

fake_initrd on the AXP platform

AXP still has problems with the bootloader. After enjoying two glasses of beer, our Rudi came up with a brilliant workaround:

On all architectures, the kernel is able to load its (real) root filesystem from floppy. Therefor Rudi developed a small kernel patch that makes it possible to tell the kernel to use this (real) root system as an initial ramdisk. To do so, one simply has to pass the additional parameter fake_initrd to the kernel. The kernel then requests a disk with the root filesystem during booting, but only mounts this as initrd, thus providing all features one normally has in an installation via linuxrc.

While this feature is only mentioned here for the sake of completeness, it may become invaluable when porting to other architectures. Namely in those cases, where for a given architecture no boot loader exists that offers the initrd feature.

Outlook

For the future it is possible that an initrd could be used for quite a few more (and more demanding) tasks than just loading the modules needed for accessing / (the root filesystem).

High end IDE drivers
Root filesystem on software RAID (linuxrc sets up the md devices)
Root filesystem on LVM (is this needed?)
Root filesystem in encrypted (linuxrc asks for the password)
Root filesystem on SCSI disk attached to a PCMCIA adaptor

<keyword>boot,initrd,ramdisk,VFS,SCSI,lilo,loadlin</keyword>