Boot time

Şuraya atla: kullan, ara

Boot time

The SLED desktop team had its semi-annual meeting in July 2006 in the Cambridge office. User:Federico-mena led the sessions to start a project to reduce the time it takes to boot a Suse Linux system.

Status

2006/Aug/02 - The project has not started yet; we just have notes on what we need to do.

Bugs

  • 192998: earlygdm is not installed in runlevel 5, so we don't do preloading appropriately.

The plan

  • THE GOAL IS TO BOOT IN 5 SECONDS. This means, from the grub screen until gdm loads there should be no more than 5 seconds. Stating a goal lets us know when to stop optimizing, and it also lets us see regressions in performance.
  • We need an easy way to get reproducible results. Rebooting the machine every time you want to get timings is just too painful. An initial idea is to have a script roughly like
   telinit 1
   wipe-the-buffer-cache
   bootchartd
   telinit 5
   wait-for-gdm-to-open
   stop bootchartd
  • The basic tool is [Bootchart]. This gives you a high-level overview of the boot process: which processes get run, when do they use the CPU and disk, which processes wait for things to happen before they can continue. Bootchart will not give us fine-grained analyses of the various processes, but it lets us find hotspots easily.
  • This is the kind of optimization problem where there is no single big hotspot: we'll have to kill 5% here, 2% there, 3% elsewhere, etc. until we reach a goal.
  • Suspicious processes: zmd, X startup, apparmor, sshd, nscd.

Ideas and open questions

Do we really need the network during boot? Maybe only if we have NFS home directories, or a NIS setup, or Active Directory. We can probably delay launching sshd/nscd/ntp until later. Could we benefit from a systemwide D-Bus service to start the network on demand?

Zmd definitely doesn't need to start running at boot time.

Preload a bit -> boot a bit -> preload a bit -> boot a bit, etc?

Joe Shaw says parallelizing things at boot doesn't really help (Fedora tested this), since everything waits on I/O anyway.

There is overhead to loading kernel modules. Can we delay them?

Once we add instrumentation to individual processes to monitor them, we can simply use syslog to centralize all this logging.

Make the kernel print timestamps before init(1) gets run (i.e. a poor man's kernel profiler).

There's a lot of waiting for udev before "/" gets mounted.

The USB driver is slow when booting. Can we delay it?

How long does Windows take to boot? XP SP2 takes a long time, but the desktop shows quite early, long before the keyboard is available. Of course we could load a png snapshot of the desktop :-).

We need to track how boot performance degrades over time: what happens after you upgrade (not reinstall) your distro? When you update certain packages and the disk becomes fragmented?

Have an opt-in program for openSUSE contributors: automatically mail their performance results back to Novell so that we can see what their machines are doing:

  • CPU, RAM, disk from their machines
  • Date of installation or upgrade (get this from the Yast logs? stat /etc/SuSE-release?)

Pending patches

Sun has a patch to GDM to make it preload the gnome-session stuff while GDM is waiting for you to type your username/password.

Discussion

CUPS needs USB to be loaded, to probe for printers. Is CUPS aware of HAL/hotplug?

Quotes from Robert Love and Greg KH:

> > I also have a kernel patch that makes all of the device discovery work 
> > in a multi-threaded way.  It speeds up the kernel boot time a _lot_ but
> > causes some "interesting" issues with some hardware and drivers.  You
> > can very easily blow up SCSI power supplies if you aren't careful, but
> > for my tiny laptop, I shave a full second off of the time the kernel
> > takes to starting up init, which is very respectable.
> 
> I did multi-threaded drive discovery way back when, for MontaVista.  I
> think it was against 2.4.  I don't recall the gains on small machines;
> our goal at the time was very large ones.  On my workstation, scanning
> the entire SCSI LUN space is still the largest consumer of boot time.
> But that might just be the way it is.  Interesting to see what gains (a)
> multi-threading everything on (b) small machines brings.

On my 4-way workstation, with SATA (not scsi, sorry), the speedups are
around 4-5 seconds.  And I need to tweak that to start running init when
only the root disk is found, not to wait around till all devices are
found.

ZMD

AutoYaST may need ZMD in some situations after the initial installation. Starting ZMD on demand there may not be a problem.

Quote from Ingo Tambet:

zmd is also the ZLM remote management daemon so it can't be started on
demand.

There's actually two different daemons in zmd. There's one, very light
weight watcher daemon that only opens communication sockets and listens
for incoming requests. When a request comes in, it'll exec the real
(heavy-weight) deamon and exit. When there's certain time of inactivity,
the real daemon will shut down and exec the watcher daemon. 

Currently, at startup, we always launch the real daemon. What we can do,
(in addition to optimizing the real daemon) is to start the watcher
daemon only, but I'm not sure what it would buy us - the zen-updater
would trigger the start of the real daemon anyway.