Thursday, October 14, 2010

Linux boot process in redhat enterprise linux rhel5


This post explains about the linux boot process and important scripts participating in it. The paper is based on the booting of Redhat Enterprice Linux machines. Version 5.4

Power Cycle -> CPU Reset -> BIOS -> Boot Device -> MBR -> IPL ->/ boot/grub/grub.conf -> /etc/inittab -> /etc/rc.d/rcX.d -> /etc/rc.local -> /etc/isuue -> /etc/motd

Power Cycle
When we power on the system, it'll go to SMPS and will reset chips in CPU.

Power On Self Test (POST)
The computer power-on self-test tests the computer to make sure it meets the necessary system requirements and that all hardware is working properly before starting the remainder of the boot process. If the computer passes the POST, the computer may have a single beep as the computer starts and the computer will continue to start normally, Passing the control to the BIOS.

BIOS [Basic Input Output System]
The BIOS is the first code run by a PC when powered on. The primary function of the BIOS is to load and start an operating system. When the PC starts up, the first job for the BIOS is to initialize and identify system devices such as the video display card, keyboard and mouse, hard disk, CD/DVD drive and other hardware. The BIOS then locates boot device such as a hard disk or a CD, and loads and executes that software, giving it control of the PC.

Boot Device
Boot device is the device which contains the bootable code. It can be a hard disk. CD ROM, USB device or a floppy disk. After the execution of BIOS it passes the control to boot device and then to boot loader.

Boot Loader
A computer's central processor can only execute program code found in Read Only Memory (ROM), Random Access Memory (RAM) or an operator's console. Modern operating systems and application program code and data are stored on nonvolatile data storage devices, such as hard disk drives, CD, DVD, flash memory cards, USB flash drive, and floppy disk. When a computer is first powered on, it does not have an operating system in ROM or RAM. The computer must initially execute a small program stored in ROM along with the minimum of data needed to access the nonvolatile devices from which the operating system programs and data are loaded into RAM.

The small program that starts this sequence of loading into RAM, is known as a bootstrap loader, bootstrap or boot loader. This small boot loader program's only job is to load other data and programs which are then executed from RAM. Often, multiple-stage boot loaders are used, during which several programs of increasing complexity sequentially load one after the other in a process of chain loading. Boot loader recides in MBR.

MBR [Master Boot Record]
MBR is the first sector or the first 512 byets of a storage device. It is mainly divided as follows.

Executable code [Initial Program Loader or GRUB Stage 1] 440 bytes. [maximum 446]
Disk signature and reserved bytes. 4 + 2 = 6bytes.
Partition Table. 16 * 4 = 64 bytes.
MBR Signature. 2 bytes.

A more clear picture of MBR is shown below.
Linux boot process - MBR

GRUB [GRand Unified Bootloader]

GNU GRUB is a boot loader package from the GNU Project. The MBR usually contains GRUB stage1, but can contain another bootloader which can chain boot GRUB stage1 from another boot sector such as a partition's Volume boot record. Given the small size of a boot sector, Stage1 does little more than load the next stage of GRUB (which may reside physically elsewhere on the disk). Stage1 can load Stage2 directly, or it can load stage1.5. GRUB Stage1.5 is located in the first 30 kilobytes of hard disk immediately following the MBR. Stage1.5 loads Stage2.
Once boot options have been selected, GRUB loads the selected kernel into memory and passes control to the kernel. Alternatively, GRUB can pass control of the boot process to another loader, using chain loading. This is the method used to load operating systems such as Windows, that do not support the Multiboot standard. In this case, copies of the other system's boot programs have been saved by GRUB. Instead of a kernel, the other system is loaded as though it had been started from the MBR. This could be another boot manager, such as the Microsoft boot menu, allowing further selection of non-Multiboot operating systems.
GRUB configuration file : "/boot/grub/grub.conf"

2nd stage of boot loading will start by executing the "/boot/grub/grub.conf" file.
The following is an example of grub.conf and we'll discuss the important lines.

# grub.conf generated by anaconda
# Note that you do not have to rerun grub after making changes to this file
# NOTICE: You have a /boot partition. This means that
# all kernel and initrd paths are relative to /boot/, eg.
# root (hd0,0)
# kernel /vmlinuz-version ro root=/dev/sda2
# initrd /initrd-version.img

#default=0 means it'll be booting the default operating system. Usually linux. If you have two operating systems and one is windows. If u want windows to be the default operating system, then u've to give default=1. Its value starts from zero. etc 0,1,2,3 etc.

#timeout defines the time which stage1.5 will give u to switch between operating system options. It is in seconds. After waiting 5 seconds, It'll boot the default OS.

#This is for graphical GUI interface for GRUB stage1.5. If you dont want it, you can comment this line.
#The figure of Grub stage1.5 is shown below.
Linux boot process - grub stage 1.5

After commenting the splashimage line, grub stage1.5 will look as shown below.

Linux boot process - grub stage 1.5

#if u give hidden menu option it will hide the other operating systems menu and selection options. Still you can get the menu pressing any key.
#If you dont give this option it will show the list of operating systems and you'll have to select it.

Linux boot process - grub stage 1.5

The above picture is grub.conf without hiddenmenu in it. See the operating systems are listed.

Linux boot process - grub stage 1.5

The above picture is grub.conf with hiddenmenu in it. See the operating systems are not listed.

title Red Hat Enterprise Linux Server (2.6.18-164.el5)
#this is the title of underlying operating system. You can edit this if u want.
root (hd0,0)
#location of /boot partition.
kernel /vmlinuz-2.6.18-164.el5 ro root=LABEL=/ rhgb quiet
#First it will load the kernel vmlinuz-version in ro=read only format with the label of disk "/"
#rhgb is redhat graphical boot. It gives us a graphical GUI and updates the flow of booting.
#quiet option hides the boot messages before rhgb starts.

initrd /initrd-2.6.18-164.el5.img
#Loading initrd image which gives a basic filesystem for kernel to execute basic commands and load drivers.
GRUB 2nd stage will load the kernel into RAM and pass the control to kernel.
Kernel phase

The kernel in Linux handles all operating system processes, such as memory management, task scheduling, I/O, interprocess communication, and overall system control. This is loaded in two stages - in the first stage the kernel (as a compressed image file) is loaded into memory and decompressed, and a few fundamental functions such as basic memory management are set up. Control is then switched one final time to the main kernel start process. Once the kernel is fully
operational and as part of its startup, upon being loaded and executing the kernel looks for an init process to run, which (separately) sets up a user space and the processes needed for a user environment and ultimate login. The kernel itself is then allowed to go idle, subject to calls from other processes.
Kernel loading stage

The kernel as loaded is typically an image file, compressed into either zImage or bzImage formats with zlib. It contains a header program which does a minimal amount of hardware setup, decompresses the image fully into high memory, taking note of any RAM disk if configured. It then executes kernel startup via ./arch/i386/boot/head and the startup_32() (for x86 based processors) process.

Kernel startup stage

The startup function for the kernel (also called the swapper or process 0) establishes memory management (paging tables and memory paging), detects the type of CPU and any additional functionality such as floating point capabilities, and then switches to non-architecture specific Linux kernel functionality via a call to start_kernel().

start_kernel executes a wide range of initialization functions. It sets up interrupt handling (IRQs), further configures memory, starts the Init process (the first user-space process), and then starts the idle task via cpu_idle (). Notably, the kernel startup process also mounts the initial RAM disk ("initrd") that was loaded previously as the temporary root filing system during the boot phase. This allows driver modules to be loaded without reliance upon other physical devices and drivers, and keeps the kernel smaller. The root file system is later switched via a call to pivot_root () which unmounts the temporary root file system and replaces it with the use of the real one, once the latter is accessible. The memory used by the temporary root file system is then reclaimed.

Thus, the kernel initializes devices, mounts the root filesystem specified by the boot loader as read only, and runs Init (/sbin/init) which is designated as the first process run by the system (PID = 1). A message is printed by the kernel upon mounting the file system, and by Init upon starting the Init process. It may also optionally run Initrd to allow setup and device related matters (RAM disk or similar) to be handled before the root file system is mounted.

According to Red Hat, the detailed kernel process at this stage is therefore summarized as follows:

"When the kernel is loaded, it immediately initializes and configures the computer's memory and configures the various hardware attached to the system, including all processors, I/O subsystems, and storage devices. It then looks for the compressed initrd image in a predetermined location in memory, decompresses it, mounts it, and loads all necessary drivers. Next, it initializes virtual devices related to the file system, such as LVM or software RAID before unmounting the initrd disk image and freeing up all the memory the disk image once occupied. The kernel then creates a root device, mounts the root partition read-only, and frees any unused memory. At this point, the kernel is loaded into memory and operational. However, since there are no user applications that allow meaningful input to the system, not much can be done with it."

At this point, with interrupts enabled, the scheduler can take control of the overall management of the system, to provide pre-emptive multi-tasking, and the init process is left to continue booting the user environment in user space.

Init process (SysV init style only)

"Init is the father of all processes. Its primary role is to create processes from a script stored in the file "/etc/inittab". This file usually has entries which cause init to spawn gettys on each line that users can log in. It also controls autonomous processes required by any particular system. A run level is a software configuration of the system which allows only a selected group of processes to exist. The processes spawned by init for each of these run levels are defined in the /etc/inittab file.

1. It looks at the sysinit script, which "sets the environment path, starts swap, checks the file systems, and takes care of everything the system needs to have done at system initialization." This includes the system and hardware clock, special serial port processes, and the like.
2. Init then looks at the specific runlevel, as specified in that runlevels configuration.
3. Init then sets the source function library for the system. This spells out how to start or kill a program and how to determine the PID of a program.
4. It then searches and starts each applicable process, and it creates a login session for the user.

After it has spawned all of the processes specified, init goes inactive, and waits for one of three events to happen:

1. processes it started to end or die,
2. a power failure signal
3. a request via /sbin/telinit to further change the runlevel.

"/etc/inittab" file
The following is an example for inittab file.

[root@server ~]# cat /etc/inittab
# inittab This file describes how the INIT process should set up
# the system in a certain run-level.
# Default runlevel. The runlevels used by RHS are:
# 0 - halt (Do NOT set initdefault to this)
# 1 - Single user mode
# 2 - Multiuser, without NFS (The same as 3, if you do not have networking)
# 3 - Full multiuser mode

# 4 - unused
# 5 - X11
# 6 - reboot (Do NOT set initdefault to this)
#The above line sets the default runlevel. Here its 3. Thats Full multiuser without Graphics.

# System initialization.
#This line will execute system initialization scripts in all runlevels.

l0:0:wait:/etc/rc.d/rc 0
l1:1:wait:/etc/rc.d/rc 1
l2:2:wait:/etc/rc.d/rc 2
l3:3:wait:/etc/rc.d/rc 3
l4:4:wait:/etc/rc.d/rc 4
l5:5:wait:/etc/rc.d/rc 5
l6:6:wait:/etc/rc.d/rc 6
#The above lines are used for enabling runlevels. If we comment any line, corresponding runlevel will not be active.

ca::ctrlaltdel:/sbin/shutdown -t3 -r now
#The above line enables restarting of system by pressing ctrl+alt+del

# When our UPS tells us power has failed, assume we have a few minutes
# of power left. Schedule a shutdown for 2 minutes from now.
# This does, of course, assume you have powerd installed and your
# UPS connected and working correctly.
pf::powerfail:/sbin/shutdown -f -h +2 "Power Failure; System Shutting Down"

# If power was restored before the shutdown kicked in, cancel it.
pr:12345:powerokwait:/sbin/shutdown -c "Power Restored; Shutdown Cancelled"

# Run gettys in standard runlevels
1:2345:respawn:/sbin/mingetty tty1
2:2345:respawn:/sbin/mingetty tty2
3:2345:respawn:/sbin/mingetty tty3
4:2345:respawn:/sbin/mingetty tty4
5:2345:respawn:/sbin/mingetty tty5
6:2345:respawn:/sbin/mingetty tty6
#Above lines are for terminal consoles. The action here is respawn. That is if the process is stopped, It will be started again without any delay.

# Run xdm in runlevel 5
x:5:respawn:/etc/X11/prefdm -nodaemon
#The above line makes GUI available.

There is also a "emergencey" mode. It wont run any init scripts. It'll run only one script that is sulogin. see the image below.

Linux boot process - single user mode

Giving option "emergency" by editing the kernel line in the begining of the system boot.

Linux boot process - emergency

It'll ask for the password of root user by executing the script "sulogin". It wont run any of other init scripts.

Inittab file passes the control to rc.sysinit script.

"/etc/rc.d/rc.sysinit" file:

The red color "Red Hat" is the starting point of the rc.sysinit script. We can make the booting interactive by pressing "i" button at this point. Normally we use this function when system is hung by a service or one service is taking a long time to start.

Linux boot process - interactive boot up

Now here onwards almost everythin is handled by rc.sysinit based on inittab configuration.
All the startup scripts recides in /etc/init.d.

"/etc/rc.d/rcX.d" scripts:

According to the runlevel(X), it'll go to rcX.d and execute all the scripts there. Here in our example default runlevel is 3. So It'll execute the scripts in "/etc/rc.d/rc3.d"

[root@server rc3.d]# ls
K01dnsmasq K24irda K87multipathd S00microcode_ctl S19rpcgssd S85gpm
K01setroubleshoot K30sendmail K88wpa_supplicant S04readahead_early S22messagebus S85httpd K01smartd K35vncserver K89dund S05kudzu S25netfs S90crond K02avahi-dnsconfd K35winbind K89netplugd S06cpuspeed S25pcscd S90xfs
***Output truncated***

You can see a lot of scripts starting with S and K. The scripts starting with K is killing scripts. It'll kill the processes running which are not set to execute .in runlevel X (here 3). The scripts starting with S is starting scripts. It'll start all the processes which are set to execute in runlevel X.

In the above example smartd service is to be killed and Kudzu service is to be started. We can check it by listing chkconfig entries.
[root@server rc3.d]# chkconfig --list smartd
smartd 0:off 1:off 2:on 3:off 4:on 5:on 6:off
[root@server rc3.d]# chkconfig --list kudzu
kudzu 0:off 1:off 2:off 3:on 4:on 5:on 6:off

All the scripts here links to /etc/init.d. see the long listing below.
[root@server rc3.d]# ll K01smartd
lrwxrwxrwx 1 root root 16 May 28 13:31 K01smartd -> ../init.d/smartd
[root@server rc3.d]# ll S05kudzu
lrwxrwxrwx 1 root root 15 May 28 13:31 S05kudzu -> ../init.d/kudzu

"/etc/rc.local" file
In this file we can specify customized scripts which will run at the startup after the scripts in rcX.d(init scripts).

[root@server ~]# cat /etc/rc.local
# This script will be executed *after* all the other init scripts.
# You can put your own initialization stuff in here if you don't
# want to do the full Sys V style init stuff.

touch /var/lock/subsys/local
[root@server ~]#

"/etc/issue" file:
Accordoing to man page "/etc/issue" is a issue - pre-login message and identification file. The file "/etc/issue" is a text file which contains a message or system identification to be printed before the login prompt. Normaly it shows the Redhat release and versions. Kernel version and the architecture of the machine.

[root@server ~]# cat /etc/issue
Red Hat Enterprise Linux Server release 5.4 (Tikanga)
Kernel \r on an \m

"/etc/motd" file:
motd file is Message Of The Day. The contents of /etc/motd are displayed after a successful login but just before it executes the login shell.

Taking back-up of MBR:
[root@server ~]# sfdisk -d /dev/sda >sda.out
[root@server ~]# cat sda.out
# partition table of /dev/sda
unit: sectors

/dev/sda1 : start= 63, size= 208782, Id=83, bootable
/dev/sda2 : start= 208845, size= 12289725, Id=83
/dev/sda3 : start= 12498570, size= 1044225, Id=82
/dev/sda4 : start= 0, size= 0, Id= 0
[root@server ~]#

To restore it:
[root@server ~]# sfdisk /dev/sda < sda.out

Or you can take the backup and restore grub using dd command.

#dd if=/dev/sda of=grub.bkp bs=512 count=1

#dd if=grub.bkp of=/dev/sda bs=512 count=1

Recommended Reading

1. Practical Guide to Linux Commands, Editors, and Shell Programming, A (2nd Edition)
2. UNIX and Linux System Administration Handbook (4th Edition)
3. Linux All-in-One For Dummies
4. Introduction to the Command Line (Second Edition): The Fat Free Guide to Unix and Linux Commands

1 comment:

  1. Hi,

    This is very useful post and good one. But can you do explain clearly about stage1 , stage1.5 and stage2? Also please explain how the stage1 can be called after MBR?



Be nice. That's all.