Building the Linux Kernel Image

Building your own kernel for your machine can be a daunting task. This guide will give you step by step instructions about compiling a kernel for ARM machines on the target platform. Notes:
• • •

"bash$" and "bash#" are shell prompts, not commands to be typed. "host" means the machine you are building the ARM kernel on. "target" means the machine you are building the ARM kernel for.

Decide where to build your kernel
Firstly, you need to decide where you are going to build your ARM Linux kernel. A good place is to build the kernel in /usr/src/arm. If you wish to build it in a different place, replace /usr/src/arm with your preferred location. Please note that in general, building a kernel in /usr/src/linux is not recommended. You will need to create the /usr/src/arm directory before you download the source and patches, if it doesn't exist already. This directory should be owned by your non-root user ID To do this, become root:
bash$ su Password: bash# cd /usr/src bash# mkdir arm bash# chown myuser.mygroup arm bash# exit bash$

Deciding on a kernel version
Firstly, you need to decide which version of the Linux kernel you wish to compile. Most people will want the latest stable kernel release. To help you identify a particular release, a system of version numbers is used by the kernel developers. For any kernel version x.y.z,
o o


x - This is the major revision number y - This is the minor revision number, where: Even numbers indicate "stable" kernel releases Odd numbers indicate "development" or "beta" kernel releases z - This is the patch level of the kernel

The kernel releases are separated out into directories corresponding to the major/minor version numbers of the In some circumstances. where 'x'.z is Linus' version number and m is Alan's version number. Other maintainers like Nicolas Pitre will produce additional patches. You should select and download the latest patch for the kernel into your /usr/src/arm directory. These have names similar to patch-x. Downloading an ARM patch. Note: Some files may be named pre-patch-x. This is the main ARM patch which should always be applied. or the most bug fixes in. always work from the individual maintainer towards the main tree. the latest stable kernel on ARM is 2. and you don't mind loosing the odd hard disk drives worth of data. you will find a suffix to the version number . 'y'.gz. In this case.y.arm.y.y.y.z-rmkn. Please refer to the machine list for information concerning extra patches.linux. Note2: Some kernels are based on the Alan Cox series of The individual files are named patch-x. These are alpha or beta patches.2.'-rmkn'. These can be found in ftp://ftp. which contains all the ARM specific updates for a particular kernel version. Don't look at the latest tree available from ftp. You should not use these unless you are sure that you know what you are doing. you will need to patch the kernel with a maintainer specific patch.4. you will need to obtain Alan . where 'n' is the ARM release number. As a general rule.gz where x. which are probably unstable. Under the ARM kernel tree. Nicolas Pitre's patches add a '-np' suffix. When choosing a kernel version. This is the one which will have either the most features. 'z' and 'n' are the version numbers mentioned above. and these will add an additional suffix to denote their we may not have the patches available! Downloading the maintainer-specific patch. This adds greater support for the machines devices. You will need to download a kernel patch. maintainers forward parts of their patches into the -rmk tree as and when they are happy with the change. You will need the version of the patch later when downloading the main kernel source.z-rmkn.*. but at the expense of being less stable.gz.At the time of writing this document.

y.y. and are named according to a unified naming scheme. There are many sites scattered around the world.y.' and end in '..kernel. the -rmk patch before the -np patch.4/. but a machine-readable description of changes to make to a set of text files. in the directory /pub/linux/kernel/people/alan/linux-2. again into your /usr/src/arm directory. The patch files with more extensions depend on the ones with less extensions. You therefore need to obtain the main kernel source.z. The kernel source can be found on one of the | patch -p1 The patches are heirarchial. Each kernel release is accompanied by several files: o o o o linux-x. Unpacking the ARM kernel source Unpack the tar archive you downloaded above using: bash$ tar zxvf linux-x. change into the linux directory. so if you're on a modem. .tar. you need to find the kernel's corresponding patch from the kernel. These files are large (about 10MB or more). You can find out more information on these sites by looking at the main www.kernel.gz file. be prepared for it to take a long time.z-rmkn.bz2 FTP sites. so you need to apply them in the correct and so forth. Once you have selected a'.gz In the middle is placed a country identifier. They will be stored in subdirectories of /pub/linux/kernel.kernel. so you need to apply.gz patch-x.z. For example: o o o ftp. and apply the patch files: bash$ zcat . All sites start with 'ftp.y. Downloading the main kernel source A patch file on its own usually does not contain any compilable code./patch-x. for example.gz linux-x.bz2 You will want to download the servers.

Replace the line starting with ARCH := with ARCH := arm. then you need to modify the top level kernel makefile.The kernel source tree is now ready to be configured. Set the starting with CROSS_COMPILE = to be the path and prefix to your compiler tools (eg. Most configuration options have help associated with them. . so if you're not sure what it's asking. then you have finished. Configuration of the kernel source tree Firstly. Note: If you want to change the configuration via make xxx_config. Compiling the kernel source If you are only installing the kernel source tree for other programs. please read the help.config immediately prior to executing this command. You will need to change the definition of "ARCH" and "CROSS_COMPILE". There are a range of 'make' targets which allow a set of defaults to be selected for the particular machine you are compiling the source for. you should use the make config or make menuconfig to further refine the configuration. You may like to read linux/Documentation/README and linux/Documentation/arm/README before proceeding. CROSS_COMPILE = /usr/local/bin/arm-linux-). type the following commands: bash$ bash$ bash$ bash$ make make make make clean dep zImage modules The final two commands will actually compile the kernel and the kernel modules. please remove the file linux/. Some examples are given below: o o o o o o o a5k_config ebsa110_config footbridge_config rpc_config brutus_config victor_config empeg_config You should select one of these as the "basic" configuration as follows: make footbridge_config. If you want to compile up a new kernel. Following this. if you are not compiling the kernel natively.

install the kernel image (normally into /boot): bash# cd /boot bash# cat /usr/src/arm/linux/arch/arm/boot/zImage >vmlinuz bash# cp /usr/src/arm/linux/System. Note that it is a good idea to always keep a known good previous version of the kernel and modules in case you need to back down to a previous version.old /boot vmlinuz vmlinuz. What you do next depends if you are cross compiling or not.bak Now. bash# . you need to become 'root'. If you are cross compiling. What you need to do is machine dependent.3-rmk1 kernel): bash# bash# bash# bash# bash# bash# bash# cd ls mv cd mv mv /lib/modules 2. Installing a native kernel Since you are about to upgrade system .4.bak System.z directory. goto the section "Installing a cross compiled kernel". If you are building natively ( System.3-rmk1.y. The following is given as an example (for a 2. continue.Installing the kernel After the kernel has successfully compiled.4. To do this. arch/arm/boot/zImage. for the target on the target). you should have the kernel image. Next. type: bash$ su Password: bash# It is highly advisable to keep a backup of your current kernel and modules. install the new kernel modules: bash# cd /usr/src/arm/linux bash# make modules_install bash# This will copy the modules into the /lib/modules/x.3-rmk1 2.4.

(note that /usr/src/arm/lib/modules/x. Some kernel loaders do not understand files with holes in.z on the target machine).y. please go to the "Problems" step below. since they are incompatible with your host kernel. or transferred to the target machine. Run the boot loader map utility: bash# loadmap -v bash# to update the maps. If you place the vmlinuz kernel first.Note that the command to copy the new kernel image is cat and is not the usual cp. Installing a cross compiled kernel Install the modules into /usr/src/arm/ as follows: bash$ make modules_install INSTALL_MOD_PATH=/usr/src/arm/ bash$ This will place the modules into the /usr/src/arm/lib/modules/x. please refer to the documentation for your machine. then this will be the default kernel which the kernel loader will use.conf. The kernel will be available in /usr/src/arm/linux/arch/arm/boot/zImage and the kernel symbol information in /usr/src/arm/linux/System. and therefore using cat in this way ensures that this does not happen. Edit the loader configuration file /etc/boot. Please also note that you should not install these kernel modules into the hosts root filesystem. as well as EBSA285 machines using EBSA285BIOS with an IDE disk. but instead creates "holes" in the file. For other machines. Unix traditionally will not allocate space on the filesystem to sections of files containing zero data.y. More information can be found by typing man boot.bak or vmlinuz images. which can then be placed into an suitable filesystem.z should become /lib/modules/x. and are now ready to reboot your machine and try out your new kernel! If you experience problems.z directory on the host.y. . You have finished. Exactly how do install this is outside the scope of this Running loadmap Loadmap is part of the Linux loader on Acorn machines.conf so that you can boot either the vmlinuz.

map file safe . bbootsect. gzip -9 < $tmppiggy > $tmppiggy. 9. 8. 5.o. Using ld(1).s for bzImage or setup.gz into ELF relocatable (ld -r) piggy. Compile compression routines head. Bootsector asm code bootsect. 12.o and piggy. removing .o.gz 10.S and misc. depending on whether the target is bzImage or zImage.note and . 2. Here is how the image is built: contains the symbol information for this kernel.o. the difference is marked by -D__BIG_KERNEL__ present for bzImage.o and misc. Link together head. System.It is important that you keep the System.o and . In the same way as the bootsector code.s assembled and raw-converted into bootsect for zImage). 6.S) is preprocessed into is produced by nm vmlinux. the above .s respectively. When the user types 'make zImage' or 'make bzImage' the resulting bootable kernel image is stored as arch/i386/boot/zImage or arch/i386/boot/bzImage respectively. 3. Link $tmppiggy.c (still in arch/i386/boot/compressed directory) into ELF objects head. don't mistake this for /usr/src/linux/vmlinux!).s is assembled and then converted into 'raw binary' form called bbootsect (or bootsect. Enter directory arch/i386/boot. 11. Setup code setup. misc.s or bootsect.a are linked into vmlinux which is a statically linked. 7. This section explains the steps taken during compilation of the Linux kernel and the output produced at each stage. The result is then converted into 'raw binary' form called bsetup.S includes video. The build process depends on the architecture so I would like to emphasize that we only consider building a Linux/x86 kernel.o) and some of them are grouped logically into archives (. which will be required if you need to report a problem. 4.S is preprocessed either with or without -D__BIG_KERNEL__.s for zImage.a) using ar(1).o into bvmlinux (or vmlinux for zImage. irrelevant or uninteresting symbols are grepped out.comment ELF sections.S (setup. Note the difference . non-stripped ELF 32-bit LSB 80386 executable file. into bbootsect. C and assembly source files are compiled into ELF relocatable object format (. Enter directory arch/i386/boot/compressed and convert /usr/src/linux/vmlinux to $tmppiggy (tmp filename) in raw binary format.

between -Ttext 0x1000 used for vmlinux and -Ttext 0x100000 for bvmlinux. e. Therefore it is easy to build a broken kernel by just adding some large ". it does not check the *upper* bound of said setup size.out removing . 2. BIOS selects the boot" at the end of setup. BIOS loads the bootsector from the boot device. 4.g.2 Booting: Overview The boot process details are architecture-specific. 2. This writes important variables like setup_sects and root_dev at the end of the bootsector. 1. 1.S.3 Booting: BIOS POST 1. The kernel is uncompressed in protected mode. CPU #RESET line is asserted (CPU now in real 8086 mode). Low-level initialisation is performed by asm code.5M for booting with LILO and 0xFFFF paragraphs (0xFFFF0 = 1048560 bytes) for booting raw image. Note that while tools/build does validate the size of boot sector. using the program tools/build. decompression routines and compressed kernel image. cat together bbootsect. from floppy disk or CD-ROM (El-Torito emulation mode). The size of the setup must be greater than 4 sectors but is limited above by about 12K . bsetup and compressed/bvmlinux.note and . The upper limit on the bzImage size produced at this step is about 2. Convert bvmlinux to 'raw binary' bvmlinux. The size of the bootsector is always 512 bytes. Bootsector loads setup. Due to old design and backward compatibility. The power supply starts the clock generator and asserts #POWERGOOD signal on the bus. Go back to arch/i386/boot directory and. 5.the rule is: 0x4000 bytes >= 512 + setup_sects * 512 + room for stack while running bootsector/setup We will see later where this limitation comes from. 13. High-level C initialisation. .e. so we shall focus our attention on the IBM PC/IA32 architecture. This process can be separated into the following six logical stages: 1. 14. i. 3. for bzImage compression loader is high-loaded. 6.out into bzImage (delete extra 'b' above for zImage).comment ELF sections. kernel image and lower bound of setup size. the PC firmware boots the operating system in an old-fashioned manner.

S: 54 55 56 57 58 59 60 61 62 63 64 movw movw movw movw movw subw subw cld rep movsw ljmp $BOOTSEG. The BIOS Bootstrap Loader function is invoked via int 0x19. LILO (or other bootloader's) bootsector.3.4 Booting: bootsector and setup The bootsector used to boot Linux kernel could be either: • • • Linux bootsector (arch/i386/boot/bootsect. 1. DEF_SYSSEG and DEF_SYSSIZE are taken from include/asm/boot. 5. let us consider the actual code of bootsect. %si %di. DEF_SETUPSEG. %ax %ax. %di $INITSEG. IVT (Interrupt Vector Table) initialised at address 0. with %dl containing the boot device 'drive number'. This loads track 0. %cs=0xFFFF0000.%eip = 0x0000FFF0 (ROM BIOS POST code). $go . unless you really know what you're doing. %cx %si. */ #define DEF_INITSEG 0x9000 #define DEF_SYSSEG 0x1000 #define DEF_SETUPSEG 0x9020 #define DEF_SYSSIZE 0x7F00 Now. %es $256.out of the way */ setup starts here */ system loaded at 0x10000 (65536) */ system size: # of 16-byte clicks */ (the numbers on the left are the line numbers of bootsect. %ds=%es=%fs=%gs=%ss=0.S). 4. or no bootsector (loadlin etc) We consider here the Linux bootsector in detail. %ax %ax. sector 1 at physical address 0x7C00 (0x07C0:0000). %ds $INITSEG. The first few lines initialise the convenience macros to be used for segment values: 29 30 31 32 33 34 SETUPSECS BOOTSEG INITSEG SETUPSEG SYSSEG SYSSIZE = = = = = = 4 0x07C0 DEF_INITSEG DEF_SETUPSEG DEF_SYSSEG DEF_SYSSIZE /* /* /* /* /* /* default nr of setup-sectors */ original address of boot-sector */ we move boot here . All POST checks are performed with interrupts disabled.h: /* Don't touch these.S file) The values of DEF_INITSEG. 6.

# ax and es already # put stack at Lines 54-63 move the bootsector code from address 0x7C00 to 0x90000. This and the following three instructions (lines 64-76) prepare the stack at $INITSEG:0x4000-0xC. %ss = $INITSEG (0x9000) and %sp = 0x3FF4 (0x40000xC). set %ds:%si to $BOOTSEG:0 (0x7C0:0 = 0x7C00) set %es:%di to $INITSEG:0 (0x9000:0 = 0x90000) set the number of 16bit words in %cx (256 words = 512 bytes = 1 sector) clear DF (direction) flag in EFLAGS to auto-increment addresses (cld) go ahead and copy 512 bytes (rep movsw) The reason this code does not use rep movsd is intentional (hint . %ds %ax. The old stack might have clobbered the 69 # drive table. %sp # 12 is disk parm size. i. 3.65 # bde . 4. We 66 # wouldn't have to worry about this if we checked the top of memory. $0x4000-12. 70 go: movw arbitrary value >= 71 length of 72 stack. Also 67 # my BIOS can be configured to put the wini drive tables in high memory 68 # instead of in the vector table. i. 73 74 movw contain INITSEG 75 movw 76 movw INITSEG:0x4000-12. we must take care of this by creating new parameter tables (for the first disk) in RAM.this may mean 7 sectors in some cases. This is achieved by: 1. We will set the maximum sector count to 36 . Lines 77-103 patch the disk parameter table for the first disk to allow multi-sector reads: 77 78 79 80 81 82 83 84 85 # # # # # # # # # Many BIOS's default disk parameter tables will not recognise multi-sector reads beyond the maximum sector number specified in the default diskette parameter tables . 2. This is where the limit on setup size comes from that we mentioned earlier (see Building the Linux Kernel Image)..code16).e.changed 0xff00 to 0x4000 to use debugger at 0x6400 up (bde).the most we will encounter on an ED 2. . Line 64 jumps to label go: in the newly made copy of the bootsector.e. %di # 0x4000 is an # length of bootsect + # setup + room for %ax.88. Since single sector reads are slow and out of the question. in segment 0x9000. %ss %di. 5.

%si $6. Low does. unless retry succeeds but usually it doesn't (if something is wrong it will only get worse). %fs:2(%bx) # patch sector count The floppy disk controller is reset using BIOS service int 0x13 function 0 (reset FDC) and setup sectors are loaded immediately after the bootsector. i.INITSEG. %ah %dl. again using BIOS service int 0x13. %dx $0x02. %cl %di # set fs to 0 # fs:bx is parameter # # # # ds:si is source copy 12 bytes di = 0x4000-12. at physical address 0x90200 ($INITSEG:0x200). %fs:(%bx) %es. in # service 2. track 0 # address = 512. 0x4(%di) %di. "read # (assume all on head # read it # ok . %ah setup_sects. . track 0) 116 int 117 jnc 118 119 120 121 122 123 124 pushw call movw call popw jmp ok_load_setup: %ah. # # Segments are as follows: ds = es = ss = cs . fs = 0. This happens during lines 107-124: 107 load_setup: 108 xorb 109 xorb 110 int 111 xorw 112 movb 113 movw INITSEG 114 movb sector(s)" 115 movb 0. don't need cld -> 91 92 table address 93 94 95 96 97 done on line 66 98 99 100 101 102 103 %di %ds $36. function 2 (read sector(s)). %bp print_hex %ax load_setup # reset FDC # drive 0.e. The only way to get out of it is to reboot the machine. %bx $0x02.continue # dump error code If loading failed for some reason (bad floppy or someone pulled the diskette out during the operation). head 0 # sector 2. # and gs is unused. we dump error code and retry in an endless loop. %dl $0x13 %dx. %bx %ds %fs:(%bx). %fs $0x78.86 87 88 89 90 # # High doesn't hurt. %cl $0x0200. movw movw pushw ldsw movb pushw rep movsw popw popw movb movw movw %cx. %al $0x13 ok_load_setup %ax print_nl %sp.

c}. but the kernel is loaded 64K chunk at a time using a special helper routine that calls BIOS to move data from low to high memory. The setup sectors are loaded as usual at 0x90200. The bootsect_kludge label in setup.If loading setup_sects sectors of setup code succeeded we jump to label ok_load_setup:.S contains the value of setup segment and the offset of bootsect_helper code in it so that bootsector can use the lcall instruction to jump to it (inter-segment jump). Ability to pass kernel command line parameters (there is a patch called BCP that adds this ability to bare-bones bootsector+setup). 1. arch/386/boot/compressed/ {head.S).there are approximately 4 spare bytes and at least 1 spare byte in bootsect. This sets up stack and calls decompress_kernel() which uncompresses the kernel to address 0x100000 and jumps to it. 2. This helper routine is referred to by bootsect_kludge in bootsect. Once the data is no longer needed (e.S and is defined as bootsect_helper in setup. Ability to load much larger bzImage kernels . Let us examine the kludge in the bootsector code that allows to load a big kernel.S doesn't run out of low memory when copying data from disk. no more calls to BIOS) it is overwritten by moving the entire (compressed) kernel image from 0x10000 to 0x1000 (physical addresses. This is done to preserve the firmware data areas in low memory (0-64K). The newer versions (as of a couple of years ago or earlier) use the same technique as bootsect+setup of moving data from low into high memory by means of BIOS services. The reason why it is in setup. we jump to $SETUPSEG:0 (arch/i386/boot/setup. This is done by setup.g. known also as "bzImage".5M vs 1M. of course).S (which is strictly not true .up to 2. which is why there is code in setup to load the rest of itself if needed.S.S is simply because there is no more space left in bootsect. 3. Old versions of LILO (v17 and earlier) could not load bzImage kernels.5 Using LILO as a bootloader There are several advantages in using a specialised bootloader (LILO) over a bare bones Linux bootsector: 1. Then we proceed to load the compressed kernel image at physical address 0x10000. After the kernel is loaded.misc.S which sets things up for protected mode and jumps to 0x1000 which is the head of the compressed kernel. the code in setup has to take care of various combinations of loader type/version vs zImage/bzImage and is therefore highly complex. Note that old bootloaders (old versions of LILO) could only load the first 4 sectors of setup. i. The main reason . Also. This routine uses BIOS service int 0x15 (ax=0x8700) to move to high memory and resets %es to always point to 0x10000. Ability to choose between multiple Linux kernels or even multiple OSes.S but that is not enough.e. This ensures that the code in bootsect.S. Some people (Peter Anvin notably) argue that zImage support should be removed. obviously).

Parse boot commandline options. Initialise softirq subsystem. initialise dynamical module loading facility.6 High level initialisation By "high-level initialisation" we consider anything which is not directly related to bootstrap. 6. 4. 6. The init/main. The following steps are performed: 1. 1. Initialise time keeping data. 4. Initialise page tables. kmem_cache_init(). which just reloads esp/eip and doesn't return. 12.). The first CPU calls start_kernel(). namely arch/i386/kernel/head. The last thing LILO does is to jump to setup. If module support was compiled into the kernel. initialise profiling buffers. only first CPU does this). 3.c:start_kernel() is written in C and does the following: 1. 2. etc. Initialise irqs. all others call arch/i386/kernel/smpboot. Initialise segment values (%ds = %es = %fs = %gs = __KERNEL_DS = 0x18). Zero-clean BSS (on SMP. Initialise data required for scheduler. Initialise console. If "profile=" command line was supplied. 5.S and things proceed as normal.c:initialize_secondary() if ready=1. . even though parts of the code to perform this are written in asm.c and is the same string as displayed by cat /proc/version. 2.(according to Alan Cox) it stays is that there are apparently some broken BIOSes that make it impossible to boot bzImage kernels while loading zImage ones fine. 14. to the kernel ring buffer for messages. Calculate BogoMips value for this CPU. 7. Copy the first 2k of bootup parameters (kernel commandline). 5. 7. This is taken from the variable linux_banner defined in init/version. Check CPU type using EFLAGS and. compiler used to build it etc. 8. able to detect 386 and higher. 3. initialise most of slab allocator. Perform arch-specific setup (memory layout analysis. copying boot command line again. 15. 13. Initialise traps. 10. 9.S which is the head of the uncompressed kernel. cpuid. 11. if possible. Enable interrupts. Take a global kernel lock (it is needed so that only one CPU goes through initialisation). Print Linux kernel "banner" containing the version. Enable paging by setting PG bit in %cr0.

Call mem_init() which calculates max_mapnr. Create various slab caches needed for VFS. but so far there are no patches available that implement this in a sufficiently elegant manner to be acceptable into the kernel.init ELF section of the kernel image. 19. If quota support is compiled into the kernel. initialise max_threads based on the amount of memory available and configure RLIMIT_NPROC for init_task to be max_threads/2. These functions either do not depend on each other or their dependencies have been manually fixed by the link order in the Makefiles. /bin/init. 18. panic with "suggestion" to use "init=" parameter.initcall. kmem_cache_sizes_init(). etc.. 17. 23. this includes mounting an internal (in-kernel) instance of shmfs filesystem. 20. If System V IPC support is compiled in. /bin/sh in this order. if all these fail. fork_init(). Therefore. create and initialise a special slab cache for it. This means that. . VM.16. provided they are listed sequentially in the same Makefile. this is important because you can imagine two subsystems A and B with B depending on some initialisation done by A. or tries to exec /sbin/init. Rogier Wolff proposed to introduce a hierarchical "priority" infrastructure whereby modules could let the linker know in what (relative) order they should be linked. 21. 24. create uid_cache. If A is compiled statically and B is a module then B's entry point is guaranteed to be invoked after A prepared all the necessary environment. make sure your link order is correct. totalram_pages and high_memory and prints out the "Memory: . Note that for System V shm. Comparing various architectures reveals that "ia64 has no bugs" and "ia32 has quite a few bugs". then B is also necessarily a module so there are no problems. in the example above. Perform arch-specific "check for bugs" and. Set a flag to indicate that a schedule should be invoked at "next opportunity" and create a kernel thread init() which execs execute_command if supplied via "init=" boot parameter. But what if both A and B are statically linked into the kernel? The order in which they are invoked depends on the relative entry point offsets in the . 22. whenever possible. the order in which initialisation functions are invoked can change. initialise the IPC subsystem. buffer cache. Initialise data structures used by procfs. this is an idle thread with pid=0. /etc/init. 25. activate workaround for processor/bus/etc bugs. A and B work fine when compiled statically once. they will always work.. Go into the idle loop. If A is a module. good example is "f00f bug" which is only checked if kernel is compiled for less than 686 and worked around accordingly. If." line. If they don't work. Sometimes. finish slab allocator initialisation. change the order in which their object files are listed. Important thing to note here that the init() kernel thread calls do_basic_setup() which in turn calls do_initcalls() which goes through the list of functions registered by means of __initcall or module_init() macros and invokes them. depending on the position of directories in the trees and the structure of the Makefiles.

The trampoline code simply sets %bx register to 1.S and discovering that it is not a BP.4BSD book) is that "the relevant code is spread around various subsystems and so it is not feasible to free it". executing them one at a time. This is useful for recovering from accidentally overwritten /sbin/init or debugging the initialisation (rc) scripts and /etc/inittab by hand.S. then it is already implemented or somebody is working on it". What do_boot_cpu() does is create (i. Linux kernel can only be compiled as an ELF binary. The reason related to throwing away initialisation code/data is that Linux provides two macros to be used: . The requirement that trampoline code must be in low memory is enforced by the Intel MP specification. most of the code and data structures are never needed again. Note that init_task can be shared but each idle thread must have its own TSS. The AP code writes a magic number in its own code which is verified by the BP to make sure that AP is executing the trampoline code.8 Freeing initialisation data and code When the operating system initialises itself.S. This is why init_tss[NR_CPUS] is an array. So. it skips the code that clears BSS and then enters initialize_secondary() which just enters the idle task for this CPU . setup etc until it reaches the start_kernel(). The excuse they use (see McKusick's 4. Now. Most operating systems (BSD. Then it generates STARTUP IPI to the target cpu which makes this AP execute the code in trampoline. thus wasting precious physical kernel memory. cannot use such excuses because under Linux "if something is possible in principle. of course.) cannot dispose of this unneeded information. The smp_boot_cpus() goes in a loop for each apicid (until NR_CPUS) and calls do_boot_cpu() on it. and then on to smp_init() and especially src/i386/kernel/smpboot. and now we find out the reason (or one of the reasons) for that. the BP goes through the normal sequence of bootsector. The boot CPU creates a copy of trampoline code for each CPU in low memory. fork_by_hand) an idle task for the target cpu and write in well-known locations defined by the Intel MP spec (0x467/0x469) the EIP of trampoline code found in trampoline.e.recall that init_tasks[cpu] was already initialised by BP executing do_boot_cpu(cpu).S.Another thing worth noting is Linux's ability to execute an "alternative init program" by means of passing "init=" boot commandline. 1. enters protected mode and jumps to startup_32 which is the main entry to arch/i386/kernel/head. Linux. the AP starts executing head.c:smp_boot_cpus(). 1. FreeBSD etc. as I said earlier.7 SMP Bootup on x86 On SMP.

data.for data code These evaluate to gcc attribute specificators (also known as "gcc magic") as defined in include/linux/init.for initialisation __initdata . What happens during boot is that the "init" kernel thread (function init/main. which is declared in the linker map in arch/i386/vmlinux. Even if a given subsystem will never become a module. see fs/pipe.• • __init . 2.9 Processing kernel command line Let us recall what happens to the commandline passed to kernel during boot: 1.init which is also freed in the static case.e. The current trend in Linux.init"))) What this means is that if the code is compiled statically into the kernel (i. MODULE is not defined) then it is placed in the special ELF section .init"))) __attribute__ ((__section__ (". The functions registered via module_init() are placed in . the subsystem in question can be modularised if needed. when designing a subsystem (not necessarily a module). as well as a signature saying that there is a valid commandline there. Otherwise (i. bdflush (see fs/buffer. this results in freeing about 260K of memory. LILO (or BCP) accepts the commandline using BIOS keyboard services and stores it at a well-known location in physical memory. provided it does not matter when exactly is the function called.text. copies the first 2k of it out to the zeropage. is to provide init/exit entry points from the early stages of design so that in the future. called __exit and __exitdata.e. arch/i386/kernel/head.c.g. There are two more macros which work in a similar manner. it is still nice and tidy to use the module_init() macro against its initialisation function.init. e. if it is a module) the macros evaluate to nothing. Example of this is pipefs. On a typical system (my workstation).initcall.h: #ifndef #define #define #else #define #define #endif MODULE __init __initdata __init __initdata __attribute__ ((__section__ (". 1.c:init()) calls the arch-specific function free_initmem() which frees all the pages between addresses __init_begin and __init_end. but they are more directly connected to the module support and therefore will be explained in a later section.

}. i.3. Note that using the return value of 0 from the function registered via __setup(). itself called by start_kernel()) copies 256 bytes from zeropage into saved_command_line which is displayed by /proc/cmdline. ints). (void)get_options(str. \ static struct kernel_param __setup_##fn __initsetup = \ { __setup_str_##fn.init and invokes each function. ARRAY_SIZE(ints). passing it the word if it matches. int (*setup_func)(char *). __setup_end. 5. fn) \ static char __setup_str_##fn[] __initdata = str. with the result depending on the order. BusLogic HBA drivers/scsi/BusLogic. checksetup() goes through the code in ELF section . This same routine processes the "mem=" option if present and makes appropriate adjustments to VM parameters.func) /* nothing */ endif So. fn } #else #define __setup(str.setup. arch/i386/kernel/setup. Jeff Garzik commented: "hackers who do that get spanked :)" Why? Because this is clearly ld-order specific. how do we write code that processes boot commandline? We use the __setup() macro defined in include/linux/init. So.e. extern struct kernel_param __setup_start. kernel linked in one order will have functionA invoked before functionB and another will have it in reversed order. it is possible to pass the same "variable=value" to more than one function with "value" invalid to one and valid to another.c:parse_mem_cmdline() (called by setup_arch(). We return to commandline in parse_options() (called by start_kernel()) which processes some "in-kernel" parameters (currently "init=" and environment/arguments for init) and passes each word to checksetup(). . 4. #ifndef MODULE #define __setup(str.h: /* * Used for kernel command line parameter setup */ struct kernel_param { const char *str. you would typically use it in your code like this (taken from code of real driver.c): static int __init BusLogic_Setup(char *str) { int ints[3].

} if (str == NULL || *str == '\0') return 0. BusLogic_Setup). This also means that it is possible to write code that processes parameters when compiled as a module but not when it is static or vice versa.if (ints[0] != 0) { BusLogic_Error("BusLogic: Obsolete Command Line Entry " "Format Ignored\n". return 0. . so the code that wishes to process boot commandline and can be either a module or statically linked must invoke its parsing function manually in the module initialisation routine. return BusLogic_ParseDriverOptions(str). } __setup("BusLogic=". NULL). Note that __setup() does nothing for modules.

Sign up to vote on this title
UsefulNot useful