You are on page 1of 62

Software for Embedded Systems (Embedded Linux) CS424

Tathagata Ray

Embedded Linux
Embedded Linux is just like the distributions running on millions of desktops and servers but its adapted to a specific use case. There are other embedded OS such as VxWorks, Integrity, and Symbian. The biggest difference between using a traditional EOS and Linux is the separation between kernel and the applications. Embedded Linux is gradually becoming popular.

Standards Based
Embedded Linux adheres to industry standards. For example Portable Operating System Interface for Unix (POSIX) is a standard specified by IEEE for handling threads and interprocess communication. Most linux distributions are POSIX compliant.

Process Isolation and control


Manage tasks, isolate them from the kernel and each other Linux is multitasking OS. Creates process Process cant access the memory space of another process. Processes cant access arbitrary memory from the kernel. Need to use syscalls. Linux uses a virtual memory-management system. Many embedded system dont use virtual memory as there is no disk. Uniform Interface to resources For example malloc() function allocates memory from the heap. It always works no matter what is the processor. Same is true for files.

Other reasons
Wide peripheral supports. Security
Being open source is an advantage Privilege level Pluggable Authentication Module
Its a framework. The framework provides a uniform way for authentication-related activities to take place.

http://docs.oracle.com/cd/E19082-01/819-2145/ch3pam-01/index.html

Commercial reasons
Apart from kernel, linux has a collection of software working together. Use the desktop to run and test software for embedded system (cross compilation). No royalties. Complete control over the source code of the software

Basic Steps of embedded Linux development process


Target Hardware Create development Boards containing chip and collection of peripherals and connectors. They are large, bulky and designed to be easily handled. Obtaining Linux Linux is nearly always included with a development board and has support for the peripherals supported by the chip or the development board. Booting Linux configure the software services Linux needs to boot and ensuring the cabling is proper and attached. Development Environment cross-compilation System design The linux distribution needs to be optimized according to the required design.

Anatomy of Embedded Linux


Software components of a embedded Linux system
Boot Loader
This is the software thats first run on the system. Primary responsibility is to get the processor initialized and ready to run the operating system. Kernel is stored in flash memory which is copied to RAM by boot loader and sets the instruction pointer to that memory location and tells the processor to start executing at the IPs current location.

Anatomy of Embedded Linux


Kernel
Created by a Finnish computer science student in 1991. Linux is ported to nearly every major processor. Less kernel development work. Thin down the Kernel to reduce the booting time. No low-level programming required to run the kernel

Anatomy of Embedded Linux


Root File system
A file system is a way of representing a hierarchical collection of directories, where each directory can contain other subdirectories and files. The top of the tree is called the root. On booting, Linux must be able to mount the root file system.

Anatomy of Embedded Linux


Your Application
When Linux starts, it looks for a program to execute by default, or you can supply it with the name of something to run. init (short for initialization) is a daemon process that is the direct or indirect ancestor of all other processes. It automatically adopts all orphaned processes. Init is the first process started during booting, and is typically assigned PID number 1. It is started by the kernel using a hardcoded filename, and if the kernel is unable to start it, a kernel panic will result. Init continues running until the system is shut down.

Tool Chain
In order to build the operating system and applications, we need a tool chain. Tool chain provide the assembler, compiler, and linker, along with a number of utilities needed to develop a Linux based system Most common tool chain is GNU tools. The command sudo apt-get install gcc will install the IA-32 GCC binary tool chain for a debian/Ubuntu-based system.

Tool Chain
The host machine is the machine you develop your applications on. In most cases today that will be an IA-32-based machine running a desktop distribution of Linux such as Fedora/Centos, Ubuntu, or SUSE/OpenSuse. The target device is the actual embedded device that you are developing the software for. The device could be based on IA32, ARM, MIPS, PowerPC, or any of the other CPU architecture supported by Linux.

Cross Compilation
When the host and target architecture are the same, you are said to be doing native development. When the host and target platforms differ, you are said to be doing cross development. In this case the tool chain that you download and run on your host must be a specialized version capable of executing on the host CPU architecture but building for the target CPU architecture. All of the GNU tools support both native host development and cross-development configurations.

Getting the tools


The compiler is provided by GNU Compiler Collection (GCC) (http://gcc.gnu.org/), and the assembler, linker, library, and object manipulation tools are provided by the GNU binutils project. There are a number of scripts that greatly simplify the generation of the tool chain for example crosstool-NG. It cross-builds the tool chain for all architectures supported by GCC.

Getting the Tools


There are four key attributes of the target tool chain that must be selected: The target CPU architectures (for example, IA-32, ARM) The application binary interface (ABI)/calling conventions (for example, fastcall, X.32, System V , AMD64 ABI convention, and EABI) Object format (for example, ELF)
Executable and Linkable Format (ELF, formerly called Extensible Linking Format) is a common standard file format for executables, object code, shared libraries, and core dumps.

Target operating system (for example, bare metal/Linux). This is important for calling conventions used to interact with the system.

Tools Overview
ar : Creates, modifies, and extracts from archives. An archive is a collection of objects or libraries. It provides a simple way to distribute a large number of objects/libraries in a single file. as : The GNU assembler. The assembler can be used natively and is used to assembly the output of the GCC compiler. gcc :The GNU C and C++ compiler. The default invocation of cc performs preprocessing, compilation assembly, and linking to generate an executable image.

Tools overview
ld : The GNU linker. The linker combines objects and archives, relocates their data, and resolves symbol references. A map file of the resulting output can be generated and is very useful in understanding the layout of the application. Objdump: Displays information from object files: objdump S <file> dumps the entire object and displays the disassembly object and where possible interleaves this with the source code; objdump s vmlinux generates a useful file for debugging kernel oops (at least for the statically linked kernel elements).

Kernel
Source code of the kernel is available at http://www.kernel.org. Pick the latest matured Kernel to start your work. Kernel Source Tree: kernel code + device drivers + platform support code. Know about the Kernel features. You may not need all the features. Choose the right kernel version.

Device driver support


If the device is in the market for some time then driver is most likely present in the kernel source tree. Otherwise get it from manufacturer. You can develop your own driver and upstream the driver.

Building a Kernel

Kernel source tree


arch: The arch subdirectory contains all of the architecture specific kernel code. It has further subdirectories, one per supported architecture, for example i386 and alpha. include: The include subdirectory contains most of the include files needed to build the kernel code. It too has further subdirectories including one for every architecture supported. The include/asm subdirectory is a soft link to the real include directory needed for this architecture, for example include/asm-i386. To change architectures you need to edit the kernel makefile and rerun the Linux kernel configuration program. init: This directory contains the initialization code for the kernel and it is a very good place to start looking at how the kernel works.

Kernel Source Tree


mm: This directory contains all of the memory management code. The architecture specific memory management code lives down in arch/*/mm/, for example arch/i386/mm/fault.c. drivers: All of the system's device drivers live in this directory. They are further sub-divided into classes of device driver, for example block. ipc: This directory contains the kernels inter-process communications code. modules: This is simply a directory used to hold built modules. fs: All of the file system code. This is further sub-divided into directories, one per supported file system, for example vfat and ext2. kernel: The main kernel code. Again, the architecture specific kernel code is in arch/*/kernel. net: The kernel's networking code. lib: This directory contains the kernel's library code. The architecture specific library code can be found in arch/*/lib/. scripts: This directory contains the scripts (for example awk and tk scripts) that are used when the kernel is configured.

Example

Kernel options
.config file controls the wide range of kernel configuration items.

You can open .config and view but do not make any changes in it manually.

menuconfig

.config file

.config file
CONFIG_FEATURE_XX=y: means that the feature is built into the kernel at build time. #CONFIG_FEATURE_XX is not set: This config is not included. CONFIG_FEATURE_XX=m. This item includes a feature as a dynamically loaded module. The feature is not statically compiled into the kernel. The modules are stored on the root file system and loaded automatically as required during the boot sequence. In general, the dynamically loaded modules are used for device drivers.

Building the Kernel

Building the Kernel


The bzimage file consists of a compressed kernel and startup code that is used to decompress the kernel image. A map file is a list of addresses and an associated program symbols. The Linux kernel map file is system.map.

Why compressed?
The first reason is to save on the storage requirements for the kernel. In embedded systems the kernel is usually stored on a flash device (although general mass storage is also an option). The second reason to use compression is boot speed. Many embedded systems to keep the image uncompressed and run the kernel from flash without first copying to memory. This is known as execute in place (XIP).

Root File System Build


The file system layout in most cases will follow the layout defined by the Filesystem Hierarchy Standard (FHS). In many embedded systems, the FHS is not followed exactly, as the root file system is trimmed significantly. The FHS based file system follows the directory layout below: /bin The programs that are normally accessible by all users are stored here. For example, shell-sh or busybox. /dev This contains special block/character device and named pipe files. /etc This is the general configuration storage directory, such as the password file and dhcp/ network configuration information. /lib The shared libraries for other programs are stored here. /lib/modules This holds the loadable device drivers in the form of kernel modules.

Root File System


/root This is the home directory for the root user. /sbin Programs that are normally run by the system, or as root, are stored here. The init program is stored here. /tmp Temporary files are created and stored here; they are not guaranteed to survive a system restart. /usr This is where a lot of miscellaneous packages and configuration items are stored. /var This is for variable files; it stores files that change constantly during the operation of the system. Items such as logs are stored here.

Initial RAM disk


The initial RAM disk (initrd) is an initial root file system that is mounted prior to when the real root file system is available. The initrd is bound to the kernel and loaded as part of the kernel boot procedure. The kernel then mounts this initrd as part of the two-stage boot process to load the modules to make the real file systems available and get at the real root file system.

Initial RAM disk


The initrd contains a minimal set of directories and executables to achieve this, such as the insmod tool to install kernel modules into the kernel. In the case of desktop or server Linux systems, the initrd is a transient file system. Its lifetime is short, only serving as a bridge to the real root file system. In embedded systems with no mutable storage, the initrd is the permanent root file system.

Busybox
BusyBox was first written by Bruce Perens in 1996 for the Debian GNU/Linux setup disk. The goal was to create a bootable GNU/Linux system on a single floppy disk that could be used as an install and rescue disk. A single floppy disk can hold around 1.41.7MB, so there's not much room available for the Linux kernel and associated user applications.

Busybox
BusyBox exploits the fact that the standard Linux utilities share many common elements. For example, many file-based utilities (such asgrep and find) require code to recurse a directory in search of files. When the utilities are combined into a single executable, they can share these common elements, which results in a smaller executable. In fact, BusyBox can pack almost 3.5MB of utilities into around 200KB. This provides greater functionality to bootable floppy disks and embedded devices that use Linux. The Busybox is a single statically linked executable.

Busybox Linking
Each Busybox command is created by creating a link to the Busybox executable.

Create softlinks

Busybox Linking
You can also invoke BusyBox by issuing a command as an argument on the command line. For example, entering /bin/busybox ls

will also cause BusyBox to behave as 'ls'. Of course, adding '/bin/busybox' into every command would be painful. So most people will invoke BusyBox using links to the BusyBox binary. For example, entering Creates soft links to busybox executable ln -s /bin/busybox ls Different commands can be created by softlinks ./ls will cause BusyBox to behave as 'ls' (if the 'ls' command has been compiled into BusyBox). Generally speaking, you should never need to make all these links yourself, as the BusyBox build system will do this for you when you run the 'make install' command.

C Library
Need C libraries and most commonly used is libc. The most commonly used implementation is that provided by the GNU C Library. provides all the functions defined in the standard. In fact, it complies with ISO C99, POSIX.1c, POSIX.1j, POSIX.1d, Unix98, and Single Unix Specification standards. Given that GLIBC is so comprehensive, it can be considered too large for use in an embedded system. Lighter Invariants
Embedded GLIBC uClibc (Clibc) Bionic C

Boot sequence
When system is first booted the processor executes code at a wellknown location (BIOS in Flash memory). When a boot device is found, the firststage boot loader is loaded into RAM and executed. its job is to load the second-stage boot loader. When the second-stage boot loader is in RAM and executing, Linux and an optional initial RAM disk (temporary root file system) are loaded into memory. The second-stage boot loader passes control to the kernel image and the kernel is decompressed and initialized. checks the system hardware, enumerates the attached hardware devices, mounts the root device, and then loads the necessary kernel modules. When complete, the first user-space program (init) starts, and high-level system initialization is performed.

Two stages of booting system

System startup
In a PC, booting Linux begins in the BIOS at address 0xFFFF0. This address is a physical address, as the MMU has not been yet enabled. The first step of the BIOS is the power-on self test (POST). The job of the POST is to perform a check of the hardware. The second step of the BIOS is local device enumeration and initialization.

Flash memory

System startup
The BIOS is made up of two parts:
the POST code runtime services.

After the POST is complete, it is flushed from memory, but the BIOS runtime services remain and are available to the target operating system.

System startup
To boot an operating system, the BIOS runtime searches for devices that are both active and bootable in the order of preference A boot device can be a
floppy disk CD-ROM, partition on a hard disk, device on the network USB flash memory stick.

System startup
Commonly, Linux is booted from a hard disk, where the Master Boot Record (MBR) contains the primary boot loader. The MBR is a 512-byte sector, located in the first sector on the disk (sector 1 of cylinder 0, head 0). After the MBR is loaded into RAM, the BIOS yields control to it.

Stage 1 Boot Loader


The primary boot loader that resides in the MBR is a 512byte image containing both program code and a small partition table. The first 446 bytes are the primary boot loader, which contains both executable code and error message text. The next sixty-four bytes are the partition table, which contains a record for each of four partitions (sixteen bytes each). The MBR ends with two bytes that are defined as the magic number (0xAA55). The magic number serves as a validation check of the MBR.

Stage 1 Boot Loader


The job of the primary boot loader is to find and load the secondary boot loader (stage 2). It does this by looking through the partition table for an active partition. When it finds an active partition, it scans the remaining partitions in the table to ensure that they're all inactive. When this is verified, the active partition's boot record is read from the device into RAM and executed.

Stage 2 Boot Loader


The secondary, or second-stage, boot loader could be more aptly called the kernel loader. The task at this stage is to
load the Linux kernel and optional initial RAM disk.

The first- and second-stage boot loaders combined are called Linux Loader (LILO) or GRand Unified Bootloader (GRUB) in the x86 PC environment.

GRUB
GNU GRUB is a bootloader (can also be spelled boot loader) capable of loading a variety of free and proprietary operating systems.
Linux, DOS, Windows, or BSD.

GRUB can be run from or be installed to any device (floppy disk, hard disk, CDROM, USB drive, network drive) and can load operating systems from just as many locations, including network drives. It can also decompress operating system images before booting them.

Stage 2 Boot Loader


With the second-stage boot loader in memory, the file system is consulted, and the default kernel image and initrd image are loaded into memory. With the images ready, the stage 2 boot loader invokes the kernel image.

Kernel
The kernel image is a compressed one. Typically this is a zImage (compressed image, less than 512KB) or a bzImage (big compressed image, greater than 512KB), that has been previously compressed with zlib. At the head of this kernel image is a routine that does some minimal amount of hardware setup and then decompresses the kernel contained within the kernel image and places it into high memory.

Kernel
If an initial RAM disk image is present, this routine moves it into memory and notes it for later use. The routine then calls the kernel and the kernel boot begins.

Kernel
When the bzImage (for an i386 image) is invoked, you begin at ./arch/i386/boot/head.S i n the start assembly routine. This routine does some basic hardware setup and invokes the startup_32 routine in ./arch/i386/boot/compre ssed/head.S. This routine sets up a basic environment (stack, etc.) and clears the Block Started by Symbol (BSS). The kernel is then decompressed through a call to a C function called decompress_kernel (l ocated in ./arch/i386/boot/compre ssed/misc.c).

BSS: A data representation at the machine level, that has initial values when a program starts and tells about how much space the kernel allocates for the un-initialized data. Kernel initializes it to zero at runtime.

Kernel
When the kernel is decompressed into memory, it is called. This is yet another startup_32 function, but this function is in ./arch/i386/kernel/head.S. In the new startup_32 function (also called the swapper or process 0), the page tables are initialized and memory paging is enabled. The type of CPU is detected along with any optional floating-point unit (FPU) and stored away for later use. The start_kernel function is then invoked (init/main.c), which takes you to the non-architecture specific Linux kernel.

Kernel
With the call to start_kernel, a long list of initialization functions are called to set up interrupts, perform further memory configuration, and load the initial RAM disk. In the end, a call is made to kernel_thread(in arch/i386/kernel/ process.c) to start the init function, which is the first user-space process. Finally, the idle task is started and the scheduler can now take control (after the call to cpu_idle).

Debugging
Debugging applications is relatively straightforward. User space debuggers available.
gdb ddd Kdbg IDEs such as Eclipse.

Debugging
One advantage of using Linux on your embedded target platform is you can start to develop and debug your application on a host-based Linux system such as on a Linux desktop. Once you have completed debugging on the host, you can then migrate your application to the target.

Debugging
If the embedded target system is not high performance debug the target from the host platform this is known as cross-debugging. The debugger runs on the host system and communicates with the target either via a direct connection to the processor via JTAG/BDM (background debug mode), or over a communications port such as Ethernet or a serial port.

Kernel Debugging
If the kernel crashes, you will be presented with a kernel oops.
This is a detailed traceback of the kernel stack; provides sufficient information to identify why the kernel crashed.

A core dump file contains a complete snapshot of system memory, and all relevant processor state such as register contents. In many embedded systems there is not sufficient local storage to save the core dump file.

Kernel Debugging
Native debugging
Turn on debugging info of kernel (turn on CONFIG_DEBUG_INFO) Use >gdb vmlinux /proc/kcore. The symbols for the kernel are loaded from the vmlinux file The /proc/kcore is a special driver used to communicate with the kernel debug agent from the user space gdb application.

Cross-target kernel debugging


Use JTAG/BDM. Use a software based target debug agent. Remote kernel debugging available now. Use emulator such as QEMU.

Reference
Pro Linux Embedded Systems by Gene Sally. Modern Embedded Computing by Peter Barry and Patrick Cowley. http://www.ibm.com/developerworks/linu x/library/l-linuxboot/

You might also like