Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Download
Standard view
Full view
of .
Save to My Library
Look up keyword
Like this
6Activity
0 of .
Results for:
No results containing your search query
P. 1
Application eXecute-In-Place (XIP) with Linux and AXFS

Application eXecute-In-Place (XIP) with Linux and AXFS

Ratings: (0)|Views: 2,081 |Likes:
Published by Sören Wellhöfer
This article explains how application eXecute-In-Place (XIP) under Linux works and conplements this by providing a practical guide to the innovative Advanced XIP File System (AXFS). It also presents benchmarking results for execution speeds that have been conducted with AXFS.

In its first part, this article gives a broader view on the concepts of eXecute-In-Place (XIP) for user applications with specific references to Linux. The second part focuses on AXFS and begins by explaining its basic ideas, followed by concrete instructions on how to set up a Linux box with it. Finally, this article is complemented by presenting and analyzing the results of performance tests that have been conducted on two embedded systems to compare AXFS and JFFS2 in terms of execution speed.
This article explains how application eXecute-In-Place (XIP) under Linux works and conplements this by providing a practical guide to the innovative Advanced XIP File System (AXFS). It also presents benchmarking results for execution speeds that have been conducted with AXFS.

In its first part, this article gives a broader view on the concepts of eXecute-In-Place (XIP) for user applications with specific references to Linux. The second part focuses on AXFS and begins by explaining its basic ideas, followed by concrete instructions on how to set up a Linux box with it. Finally, this article is complemented by presenting and analyzing the results of performance tests that have been conducted on two embedded systems to compare AXFS and JFFS2 in terms of execution speed.

More info:

Published by: Sören Wellhöfer on Sep 18, 2009
Copyright:Attribution

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF or read online from Scribd
See more
See less

05/11/2014

pdf

 
Application eXecute-In-Place (XIP) with Linux and AXFS
S¨oren Wellh¨ofer
soeren.wellhoefer@gmx.net
September 17, 2009
Abstract
XIP, or
eXecute-In-Place
when written out, isa technique of directly accessing applicationcode and data in non-volatile flash memoryrather than transferring it to physical RAMfirst in order for an execution to proceed. It isfrequently used in the contexts of embeddedcomputing.The sheer notion, however, of using morecost-intensive flash chips to run programsfrom has been looked upon rather suspiciously by many a system engineer in the past.Moreover, in the world of Linux, runningapplications in-place is far less common thandoing it for the kernel in a likewise fashion.
1
For that matter, kernel XIP has already beensuccessfully used to improve boot-up time in anumber of cases. Application XIP, on the otherhand, is not yet as widely taken advantage of  by open source system designers.This article is to demonstrate, in concept aswell as practice, how application eXecute-In-Place basically works and to shed some lighton the argument that XIP might not be such anaberrant idea after all.Alas, free, comprehensive and practical re-sources on the subject matter are relativelyscarce; it is hoped that with this short articlethis condition can at least be somewhat reme-diated.In its first part, this article gives a broaderview on the concepts of eXecute-In-Place (XIP)for user applications with specific referencesto Linux. The second part focuses on AXFS(
 Advanced XIP File System
) and begins by ex-plaining its basic ideas, followed by concreteinstructions on how to set up a Linux box withit.Finally, this article is complemented by pre-senting and analyzing the results of perfor-mance tests that have been conducted ontwo embedded systems to compare AXFS and JFFS2 in terms of execution speed.
1 Regular program execution
A program, when stored on disk or flash, isin essence nothing else but binary data, thatis, code that gets executed and program datathat will be used by the program during exe-cution. Linux uses a specific and very flexibleformat for this purpose called ELF (
Executableand Linkable Format
). It is widely accepted andhas become the official de-facto standard in theUnix world.When a user decides to execute a program,the shell he uses invokes the Linux system call
fork()
. By doing so, a new process witha unique PID (
Process Identification
) is created.This process initially is an exact clone of theshell process itself; all the executable code aswell as the process data is merely copied.
2
The next thing that happens is that a sys-tem call of the
exec()
family of functions isinvoked which receives the path name of thefile that the user wishes to execute as one of itsarguments. The
exec()
-like call now recog-nizes the ELF format and attempts to replacesections of the currently running process withthose found in the file. While doing so, most of the program code as well as the program data
1. Here, application XIP denotes the fact that user-space programs are to be executed from flash memory.2. Actually, processes data is not blindly copied but both parent and child process share the same region of memory unless writes occur; then the needed sectionsare really copied. This technique is often called
copy-on-demand
.
1
 
is replaced with that of the now to be newlyexecuted program.
2 Application XIP
What has been described in the previous sec-tion was the normal-fashioned way of doingthings. In-place-execution-oralsoabbreviated
XIP
- takes a slightly different approach inthat the actual program code as well as theprogram’s static data (the
.text
and
.data
sections in an ELF file respectively) are nevercopied to RAM and neither will they be copiedwhen a process forks itself by calling
fork()
.When a program is to be executed in-place,the only things actually made space for inRAM is the
.bss
section
3
, meaning the unini-tialized data that a program will be using,as well as the program’s stack that grow dy-namically as execution proceeds
4
. The
.text
and
.data
sections of an ELF file do not gettransferred to RAM since execution directlyproceeds from non-volatile flash memory bysetting the execution pointer to a memory lo-cation on the flash itself.This is potentially useful if a tight limit onRAM resources constrains its utilization suchas is often the case for embedded systems andsmaller devices; with XIP, no fetching of pagesfrom flash and copying them to RAM is neces-sary whatsoever.Because of its nature XIP can be perceived asa form of shared memory access in that multi-ple processes executing the same code wouldall share it in flash memory without maintain-ing a unique copy somewhere in RAM just forthemselves.A method commonly found in the Linuxworld is to use XIP in connection with com-pressed read-only file systems such as
cramfs
or
SquashFS
.ThereareXIPextensions(patches)for thosefilesystems availablethataim tomin-imize flash storage utilization through com-pression whilst providing XIP capabilities.This is usually achieved by storing application binary data unchanged in flash memory whilestill applying compression to everything elseon the file system image.Within the virtual memory address spaceof a process, the executable code and staticdata (
.text
and
.data
) are directly mappedto flash memory so that a paging
5
of thesesections becomes unnecessary.Note that for XIP data, it cannot be possibleto be compressed as this would thwart thevery purpose of in-place execution itself; datawould have to be inflated and moved to RAMfirst in order to execute it - this is preciselywhat is being tried to avoid.Again, not making use of XIP means thatprogram data (the ELF
.data
and
.text
sec-tions) are stored in compressed form in flashmemory just as everything else on a com-pressed file system. It is then decompressedand loaded into RAM when needed during theexecutionflow, thatis, whenmissesinthepagecache lookup performed by the kernel occur.
6
3. The
.bss
(
Block Started by Symbol
) of an ELF filedescribes all the uninitialized data of a program. It doesnot occupy any actual space on disk but gets allocated inRAM upon execution and is filled with all zeros initially.By contrast, the
.data
section does already occupymemory on disk; it contains all the non-changeable datawhichishard-codedintotheapplication. Anexampleforthis would be a statically assigned string constant in a Cprogram.4. Note that all this is only true for
read-only
file sys-tems. If data is to be also modified, the
.data
sectiondoes also get (partially) copied to RAM so that it can bechanged in before committing it back to the disk or flash.5. Paging, as performed by the operating system andin the the applied context, is the proccess of retrievingdata segements called pages (usually 4kB in size) fromand external media, such as hard disks or flashes, andloading them into RAM. This allowes the memory spaceof a process to be built up when execution begins. But because the memory as it is seen by the process is nottruly continuous, but rather is in itself made up of manyfragmented portions residing at some location in physi-cal RAM, this method is often called
virtual memory
, orVM. With most file systems that apply compression, it becomes in addition to merely copying pages also nec-essary to decompress them whilst paging; this naturallytakes up time as well.6. The kernel is, on most architectures, aided in this
2
 
This method of pulling them in only whentruly needed is often called
demand-paging
be-cause a loading only occurs when they areabsolutely wanted.One disadvantages of XIP is that read ac-cess to flash memory is relatively slow whencompared to RAM. Whereas retrieving a pageof memory in random access fashion fromregular SRAM takes about
25 ns
, this amountincreases to about
100 ns
for an average NORflash chip. Although recent developmentshave dramatically reduced this number downto even
70 ns
for some rather expensive NORchips, flash memory access can never be as fastas SRAM due to its intrinsic technical make-up.It is important to note that eXecute-In-Placeis only truly possible with NOR flash memorywithoutapplyingsomesortofemulationlayer.Unlike with NOR, NAND flash memory isnot directly memory-mappable and can only be access on a per-block basis of usually
512kB
. This generally yields a faster data rate,however it renders NAND unsuitable for XIPwhere it must be possible to singly access andread individual bytes; this is necessary for aprogram execution.Another disadvantage is posed by the factthat RAM is generally available at a lower costthan flash memory. For XIP to be used benefi-cially, one must consider whether the advan-tages gained in RAM preservation outweighthe disadvantages of higher flash memory uti-lization and slower memory access and renderthe trade-off worthwhile.In a test case constructed by engineers atIntel, it has been shown that when running anapplication and reducing the available RAMat fixed successive intervals, system perfor-mance will drop rapidly at a certain pointwhere memory saturation has been reached.As the system runs of memory, the operatingsystem is excessively burdened with the taskof swapping
7
out pages to an external storagemedium; this, of course, degrades system per-formance tremendously.
8
On the contrary, conducting the same testwith the application being XIP, this drop inperformance is much more gradual and not asabrupt. At very low levels of free memoryspace available (less than 5%) where regularexecution from RAM would result in a com-plete system freeze, XIP is capable of still keep-ing up a fairly good system performance andoverall responsiveness.Moreover, the differences in speed men-tioned above, between executing from NORflash and RAM can be alleviated by employinglarger-scaled CPU instruction caches that con-tinuelypre-fetchsoontobeexecutedchunksof code from flash memory.For instance, the Alchemy Au1100 MIPS32- based processor, one that is part of one of thesystems used for the benchmarks described inthe last part of this article, is endowed witha
16 kB
instruction cache. As the percentageof cache hits begins to increase at a steadyrate (relative to the number of cache flushes),the differences between RAM and NOR accessspeed begin indeed to smooth out, especiallyfor higher system loads. Performance is, of course, not the same nor can it be; however,instructioncachescanhaveanoticeableimpacton the speed of execution from flash memory.In summary it can be said that if a ma- jor point of emphasis when making designchoices is placed on minimizing RAM utiliza-tion, XIP can definitely be said to fulfill thisrequirement at large. As having huge amountsof RAM available does also always mean toput up with a higher power consumption,XIP can bring about an improvement because
task by an MMU (
 Memory Management Unit
), a hardwarecomponent to speed up page table look-ups.7. If physical memory becomes exhausted, it is com-mon to apply
swapping
. The principle is to temporallystore data from RAM on disk (“swap out”) as to forth-with create space for other currently more exigent data.8. The condition of high “memory pressure”, keepingthe system from doing anything useful, has generally become to be known as
thrashing
.
3

Activity (6)

You've already reviewed this. Edit your review.
1 hundred reads
1 thousand reads
Marc Grunewald liked this
Lee Cheng liked this
genesis619 liked this
ricardoscop liked this

You're Reading a Free Preview

Download
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->