You are on page 1of 5

2011 International Conference on Circuits, System and Simulation

IPCSIT vol.7 (2011) (2011) IACSIT Press, Singapore

Development of Heterogeneous Multi-core Embedded Platform for

Automotive Applications
Ting-Ying Wei, Zhi-Liang Qiu, Chung-Ping Young+ and Da-Wei Chang
Department of Computer Science and Information Engineering
National Cheng Kung University, Tainan, Taiwan

Abstract. Car electronics dominates the functionality of a modern automobile. To meet the requirements of
low cost, high performance, compact size and versatile operation, multi-core system-on-chip (SoC)
embedded systems play an important role for system architecture and development. To accommodate a
variety of control and computation requests of embedded systems, a heterogeneous multi-core processor can
satisfy different types of computational tasks. However, such architecture brings software development more
complexity and challenges. Dual-kernel embedded software was developed for PAC Duo, which consists of
one ARM processor and two PAC DSPs developed by Industry Technology Research Institute (ITRI),
Taiwan, with Linux on ARM processor and C/OS-II on PAC DSP. An inter-processor communication (IPC)
mechanism, which takes advantages of hardware features, was developed to fulfill the heterogeneous multicore interconnection. The real-time process migration between DSPs was realized for load balancing
enhancement. Therefore, the heterogeneous multi-core system software will be suitable for automotive

Keywords: Automotive, heterogeneous multi-kernel, system-on-chip, PAC

1. Introduction
A modern automobile is designed to be safer, more convenient and more comfortable than ever, while
car electronics plays an important role for automobile development. Because of the advance of integrated
circuit technology, an electronic control unit (ECU) not only shrinks the size and reduces the power
consumption, but provides more complicated functionality and data processing capability. In order to achieve
the increasing demands for telematics and infotainment in a vehicle, the processor of an ECU needs to boost
its computing power for computation intensive applications. Moreover, all the ECUs in an automobile are
connected through in-vehicle communications, like CAN, LIN, or FlexRay bus, to organize a distributed
system. Since there are as many as 70 ECUs embedded in a vehicle [1], [2], this conventional approach
implies higher cost for utilizing more components, occupying more space, and maintaining robust
interconnection among ECUs.
Multi-core system-on-chip (SoC) is a popular solution for development of advanced computers or
embedded systems. It features with parallel processing, compact chip size and lower power consumption.
When multi-core SoC is employed for automotive applications, the system will be transformed from
federated architecture into integrated architecture [3]. The performance is improved, the amount of
processors is reduced and better reliability of the on-chip communication is assured.
Multi-core processor can be categorized into two types: homogeneous and heterogeneous. Symmetric
multiprocessor (SMP) operating systems are usually implemented on homogeneous multi-core processors for
high performance clustering computing. On the other hand, heterogeneous multi-core processors, consisting
of diverse cores dedicated to specific applications, are better for embedded systems. Texas Instrument

Corresponding author. Tel.: + 886-6-2757575 x62533; fax: +886-6-2747076.

E-mail address:

OMAP [4] and DaVinci [5], and Industry Technology Research Institute (ITRI) Parallel Architecture Core
(PAC) [6] are examples, where they comprise an ARM-based general purpose processor (GPP) and at least
one digital signal processor (DSP). However, the complexity of embedded software development grows,
because of various hardware abstraction, system software, and inter-processor communications (IPC).
A heterogeneous multi-core aware embedded software platform was customized and implemented on the
ITRI PAC for a variety of automotive applications, while the Linux is on ARM processor and the C/OS-II
is on PAC DSP. The IPC mechanism, which takes advantages of hardware features, was designed to fulfil
the heterogeneous multi-core interconnection among ARM and DSPs [7]. The real-time process migration
between DSPs was realized for load balancing and fault tolerance enhancement.

2. System Architecture
In order to develop the heterogeneous multi-core embedded software and applications, the ITRI PAC
Duo platform connecting with other ECUs via CAN bus is employed for implementation.

2.1. PAC Duo SoC

PAC Duo SoC is a chip-level heterogeneous multiprocessor SOC composed of an ARM926EJS and two
PAC DSP cores of the same architecture [8]. The ARM926EJS serves as the GPP while two DSPs can be
treated as special purpose processors (SPP) to cooperate with the GPP. PAC DSP is a five-way VLIW DSP
core and includes a scalar unit, two load/store units, and two arithmetic units. It uses distributed register file
with low access latency and power consumption, and it utilizes variable-length operation encoding
techniques to increase the code density [9], [10].
Each DSP core in PAC Duo has a 64 KB local memory and resides on the 32-bit AXI bus.
Communication with the ARM processor can be achieved through an AXI-AHB bridge, since the ARM
processor resides on the 32-bit AHB bus. PAC Duo supports inter-processor communication at hardware
level through hardware mailbox mechanisms or shared memory [8]. The former is interrupt-driven, allowing
a processor to send interrupts to another processor for event notification. The latter allows the processors to
share data or states. There are four banks of shared memory on the platform. Two banks of shared memory
(128 KB SRAM and 128 MB DDR2 DRAM) reside on the AXI bus while the other two (256 KB SRAM and
128 MB SDRAM) reside on the AHB bus [8].

2.2. Software Implementation

Fig. 1 depicts the software architecture of heterogeneous multi-kernel embedded software on the PAC
Duo platform. A Linux kernel, which runs on the ARM core as a master processor of the system, is for I/O
control and system management. The real-time kernel C/OS-II, which runs on DSP1, DSP2, or both DSPs,
is mainly for real-time signal processing computation. This multi-kernel architecture allows an application to
be programed as a real-time task on C/OS-II or a non-real-time task on Linux. Communications among the
kernels are achieved through an IPC mechanism, which intends to support efficient communication and data
sharing among software running on these cores. The realization of IPC is separated into two parts, which are
integrated into each kernel respectively.

Fig. 1: Software architecture of the multi-kernel embedded system on the PAC multi-core platform.

After three major modules, Linux kernel, C/OS-II kernel and IPC, are installed, there are several
aspects to be considered for enhancing the system performance and stability.

Real-time scheduling

C/OS-II is a preemptive kernel, so the execution of a processor is always given to the highest priority
task ready-to-run [11]. The tasks to be executed on C/OS-II are compiled along with the kernel off-line, so
each tasks execution time, resources, dependencies and time constraints should be known first before
priority selection. However, this static task assignment may not be operated perfectly for automotive
applications, when the tasks are assigned dynamically for variable driving environment.
A dynamic global scheduling policy is required, so the dispatcher on the ARM can dispatch processes to
DSPs by predefined algorithm or users policy. This global multi-core scheduling selects the highest priority
task from the global ready queue and assigns the task to a free processor. The state diagram in global
scheduling involves 5 states: NEW, READY, RUNNING, WAITING and TERMINATED. When a task is
initiated, its in READY state and the task is placed in the ready queue. The scheduler will pick a task from
ready queue and the state is transferred to RUNNING, where the task assignment will be implemented. If a
running task is interrupted, the task will be put back to the ready queue. Since there are two identical DSP
cores in the chip, the task can be freely dispatched to either DSP where it can run more efficiently. The
system software provides more flexibility of process management and then achieve higher throughput. This
is done by migrating a waiting process on an overloading core to an idle core.
2) Load balancing
The load balancing is managed by the task manager on ARM. It probes the overloading of any DSP core
and initiates a process migration. Based on PAC Duo architecture, C/OS-II is implemented on each PAC
DSP, and the shared memory, both DDR2 memory and DSP local memory, can be accessed from either DSP.
Some system calls of C/OS-II supporting process migration between DSPs are developed.
When the process migration system calls of C/OS-II are invoked, the mailbox is employed to transfer
migration information. If a migration is requested, the source core will save migration information and then
send the data structure to the target core. After the migrating process is frozen, the source core sends
migration requirement to the target core. If the migration is allowed, target core will start executing the
migrated process immediately. If not, the target processor will return a fail message to the source core.
Figure 5 presents the common code and data section in shared memory being accessed by both cores during
process migration.
The following procedure describes the process migration, which is shown in Fig. 2.
A. The ARM core sends migration requirement (migrate the process A to DSP2) to DSP1.
B. After DSP1 receives the migration request, it suspends the process A, packets the migration information
and sends migration packet to DSP2.
C. When DSP2 receives the migration information, DSP2 will examine its state and response the result to
DSP1. If migration requirement be permitted, DSP2 will send succeed message to DSP. If the result is fail,
DSP2 send the fail message to DSP1.
D. DSP1 will receive the migration result. If fail, it will resume the suspended process and send fail message
to ARM. If migration is successful, DSP1 send success message to ARM.

Fig. 2: The procedure of task migration.


3) Fault tolerance
For automotive applications, robustness is one of the most important design metrics. If one component
doesnt work, it will bring down the whole system. DSP is one of the core components to process all
incoming digital signals, so dual DSPs not only enhance the computation capability, but also provide
redundant operation setting and fast recovery mechanism. The watchdog timer is started and its service
routine investigates the hardware status and software flow. When one DSP is diagnosed as failure or the
program is hung, all processes assigned to that DSP will be migrated to the other one. A task manager at
ARM side is responsible for task dispatch and process migration between the DSPs. This mechanism first
records the states at previous check point and resets the states to the designated instance before the process
migration. This prevents from system reset and data loss, and shortens the recovery time.

3. Automotive Applications and System Implementation

Vehicle safety is the most concerned issue for automotive applications. Active safety is a vigorous
approach to bring forward the precaution enabling to prevent the driver, passengers, and vehicle from a
possible accident. Collision avoidance is one of the vital features in some luxury cars. On the other hand,
infotainment is not a safety critical feature, but it provides convenience and comfort to the passengers and is
a value added factor to a vehicle.
To fulfil a collision avoidance function is to determine the safe approaching speed to the front car by
detecting the relative speed and distance through one or several types of sensors, like video, laser, or
ultrasound, etc., and converting these parameters into one collision avoidance index.
Fig. 3 shows the block diagram of a multi-function telematics application developed on a PAC Duo
platform. While the DSPs are handling the multi-sensor data fusion, the ARM manages user interface and
communication. The Linux kernel on ARM, involving internet protocol stack and 3G modem device driver,
provides the pervasive networking capability to applications. Fig. 3 also demonstrates a telematics
application for transmitting the dynamic vehicle driving information to the server in service center through
3G communication and internet.

Fig. 3: A telematics application integrating sensors and 3G modem.

Fig. 4 demonstrates the system realization of PAC Duo platform connecting with a variety of sensors
through CAN bus for automotive application.


Fig. 4: System realization of PAC Duo platform for automotive applications.

4. Conclusion
Heterogeneous multi-core SoC becomes popular for a variety of embedded systems, while the multikernel embedded software development is complicated. This work shows the system software implemented
on a heterogeneous multi-core SoC platform, PAC Duo, and describes its relevant automotive applications.
To improve the performance and reliability, the system software is enhanced for load balancing, fault
tolerance and power management. The PAC Duo platform should be capable for future automotive
applications development.

5. Acknowledgements
This work was supported in part by the National Science Council of Taiwan under Grants NSC-98-2220E-006-019 and NSC-99-2220-E-006-009.

6. References
[1] N. Navet and F. Simonot-Lion, Automotive Embedded Systems Handbook, Taylor & Francis, 2009.
[2] N. Navet, A. Monot, B. Bavoux, and F. Simonot-Lion, Multi-source and multicore automotive ECUs - OS
protection mechanisms and scheduling, 2010.
[3] H. Kopetz, R. Obermaisser, C. El Salloum, and B. Huber, Automotive Software Development for a Multi-Core
System-on-a-Chip, Fourth International Workshop on Software Engineering for Automotive Systems (SEAS'07),
[4] Texas Instruments, OMAP5910 Dual-Core Processor (Rev. D), August 2004.
[5] Spectrum Digital Inc., DaVinci EVM Reference Manual, 2007.
[6] Industrial Technology Research Institute, PACDSP3S0001-Processor Architecture, 2008.
[7] D.-W. Chang, et. al., Building Multi-Kernel Embedded System on PAC Multi-Core Platform, Proc. WESQA,
[8] Industrial Technology Research Institute (ITRI), PAC DUO Programmings Reference, 2009.
[9] T.-J. Lin, C.-N. Liu, S.-Y. Tseng, Y.-H. Chu and A.-Y. Wu, Overview of ITRI PAC project - from VLIW DSP
processor to multicore computing platform, Proc. IEEE International Symposium on VLSI Design, Automation
and Test (VLSI-DAT 2008) , pp. 188-191, 2008.
[10] C.-W. Chang, I.-T. Liao, S.-Y. Tseng and C.-W. Jen, PAC DSP Core and Its Applications, Proc. 2006 IEEE
Asian Solid-State Circuits Conference (ASSCC 2006), pp. 19-22, Nov. 2006.
[11] J. Labrosse, MicroC/OS-II: The Real-Time Kernel, 1999.