Professional Documents
Culture Documents
Hardware/Software Co-Design For JPEG Encoder Test Bench: Keywords
Hardware/Software Co-Design For JPEG Encoder Test Bench: Keywords
Xiaoying Liang
Abstract
This paper presents a hardware/software (HW/SW) co-design approach using System On a
Programmable Chip (SOPC) technique to achieve Joint Photographic Experts Group (JPEG)
algorithm. It firstly introduces JPEG image compression technology and the system architecture. Then
the hardware/software design process of JPEG encoder test bench is introduced. It focuses on using
the characteristics of Field-Programmable Gate Array (FPGA) structure to achieve JPEG algorithm
including the improved Discrete Cosine Transform (DCT), and Nios II embedded processor of
customizable characteristics, translating image acquisition, JPEG image compression and Thin Film
Transistor Liquid Crystal Display (TFT-LCD) controller into user-defined modules according to Altera
Avalon bus requirements with the SOPC Builder, where the user-defined module can be added to the
system under the control of soft-core Nios II Embedded. Finally, the whole system is verified on a
single FPGA chip. The experimental results shows the advantages of JPEG algorithm as a FPGA
hardware module includes low power consumption, high image quality, low production costs and
stable performance. Theres a very great practical significance to reduce costs and improve image
processing speed.
258
4) Compared with the traditional image process system using only software or hardware, the
software and hardware of this system work closely together. And the system could obtain a better
balance of flexibility and performance.
3. The architecture
A FPGA hardware/software co-design approach is becoming increasingly popular for
implementation of digital circuits. It can be developed in software for flexibility and upgrading
completed with hardware IP blocks for cost reduction and performances. Altera provides the SOPC
builder tool for the quick creation and easy evaluation of embedded systems. Using the SOPC Builder,
the proposed system in this paper has been developed with the NIOS II Processor and some peripherals
to give support to the correct operation of the processor. These peripherals are the program and data
memories (DDR SDRAM, SRAM and FLASH), two UART to communicate with the PC and provide
debug information and to program the processor, some input and output ports to read the data from the
259
camera and deliver the output signal to the LCD, some ports with timing and synchronization purposes.
All this peripherals are connected to the Avalon Bus in a single master/slave configuration, where the
bus master is the NIOS II Processor and DMA controller. In additional, the NIOS II configuration
chosen is the NIOS II/fast, to provide the best performance to the processing unit. The diagram of the
system structure is shown in Figure 2.
260
TFT display. 5) Serial configuration device is used to storage the configuration data of FPGA. While
the FPGA powers up, the serial configuration device sends data to the FPGA. 6) JTAG port is the
special port that uses the IEEE Std 1149.1 JTAG interface pins and supports the JAM STAPL standard.
7) UART serial port uses as the debug port for Nios II and image data output. 8) The clock module
produces system clock with a 50 Mhz external Clock. 9) Altera Daughter Card is a port that meets
Altera development board extended standard, using to connect image camera module and TFT-LCD
interface module. 10) The key and LED complete the program control and the result display.
261
the top file which includes not only the Avalon Streaming Interface but also instance of timing
generator and pixel FIFO.
4.2.3. JPEG encoder
Z C t XC .
(1)
where X is the data matrix, C is the matrix of DCT coefficients, and Ct is the transpose of C.
Denoting the 1-D DCT of an N x N data matrix X by Y = XC and letting the element of the data
matrix X be represented by the 2s complement code, then the (k, l)th element of Y is
n2 N
(2)
j 0 m 1
where cm,l is the mth row and the lth column element of C, xk( ,jm) is the jth bit of xk ,m which is the kth
row and the mth column element of X and has a value of either 0 or 1, n is the number of bits xk ,m
carries, and xk( n,m1) is the sign bit.
By considering characteristics of the DCT matrix, it can be shown that
N /2
yk ,l u k ,m cm,l .
(3)
m 1
262
N /2
yk ,l vk ,m cm ,l .
(4)
m1
263
5. Experimental results
After finish the design of systems software and hardware, it is needed to test the SOPC system to
assure the correctness of design and the performance of system.
264
Figure 9. Experiment. Left: JPEG Encoder Development Board. Right: JPEG Compression Data
PSNR 10 log10
A2
1 N M
(
) [ f (n, m) f n (n, m)]2
NM N 0 M 0
(dB ) .
(5)
where f(n,m) is the original image, fn(n,m) is the grayscale image, the image size is N x M, and A is the
maximum of f(n,m). The results are shown in Table 1.
Design
Proposed encoder
ACDSee
Lenna
Lenna
1.597
1.597
PSNR(dB)
37.574
39.255
As can be seen from the table, there is not much difference between the proposed encoder and the
pure software encoder in compression quality.
5.2.2. Subjective evaluation
The subjective evaluation of images means evaluating quality of image by naked eye. The
experiment result shows the JPEG file compressed by our technologies would be absolutely decoded
and displayed on the third part software. Compared with software encode and decode technologies,
difference cannot be distinguished by human being. Specially, when quality of compressed is 50%, two
images are essentially same. The reason for this result is the maximum bit is 12 for inner calculator in
FPGA. When the quality factor is lower, the greater the quantization step and the quantization error
difference between the proposed DCT and ACDSee is also smaller.
265
Developer
Proposed encoder
Proposed encoder
Proposed encoder
JPEG_Fast_E (CAST,Inc)
JPEG_E (CAST,Inc)
-8
-7
-6
-7
-6
6606LEs
6682LEs
6608LEs
6355 LEs
5,337 LEs, 9 M4Ks, 19 DSP-9bit
Clock frequency
107MHz
119MHz
150MHz
93 MHz
154MHz
6. Conclusion
The new generation of FPGA technologies enables a commercial softcore processor and an
application IP to be integrated into a SOPC developing environment. The benefit of a softcore
processor is to add a micro-programmed logic that introduces more flexibility. Therefore, in this paper,
we present an efficient HW/SW co-design architecture for JPEG encoder and its FPGA implementation.
It is based on a Nios II CPU and a set of specialized processors and interfaces that implements JPEG
baseline encoder. The whole design has been tested on a NIOS II development board and some
experimental results are demonstrated. The result shows that the proposed system is more flexible and
stable, and can be used in a wide range of video system applications, particularly in consumer product
such as Smartphone.
7. References
[1] Jianbo Xu, Jing Long, Wei Liang, Weihong Huang, "A DFA-based Distributed IP Watermarking
Method Using Data", JCIT: Journal of Convergence Information Technology, Vol. 6, No. 8, pp.
152-160, 2011.
[2] Yang-Hsin Fan, Trong-Yen Lee, "Grey Relational Hardware-Software Partitioning for Embedded
Multiprocessor FPGA Systems", AISS: Advances in Information Sciences and Service Sciences,
Vol. 3, No. 3, pp. 32-39, 2011.
[3] Hejin Liu, Kejun Li, Ying Sun, Ruzhen Li, Wenli Wang, Zhenyu Zou, "Design and
implementation of SOPC-based frequency variable inverter", Dianwang Jishu/Power System
Technology, Vol. 35, No. 2, pp. 194-200, 2011.
[4] Yang Yu, Yefu Chen, Yu Peng, "An SOPC test strategy based on wrapper/TAM co-optimization",
In Proceedings of the 10th International Conference on Electronic Measurement and Instruments,
pp.331-335, 2011.
[5] Jigang Tong, Zhenxin Zhang, Qinglin Sun, Zengqiang Chen, "Design of node with SOPC in the
wireless sensor network", ICIC Express Letters, Vol. 4, No. 5B, pp. 1869-1874, 2010.
[6] Chih-Min Lin, Ming-Hung Lin, Chun-Wen Chen, "SoPC-based adaptive PID control system
design for magnetic levitation system", IEEE Systems Journal, Vol. 5, No. 2, pp. 278-287, 2010.
[7] Lionel Damez, Loic Sieler, Alexis Landrault, Jean Pierre Drutin, "Embedding of a real time
image stabilization algorithm on a parameterizable SoPC architecture a chip multi-processor
approach", Journal of Real-Time Image Processing, Vol. 6, No. 1, pp. 47-58, 2011.
[8] Ming-Ting Sun, Ting Chung Chen, Albert M. Gottlieb, "VLSI Implementation of a 16 X 16
Discrete Cosine Transform", IEEE Transactions on Circuits and Systems, Vol. 36, No. 4, pp. 610617, 1989.
266