You are on page 1of 14

White Paper

Optimization Technology for the Intel® PXA27x Processor Family
Performance and Power Savings for Wireless System and Application Development

Overview of Key System Optimizations 12 12 13 14 . Intel® PCA Developer Network 10.2 Debuggers 6 7 9 11 11 11 8. Wireless Intel SpeedStep® Technology 6.3 Optimizing for the Bus Transaction Arbiter 3 3 4 4 4 3.1 Compilers 7.1 Optimizing for On-chip SRAM 2.1 Enabling Intel® Wireless MMX™ Technology 3. Introduction 2. Intel® VTune™ Performance Analyzer 9. Intel® Wireless MMX™ Technology 3. Optimizing the Intel® PXA27x Processor’s Performance via the BSP 2.2 Writing Intel Wireless MMX Technology Code ® ™ 4 5 6 4. Intel® Integrated Performance Primitives (Intel® IPP) 7. Intel® Quick Capture Technology 5. Intel® Software Development Tools 7.2 Optimizing for the Enhanced Memory Subsystem 2.Table of Contents 1. Summary Appendix A.

While a full list ■ ■ ■ 3 . Microsoft Windows Mobile for PocketPCs and Smartphones. Microsoft Windows* Mobile for PocketPCs and Smartphones. Microsoft Windows* CE . Applications that take advantage of these optimizations will run faster and more efficiently on the Intel PXA27x processor-based devices. Intel® Wireless MMX™ Technology—an advanced set of multimedia instructions that brings desktop-like multimedia performance to Intel PXA27x processor-based clients. as well as optimization labs and support to answer questions.NET.000 different software and hardware solutions. and profiling. Intel® Software Development Tools (Intel® SDT)— provides both an optimizing compiler and a set of sophisticated.000 companies and over 3. 2. and Symbian* OS. Microsoft Windows CE . Intel® PCA Developer Network—provides information on third-party software applications that are already optimized. These technologies also allow Independent Software Vendors (ISVs) to fully tune their applications. Palm OS. The BSPs include the latest optimizations and drivers for the Intel PXA27x processor family and make it easy for customers to create a BSP customized for their own mobile device. Intel provides BSPs for a variety of operating systems including Linux. The BSPs include the latest optimizations and drivers for the Intel PXA27x processor and make it easy for customers to create a customized BSP. optimizing. the latest versions of which can be obtained from Intel field sales representatives. Pointers to additional resources that provide more detailed information on each technology are provided at the end of this paper. Operating System Board Support Packages (BSPs)— Intel provides BSPs for a variety of operating systems including Linux*. This paper describes a typical development cycle and how to take advantage of the optimization technologies available from Intel. Devices that take advantage of these optimizations achieve significant performance improvements and power savings over those that do not. ■ Intel® Wireless MMX™ technology Wireless Intel SpeedStep® technology Intel® Quick Capture technology Up to 624MHz core speed Enhanced memory subsystem Intelligent bus transaction arbiter 256K of on-chip SRAM ■ ■ Intel’s complete suite of development components are designed to help customers to take full advantage of cutting-edge technologies and get the best power and performance from mobile devices. while minimizing the power needed to run rich applications. OEM and ODMs should make sure that the device BSP supports these features. the Intel PCA Developer Network can help customers find value-add solutions for mobile devices. and Symbian OS.Optimization Technology for the Intel® PXA27x Processor White Paper 1. Introduction The Intel® Personal Internet Client Architecture (Intel® PCA) PXA27x processor family offers developers a new generation of ultra-low power and industry-leading multimedia performance on silicon. The key features this paper addresses are: ■ This paper introduces Intel® optimization technologies and address how each fit into a typical development cycle consisting of iterations of coding. Optimizing the Intel® PXA27x Processor’s Performance via the BSP To ensure that designs and applications take full advantage of the technology in the Intel PXA27x processor. The BSPs for the Intel PXA27x processor contain an extensive number of optimizations. Intel® VTune™ Analyzer—this tool lets users profile applications for hotspots of activity. Intel® Quick Capture Technology—provides the ability to get live video and high-quality still images from a wide range of camera sensors in current and future camera-enabled mobile handsets and PDAs. Wireless Intel SpeedStep® Technology—allows customer to dynamically adjust the power and performance of the processor based on CPU demand. Intel has integrated a host of new features in the Intel PXA27x processor family to enable this level of power and performance. A tuning assistant provides support to optimize C/C++ code and/or assembler sequences. Palm* OS. including: ■ ■ ■ ■ ■ ■ ■ ■ Intel® Integrated Performance Primitives (Intel® IPP)— a cross-platform software library that allows users to write optimized applications that utilize Intel Wireless MMX technology to maximize performance on the Intel PXA27x processor. This can significantly decrease power consumption in wireless handheld devices. high-level language debuggers to help software run at top speed.NET. With over 1.

which is programmable through the ARB_CNTRL register.1 Optimizing for On-chip SRAM The internal SRAM can be used for frame buffers as well as storage of variables or data to be processed. This is only an overview of the key features that should be in Optimizing for 256K of on-chip SRAM Enabling and utilizing the enhanced memory subsystem Taking advantage of the bus-transaction arbiter 2. Example: a device is designed to encode MPEG-4 video using Intel Quick Capture interface in the Intel PXA27x processor and stream it over USB Host to an attached USB Client. including: ■ ■ ■ 2. the LCD controller. or other variables that need to be accessed quickly. your BSP or in your application. executable code. the DMA controller. Example: a 320x240x16 bit-per-pixel frame buffer consumes 154K of memory. Customers can program priority weights for each of these clients via the arbiter-control register.3 Optimizing for the Bus Transaction Arbiter The bus arbiter in the Intel PXA27x processor performs the arbitration for internal-bus-access transactions. 4 . This retains the OS state so context can be restored quickly upon wakeup from those modes. and is powered from the VCC_SRAM domain. which enables fine-tuning of device performance based on the typical usage model for that device. improving performance of processing the MPEG-4 video stream (CPU intensive) and transmitting it via USB Host. Further performance can be gained by speculatively ’parking’ a specific client on the arbiter. The internal system bus in the Intel PXA27x processor can run up to 208MHz using fast bus mode at many other product points. 2. the USB host controller. while minimizing the power needed to run rich applications. Other features are listed in the other sections of this paper. As a result.2 Optimizing for the Enhanced Memory Subsystem The Intel PXA27x processor family enhances and adds flexibility to the bus settings of the Intel® PXA255 processor family. one or more banks can remain powered on. This helps reduce latency and increases bandwidth to memory. For more information. applications on the Intel PXA27x processor can offer better performance at the lower frequency settings. offering better system performance. several key optimizations for the Intel PXA27x processor are described here. and both an internal and external memory controller. allowing the rest to be used for temporary storage of MPEG-4* video buffers. Customers are encouraged to test the performance benefits of applying different priority weights to these clients based on the supported usage model. streamed data. The memory controller offers flexibility to run at greater speeds than before by setting the CCCR[A] bit to one. incoming data from the Intel® Quick Capture camera interface. By assigning higher priority weights to the core and the USB host controller. The SRAM is comprised of four independently controllable 64K banks. which supported a 200-MHz system bus at core speed of 400MHz. Intel Wireless MMX technology is an advanced set of multimedia instructions that help bring desktop-like multimedia performance to Intel PXA27x processor-based clients. The Intel PXA27x processor system bus supports six clients—the core. a Java* virtual machine heap.White Paper Optimization Technology for the Intel® PXA27x Processor of the available optimized drivers is beyond the scope of this paper. When entering sleep or deep-sleep mode. consult the following documentation: ■ ■ The Intel® PXA27x Processor Optimization Guide The Intel® PXA27x Processor Developer Manual Volume I of III The Intel® PXA27x Processor Developer Manual Volume II of III The Intel® PXA27x Processor Developer Manual Volume III of III ■ ■ 3. This means that the arbiter will always start with that client when internal-bus-access transactions are performed. Setting the arbiter-control register to park on the core often results in the best performance. including 312 and 208MHz by setting the CLKCFG[B] bit to one. Intel® Wireless MMX™ Technology Introduced in 2003. The SRAM has a fast access time. offering both lower power and higher performance than using external memory.

All Intel Wireless MMX technology data types are 64-bits wide. PC software developers who have already utilized Intel MMX technology and Intel® SSE will find a familiar programming environment in Intel Wireless MMX technology. consult the latest BSPs from Intel for specific examples. Intel Wireless MMX technology also provides extensive support for byte SIMD.Optimization Technology for the Intel® PXA27x Processor White Paper Mux wCGR [2:0] wRd [63:0] wRn [63:0] Mux wRm [63:0] Mux Transfer MCR/MRC CGR RF Shift and Permute Unit (SPU) Multiply Accumulate Unit (MAU) Execution Unit (EXU) wCID wCon Store Buffer Mux Mux wCSSF wCASF Co-processor Interface Unit (CIU) Mux Load Buffer Figure 1: Intel® Wireless MMX™ technology architectural diagram Intel Wireless MMX technology builds on the Intel® MMX™ technology originally introduced in the Pentium® processor family. for example. 3. 5 .1 Enabling Intel® Wireless MMX™ Technology To take advantage of Intel Wireless MMX technology. systems must do the following: 1. Intel Wireless MMX technology utilizes the data parallelism present in a large number of multimedia algorithms. Enable the Intel Wireless MMX technology coprocessor. This provides an opportunity for product differentiation and software upgradability. This technology allows software developers to design applications such as 2D and 3D gaming. and voice recognition available quickly for Intel-based cell phones and PDAs. Make sure the registers specific to Intel Wireless MMX technology are preserved across context switches and changes in power state. 2. Detect the presence of the Intel PXA27x processor. eight bytes of video data can be processed in parallel. where the Intel Wireless MMX technology registers are saved only if a process uses the registers. Because the specific techniques of detecting processor type and preserving context state vary in each operating system. 3. Designed to be simple.264. Note that preserving context can be implemented using “lazy switch” support. streaming MPEG-4 video. Intel Wireless MMX technology offers flexibility for algorithm customization and future standards. Intel Wireless MMX technology is general enough to address the needs of a large domain of mobile software applications built from current and future algorithms. Special instructions such as the byte average and the sum-of-absolute difference accelerate motion estimation and compensation algorithms used in video compression. By executing the same operation on different data elements in parallel. To enable the coprocessor. such as H. If the coprocessor is not enabled. set bits 0 and 1 of the “Coprocessor Access Register” located in Register 15 of Coprocessor 15. Packing data elements into a single register and introducing new types of instructions to operate on packed data accomplish this. This helps speed the porting of existing code bases from the Intel® Architecture to Intel® PCA-based mobile devices. wireless encryption/decryption. all Intel Wireless MMX technology instructions will trap on an undefined instruction and applications optimized for Intel Wireless MMX technology will not run correctly. Unlike fixed hardware accelerators.

The LCD controller in the Intel PXA27x processor supports a hardware cursor and three image places: one base plane and two overlays. Intel Quick Capture Technology includes the following key Intel Wireless MMX ® ARM* v5TE Instructions Only ™ Figure 2: Intel Wireless MMX benefits over Scalar code ® 3.1 ECO B.5 volt VCC_MEM) with The Intel PXA27x processor A1 stepping running at speeds indicated in graph. This platform represents a “bare metal” system with no operating system installed. It is a 4-.0 51. Rev 2 daughtercard.0 20.0 30. 5-. The system bus was 104Mhz for 208.0 50.0 70.com (www. The figure above shows the relative performance benefits for Intel Wireless MMX technology over scalar code. The software interface that controls camera applications typically resides on the base plane.com). 416 and 520MHz core frequencies.0 75. The 208-MHz measurements made in the processor Run mode and measurements at all other frequencies were made in turbo Mode. or 4.com/exec/ obidos/tg/detail/-/0974364916 4. and to use the vectorizer feature. 9-. Actual benchmarks were run on a Mainstone I system (main board rev 1.0 42. This provides high-performance and low-power solutions for still and video image capture and playback. reference the following documentation: ■ The Intel® PXA27x Processor Optimization Guide Figure 3: Programming with Intel® Wireless MMX™ technology 6 . MPEG-4 decoder implement with Intel IPP library optimized for Wireless MMX and MPEG-4 content is the CIF resolution video clip “Coastguard” in portrait mode.0 Higher is Better! ■ 50.0 80. ECO D with 2. For more details on optimizing your application for Intel Wireless MMX technology.0 FRAME RATE (fps) 60.1 mega pixels. or use inline assembler— this offers the most flexibility.0 32. 8-. This test decodes an MPEG-4 CIF resolution video clip and an MP3 file simultaneously. ® ■ The Intel Quick Capture interface eases the connection between the Intel PXA27x processor. CMOS and some CCD sensors. Use the Intel C/C++ Compiler to use intrinsics that support Intel Wireless MMX technology. which are pre-optimized libraries that provide a high level of abstraction to jump-start multimedia and signal processing-based applications. Intel® Quick Capture Technology With a growing number of PDAs and cell phones that include 312 MHz ™ 208 MHz 416 MHz digital cameras.amazon.0 0.0 66.0 The book Programming with Intel® Wireless MMX™ Technology.amazon. or 10-bit wide bus with control and clock lines that can be used in master and slave modes. available from Amazon. there are several usage options to consider: components: ■ Highly flexible Intel Quick Capture Interface Hardware color-space conversion ■ Write directly in assembly language. Pre-order at http://www. 312. ■ ■ Link to Intel Integrated Performance Primitives.White Paper Optimization Technology for the Intel® PXA27x Processor 90.0 40. The maximum programmable resolution supported is 2048x2048 pixels. but requires more effort and maintenance than other options.2 Writing Intel® Wireless MMX™ Technology Code After Intel Wireless MMX technology is enabled. is a comprehensive programming guide for Intel Wireless MMX technology and an invaluable resource for this exciting technology. and a preview window and/or a decompressed image is displayed on the overlays. the Intel PXA27x processor introduces Intel Quick Capture technology.0 10.

These functions are already in the Intel IPP. 5. This technology includes: ■ ■ ■ ■ These sample scenarios can be used in other advanced applications such as MPEG-4 video conferencing. in YCbCr 4:2:0 formats. The output. The output of an MPEG-4 decoder in YCbCr 4:2:0 format can be sent directly to Overlay 2. Wireless Intel SpeedStep® Technology To help maintain system battery-life with increased performance. This generates a compressed bit stream that is sent to a remote recipient over the base band (802.11) network interface. allowing the video sequence to be displayed on the LCD panel. No state retained. The application is best understood in Figure 4. Here are two examples of where this can be used: ■ Overlay 2. which dynamically optimizes application performance and power usage to extend battery life for phones and PDAs. the encoded stream from the remote recipient is received and decoded by an MPEG-4 video decoder.” Camera sensors often output data in YCbCr 4:2:2 formats. the output of the camera sensor can be sent directly to Overlay 2. Outgoing video encode stream—the sensor data is also converted using Intel IPP to YCbCr 4:2:0 format. Core and (optionally) peripheral PLLs disabled. LCD may continue to be refreshed via DMA. subject to change without notice) Figure 4: Example of Intel® Quick Capture technology data streams Overlay 2 has built-in hardware color-conversion from various luminance-chrominance (YCbCr) formats to red/green/blue (RGB) output. the Intel PXA27x processor includes Wireless Intel SpeedStep Technology. Processor and peripheral state retained. SRAM contents may be retained. which shows all key features of the Intel Quick Capture technology. No state retained except for general-purpose IO (GPIO). When performing camera preview for still-image or video capture. the processor operates in “normal mode. including three main data paths: ■ Five low-power states Ability to change voltage and frequency dynamically Wireless Intel SpeedStep power manager software Self-preview video stream—data from the sensor in YCbCr 4:2:2 format is down sampled and converted to RGB 5:6:5 format using Intel Wireless MMX technology. LCD may continue to be refreshed via DMA. is sent directly to the LCD in When running code. SRAM contents may be retained. which performs color conversion and displays the received video. Deep Idle Standby Sleep Deep Sleep ■ 7 . Incoming video stream—occurring simultaneously with the other streams.Optimization Technology for the Intel® PXA27x Processor White Paper Self Preview Video Stream CMOS Sensor YCBCr 4:2:2 LCD Controller RGB565 Display Scaling 2:1 to 4:1 Downscale Color Space Conversion Overlay 1 Outgoing Video Encode Stream Base Plane Format Convert 4:2:2 to 4:2:0 MPEG4 Video Encode Decode of Incoming Baseband Video Stream Overlay 2 Base Band 64kbps bitstream or Network Interface MPEG4 Video Decode YCBCr 4:2:0 Intel® Wireless MMX™ technology routines (Tentative.” Additional low-power modes and their uses are summarized in the following table: POWER STATE USAGE ■ Idle Processor clocks stopped with nearinstantaneous resumption. which converts the YCbCr to RGB for viewing on the LCD panel. More information on utilizing Intel Quick Capture technology is available in the application note titled “Intel® Quick Capture Technology for the Intel® PXA27x Processor Application Note. The live video preview in RGB format is displayed directly to the LCD in Overlay 1. which is accepted as input by the MPEG-4 video encoder.

and ISVs do not need to modify applications. Commands to change the voltage to an external power-management IC (PMIC) chip are sent via a dedicated I2C interface. OS vendors are not required to modify operating systems. This software solution optimizes the usage of lowpower capabilities listed above to help maximize battery life for phone standby time. However. The policy manager also determines the operating system power state. ■ The idle profiler monitors the OS idle activity within the operating system for a given workload and provides input to the policy manager. Other lowpower features of the processor include internal SRAM powered at 1. The Wireless Intel SpeedStep Power Manager consists of five software components or modules as shown in Figure 5.1V. then uses OS services or its own services to dynamically scale power and performance. ■ The policy manager determines the system power policy using several inputs. The power manager will adapt the power policy to workloads and usage scenarios automatically. and support for SDRAM down to 1. Operating System OS Mapping OS PM OS Services Scheduler Wireless Intel Speedstep® Power Manager Idle Profiler Perf Profiler OEMIdle Key Pad Audio Display Comm USB Battery PMU Hardware IPM Component IPM Enhaced IPM Optional OS Component Figure 5: Wireless Intel SpeedStep® Power Manager architecture 8 . and when running applications. ■ ■ ■ Applications (IPM) Applications User Settings Policy Manager DVM/DFM State Mtg. as well as core voltage. The performance profiler monitors the CPU percent utilization and memory usage through the performance-monitoring unit (PMU). To include this solution.White Paper Optimization Technology for the Intel® PXA27x Processor The processor supports dynamic runtime scaling of internal core and bus frequencies. without user intervention. The policy includes defining the new operating system states and desired frequency and voltages for the processor based on the workload. OEMs/ODMs should make sure hardware systems support the power manager. a feature in Intel XScale® microarchitecture. Intel provides the Wireless Intel SpeedStep power manager in all BSPs. and modify the platform-specific layer for phones and PDAs. include the power manager in BSPs. the power manager provides an Applications API so that ISVs can enhance or fine-tune applications to achieve power savings. The operating system mapping layer allows the power manager to be ported across multiple operating systems. talk time. If this state is not supported by the specific operating system then the power manager will create the state and use the driver interface to transition the operating system into the new state. The power manager also provides a User API so end users can control the power policy. This profiler determines if the workload is CPU-bound and/or memory-bound. The user interface allows users to tune the parameters used to determine the power policy.8V. then provides input to the policy maker in the policy manager.

Similarly. Device drivers have the flexibility to request a state change. The communications device drivers running on the applications subsystem is a client of Wireless Intel SpeedStep Technology Power Manager and the OS power management. which causes the appropriate communications software to be notified. illustrating the link between the Application Subsystem and the Communications Subsystem. such as state transitions. video. and speech encoder/ decoders (codecs) optimized for the Intel PXA27x processor and Intel Wireless MMX technology. Details about the Intel PXA27x processor power states can be found in the Intel PXA27x Processor Developer Manuals. The software components of the communications subsystem are shown in Figure 6. then the power manager uses the operating system interface to notify the device drivers. Detailed information about the power manager can be found in a Wireless Intel SpeedStep Technology Power Manager application note. Intel® Integrated Performance Primitives (Intel® IPP) Intel IPP is a performance library of optimized algorithms created to ease development of optimized applications. as well as in Intel® BSPs. frequency change and voltage change. and receives notifications from each on appropriate state transitions. All the device drivers must register with the power manager through the device driver APIs to get notification on all of the power management events. the battery device driver provides input into the policy manager for state changes based on thresholds for the battery status. 9 . The power management component of the communications software (including the L1. Intel IPP includes general signal and image processing primitives. the communications driver is notified of this state change and signals the communications power management component about the state change. as well as primitives that can be used to construct internationally standardized audio. comprehensive and robust. For example. the communications device driver is notified about frequency or voltage change. which uses the serial or SSP Intel® Mobile Scalable Link (Intel® MSL) in the Intel PXA27x processor. a power manager created by the device driver interface is used. image. The communications subsystem is then put into the low-power standby state and prepares to wake up the new state requires processing on the Applications Subsystem. L2 and L3 protocol layers) is responsible for the managing the power for the communications subsystem for each state for GSM and GPRS. for dynamic performance and power scaling. the device driver needs to transition to the new state and prepare the device for the next state transition. Otherwise. 6.Optimization Technology for the Intel® PXA27x Processor White Paper Applications (IPM) Radio OS Services Comm USB Kernel OS PM IPM Audio HAL PMIC Audio USB Driver MSL Driver MSL Protocol Stack L1 AT/APEX Interface Power Mgmt Protocol Stack L2/L3 Comm FW IPM Component IPM Enhaced IPM Optional OS Component Figure 6: Communications subsystem Wireless Intel SpeedStep Technology Power Manager provides a device driver interface and an application interface so that the power policy is optimized. Upon receiving a callback for a power management state transition or event. Example: When the OS goes into the standby state. If the state is supported by the operating system.

including trigonometric functions Vector and matrix operations Rasterization Primitives for image capture: ■ For more information about Intel IPP. filtering. statistics. thresholding. filtering. Intel GPP includes the following optimized primitives: ■ ■ ■ ■ Vector initialization. windowing. thresholding. and measure Deterministic and random signal generation Video—ITU H. The following primitives are available for general onedimensional (1D) signal processing: ■ Additional primitives are available that allow construction of the following multimedia codecs: ■ Vector initialization. and transforms Cryptography ■ ■ ■ ■ Data-type conversion Fixed-point arithmetic. ISO/IEC 14496-2 MPEG-4 decoder Audio—ISO/IEC 11172-3 and 13818-3 (MPEG-1. and handheld processors without sacrificing performance. so applications can be ported easily across Intel® server.723.intel. visit http://www. includes primitives for 2D and 3D graphics. ■ ■ ■ ■ Convolution.1 codec and ETSI GSM-AMR codec Image—ISO/IEC JPEG codec. statistics. arithmetic. -2) Layer 3 (“MP3”) decoder Speech—ITU-T G. Intel® Graphics Performance Primitives (Intel® GPP). and measure Color space conversions Morphological operations Convolution. windowing. desktop. Color format conversions: YCbCr 4:2:2 to 4:2:0 C/C++ Compiler Intrinsic Functions FPU Library I/F to Specific Hardware C++ Library C Library Linker Assembler Object Filters Compiler System Debuggers XDB Debugger Platform Simulator Intel XScale® Stimulator Microsoft* Windows* JTAG ROM OS Debug Task OS T3 T1 T2 T3 JTAG ROM Monitor T1 T2 T3 T1 T2 Figure 7: Intel® Software Development tools 10 . and transforms ■ Primitives for general two-dimensional (2D) image processing include the following: ■ A companion library. Intel IPP supports a consistent Application ® ■ ■ ■ Gamma correction Image scaling Frame stabilization Programming Interface (API).White Paper Optimization Technology for the Intel® PXA27x Processor Intel IPP also supports application porting across certain Intel platforms.com/software/products/ipp/. arithmetic.263 decoder.

A JTAG hardware connector on the target system is not required. JTAG and OS-aware debuggers ■ ■ ■ 7. including breakpoints. ■ Compiler system. and other OS activities. These macro-like functions help the compiler keep track of registers for optimization and eases code maintenance. Intel Wireless MMX technology and other coprocessor registers. ■ ■ System and application developers who use the Windows CE . including compiler. and on-chip peripherals. and a vectorizer Linker with optimized C-runtime and floating-point libraries First class C/C++ debugging support. Microsoft Windows Mobile 2003 software for Pocket PCs and Smartphones.NET and Windows Mobile 2003 can use Microsoft eMbedded Visual ■ 7.NET and Windows Mobile 2003 typically use Microsoft Platform Builder* to build and debug their target platform.2 Debuggers The Intel Software Development Tools also provide a comprehensive set of debuggers to support all phases of development.x Support for Intel Wireless MMX technology instructions includes assembly language and inline assembly of those instructions. The ROM Monitor Debugger is a software solution and is mainly used for ISV application debugging. linker. which are seamlessly integrated into those tools with an additional debugger window. Direct access to registers that control Intel XScale technology. assembler. Palm OS. A JTAG debugger version allows access to the hardware through JTAG so that hardware prototypes can be fully tested without having to debug client software running on the target. and communicate directly to the target via JTAG or TCP/IP. Another way to use Intel Wireless MMX technology is to link in code developed using the Intel IPP. ■ ■ ■ ■ Large set of optimization switches PNO (Pace Native Objects or “ARMlets”) support for Palm OS v5.NET environments can use the Intel debugging extensions. threads. Important features of the compiler systems include: ■ Appropriate OS-awareness plug-ins can be loaded either into the JTAG debugger or into the ROM Monitor Debugger. including the following features: ■ Support for Intel Wireless MMX technology using assembly language and inline assembly. so that developers can follow the kernel with its task switches. ■ ■ Intrinsics allow Intel Wireless MMX technology instructions to be used in C/C++ code without the use of inline assembly language. A bit field editor and description of all bit fields for each register is particularly useful for low-level driver development. The Intel debuggers also provide eXDI debugger extensions for access with Macraigor Raven* and EPI JTAG tools (available through EPI*). ISVs developing applications for Microsoft Windows CE . and other OS constructs. as well as intrinsics and vectorization. Vectorization is a feature that uses Intel Wireless MMX technology to vectorize code that is normally scalar. The tools include: ■ ■ A simulator debugger contains the silicon model with simulated peripheral and device registers. The XDB debuggers provide full Flash memory support to burn an OS image into board memory. The vectorizer identifies code that can be run as SIMD operations and attempts to use Intel Wireless MMX technology instructions to implement those areas of code. and stepping through code by using script language. and optimized libraries Debuggers for all stages of system software and application development including simulators.1 Compilers Intel® compilers are highly optimized for Intel PCA processors. Execution Trace Support displays code execution history that occurred before breakpoint encountered.Optimization Technology for the Intel® PXA27x Processor White Paper 7. The compiler systems generate code for Microsoft Windows CE. intrinsics. Support for scripting for batch files and for automated validation allows “overnight” debugging. Intel® Software Development Tools Intel Software Development Tools are comprised of compiler and debugger tools for building and debugging system platforms and applications. 11 . All debuggers share a common GUI and have the same basic functionalities. OEMs developing on Microsoft Windows CE . OS awareness plug-ins allow inspection of tasks. Nucleus* OS and for OSindependent systems. queue and semaphore tables. eliminating the need for real hardware in this phase of the development cycle. local variable and memory display.NET.

and OS independent systems Rebuild the Intel VTune Device Driver Sources once and provide binary as loadable module to ISVs. and over 750 available solutions. Using Software optimization for processors based on Intel XScale technology Development support Tools and technical support for Intel PCA building blocks Marketing support and co-marketing opportunities Solutions for wireless carriers and operators Solutions for the wireless enterprise ■ ■ ■ ■ ■ Visit http://developer. Intel provides BSPs for Windows 2003 Mobile Software (Smartphone and Pocket PC) for Intel PCA processors. Software running on Intel applications processors can be profiled remotely using the Intel VTune performance analyzer GUI on the host system. This tool works in a host/target development environment. System developers need to keep these functions in their device BSPs to enable ISVs to develop and showcase applications onto those devices. 9. 12 . The extensions only work.intel. Nucleus OS. Intel® VTune™ Performance Analyzer The Intel® VTune™ Performance Analyzer helps developers to optimize software on Intel processors. The Intel XDB Debugger also supports communication over a ® the Performance Monitoring Unit (PMU) in the processor.dll) feature in Intel’s BSPs based on Microsoft Windows 2003 Mobile Software (Pocket PC and Smartphone) Linux-based devices: ■ ■ ■ Intel® C++ Software Development Tool Suite. No modifications on the BSP are required For more information on the Intel Software Development Tools. Intel® PCA Developer Network With more than 6. The Intel VTune performance analyzer is available in beta for Microsoft Windows 2003 Mobile Software and Linux targets using the Intel PXA27x processor. This is currently available for Palm OS only. Sampling provides developers with the most accurate representation of actual software performance.NET Intel C++ Compiler. with performance data gathered by a data collector running on a target development system with Intel XScale technology. all program activity can be profiled at the microarchitecture level. system developers need to implement the ROM monitor to ensure that ISVs can debug applications on these devices.com/pca/developernetwork/index. For Palm OS. To provide Intel VTune support within handheld and cell phone devices to ISVs. With intimate knowledge of the processor. with negligible overhead. OEMs need to keep Intel VTune support in specific BSPs by following the steps: Windows 2003 Mobile Software-based devices: ■ ROM monitor for ISVs developing on off-the-shelf target devices.White Paper Optimization Technology for the Intel® PXA27x Processor C++* to build and debug their applications on an OEM or ODM’s device. For Platform Builder For Microsoft Windows CE .NET and for Windows Mobile 2003. Intel Software Development Tools are available in the following packages: ■ Intel® C++ Compiler.500 companies.intel. and suggests techniques for improving code performance. and the Intel VTune feature is part of the BSPs. For Microsoft eMbedded Visual C++ ® Keep the Intel VTune (PMUDLL. 3. the Intel VTune performance analyzer provides additional analysis of potential stalls and latency issues.htm and see what additional resources and support await you today. including Intel XScale technology-based processors like the Intel PXA27x processor. The Network supports wireless device and equipment manufacturers. Symbian OS. however. visit http://www. if the OEM/ODM builds three specific ioctl functions into their system ROM or Flash that are part of the Intel BSPs for CE . application developers and service providers in these key areas: ■ 8.000 individual members. eMbedded Visual C++ provides debugger extensions to allow direct access of the Intel XScale technology peripheral registers on that device.com/software/products/compilers. The Intel VTune performance analyzer provides performance data from the system level down to the source level. For Palm OS. The Intel VTune analyzer remote collector component identifies potential performance issues and provides recommendations for improving software utilization of the processor hardware and design. the Intel PCA Developer Network is a global community of hardware and software developers working to accelerate the delivery of nextgeneration wireless Internet applications and client devices.

intel. high-level language debuggers to help software run at top speed. Microsoft Windows CE .com/pca ■ Operating System Board Support Packages (BSPs)— Intel provides BSPs for a variety of operating systems including Linux.com/ software/product/vtune Intel PCA Developer Network—provides information on third-party software applications that are already optimized. Intel Wireless MMX Technology—an advanced set of multimedia instructions helps bring desktop-like multimedia performance to Intel PXA27x processor-based clients. A tuning assistant provides support to optimize C/C++ code and/or assembler sequences. This can result in a significant decrease in power consumption for wireless handheld devices. Microsoft Windows Mobile for PocketPCs and Smartphones. More information is available at http://www. Intel® Quick Capture Technology—provides the ability to get live video and high-quality still images from a wide range of camera sensors in current and future camera-enabled mobile handsets and PDAs. please contact your local Field Applications Engineer or sales office. More information on Intel IPP is available at http://www. More information is available at http://www. For more details on how to get maximum battery-life and performance from devices and applications based on the Intel PXA27x processor. ® ™ ■ ■ ■ ■ Refer to the checklist in Appendix A for a detailed checklist of optimization opportunities. With over 1. ■ 13 . and Symbian OS. while minimizing the power needed to run rich applications.intel. read the Intel® PXA27x Processor Optimization Guide and be sure the following technologies are used and enabled: ■ ■ Intel® Integrated Performance Primitives (Intel® IPP)— a cross-platform software library that allows users to write optimized applications that utilize Intel Wireless MMX technology to maximize performance on the Intel PXA27x processor.000 companies and over 3. For other questions. The BSPs include the latest optimizations and drivers for The Intel PXA27x processor and make it easy for customers to create a BSP customized for their own mobile device.NET. http://www. the Intel PCA Developer Network can help customers find the value-add solution they need for their mobile device.com/software/products/ipp Intel® Software Development Tools (Intel® SDT)— provides both an optimizing compiler and a set of sophisticated. Wireless Intel SpeedStep® Technology—allows customer to dynamically adjust the power and performance of the processor based on CPU demand.intel. as well as optimization labs and support to answer questions.com/ software/products/compilers Intel® VTune™ Analyzer—enables profiling of applications for activity hotspots. Summary This paper provided a framework for how to use the tools and features available to optimize systems and applications for the Intel PXA27x processor.000 different software and hardware solutions.intel. Palm OS.Optimization Technology for the Intel® PXA27x Processor White Paper 10.

Use the Intel XDB Debugger to debug applications. For OEMs building devices based on Microsoft Windows CE. Use the Intel Integrated Performance Primitives to develop codecs with optimized performance. Intel. Use the Intel VTune Performance Analyzer to help profile and optimize applications. visit the Intel Web site at: developer. the Intel logo. All rights reserved. Use the Intel Graphics Performance Primitives to develop 3D pipelines with optimized performance. ■ Refer to the Intel PXA27x Processor Developer Manual and the Intel PXA27x Processor Optimization Guide for detailed tips on optimizations. Microsoft Windows Mobile 2003* software for Pocket PC and for Smartphone: build in the specific ioctl() functions compatible with Intel XDB Debugger into the device system to allow direct debugging using Microsoft eMbedded Visual C++. Overview of Key System Optimizations ■ ■ Use the Wireless Intel SpeedStep Power Manager application API in applications to tune power consumption.dll) in your device to enable ISVs to use VTune to tune applications for your device. MMX.NET. Use Overlay 2 to perform hardware color conversion. Microsoft Windows Mobile 2003 software for Pocket PC and for Smartphone: keep the Intel VTune library (PMUDLL.intel. Intel XScale. Use internal SRAM to help reduce power and increase performance. VTune. No modifications on BSPs are required. Port the Wireless Intel SpeedStep Power Manager software into the device BSP. Make sure the registers specific to Intel Wireless MMX technology are preserved across context switches and changes in power state.com/procs/perf/limits. ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ Use the Intel Quick Capture Interface to ease interfacing with cameras. Intel Centrino. For OEMs building devices based on Linux: rebuild the Intel VTune device driver sources once and provide binary as loadable module to ISVs. reference t Please Recycle 300869-001 .NET. Set CCCR[A] and CLKCFG[B]=1 to maximize memory performance. Copyright © 2004 Intel Corporation. For OEMs building devices based on Microsoft Windows CE .) 1-800-628-8686 or 1-916-356-3104 *Other names and brands may be claimed as the property of others. For more information on performance tests and on the performance www. and Wireless MMX are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Use the Intel Software Development Tools (Intel C/C++ compiler) to build applications enabled with Intel Wireless MMX technology.htm or call (U.com Performance tests and ratings contained within this document are measured using specific computer systems the approximate performance of Intel® products as measured by those tests.White Paper Optimization Technology for the Intel® PXA27x Processor Appendix A. ■ ■ For more information. For OEMs building devices based on Palm OS: build in the ROM monitor compatible with the Intel XDB Debugger into your device’s system to allow debugging. Buyers should consult other sources of information to evaluate the performance are considering purchasing. Make sure to set the bus arbiter and parking settings properly. Personal Internet Client Architecture. Make sure the Intel Wireless MMX technology coprocessor is enabled in the device BSP.intel. Intel SpeedStep. Make sure the device BSP and OS allow the presence of the Intel PXA27x processor to be detected. Any difference in system hardware or may affect actual performance. Pentium.S. 0304/MS/MD/PDF and/or components and reflect software design or configuration of systems or components they of Intel products.