You are on page 1of 2

In this series, we bring you a number of articles on how you can use power debugging to minimize power consumption

in an embedded system.

Part 5: A trip down memory lane to save power


A very costly operation in terms of power consumption in embedded systems is accessing memory. If we can optimize memory access, we will save power. There are two general principles for doing this. The first principle is to minimize the total number of memory accesses. The second is to do the ones we need to do from a memory as close to the CPU as possible. Every system has a memory hierarchy and the further away from the core the memory is, the more expensive it is to access. This is regardless of whether the cost is measured as time or as energy. In a microcontroller, the physical memory hierarchy typically consists of registers, RAM and Flash. In more advanced processors, there is also a cache hierarchy, but for the application programmer it is usually transparent and we will not discuss it in this article. Starting from the bottom of the memory hierarchy, the first thing we should do is to move items from Flash into RAM, so that every access of them consumes less power. Given the often small RAM sizes available on microcontrollers we seldom have the luxury of running the entire program from RAM. Instead, we run it from Flash. But there is a lot of power to save if we instruct the program to run at least the most frequently executed lines of code from RAM, instead of from Flash. This requires a decomposition of our code into functional subroutines. Using the function profiler, we should be able to detect which functions execute most frequently and spend most time executing, if this is not already known to us. These are the functions we should run from RAM, at least if they have a small enough memory footprint. If these functions are too big to fit in RAM, we need to consider the possibility of splitting large functions into several small ones. We can easily see the results using the power profiling functionality in the function profiler. Alongside execution profiling, this window displays the power consumption for each function including sample average, max and min. If changing the location of a function from Flash to RAM results in a change in power consumption, we can instantly verify it by using the power profiling functionality.

Power profiling in IAR Embedded Workbench

Once we have put the critical functions into RAM, we can go one step further by making sure our functions handle data stored in memory as efficiently as possible. There are many practices and techniques for this, but the first thing we need to ask ourselves is: "Can we restructure the code to make fewer data accesses?". Rewriting algorithms is very specific to the application, but there are also some generic low-level practices to consider, and we will take a look at a few examples. When a variable (or constant) is needed by the program, it is read into a register. While registers are the fastest and least expensive memories to access, they are also very limited in number, so not all

Page 2

variables will necessarily fit into the registers we have available at any given time. However, if accesses to a specific variable could be grouped together, chances increase that it remains in a register between accesses, instead of being put on the stack while other data is occupying the registers. It is also possible to use a #pragma directive to place a variable in a specified register and keep it there. This can be useful in some cases, but it will also limit the compilers ability to optimize the code since it will have fewer registers available. Using global variables is often considered bad coding practice, but is still frequent and, in some circumstances, for good reasons. If we access a global variable more than once during a function call, it can be a good idea to make a local copy of it, since the copy is normally stored in a register, which makes subsequent accesses to the variable much less costly. Another example with variables in registers concerns how they are handled when we make a function call. Normally, a function call means that all registers are saved to memory, and when the function returns, registers are restored again. By grouping function calls together, fewer save/restores of the caller context need to be made and accesses to main memory are avoided. The effects of manipulating how variables are accessed and stored can be difficult to measure for a single read or write access and the effect on overall power consumption can be negligible, but when this is done systematically, the aggregated effect of all saved memory accesses will be well worth the effort. Optimizing memory access is one of the most fruitful modifications we can do from a software perspective to reduce power consumption in an embedded system. Memory accesses in themselves consume a lot of energy, so reducing their number will save power. It also takes time to read and write data, so with fewer memory accesses we speed up the execution of a program and can spend more time in low-power mode. As with any optimization work, it becomes much easier if we immediately can see the result of our efforts instead of reaching in the dark and guessing the effects of changes we make. The power debugging technology in IAR Embedded Workbench measures the effects of any changes we make and gives us immediate feedback, without us having to take the system into the lab.

Part 1: Introduction to Power Debugging Part 2: Turn off your peripherals! Part 3: Got some time to spare? Take a nap! Part 4: Haste makes waste Part 5: A trip down memory lane to save power

You might also like