You are on page 1of 11

Code_Generation_Tools_FAQ

Contents
• 1 Q: I get SPC offset link errors when I try to use trampolines with -ml0. Why?
• 2 Q: I want to ensure priority ordering of sections across chips with different internal memory sizes with 1 command file
• 3 Q: Why do I need -w link flag?
• 4 Q: Can I use CCS macros, such as $(Install_dir), inside my *.opt (and *.cmd) files?
• 5 Q: What's the benefit of using near calls + linker trampolines .v. all-far-calls?
• 6 Q: How to implement right shift for 64 bit numbers? Is there any intrinsic to be used?
• 7 Q: Difference between long and long long in C6000 compliers?
• 8 Q: I installed the latest version of ActivePerl (5.10.xx) but some of TI's scripts don't work (i.e., <genAIS.pl>). What happened?
• 9 Q: What's the difference between HEAP and STACK?
• 10 Q: Restrictions on location of .const and .bss
• 11 Q: When using inline assembly instructions on the TI C compiler, why the code breaks at run-time?
• 12 Q: What standards does the TI compiler adhere to?
• 13 Q: Fast RTS library
• 14 Q: What does the _eh mean at the end of some of the run time support libraries? For example, what is the difference between rts64plus.lib
and rts64plus_eh.lib?
• 15 Q:How can I use C to Access Data Stored in Program Space Memory on the TMS320C24x DSP?
• 16 Q: What is the use of the `interrupt` keyword while declaring an ISR? Is it mandatory to declare all the ISR`s using this keyword?
• 17 Q:How do I reduce the size of my DSP executable?
• 18 Q: Share C headers between C and assembly code
• 19 Q: Simple C Tuning
• 20 Q: Accessing structure in assembly that is defined in C
• 21 Q: How to resolve error "remark #562-D: "typeid" is reserved for future use as a keyword , error #20: identifier "typeid" is undefined"
• 22 Q: What happens when the MEMCPY function is called, or what is the important of calling _strasg?
• 23 Q: When compiling my project I get the following error message, "undefined symbol: _strasgi. Error: symbol referencing errors- 'user.out' not
built". How can I fix it?
• 24 Q: How to reduce code size with -ms option?
• 25 Q: I'm using an older compiler to link in a library built with a newer compiler and I get an error message from the linker similar to: ">> error:
illegal relocation type 050002 found in section .debug_something, file filename". What's wrong, and what can I do about it?
• 26 Q:Can I split my .bss section into two different memories while building a library so that I can keep the non-time critical code on external
memory?
• 27 Q: what is the sequence order of memory loads / stores when the compiler schedules them in parallel in the same execute packet?
• 28 Q: does the TI compiler support variadic macros?
• 29 Q: I am porting my C project to C++ and the compiler does not recognize the _nassert keyword

Q: I get SPC offset link errors when I try to use trampolines with -ml0. Why?

You may be keen to use the new C6000 Codegen 5.x trampolines feature to gain potential code and cycle savings. However some users have reported
getting Section Program Counter errors when changing from -ml3 to -ml0 and --trampolines.

Mixing near and far data references is the essence of the problem. A variable declared as far can't be accessed as near (DP-relative offset). If all of your
sources were in 1 project you'd rebuild with -ml0 (far aggregate data, and near code), add the new C6000 Codegen 5.x --trampolines switch to your link
options, and your code would link correctly. However, virtually everyone uses code from pre-built libraries. Having near scalar data references via -ml0 in
your application won't work if you use data from a library built with -ml3 (far aggregate and scalar data, far code).

This FAQ shows an example of how this can manifest itself, and some potential workarounds.

Lets say we elect to update only the application to new Codegen 5.0 and leave the modules (ALGRF, UTL etc) as Codegen v4.36 (from CCS 2.21).

Changing the options to -ml0 (instead of -ml3) for Build, and adding --trampolines to the Linker options yields the following link error:

[Linking...] "C:\CCStudio\C6000\cgtools\bin\cl6x" -@"Debug.lkf"


<Linking>
>> warning: Detected a near (DP-relative) data access to a far
(non-.bss-relative) symbol. The 'far' qualifier is required to
access this symbol from C/C++ source. Located in
C:\my_c_drive_stuff\localwork\cgt_cases\trampolines_faq\rf3_appcgt501_libscgt436\referenceframeworks\apps\rf3\dsk6416\Debug

To fix this we need to find the offending symbol. Hence we add -k & -al to the build options to produce .asm & .lst files. Looking at <appMain.lst> we
observe at the shown SPC offset 00000054 :

143 00000054 0200006C! LDW .D2T1 *+DP(_UTL_logDebugHandle),A4 ; |45|

UTL_logDebugHandle() is accessed as a near reference (DP offset). However it is declared in src/utl/utl_defaultlog.c as:

LOG_Handle UTL_logDebugHandle = NULL;

LOG_Handle data type is defined in <log.h> as :

typedef struct LOG_Obj *LOG_Handle;

Contents 1
Code_Generation_Tools_FAQ
LOG_Handle is a pointer. A pointer is a scalar data type. The UTL module was built with -ml3 hence we are mixing a far declaration of
UTL_logDebugHandle() with a near reference to it - this does not work.

To address this we can choose any of the options indicated below:

1. match up the data declaration with its reference i.e. add the 'far' qualifier.

extern far LOG_Handle UTL_logDebugHandle;

References from the application are now far, and its declaration is far in <utl.h>

Downsides of this approach are:

• modifying global header file <utl.h>. If you get a new version of RF's you'd lose your change.
• Dependencies between code (<utl.h>) and build options are not good. Better is to make the declaration of the variable far as well, thus
eliminating the -ml3 <-> code dependency.
• if you are trying to keep use the same header files for products on both C5000 & C6000 this will be a problem. The 'far' keyword yields a build
error on C55x since it has no meaning on this architecture. This can be worked around via a #define of 'far' to be empty but its still a downside.

2. Rebuild UTL (and subsequent) libraries with -ml0.

In this approach we make all scalars declared and referenced as near.

However after rebuilding UTL 64x lib with -ml0, the application still yields the following link error:

[Linking...] "C:\CCStudio\C6000\cgtools\bin\cl6x" -@"Debug.lkf"


<Linking>
>> warning: Detected a near (DP-relative) data access to a far
(non-.bss-relative) symbol. The 'far' qualifier is required to
access this symbol from C/C++ source. Located in
../../../lib/utl.l64, section .text, SPC offset 00000014

The problem now is that the linker error message doesn't tell you which member (object file) of <utl.l64> is the culprit. You don't know where to look!

With CGT 5.0 in CCS 3.0 the following options are available to you:

• rebuild all libraries with the -ml0 switch instead of -ml3. This will ensure all data is declared and referenced as near. It may seem that this
imposes some constraints however as spraa46 points out "Under -ml0, only scalars are declared near. And the total size of all near data is limited
to 32Kbytes. The largest scalar types use up 8 bytes: long long, double, and long double. Even if all the scalars were one of those 8-byte types, it
would take 4,097 differently-named scalars to exceed the 32K byte limit. Exceeding such a large limit is highly unlikely". Note that you don't
need to rebuild each library with Codegen 5.0; you could keep Codegen v4.36 and just change -ml3 to -ml0. This might be perceived as less risk
for some customers.

• Dig deeper into the error message emitted. We know that the problem occurs at SPC offset 00000014. Hence we can search for '00000014'
in the .lst files of each object file that contributes to the library (via e.g. grep, perl, DOS find batch utilities). We're looking for global variable
references via the DP. In the above case we find in <utl_algmem.lst>:

96 00000014 0200006E!

| LDW .D2T2 *+DP(_ALGRF),B4 ; |48|

'ALGRF' is a pointer to a configuration structure ALGRF_Config in <algrf.h>. ALGRF is therefore a scalar hence again we are mixing a far declaration
(since ALGRF is built with -ml3) with a near access, this time coming from the UTL component.

Rebuilding ALGRF with -ml0 instead of -ml3 fixes the problem.

3. Put .far within 32Kb of .bss.

This is unlikely to work. Since you built the modules and algorithm libraries with -ml3 you did it for a reason i.e. aggregate data typically needs to be
accessed as far because there's more than 32Kb of total aggregate and scalar data.

A G723 algorithm for example has approximately 19Kb of constant data tables on C6000.

4. Rebuild the application with Codegen 5.10 using new mem_model switch.

A new build option --mem_model:data=far is available in CGT 5.10. This deprecates -ml0 (although -ml0 is still available for legacy support). This switch
makes both aggregates and scalars far.

Note that trampolines are on by default in cgt 5.10 so you don't need the linker --trampolines switch. One other note regarding defaults is that the 5.10
Compiler defaults to --mem_model:data=far_aggregates i.e. if you put no -ml[N] or -mem_model switches on the build command line, the data model will
be far aggregates (arrays, structures etc). Having options for both all data far and the lesser set far_aggregates gives users more options than existed in
previous C6000 compiler releases.

Q: I get SPC offset link errors when I try to use trampolines with -ml0. Why? 2
Code_Generation_Tools_FAQ
Q: I want to ensure priority ordering of sections across chips with different internal memory sizes with 1
command file

You may have a situation as follows:

DSP 1 has 64kb of internal memory and DSP 2 has 128kb. You then have 4 key code sections:

• .text:_a : 20kb
• .text:_b : 30kb
• .text:_c : 20kb
• .text:_d : 10kb

Assume that .text:_a is most important to be placed in IRAM (eg most MIPS intensive), followed by .text:_b etc. You want to have 1 .cmd file which works
across the 2 devices. On device 1, you want .text:_a and .text:_b to be placed in IRAM but can accept it if c and d are allocated in SDRAM. On device 2 you
want all .text subsections to be allocated in IRAM.

This can be achieved in c6000 Codegen 5.x (CGT 3.x on 55x) by doing:

GROUP: {
.text:_a
.text:_b
.text:_c
.text:_d
} >> IRAM | SDRAM

This tells the linker:

• group the sections as per order I listed them in. The GROUP directive is the only directive in the linker to force ordering. Simply listing sections
does not guarantee that the linker will allocate them in the order you desire. In fact the linker typically allocates the largest sections first in an
effort to minimize potential memory map holes.
• Automatically Split the output sections between the non-contiguous Memory ranges IRAM and SDRAM. The >> directive indicates that the
contents of the GROUP can be split between IRAM and SDRAM. For example, when the linker encounters .text:_c a 20kb section it realizes
it cannot fit this section in IRAM on Device 1. Hence it skips it, and tries to find a smaller section that can such as .text:_d. On Device 2,
there's adequate room for subsections a thru d hence they all get placed sequentially as per GROUP rules. On Device 1, .text_c will be placed
in SDRAM by the linker since it can fit there.

Individually both the GROUP and section splitting operators existed before Codegen 5.x. However starting from CGT 5.x you can use them together. This
achieves the dual effect of ordering and splitting across memory ranges.

Note that if you simply do:

.text:_a >> IRAM | SDRAM


.text:_b >> IRAM | SDRAM
.text:_c >> IRAM | SDRAM
.text:_d >> IRAM | SDRAM

then this handles the section splitting but does not guarantee the ordering. Hence some of the most MIPS-intensive functions may get allocated to SDRAM.

And, if you simply do:

GROUP: {
.text:_a
.text:_b
.text:_c
.text:_d
} >> IRAM

then you can't have 1 command file for the 2 devices. You would need a different one for Device 1 since all of these sections wont fit in the 64k IRAM.

Q: Why do I need -w link flag?

The linker has a flag -w which is defined to "Warn if an unspecified output section is created". Hence, if you do e.g. in assembly

".sect .mysect"

or in C

#pragma CODE_SECTION(myFunc, ".mySect")

then such a section is not one of the standard sections (e.g. .text, .bss etc) that the linker expects.

Q: I want to ensure priority ordering of sections across chips with different internal memory sizes with 13 comma
Code_Generation_Tools_FAQ
If you do not explicitly place such a section in your .cmd file, the linker will arbitrarily place it for you. This can cause nasty runtime problems. For example,
cases have been observed where such user-defined sections have been placed in FLASH sections (but naturally they weren't actually FLASH'ed). As a result,
the code breaks.

The solution is simple : always use the -w linker flag. It will show a warning when such sections have not been placed.

An alternative solution is to always use subsections e.g.

#pragma CODE_SECTION(myFunc, ".text:mySect")

however, this is not practical if you are consuming libraries written by other parties, which may have explicit user-defined sections.

Best is to always link with -w.

Q: Can I use CCS macros, such as $(Install_dir), inside my *.opt (and *.cmd) files?

No. CCS macros are only supported in *.pjt files. How CCS macros work is CCS will read the *.pjt file and expand any CCS macros it recognizes before the
executing a command such as a build request. For example, on a build command, if macros are used in the *.pjt file to specify include search paths:

-i"$(Install_dir)\c6000\csl\include"

CCS will read and expand this to (Assuming the CCS in use is installed in C:\CCStudio_v31):

-i"C:\CCStudio_v31\c6000\csl\include"

And pass this include search path to the compiler.

However using macros in *.opt files will fail since *.opt files are never read by CCS (and never expanded) but simply passed to the compiler as a valid
compiler option (-@"C:\myfile.opt"). Since the compiler does not understand the macro, the search path will not be valid.

The same limitation applies for *.cmd files.

Q: What's the benefit of using near calls + linker trampolines .v. all-far-calls?

First of all let's be clear on what is the difference between near & far calls. As shown in this c64, c64Plus Compiler Overview (slides 27, 28), near-code with
linker trampolines (where necessary) is the recommended model.

spraa46 Chapter 6 gives a detailed explanation of near .v. far - we repeat the key points here.

NOTE - this FAQ focuses on code. Information on near .v. far data can be found in the C6000 memory models topic.

Let's see what near .v. far calls look like in C6000 assembly: -

; near call. Destination must be +/? 1M words from current PC


CALL _func1

; destination can be anywhere

MVKL _func1,A3 MVKH _func1,A3 CALL A3

Whenever the linker encounters a near call that cannot reach the destination, it redirects the call to a linker generated trampoline. The trampoline uses a large
branch instruction that can always reach the destination. Because the trampoline uses a branch instead of a call, the called function returns not to the
trampoline, but to the original call sequence. Multiple calls to the same function may reuse the same trampoline.

So why use near calls with trampolines instead of far calls? Here is a long list of reasons...

1. Performance - comparing -ml3 (far calls and far data) .v. no -ml3 on TI's h264 base profile dm6446 decoder showed a 4% performance
improvement when removing -ml3. The performance gain comes because the majority of the time a trampoline is not needed.
2. Footprint - The near call is 8 bytes smaller than the far call. That rate of savings becomes a noticeable code size reduction when nearly all the
calls change from far to near. The same h264 bp dec exhibited a 3.5kb footprint reduction without -ml3. This can in turn lead to performance
improvements since smaller code equals better cache performance.
3. Stack depth analysis - it is very useful to find out the function call tree and stack depth. This can be done by object file parsing scripts such as TI's
cg_xml package. Code built with c6x CGT 6.0.x does not distinguish between near .v. far .v. indirect calls in the generated Dwarf information.
Hence if you build your library with -ml3 then the call_graph script will treat far calls like indirect calls. The end stack size may then be too small
because we dont see the true call tree.
4. Testing, support - neither the -ml3 option nor the option to explicitly make function calls far are documented from CGT 6.0.x onwards.

• Trampoline debugging.

Q: Why do I need -w link flag? 4


Code_Generation_Tools_FAQ
The map file is a big help e.g. it shows: -

FAR CALL TRAMPOLINES


callee addr tramp addr call addr call info
???????? ???????? ???????? ???????? ????????? ????????????????
_func4 80000260 .T$0001 00008480 00006fe8 demo.obj (.text)
_func5 80000200 .T$0002 000084a0 00006ff0 demo.obj (.text)

The column contents are as follows:

• ♦ callee - function called


♦ addr - address of callee
♦ tramp - automatically generated name for the trampoline
♦ addr - where the trampoline resides in memory
♦ call addr - list of addresses from which a call using the trampoline originates
♦ call info - the object file and input section that contain the originating call. If the object file is from a library, the name of the library is
shown as well.

• What can you do if you absolutely can't tolerate trampolines?

This is rare but perhaps you have a custom loader that doesnt comprehend trampolines. What then? Any near calls greater than a 4Mb address reach will fail
at the link stage!

One option, which we use in the Davinci DVSDK is to ensure all code lives within a 4Mb memory range. Typically this would be in external memory (e.g.
DDR2). In a BIOS application this might look as follows in a TCF file: -

var mem_ext = [
...
{
comment: "DDRcode: off-chip memory for code",
name: "DDRcode",
base: 0x8FA00000,
len: 0x00400000, // 4MB
space: "code"
},
];

var params = {
clockRate: 594,
catalogName: "ti.catalog.c6000",
deviceName: "DM6446",
regs: device_regs,
mem: mem_ext
};

utils.loadPlatform("ti.platforms.generic", params);

NOTE - it is important that you avoid having big globs of data (heap, stacks etc) linked in between the code sections! Otherwise your 4Mb will over-run
pretty quickly! One way to achieve this, again in TCF syntax is: -

bios.setMemCodeSections (prog, bios.DDRcode) ;

This ensures that all code sections are allocated in DDRcode. Put your data in a different memory section.

Assuming your total DSP code size is < 4Mb this should work. You can verify this by linking then looking at the map file - if you can't find the word
'trampoline' in there, then you're in good shape - no trampolines occurred.

Q: How to implement right shift for 64 bit numbers? Is there any intrinsic to be used?

Here are ways to access 64-bit values:

? To get the upper 32 bits of a long long in C code, use >> 32 or the _hill( ) intrinsic.

? To get the upper 32 bits of a double (interpreted as an int), use _hi( ).

? To create a long long value, use the _itoll(int high32bits, int low32bits) intrinsic.

For example for ((temp64) >> 40) you can use (_hill(temp64) >> 8)

Q: What's the benefit of using near calls + linker trampolines .v. all-far-calls? 5
Code_Generation_Tools_FAQ
Q: Difference between long and long long in C6000 compliers?

The difference between long and long long is long (signed and unsigned) is 40-bit Data storage and long long (signed and unsigned) is 64-bit Data storage.

Q: I installed the latest version of ActivePerl (5.10.xx) but some of TI's scripts don't work (i.e., <genAIS.pl>).
What happened?

Please see this forum entry

Q: What's the difference between HEAP and STACK?

Heap:

The heap is a large pool of memory that can be allocated at runtime by application programs.

Memory is allocated from a global pool, or heap, that is defined in the .sysmem section. You can set the size of the .sysmem section by using the
--heap_size=size option with the linker command. The linker also creates a global symbol __SYSMEM_SIZE, and assigns it a value equal to the size
of the heap in bytes. The default size is 1K bytes.

Dynamically allocated objects are not addressed directly, they are always accessed with the help of pointers, and the memory pool is in a separate section
(.sysmem); therefore, the dynamic memory pool can have a size limited only by the amount of available memory in your system. To conserve space in the
.bss section, you can allocate large arrays from the heap instead of defining them as global or static.

Stack:

The stack is where memory is allocated for automatic variables within functions.

The stack memory is used to:

• Allocate local variables


• Pass arguments to functions
• Save temporary results
• Save function return addresses

The runtime stack is allocated in a single continuous block of memory and grows down from the high addresses to the low addresses. The compiler uses the
hardware stack pointer (SP), to manage this stack. In C6000 the stack pointer is usually register B15.

Additional details can be found at several tool documents: TMS320C6000 Assembly Language Tools user's guide (SPRU186) TMS320C6000 Optimizing
compiler user's guide (SPRU187) DSP/BIOS User's Guide (SPRU423)

Q: Restrictions on location of .const and .bss

The .const and .bss sections generated by the compiler are accessed using "direct addressing" that uses the processor's DP register. For this reason, there are
special restrictions regarding their locations:

-Restrictions under Floating-Point compiler:

When using small memory model, sections .const and .bss MUST be linked in the same data page (i.e the 16 MSBits of their memory address should be the
same) because the compiler uses direct addressing to access them. Otherwise, this could cause wrong execution of the program. This doesn't apply to the
large memory model.

-Restrictions under Fixed-Point compiler:

No restriction exists in the location of .bss and .const because the Fixed-point compiler always initializes the DP before accessing each memory location.

Q: When using inline assembly instructions on the TI C compiler, why the code breaks at run-time?

The TI C compiler supports an asm() instruction which allows developers to insert assembly code within C functions. However, the C compiler will only
check that the asm() function is a valid assembly function. If the asm() code modifies registers required by the surrounding C code, data corruption will
occur and the program will fail at run-time.

The work around for this issue is to save all registers modified by inline assembly onto the system stack. This will allow all data to be restored upon return to
the actual C code. This applies to all DSP targets.

Q: Difference between long and long long in C6000 compliers? 6


Code_Generation_Tools_FAQ
Q: What standards does the TI compiler adhere to?

All TI compilers support:

• C Standard: ANSI X3.159-1989 (C89), which is the same as ISO/IEC 9899:1990.


• C++ Standard: ISO/IEC 14882:1998

We do not support: C95, C99, C++ 2003, C++ TR1.

Q: Fast RTS library

The Fast RTS Library is available as a separate installable. It is not bundled with Code Composer Studio as a standard library and can be downloaded from
TI website at the link given below.

C67x Fast RTS - [SPRC060]

In addition, the DSP library can be downloaded from the following link. C67x DSP Library - [SPRC121]

Q: What does the _eh mean at the end of some of the run time support libraries? For example, what is the
difference between rts64plus.lib and rts64plus_eh.lib?

The _eh means `Exception handling`. The compiler supports C++ exceptions now. Because they are costly in cycles and size, even if an exception is never
thrown, they are disabled by default. They can be enabled with --exceptions compiler option. If the code is built with exceptions, then one of the exception
handling RTS libs that has the `_eh` in the name has to be linked in.

Q:How can I use C to Access Data Stored in Program Space Memory on the TMS320C24x DSP?

On TMS320C24x devices, it is sometimes desirable to place data in program space memory rather than in data space memory. In particular, the on-chip
flash (or ROM) in the program space provides a large nonvolatile memory for storing constant arrays, look-up tables, and string tables. When working in the
C programming language, however, it is not sufficient to simply link the data into the program space, as the C-compiler expects all constants (and variables)
to be in data space memory. No mechanism exists in the C-compiler for accessing program space memory, other than at code-initialization time.

One method for overcoming this problem is to copy the data from the flash (or ROM) into data space RAM as part of the code-initialization process. This
could be done using a custom assembly code routine, or the C-compiler does provide some built-in capability for initializing global and static variables and
constants. The C-code could then access the copies of the data in the data space during code execution. The downside of this approach is that each word of
data now consumes two words of memory: the original data in the flash and a copy of the data in data space RAM. Large arrays of constants or string tables
can quickly use up the valuable on-chip RAM available in C24x generation DSPs.

A better approach is to temporarily copy the values of interest from the flash to data RAM only when it is needed at run time. C-code can then access the
temporary copy (e.g., as a local variable located on the software stack), and dispatch the value as required. This approach avoids the double memory usage
problem at the expense of using some CPU clock cycles to temporarily copy the data from program memory each time the value is accessed at run time.

Click here to download the PFUNC library.

Q: What is the use of the `interrupt` keyword while declaring an ISR? Is it mandatory to declare all the ISR`s
using this keyword?

ISA: C6000; All CCS:

'Interrupt' keyword is not mandatory for declaring an ISR. The user can still branch to an ISR without declaring a function using this keyword. Specifying
this keyword would declare the ISR as a Maskable interrupt function, which means that the interrupt could be disabled using software instruction. For such
ISR`s, there is no need to include the assembly instruction `B IRP` in the interrupt vector table. Whereas for the functions, which were not declared using an
'interrupt' keyword, the user has to explicitly use the `B IRP` instruction at the end of the assembly routine in the interrupt vector table.

Q:How do I reduce the size of my DSP executable?

Check out this Kbase article.

Note that there is no "selective size reduction" support in Code Gen Tools. FYI the --hide / --unhide support in CGT 6.1 does not reduce .out size.

Q: What standards does the TI compiler adhere to? 7


Code_Generation_Tools_FAQ
Q: Share C headers between C and assembly code

The .cdecls directive allows programmers in mixed assembly and C/C++ environments to share C headers containing declarations and prototypes between
the C and assembly code.

The Syntax for

Single Line:

.cdecls [options,] "filename"[, "filename2"[,...]]

Multiple Lines:

.cdecls [options]
%{
/*---------------------------------------------------------------------------------*/
/* C/C++ code - Typically a list of #includes and a few defines */
/*---------------------------------------------------------------------------------*/

%}

The application note titled SPRU186 will give more information on .cdecls (assembler directives).

Note: The .cdecls directive is not supported in linear assembly.

Q: Simple C Tuning

The C6000 compiler delivers the industry's best "out of the box" C performance. In addition to performing many common DSP optimizations, the C6000
compiler also performs software pipelining on various MIPS intensive loops. This feature is important for any pipelined VLIW machine to perform.

Each .asm file contains software pipelining information. The compiler provides some feedback by default. Additional feedback is generated with the -mw
option. In order to view the feedback, you must enable the -k option which retains the .asm output from the compiler.

The feedback is located in the .asm file that the compiler generates. This application report SPRU198 provides a quick reference to techniques to optimize
loops, including an overview of feedback, and guidelines for responding to specific feedback messages. By understanding feedback, you can quickly tune
your C code to obtain the highest possible performance.

Q: Accessing structure in assembly that is defined in C

The structure defined in C can be accessed in an .asm file by using the .struct and .endstruct directives. Here the .struct directive assigns
symbolic offsets to the elements of a data structure definition. The .struct and .endstruct directives do not allocate memory. They simply create a
symbolic template that can be used repeatedly.

Otherwise pass the base address of the structure in a register to the assembly function. Members of the structure can then be accessed by manually specifying
the offsets from the base address of the structure. For more information, you can refer Assembly Language Tools User Guide SPRU186

Q: How to resolve error "remark #562-D: "typeid" is reserved for future use as a keyword , error #20: identifier
"typeid" is undefined"

TI compiler support typeid keywords but -rtti flag must be enabled. Check in CCS->Project->Build options->Complier ->Parser->Enable Support C++
Run-Time Type Information (-rtti).

Q: What happens when the MEMCPY function is called, or what is the important of calling _strasg?

A call to MEMCPY in the code may call either of the two underlying functions in the library:

_strasg - This function disable interrupts by clearing the GIE bit in the CSR

_memcpy - This function does not clear interrupts.

strasg() is just a version of memcpy() specialized and optimized for certain conditions. The choice of strasg() vs. memcpy() depends on the
alignment and length of the object to be copied. If the alignment is exactly 32 bits and the length is greater than 28, strasg() would be called.

Note: From compiler release 6.0.1, the internal RTS function __strasg is replaced with __strasgi.The new __strasgi can be interrupted.

Q: Share C headers between C and assembly code 8


Code_Generation_Tools_FAQ
Q: When compiling my project I get the following error message, "undefined symbol: _strasgi. Error: symbol
referencing errors- 'user.out' not built". How can I fix it?

You must be linking an object module built with compiler version 6.0.X (or higher) with a runtime support (RTS) library from compiler version 5.1.X (or
lower).

The function __strasgi is called by the compiler to perform structure assignment, and other similar memory copy operations. This is termed an "internal"
RTS function because it is not called directly by the user. Floating point operations on fixed point devices are similarly performed with internal RTS
functions. Starting with compiler release 6.0.1, the internal RTS function __strasg is replaced with __strasgi. Note the "i" at the end of the new
function name; it stands for interruptible. The new __strasgi can be interrupted. The older __strasg cannot be interrupted.

It is most likely is brought into the build environment from some object library obtained elsewhere. Which means it was built elsewhere as well and when
the build of vendor.obj occurred, a compiler version 6.0.X (or higher) was used. Such a build may place a call to __strasgi in the code. If that is the case,
then that code has to be linked with a RTS library that has a definition of the __strasgi function. Libraries from compiler version 5.1.X (or older) do not
have this function defined.

Two solutions exist. One: Upgrade from compiler version 5.1.X to 6.0.X (or later). Two: Change to using an older vendor library that was built with
compiler version 5.1.X (or earlier).

Q: How to reduce code size with -ms option?

When using the -O or -on option, you are telling the compiler to optimize your code. The higher the value of n, the more effort the compiler invests in
optimizing your code. However, you might still need to tell the compiler what your optimization priorities are. By default, when -O2 or -O3 is specified, the
compiler optimizes primarily for performance. (Under lower optimization levels, the priorities are compilation time and debugging ease.) You can adjust the
priorities between performance and code size by using the code size flag -msn. The -ms0, -ms1, -ms2, and -ms3 options increasingly favor code size over
performance.

It is recommended that a code size flag not be used with the most performance-critical code. Using -ms0 or -ms1 is recommended for all but the most
performance-critical code. Using -ms2 or -ms3 is recommended for seldom-executed code. Either -ms2 or -ms3 should be used also if you need the
minimum code size. In all cases, it is generally recommended that the code size flags be combined with -O2 or -O3. At lower levels of optimization, you
may be trading more performance than necessary.The application note titled SPRU187 will give more information on optimization level.

Note:

• If you reduce optimization and/or do not use code size flags, you are disabling code-size optimizations and sacrificing performance.
• If you use -ms with no code size level number specified, the option level defaults to -ms0.

Q: I'm using an older compiler to link in a library built with a newer compiler and I get an error message from
the linker similar to: ">> error: illegal relocation type 050002 found in section .debug_something, file
filename". What's wrong, and what can I do about it?

The object code in the library (or file) includes Dwarf debug information that the older linker cannot read. Dwarf is a new and improved way to encode
debug information. Debug information is read by a debugger like Code Composer Studio (CCS). The format of the debug information changed between the
compilers released with CCS 2.X and CCS 3.X. If you are seeing this message, you are doing the final build with a CCS 2.X compiler while using a library
(or other object module) built with a CCS 3.X compiler.

The rest of this FAQ presumes a library is the source of the problem. If you have this problem with a single object file instead, then the problem is no
different conceptually. But the details of how you address the problem change. The rest of this FAQ presumes you are dealing with a library. The
adjustments when handling a file instead are straightforward.

Presuming you cannot update to the newer compiler, you have two options. One is to rebuild the library to not use Dwarf debug information. The other
option is to strip the debug information out of the library.

Rebuild the Library:

These directions presume you are using a CCS 3.X compiler to rebuild the library. Build the library the same as before, but use an additional switch to
change the format of the debug information. The compiler switch --symdebug:none turns off all debug information. The switch --symdebug:coff enables the
old style Stabs (sometimes called COFF) debug information that the old compiler can read. Note using --symdebug:coff interferes with optimization.

Strip the Library:

This process removes all the debug information from the library. These directions use executables from the C5500 compiler toolset. If you are using a
different toolset, replace the "55" with the abbreviation for your toolset. For instance, C6000 users should replace ar55 with ar6x.

The name of the example library is: mylib.lib.

Q: When compiling my project I get the following error message, "undefined symbol: _strasgi. Error: symbol
9 ref
Code_Generation_Tools_FAQ
1. Copy the library to an empty directory
2. Extract all the library members: ar55 -x mylib.lib
3. Strip the debug information by applying strip55 to each .obj file in the directory: strip55 *.obj
4. Delete the old library: del mylib.lib
5. Rebuild the library: ar55 -r mylib.lib *.obj

Note you have to perform these steps each time you receive an update of the library.

Q:Can I split my .bss section into two different memories while building a library so that I can keep the
non-time critical code on external memory?

You can use linker tricks to segregate critical code from libraries into separate sections so you can link them to internal memory instead of external memory.
In terms of swapping code back and forth that is certainly doable.

If you are intending to do overlays then the linker supports the UNION directive which allows you to have two different pieces of code execute from same
address (at different times) while being loaded to distinct addresses. You can also use the table feature of the linker to generate copy tables that can then be
used by a C program to access the start and sizes of the different sections to swap them in and out of internal memory. The latest run time support library
even has a copy table routine that supports this.

There is a secondary bootloading application note on C6000 devices that has example code and discussion of many of these topics.

Q: what is the sequence order of memory loads / stores when the compiler schedules them in parallel in the
same execute packet?

this information is especially important when it is needed to have a specific sequence in the cpu writes / reads to external memory when interfaced to real hw
(like for example an FPGA or an external controller)

the load and store order rule is as follows:

-- For LD||ST, the LD goes before the ST.


-- For LD||LD or ST||ST, the "T1" goes before the "T2."

T1/T2 is determined by which register file provides the data. (These are sometimes called "DA1" and "DA2".) The register file that
provides the address is ignored, as is the position within the execute packet. This ordering applies to all C6000 family members.

Example 1:

[ A1] LDB .D2T2 *B4++(2),B5 ; Load Byte 2


|| [ A1] LDB .D1T1 *+A9[A8],A9 ; Load Byte 3

The LDB into A9 goes before the LDB into B5.

Example 2:

[ A1] STB .D1T2 B5,*A5++(2) ; Store Byte 2


|| [ A1] STB .D2T1 A9,*+B2[B1] ; Store Byte 3

The STB from A9 goes before the STB from B5.

This is in SPRU609 and SPRU610 (the cache documentation for C621x/C671x and C64x).

see http://focus.ti.com/general/docs/techdocsabstract.tsp?abstractName=spru610c

In SPRU610C, it's under "memory system policies," under "memory access ordering."

This information should also be added in the future release of SPRU732 and SPRU871.

generally it is always best to use the keyword volatile for pointers related to memory mapped hw; volatile instructs the compiler to not reorder the sequence
of instructions during compilation, when having some C code accessing memory locations (performing loads / stores)

Q: does the TI compiler support variadic macros?

The TI compiler does not support variadic macros; it conforms to the ISO/ANSI C standard (C89), and variadic macros came with C99.

Still, depending on which target compiler and version, this should be supported through the tools' GCC language extensions.

Q: I'm using an older compiler to link in a library built with a newer compiler and I get an error message
10fromthe
Code_Generation_Tools_FAQ
These are enabled with the --gcc compile switch

Q: I am porting my C project to C++ and the compiler does not recognize the _nassert keyword

This would be expected, since _nassert in C++ is part of the standard namespace. The correct syntax is std::_nassert()

Eventually you could have your own wrapper functions or macros to have it defined in a compatible way between the two languages

Q: does the TI compiler support variadic macros? 11

You might also like