You are on page 1of 47

TMS320C54x Family of Digital Signal

Processors
▪ Advanced version of TMS320C5x

▪ Built with modified Harvard architecture with


more internal buses and on chip peripherals ,
larger size ALU and very rich instruction set

▪ Can execute 40 to 120 Million Instructions Per


Second (MIPS)
▪ Features of TMS320C54x Family of Digital Signal
Processors
➢ 16 bit CPU
➢ 25 to 8.3 ns single cycle instruction execution time
➢ Single cycle 17x17-bit MAC (Multiply and Accumulate) Unit
➢ 8M x 16-bit Virtual program memory address space
➢ 64k x 16-bit physical program memory address space
➢ 64k x 16-bit external data memory address space
➢ 64k x 16-bit external IO address space
➢ 5k to 32k x 16-bit single access On chip RAM
➢ 2k to 48k x 16-bit On chip program / data ROM
➢ Synchronous , TDM and buffered serial ports
➢ Programmable timer and PLL
➢ IEEE standard JTAG ports
➢ 5V /3.3V operation with low power dissipation and power down modes
➢ DMA interfaces
➢ 100/128/144 pins in plastic TQFP and BGA package
Comparison of TMS3205Cx and TMS3205C4x
Architecture of TMS320C54x
▪ TMS320C54x has an advanced modified version of Harvard architecture
▪ Four pairs of separate bus –three for data memory and one for program memory
➢ PB, PAB - Program memory bus to read opcode and immediate operand
➢ CB,CAB,DB,DAB – Two independent data memory buses to read two data
simultaneously from memory
➢ EB,EAB – Data memory bus to write data in data memory
▪ Architecture of TMS320C5x - three major areas – CPU , memory and peripherals
▪ Functional units of CPU – Arithmetic Logic Units (ALU) , 2 nos of 40 bit
accumulators (ACCA,ACCB),barrel shifter, 17x17 –bit multiplier,40 bit adder , CSSU
(Compare Select and Store Unit) , exponent status registers , data address generation
unit, program address generation unit, system control interface
▪ On-chip memory (Internal) – 16 bit Program/Data ROM ( 2k to 48k words) , 16 bit
Data/ Program RAM (5k to 32k words) ,DMA controller and External memory
interface
▪ On-chip peripheral (Internal) - clock generator , hardware timer , software
programmable wait state generators , general purpose IO pins , programmable bank
switching logic, Host Port Interface (HPI) , serial port , Buffered Serial port (BSP),
Multichannel Buffered Serial Port (McBSP)
Architecture of TMS320C54x
▪ TMS320C54x have a total memory address space of 192k
(including on-chip memory) with addressability (memory
word size) of 16 bits.

▪ Address space divided in to 3 selectable address spaces


➢64k Program memory address space

➢64k data memory address space

➢64k IO ports address space


Simplified Architecture of TMS320C54x
Functional Units of CPU of TMS320C54x
▪ Arithmetic logic unit (ALU)
➢40 bit ALU
➢Performs arithmetic and logic operations in a single cycle
➢Result stored in accumulator /Memory
➢For involving operations on two data ,one of the data is from barrel shift
register/memory and the other data is from accumulator/memory/T-register
➢Barrel shift register and accumulator supply 40 bit data to ALU
➢Two 16 bit data are loaded to bits 0 to 15 and 16 to 31 with bits 32 to 39 are
filled with Zero or sign extended
➢Can function as two 16 bit ALU and performs two 16 bit operations
simultaneously when the C16 bit in status register 1 (ST1) is set
Functional Units of CPU of TMS320C54x
▪ ACCUMULATOR
▪ CPU has two 40 bit accumulators A,B
▪ Act as source/destination for the ALU and the multiplier/adder
▪ One can be used as storage of the other
▪ Divided in to 3 parts
▪ Guard bits (bits 32-39)- used as head margin for computation-
prevent overflow in iterative computation like
convolution/correlation
▪ A high order word (bits 16-31)
▪ A low order word (bits 0 to 15)
▪ Instruction set of TMS320C54x has instructions for storing the
guard bits, high order word and low order words in data memory
and for manipulating 32 bit accumulator words in or out of data
memory
Functional Units of CPU of TMS320C54x
▪ Barrel shifter
▪ CPU has 40 bit barrel shifter
▪ Can perform 0 to 31 bits left shift, 0 to 15 bits right shift along with
exponent encoder can normalize the accumulator content
▪ The shift information are specified in the shift count field of the
instruction, the shift count field of status register 1 or in T-register
▪ The shift and normalize operations of barrel shifter can be used for
▪ Prescaling of the memory/Accumulator operand before an ALU operation
▪ Logical or arithmetic shifting of accumulator value
▪ Post scaling the accumulator before storing in memory
▪ Normalizing the accumulator
▪ 40 bit shifter can handle 16/32/40 bit operands which are inputs from
data buses (DB and CB buses) or from accumulators . The output of
shifter can be loaded in ALU or EB bus
Functional Units of CPU of TMS320C54x
▪ Multiplier/Adder
➢ Consists of 17 x 17 multiplier,40 bit adder, signed/unsigned input
control logic, fractional control logic, Zero detector , rounder,
overflow/saturation logic and T-register
➢ One of the inputs for the multiplier can be supplied from
➢ T-register/data memory/accumulator , and the other input can be
supplied from data memory/program memory/accumulator.
➢ The multiplier/adder unit can perform 17x17-bit two complement
multiplication and 40 bit addition in parallel in a single instruction
cycle.
➢ In addition, the multiplier and ALU together can perform MAC
operation and an ALU operation in parallel in a single instruction
cycle.
➢ These parallel operations can be used for efficient implementation of
DSP computations like convolution , correlation and filtering
Functional Units of CPU of TMS320C54x
▪ Compare , select and Store unit (CSSU)
➢ The CSSU is an application specific hardware unit dedicated to perform add/compare/select
operations in order to support various Viterbi algorithms used in equalizers and decoders.

➢ The inputs to CSSU for comparison are from accumulator and the output is stored in data
memory .

➢ The status of comparison is also stored LSB of TRN register and TC bit of status register 0.

➢ The instruction “CMPS src , use the CSSU to compare the low and high word of specified
source accumulator, to select the largest of the two words and store in specified data
memory . If high accumulator is greater, then 0 is stored in LSB of TRN and TC, or if low
accumulator is greater , then 1 is stored in LSB of TRN and TC.
Functional Units of CPU of TMS320C54x
▪ Exponent Encoder (EC)
➢ For implementing floating point arithmetic in fixed processors like TMS320C54x,
require separate section of exponent, mantissa of the floating point

➢ Exponent encoder: an application specific hardware device dedicated to extract the


exponent value from floating point in the accumulators and store in T-register

➢ “ EXP src” –used to extract the exponent and save in T-register

➢ “NORM src,dst” – used to normalize the accumulator using the exponent in T-register
as count value
Functional Units of CPU of TMS320C54x
▪ Data address generation unit
➢ Consists of 2 units-Auxiliary register arithmetic unit (ARAU0,ARAU1)

➢ 8 nos of Auxiliary registers (AR0 –AR7)

➢ 16 bit circular buffer size register (BK)

➢ 16 bit stack point register

➢ AR- used to hold the data- memory address in indirect addressing mode

➢ 3-bit ARP field ST –indicates the current AR used for indirect addressing

➢ AR0 –used as an index register for modifying the content of other auxiliary register\ARAU perform
arithmetic operations related to address generation for indirect addressing mode like increment,
decrement, Indexing , bit revered address generation and circular address generation.

➢ Two independent ARAU at any time can operate two ARs to generate two data memory address
simultaneously
Functional Units of CPU of TMS320C54x
▪ Data address generation unit
➢ The 9-bit DP (Date-page Pointer of status register-0 is used as upper 9
bits of data-memory address (page address) in direct addressing.
➢ The circular buffer register is loaded with circular buffer size which is
used to generate the start and end address of circular memory along
with AP specified min the instruction
➢ The stock pointer is used to implement the LIFO stack for memory
operands that uses stack addressing.
➢ The stack pointer always holds the address of top of stack.
Functional Units of CPU of TMS320C54x
▪ Program address generation unit
➢ The program address generation unit consists of five registers,
➢ Program Counter (PC), Repeat Counter (RC), Block-Repeat Counter (BRC), Block-Repeat Start
Address register (RSA) and Block-Repeat End Address register (REA)
➢ Some version of TMS320C54x processors has an additional register called program counter
extension register (XPC) to support addressing of virtual memory.
➢ The program counter PC is a 16-bit register which hold the address of the program code. An
instruction is fetched from program memory by loading the counter of PC (address) on the program
address bus (PAB) and then reading the code from program bus (PB), When the memory is read, the
PC is incremented for the next fetch, so that when an instruction word is read, the PC holds the
address of next word of same instruction or the next instruction.
➢ The XPC is a 7-bit register that selects the extended page of program memory in the processors that
supports virtual addressing.
Functional Units of CPU of TMS320C54x
▪ Program address generation unit
➢When the execution of a single instruction has to be repeated the BRC
is used to hold the count value. The register RSA and REA are used to
hold the start and end address of the block to be repeated respectively.
▪ Status Register
➢Two status registers-ST0,ST1-16 bit registers holds the address of
status of ALU, pointers for indirect addressing , various bits for
interrupt control, hold mode, arithmetic mode and accumulator shift
value
➢Status register can be stored into data memory and can be loaded from
data memory
➢ST0 and ST1can de individually set or cleared using SSBX and RSBX
instructions
▪ Status Register
▪ Status Register
▪ Status Register
▪ CPU Memory Mapped Register

➢TMS32054x has 32 nos of 16


bit memory mapped registers
mapped into page 0 of data
memory space
➢It includes registers for data
and program memory address
generation , various status
and control registers for CPU
and accumulator
▪ CPU Memory Mapped Register
On-Chip Memory in TMS320C54x

▪ Mask Programmable ROM

▪ Single access RAM

▪ Dual access RAM


▪On-chip ROM

➢ The internal maskable ROM of size 2k to 48k words.

➢ It is mapped to program memory space and in some processors a part of ROM can be

mapped to data-memory space.

➢ The processor has an option for including or excluding the on-chip ROM addresses in the

processor program memory address space.

➢ The purpose of the ROM is to permanently store the program code and data for a specific

application during manufacturing of the chip itself.

➢ It has an option of boot loading the content of on-chip ROM to internal/external RAM during

power –ON reset.

➢ The content of the on-chip ROM is protected so that any external device cannot have access

to the program code. This feature provides security for proprietary algorithms.
▪ On-chip DARAM

➢ The TMS320c54x family of processors has 5k to 10k words of on-chip DARAM


which are organized into blocks as shown below.

• TMS320C541 : 5k words organized as 5 blocks of 1k words each

• TMS320C542/543 : 10k words organized as 5 blocks of 2k words each

• TMS320C545/546 : 6k words organized as 3 blocks of 2k words each

• TMS320C548/549 : 8k words organized as 4 blocks of 2k words each

➢ The DARAM blocks can be accessed twice per machine cycle.

➢ Upon reset, the DARAM is mapped to data memory address space and after reset the
processor has provision to map the DARAM into program memory space.
▪ On-chip SARAM

➢ The TMS320c548/549 has 24k words of on-chip SARAM which are


organized as three blocks of 8K words.
➢ Upon reset , the SARAM is mapped to data memory space and after
reset the processor has provision to map the SARAM into program
memory space.
On-chip peripherals of TMS320c54x processors
➢ The various on – chip peripherals available in
TMS320c54x family of processors are,
• software-programmable wait-state generator
• Programmable bank switching
• Parallel IO ports.
• DMA controller
• Host port interface(HPI)
• Serial ports(standard, TDM, BSP and McBSP)
• General purpose I/O pins
• Times
• Clock generator and phase locked loop(PLL)
▪ Software –programmable wall state generator

➢ It can insert/generate wait –states in external bus cycles for


interfacing with slow speed external memory IO devices.

➢ It can extend the external bus cycles upto seven machine cycles.

➢ When all external accesses are configured to zero wait states, the
internal clock to the wait state generator is shut off to reduce power
consumption.
▪ Programmable bank switching

➢ It is used to insert one cycle automatically when the memory data


access switches from data memory space to program memory space
or vice versa.

➢ This extra cycle helps the memory to release the bus before the other
memory starts driving the bus, thereby avoiding bus contention.
▪ Parallel IO ports

➢ It has 64k IO address space which can be used as 64k IO ports.

➢ The IO port can be addressed by the PORTR and PORTW instruction for data
transfer between ports and data memory.

➢ It is easily interfaced to external IO devices through IO ports with minimal


external address decoding circuits.
▪ DMA(direct memory access ) controller

➢ It can perform data transfer between various internal and external


memory spaces without the intervention of CPU.

➢ It has six independent programmable channels, allowing six


different context for DMA operation.

➢ It has higher priority than the CPU for both internal and external
access.

➢ It performs single word or double word transfers.

➢ DMA requires 5 cycles for transfer from / to external to internal


memory.
▪ Host port interface(HPI)

➢ It is an 8-bit parallel port that provides an interface to a host


processor for information exchange b/w the digital signal
processor and the host processor.

➢ The information exchange takes place via on-chip memory that is


accessible to both DSP and host.

➢ The internal DARAM mapped in data memory space 1000h to


17ffh as HPI memory.
▪ Serial ports

There are four types

➢ Synchronous serial port

➢ Time division multiplexed serial port

➢ Buffered serial port

➢ Multichannel buffered serial port(McBSP)


▪ Synchronous serial port

➢ It is high speed, full-duplexed serial ports that provide


direct communication with serial devices such as codecs,
serial ADC, etc.

➢ It can operate up-to one-fourth the machine cycle rate.

➢ The transmitter and receiver are double buffered and data


is framed either as bytes or as words.
▪ Time division multiplexed serial port

➢ This technique is for serial communication to multiple devices


having TDM ports.

➢ TDM is the process of dividing the time intervals into no.of


subinterval with each subinterval representing a communicational
channel.

➢ The processor can communicate with up to seven devices/processor


with TDM serial ports via a pair of data lines and a pair of address
lines.

➢ Like synchronous serial port, the TDM port is also double buffered
for both transmit and receive data.
▪ Buffered serial port

➢ It consists of full-duplex double –buffered serial port interface and an auto –


buffering unit.

➢ The internal memory is connected to an auto –buffering unit by a dedicated


bus, so that the buffered serial port can directly read/write to processor internal
memory without the intervention of CPU.

➢ This results in minimal overhead for serial port transactions and faster data
rates.
▪ Multichannel buffered serial port (McBSP)

➢ It is an enhanced buffered serial port that can support multichannel


transmit and receive up to 128 channels.

➢ The features is wide data sizes from 8-bit to 32-bit , micro-law and
A –law companding and programmable internal clock and frame
synchronization.
▪ General purpose IO pins

➢ The two General purpose IO pins and they are

• branch control input pin, BIO and


• external flag output pin, XF.

➢ BIO – it is used to monitor the status of peripheral devices. A branch instruction


can be conditionally executed depending upon the state of the BIO input. It is an
alternative to interrupt, when the interrupt are dedicated to time –critical
application.

➢ XF – it can be used to signal external devices. It can controlled using software.


At reset the XF pin is set high. The SSBX instruction is used to set XF pin and
RSBX instruction is used to reset XF pin.
▪ Timer

➢ The on-chip timer in TMS320CS4x processors is a 16-bit timer with a 4-


bit prescaler.

➢ The timer can be used to initiate any time –based event through interrupt.

➢ The timer as a count register, which is loaded with a count value and at
every clock cycle the timer count is decremented by 1. at the end of the
count an interrupt is generated.

➢ The timer has a control register to control its operation like start, stop,

restart and disable.


▪ Clock generator and PLL

Two methods of clock geenration:

➢ The internal oscillator connected to an external crystal is used to


generate a clock at crystal frequency and then divided by 1,2,or 4 used
for CPU.

➢A low frequency external clock is supplied to an internal PLL circuit.


The CPU clock is generated by a PLL circuit at multiple frequency of
external clock. This method reduces system power consumption and
clock-generated EMI and facilitate the use of low –cost crystal.
Addressing Modes of TMS320C54x
Processors
▪ The Addressing modes refer to the method of specifying the operand or the data to be
operand or the data to be operated by the instruction.
The TMS320C54x processors support the following seven addressing modes.
➢Immediate addressing
➢Absolute addressing
➢Accumulator
➢Direct addressing
➢Indirect addressing
➢Memory mapped register addressing
➢Stack addressing
▪ Immediate Addressing
➢ In immediate addressing, the data is specified as a part of the
instruction.
➢ In this addressing the instruction will carry a 3-bit/5-bit/8-bit/9-
bit/16 bit constant, which is the data to be operated by the
instruction .
➢ The immediate constant is specified with # symbol.
➢ The syntax used for immediate addressing are #k3,#k5,#k9,#k & #1k
➢ Example
• LD # 1Ch, ASM : Load the immediate 5-bit constant (1Ch) in ASM field of
status register 1
• LD # 12Ah, DP: Load the immediate 9-bit constant (12Ah) in DP field of
status register 0
• LD # 37A5h, 16, A:Shift the long immediate (16-bit) constant by 16-bit
and load in accumulator A.
▪ Absolute Addressing
➢ In absolute addressing, 16 bit address of the operand is directly
specified as a part of the instruction.
➢ This addressing can be used to address an operand in all the
three address space of the processor
➢ The syntax used for absolute addressing are pmad, dmad and PA
➢ Example
▪ Direct Addressing
➢ In the direct addressing mode the lower 7 bits of data memory address are specified in the instruction
itself. The-bit data memory address is formed by using either the 9 bits of DP (Data Pointer). In status
reigster-0or the 16-bit of SP (Stack Pointer)
➢ When DP is used, the 9 bits of DP is the upper 9 bits of the 16 bit address and lower 7 bits are the address
directly specified by the instructions
➢ When SP is used the (16-bit) content of SP is added to 7 bits specified in the instruction t form 16 bit
address
➢ In the instruction listed in table below , the syntax used to represent direct addressing is smem.. In the
assembly language programs , the 7-bit address as a 7*bit constant without # symbol.
➢ Example :
• ADD 6Ch, A: Add the content of memory directly addressed by the instruction (address =6Ch) to the
accumulator A.
• SUB 57h, B : Subtract the content of memory directly adressed by the instruction (address = 57h) from the
accumulator B
▪ Indirect Addressing
➢ In the indirect addressing mode, the data memory address is specified by the content of
one of the eight auxiliary registers, AR0-AR7.
➢ The AR (Auxiliary Register) currently used for accessing the data is denoted by 3-bit Arp
(Auxiliary Register Pointer) field of status register-0
➢ In this addressing mode, the content of AR can be updated automatically either after or
before the operand is fetched .The syntax used for modifying the content of AR are listed
below
➢ the syntax used to represent indirect addressing is Smem/Xmem/Ymem. In the assembly
language programs , the syntax listed in table below are used
Syntax Used indirect Address for modifying AR

SYNTAX MODIFICATION OF AR
*ARx AR unaltered
*ARx- AR decremented by 1 after data access
*Arx+ AR incremented by 1 after data access
*Arx AR incremented by 1 before data access

*Arx -0 AR decremented by the content of the index register (AR0)


*Arx +0 AR incremented by the content of the index register (AR0)
*Arx-0B AR decremented for bit reversed addressing using index register (AR0)

*Arx+0B AR incremented for bit reversed addressing using index register (AR0)

*ARx-% AR decremented for circular addressing


*Arx+% AR incremented for circular addressing
*Arx-0% AR decremented for circular addressing using index register (AR0)

*Arx(lK) Arx = base , lk = Offset, Data Address = Base + Offset, Arx is not altered

*Arx(lk) Same as above but Arx is modified by long immediate

*Arx(lk)% Same as above but address modified for circular addressing

*(lk) Absolute addressing


➢ Example:
• LD *AR3,A : load the content of memory addressed by AR3 in accumulatorA
• LD *AR3-,A : Same as above , but after loading decrement AR3
• LD *AR3+,A : Same as above but after loading increment AR3
• LD *AR3-0, A : same as above, but after loading decrement AR3 using AR0
• LD *AR3+0, A: Same as above, but after loading increment AR3 using AR0

▪ Memory - Mapped Register Addressing


➢ In memory - mapped Register addressing , the address of the memory mapped register is specified as direct or indirect
address in the instruction
➢ The memory – mapped register are mapped to page-0 of data memory address and so can be accessed by using only 7-
bit address. In direct addressing , the 7 bits are directly specified in the instruction as a 7 bit constant without # symbol
➢ In indirect addressing , the lower 7 bits of auxiliary register will be th address of memory mapped register
➢ In this addressing mode , the memory mapped register are accessed without affectiong the content of DP (Data Pointer)
or SP (stack Pointer
➢ Example
• LDM 06h, A : Load the content of MMR directly addressed by the instruction (address=06h) in
accumulator A
• STLMA, 1Eh : Store the content of accumulator A in MMR directly addressed by the instruction
(address=1Eh)

▪ Stack Addressing
➢ In stack addressing mode , the data memory address is the content of Stack Pointer(SP)
➢ The push and pop instruction access the stack memory using the stack addressing mode.
➢ The call interrupt and return instruction also use stack pointer address for automatic storage/retrieval of information
to/from stack

You might also like