Design and Implementation of 8-bit RISC MCU

Yanfen Chen*,Wuchen Wu,Ligang Hou,Jie Hu VLSI & Integrated System Lab. Beijing University of Technology, Beijing,China
Abstract—An 8-bit Reduced Instruction Set Computer (RISC) Micro Controller Unit (MCU) has been implemented in this paper, including the design of pipeline and critical modules. The whole design uses two-stage pipeline which enables instructionfetching modules and instruction-executing modules to work simultaneously. Its instruction set is compatible with PIC16F87XA instruction set and it achieves the execution speed of a single-cycle instruction (except for the program transfer instruction).The design is described by Verilog HDL,simulated by Modelsim and verificated by FPGA. The whole system can work normally and can achieve 40MHz frequency.



0x00-0x1F 0xA0-0xFF

SFR(Special Function Registers) On-chip RAM (used as data memory)

I. INTRODUCTION With the rapid development of deep submicron manufacturing and designing technology, integrated circuits have entered the era of System On Chip (SoC), and Intellectual Property (IP) has become the most important part in SoC technology[1]. MCU is particularly significant in the design of IP, because it is the controlling core in almost all applications. It has made great progress in application processing structure and integration with ASIC SoC and other semiconductor products[2]. II. THE SYSTEM STRUCTURE OF MCU The RISC MCU core is comprised of following modules: Ifetch Idecoder ALU ALU Mux PC REG_CTRL CLK_div INT_CTRL and REG_FILE, and peripheral circuits include ROM RAM General-Purpose IO ports(GPIO) TIMER0 TIMER1 UART(Universal Asynchronous Receiver/Transmitter). One part of on-chip RAM is used as data memory, and on-chip ROM and the other part of RAM are simultaneously used as program memory. Table 1 and Table 2 respectively shows the space of program address and the space of dada address ,and Figure 1 shows the system structure of the MCU.

Figure 1 The system structure of MCU


On-chip RAM used as program memory

Ifetch is used to fetch instructions from memory and adjust program counter. Idecoder is one of critical modules of the MCU. It generates a series of micro-control signals by decoding instructions, then the signals will control related modules of MCU to execute calculation and control operations, such as fetching instructions, selecting operands and so on[3]. Arithmetic Logical Unit ALU is the core of RISC MCU. It operates on operands sent by data bus according to the controlling signals exported by Idecoder, and it performs the function of adding subtracting logical calculations and shifting[4]. Figure 2 shows the operations of ALU and the W register which is a 8-bit register and used for calculating of ALU.

The special function registers are registers used by the CPU and peripheral functions to control the operation of the device.that is. At that time. Sampling and receiving of data are implemented by the way of sampling data at the middle time in receiver module and sending module. Port registers’read and write is controlled by direction registers. All internal and external data have a direct reference to it and its performance is closely related to the performance of the whole MCU. Interrupt priority is defined by software. UART interrupt and external interrupt. and a frame of serial data transferred include a start bit. It provides the address of the next instruction to ensure the continuous implementation of the program. the content of PC will be pushed into the stack and the address of program will jump to the interrupt address. and complete corresponding operations according to count results. the starting address of the PC. They can work simultaneously when the prescaler is used by only one of them. The data source of timer and counter respectively comes from internal system clock and the pins.It is mainly composed of timer and watchdog. It also has the function of overflow interrupt. There are three interrupt sources in MCU. TIMER is used to accurately count trigger signals which are input by port pins and generated by external event. REG_CTRL is responsible for managing register file and internal data memory including special function registers (SFR) and general purpose registers. It has the function of overflow interrupt. However. and interrupt vector is 0x004H. this is not the case when the CALL GOTO and RETLW instructions are executed. but they can’t use it at the same time. the process of modification is to simply add 1 to PC’content.Figure 2 The opetations of ALU and W register Program Counter PC is usually called instruction calculator. The interrupt flag signals are connected to INT_CTRL. There are two TIMER modules configured in this design:TIMER0 and TIMER1. and the exeternal clock is synchronized with the internal clock. TIMER1 is a 16-bit wide circulatory accumulating counter pair composed of two 8-bit registers TMR1L and TMR1H triggered by the rising edge of the clock. Figure 3 structure of UART Input and output port units consist of two bidirectional 8bit I/O port registers (PORTA and PORTB) and two I/O direction registers (TRISA and TRISB). It provides unique access for the MCU to exchange data with external memory. It is mainly composed of interrupt flag registers interrupt mask registers and global interrupt enable/inhibit registers. Only 24 ports are used in the UART and sampling module is added in the system. eight data bits and a stop bit. and they share the same prescaler. high accuracy and reliability. It has simple ports. TIMER0 is an 8-bit wide circulatory accumulating counter triggered by the rising edge of the clock. When an interrupt occurs. Its structure is shown in Figure 3. It offers all data(mainly the source operands required by executing unit) except immediate data required by internal computation and is responsible for storing all computed results. All operations of the register file are synchronous with the rising edge of system clock. It can work in either timer mode or counter mode. The UART overcomes the shortcomings of common UART. Because instructions are usually executed sequentially. TIMER has two modes: timer mode and counter mode. CPU changes the content of PC when an instruction is executed so that PC’content can always be the address of next instruction that is waiting to be extracted. CLK_div is used to divide the frequency of external clock source or intertal clock to generate a four-phase nonoverlapping clock. UART consists of sampling module baud rate generating module receiver module and sending module. INT_CTRL can control the generation of interrupts and confirm the interrupt vector. and counter mode can be divided into synchronous and asynchronous counter mode depending on the setting of controlling signals. the content of PC is the address of the instruction which is extracted from memory. It uses the general RS232-C serial interface standard. Register file is the core of the RISC MCU.the address of the first instruction must be sent to it. and it is the center in which MCU deals with internal data[5]. The main difference between the two modes is the difference of data source.they are timer interrupt. . Before the program starts.

the content of PC is updated to destination address which is the present address plus 1 in normal circumstances. Simulation and hardware implementation Hardware structure of the MCU is described by Verilog HDL language and simulation and verification are respectively implemented by the Modelsim software and FPGA board. When the rising edge of Q2 arrives. Figure 4 shows the general format for instrucions. fetching operands. immediate data operating instructions and controlling instructions.The changes of the PC pointer and the implementation process of instructions . with the arriving of the edge of interrupt signal. Figure 6 shows the results of simulation. The MCU adopts two-stage pipeline structure to fetch and execute instructions. Here shows how the two-stage pipeline works: A. When external interrupt is set to enable. The instructions include byte operating instructions. executing operation and memory access. III. IV. It is compatible with PIC16F87XA instruction set. At the instruction-executing stage The stage includes decoding. Figure 5 shows the two-stage pipeline.GPIO can make pins of I/O and functional modules reused. B. All of the above operations are implemented between next Q1 to Q4.An instruction is executed when the next instruction is fetched. bit operating instructions. Destination address is given by control unit when program jumps and corresponding interrupt vector when interrupt occurs. the eight-bit MCU greatly increases its speed by adopting the two-stage pipeline structure. instructions are fetched from program memory and latched in instruction register. In summary.Q2. GPIO tells INT_CTRL that the interrupt should be cleared by software. The first stage is the instruction-fetching stage and the second stage is the instruction-executing stage. The MCU works normally ane the frequency is 40MHz. Clock generates four nonoverlapping clocks respectively named Q1. All instructions require only one cycle except for the program transfer instruction which requires two cycles[6].Q3 and Q4 after the frequency of clock is divided into a quarter of it. At the instruction-fetching stage When the rising edge of Q1 arrives. It is a generally used parallel processing mode. INSTRUCTION SET AND INSTRUCTION FORMAT The instruction set used by the MCU has 35 instructions and each instruction word length is 14 bits long.while data will be directly sent to ports when written. Processor will read the output register of this module when it is checking the port. Figure 5 The structure of the two-stage pipeline Figure 4 General format for instructions V.DESIGN OF PIPELINE Pipeline technology is able to make the speed of processor some times faster by increasing some hardwares or even not increasing the hardware.

LI Yong.The design is described by Verilog HDL language which has good readability so that it is easy to modify the functions of resources.2005 . The key to the design is the two-stage pipeline.A Design of 8-bit RISC MCU IP Core.4727 mw.Li Xin-hui.6. PIC 16C5X Data Sheet (EPROM/ROM-Based 8 bit CMOS Microcontroller [3] [4] Figure 7 The FPGA board used for verification [5] [6] VI.Design of a 32-Bit RISC Microprocessor. om the results of . Figure 6 The results of simulation tapeout.The design can be easily used in embedded systems.2005.Design and Research of MCU Core Based in RISC Technique.YAN Tianxin.WANG Yan-fang.2001. JIANG Yan.MINI-MICRO SYSTEMS.Structure Design of A 64-BIT RISC Microprocessor. FENG Hai-tao. as is shown in Figure 7.Design and Implementation of 32-bit Integral Microprocessor Based on FPGA.QI Jia-yue. it can be seen that the MCU has reached the design target.Science Tcchnology and Engineering .we can know that the area of the chip is 24mmx24mm and the power consumption is 34.ELECTRONICS & PACKAGING.Liu Du-ren.2005 YANG Guang.WANG Yong-gang. CONCLUSION From the results of simulation and verification.SHI Jiang-tao.can be seen in the simulation waveform.LIAN Dian-bin.MICROELECTRONICS & COMPUTER. REFERENCES [1] [2] LIU Tao.Microelectronics. Li Jia-rong.The chip used on the FPGA board is ALTERA’s EP1C6T144C8N.YING Jihong.