You are on page 1of 6

The IEEE International Conference on

Industrial Informatics (INDIN 2008)


DCC, Daejeon, Korea July 13-16, 2008

Efficient Implementation of CNC Position Controller


using FPGA
Yaodong Tao1,2, Hu Lin2, Yi Hu2,Xiaohui Zhang2,Zhicheng Wang2
1)
University of Science and Technology of China, Hefei, Anhui, 230027, P. R. China
2)
National Engineering Research Centre for High-end CNC
Shenyang Institute of Computing Technology, Chinese Academy of Sciences, Shenyang, Liaoning, P. R. China, 110004
taoyd@mail.ustc.edu.cn

Abstract—In this paper, an efficient design scheme for advantages of a general purpose processor and a specialized
implementation of the high-speed CNC Position Controller (PC) circuit that can be reconfigured as many times as it is necessary
using Field Programmable Gate Array (FPGA) technology is until the required functionality is achieved. The speed and size
presented. The algorithm is implemented using a Distributed of the FPGA are comparable with the Application-Specific
Arithmetic (DA)-based scheme where a Look-Up-Table (LUT) Integrated Circuit (ASIC), but the FPGA is more flexible and
mechanism inside the FPGA is utilized. Two novel DA-based its design cycle is much shorter because of its reconfigurability.
CNC Position Controllers have been proposed for FPGA FPGA applications go beyond the simple implementation of
implementation. The implementation results show that the two digital logic; they can be used for implementations of specific
DA-based PCs use 0.8% and 1.5% logic resource of FPGA architectures in order to speed up some algorithm. The FPGA
device respectively comparing the multiplier-based design uses has become more popular for hardware designers to provide
51.1% logic resource of FPGA device. These two DA-based high-performance digital signal processing, providing
designs, using a 32 MHz clock as an input clock, can ensure the solutions that are often 10X to 100X faster than can be
servo loop update frequency reaches 1 MHz to satisfy the accomplished with PC or Single Board Computer (SBC)
high-speed CNC requirement. processors. FPGA has a natural parallel architecture for
high-speed computation.
Keywords —distributed arithmetic, position controller, servo The FPGA-based implementation of control algorithm offers
loop update time, FPGA design advantages such as high-speed computation, complex
functionality and real-time processing capabilities. Recently,
I. INTRODUCTION there has been many researches about the FPGA-based
implementation of the algorithm. For example, the system
High-speed computer numerical control (CNC) machine developed by Takahashi and Goetz [2] could run a current
motion control involves feed-rates superior to 40m/min as well control algorithm with a Xilinx FPGA to increase the
as acceleration dynamics above 1G and in some case 2G[1]. bandwidth of the current loop control. Tzou and Kuo [3]
Due to these facts, a motion controller for high-speed CNC performed the vector and velocity controls of a PMAC servo
machines has higher demands on the electronics design motor by using FPGA technology successfully. The other
compared with a conventional speed CNC, which requires works on FPGA-based motion control include Paramasivam[4],
higher sampling rates of Position Controller (servo loop update Bielewicz [5] and Dubey[6]. Yau [7]implemented Real-time
time) that reduce the available time for on-line processing. In NURBS interpolation using Altera FPGA for high-speed
order to conduct closed-loop servo motor control, CNC motion control. K.D.Oldknow and I.Yellowley[8] completed
generally includes three loops, such as current loop, velocity three-dimensional dynamic interpolation using stateline based
loop and the position loop. In the CNC architecture, the current control architectures in a FPGA device. Osornio-Rios, R.A[9]
loop’s and the velocity loop’s computation are completed by a showed the development of a PID controller for improving the
servo driver, meanwhile the position loop’s computation is servo loop update frequency by using multiplier-based design
completed by the position controller of CNC. For high-speed method with FPGA. In all of these works, a FPGA was used to
control the position loop update time should be as short as achieve some control algorithms with the conditional
possible. multiplier-based method which required a large number of
In CNC, the position controller, the other logical function of multipliers and adders and did not efficiently utilize the
CNC, the interpolator and acceleration/deceleration module memory-rich characteristics of FPGA. A FPGA chip consists
implement the movement control of machine tools. Recently, of a lot of memory blocks, referred to as Look-Up Tables
Field Programmable Gate Array (FPGA) has become an (LUT), which could be utilized to implement efficient designs.
alternative solution for the realization of digital control systems, Due to a lot of memory blocks contained in FPGA, YF Chan
previously dominated by the general-purpose microprocessor and M. Moallem[10] achieved high efficiency implementation
and application specific integrate circuits (ASIC). of a general PID controller using the Distributed
FPGA is an array of basic logic blocks where the user can Arithmetic(DA) scheme, which was an efficient LUT design
define its interconnectivity, making them programmable in a method, and was very promising in the FPGA implementation
fully open architecture. Therefore, a FPGA provides the of control algorithms.

978-1-4244-2171-8/08/$25.00 © 2008 IEEE 1177


The main objective of this paper is to show the advantages in
the design of position controller for high-speed CNC using DA
scheme compared with multiplier-based scheme in FPGA
device. The organization of this paper is as follows. In Section
2, the background knowledge of this paper is introduced.
Firstly, the architecture of CNC and the principle of DA
scheme are simple explained, and the reason of improving
servo loop update cycle for high-speed control is discussed. In
Section 3, a Position Controller is considered and its
implementation using the DA scheme is discussed. In Section 4,
comparisons are made between the proposed schemes and the
design based on multiplier-based method.
II. BACKGROUND

A. A. the Architecture of CNC and Position Controller


The research object of this work is the Position Controller of
the CNC. The architecture of the CNC is shown in Fig.1. The Figure1. The Architecture of CNC
CNC consists of the Display module (GUI), Task module, K
Motion module and I/O module. The Task module processes ∑ Ak bkn (4)
the command interpretation and sends the instructions to k =1

Motion module. GUI displays the state of the system. I/O Because each bkn may take on values of 0 and 1 only,
module processes the digital input and output points of system. expression (4) may have only 2k possible values. Rather than
The Motion module executes the commands from the Task compute these values on line, we may precompute the values
module, which consists of an Interpolator and a Position and store them in a ROM. The input data can be used to directly
Controller. The Interpolator refines the big step command from address the memory and the result.
Task module to small steps and sends the next position to the For example, let k=3, A1=0.72, A2= - 0.30, A3=0.95. The
Position Controller. The Position Controller moves the axis to memory must contain all possible combinations (23 = 8) and
the proper position following the position command rapidly their negatives in order to accommodate the term which occurs
and accurately. at the sign-bit time.
As a consequence, we need to use a 2 × 2 k word ROM. Fig.2
B. Distributed Arithmetic (DA) scheme
shows a simple structure (with a 2 × 23 = 16 − word ROM ) that
Distributed arithmetic is a bit level rearrangement of a can be used to mechanize these equations; Table 1 shows the
multiply accumulate to hide the multiplications. Distributed contents of the memory. The Ts signal is the sign-bit timing
arithmetic uses lookup operation to Lookup Table (LUT) to signal. We assume that the data on x1, x2 and x3 lines (which
calculate the sum of products. It is a powerful technique for with Ts comprise the ROM address words) are serial,
reducing the size of a parallel hardware multiply-accumulate 2’s-complement numbers. Each is delivered in a
that is well suited to FPGA designs[11]. The basic principles of one-bit-at-a-time (1BAAT) fashion, with LSBs {bk N-1} first.
DA are introduced as follows. The sign bits { bk0} are the last bits to arrive. The clock period
As an example of direct DA inner-product generation, in which the sign bits all simultaneously arrive is called the
consider the calculation of the following sum of products: “sign-bit time”. During the sign-bit time the control signal Ts =
K
y = ∑ Ak xk (1) 1, otherwise Ts = 0. For the moment we will assume essentially
k =1 zero time delay between the time of arrival of the address
The AK are fixed coefficients, and the xk are the input pattern to the ROM and the availability of its output. The delay
datawords. If each xk is a 2's-complement binary number scaled around the accumulator loop is assumed to be one clock cycle
(for convenience, not as necessity) such that (|xk | < 1, then we and is concentrated in the summer. Switch SWA remains in
may express each xk as Position 1 except during the clock cycle that follows the
N −1 sign-bit time, when it toggles for one clock cycle to Position 2,
xk = −bk 0 + ∑b
n =1
kn 2
−n
(2) and the fully formed result Y is output.
C. High-speed machining and servo loop update cycle
where the bkn are the bits, 0 or 1, bk0 is the sign bit, and bk N-1 is
the least significant bit (LSB). Let us combine (1) and (2), High-speed machining refers to chip removal in the CNC and
which gives us offers substantial economic benefits due to increased metal
cutting productivity [12]. When the feed rate is constant, the

N −1 K
⎤ K
y = ∑ ⎢∑ Ak bkn ⎥ 2− n + ∑ Ak (−bk 0 ) (3) control resolution of CNC is the distance that the axis moves in
n =1 ⎣ k =1 ⎦ k =1 position control cycle (a servo control cycle). In the high-speed
Equation (3) defines a distributed arithmetic computation. machining process, CNC must to maintain control resolution,
Consider the bracketed term in (3): while ensuring high-speed feed rate. Table 2 shows the

1178
relevance of having fast servo cycle times to work quickly and first-order feedforwad(FF0). The second-order coefficient
accurately[13]. When the feed rate is 30m/min and position multiplied by the second-order derivative of u is the
control cycle is 1ms (frequency is 1 KHz), the control second-order feedforwad(FF2).
resolution of CNC can only attain 0.5 mm. If keeping feed rate The form of the PID control algorithm of CNC PC is given
x1 1
16 word A1 = 0.72
x2 1
1 ROM A2 = -0.3
x3
1 (Table 1) A3 = 0.95
Ts

Parallel Output
+ Figure3. The DA Structure using Adder and Fully memory

Position Cycle distance Cycle distance Cycle distance
Cycles/s
SWA 2-1 control cycle
(1/tcs)
with with with
1 tcs(ms) F=3 m/min (mm) F=10m/min (mm) F=30 m/min (mm)
2 20 50 1 3.33 10
Y 10 100 0.5 1.66 5
3 333 0.15 0.5 1.5
Figure2. The DA Structure using Adder and Fully memory 1 1000 0.05 0.16 0.5
0.4 2500 0.02 0.06 0.2
Input Code 16-word 0.1 10,000 0.005 0.016 0.05
Ts b1 B2 B3 Memory Contents
n n n Table2. Relationship between the control cycle of position controller with
0 0 0 0 0 speed and machining precision
1≤n≤N-1

0 0 0 1 A3= 0.95
0 0 1 0 A2= -0.30
0 0 1 1 A2+A3= 0.65
by
0 1 0 0 A1= 0.72 1 de(t )
0
0
1
1
0
1
1
0
A1+A3= 1.67
A1+A2= -0.42
v(t ) = K P (e(t ) +
TI ∫ e(t )dt + T D
dt
) (5)
0 1 1 1 A1+A2+A3= 1.37
1 0 0 0 0 In equation (5), v(t) is the output of CNC PC, which is a
1 0 0 1 -A3= -0.95
1 0 1 0 -A2= 0.30 velocity variable. e(t) is the difference between u(t) and y(t),
1 0 1 1 -(A2+A3)= -0.65 which is input of CNC PC and the error between command
n=0

1 1 0 0 -A1= -0.72
1 1 0 1 -(A1+A3)= -1.67 position value and actual position value. KP is proportion
1 1 1 0 -(A1+A2)= -0.42
1 1 1 1 -(A1+A2+A3)= -1.37
coefficient, and TI is integration time constant and TD is
derivative time constant. Discretizing the equation (5), we have
Table1. The Content of ROM k

unchanged and improving the frequency of servo loop update v( k ) = K P e(k ) + K I ∑ e( j ) + K D [e(k ) − e(k − 1)] (6)
j =0
to 10 KHz (servo loop update cycle is 0.1 ms), the control
resolution would attain 0.05mm. Then, e(k), u(k), y(k) are respectively the value of e(t), u(t),
As mentioned above, CNC for high-speed control has higher y(t) after discretizing. K is the kth sampling time. Using P(k) to
designing demands in comparison with a conventional CNC. denote proportion term and I(k) to denote integral term and
The main design requirement is the position control cycle D(k) to denote derivative term, we have
which in conventional machining is 1KHz (kilo sampling rate
per second), while in high-speed machining the position control v( k ) = P(k ) + I (k ) + D( K ) (7)
cycle is 10KHz or even higher. P (k ) = K P e(k ) = K p (u (k ) − y ( k )) (8)
k

III. HIGH SPEED CNC POSITION CONTROLLER I (k ) = K I ∑ e( j )


j =0
(9)

Generally speaking, CNC Position Controller is a PID-based


D (k ) = K D (u (k ) − y (k ) − u (k − 1) + y (k − 1)) (10)
controller as shown in Fig.3, where input variable u is a
position command from the interpolator, output variable k is a The equation (9) can be written as
speed command to servo, and feedback variable y is the signal I (k ) = I (k − 1) + K I u (k ) + (− K I ) y (k ) (11)
from the servo encoder. CNC Position Controller receives the Three-grade feedforward of CNC PC can be respectively
position command calculated by interpolator and controls expressed as follows.
accurately servos revolving which drive the axis to move to the
target position. In order to reduce the response time for CNC FF0 (t ) = K FF0 u (t ) (12)
position controller, we append three-grade position
du (t )
feedforward to PID control as shown Fig.4, where the FF1 (t ) = K FF1 (13)
zero-order is position feedforward, the first-order velocity dt
feedforward and the second-order is acceleration.K0, K1 and K2 d (du (t ) / dt )
FF2 (t ) = K FF2 (14)
are the zero-order coefficient, first-order coefficient and second dt
coefficient respectively. The zero-order multiplied by the As shown above, discretizing (12), (13), (14), we have
position value is zero-order feedforward(FF0). The first-order FF0 (k ) = K 0u (k ) (15)
coefficient multiplied by the first-order derivative of u is the FF1 (k ) = K1[u (k ) − u (k − 1)] (16)

1179
FF2 (k ) = K 2 [(u (k ) − u (k − 1)) − (u (k − 1) − u (k − 2))] (17)

Figure4. Architecture of CNC Position Controller

I(k-1)[j] u(k)[j] y(k)[j] LUTI


0 0 0 0
0 0 1 -KI
0 1 0 KI
Figure5. Architecture of the proposed PC-I
0 1 1 0
1 0 0 1
1 0 1 1-KI u(k)[j] u(k-1)[j] y(k)[j] y(k-1)[j] LUTPD
1 1 0 1+KI 0 0 0 0 0
1 1 1 1 0 0 0 1 KD
0 0 1 0 -(KP+KD)
Table3. The Content of LUTI 0 0 1 1 -KP
… … … … …
1 1 0 0 KP
summing of three-grade feedforward and using F as the result, 1 1 0 1 KP+KD
we have 1
1
1
1
1
1
0
1
-KD
0
F (k ) = ( K 0 + K1 + K 2 )u (k ) + (−( K1 + 2 K 2 )u (k − 1))
(18) Table4. The Content of LUTPD
+ K 2u (k − 2)
m−1
As results, the discrete expression v(k) of output of CNC PC I (k ) = ∑ ( I (k − 1)[ j ] + K I u (k )[ j ] + (− K I ) y (k )[ j ]) × 2 j (21)
can be written as j =0

v(k) = P(k) + I (k) + D(k) + FF0 + FF1 + FF2 (19) m −1


P (k ) + D(k ) = ∑ (( K P + K D )u (k )[ j ]
Summing the others term of v(k) excluding I(k), we have j =0
P + D + FF0 + FF1 + FF2 = (K P + K D + K0 + K1 + K 2 )u(k ) + ( − K D )u ( k − 1)[ j ] (22)
+ (−(K D + K1 + 2K 2 ))u(k −1)
+ ( −( K P + K D )) y ( k )[ j ]
+ K 2u(k − 2) (20)
+ K D y (k − 1)[ j ]) × 2 j
+ (−(K P + K D )) y(k)
m −1
+ K D y(k −1)
Having obtained the discretized expression (19) of control
F (k ) = ∑ j =0
(( K 0 + K1 + K 2 )u (k )[ j ]

algorithm of CNC PC, the focus is now on its efficient + (− K1 + 2k 2 )u (k − 1))[ j ] (23)
implementation. The direct implementation of the above j
algorithm using FPGA requires a total of 7 multipliers, 7 + K 2 u (k − 2)[ j ]) × 2
adders/subtractors, and 1 delay blocks. The equation (20) Because j may take on values of 0 and 1 and KP, KI, KD, K0, K1,
requires 5 multipliers and 4 adders/subtractors, the term I(k) K2 are constant, the term with [j] in equation (21), (22) and (23)
requires 2 multipliers, 2 adders/subtractors and 1 delay block. can be precomputed and stored in LUTs, such as
The operation to sum (20) and I(k) requires an adder/subtractor, (I(k-1)[j]+KIu(k)[j]+(-KI)y(k)[j]), where LUTs for (21), (22)
thus, a total of 7 adders/subtractors are required. and (23) named respectively LUTI, LUTPD and LUTF, as
Since the FPGA has a limited number of configurable logic shown on Table3, Table4, Table5. The I, PD, F terms can be
blocks for the above calculations, the multiplier-based obtained in m clock cycles. The main advantage of the DA
implementation is not efficient for the FPGA. In order to reduce expression given by (21), (22) and (23) lies in its capability to
the required resource of FPGA, we use DA method to replace compute the CNC PC function utilizing the LUTs- rich FPGA
the multiplication operation by simple shifting and addition which can be built by Ram Block of FPGA.
operation as mentioned in the part 2 of section 2. This is Based on the above equations, the direct DA implementation
discussed as follows. of the CNC PC, namely, PC-I, is shown in Fig5. It consists of
four delay blocks, three LUTs and three ACCs. Four delay
A. Direct DA Implementation for CNC PC (PC-I)
blocks are used to obtain u(k–1),y(k–1),u(k–2),I(k–
Let us consider the controller terms given in (19). Assuming 1). LUTs and ACCs are used to obtain PD, I, F, where ACC
that all of term is m-bit numbers and [j] represents the jth bit of consists of a shift register and adder/subtractor. Finally, two
the numbers we have adders produce the sum of PD(k), I(k) and F(k). The
throughput (speed) of this PC implementation is (m+1) clock

1180
cycles, i.e., m clock cycles to generate the result, and one more to calculate I(k). The second stage consists of one LUT and one
clock cycle to update the I(k–1). The total latency is (m+1) ACC to calculate the summation of the whole PC function
clock cycles. using the results of I(k) available in the first stage. These stages
are pipelined so that when the second stage is performing the
u(k) [j] u(k-1) [j] u(k-2)[j] LUTFF
0 0 0 0
0 0 1 K2
0 1 0 -(K1+2K2)
0 1 1 -(K1+K2)
1 0 0 K0+K1+K2
1 0 1 K0+K1+2K2
1 1 0 K0-K2
1 1 1 K0

Table5. The Content of LUTF


u(k) y(k) LUTI

0 0 0
0 1 -KI
1 0 KI
1 1 0

Table6. The Content of new LUTI’ Figure6. Architecture of the proposed PC-II

I(k)[j] u(k) [j] u(k-1)[j] u(k-2)[j] y(k)[j] y(k-1)[j] LUTPID+FF CNC PC Complexity Throughput latency Total

0 0 0 0 0 0 0 160 LEs, 0.8% LEs,


PC-II 32 cycles 64 cycles
0 0 0 0 0 1 KD 2 M4K Ram Block 3% Ram Block
0 0 0 0 1 0 -(KP+KD)
0 0 0 0 1 1 -KP 290 LEs, 1.5% LEs,
PC-I 33 cycles 33 cycles
… … ... … … … … 3 M4K Ram Block 5% Ram Block
1 1 1 1 0 0 1+K0-KD
1 1 1 1 0 1 1+K0 Multiplier 9706 LEs,
1 1 1 1 1 0 1+K0+K2-KD 1 cycles 1 cycles 51.7% LEs
-based 0 M4K Ram Block
1 1 1 1 1 1 1+K0+K2

Table7 The Content of LUTPID+FF Table8. Comparison of Complexity and Speed amongst the three designs
B. Enhanced DA Implementation for CNC PC (PC-II)
first calculation and the first stage is performing the next
In order to improve the efficiency of the design, we apply a calculation. Therefore, the throughput is only m clock cycles.
pipeline scheme to utilize the direct DA implementation as The two stages, each requiring m clock cycles, introduce a
follows. latency of 2m clock cycles.
For the I(k) term, equation (21) can be revised as The performance, in terms of complexity and speed, of
m −1
I (k ) = I (k − 1) + ∑ ( K I u (k )[ j ] + ( − K I ) y (k )[ j ]) × 2 j (24) proposed designs and the multiplier-based design are listed in
j =0 Table8, when the input clock frequency is 32MHz. Compared
The term I(k – 1) can be incorporated into the ACC as with the multiplier-based CNC PC, the two DA-based designs,
follows. The ACCs’ shift register is cleared to 0 in the above PC-I and PC-II, utilize the memory rich characteristic of the
direct implementation after m clock cycles. By removing the FPGA. The proposed design (PC-II) requires less
clear function in the shift register, the ACC can keep the adders/subtractors and less delay blocks as compared with the
previous value I(k–1) to perform the addition. The contents of direct DA implementation, i.e., PC-I. The speed (throughput)
the new LUT table, LUTI’ , are as shown in Table6. If using the of PC-II design is a little bit higher than that of PC-I, but the
pipeline scheme, we can combine (22) with (23) as follow. latency is more. Since the latency only occurs once during the
m−1 power up or system reset, it is not of much concern in our CNC
P + D + F = ∑( ( K 0 + K1 + K 2 + K P + K D )u (k )[ j ] + consideration. Thus the PC-II design has improved
j =0
I (k )[ j ] + (−( K P + K D )) y (k )[ j ] + (25) characteristics compared with PC-I and is the most preferred
design in the PC amongst the three designs.
(−( K D + K1 + K 2 )u ( K − 1)[ j ] +
K D y (k − 1)[ j ] + K 2u (k − 2)[ j ]) × 2 j
IV. RESULTS
Thus, we can create a new table LUTPD+F as Table7 for
calculating the terms with [j] in (25). CNC Position Controller (PC-II) using DA-based scheme,
The LUTPD+F and corresponding ACCs will generate the proposed above, was implemented in Altera FPGA. The FPGA
P+D+F term in m clock cycles, in the second pipeline stage. design flow is as follows. First, PC-II function was described
This two-stage implementation of the DA-based CNC PC by VHDL language in Altera’s QuartusII 6.1, an Integrated
namely, PC-II, is shown in Fig6. It requires two ACCs, three Development Environment (IDE) for FPGA design. At the
delay blocks and two LUTs, while PC-I requires more two same time, Function Simulation of PC-II was processed and the
LUTs, more three delay blocks, and more two ACCs. validity of design was verified. Then logic synthesis was
PC-II needs two stages to accomplish one servo update cycle carried out. Finally, the .pof file created by QuartusII was
of CNC PC. The first stage consists of one LUT and one ACC downloaded to a PCI AXIS Control Card in which there is a

1181
Cyclone 1C12Q240C8 FPGA device contained 12,060 LEs REFERENCES
and 52 M4K Ram Block. The CNC AXIS Control Card as [1]. Wang Z, et al., “Case representation and similarity in high-speed
shown in Fig.7. machining,” International Journal of Machine Tools and Manufacture, vol. 43,
All parameters and input variables of the position controller 2003, pp. 1347-1353(1347).
[2]. T. Takahashi and J. Goetz, “Implementation of complete AC servo
referred to this work, regardless of negative or positive, are
control in a low cost FPGA and subsequent ASSP conversion,” Applied Power
32-bit 2's-complement binary numbers. Electronics Conference and Exposition, 2004. APEC'04. Nineteenth Annual
In the processing of debugging, the designed position IEEE, vol. 1, 2004.
controller using VHDL was implemented in an Altera DE1 kit [3]. Y.Y. Tzou and T.S. Kuo, “Design and implementation of an FPGA-based
motor control IC for permanent magnet AC servo motors,” 23rd International
as shown in Fig8. The kernel device of Altera DE1 Conference on Industrial Electronics, Control and Instrumentation, vol. 2,
Development kit is Cyclone II 2C20F484C6, a FPGA device 1997, pp. 943-947.
which contains 18752 LEs and 52 M4K Ram Block. The [4]. S. Paramasivam, et al., “Ingenious digital controller for switched
SignalTapII, an embedded logic analyzer of Altera, was used to reluctance motor using Verilog (HDL),” TENCON 2003. Conference on
Convergent Technologies for Asia-Pacific Region, vol. 3, 2003.
debug the design with HIL (Hardware In Loop) method. [5]. Z. Bielewicz, et al., “A DSP and FPGA based integrated controller
The PC-II offers more improvement over PC-I. In particular, development solutionsfor high performance electric drives,” Proceedings of the
the PC-I design needs 32 clock cycles to update servo position a IEEE International Symposium on, vol. 2, june,1996, pp. 679-684.
[6]. R. Dubey, et al., “FPGA based PMAC motor control for system-on-chip
time, while the PC-I design needs 33 clock cycles. Furthermore,
applications,” Proceedings of International Conference on Power Electronics
PC-II design requires more less resources. The latency of PC-II Systems and Applications,, 2004, pp. 194-200.
is more than PC-I since PC-II needs two stages of pipeline. The [7]. H.-T. Yau, et al., “Real-time NURBS interpolation using FPGA for high
latency only occurs once during the power up or system reset. speed motion control,” Computer-Aided Design, vol. 38, 2006, pp. 1123-1133.
[8]. K.D. Oldknow and I. Yellowley, “Three-dimensional dynamic
Thus, the computing speed of PC-II is faster than that of the interpolation using stateline based control architectures,” International Journal
PC-I design. When input clock frequency is 32MHz, the servo of Machine Tools and Manufacture, vol. 42, no. 15, 2002, pp. 1627-1641.
loop update frequency of PC-II scheme can attain 1MHz, [9]. R.A. Osornio-Rios, et al., “The application of reconfigurable logic to
which meets the requirement of high-speed control of CNC. high speed CNC milling machines controllers,” Control Engineering Practice,
vol. In Press, Corrected Proof, 2007.
It is seen from Table8 that the PC-II design requires about [10]. Y.F. Chan, et al., “Efficient implementation of PID control algorithm
0.8% resources (LEs) and 3% of M4K Ram Block of Cyclone using FPGA technology,” Decision and Control, 2004. CDC. 43rd IEEE
II 2C20F484C6. The PC-I design need 1.5% LEs and 5% M4K Conference on, vol. 5, 2004.
[11]. S.A. White, et al., “Applications of distributed arithmetic to digital
Ram Block of the logic device, while multiplier-based signalprocessing: a tutorial review,” ASSP Magazine, IEEE [see also IEEE
implementation uses about 51.7% LEs. In other words, if the Signal Processing Magazine], vol. 6, no. 3, 1989, pp. 4-19.
position controller implementation of high-speed CNC uses [12]. C.R. Knospe, “Active magnetic bearings for machining applications,”
multiplier-based method, a Cyclone II 2C20F484C6 can only Control Engineering Practice, vol. 15, no. 3, 2007, pp. 307-313.
[13]. S. Gordon and M.T. Hillery, “Development of a high-speed CNC cutting
contain a single position controller channel, which can only machine using linear motors,” Journal of Materials Processing Tech., vol. 166,
control one axis. On the contrary, the position controller no. 3, 2005, pp. 321-329.
implementation of high-speed CNC using DA scheme, whether
PC-II scheme or PC-I scheme, can control more than one axis,
which achieve much higher than the efficiency multiplier-based
implementation.

V. CONCLUSION
In this paper, two FPGA development of a high-speed CNC
Position Controller (PC) using DA-based scheme are shown.
By using DA-based LUT scheme, the memory inside FPGA
has been utilized to provide efficient design for CNC PC. Servo
update frequency of PC can attain 1MHz which achieves the
requirement of high-speed CNC. Because the DA-based FPGA
design methodology is used, more than one PC can be Figure7. CNC Axis Control Card
implemented in Cyclone 2C20F484 FPGA device, while only
one PC can be implemented using multiplier-based design
method. In future, we will implement efficiently the other
functions of CNC Motion Controller by using DA-based
scheme in FPGA.
ACKNOWLEDGMENT
This research is supported by the key project on the
revitalization of the northeast China by CAS (the Chinese
Academy of Sciences)—Oriented servo driven synchronous
control bus and device industrialization.
Figure8. The Altera DE1 Development Kit

1182

You might also like