Rui Xu, Dongxu Li and Jianping Jiang

This paper deals with the active vibration control of smart truss structure. First, the electro-mechanical coupled dynamic
model of the smart structure is constructed. Then, the first-order ordinary differential equation of the control system is
presented. After that, an online learning fuzzy control (OLFC) algorithm is proposed to control the structure vibrations.
The OLFC algorithm is composed of a reward function, a Q learning algorithm, a rule base generator and a conventional
fuzzy controller. The OLFC algorithm learns the rule base by interaction with the plant, and changes rule base generate
policy via evaluative reward signal to realize the learning goal. The algorithm only needs little information about the plant
to design the reward function. In order to prove the effectiveness of the proposed control algorithm, control responses
are presented and compared with conventional fuzzy control method.

Fuzzy control, vibration control, smart structure, online learning

Date received: 10 July 2015; accepted: 24 February 2016

(FC) algorithms, which do not require the mathemat-
Large flexible truss structure is widely used in aero- ical model of the plant. Compared to the non-fuzzy
space. Due to limited launch capabilities, such struc- control methods, the advantages of fuzzy control
tures need to be lighter and more flexible.1 This type methods are: (1) human experience can be easily
structure has some special dynamical properties, such used in the design of the FC controller; (2) FC is a
as very low stiffness, low damping ratios, low modal natural way of expressing uncertain information, i.e.
frequencies and so on.2,3 Due to these properties, it can easily deal with system uncertainty; (3) FC
these structures are extremely susceptible to vibration allows integrated control schemas such as learning
and the vibrational energies are hard to disperse in the algorithm; and (4) FC has universal approximation
vacuum space.4,5 The vibration problem of the truss capabilities, and it can easily deal with system
structure will significantly degrade the pointing stabil- nonlinearities.
ity of the spacecraft. Hence, vibration control schemes The application of Fuzzy theory to vibration con-
are ultra-needed. trol was first proposed by Tsoukkas.13 Nowadays,
To control the vibration problem of the truss Fuzzy control schemes are widely used in vibration
structure, one promising method is to use the technol- control problem. Lin et al.14 presented a neuro-fuzzy
ogy of smart structures, which employs embedded control smart piezoelectric rotating truss structure.
actuators and sensors.6 A common way is to use Nasser et al.15 proposed a nonlinear fuzzy controller
piezoelectric stacks as actuators. Since piezoelectric for active vibration control of composite structures.
materials have the advantages of being light Nikdel et al.16 designed a Takagai-Sugeno (T-S)
weight, easy to implement, and having high band- fuzzy approach for control of a flexible joint robot.
widths,7 there are various control methodologies Qiu et al.17 proposed a T-S fuzzy controller to con-
that can be used to solve this problem, such as trol a two-connected piezoelectric flexible plate.
linear-quadratic-Gaussian (LQG) control,8,9 positive
College of Aerospace Science and Engineering, National University of Defense Technology, China
space control,10 robust control,11 sliding mode con- Defense Technology, China
trol,12 and so on. The above-mentioned control algo-
Corresponding author:
rithms can provide adequate control performance
Rui Xu, College of Aerospace Science and Engineering, National
for a broad class of problems, but they are sensitive University of Defense Technology, Deya Road 109, Changsha, Hunan
to the parameters’ uncertainty or unmolded dynamics 410073, China.
of the plant. An alternative is to use fuzzy control Email:

The piezoelectric stack is used particle-swarm method to optimize the fuzzy controller for active vibration control of smart composite beam. patches electric actuator is considered in this paper. Take the truss structure as an example. The supper element contains a spring element. Keu', Keuu, and Ke'' are the mass Electro-mechanical coupled model of the actuator matrix. Generally. per element with three nodes (i. and dielectric stiffness matrix of the actuator. (OLFC) algorithm is proposed. j. However. In this paper. Assume that all the piezoelectric patches of the piezoelectric stack By using the finite element method. a spherical hinge. On the tip of the structure. This algorithm only needs little information about the plant to design the reward function. The tip velocity can also be measured by the derivative of the displacement. After that. the dynamic equation of the smart truss structure can be written as

Md€ + Cd_ + K + Ku' ' = Fq (3)

where e and Fe are the displacement vector and force vector. Meuu, Keuu, Keu', and Ke'' are the mass matrix, elastic stiffness matrix, piezoelectric coupling matrix, and dielectric stiffness matrix of the actuator, respectively. 'e and Qe' are the voltage and applied charge. As described in previous papers, the dynamic equations of the actuator can be written as

Meuu d€ + Keuu e + Keu' 'e = Fe (1)

KeT u' + K'' ' = Q' (2)

The actuator is modeled as a super element with three nodes (i, j, k). The simplified model of the actuator is shown in Figure 2(b). It consists of a connecting rod, a preloading spring, a piezoelectric stack, a spherical hinge, and a sleeve. The actuator is connected to the truss and forms one rod element connecting two nodes of the structure. Assume that all the piezoelectric patches of the piezoelectric stack have the same physical parameters. Figure 2(b) shows the potential distribution of the piezoelectric stack while a voltage ' is loaded on the piezoelectric stack. On the tip of the structure, a displacement transducer is placed to measure the tip displacement.

Rule Base Generator According to equation (4). we use a constant coefficient to scale  the value x_ ¼ Ax þ B’ ð8Þ E ¼ kE  u ð12Þ y ¼ Dx where EC ¼ kEC  u_ ð13Þ   V ¼ kU  U ð14Þ 0 I A¼ ð9Þ  M K 1 1 M C where kE . The fuzzy inference > 1 xi 4  0:8 > > system is composed of four principal elements: fuzzi. ’ is the u d kU dt control voltage. 0 0 0 0 0 0  0 0 1 ZO – zero. presented in Figure 4(b). 2016 . and stiffness matrix. both the input xi ¼ fu. defined as posed of two parts: the fuzzy inference system and the 8 reinforcement learning algorithm. The sigmoidal membership functions and symmetric Gaussian membership functions are used Online learning fuzzy control algorithm for input fuzzy sets Ali . Fuzzy inference system where K ’’ ’’ 33 npzt The fuzzy controller is a Mamdani-type one. EC) and output (U) variables both belong to ment and velocity of the truss. Fq is the mechanical force. 3 K’u  þ K’’ ’ ¼ Q ð4Þ Tip Mass Truss Structure where M. u_ g and out-   put y ¼ fVg are quantized into five fuzzy sets. d. the system must be written as _ as input. and it receives the tip displacement (u) and tip vel- For control synthesis. displayed in Figure 4(a). age (V) as output. where u and u_ are the tip displace. C. and K are the mass. For The schematic diagram of the proposed OLFC con. fuzzy inference engine and defuzzyfier. ð15Þ Downloaded from pig. if Ali ¼ NB. Al ðxi Þ ¼ i > > > 12:5ðx þ 0:8Þ2 0:64xi 4  0:4 The reinforcement learning algorithm is composed of : reward function. rule base. ------- 0  0 0 1 0 0 0 0 0 D¼ ð11Þ namely: NL – negative large. The triangular membership functions are used for Control system description output fuzzy sets Bl . The OLFC is com. we get Figure 3. NS – negative small. . F’ ¼ Ku’ K1 Q ¼ U’. U¼ lpzt Ku’ K1 ’’ . example. the fuzzy domain of the input (E and vector y ¼ ½ u u_ T .sagepub. The damping matrixes are calculated kEC by Rayleigh damping model as follow Rule Base C ¼ M þ K ð5Þ where  and  are the constant damping coefficient. Md€ þ Cd_ þ Kd  ¼ Fq þ F’ ð7Þ  ¼ K  Ku’ K1 K’u . and kU are constant coefficients. respectively. For convenience of design and duce a state vector x ¼ ½  d_ T and an output realization. _ Fuzzy Inference Defuzzyfication Fuzzification kE € and d are the displacement. velocity. PS – positive small. while it gives the control volt- ocity (u) first-order ordinary differential equation. Q table. PL – positive large. P table. respectively.Xu et al. and rule base 0 xi 5  0:8 at UNIV OF VIRGINIA on June 5. structural damping. and then we can obtain [1. M1 U In this paper. kEC . 1]. We intro. < 1  12:5ðx þ 0:8Þ 2 0:84xi 4  0:6 fier. we get P Table   ’ ¼ K1 ’’ Q  K’u  ð6Þ Q Table Reward Function Substituting equation (6) to (3). Ku’ and K’’ are the Actuator Fuzzy Controller Sensor piezoelectric-mechanical coupling matrix and dielec- tric stiffness matrix. Schematic diagram of the control system. the membership function is trol system is shown in Figure 3. respectively.   0 The input and output variables are usually B¼ ð10Þ quantized into fuzzy sets defined by linguistic labels. Hence. K’u ¼ KTu’ . and acceleration u vector.

l¼1 y l Ali ðxi Þ i¼1 y ¼ fðxÞ ¼ P Q  ð16Þ The idea of reinforcement learning comes from 25 2 l¼1  l ð i¼1 Ai i x Þ animal psychology of learning.6 -0. Reinforcement learn- clusion set. Rð25Þ : IF u is PL and u_ is PL THEN y is B25 . the value of yl algorithm which has been widely used.   . For the mem. and y l is the center of output trial-and-error interactions with an environment. Initial displacement field applied to the truss structure.   . (b) Membership function of output power set. we use P25 2 the reinforcement learning method to learn it online.2 0 at UNIV OF VIRGINIA on June 5. aÞ ¼ ð1  ÞQt1 ðs. It reinforces the action which has a positive reward.4 -0. and there are five choices (NL. the taken action a and the > > 0:9 if Bl ¼ NL received reward R to update the values of a table > > > > l denote as Q(s.4 0.6 0. For each rule. It is updated as < 0:4 if B ¼ NS y l ¼ 0 if Bl ¼ ZO ð17Þ > > Qt ðs.sagepub. Then the fuzzy system can be represented as a piecewise interpolation function Reinforcement learning Q  In order to determine the output action value.8 Degree of membership Degree of membership 0. and util- 8 izes the perceived state s.4 0. NS.6 0.22 It is based on can be calculated as follow Finite Markov Decision Processes (FMDPs). vibration energies of the system. a) (Q Table). P25 ¼ ½y1 .2 0 0.8 1 E and EC V Figure 4.. the fuzzy system can be sim.4 Proc IMechE Part G: J Aerospace Engineering 0(0) (a) (b) NB NS O PS PB NB NS O PS PB 1 1 0. The vibrational energy plified as of the system could be approximately expressed as X 25 1 y¼ l l ðxÞ ¼ T ðxÞ ð18Þ Et ¼ Ku2 þ Mu_ 2 ð20Þ 2 l¼1 Downloaded from pig. As LetQ The purpose of vibrational control is to disperse the 2 l¼1 i¼1 Ali ðxi Þ ¼ 1. Q learning is a model-free reinforcement learning bership function defined in this paper. which are described as follow: Rð1Þ : IF u is NL and u_ is NL THEN y is B1 0. it defines a state–action pair. ðxÞ ¼ ½1 ðxÞ. and weakens the where Ali ðxi Þ is theQ membership of the resulting con.2 0.8 -0. l ðxÞ ¼ ni¼1 Ali ðxi Þ is the degree of mem.2 0.6 0.8 0. PS.2 0. membership function Bl of the lth rule.4 0.6 0.8 -0.6 -0. Figure 5.1m ð2 Þ R . aÞ þ l ðxÞRt ð19Þ > > 0:4 if Bl ¼ PS > > : 0:9 if Bl ¼ PL where  is the learning rate and Rt is the reward function. 2016 .y M T .4 0. : IF u is NL and u_ is NS THEN y is B2 . PL) for the output action value Bl . ing attempts to learn the optimal strategy through bership of the lth rule. ZO. Then there are 25 rules that can be used in the fuzzy system.8 1 -1 -0. action with a negative reward.4 -0. (a) Membership function of input power set. Membership function plots of the input and output power set.2 0 0 -1 -0. M ðxÞT .

which is defined as Density (kg/m ) 8000 Equivalent cross area (m2) 6.510 P10 algorithm learns. Table 4. so that the more the Equivalent piezoelectric 4. PL). (a) Generate rule base according to P Table. FC-controlled and (f) Update the P Table using equation (22) OLFC-controlled system for the two cases are com- pared in Figures 6– at UNIV OF VIRGINIA on June 5.11011 Tt ¼ ð23Þ 1 þ Tt1 Piezoelectric stack density (kg/m3) 7600 Piezoelectric stack equivalent 7. The control voltage is compared in Figure 9. 5 where K and M are the equivalent stiffness and mass Table 1. Initialize Q0 ðs. 2016 . The displace- (d) Update the Q Table using equation (19).23 Hz and 0. K. ment response. as shown in Figure 5. u) Simulation results s. and the size of the sphere stands for the In order to verify the robustness of the present control value of Pðs. aÞ. we change the weight of the tip mass to action value for states (NL. After 30 s. The (u.18 Hz. the first mode 2. It can be seen that the proposed control algorithm can significantly sup- Numeric example press the vibration of the truss structure in both two cases. the mature its behavior is. aÞ=Tt Þ Pt ðs. Inside diameter (m) 0. The action NL increases fast.004 tion model to generate the rule base.sagepub.4  0. respectively. aÞ and rewards R0 . cross area (m2) From equation (22).4  4 d Rod number 135 Rt ¼  Et ¼ 2Kuu_  Mu_ u€ ð21Þ Elastic modulus (GPa) 72. action NL is almost equal to 1. V is action a. cess of action for sate (PL. That means that the learning algorithm is already mature. st . ameters of the truss structure are shown in Table 1. constant d33 /sE33 /33 (C/N)/2530/0. (NL. Two simulation cases are studied here: for cases because these states have not appeared in this simulation 1 and 2. and the (e) Update the temperature T using equation (23). NL). and it is 3 decreased with time. Parameters of truss structure. Repeat: frequencies are 0. Parameters of actuator. for any s and a.72 The OLFC algorithm is formulated as follows: Piezoelectric patch number 140 1. we use the Boltzmann distribu. system energy of the uncontrolled. For the first 3 s. the weight of the tip mass Mtip is 1 kg and case.07104 where  is a positive constant value. Considering an initial displacement field applied to (b) Execute the action according to fuzzy the truss structure which is obtained by a mechanical inferences. actions have almost the same probability to be chosen. the chosen probability of the action with Piezoelectric stack length (m) 0. For these two cases. of the structure. The par. M and T. 10 kg. respectively.16 the highest Q value is increasing. respectively. Simulation parameters The rule base generated by the algorithm at the last To demonstrate the feasibility of the proposed control step for case 1 and case 2 are shown in Tables 3 and method. The P Table in the _ is the state last step is visualized in Figure 11. Figure 10 shows the learning pro- tion control of the truss structure is presented here. aÞ ¼ ð22Þ P 25 expðQt ðs. ZO). aÞ=Tt Þ Table 2.006 Pðs. the probability of The actuators are placed at the heel of the truss.7 dt Density (kg/m3) 2700 In each time step. It is ture. Denote a table Outside diameter (m) 0.283105 Tt1 Elastic modulus (Pa) 2. power spectrum response. and it is updated as follow expðQt ðs. PL) is much different from its neighbor. initialize . NL) change the dynamic characteristic of the truss struc. s¼1 Item Value where Tt is the temperature at time step t. Item Value Then the reward function can be defined as the reduction rate of the system energy Size (m) 0.2 the state is s. and (PL. the five The truss is made of aluminum hollow rod. And it can be seen in Figure 11 that the probably Downloaded from pig. it can be seen that the algorithm.Xu et al. While the learning time increases. . (PL. force applied at the free tip that induces a tip displace- (c) Calculate the reward Rt using equation (21). the probability of parameters of the actuator are shown in Table 2. respectively. a numerical example concerning the vibra. it can be noted that when T is Piezoelectric elastic modulus (GPa) 63 decreasing. From Table 3. aÞ (P Table) as the probability of action a while Connector weight (kg) 0. ment equal to 0.1 m.

2016 . Amplitude spectrum response of the controlled and uncontrolled system.8 OLFC 0.8 1 Frequency (Hz) Frequency (Hz) Figure 7. Downloaded from pig.2 0 0 0 20 40 60 80 100 0 20 40 60 80 100 t(s) t(s) Figure 8.4 0. (a) 1 (b) 1 Uncontroled Uncontroled FC FC System Vibrational Engergy System Vibrational Engergy 0.02 0.6 0.1 Uncontroled Uncontroled FC FC 0.1 0 20 40 60 80 100 0 20 40 60 80 100 Time(s) Time(s) Figure 6.05 -0.05 OLFC Tip Defelection(m) Tip Defelection(m) 0 0 -0.04 0.4 0.02 0. Displacement response of the controlled and uncontrolled at UNIV OF VIRGINIA on June 5.4 0.4 0.2 0.8 OLFC 0. (a) Case 1: Mtip ¼ 1 kg (b) Case 2: Mtip ¼ 10 kg.6 0.sagepub.6 Proc IMechE Part G: J Aerospace Engineering 0(0) (a) 0. (a) 100 (b) 100 FC FC OLFC OLFC 50 50 Control Voltage(V) Control Voltage(V) 0 0 -50 -50 -100 -100 0 20 40 60 80 100 0 20 40 60 80 100 Time(s) Time(s) Figure 9. System energy of the controlled and uncontrolled system.05 -0. (a) Case 1: Mtip ¼ 1 kg (b) Case 2: Mtip ¼ 10 kg.03 0.2 0.01 0 0 0 0.04 |Y(f)| |Y(f)| 0.05 OLFC 0.05 0. (a) Case 1: Mtip ¼ 1 kg (b) Case 2: Mtip ¼ 10 kg.6 0.06 0.03 0.1 (b) 0. (a) 0. Control voltage comparison of the FC and OLFC.08 Uncontroled Uncontroled 0.07 FC FC OLFC 0.2 0.05 0.6 0. (a) Case 1: Mtip ¼ 1 kg (b) Case 2: Mtip ¼ 10 kg.07 (b) 0.8 1 0 0.06 OLFC 0.01 0.1 -0.

Visualize of P Table. (a) Case 1: Mtip ¼ 1 kg (b) Case 2: Mtip ¼ 10 kg.4 0.8 ZO PS PS PL PL Probability Probability 0.Xu et al.2 0.6 0.90 actuator. of each action choice for these states are the same. Fuzzy IF-THEN rule base (Case 1: Mtip ¼ 1 kg).sagepub. Fuzzy IF-THEN rule base (Case 2: Mtip ¼ 10 kg). u_ V Item Value u NL NS ZO PS PL Mass (with cable) 480 g NL NS PS NL ZO PS Operating voltage 0–1000 V NS PL PL PL PL PL Closed-loop travel 180 mm ZO NS NS ZO PS PL Static large-signal stiffness 32 N/mm PS NL NL NL NL NS Push/pull force limit 4500/500 N PL PL NL NL NL NL at UNIV OF VIRGINIA on June 5. Probability variation of different actions for state (PL. (a) Case 1: Mtip ¼ 1 kg (b) Case 2: Mtip ¼ 10 kg. which that means the actions are taken at random.4 0.8 ZO 0. PL PL NS NL NS PL Table 4. 7 Table 3. ZO). 2016 . u_ V u NL NS ZO PS PL Experiment validation NL NL PL PL PL NL NS PL PL PL PL PL Introduction of experiment setup ZO ZO PL ZO NS PL In order to verify the effectiveness of the proposed con- PS NL NL NL NL NS trol algorithm. capacitance 1500 nF (a) 1 (b) 1 NL NL NS NS 0. Downloaded from pig.6 0. experimental research was conducted. Figure 11. Parameters of the P-216. Table 5.2 0 0 0 5 10 15 20 25 30 0 5 10 15 20 25 30 Time(s) Time(s) Figure 10.

respectively. is made of two active rods and a PI E-507 voltage Figure 15 shows the comparison of the tip displace- amplifier. first an excitation voltage ’ ¼ 500 sinð3 subsystem is made of aluminum hollow rod.37 Hz. respectively. respectively. 2016 . the actuator and amplifier subsystem.8 Proc IMechE Part G: J Aerospace Engineering 0(0) Laser Laser Displacement Sensor 24V Digital signal Power Supply Decoder 0-10V Digital signal I/O Panel dSPACE DS1005 Active Rods 0-10V Digital signal 0-1000V Voltage Amplifier Monitor Figure 12.008 m the first bending mode of the truss structure.96 s.3%. respectively. and they are made by P-216. respectively. The selling time decreasing of a dSPACE DS1005 controller and a computer. comparison of the system vibrational energy with Downloaded from pig. voltage was deactivated. of the truss. The actuator and amplifier subsystem Figure 14(a) and (b). The sensor and decoder at UNIV OF VIRGINIA on June 5. respectively. Schematic diagram of the experiment system. The system are shown in Figures 12 and 13.4% and 60. The tip displacement tem is made of an LK400 laser displacement sensor response and control voltage are displayed in and its decoder.56 s.62 s for the uncontrolled. The two active rods are placed at the heel ment response of the uncontrolled and controlled system. The density and elastic the vibration of the truss structure reaches the max- modulus of the truss structure are 2700 kg/m3 and imum amplitude after 6.7 GPa. The parameters are shown in 1. The rates of the FC and OLFC systems relative to uncon- schematic diagram and photograph of the experiment trolled system are 38. and 0. The experiment system consists of four subsystems: the truss structure subsystem. The truss structure 6.sagepub. 0. FC and Table 5.5 s excitation.012 m. and the Since the first-order frequency of the truss structure is control and monitor subsystem. The control and monitor subsystem consists OLFC system. the excitation 72.90 power It can be seen that the closed-loop 5% settling time is piezo actuators. Photograph of the experiment system. When and 0. The 2 tÞ þ 500 was loaded on the PZT patches to excite inside and outside diameters of the rod are 0. the sensor and decoder sub- Vibration control results system. Figure 13.

76 11.74 13. 3.76 2 3. The numerical and experimental results of truss uncontrolled. It can be seen that the average 4.09 4 3.39%. FC and OLFC randomly. the algorithm will system is 13. 9 (a) (b) Figure 14.91 8. Damping ratio of the controlled and uncontrolled system.18 Average 3. The electro-mechanical coupled dynamic model of the smart structure is con- structed.27 15. (a) Tip displacement and (b) control voltage. Numerical simulations and experiment verification are conducted to verify the proposed method.77 5 3. Downloaded from pig. An OLFC algorithm is designed to control the structure vibrations. The algorithm can learn the rule base by inter- action with the plant online. vibration control demonstrate that OLFC has a Table 6 compares the damping ratio of the uncontrolled.05 12.52 9. Tip displacement response and control voltage. better performance than FC. After learning. The OLFC algorithm does not need mathematical model or expert’s experience of the plant to design the rule base.52 8.96 15. Conclusion This paper addresses with the active vibration control of smart truss structures. In the beginning. the action of OLFC is chosen damping ratio of the uncontrolled.16 3 at UNIV OF VIRGINIA on June 5. The conclusions are as follows: 1. 8. respectively.Xu et al. and uncertainty. 2. become more and more mature. FC and OLFC is shown in Figure 16.61 6.sagepub.06%.10 12.39 Figure 15. Displacement response of the controlled and uncontrolled experiment system.06 8. 2016 .33 11. System energy of the controlled and uncontrolled OLFC can easily deal with system nonlinearities experiment system. Compared to the non-fuzzy control methods. Number of periods Uncontrolled FC OLFC 1 1. Table 6. It only needs a little information about the plant to design the reward function.74% and 3. Figure 16. FC and OLFC system.

