adaptive control

© All Rights Reserved

0 views

adaptive control

© All Rights Reserved

- Midterm Solutions
- 05531082
- Introduction
- 22222222222211111
- inspection robot indoor
- Added Mass
- Robotica_96
- Automation
- Fault Tolerant Control for Six-Phase PMSM Drive System via Intelligent Complementary Sliding Mode Control Using TSKFNN-AMF
- Chap_10x.pdf
- Jurnal Control-Mothod of Moment
- Gimbal_modeling.pdf
- List of M``````````````````````````````````````````````````````````````````````oments of Inertia - Wikipedia, The Free Encyclopedia
- physics paper
- From PID to Active Disturbance Rejection Control.pdf
- Moment of Inertia
- LEC_1
- 4(25).pdf
- ENS5253 Chapter-3 Review Questions_quiz-3(1)
- 1301.0931

You are on page 1of 6

disturbance

Nguyen Thanh Long, Nguyen Van Huong, Dao Phuong Nam, Mai Xuan Sinh, Nguyen Thu Ha

Abstract— This work presents an adaptive optimal control inevitable in most of practical systems, it is necessary to find

algorithm based integral sliding mode control law for a class of a optimal control scheme in presence of disturbances and

continuous-time systems with input disturbance or uncertain uncertainties. Based on the redefined infinite horizon cost

and unknown parameters. The main objective is to find a general function and the nominal system, Wang et al. [1] and Liu et

form of integral sliding mode control law can assure that the al. [2] presented a novel strategy to design the robust

system states are forced to reach a sliding surface in the initial

controller for a class of nonlinear systems with perturbations,

finite time. Then, an adaptive optimal control based on the

approximate dynamic programming method is responsible for which are bounded by a known state-dependent function.

the robust stability of the closed-loop system. Finally, the Combining the two-player zero-sum differential game theory

theoretical analysis and simulation results demonstrate the and ADP, the nonlinear H f control problems were

performance of the proposed algorithm for a wheel inverted approximately solved in [5]–[7].

pendulum system. In this paper, we propose the combined idea of

approximate dynamic programming and integral sliding mode

Keywords – approximate/adaptive dynamic programing control for the purpose of presenting the controllers for

(ADP), Wheel inverted pendulum (WIP) system, Integral unknown systems with disturbance input. Unlike the general

sliding mode control (ISMC), robust stability. ISM control, the controller contains the part, which is learnt

online, and guarantee the stability and the nearly optimal

I. INTRODUCTION performance of the sliding-mode dynamics.

Sliding mode control (SMC) has been considered since II. PROBLEM STATEMENT

the 50s of the 19th century (Utkin, 1977, 1992; Pisano and

We study a class of continuous-time systems described

Usai, 2011), and many studies about the sliding mode control

by:

method have been investigated in recent years (Man and Yu,

1997; Drakunov, 1992; Ting et al., 2012). The most advantage x Ax B u f x, u,t 1

feature of sliding mode control has been described that it where x n is the measured component of the state

consists in the complete compensation of the so-called

matched disturbances (i.e., disturbances acting on the control available for feedback control, u m m d n is the input.

input channel) when the system is in the sliding phase and a Suppose that A nun is unknown constant matrix,

sliding mode is enforced. This latter develope when the state f x, u, t m is the disturbance or/and uncertain of

is on a suitable subspace of the state space, called sliding

system.

manifold (or sliding surface).

Assumption 1: The matrix B has linearly independent

The integral sliding mode (ISM) technique was first

proposed in [4], [8] as a solution to the reaching phase columns, i.e. rank B m .

problem for systems with matched disturbances only. In order Assumption 2: There exist a constant value U ! 0 , a

to avoid the phase and to obtain a robustness from the initial

continuous function P . , a continuous function O t such

time, the concept of integral sliding mode has been introduced

[3], [9]. The ISM has find many application in industrial that 0 d O t d Omax 1; t and the disturbance and uncertain

process like robots and electromechanical systems, etc, [10], of system satisfied:

[11].

Furthermore, reinforcement learning (Sutton & Barto,

1998) [16] and approximate/ adaptive dynamic programming

f x, u,t U P x t O t u t

(ADP) (Werbos, 1974) [17] theories have been broadly

applied for solving optimal control problems for uncertain Remark 1: The disturbance O t u t of f x, u,t is a

systems in recent years. Because disturbances are always

component that has not been mentioned in previous articles

Dao Phuong Nam, Nguyen Thu Ha, Nguyen Van Huong, Mai Xuan Sinh Nguyen Thanh Long is with Hung Yen University of Technology and

are with Hanoi University of Science and Technology (e-mail: Education

nam.daophuong@ hust.edu.vn).

using ADP algorithms. Because of this component, the Integrating 8 over the time interval 0 d W d t we obtain:

creation of an ISM controller is needed to solve the problem.

III. PROPOSED CONTROL LAW 1

V 1/ 2 t V 1/ 2 0 d ct

We define B is the Moore-Penrose pseudo inverse of

+ 2

matrix. By assumption 1, B + can be computed as:

Consequently, V t can reach zero in a finite time t s that is

B B

1

B + T

B T bounded by:

2V 1/ 2 0

Assumption 3: There exists a number V ! 0 such that ts d

B B

1 c

MA V , where M B+ T

BT .

Define the sliding surface as follows: In view of 3 , it is known that V t

0 when t 0 .

Based on the above proof, we derive that for all t t 0 we

^x n

: s t 0` 2 have:

s t M x x0 ³ v W d W 3 Remark 2: It is necessary to ensure that the time of

0

convergence of sliding surface is finite.

s

with v is later designed. Remark 3: In practice, the discontinuous function may

s

s

Theorem 1: The control signal u v k t , with cause the undesirable chattering phenomenon in the SMC

s

action. In order to remarkably attenuate this chattering, the

s

1 discontinuous function can be replaced by a continuous

k t

1 O t

U P x V x O t v t c 4 s

s

approximation such as where K denotes a positive

s K

and c is a positive constant, can be guaranteed that the system

states are forced to reach the sliding surface at the initial time. constant. As K gets close to zero, there is almost no

Proof: performance difference between the approximated control

The time derivative of 3 given by: law and the original control law [12].

When s t s t 0 , from 5 we have:

s B + BT Ax B u + f v t

5 ueq f x, ueq ,t MAx v

MAx u f v

The system can be rewritten:

It follows from assumption 2, we have:

x Ax Β MAx v A BMA x Bv

k t ! f x, u, t MA x t c 6 9

x Ax Bv

B

1 T where A A BMA; B B .

We consider a candidate Lyapunov function: V s s 7

2 According to theorem 1, the equivalent control

s

The derivative of V is computed as: u v k t makes the solution of x t system 1 be

s

V sT s sT ª¬ MAx u f v º¼

equivalent to the solution of 9 . The original dynamics is

d s MA x f k t s 8

equivalent to the sliding-mode dynamics 9 when the

V c s cV 1/ 2 sliding-mode controller u is designed to guarantee the

reachability of sliding manifold. The control objective is to

find an approximate optimal control v that the sliding-mode Assumption 4: There exists a number G ! 0 such that Φk has

dynamics 9 is robustly stable and have a nearly optimal full column rank for all k Z , k t G .

performance. The cost function is defined: By using assumption 4, Pk can be uniquely determined by:

f

³x Qx v T Rv d W Φ Φ

1

J T

vec Pk T

k k ΦkT Ψ k 16

0

with Q; R are the symmetric positive definite matrices. We

following algorithm.

present an ADP method is used to learn a nearly optimal Algorithm 1:

control v based on the state information of the original

Select K 0 such that A BK 0 is Hurwitz and a

system and the equivalent sliding-mode dynamics 9 .

threshold X ! 0 . Let k o 0 .

Theorem 2: [13] Let K 0 be any matrix such that A BK 0 is Repeat

Hurwitz, and repeat the following steps for k 0,1,... 1. Apply v K k x e and solve Pk from 16 .

Step 1: Solve for the real symmetric positive definite 2. Update K k + 1 by using 11 .

solution Pk of the Lyapunov equation:

3. k m k 1

Until Pk Pk+1 X

AkT Pk Pk Ak Q K kT RK k 0 10

k* m k

We obtain the approximated optimal control policy:

where Ak A - BK k .

v K k* x

Step 2: Update the matrix by:

Remark 4: Choosing the exploration noise is not a trivial task

for general reinforcement learning problems and other related

Kk+1 R B Pk 1 T

11 machine learning problems, especially for high-dimensional

systems. In solving practical problems, several types of

Then, the following properties hold: exploration noise have been adopted, such as random noise

[14], exponentially decreasing probing noise [18].

a A BK k is Hurwitz Lemma 1: Under assumption 4, by using algorithm 1, we have

b P * d Pk + 1 d Pk lim K k K * ; lim Pk P* .

k of k of

of k of

From 13 , 14 one see that the Pk ; K k+1 obtained from

With K * R1 BT P* and P * is a unique symmetric, positive 10 , 11 must satisfy the condition 14 , 15 . In addition,

definite matrix such that: by assumption 4, it is unique determined by 16 . Therefore,

k of k of

Then, we find an approximate optimal control policy by using Lemma 2: There exists a sufficiently small constant H ! 0

the online measurements of the closed-loop system 9 . such that for all symmetric matrix P ! 0 satisfying

We consider that: P P* H the system 9 can be stable by v R1BT Px .

Proof:

V x T

x Pk x 13 Because Q P * BR 1 BT P * ! 0 , there exists D ! 0 such that

Q P * BR 1 BT P * ! DI . For any symmetric matrix P ! 0

So, we have: we have:

t T

x t T Pk x t T x t Pk x t ³ V x W d W 14 AT P PA Q PBR 1 BT P 0

T T

where:

Applying Kronecker product representation, we obtain that:

Φk vec Pk Ψ k 15 Q PBR 1 BT P Q P BR * 1

BT P * P * P A

AT P* P 2 PBR 1 BT P P* BR 1 BT P*

Because of continuity, there exists a sufficiently small

constant H ! 0 such that for all symmetric matrix P ! 0 We can choose O1 ; O 2 ! 0 to guarantee that the state

satisfying P P* H we have: estimation of the observer converges to the true state (This

was shown in [15]).

Step 3: Apply a similar algorithm 1:

Q PBR 1 BT P ! Q P * BR 1 BT P * DI ! 0

Select K 0 such that AC BC K 0 is Hurwitz and a

threshold X ! 0 . Let k o 0 .

1 T

We consider the Lyapunov function V x Px , we have Repeat

2

1. Apply vC K k xˆ e and solve Pk from 16 .

the system 9 is globally asymptotically stabilizes.

2. Update K k + 1 by using 11 .

Remark 5: From lemma 1 and lemma 2, it is clear that

choosing the threshold X in algorithm 1 is small enough can 3. k m k 1

guarantee the robustly stable of system 9 . Until Pk Pk+1 X

Remark 6: The algorithm 1 requires knowledge of the full k* m k

states x of system 9 . In case it is not possible to measure We obtain the approximated optimal control policy:

the state, since any exploration noise e satisfying the vC K k* xˆ

persistence of excitation condition will have the assurance to It is clear that in the algorithm 2, we only use the input, output

occur the convergence of matrices K, P , we can design an of system 1 and the state estimate of the observer without

observer and use the output-feedback control according to the

following algorithm: requiring any knowledge of state of system 1 .

Algorithm 2:

Step 1: The control of system 1 will be chosen: IV. SIMULATION RESULTS

In this section, we apply the proposed an approximate

s optimal control based integral sliding mode control law to a

t

u v k t ; s t M C y y0 ³ vC W d W 17

s 0 wheel inverted pendulum described as (19) and table 1.

ªxº ª0 1 0 0 0 0º ª x º ª 0 0 º

where y Cx is the output of 1 and: «x»

« »

«0

« 0 0 0 ]1 0 »» «« x »» «« ] 3 ] 3 »»

CB

1

CB

T T

MC CB «\ » «0 0 0 1 0 0 » «\ » « 0 0 » § ª u1 º ·

« » « »« »« » ¨ « » f ¸ 19

«\ » «0 0 0 0 0 0 » «\ » «] 4 ] 4 » © ¬u2 ¼ ¹

Assumption 5: The matrix CB has linearly independent «I» «0 0 0 0 0 1» « I » « 0 0 »

columns. « » « »« » « »

¬« I ¼» ¬«0 0 0 0 ]2 0 ¼» ¬« I »¼ «¬ ] 5 ] 5 »¼

Similar theorem 1, we must find an approximate optimal

control vC of system: y >1 1 1 1 1 1@ x >0 0@ u

T T

where:

x AC x BC v 18

ª¬ x x \ \ I I º¼ ; u >u1 u2 @

T

x

T

with AC A BMCA; BC B

M b2 d 2 gR 2

Step 2: Choose matrix AL is a Hurwitz matrix, the pair ]1

M d 2

I x M b R 2 2M w R 2 2 I a M b dR

2

b

L C L

]2

M b R 2 2M w R 2 2 I a M b gd

xˆ ˆ xˆ B v L y Cxˆ

AL xˆ Aδ C

ª¬ M b 2M w R 2 2 I a º¼ I x 2M b d 2 M w R 2 I a

R M b d 2 I x M b dR

where L is observer gain that is selected such that AC LC ]3

is a Hurwitz matrix, δ xˆ is activation functions and Â is

M d b

2

I x M b R 2 2M w R 2 2 I a M b dR

2

updated by laws:

]4

L

ª § I · 2 º

R « 2 ¨ M w a2 ¸ L Iz »

Aˆ O

O1 AL C y Cxˆ δ xˆ O2 y Cxˆ Aˆ

T

¬ © ¹ ¼

T T

R

]5

M b R 2 2M w R 2 2 I a M b dR

ª¬ M b 2M w R 2 2 I a º¼ I x 2M b d 2 M w R 2 I a

ª 0.01 0.01 0.01 0.15 0.1 0 º u cos 2 t

f «0.02 0.01 0.03 0.01 0.02 0.01» x sin t

¬ ¼ 5

The Figure 1 and Figure 2 show the control and state signals

of system when we do not use algorithm 1. The Figure 3 and Figure 2. The control signal (not use algorithm 1)

Figure 4 show the control and state signals of system when

we use theorem 1 and algorithm 1. Figure 5 and The Figure 6

show the control and state signals of system when use

algorithm 2. The Figure 7 and Figure 8 show the convergence

of matrix K and P of proposed algorithm 1 and algorithm

2, and the tracking errors converge to zero.

Table 1 The parameters and variables of wheeled

Figure 3. The state of system (use algorithm 1)

inverted pendulum

Parameter Symbol Value

Mass of main body Mb 13.3kg

Diameter of wheel R 0.13m

Distance between the wheels L 0.325m

Moments of inertia of body Ix 0.1935kgm 2

with respect to x-axis Figure 4. The control signal (use algorithm 1)

with respect to z-axis

Moments of inertia of wheel Ia 0.1229kgm 2

about the centre

Acceleration due to gravity g

9.81ms2

continuous-time systems with unknown system dynamics and

external disturbance. The proposed algorithm pointed out the

robustly stability of system and the bound of cost function.

The theory analysis and simulation results illustrate the

effectiveness of proposed algorithm.

REFERENCES

[1] D. Wang, D. Liu, H. Li, “Policy iteration algorithm for online design of

robust control for a class of continuous-time nonlinear systems,” IEEE

Trans. Autom. Sci. Eng., vol. 11, no. 2, pp. 627–632, Apr. 2014.

[2] D. Liu, D. Wang, F.-Y. Wang, H. Li, and X. Yang, “Neural-

networkbased online HJB solution for optimal robust guaranteed cost

control of continuous-time uncertain nonlinear systems,” IEEE Trans.

Cybern., vol. 44, no. 12, pp. 2834–2847, Dec. 2014.

[3] M. Taleb, “High order integral sliding mode control with gain

adaptation”, European Control Conference (ECC). July 17-19, 2013.J.

G. Kreifeldt, “An analysis of surface-detected EMG as an amplitude-

modulated noise,” presented at the 1989 Int. Conf. Medicine and

Biological Engineering, Chicago, IL.

Figure 7. The convergence matrices in algorithm 1 [4] G. P. Matthews, R. A. DeCarlo, “Decentralized tracking for a class of

interconnected nonlinear systems using variable structure control,”

Automatica, vol. 24, pp. 187–193, 1988.

[5] H.-N.Wu and B. Luo, “Neural network based online simultaneous

policy update algorithm for solving the HJI equation in nonlinear H f

control,” IEEE Trans. Neural Netw. Learn. Syst., vol. 23, no. 12, pp.

1884–1895, Dec. 2012.

[6] H. Zhang, C. Qin, B. Jiang, and Y. Luo, “Online adaptive policy

learning algorithm for H f state feedback control of unknown affine

nonlinear discrete-time systems,” IEEE Trans. Cybern., vol. 44, no. 12,

pp. 2706–2718, Dec. 2014.

[7] M. Abu-Khalaf, F. L. Lewis, and J. Huang, “Neuro dynamic

programming and zero-sum games for constrained control systems,”

IEEE Trans. Neural Netw., vol. 19, no. 7, pp. 1243–1252, Jul. 2008.

[8] V. Utkin, J. Shi, “Integral sliding mode in systems operating under

uncertainty conditions,” in Proc. 35th IEEE Conf. Decision Control,

Kobe, Japan, pp. 4591–4596, Dec. 1996.

[9] Chieh-Chuan Feng “Integral Sliding-Based Robust Control”, Recent

Advances in Robust Control – Novel Approaches and Design Methods,

ISBN: 978-953-307-339-2, InTech, 2011

[10] J.K. Lin et al, “Integral sliding mode control and its application on

Active suspension System”, 4th International Conference on Power

Electronics Systems and Applications; The Hong Kong Polytechnic

University, Hong Kong, 2011.

[11] V.I Utkin., and al., ‘Sliding Mode Control in Electro Mechanical

Systems (second edition)’, CRC Press, Boca Raton, 2009

[12] R. A. De Carlo, S. H. Zak, and G. P. Matthews, “Variable structure

control of nonlinear multivariable systems: A tutorial,” Proc. IEEE, vol.

76, no. 3, pp. 212–232, Mar. 1988.

[13] D. Kleinman, “On an iterative technique for Riccati equation

computations”, IEEE Trans-actions on Automatic Control, 13(1):114–

115, 1968.

[14] H. Xu, S. Jagannathan, F. L. Lewis, “Stochastic optimal control of

unknown linear networked control system in the presence of random

delays and packet losses”, Automatica, 48(6):1017–1030, 2012.

[15] F. Abdollahi, H. A. Talebi, and R. V. Patel, “A stable neural network

observer with application to flexible-joint manipulators,” in Proc. 9th

Int. Conf. Neural Inform. Process., 2006, pp. 1910–1914.

[16] Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: an

introduction. MIT Press.

[17] Werbos, P.J. (1974). Beyond regression: New tools for prediction and

analysis in the behavioural sciences. Ph.D. Thesis. Harvard University.

Figure 8. The convergence matrices in algorithm 2 [18] K. G. Vamvoudakis, F. L. Lewis, “Multi-player non-zero-sum games:

online adaptive learning solution of coupled Hamilton–Jacobi

V. CONCLUSION equations”, Automatica, 47(8):1556–1569, 2011.

algorithm based integral sliding mode control law for

- Midterm SolutionsUploaded bysagarnitishpirthee
- 05531082Uploaded byYang Liu
- IntroductionUploaded byippu2k6
- 22222222222211111Uploaded byMuhammad Abid
- inspection robot indoorUploaded byduyphuoc
- Added MassUploaded byWon-young Seo
- Robotica_96Uploaded byzamzuri_rashid
- AutomationUploaded byLuís Alberto Carvajal Martínez
- Fault Tolerant Control for Six-Phase PMSM Drive System via Intelligent Complementary Sliding Mode Control Using TSKFNN-AMFUploaded byRACHEL BOSS
- Chap_10x.pdfUploaded byAhmed58seribegawan
- Jurnal Control-Mothod of MomentUploaded byedwinfisika
- Gimbal_modeling.pdfUploaded byMeluk 'Rhaina' de Silva
- List of M``````````````````````````````````````````````````````````````````````oments of Inertia - Wikipedia, The Free EncyclopediaUploaded byPratik Roy
- physics paperUploaded byharshanauoc
- From PID to Active Disturbance Rejection Control.pdfUploaded bySanthosh Inigoe
- Moment of InertiaUploaded byAnonymous 9uu04el
- LEC_1Uploaded byabbasmiry83
- 4(25).pdfUploaded byMaher Hammami
- ENS5253 Chapter-3 Review Questions_quiz-3(1)Uploaded byomairakhtar12345
- 1301.0931Uploaded byRicardo Toribio Dionicio
- ENG2016MUploaded bySwifty Spot
- Assign 2Uploaded byalshifa
- week11Uploaded byYusafxai Yxai
- 43 MATH EneCostin,StoicaAdrianMihail,ParvuPetrisorValentinUploaded byTry Susanto
- uploaded_Lecture_11 Robust State Feedback Stabilization.pdfUploaded byAnonymous axvKAM5ro
- Control Systems PPTUploaded byKameshSonti
- 125_600Uploaded byBhaskar Biswas
- 10.1109@iecon.1999.819360Uploaded bywawee
- PhysicsUploaded byVikash Kotteeswaran
- materiUploaded bycristaharyata

- [IJCAS CFP] Special Issue_Soft Robotics 2016Uploaded bysalim
- 511 NotesUploaded bysalim
- Tentalosning TMA947 070312 2Uploaded bysalim
- t3Uploaded bysalim
- NLSC Lecture Notes 2016 Sufficient Conditions ChapterUploaded bysalim
- t 1Uploaded bysalim
- sol2Uploaded bysalim
- tenta13A_2Uploaded bysalim
- adap 1Uploaded bysalim
- Ma 30087Uploaded bysalim
- MA30087/50087: Optimisation methods of operational researchUploaded byMarcelo Ziulkoski
- IOIPNotesOct2014Uploaded bysalim
- intro_optUploaded byУзахир УЗИ Армани
- lecture_notes_series_optimisation_constrained.pdfUploaded bysalim
- tentoct11Uploaded bysalim
- cv-14-lecture-18_3Uploaded bysalim
- hu2008Uploaded bysalim
- Lewis 1996Uploaded bysalim
- EEE482-FA09-ISSMUploaded bysalim
- Course Outlines 012Uploaded bysalim
- m11_lec1Uploaded bysalim
- SEEM3470_tuto_02Uploaded bysalim
- Dβsgn(σ (t)) (1)Uploaded bysalim
- 07 Homework6 SolUploaded bysalim
- 06-12-01-sol.psUploaded bysalim

- Articulo Higher Order Sliding Modes, Differentiation and Output Feedback ControlUploaded byPaulina Marquez
- Planificación Estratégica PersonalUploaded byGerisval Alves Pessoa
- Beckett and GlitchesUploaded byKen Alba
- Practical_Process_Control_Textbook_20060612.pdfUploaded byMuhammad Sabih
- General CSUploaded byahmad
- Systems Concepts in EvaluationUploaded bymilton.ba
- PLCs Vs DCS.pdfUploaded bythyros
- A Review on Feature Extraction Techniques to Aid Content Based Image Retrieval SystemUploaded byIJSTE
- Technical English QB I SEMUploaded byjothilakshmi
- MIT Introduction to Deep Learning Course Lecture 1Uploaded bytintojames
- Systems ThinkingUploaded byDermawan Purba
- MPC Matlab UserGuideUploaded byanncar1987
- Development of an Automatic Arc Welding System Using Smaw ProcessUploaded byapoi
- bm3Uploaded byBrightworld Projects
- Questionnaire 1Uploaded byDéborah Monnise
- Seanewdim Hum Soc ii8 Issue 52Uploaded byseanewdim
- 4th Year Control Lab 1Uploaded byGathy Brayoh
- FeedConunit-6.pdfUploaded byYvesExequielPascua
- Tutorial-6.pdfUploaded byAnimesh Choudhary
- Finite State Automata and Simple Recurrent NetworksUploaded byCaterina Carbone
- Cs2053 Soft Computing Syllabus rUploaded byanoopkumar.m
- FG05W1 - Introduction to Process Control .pdfUploaded byknightfelix12
- Deep Learning with KerasUploaded byHisham Shihab
- Control TheoryUploaded bySyedAhsanKamal
- Input- Output Models[1]Uploaded byJulius Supe
- Priyanshu Executive SummaryUploaded byPriyanshu Agarwal
- 221418720 Natural Language ProcessingUploaded byAnonymous jpeHV73mLz
- Exercise 6 - State-space ModelsUploaded byanass sbni
- Bio Semiotic AUploaded byapi-26763370
- 2008-11-24 User Experience. Some thoughts on how to improve it - Simo Säde - EtnoteamUploaded byMobile Monday Italy