You are on page 1of 63

| 

   
| 

 
 
  

"#

"

j


  

|



   
 
 



 


|


 

 
  


 
 
 

  !


  





 

 

"#

"

$
 




 


%

&


 

 

'

&

 

  

&
(




 

#
) *



" 
 

 +
, 








-

' 


,

 



 
  
XOR (exclusive OR) problem

-- -
##  -



#- #
-# #
Ô
     
|
.
  
#//
 
 

012




   

 


 





#

#
'


#
· 
·

·  

· 

j


 
  
3
, 

%

X     



 
  




 
 




j 
 
"#
"

  1  

"

°   
Ô 
 


 
 




4 



   


*
   
* #
Ô 
 


 
 



 
  
 

 

  
 


4 



   


*
   
* #
Ô 
 


 
 



 
  
 

 

  
 
 
 

 


4 



   


*
   
* #
Ô 
 


 
 



 
  
 

 

  
 
 
 

 
 1
 

'
 
  

  



) 
 

 

  







 




 

  


4 



   

*
   
* #

1
  



" 

(

 


 
+

' 


 

#

 



   
 "


   
     
'






 


' 




(

  
  




'




 "

  

   
  *
  


5  
6 "




  



  
 


   
 
 


 

 


" 





!


 





 
  


  

 



  





   7

!


 
 



 
  



  
* "#
8  8
` 8

8  8
"
(


  7 
 
 










 



  
8
8

 


 


  


 


 

 
  
8

"#
" +

þ          


 
   

      



     
       
 
  
'redit assignment problem

 


9 -

9-

  


  

 
  
 


 




 
 
 !



 
 


 

 !


 


 
  &

6 

 

 




 


  


  

 

 

(
  



 


8  

 



*

ð   
ð   
þackpropagation learning algorithm µþ 

, 





|

È  °      

þ has two phases*

   


9  
-!
  

 

 
 

  

 
þackpropagation learning algorithm µþ 

, 





| &
È  °  
     
  
 
 


:




  

þ has two phases*

   


9  
-!
  

 

 
 

  
 


   
9 
-!


 

 
    
 
 

  

 


 


  

  

 
  
 

j
 
"# 1  

#

;

"
#

 
"
1  
8






8

"
 
8


#


8

 
8


K  
  

 
 
  


Õ           





   

       Õ   





    

7


  !


  
  




" 


orward pass

(
 
"
 
  

  




 

¢ 'ompute values for hidden units

 ?@ *   ? @  ? @ 8


 *   ? @@ ;8

2 compute values for output units 8

 ?@ *  ? @  
 "
 *   ? @@
þackward ass
(


 

)  
 
 &


 
 
 


 *
#
   *          

  *#
 
 

 
 


&
(









 

 
4&
Use gradient descent 

ð   
   #       
ð    

both for hidden units and output units


j
 
   


 

 


 





 
 

ð    ð    ð   
* 
ð    ð    ð   

both for hidden units and output units

j 
6 
 

 



  




 

 






 


8



  


j 
$ 



j 
$
 *

ð   ð   
*    * Õ   
ð    ð    

j 
6



ð   ð  
   *  !   * 
ð   ð   
 
 &
'
  



*

  


  
 *

ð    ð   
`   * * <     
ð    ð  
`   * <            



 



*

ð   ð   ð    
   * 
ð  
* 
 ð     ð  
   *         

$  


 8
` `8

(
 




 


 

9 -

9-





-
8
8
`8
'
6$
 

ð   
*      
ð   
ð   
* `   Õ   
ð   
,

  
 
 

4
 




   „    

   „ `  Õ 

( 
„ 

 

 
-
=
„ = #
(ummary

(

 
 

  #     *      
   #      *     Õ   
  

   #       *      Õ    
*                Õ    



  #      *      
*                


| 
   

: 

|

jopic

,  

$
 
 
 
: 

$
 
2  ;
6 
) 

#&
6 

 
 

 

  !



&
4  
`


  

*
    *              

 

   
 
 
'&
$   

`


 
 
 

 
*

   *     

     

%&
4  

*
  # *         
   # *         Õ   
1 


 
 


!

 







  


 &
6
" *

"# ##
#
##
# #
#
- #
#
#
-
#
-
" 
# 

#
#-
#
-
#

5

  
  




6



#&
(

 


 &

 

„
-&#

x"¢# 0 ##
#
##
# #
#
- #
#
#
-
#
-
x"2 ¢ 
# 

#

 
 
>-
#?

 
>#
-?&

  
&
' 
# 
  *


# u¢  ¢
"# ## ##
# #
#
- #
#
#
-
#
-
" 
# 

#
u2  2

#

#"-

-"#
#

#



-"-

#"#
#


' 
 

  


  

  

  

Õ¢  ¢
"# ##
#
##
# #
#
- #
#
#
-
#
-
" 
# 

#
Õ2  2

;#
 #

#
;
 



' 
 
  

 

  
  *

"# ##
#
##
# y¢ 2
#
- #
#
#
-
#
-
" 
# y2 2

#

#
#

#"#

-"
#





#"#

#"
#


$  
*
   #       *      Õ    
*                Õ    

"# ##
#
##
#
`¢ -¢
#
- #
#
#
-
#
-
" 
# `2 -2

#

j 
>#!
-?

#

#


-
,*
#
#
 # 
#
@ 

#


  
-
@ 


' 



# 
 
   
 *


# Õ¢  ¢
"# ## ##
# `¢ Õ¢ -¢

-
# #
# `¢ Õ2 -2
#
-
#
-
"
# `2 Õ¢ -2


# `2 Õ2 -4
Õ2  2

   #       *      Õ    
(


*

   #       *      Õ    

"# ##
#
w¢¢ 0 
#
- w2¢ -¢ 2
#
-
w¢2 -0 2
" 
#
w22 0 
$ 
 
 
 
°-*

   * <    

`    


#
"# ## `¢ w¢¢ -¢ `¢ -¢
#
-
`2 w2¢ 2

-
# `¢ w¢2 0
" 
# `2 -2
`2 w22 -2
`-
 
 *

   * <    
`    

##
# ° ¢ ¢
"#
`¢ -¢
#
-
#
-
" 
# `2 -2

°2  -2

°#
 #



#
°
-
@ 


6
 
  

 *

  #     * „     

x¢ 0 ##
# °¢ x ¢  0
`¢ -¢
#
- ¢ x 2  ¢
#
-
2 x ¢  0
x 2 ¢ 
# `2 -2

°2 x2  -2


*

  #     * „     

"#
- v¢¢ -¢
w¢¢ 0 
v2¢ 0 w2¢ -¢ 2
v¢2 0 ¢
w¢2 -0 2
"
# v22 0 
w22 0 





  


; 
 
 

 




  


 
(
 







  

 
 



 
  *

  #     * „     


# Õ¢  ¢ 2
"#
- ## ##
-&/
#
- #
#&
#
-&#
#
-&
"
# 
-&A

-&
Õ2  ¢ 


  

 
 



 
  *

  #     * „     

"#
- ##
# y¢  ¢ 
##
-&/
#
- #
#&
#
-&#
#
-&
"
# 
-&A

-&
y2  0 2

1  



 
 
>#!
-?
Uctivation unctions



  
  
 

+
`   *            
   *            


   
( *      *

 


 

   

  
  

 

   

  
  

 



 
,
 
   


|
# #
     * *
# "       #    

 



 

&
j


  
 

 




-

#&

6  






 



 
 


@#

#&

†
 

  

    
 
 
*




-!



-&
:   


  


 "     
<     * 
*     >#    ?
>#  "     ?
  *

  *    


  *

<     *   #  

:   


  

"



-&!

 
 





; 


  
" 
 
, 
 




  

   


  
  !

    *          <     
   * <      `     






 



  
 
  


-

 


" &
j



saturating 
 



  
 




 


 &
'



 
 
  
(ummary of (sequential) þ learning algorithm

,
 



,


 
 &
*

!



 
 
*

 

  
  


   
    


   
  
  


   
   
  

   
  
  

   
    

       
        
   
 

áetwork training:

j 


 

 
 
 

4 
 



 

9  -
5 

;
 

 
 


 

  

 

 
 

  
 

 

 
 
 

j
 

 
 *

 ,) 

!
  !

   
(


 
 



 $ 




  &
' 


   7


 
 


 
&

' 



 
  

Udvantages and disadvantages of different modes

(equential mode
 
 

 

 
 2
 





 


 


  

    


 


 6


 


  

 

&&&


 
 
 

 

 
!
 &

 

 
 

 , 

 

þatch mode:
 
 

) 

 4
 
  
 
 4

 
§ynamics of þ learning

U      


    
     
  Ô

2 !

)

 
 

  


#
4

  *#
         




 
4




 


  
  !


 
  

 !
 



 
  




 

$ 
|


  
  
 
 "
 

  
&&
 !


 &




 


(electing initial weight values

 ' 



 

  


 
 




  &
j
!



 

 

 6


 

 
 
 
 
  



 , 

 

 
 

  

  

 

 


 


 
 
  

RegulariÕation ± a way of reducing variance (taking less
notice of data)

,
 

 
 

 



  

 


 
 
  

Ô È

 
Ô 
  ;
 


 *

)  


 
 

!

 &

4

È   *   
 
  ;


  ;
|omentum

|

 




 




   

6
 



) 
 
  

" 


 

  



|


) 



  #    * „      
 >      #?
ù 
 


 

 
 




 
 

4 

 
 

 




 


 
 
 

 
 
 

   


 
 




 
 

 
 
 

 
 



 
 
;

 '

  

  

 

(topping criteria
'

 
   




* 
*#  *#
>          ? 

 
 

 
 !

|  

  

' 

 





4

!
 

   

However, aim is for new patterns to be


classified correctly
  j 
 
B 

 
j 

j  !
 
 

 


 

 

 
 
 
 






 

 
 

 "
 
j  

 
 "
 
 

'ross-validation
|

  
 
   


 


 

 
 




 


°   


, 





  

: 
 




 j 



 





 
 
 
 
 

 


  



 

 ;

C
 

9-
 



 

 j


4  

 
 

 
   
4 
 

 

;

 
 



 
Universal unction Upproximation





| +


 


| +

Universal Upproximation jheorem



 

† 
  
  
   
 

"

 

|


  


D
     °    †

 
°