Attribution Non-Commercial (BY-NC)

18 views

Attribution Non-Commercial (BY-NC)

- ANN.docx
- Reinforcement Learning
- SATUAN ACARA PERKULIAHAN
- ERP Implementation
- Artificial Intelligence
- Machine Learning Introduction
- Albert Lai - Information Fusion Poster
- ZhaSahVis10
- curso de machine learning lectura 1
- AIAA-2014
- 3
- SPM Subjective
- IEEE Banglore
- AMpc
- Genetic Programming
- Big Data-Driven Marketing: How machine learning outperforms marketers’ gut-feeling
- Subtitle
- Turban Dss9e Ch05
- part2
- ML for NLP

You are on page 1of 29

Machine Learning (ML)

The study of algorithms which improve automatically though experience. - Mitchell

General description

Data driven Extract some information from data Mathematically based

Probability, Statistics, Information theory, Computational learning theory, optimization

Text

Document labeling, Part of speech tagging, Summarization

Vision

Object recognition, Hand writing recognition, Emotion labeling, Surveillance

Sound

Speech recognition, music genra classification

Finance

Algorithmic trading

A few types of ML

Supervised

Given: labeled data Usual goal: learn function Ex: SVM, Neural Networks, Boosting etc.

Unsupervised

Given: unlabeled data Usual goal: cluster data, learn conditional probabilities Ex: Nearest Neighbors, Decision trees

Semi-Supervised

Given: labeled and unlabeled data Usual goal: use unlabeled data to increase labeled data Ex: Cluster, Label unlabeled data from clusters

Reinforcement

Given: Reward function and set of actions Goal: Learn a function which optimizes the reward function Ex: Q-learning , Ant-Q

Markov Decision Process (MDP)

State space (fully or partially observable) Action space (static or time dependant) Transition function produces an action (based the present state, not the past) Reward function (based on action)

Learning flocking behavior

N agents discrete time steps Agent i partner j Define Q-state Q(st, at) st - state ai - action

State of i

[R] = floor(|i-j|)

Actions for i

a1 - Attract to j a2 - Parallel positive orientation to j a3 - Parallel negative orientation to j a4 - Repulsion from j

Distances R1, R2, R3 s.t. R1 < R2 < R3

st r 0<[R]R1 1 -1 R1<[R]R2 1 -1 R2<[R]R3 1 -1 R3<[R] a1,a2,a3, a4 0

Distances R1, R2, R3 s.t. R1 < R2 < R3

st r 1 0<[R]R3 a1,a2,a3 -1 R3<[R] a1,a2,a3,a4 0 at a4

Basic planners work from scratch Ex, path planning for parking, no difference between first time and the hundredth time Ideal learn some general higher level strategies that can be reused General solution patterns in the problem space

Agent can see, perceptual information

Range finder like virtual sensors

From its own experimentation or external source

Search based off of what has previously been successful in similar situations.

X set of agent states E set of environment states def x+ \in X+ {x+=(x,e) | x \in X, e \in E} = Sensor function (x+): X+ R At a specific sensor state x+ \in X+ def sensor state s =(1(x+), , n(x+)) And sensor space s \in S where S all sensor state values

Finally

Def locally situated state of the agent = (s,x) \in where x is some state information independent of the sensory agent. Now we want collect data to train a function (): {viable, nonviable} Note, errors in () could cause problems

Function IS_NONVIABLE(x+) if is_collision(x+) then return True s := (1(x+), , n(x+)) x := extract_internal_state(x+) := (s, x) return ():

MIT lab - System where the user interactively train the dog using click training Uses acoustic patterns as cues for actions Can be taught cues on different acoustic pattern Can create new actions from state space search Simplified Q-learning based on animal training techniques

Predictable regularities

animals will tend to successful state small time window

limit the state space by only looking at states that matter, ex if utterance u followed by action a produces a reward then utterance u is important.

Easy to train

Credit accumulation And allowing state action pair to delegate credit to another state action pain.

Alternatives to Q-Learning

Q-decomp [RZ03]

Complex agent as set of simpler subagents Subagent has its own reward function Arbitrator decides best actions based on advice from subagents

A simple world with initial state S0 and three terminal states SL , SU , SR , each with an associated reward of dollars and/or euros. The discount factor is (0, 1).

Q-Decomp as the learning technique Reward function - Inverse Reinforcement Learning (IRL) [NR00]

Mimicking behavior from an expert

Idea that uses the SVM algorithm to generate a smooth path. Not really Machine learning but neat application of a ML algortihm Here is the idea

Videos

Robot learning to pick up objects

http://www.cs.ou.edu/~fagg/movies/index.html#torso_2004

Training a Dog

http://characters.media.mit.edu/projects/dobie.html

References

[NR00] A. Y. Ng and S. Russell. Algorithms for inverse reinforcement learning. In Proc. 17th International Conf. on Machine Learning, pages 663-670. Morgan Kaufmann, San Francisco, CA, 2000. [B02] B. Blumberg et al. Integrated learning for interactive synthetic characters. In SIGGRAPH 02: Proceedings of the 29th annual conference on Computer graphics and interactive techniques, pages 417-426, New York, NY, USA, 2002. ACM Press. [RZ03] S. J. Russell and A. Zimdars. Q-decomposition for reinforcement learning agents. In ICML, pages 656-663, 2003 [MI06] K. Morihiro, Teijiro Isokawa, Haruhiko Nishimura, Nobuyuki Matsui, Emergence of Flocking Behavior Based on Reinforcement Learning, Knowledge-Based Intelligent Information and Engineering Systems, pages 699706, 2006 [CT06] T. Conde and D. Thalmann. Learnable behavioural model for autonomous virtual agents: low-level learning. In AAMAS 06: Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems, pages 89-96, New York, NY, USA, 2006. ACM Press. [M06] J. Miura. Support vector path planning. In Intelligent Robots and Systems, 2006 IEEE/RSJ International Conference on, pages 2894-2899, 2006. [KP07] M. Kalisiak and M. van de Panne. Faster motion planning using learned local viability models. In ICRA, pages 2700-2705, 2007.

[M07] Mehryar Mohri - Foundations of Machine Learning course notes http://www.cs.nyu.edu/~mohri/ml07.html [M97] Tom M. Mitchell. Machine learning. McGraw-Hill, 1997 RN05] Russell S, Norvig P (1995) Artificial Intelligence: A Modern Approach, Prentice Hall Series in Artificial Intelligence. Englewood Cliffs, New Jersey [CV95] Corinna Cortes and Vladimir Vapnik, Support-Vector Networks, Machine Learning, 20, 1995. [V98] Vladimir N. Vapnik. Statistical Learning Theory. Wiley, 1998. [KV94] Michael J. Kearns and Umesh V. Vazirani. An Introduction to Computational Learning Theory. MIT Press, 1994.

- ANN.docxUploaded bySafi Ullah Khan
- Reinforcement LearningUploaded byAlejandro Nuñez Velasquez
- SATUAN ACARA PERKULIAHANUploaded byAl-kahfiHariadi
- ERP ImplementationUploaded byrakesh_a_thaker
- Artificial IntelligenceUploaded byucsaf samundri
- Machine Learning IntroductionUploaded byAlex David
- Albert Lai - Information Fusion PosterUploaded byAMIA
- ZhaSahVis10Uploaded byVinh Nguyen
- curso de machine learning lectura 1Uploaded bycorpses88
- AIAA-2014Uploaded byCS & IT
- 3Uploaded byVaheed Moj
- SPM SubjectiveUploaded byChandana Bellamkonda
- IEEE BangloreUploaded bymadhu_ramarkla
- AMpcUploaded bystathiss11
- Genetic ProgrammingUploaded byDikshant Deshmukh
- Big Data-Driven Marketing: How machine learning outperforms marketers’ gut-feelingUploaded byKyle Gibson
- SubtitleUploaded byAnkur Shah
- Turban Dss9e Ch05Uploaded byKHLAED
- part2Uploaded byNazmul-Hassan Sumon
- ML for NLPUploaded byCarl Andrews
- Introduction to Transfer Function in MATLABUploaded byvoltax1
- art%3A10.1007%2Fs10845-015-1107-8Uploaded byPradeep Kundu
- edx gantt chartUploaded byapi-373338729
- Course Outline.pdfUploaded byBaye Desalew
- InTech-Types_of_machine_learning_algorithms.pdfUploaded byDaniel Arya
- Harish PhularaUploaded byHarish Phulara
- HARSHIT's Resume.pdfUploaded byharshit
- cs-it-dept.pdfUploaded byhumaira siddiqui
- Bayesian Based Machine LearningUploaded bySrinivas
- Emerging trends in analyticsUploaded bySampangi

- Laboratory Exercise NrUploaded byGangadhar Vukkesala
- Dynamic-Simulator.pdfUploaded bytranhuutuong
- Control System - I: Weekly Lesson PlanUploaded bydebasishmee5808
- Linear Programming Applied to Nurses Shifting ProblemsUploaded bymmxy_1
- Understanding and Design of an Arduino-based PID ControllerUploaded byNikhil Deshmukh
- Mincostflow NotesUploaded byttungl
- Hodge+Austin_OutlierDetection_AIRE381Uploaded byTruong Thi Tuyet Hoa
- Optimal Advertisement Placement Slot Using Knapsack Problem (A Case Study of Television Advertisement of Tv 3 Ghana)Uploaded byAnonymous 7VPPkWS8O
- algorithms-11-00047.pdfUploaded byPraveenKumarBonthagarla
- queu Solution7Uploaded byLukman Iskanderia
- A Proposed Technique for Privacy Preservation by Anonymization Method Accomplishing Concept of K-Means Clustering and DESUploaded byIJRASETPublications
- 3.Decision Making and LoopingUploaded byAarthi Janakiraman
- IIR Filters in MatlabUploaded bysidhuhere
- Algorithms for Big Data (CS 229r)Uploaded byAditya
- Ch09-Charting Diagramming1 (1).pdfUploaded byTheslyness
- AI Fuzzy LogicUploaded byAmandeep Singh
- Solution of Skill Assesment Exercise of Control System Engineering by Norman s NiseUploaded byGarramiin Maal Ta'a
- Week 5, Security and AuthenticationUploaded byCyn Syjuco
- 2018_MWDSI_Forecasting Intermittent Demand Patterns With Time SeriesUploaded byAnonymous qq17d7UgRT
- Or IntroductionUploaded bysmsmba
- 70Uploaded byromarcin
- Distributed Algorithms_ an Intuitive Approach [Fokkink 2013-12-06]Uploaded byudaysk
- En 1990 Basis of Structural Design - Annex B Reliability Differentiation Annex C Reliability TheoryUploaded byHarikrishnan P
- Retrieving Secure Data from Cloud Using OTPUploaded byIJIRAE
- 02. Bisection MethodUploaded byRaisul
- 2008_Nonlinear System Identification Using Wavelet Based SDP Models_THESIS_RMITUploaded byademargcjunior
- The Linear Algebra Behind Google PptUploaded byanon_990946533
- 09-mpegUploaded byDr-Raghad Al-Faham
- Non linear equationUploaded byJatinder Kumar
- KNP3063 Robotics and Automation Course PlanUploaded byNurulBusyra