You are on page 1of 19
Deep Deterministic Policy Gradient — DDPG — History PU aa ON Nupamiliclsh [ee ML methods Unsupervised iforcement [eae] aura) + We have training set + Wehave a set of + Wedonot have a for which we have unlabeled data and target variable given right answer for learning algorithm ‘every training algorithm + We have reward + Job —to find structure signals + Training example in data with algorithms contains all the right + Agent needs to plan answers the path on its own to: reach the goal where + Job —to replicate the reward exist right answers Reinforcement Learning robotics playinggemes self-driving cars Supervised vs Unsupervised ered Reciaint Supervised + Clustering ~ Kemeans + Dimensionality reduction ~ Principal Component Analysis - svo Supervised process X1, X2 Build Model ¥ {(X1, X2) =¥ X1, X2 b New Data Predict Use Model Supervised uses Credit Card Fraud Logistic Regression Classification Example Car Insurance Fraud Regression Example ‘AntFtaud = intercept + coal eaimedArmt Unsupervised Contains patterns Finds patterns Recognizes patterns ‘ a f a i ! fi if Cc > KJ "hit Customer purchase Train Algorithm Build Model Customer Groups data New Customer Use Model Purchase Data Similar Customer Group Unsupervised I si politics | ot. |

You might also like