Deep Learning@Ok Interviews

DEEP LEARNING INTERVIEWS —REAL-WORLD DEEP LEARNING INTERVIEW PROBLEMS « SOLUTIONS — SECOND EDITION SHLOMO KASHANIpole tester Ban CRs Tee Cacopstainesecro willy ence cae g veel Carine see RBC SC ee cmos nw UES SCX Steak Ree B cacti Pere eee ee Creer ECE CCE TS NTS ett aoe Sune Re Csi ecw ae Rol Geec tak oas Ca mae field. The problems it poses are tough enough to cut your teeth on and to Cr cvertsCeVbetl cobras Come amc etch m alten sited provoking questions and engaging stories. BU sCh ise Coes Calne coe Mba Cs Can Gtr Tg ay Paes tay oh ate cum simi smn L aeneetlouay ere mel (a do kott EVN oar tmcy mE MON mean ee atte Coluber Rene as meray fully understand the purpose and meaning of interview questions and answers. These are powerful, indispensable advantages to have when walking Pro Cena Cece BUNA Le eT SPEC e Coa Romano ear recon De job interviews and graduate-level exams. That places this work at the Bosca Cope Maes CsCl mca tna Ca ohcelca (a Reema a sc Caste pL aetoune ete Tes Ruese iets Co eles el Cm ety eum uean ene emeNtS ecth tt cae acoa meee iicodce Constant mt a tC CRanCs teers Vas cele aeety of ML, and Al appears in the curriculum of neatly every university. This RNS KL sta CoCo ae coed eeu aa acest Shlomo Kashani, Author, Amir Ivry, Chief Editor. Pera eee * Information Theory Sa CeaneyConiity . Sass eeey yy - 5 Parnes retard ones oe mnt ene ees a. Vl ae. , eee ~ Caeser sete keto Www.interviews.al 19 78191 6'243569" DEEP LEARNING INTERVIEWS SAAIAUALLINI ONINUVAT daaa @ INVHSVX OWOTHSContents I Rusty Nail 1 HOW-TO USE THIS BOOK 3 Titroduction waingiwa 4. 4s oe oe 3 What makes this book so valuable 3 What willl learn... . . 4 How to Work Problems . 6 ‘Types of Problems . . . . iz Il Kindergarten 9 LOGISTIC REGRESSION 1 Introduction. 2... tenes 12 Problems ...... - 12 General Concepts . 12 Odds, Log-odds 13 The Sigmoid . a 15 Truly Understanding Logistic Regression . . . . 16 ‘The Logit Function and Entropy : Bythow’/Pyforeh/ CPR: : . 2B Solutions . . 2 7 General Concepts - : ae Odds, Log-odds . 29 The Sigmoid . sg fon Hig ae a . 32 Truly Understanding Logistic Regression . . . . 33 ‘The Logit Function and Entropy . 38 Python, PyTorch, CPP - 38PROBABILISTIC PROGRAMMING & BAYESIAN DL 41 Introduction... . . « Problems Expectation and Variance Conditional Probability . . Bayes Rule... . Maximum Likelihood Estimation . Fisher Information Posterior & prior predictive distributions . . . Conjugate priors... ......-.2.-0-0- Bayesian Deep Learning . Solutions ........ Expectation and Variance | Conditional Probability . Bayes Rule... . . Maximum Likelihood Estimation . . Fisher Information Posterior & prior predictive distributions . . . Conjugate priors... 6... eee 22 it i me Bayesian Deep Learning ... 2.20.00. 2020 cee eee ee eee SaSFSShE SES Il High School 83 INFORMATION THEORY Introduction’: v2 221 23 23 58 Si H3 S8 A ww we ea ee mG Problems . . Logarithms in Information Theory. Shannon's Entropy Kullback-Leibler Divergence (KLD) Classification and Information Gain Mutual Information . . . . Mechanical Statistics . . . Jensen's inequality - 101 Solutions 101 Logarithms in Information Theory. Shannon's Entropy .........- 101 103Kullback-Leibler Divergence... 02.0002 e eee eee Classification and Information Gain Mutual Information . . . Mechanical Statistics . . Jensen's inequality DEEP LEARNING: CALCULUS, ALGORITHMIC DIFFERENTIATION Introduction . .. . Problems... .. . ‘AD, Gradient descent & Backpropagation oa Numerical differentiation . “6 Directed Acyclic Graphs Thechainrule...... Taylor series expansion . Limits and continuity . . Partial derivatives wee Optimization. 20.00.0220... The Gradient descent algorithm . . ‘The Backpropagation algorithm . . Feed forward neural networks Activation functions, Autograd /JAX Dualnumbersin AD .. 00.020 c eevee eee eens Forward mode AD Forward mode AD table construction Symbolic differentiation Simple differentiation . . The Beta-Binomial model Solutions . . . ‘Algorithmic differentiation, Gradient descent Numerical differentiation Directed Acyclic Graphs Thechainrule...... Taylor series expansion . Limits and continuity . . Partial derivatives Optimization... . . The Gradient descent algorithm . - 110 116 - 118 118 121 - 122 . 124 . 124 127‘The Backpropagation algorithm . . . 156 Feed forward neural networks . . - 158 Activation functions, sutra JAK . 158 Dualnumbers in AD . . - 163 Forward mode AD .. . - 166 Forward mode AD table construction. - 168 Symbolic differentiation .172 Simple differentiation . . 172 The Beta-Binomial model - 174 IV_ Bachelors 183 DEEP LEARNING: NN ENSEMBLES 185 Introduction . . - 186 Problems... . - 186 Bagging, Boosting and Stacking . 33 . 186 ‘Approaches for Combining Predictors... . . 190 Monolithic and Heterogeneous Ensembling . - 191 Ensemble Learning . . . . 194 Snapshot Ensembling . - 195 Multi-model Ensembling . - - 58 +196 Learning-rate Schedules inEnsembling . . . .197 Solations ves vswnsimnanasgme ee - 198 Bagging, Boosting and Stacking . ee - 198 ‘Approaches for Combining Predictors .. . . . 199 Monolithic and Heterogeneous Ensembling : » 200 Ensemble Learning . sete “i - 201 Snapshot Ensembling = 201 Multi-model Ensembling Lee . 202 Learning-rate Schedules inEnsembling . . . . 202 DEEP LEARNING: CNN FEATURE EXTRACTION 205 Introduction ©. eee 205 Problems '3/2% aw enim a - 206 CNN as Fixed Feature Extractor . . = 206 FinetuningCNNs ......... - 213

Deep Learning@Ok Interviews

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Deep Learning@Ok Interviews

Uploaded by

Copyright:

Available Formats

You might also like