The document provides steps for planning an AI and machine learning project:
1. Set clear objectives for the project.
2. Audit the available data - check if it is clean, complete, and has the right variables. More data may need to be collected.
3. Data preprocessing steps are outlined including cleaning, transforming, balancing samples. Techniques like imputation, normalization, dimensionality reduction are described.
Original Description:
Original Title
AI Strategy Flow Chart share by WorldLine Technology
The document provides steps for planning an AI and machine learning project:
1. Set clear objectives for the project.
2. Audit the available data - check if it is clean, complete, and has the right variables. More data may need to be collected.
3. Data preprocessing steps are outlined including cleaning, transforming, balancing samples. Techniques like imputation, normalization, dimensionality reduction are described.
The document provides steps for planning an AI and machine learning project:
1. Set clear objectives for the project.
2. Audit the available data - check if it is clean, complete, and has the right variables. More data may need to be collected.
3. Data preprocessing steps are outlined including cleaning, transforming, balancing samples. Techniques like imputation, normalization, dimensionality reduction are described.
AI & Machine learning: brainstorming project planning
Step 1: Objectives Step 2: Audit = human perception Exploration/ Cleaning Collection + Cleaning = human preferences Transformation Audit: Explore ☺ Scrape (python, VBA) = human knowledge Objective: Are your Need Anova ☺ Surveys Need to Divide by largest value (0 < x < 1) ☺ = human motivation No do you Yes data clean No more have one? or Cross tabs ☺ data or Yes Download transform Yes Convert to Binary ☺ = human communication your data? complete? Plots ☺ variables? (Check licences) Create Unique Keys ☺ = GPU may be required Sort Log Transformation ☺ = favorites or must try No Groupby Count Queries No Square Root Transformation Yes Remove Illegal Characters No Missing Values Arcsine transformation Set Objectives Visualizations No Sample Balancing Imputation & Instrumental Variables ☺ Ratios Logical exclusions/inclusions ☺ Mean substitution Financial & Social Sample Box-Cox (Square Root + log) Under sampling (random) Delete incomplete records "The Equation" ☺ Reduction balanced Too many Collect more data Normalize (Min-Max) Over sampling (random) ? KPI’s Principle components analysis (PCA) ☺ variables All clean, Change y or x Yes SMOTE (synthetic minority) Missing Standardize (Z-score) Methods Yes with Yes balanced, Copy adjacent observations Factor Analysis ☺ Values? high complete? Weight balancing Maximum likelihood estimates RobustScaler 2 x 2 matrix* (ease vs impact) ☺ Hierarchical Factor Analysis (HFA) ☺ correlations? Change prediction thresholds Expectation-Maximization imputation Create Percentiles Killer Ideas Workshop ☺ Multi-dimentional scaling (MDS) ☺ Multiple imputation Co-creation workshop ☺ Yes No Create Scales Singular value decomposition (SVD) Problem finding No Correspondence Analysis Step 4: Options Framing Geostatistics Ideation Cognitive/ Symbolic Computing Data Identify Kriging & Variogramming ☺ only? Yes Geographic? Yes Creative thinking Supervised Machine Learning Data Type Simulation Need to group the Cholesky decomposition Cluster Analysis Yes data into Markov Chain Geostats K-means ☺ classes, Skip Step 3 No Support Vector Machine segments, Hierarchical k-means ☺ etc? Cellular automata Hierarchical Cluster Analysis☺ etc. Task CLARA ☺ No Spatial autocorrelations K-medoids ☺ Rescale Moran's I DBSCAN ☺ Clean Time Series Cross-Sectional & Geary's C Do you have Yes Images? Text (x) Time Series Cross-Sectional Fuzzy ☺ Yes See Transformation a model? and HCPC ☺ Data (y, x) Sort (APRIORI) Tools Text? Yes Univariate/ Multivariate* CART (regressor/classifier) What type of y Random Forest (regressor) variable? No Yes Statistical Do you Step 3: Modelling Yes Econometrics ETS (automatic) ☺ ─ Time series Analysis want to predict, ARIMA (automatic) ☺ ─ Structural Models Modelling ( y = f (x) ) Do or classify, images VAR ☺ (Vector Autoregressive) Neural Networks explain Modelling Workshop Yes No have Can the text Econometrics ☺ (custom models) something? List all y's captions? be transformed Eyeballing, etc. ☺ List all x's for each y Damped (exponential smoothing) Binary (y = 0,1) Categorical/Nominal Continuous into data Ranked (y = 1,2,3, etc.) (binary, (y = classes) (bounded, unbounded) Tools Theta model Statistical cardinal)? Statistical Statistical-Bounded (Sigmoid) Funnel Analysis ☺ Yes SES (simple exponential smoothing) Logistic regression (LR) ☺ Statistical No Ordinal regression ☺ Logistic regression Customer Decision Journey Holt-Winters Probit model Multinomial Logistic (Logit) Non-linear regression Regression ☺ Ordinal logistic regression ☺ Markov Chain ☺ OLS Stepwise logistic regression Bass model Machine Bayesian Belief Networks Translation Tools (Captions) No Stepwise Regression Linear Discriminant Analysis Multinomial Probit Conjoint Analysis ☺ Tools (Descriptive, other) Diffusion Curves ? See Text Tools Combinations or averages Log-binomial regression Categorical regression Boxes and Arrows (etc.) ☺ Machine Learning Statistical-Unbounded 1 Variable Inspirations Machine Learning Poisson regression Ordered probit Survival/hazard (y = Time) Discrete Scales No Tools BNNJ (Bayesian Neural Network) Cox regression Decision Trees ☺ Cumulative-logit model Conjoint Analysis Physics, mathematics, chemistry, Percentiles, sorts, filter Tools (images) CART ☺ RNNJ (Recurrent Neural Networks) Machine Learning OLS regression, GLM Counts/histograms biology, botany, zoology, etc. Yes Random Forest ☺ Continuation-ratio model Random Forest ☺ MLP (Multi-layer perception) Decision Trees (CART) ☺ Ridge regression Proportions Psychology, sociology, economics, Neural Nets (deep learning) Partial Proportional Odds Neural Networks SVR (Support Vector Machine) Random Forests ☺ K-Nearest Neighbour model Robust regression Mean, Median, Mode linguistics , geography, etc. Convolutional Neural Networks etc. GP ( Gaussian Processes) Bayesian Networks Lasso regression Machine Translation (MT) (CNN) Adjacent-category logit model Standard Deviations Finance, marketing, operations, Naïve Bayes Elastic Net regression GRNN (Generalised Regression Support Vector Machines (SVM) Rule-based (e.g. Apertium) etc. Polytomous logistic model Variance organizational behaviour, etc. Neural Networks) Naïve Bayes Support Vector Machines Stepwise regression Sources Statistical (e.g. Moses) RBF (Radial Basis Functions) (SVM) Stereotype logistic model Tobit regression Skew (Kurtosis) K-Nearest Neighbour Neural (e.g. Open NMT, PyTorch, KNN (K-Nearest Neighbour Extreme Learning Machines Machine Learning Econometrics (custom) Ranges, quartiles arXiv.org, PubMed, Agricola, Metrics Tensorflow) Regression) ROC (receiver operating Neural Networks See categorical for few ranks Machine Learning Coefficient of variation eric.ed.gov, researchgate.net, LSTM (long short-term memory) Want to characteristic) Decision Trees (CART) ☺ Box plots worldcat.org, stack exchange/overflow CART (regression trees) generate Github, google/youtube, experts Yes new No AUC (area under the curve) Random Trees ☺ Tests Kolmogorov-Smirnov, Random Forest (regression) Confusion matrix / Contingency images? Neural Networks Tools etc. Table etc. Anderson-Darling, Shapiro-Wilk, Lilliefors Generative Adversarial Text Tools Networks (GAN) t-test Control Charts Semantic Graph Analysis ☺ etc.