Professional Documents
Culture Documents
A value passed to a function (or method) when calling the function. There are
two kinds of arguments: keyword arguments, and positional arguments.
Arguments are assigned to the named local variables in a function body.
Syntactically, any expression can be used to represent an argument. The
Argument evaluated value is assigned to the local variable.
A non-human program or model that can solve sophisticated tasks. For example,
a program or model that translates text or a program or model that identifies
diseases from radiologic images both exhibit artificial intelligence. Formall,
machine learning is a sub-field of artificial intelligence. However, in recent years,
Artificial some organisations have begun using the terms artificial intelligence and
intelligence machine learning interchangeably.
A type of machine learning model for distinguishing among two or more discrete
Classification classes. For example, a natural language processing classification model could
model determine whether an input sentence was in French, Spanish, or Italian.
1
Sources: Python 3 documentation glossary (https://docs.python.org/3/glossary.html); Google Developers
Machine Learning Glossary (https://developers.google.com/machine-learning/glossary).
ACCA – Machine learning with Python for finance professionals – Version 1.0
A mechanism for estimating how well a model will generalise to new data by
testing the model against one or more non-overlapping data subsets withheld
Cross-validation from the training set.
Dictionary (dict) An associative array, where arbitrary keys are mapped to values.
Linear regression Using the raw output of a linear model as the actual prediction in a regression
model. The goal of a regression problem is to make a real-valued prediction.
A built-in Python sequence. Despite its name, it is more akin to an array in other
List languages than to a linked list.
A program or system that builds (trains) a predictive model from input data. The
system uses the learned model to make useful predictions from new
(never-before-seen) data drawn from the same distribution as the one used to
train the model. Machine learning also refers to the field of study concerned with
Machine learning these programs or systems.
Creating a model that matches the training data so closely that the model fails to
Overfitting make correct predictions on new data.
A named entity in a function (or method) definition that specifies an argument (or
in some cases, arguments) that the function can accept. Parameters can specify
both optional and required arguments, as well as default values for some
Parameter optional arguments.
An ensemble approach to finding the decision tree that best fits the training data
Random forest by creating many decision trees and then determining the “average” one. The
“random” part of the term refers to building each of the decision trees from a
random selection of features; the “forest” refers to the set of decision trees.
A system that selects, for each user, a relatively small set of desirable items from
a large corpus. For example, a video recommendation system might recommend
certain videos from a corpus, based on factors such as:
Recommendation ● Films that similar users rated or watched; and
system ● Genre, directors, actors, target demographic, etc.
Regression
model A type of model that outputs continuous values (typically floats).
Training a model from input data and its corresponding labels. Supervised
learning is analogous to a student learning a subject by studying a set of
questions and their corresponding answers. After mastering the mapping
Supervised between questions and answers, the student can then provide answers to new
machine learning (never-before-seen) questions on the same topic.
The subset of the dataset used to test the model, after the model has gone
Test set through initial vetting by the validation set.
Producing a model with poor predictive ability, because the model hasn’t
captured the complexity of the training data. Many problems can cause
underfitting, including:
● Training on the wrong set of features;
● Training for too few epochs or at too low a learning rate;
● Training with too high a regularisation rate; and
Underfitting ● Providing too few hidden layers in a deep neural network.
The most common use of unsupervised machine learning is to cluster data into
groups of similar examples. For example, an unsupervised machine learning
algorithm can cluster songs together based on various properties of music. The
resulting clusters can become an input to other machine learning algorithms
Unsupervised (e.g. to a music recommendation service). Clustering can be helpful in domains
machine learning where true labels are hard to obtain (e.g. anti-abuse and fraud).
Principal Component Analysis (PCA) is another example of unsupervised
machine learning (e.g. applying PCA on a dataset containing the contents of
millions of shopping carts may reveal that shopping carts containing lemons
frequently also contain antacids).
Validation set A subset of the dataset - disjointed from the training set - used in validation.