You are on page 1of 35

Ang, Von Errol L.

Tech Topic 4 September 7, 2019


2014106179 COE132/E01
Data Science

Database
A database is an organized collection of related information. It is an organized collection,
because in a database, all data is described and associated with other data. All information in
a database should be related as well; separate databases should be created to manage
unrelated information. For example, a database that contains information about students
should not also hold information about company stock prices. Databases are not always digital
– a filing cabinet, for instance, might be considered a form of database. For the purposes of
this text, we will only consider digital databases.

Database, also called electronic database, any collection of data, or information, that is
specially organized for rapid search and retrieval by a computer. Databases are structured
to facilitate the storage, retrieval, modification, and deletion of data in conjunction with various
data-processing operations. A database management system (DBMS) extracts information
from the database in response to queries.

Database Management System (DBMS)


We often mistakenly say our database is Oracle, MySQL, SQL Server, MongoDB. But, they
aren’t databases, they are database management systems (DBMS).
The database has your actual data and the rules about that data, while the DBMS is the
program that surrounds and manages your actual data, and it enforces the rules you specified
on your data. The rules for example could be the type of the data, like integer or string, or the
relationship between them. iTunes can read its database to give you a listing of its songs (and
play the songs); your mobile-phone software can interact with your list of contacts.
The DBMS is the software that would be installed on your personal computer or on a server,
then you would use it to manage one or more database. DBMS packages generally provide an
interface to view and change the design of the database, create queries, and develop reports.
Most of these packages are designed to work with a specific type of database, but generally
are compatible with a wide range of databases.
For example, Apache OpenOffice.org Base can be used to create, modify, and analyse
databases in open-database (ODB) format. Microsoft’s Access DBMS is used to work with
databases in its own Microsoft Access Database format. Both Access and Base can read and
write to other database formats as well.
Microsoft Access and Open Office Base are examples of personal database-management
systems. These systems are primarily used to develop and analyse single-user databases.
These databases are not meant to be shared across a network or the Internet but are instead
installed on a device and work with a single user at a time.
There are different DBMS, and they are categorized under:

 Hierarchical DBMS – in a Hierarchical database, model data is organized in a tree-


like structure. Data is Stored Hierarchically (top down or bottom up) format. Data is
represented using a parent-child relationship. In Hierarchical DBMS parent may have
many children, but children have only one parent.
 Network Model – The network database model allows each child to have multiple
parents. It helps you to address the need to model more complex relationships like as
the orders/parts many-to-many relationship. In this model, entities are organized in a
graph which can be accessed through several paths.
 Relational model – Relational DBMS is the most widely used DBMS model because
it is one of the easiest. This model is based on normalizing data in the rows and
columns of the tables. Relational model stored in fixed structures and manipulated
using SQL. RDBMS are like Oracle, MySQL, SQL Server, SQLite, DB2, …etc.
 Object-Oriented Model – In Object-oriented Model data stored in the form of objects.
The structure which is called classes which display data within it. It defines a database
as a collection of objects which stores both data members values and operations.

Relational Databases

Databases can be organized in many different ways, and thus take many forms. The most
popular form of database today is the relational database. Popular examples of relational
databases are Microsoft Access, MySQL, and Oracle. A relational database is one in which
data is organized into one or more tables. Each table has a set of fields, which define the
nature of the data stored in the table. A record is one instance of a set of fields in a table. To
visualize this, think of the records as the rows of the table and the fields as the columns of the
table. In the example below, we have a table of student information, with each row
representing a student and each column representing one piece of information about the
student.

In a relational database, all the tables are related by one or more fields, so that it is possible
to connect all the tables in the database through the field(s) they have in common. For each
table, one of the fields is identified as a primary key. This key is the unique identifier for each
record in the table. To help you understand these terms further, let’s walk through the process
of designing a database.

Designing a Database

Suppose a university wants to create an information system to track participation in student


clubs. After interviewing several people, the design team learns that the goal of implementing
the system is to give better insight into how the university funds clubs. This will be
accomplished by tracking how many members each club has and how active the clubs are.
From this, the team decides that the system must keep track of the clubs, their members, and
their events. Using this information, the design team determines that the following tables need
to be created:

 Clubs: this will track the club name, the club president, and a short description of the
club.
 Students: student name, e-mail, and year of birth.
 Memberships: this table will correlate students with clubs, allowing us to have any
given student join multiple clubs.
 Events: this table will track when the clubs meet and how many students showed up.

Now that the design team has determined which tables to create, they need to define the
specific information that each table will hold. This requires identifying the fields that will be in
each table. For example, Club Name would be one of the fields in the Clubs table. First Name
and Last Name would be fields in the Students table. Finally, since this will be a relational
database, every table should have a field in common with at least one other table (in other
words: they should have a relationship with each other).

In order to properly create this relationship, a primary key must be selected for each table.
This key is a unique identifier for each record in the table. For example, in the Students table,
it might be possible to use students’ last name as a way to uniquely identify them. However,
it is more than likely that some students will share a last name (like Rodriguez, Smith, or Lee),
so a different field should be selected. A student’s e-mail address might be a good choice for
a primary key, since e-mail addresses are unique. However, a primary key cannot change, so
this would mean that if students changed their e-mail address we would have to remove them
from the database and then re-insert them – not an attractive proposition. Our solution is to
create a value for each student — a user ID — that will act as a primary key. We will also do
this for each of the student clubs. This solution is quite common and is the reason you have
so many user IDs.

You can see the final database design in the figure below:

With this design, not only do we have a way to organize all of the information we need to meet
the requirements, but we have also successfully related all the tables together. Here’s what
the database tables might look like with some sample data.

Structured Query Language

Once you have a database designed and loaded with data, how will you do something useful
with it? The primary way to work with a relational database is to use Structured Query
Language, SQL (pronounced “sequel,” or simply stated as S-Q-L). Almost all applications that
work with databases (such as database management systems, discussed below) make use
of SQL as a way to analyze and manipulate relational data. As its name implies, SQL is a
language that can be used to work with a relational database. From a simple request for data
to a complex update operation, SQL is a mainstay of programmers and database
administrators. To give you a taste of what SQL might look like, here are a couple of examples
using our Student Clubs database.

The following query will retrieve a list of the first and last names of the club presidents:

SELECT "First Name", "Last Name" FROM "Students" WHERE "Students.ID" =


"Clubs.President"

The following query will create a list of the number of students in each club, listing the club
name and then the number of members:

SELECT "Clubs.Club Name", COUNT("Memberships.Student ID") FROM "Clubs" LEFT JOIN


"Memberships" ON "Clubs.Club ID" = "Memberships.Club ID"

Many database packages, such as Microsoft Access, allow you to visually create the query
you want to construct and then generate the SQL query for you.

Data Analytics

Data analytics (DA) is the process of examining data sets in order to draw conclusions about
the information they contain, increasingly with the aid of specialized systems and software.
Data analytics technologies and techniques are widely used in commercial industries to
enable organizations to make more-informed business decisions and by scientists and
researchers to verify or disprove scientific models, theories and hypotheses.

As a term, data analytics predominantly refers to an assortment of applications, from


basic business intelligence (BI), reporting and online analytical processing (OLAP) to various
forms of advanced analytics. In that sense, it's similar in nature to business analytics,
another umbrella term for approaches to analyzing data -- with the difference that the latter is
oriented to business uses, while data analytics has a broader focus. The expansive view of
the term isn't universal, though: In some cases, people use data analytics specifically to
mean advanced analytics, treating BI as a separate category.

Data analytics initiatives can help businesses increase revenues, improve operational
efficiency, optimize marketing campaigns and customer service efforts, respond more
quickly to emerging market trends and gain a competitive edge over rivals -- all with the
ultimate goal of boosting business performance. Depending on the particular application, the
data that's analyzed can consist of either historical records or new information that has been
processed for real-time analytics uses. In addition, it can come from a mix of internal
systems and external data sources.

Types of Data Analytics


Data analytics is broken down into four basic types.
 Descriptive analytics describes what has happened over a given period of time. Have
the number of views gone up? Are sales stronger this month than last?
 Diagnostic analytics focuses more on why something happened. This involves more
diverse data inputs and a bit of hypothesizing. Did the weather affect beer sales? Did
that latest marketing campaign impact sales?
 Predictive analytics moves to what is likely going to happen in the near term. What
happened to sales last time we had a hot summer? How many weather models predict
a hot summer this year?
 Prescriptive analytics moves into the territory of suggesting a course of action. If the
likelihood of a hot summer as measured as an average of these five weather models
is above 58%, then we should add an evening shift to the brewery and rent an
additional tank to increase output.

Data analytics underpins many quality control systems in the financial world, including the
ever-popular Six Sigma program. If you aren’t properly measuring something — whether it's
your weight or the number of defects per million in a production line — it is nearly impossible
to optimize it.

Data Mining

Data mining is the process of analyzing data to find previously unknown trends, patterns,
and associations in order to make decisions. Generally, data mining is accomplished through
automated means against extremely large data sets, such as a data warehouse. Some
examples of data mining include:
 An analysis of sales from a large grocery chain might determine that milk is purchased
more frequently the day after it rains in cities with a population of less than 50,000.
 A bank may find that loan applicants whose bank accounts show particular deposit and
withdrawal patterns are not good credit risks.
 A baseball team may find that collegiate baseball players with specific statistics in
hitting, pitching, and fielding make for more successful major league players.

Data mining is the exploration and analysis of large data to discover meaningful patterns and
rules. It’s considered a discipline under the data science field of study and differs from
predictive analytics because it describes historical data, while data mining aims to predict
future outcomes. Additionally, data mining techniques are used to build machine learning
(ML) models that power modern artificial intelligence (AI) applications such as search engine
algorithms and recommendation systems.

Data mining is looking for hidden, valid, and potentially useful patterns in huge data sets.
Data Mining is all about discovering unsuspected/ previously unknown relationships amongst
the data.

Data mining is also called as Knowledge discovery, Knowledge extraction, data/pattern


analysis, information harvesting, etc.

Data Mining Implementation Process

Business understanding: In this phase, business and data-mining goals are established.

 First, you need to understand business and client objectives. You need to define
what your client wants (which many times even they do not know themselves)
 Take stock of the current data mining scenario. Factor in resources, assumption,
constraints, and other significant factors into your assessment.
 Using business objectives and current scenario, define your data mining goals.
 A good data mining plan is very detailed and should be developed to accomplish
both business and data mining goals.
Data understanding: In this phase, sanity check on data is performed to check whether its
appropriate for the data mining goals.

 First, data is collected from multiple data sources available in the organization.
 These data sources may include multiple databases, flat filer or data cubes. There
are issues like object matching and schema integration which can arise during Data
Integration process. It is a quite complex and tricky process as data from various
sources unlikely to match easily. For example, table A contains an entity named
cust_no whereas another table B contains an entity named cust-id.
 Therefore, it is quite difficult to ensure that both of these given objects refer to the
same value or not. Here, Metadata should be used to reduce errors in the data
integration process.
 Next, the step is to search for properties of acquired data. A good way to explore the
data is to answer the data mining questions (decided in business phase) using the
query, reporting, and visualization tools.
 Based on the results of query, the data quality should be ascertained. Missing data if
any should be acquired.

Data preparation: In this phase, data is made production ready.

 The data preparation process consumes about 90% of the time of the project.
 The data from different sources should be selected, cleaned, transformed, formatted,
anonymized, and constructed (if required).
 Data cleaning is a process to "clean" the data by smoothing noisy data and filling in
missing values.
 For example, for a customer demographics profile, age data is missing. The data is
incomplete and should be filled. In some cases, there could be data outliers. For
instance, age has a value 300. Data could be inconsistent. For instance, name of the
customer is different in different tables.
 Data transformation operations change the data to make it useful in data mining.
Following transformation can be applied

Data transformation: Data transformation operations would contribute toward the success
of the mining process.

 Smoothing: It helps to remove noise from the data.


 Aggregation: Summary or aggregation operations are applied to the data. I.e., the
weekly sales data is aggregated to calculate the monthly and yearly total.
 Generalization: In this step, Low-level data is replaced by higher-level concepts with
the help of concept hierarchies. For example, the city is replaced by the county.
 Normalization: Normalization performed when the attribute data are scaled up o
scaled down. Example: Data should fall in the range -2.0 to 2.0 post-normalization.
 Attribute construction: these attributes are constructed and included the given set
of attributes helpful for data mining.

The result of this process is a final data set that can be used in modelling.

Modelling: In this phase, mathematical models are used to determine data patterns.

 Based on the business objectives, suitable modelling techniques should be selected


for the prepared dataset.
 Create a scenario to test check the quality and validity of the model.
 Run the model on the prepared dataset.
 Results should be assessed by all stakeholders to make sure that model can meet
data mining objectives.

Evaluation: In this phase, patterns identified are evaluated against the business objectives.

 Results generated by the data mining model should be evaluated against the
business objectives.
 Gaining business understanding is an iterative process. In fact, while understanding,
new business requirements may be raised because of data mining.
 A go or no-go decision is taken to move the model in the deployment phase.

Deployment: In the deployment phase, you ship your data mining discoveries to everyday
business operations.

 The knowledge or information discovered during data mining process should be


made easy to understand for non-technical stakeholders.
 A detailed deployment plan, for shipping, maintenance, and monitoring of data
mining discoveries is created.
 A final project report is created with lessons learned and key experiences during the
project. This helps to improve the organization's business policy.

Types of Data Mining

 Supervised Learning
The goal of supervised learning is prediction or classification. The easiest way to
conceptualize this process is to look for a single output variable. A process is
considered supervised learning if the goal of the model is to predict the value of an
observation. One example is spam filters, which use supervised learning to classify
incoming emails as unwanted content and automatically remove these messages from
your inbox.

Common analytical models used in supervised data mining approaches are:

 Linear Regressions – predict the value of a continuous variable using one or more
independent inputs. Realtors use linear regressions to predict the value of a house
based on square footage, bed-to-bath ratio, year built, and zip code.

 Logistic Regressions – predict the probability of a categorical variable using one or


more independent inputs. Banks use logistic regressions to predict the probability that a
loan applicant will default based on credit score, household income, age, and other
personal factors.

 Time Series – are forecasting tools which use time as the primary independent variable.
Retailers, such as Macy’s, deploy time series models to predict the demand for products
as a function of time and use the forecast to accurately plan and stock stores with the
required level of inventory.

 Classification or Regression Trees – are a predictive modeling technique that can be


used to predict the value of both categorical and continuous target variables. Based on
the data, the model will create sets of binary rules to split and group the highest
proportion of similar target variables together. Following those rules, the group that a
new observation falls into will become its predicted value.

 Neural Networks – is an analytical model inspired by the structure of the brain, its
neurons, and their connections. These models were originally created in 1940s but have
just recently gained popularity with statisticians and data scientists. Neural networks use
inputs and, based on their magnitude, will “fire” or “not fire” its node based on its
threshold requirement. This signal, or lack thereof, is then combined with the other “fired”
signals in the hidden layers of the network, where the process repeats itself until an
output is created. Since one of the benefits of neural networks is a near-instant output,
self-driving cars are deploying these models to accurately and efficiently process data
to autonomously make critical decisions.

 K-Nearest Neighbor – method is used to categorize a new observation based on past


observations. Unlike the previous methods, k-nearest neighbor is data-driven, not
model-driven. This method makes no underlying assumptions about the data nor does
it employ complex processes to interpret its inputs. The basic idea of the k-nearest
neighbor model is that it classifies new observations by identifying its closest K
neighbors and assigning it the majority’s value. Many recommender systems nest this
method to identify and classify similar content which will later be pulled by the greater
algorithm.
 Unsupervised Learning
Unsupervised tasks focus on understanding and describing data to reveal underlying
patterns within it. Recommendation systems employ unsupervised learning to track user
patterns and provide them with personalized recommendations to enhance their customer
experience.

Common analytical models used in unsupervised data mining approaches are:

 Clustering – group similar data together. They are best employed with complex data
sets describing a single entity. One example is lookalike modeling, to group similarities
between segments, identify clusters, and target new groups who look like an existing
group.
 Association Analysis – is also known as market basket analysis and is used to identify
items that frequently occur together. Supermarkets commonly use this tool to identify
paired products and spread them out in the store to encourage customers to pass by
more merchandise and increase their purchases.
 Principal Component Analysis – is used to illustrate hidden correlations between input
variables and create new variables, called principal components, which capture the
same information contained in the original data, but with less variables. By reducing the
number of variables used to convey the same level information, analysts can increase
the utility and accuracy of supervised data mining models.

 Supervised and Unsupervised Approaches in Practice


While you can use each approach independently, it is quite common to use both during
an analysis. Each approach has unique advantages and combine to increase the
robustness, stability, and overall utility of data mining models. Supervised models can
benefit from nesting variables derived from unsupervised methods. For example, a cluster
variable within a regression model allows analysts to eliminate redundant variables from
the model and improve its accuracy. Because unsupervised approaches reveal the
underlying relationships within data, analysts should use the insights from unsupervised
learning to springboard their supervised analysis.

Benefits of Data Mining:

 Data mining technique helps companies to get knowledge-based information.


 Data mining helps organizations to make the profitable adjustments in operation and
production.
 The data mining is a cost-effective and efficient solution compared to other statistical
data applications.
 Data mining helps with the decision-making process.
 Facilitates automated prediction of trends and behaviors as well as automated
discovery of hidden patterns.
 It can be implemented in new systems as well as existing platforms
 It is the speedy process which makes it easy for the users to analyze huge amount of
data in less time.

Disadvantages of Data Mining

 There are chances of companies may sell useful information of their customers to
other companies for money. For example, American Express has sold credit card
purchases of their customers to the other companies.
 Many data mining analytics software is difficult to operate and requires advance
training to work on.
 Different data mining tools work in different manners due to different algorithms
employed in their design. Therefore, the selection of correct data mining tool is a very
difficult task.
 The data mining techniques are not accurate, and so it can cause serious
consequences in certain conditions.

Data Mining Applications

Applications Usage
Communications Data mining techniques are used in communication sector to predict
customer behavior to offer highly targetted and relevant campaigns.
Insurance Data mining helps insurance companies to price their products
profitable and promote new offers to their new or existing customers.
Education Data mining benefits educators to access student data, predict
achievement levels and find students or groups of students which need
extra attention. For example, students who are weak in maths subject.
Manufacturing With the help of Data Mining Manufacturers can predict wear and tear
of production assets. They can anticipate maintenance which helps
them reduce them to minimize downtime.
Banking Data mining helps finance sector to get a view of market risks and
manage regulatory compliance. It helps banks to identify probable
defaulters to decide whether to issue credit cards, loans, etc.
Retail Data Mining techniques help retail malls and grocery stores identify and
arrange most sellable items in the most attentive positions. It helps
store owners to comes up with the offer which encourages customers to
increase their spending.
Service Service providers like mobile phone and utility industries use Data
Providers Mining to predict the reasons when a customer leaves their company.
They analyze billing details, customer service interactions, complaints
made to the company to assign each customer a probability score and
offers incentives.
E-Commerce E-commerce websites use Data Mining to offer cross-sells and up-sells
through their websites. One of the most famous names is Amazon, who
use Data mining techniques to get more customers into their
eCommerce store.
Super Markets Data Mining allows supermarket's develope rules to predict if their
shoppers were likely to be expecting. By evaluating their buying pattern,
they could find woman customers who are most likely pregnant. They
can start targeting products like baby powder, baby shop, diapers and
so on.
Crime Data Mining helps crime investigation agencies to deploy police
Investigation workforce (where is a crime most likely to happen and when?), who to
search at a border crossing etc.
Bioinformatics Data Mining helps to mine biological data from massive datasets
gathered in biology and medicine.

Big Data
Big Data is also data but with a huge size. Big Data is a term used to describe a collection
of data that is huge in size and yet growing exponentially with time. In short such data is so
large and complex that none of the traditional data management tools are able to store it or
process it efficiently.
Analyzing Twitter posts, Facebook feeds, eBay searches, GPS trackers, and ATM machines
are some big data examples. Studying security videos, traffic data, weather patterns, flight
arrivals, cell phone tower logs, and heart rate trackers are other forms. Big data is a messy
new science that changes weekly, and only a few experts understand it all.
But today, new technologies make it possible to realize value from Big Data. For example,
retailers can track user web clicks to identify behavioural trends that improve campaigns,
pricing and stockage. Utilities can capture household energy usage levels to predict outages
and to invent more efficient energy consumption. Governments and even Google can detect
and track the emergence of disease outbreaks via social media signals. Oil and gas
companies can take the output of sensors in their drilling equipment to make more efficient
and safer drilling decisions.

Types Of Big Data

1. Structured

Any data that can be stored, accessed and processed in the form of fixed format is termed
as a 'structured' data. Over the period of time, talent in computer science has achieved
greater success in developing techniques for working with such kind of data (where the
format is well known in advance) and also deriving value out of it. However, nowadays, we
are foreseeing issues when a size of such data grows to a huge extent, typical sizes are
being in the rage of multiple zettabytes.

Looking at these figures one can easily understand why the name Big Data is given and
imagine the challenges involved in its storage and processing.

Examples Of Structured Data: an 'Employee' table in a database is an example of


Structured Data
2. Unstructured

Any data with unknown form or the structure


is classified as unstructured data. In addition
to the size being huge, un-structured data
poses multiple challenges in terms of its
processing for deriving value out of it. A
typical example of unstructured data is a
heterogeneous data source containing a
combination of simple text files, images,
videos etc. Now day organizations have
wealth of data available with them but
unfortunately, they don't know how to derive value out of it since this data is in its raw form or
unstructured format.

Example of Un-structured Data: The output returned by 'Google Search'

3. Semi-structured

Semi-structured data can contain both


the forms of data. We can see semi-
structured data as a structured in form
but it is actually not defined with e.g. a
table definition in relational DBMS.
Example of semi-structured data is a
data represented in an XML file.

Examples Of Semi-structured Data: Personal data stored in an XML file-

The “Three Vs” of Big Data


In 2001, industry analyst Doug Laney defined the “Three Vs” of big data:

 Volume – The unprecedented explosion of data means that the digital universe will reach
180 zettabytes (180 followed by 21 zeroes) by 2025. Today, the challenge with data volume
is not so much storage as it is how to identify relevant data within gigantic data sets and
make good use of it. A typical PC might have had 10 gigabytes of storage in 2000. Today,
Facebook ingests 500 terabytes of new data every day; a Boeing 737 will generate 240
terabytes of flight data during a single flight across the US; the proliferation of smart phones,
the data they create and consume; sensors embedded into everyday objects will soon result
in billions of new, constantly-updated data feeds containing environmental, location, and
other information, including video.
 Velocity – Data is generated at an ever-accelerating pace. Every minute, Google receives
3.8 million search queries. Email users send 156 million messages. Facebook users upload
243,000 photos. The challenge for data scientists is to find ways to collect, process, and
make use of huge amounts of data as it comes in.
 Variety – Data comes in different forms. Structured data is that which can be organized
neatly within the columns of a database. This type of data is relatively easy to enter, store,
query, and analyze. Unstructured data is more difficult to sort and extract value from.
Examples of unstructured data include emails, social media posts, word-processing
documents; audio, video and photo files; web pages, and more. Big Data data isn't just
numbers, dates, and strings. Big Data is also geospatial data, 3D data, audio and video, and
unstructured text, including log files and social media. Traditional database systems were
designed to address smaller volumes of structured data, fewer updates or a predictable,
consistent data structure. Traditional database systems are also designed to operate on a
single server, making increased capacity expensive and finite. As applications have evolved
to serve large volumes of users, and as application development practices have become
agile, the traditional use of the relational database has become a liability for many companies
rather than an enabling factor in their business. Big Data databases, such as MongoDB,
solve these problems and provide companies with the means to create tremendous business
value.
Beyond the Big Three Vs
More recently, big-data practitioners and thought leaders have proposed additional Vs:

 Veracity – This refers to the quality of the collected data. If source data is not correct,
analyses will be worthless. As the world moves toward automated decision-making,
where computers make choices instead of humans, it becomes imperative that
organizations be able to trust the quality of the data.
 Variability – Data’s meaning is constantly changing. For example, language processing
by computers is exceedingly difficult because words often have several meanings. Data
scientists must account for this variability by creating sophisticated programs that
understand context and meaning.
 Visualization – Data must be understandable to nontechnical stakeholders and decision
makers. Visualization is the creation of complex graphs that tell the data scientist’s
story, transforming the data into information, information into insight, insight into
knowledge, and knowledge into advantage.
 Value – How can organizations make use of big data to improve decision-making?
A McKinsey article about the potential impact of big data on health care in the U.S.
suggested that big-data initiatives “could account for $300 billion to $450 billion in
reduced health-care spending, or 12 to 17 percent of the $2.6 trillion baseline in US
health-care costs.” The secrets hidden within big data can be a goldmine of opportunity
and savings.
What Are Some Examples of Big Data in Regular Life?

While most big data projects are very obscure, there are successful examples of big data
affecting the everyday life of individuals, companies, and governments:

 Predicting virus outbreaks: by studying socio-political data, weather and climate


data, and hospital/clinical data, these scientists are now predicting dengue fever
outbreaks with 4 weeks advance notice.
 Homicide Watch: this big data project profiles murder victims, suspects, and
criminals in Washington, DC. Both as a way to honor the deceased and as an
awareness resource for people, this big data project is fascinating.
 Transit Travel Planning, NYC: WNYC radio programmer Steve Melendez combined
the online subway schedule with travel itinerary software. His creation lets New
Yorkers click their location on the map, and a prediction of travel time for trains and
subway will appear.
 Xerox reduced their workforce loss: call center work is emotionally exhausting.
Xerox has studied reams of data with the help of professional analysts, and now they
can predict which call center hires are likely to stay with the company the longest.
 Supporting counter-terrorism: by studying social media, financial records, flight
reservations, and security data, law enforcement can predict and locate terrorist
suspects before they do their wicked deeds.
 Adjusting brand marketing based on social media reviews: people bluntly and
quickly share their online thoughts on a pub, restaurant, or fitness club. It is possible
to study these millions of social media posts and provide feedback to the company
on what people think of their services.

Who Uses Big Data? What Do They Do With It?

 Police response to the Boston Marathon bombing: by using big data to study
video and surveillance images, the police were able to quickly narrow down their
search for the suspects.
 Morton's Steakhouse uses Twitter to pull off marketing stunts, including the famous
New Jersey airport delivery of a porterhouse steak and shrimp dinner.
 Visa uses big data to identify and catch fraudsters. Single transactions here and
there can easily conceal a dishonest credit card user, but by watching millions of
transactions carefully, patterns of fraud can be detected.
 Facebook uses big data to tailor advertising. By carefully studying your FB likes
and browsing habits, the social media giant has eerie insight into your tastes. Those
sidebar ads you see on your Facebook feed are chosen by very deliberate and
complex algorithms that have been watching your Facebook habits.

4 things make big data significant:

1. The data is massive. It won't fit on a single hard drive, much less a USB stick. The
volume of data far exceeds what the human mind can perceive (think of a billion billion
megabytes, and then multiply that by more billions).

2. The data is messy and unstructured. 50% to 80% of big data work is converting and
cleaning the information so that is searchable and sortable. Only a few thousand experts on
our planet fully know how to do this data cleanup. These experts also need very specialized
tools, like HPE and Hadoop, to do their craft. Perhaps in 10 years, big data experts will
become a dime a dozen, but for now, they are a very rare species of analyst and their work
is still very obscure and tedious.

3. Data has become a commodity** that can be sold and bought. Data marketplaces
exist where companies and individuals can buy terabytes of social media and other data.
Most of the data is cloud-based, as it is too large to fit onto any single hard disk. Buying data
commonly involves a subscription fee where you plug into a cloud server farm.

**The leaders of big data tools and ideas are Amazon, Google, Facebook, and
Yahoo. Because these companies serve so many millions of people with their online
services, it makes sense that they would be the collection point and the visionaries behind
big data analytics.

4. The possibilities of big data are endless. Perhaps doctors will one day predict heart
attacks and strokes for individuals weeks before they happen. Airplane and automobile
crashes might be reduced by predictive analyses of their mechanical data and traffic and
weather patterns. Online dating might be improved by having big data predictors of who are
compatible personalities for you. Musicians might get insight into what music composition is
the most pleasing to the changing tastes of target audiences. Nutritionists might be able to
predict which combination of store-bought foods will aggravate or help a person's medical
conditions. The surface has only been scratched, and discoveries in big data happen every
week.

Advantages and Disadvantages of Big Data


The increase in the amount of data available presents both opportunities and problems.
In general, having more data on one’s customers (and potential customers) should allow
companies to better tailor their products and marketing efforts in order to create the highest
level of satisfaction and repeat business. Companies that are able to collect a large amount
of data are provided with the opportunity to conduct deeper and richer analysis.

While better analysis is a positive, big data can also create overload and noise. Companies
have to be able to handle larger volumes of data, all the while determining which data
represents signals compared to noise. Determining what makes the data relevant becomes
a key factor.

Furthermore, the nature and format of the data can require special handling before it is acted
upon. Structured data, consisting of numeric values, can be easily stored and sorted.
Unstructured data, such as emails, videos, and text documents, may require more
sophisticated techniques to be applied before it becomes useful.

Neural Network

The basic idea behind a neural network is to simulate (copy in a simplified but reasonably
faithful way) lots of densely interconnected brain cells inside a computer so you can get it to
learn things, recognize patterns, and make decisions in a humanlike way. The amazing thing
about a neural network is that you don't have to program it to learn explicitly: it learns all by
itself, just like a brain.

A neural network is a type of machine learning which models itself after the human brain. This
creates an artificial neural network that via an algorithm allows the computer to learn by
incorporating new data.

While there are plenty of artificial intelligence algorithms these days, neural networks are able
to perform what has been termed deep learning. While the basic unit of the brain is the neuron,
the essential building block of an artificial neural network is a perceptron which accomplishes
simple signal processing, and these are then connected into a large mesh network.

The computer with the neural network is taught to do a task by having it analyse training
examples, which have been previously labelled in advance. A common example of a task for
a neural network using deep learning is an object recognition task, where the neural network
is presented with a large number of objects of a certain type, such as a cat, or a street sign,
and the computer, by analysing the recurring patterns in the presented images, learns to
categorize new images.

What does a neural network consist of?

A typical neural network has anything from a few dozen to hundreds, thousands, or even
millions of artificial neurons called units arranged in a series of layers, each of which connects
to the layers on either side. Some of them, known as input units, are designed to receive
various forms of information from the outside world that the network will attempt to learn about,
recognize, or otherwise process. Other units sit on the opposite side of the network and signal
how it responds to the information it's learned; those are known as output units. In between
the input units and output units are one or more layers of hidden units, which, together, form
the majority of the artificial brain. Most neural networks are fully connected, which means
each hidden unit and each output unit is connected to every unit in the layers either side. The
connections between one unit and another are represented by a number called a weight,
which can be either positive (if one unit excites another) or negative (if one unit suppresses or
inhibits another). The higher the weight, the more influence one unit has on another. (This
corresponds to the way actual brain cells trigger one another across tiny gaps called
synapses.)
How artificial neural networks work

A neural network usually involves a large number of processors operating in parallel and
arranged in tiers. The first tier receives the raw input information -- analogous to optic nerves
in human visual processing. Each successive tier receives the output from the tier preceding
it, rather than from the raw input -- in the same way neurons further from the optic nerve
receive signals from those closer to it. The last tier produces the output of the system.

Each processing node has its own small sphere of knowledge, including what it has seen and
any rules it was originally programmed with or developed for itself. The tiers are highly
interconnected, which means each node in tier n will be connected to many nodes in tier n-1-
- its inputs -- and in tier n+1, which provides input for those nodes. There may be one or
multiple nodes in the output layer, from which the answer it produces can be read.

Neural networks are notable for being adaptive, which means they modify themselves as they
learn from initial training and subsequent runs provide more information about the world. The
most basic learning model is centred on weighting the input streams, which is how each node
weights the importance of input from each of its predecessors. Inputs that contribute to getting
right answers are weighted higher.

How neural networks learn

Unlike other algorithms, neural networks with their deep learning cannot be programmed
directly for the task. Rather, they have the requirement, just like a child’s developing brain,
that they need to learn the information. The learning strategies go by three methods:

 Supervised learning: This learning strategy is the simplest, as there is a labelled


dataset, which the computer goes through, and the algorithm gets modified until it can
process the dataset to get the desired result.
 Unsupervised learning: This strategy gets used in cases where there is no labelled
dataset available to learn from. The neural network analyses the dataset, and then a
cost function then tells the neural network how far off of target it was. The neural network
then adjusts to increase accuracy of the algorithm.
 Reinforced learning: In this algorithm, the neural network is reinforced for positive
results, and punished for a negative result, forcing the neural network to learn over time.

Information flows through a neural network in two ways. When it's learning (being trained) or
operating normally (after being trained), patterns of information are fed into the network via
the input units, which trigger the layers of hidden units, and these in turn arrive at the output
units. This common design is called a feedforward network. Not all units "fire" all the time.
Each unit receives inputs from the units to its left, and the inputs are multiplied by the weights
of the connections they travel along. Every unit adds up all the inputs it receives in this way
and (in the simplest type of network) if the sum is more than a certain threshold value, the
unit "fires" and triggers the units it's connected to (those on its right).

For a neural network to learn, there has to be an element of feedback involved—just as


children learn by being told what they're doing right or wrong. In fact, we all use feedback, all
the time. Think back to when you first learned to play a game like ten-pin bowling. As you
picked up the heavy ball and rolled it down the alley, your brain watched how quickly the ball
moved and the line it followed and noted how close you came to knock down the skittles. Next
time it was your turn, you remembered what you'd done wrong before, modified your
movements accordingly, and hopefully threw the ball a bit better. So you used feedback to
compare the outcome you wanted with what actually happened, figured out the difference
between the two, and used that to change what you did next time ("I need to throw it harder,"
"I need to roll slightly more to the left," "I need to let go later," and so on). The bigger the
difference between the intended and actual outcome, the more radically you would have
altered your moves.

Neural networks learn things in exactly the same way, typically by a feedback process
called backpropagation (sometimes abbreviated as "backprop"). This involves comparing
the output a network produces with the output it was meant to produce and using
the difference between them to modify the weights of the connections between the units in the
network, working from the output units through the hidden units to the input units—going
backward, in other words. In time, backpropagation causes the network to learn, reducing the
difference between actual and intended output to the point where the two exactly coincide, so
the network figures things out exactly as it should.
How many types of neural network are there?
There are multiple types of neural network, each of which come with their own specific use
cases and levels of complexity. The most basic type of neural net is something called
a feedforward neural network, in which information travels in only one direction from input to
output.

A more widely used type of network is the recurrent neural network, in which data can flow in
multiple directions. These neural networks possess greater learning abilities and are widely
employed for more complex tasks such as learning handwriting or language recognition.

There are also convolutional neural networks, Boltzmann machine networks, Hopfield
networks, and a variety of others. Picking the right network for your task depends on the data
you have to train it with, and the specific application you have in mind. In some cases, it may
be desirable to use multiple approaches, such as would be the case with a challenging task
like voice recognition

Real world uses for neural networks

 Handwriting recognition is an example of a real world problem that can be approached via
an artificial neural network. The challenge is that humans can recognize handwriting with
simple intuition, but the challenge for computers is each person’s handwriting is unique,
with different styles, and even different spacing between letters, making it difficult to
recognize consistently.

For example, the first letter, a capital A, can be described as three straight lines where two
meet at a peak at the top, and the third is across the other two halfway down, and makes
sense to humans, but is a challenge to express this in a computer algorithm.

Taking the artificial neural network approach, the computer is fed training examples of
known handwritten characters, that have been previously labelled as to which letter or
number they correspond to, and via the algorithm the computer then learns to recognize
each character, and as the data set of characters is increased, so does the accuracy.
Handwriting recognition has various applications, as varied as automated address reading
on letters at the postal service, reducing bank fraud on checks, to character input for pen
based computing.

 Another type of problem for an artificial neural network is the forecasting of the financial
markets. This also goes by the term ‘algorithmic trading,’ and has been applied to all types
of financial markets, from stock markets, commodities, interest rates and various
currencies. In the case of the stock market, traders use neural network algorithms to find
undervalued stocks, improve existing stock models, and to use the deep learning aspects
to optimize their algorithm as the market changes. There are now companies that specialize
in neural network stock trading algorithms, for example, MJ Trading Systems.
Artificial neural network algorithms, with their inherent flexibility, continue to be applied for
complex pattern recognition, and prediction problems. In addition to the examples above,
this includes such varied applications as facial recognition on social media images, cancer
detection for medical imaging, and business forecasting.

 On the basis of this example, you can probably see lots of different applications for neural
networks that involve recognizing patterns and making simple decisions about them.
In airplanes, you might use a neural network as a basic autopilot, with input units reading
signals from the various cockpit instruments and output units modifying the plane's controls
appropriately to keep it safely on course. Inside a factory, you could use a neural network
for quality control. Let's say you're producing clothes washing detergent in some giant,
convoluted chemical process. You could measure the final detergent in various ways (its
colour, acidity, thickness, or whatever), feed those measurements into your neural network
as inputs, and then have the network decide whether to accept or reject the batch.
 There are lots of applications for neural networks in security, too. Suppose you're running
a bank with many thousands of credit-card transactions passing through your computer
system every single minute. You need a quick automated way of identifying any
transactions that might be fraudulent—and that's something for which a neural network is
perfectly suited. Your inputs would be things like 1) Is the cardholder actually present? 2)
Has a valid PIN number been used? 3) Have five or more transactions been presented with
this card in the last 10 minutes? 4) Is the card being used in a different country from which
it's registered? —and so on. With enough clues, a neural network can flag up any
transactions that look suspicious, allowing a human operator to investigate them more
closely. In a very similar way, a bank could use a neural network to help it decide whether
to give loans to people on the basis of their past credit history, current earnings, and
employment record.

 Many of the things we all do everyday involve recognizing patterns and using them to make
decisions, so neural networks can help us out in zillions of different ways. They can help
us forecast the stock market or the weather, operate radar scanning systems that
automatically identify enemy aircraft or ships, and even help doctors to diagnose complex
diseases on the basis of their symptoms. There might be neural networks ticking away
inside your computer or your cell phone right this minute. If you use cell phone apps
that recognize your handwriting on a touchscreen, they might be using a simple neural
network to figure out which characters you're writing by looking out for distinct features in
the marks you make with your fingers (and the order in which you make them). Some kinds
of voice recognition software also use neural networks. And so do some of the email
programs that automatically differentiate between genuine emails and spam. Neural
networks have even proved effective in translating text from one language to another.
Google's automatic translation, for example, has made increasing use of this technology
over the last few years to convert words in one language (the network's input) into the
equivalent words in another language (the network's output). In 2016, Google announced it
was using something it called Neural Machine Translation (NMT) to convert entire
sentences, instantly, with a 55–85 percent reduction in errors.

Algorithm

An algorithm is a procedure or formula for solving a problem, based on conducting a sequence


of specified actions. A computer program can be viewed as an elaborate algorithm. In
mathematics and computer science, an algorithm usually means a small procedure that solves
a recurrent problem.

List of Common Machine Learning Algorithms


a. Naïve Bayes Classifier Algorithm
A classifier is a function that allocates a population’s element value from one of
the available categories. For instance, Spam Filtering is a popular application of Naïve
Bayes algorithm. Spam filter here, is a classifier that assigns a label “Spam” or “Not Spam”
to all the emails. Naïve Bayes Classifier is amongst the most popular learning method
grouped by similarities that works on the popular Bayes Theorem of Probability- to build
machine learning models particularly for disease prediction and document classification.
It is a simple classification of words based on Bayes Probability Theorem for subjective
analysis of content.
b. K Means Clustering Algorithm
K-means is a popularly used unsupervised machine learning algorithm for cluster
analysis. K-Means is a non-deterministic and iterative method. The algorithm operates on
a given data set through pre-defined number of clusters, k. The output of K Means
algorithm is k clusters with input data partitioned among the clusters. For instance, let’s
consider K-Means Clustering for Wikipedia Search results. The search term “Jaguar” on
Wikipedia will return all pages containing the word Jaguar which can refer to Jaguar as a
Car, Jaguar as Mac OS version and Jaguar as an Animal.
c. Support Vector Machine Algorithm
Support Vector Machine is a supervised machine learning algorithm for
classification or regression problems where the dataset teaches SVM about the classes
so that SVM can classify any new data. It works by classifying the data into different
classes by finding a line (hyperplane) which separates the training data set into classes.
As there are many such linear hyperplanes, SVM algorithm tries to maximize the distance
between the various classes that are involved and this is referred as margin maximization.
If the line that maximizes the distance between the classes is identified, the probability to
generalize well to unseen data is increased.
d. Apriori Algorithm
Apriori algorithm is an unsupervised machine learning algorithm that generates
association rules from a given data set. Association rule implies that if an item A occurs,
then item B also occurs with a certain probability. Most of the association rules generated
are in the IF_THEN format. For example, IF people buy an iPad THEN they also buy an
iPad Case to protect it. For the algorithm to derive such conclusions, it first observes the
number of people who bought an iPad case while purchasing an iPad. This way a ratio is
derived like out of the 100 people who purchased an iPad, 85 people also purchased an
iPad case.
e. Linear Regression
Linear Regression algorithm shows the relationship between 2 variables and how
the change in one variable impacts the other. The algorithm shows the impact on the
dependent variable on changing the independent variable. The independent variables are
referred as explanatory variables, as they explain the factors the impact the dependent
variable. Dependent variable is often referred to as the factor of interest or predictor.
f. Logistic Regression
This algorithm applies a logistic function to a linear combination of features to
predict the outcome of a categorical dependent variable based on predictor variables. The
odds or probabilities that describe the outcome of a single trial are modelled as a function
of explanatory variables. Logistic regression algorithms helps estimate the probability of
falling into a specific level of the categorical dependent variable based on the given
predictor variables. Just suppose that you want to predict if there will be a snowfall
tomorrow in New York. Here the outcome of the prediction is not a continuous number
because there will either be snowfall or no snowfall and hence linear regression cannot
be applied. Here the outcome variable is one of the several categories and using logistic
regression helps.
g. Artificial Neural Networks
A computing system that is designed to simulate the way the human brain analyzes
and process information. Artificial Neural Networks (ANN) is the foundation of Artificial
Intelligence (AI) and solves problems that would prove impossible or difficult by human or
statistical standards. ANN has self-learning capabilities that enable it produce better
results as more data becomes available.
h. Random Forests
Random Forest is the go to machine learning algorithm that uses a bagging
approach to create a bunch of decision trees with random subset of the data. A model is
trained several times on random sample of the dataset to achieve good prediction
performance from the random forest algorithm. In this ensemble learning method, the
output of all the decision trees in the random forest, is combined to make the final
prediction. The final prediction of the random forest algorithm is derived by polling the
results of each decision tree or just by going with a prediction that appears the most times
in the decision trees. For instance, in the above example - if 5 friends decide that you will
like restaurant R but only 2 friends decide that you will not like the restaurant then the final
prediction is that, you will like restaurant R as majority always wins.
i. Decision Trees
A decision tree is a graphical representation that makes use of branching
methodology to exemplify all possible outcomes of a decision, based on certain
conditions. In a decision tree, the internal node represents a test on the attribute, each
branch of the tree represents the outcome of the test and the leaf node represents a
particular class label i.e. the decision made after computing all of the attributes. The
classification rules are represented through the path from root to the leaf node.
Types of Decision Trees:
 Classification Trees- These are considered as the default kind of decision
trees used to separate a dataset into different classes, based on the response
variable. These are generally used when the response variable is categorical
in nature.
 Regression Trees-When the response or target variable is continuous or
numerical, regression trees are used. These are generally used in predictive
type of problems when compared to classification.
j. Nearest Neighbors
The principle behind nearest neighbor methods is to find a predefined number of
training samples closest in distance to the new point, and predict the label from these.
The number of samples can be a user-defined constant, or vary based on the local density
of points. The distance can, in general, be any metric measure: standard Euclidean
distance is the most common choice. Neighbors-based methods are known as non-
generalizing machine learning methods, since they simply “remember” all of its training
data.

Despite its simplicity, nearest neighbors has been successful in a large number of
classification and regression problems, including handwritten digits or satellite image
scenes. Being a non-parametric method, it is often successful in classification situations
where the decision boundary is very irregular.
Computer Vision

Computer vision comes from modelling image


processing using the techniques of machine
learning. Computer vision applies machine
learning to recognise patterns for interpretation of
images. Much like the process of visual reasoning
of human vision; we can distinguish between
objects, classify them, sort them according to their
size, and so forth. Computer vision, like image
processing, takes images as input and gives output
in the form of information on size, colour intensity
etc.

The image shows how computer vision works in comparison to how humans process visual
input.

Computer Vision is the process of using machines to understand and analyse imagery (both
photos and videos). While these types of algorithms have been around in various forms since
the 1960’s, recent advances in Machine Learning, as well as leaps forward in data storage,
computing capabilities, and cheap high-quality input devices, have driven major improvements
in how well our software can explore this kind of content.
Computer Vision is the broad parent name for any computations involving visual content – that
means images, videos, icons, and anything else with pixels involved. But within this parent
idea, there are a few specific tasks that are core building blocks:

 In object classification, you train a model on a dataset of specific objects, and the
model classifies new objects as belonging to one or more of your training categories.
 For object identification, your model will recognize a specific instance of an object –
for example, parsing two faces in an image and tagging one as Tom Cruise and one
as Katie Holmes.

A classical application of computer vision is handwriting recognition for digitizing handwritten


content (we’ll explore more use cases below). Outside of just recognition, other methods of
analysis include:

 Video motion analysis uses computer vision to estimate the velocity of objects in a
video, or the camera itself.
 In image segmentation, algorithms partition images into multiple sets of views.
 Scene reconstruction creates a 3D model of a scene inputted through images or
video (check out Selva).
 In image restoration, noise such as blurring is removed from photos using Machine
Learning based filters.

Any other application that involves understanding pixels through software can safely be
labeled as computer vision.

How Computer Vision Works


One of the major open questions in both Neuroscience and Machine Learning is: how exactly
do our brains work, and how can we approximate that with our own algorithms? The reality is
that there are very few working and comprehensive theories of brain computation; so despite
the fact that Neural Nets are supposed to “mimic the way the brain works,” nobody is quite
sure if that’s actually true. Jeff Hawkins has an entire book on this topic called On Intelligence.
The same paradox holds true for computer vision – since we’re not decided on how the brain
and eyes process images, it’s difficult to say how well the algorithms used in production
approximate our own internal mental processes. For example, studies have shown that some
functions that we thought happen in the brain of frogs actually take place in the eyes. We’re a
far cry from amphibians, but similar uncertainty exists in human cognition.

Machines interpret images very simply: as a series of pixels, each with their own set of color
values. Consider the simplified image below, and how grayscale values are converted into a
simple array of numbers:

Think of an image as a giant grid of different squares, or pixels (this image is a very simplified
version of what looks like either Abraham Lincoln or a Dementor). Each pixel in an image can
be represented by a number, usually from 0 – 255. The series of numbers on the right is what
software sees when you input an image. For our image, there are 12 columns and 16 rows,
which means there are 192 input values for this image.

When we start to add in color, things get more complicated. Computers usually read color as
a series of 3 values – red, green, and blue (RGB) – on that same 0 – 255 scale. Now, each
pixel actually has 3 values for the computer to store in addition to its position. If we were to
colorize President Lincoln (or Harry Potter’s worst fear), that would lead to 12 x 16 x 3 values,
or 576 numbers.

What are the practical uses of computer vision?


 The computer vision and hardware market is expected to reach $48.6 billion by 2022, so
the sector is growing. And this is where we get to the really good stuff: There is almost no
end of uses for computer vision. Think of any futuristic situation, and there’s likely a
computer vision-related solution that can or will someday be applied. Take those fancy
Tesla cars you’ve heard so much about: They rely on a host of cameras as well as sonar,
that not only prevent your car from drifting out of a lane, but are able to see what other
objects and vehicles are around you and also read signs and traffic signals. In fact, Tesla’s
cars actually look under the car in front of you to the car ahead to take into account traffic
patterns. Similarly, as reliant on technology as today’s healthcare already is, computer
vision will enable new ways of doing diagnostics that are closer to Star Trek to analyze X-
rays, MRI, CAT, mammography, and other scans. (After all, some 90 percent of all
medical data is image based.) 90 percent of all medical data is image based.

 And computer vision will also help make robots and drones an ordinary part of everyday
life. Imagine fleets of firefighting drones and robots sent into wildfires to cut down trees
and guide water delivery. Or fleets of drones sent to search for lost hikers, or earthquake
survivors, or shipwrecked sailors. In fact, drones are being used to help farmers keep tabs
on crops, but satellites as well can help farmers manage their fields, look for signs of
drought or infestation, perhaps even analyze soil types and weather conditions to optimize
fertilization and planting schedules.

 In sports, computer vision is being applied to such tasks as play and strategy analysis
and on-field movement in games, ball and puck tracking for automated camera work,
and comprehensive evaluation of brand sponsorship visibility in sports broadcasts, online
streaming, and social media. No surprise here, considering the sports and entertainment
market is expected to grow to $1.37 billion by 2019.

 And finally, new forms of personal technology will appear, and even new types of media,
similar to the way movies and TV were inventions of the last century. Immersive
technology that makes the viewer feel physically transported are arriving already in the
form of virtual and augmented reality, which is familiar to anyone who has witnessed
frenzied Pokémon Go players searching for imaginary monsters in the real world using
their phones. That’s rudimentary tech, but it shows how convincing and satisfying it can
be—wait until we all own VR goggles.

Speech recognition

Speech recognition is the ability of a machine or program to identify words and phrases in
spoken language and convert them to a machine-readable format. Rudimentary speech
recognition software has a limited vocabulary of words and phrases and may only identify
these if they are spoken very clearly. More sophisticated software has the ability to accept
natural speech. Speech recognition applications include call routing, speech-to-text, voice
dialing and voice search.

The terms "speech recognition" and "voice recognition" are sometimes used interchangeably.
However, the two terms mean different things. Speech recognition is used to identify words in
spoken language. Voice recognition is a biometric technology used to identify a particular
individual's voice.

How It Works

Today, when we call most large companies, a person doesn't usually answer the phone.
Instead, an automated voice recording answers and instructs you to press buttons to move
through option menus. Many companies have moved beyond requiring you to press buttons,
though. Often you can just speak certain words (again, as instructed by a recording) to get
what you need. The system that makes this possible is a type of speech recognition program -
- an automated phone system.
You can also use speech recognition software in homes and businesses. A range of software
products allows users to dictate to their computer and have their words converted to text in a
word processing or e-mail document. You can access function commands, such as opening
files and accessing menus, with voice instructions. Some programs are for specific business
settings, such as medical or legal transcription.

People with disabilities that prevent them from typing have also adopted speech-recognition
systems. If a user has lost the use of his hands, or for visually impaired users when it is not
possible or convenient to use a Braille keyboard, the systems allow personal expression
through dictation as well as control of many computer tasks. Some programs save users'
speech data after every session, allowing people with progressive speech deterioration to
continue to dictate to their computers. Current programs fall into two categories:
Small-vocabulary/many-users – these systems are ideal for automated telephone
answering. The users can speak with a great deal of variation in accent and speech patterns,
and the system will still understand them most of the time. However, usage is limited to a small
number of predetermined commands and inputs, such as basic menu options or numbers.

Large-vocabulary/limited-users – These systems work best in a business environment


where a small number of users will work with the program. While these systems work with a
good degree of accuracy (85 percent or higher with an expert user) and have vocabularies in
the tens of thousands, you must train them to work best with a small number of primary users.
The accuracy rate will fall drastically with any other user.

Speech recognition systems made more than 10 years ago also faced a choice
between discrete and continuous speech. It is much easier for the program to understand
words when we speak them separately, with a distinct pause between each one. However,
most users prefer to speak in a normal, conversational speed. Almost all modern systems are
capable of understanding continuous speech.

Types of voice recognition systems


Automatic speech recognition is just one example of voice recognition, below are other
examples of voice recognition systems.
 Speaker dependent system - The voice recognition requires training before it can
be used, which requires you to read a series of words and phrases.
 Speaker independent system - The voice recognition software recognizes most
user’s voices with no training.
 Discrete speech recognition - The user must pause between each word so that the
speech recognition can identify each separate word.
 Continuous speech recognition - The voice recognition can understand a normal
rate of speaking.
 Natural language - The speech recognition not only can understand the voice but
also return answers to questions or other queries that are being asked.
Applications of Speech Recognition
The technology is gaining popularity in many areas and has been successful in the
following:
 Device control. Just saying "OK Google" to an Android phone fires up a system that
is all ears to your voice commands.
 Car Bluetooth systems. Many cars are equipped with a system that connects its radio
mechanism to your smartphone through Bluetooth. You can then make and receive
calls without touching your smartphone and can even dial numbers by just saying
them.
 Voice transcription. In areas where people have to type a lot, some intelligent
software captures their spoken words and transcribe them into text. This is current in
certain word processing software. Voice transcription also works with visual
voicemail.
 Automated phone systems. Many companies today use phone systems that help
direct the caller to the correct department. If you have ever been asked something
like "Say or press number 2 for support" and you say "2," you used voice recognition.
 Siri. Apple's Siri is another good example of voice recognition that helps answer
questions on Apple devices.
Problems with Speech Recognition
 Speech recognition, in its version known as Speech to Text (STT), has also been
used for a long time to translate spoken words into text. “You talk, it types”, as
ViaVoice would say on its box. But there is one problem with STT as we know it.
More than 10 years back, I tried ViaVoice and it did not last a week on my computer.
Why? It was grossly inaccurate and I ended up spending more time and energy
speaking and correcting than typing everything.
 ViaVoice is one of the best in the industry, so imagine the rest. The technology has
matured and improved, but speech to text still makes people ask questions. One of
its main difficulties is the immense variations among people in pronouncing words.
 Not all languages are supposed in speech recognition, and those that do are often
not supported as well as English. As a result, most devices that run speech
recognition software perform reasonably only with English.
 A set of hardware requirements makes speech recognition difficult to deploy in
certain cases. You need a microphone that is intelligent enough to filter off
background noise but at the same time powerful enough to capture voice naturally.
 Speaking of background noise, it can cause a whole system to fail. As a result,
speech recognition fails in many cases due to noises that are out of the user's
control.
 Speech recognition is proving to be better off as an input method for new phones and
communication technologies like VoIP, than as a productivity tool for mass text input.
Machine Learning
Machine learning is a method of data analysis that automates analytical model building. Using
algorithms that iteratively learn from data, machine learning allows computers to find hidden
insights without being explicitly programmed where to look.
The iterative aspect of machine learning is important because as models are exposed to new
data, they are able to independently adapt. They learn from previous computations to produce
reliable, repeatable decisions and results. It’s a science that’s not new – but one that’s gaining
fresh momentum. Because of new computing technologies, machine learning today is not like
machine learning of the past. While many machine learning algorithms have been around for
a long time, the ability to automatically apply complex mathematical calculations to big data –
over and over, faster and faster – is a recent development. Here are a few widely publicized
examples of machine learning applications that you may be familiar with:
 The heavily hyped, self-driving Google car.
 Online recommendation offers like those from Amazon and Netflix.
 Knowing what customers are saying about you on Twitter.
 Fraud detection.
Why the increased interest in machine learning?
Resurging interest in machine learning is due to the same factors that have made data mining
and Bayesian analysis more popular than ever. Things like growing volumes and varieties of
available data, computational processing that is cheaper and more powerful, and affordable
data storage. All of these things mean it’s possible to quickly and automatically produce
models that can analyse bigger, more complex data and deliver faster, more accurate results
– even on a very large scale. The result? High-value predictions that can guide better
decisions and smart actions in real time without human intervention.
One key to producing smart actions in real time is automated model building. Analytics thought
leader Thomas H. Davenport wrote in The Wall Street Journal that with rapidly changing,
growing volumes of data” you need fast-moving modelling streams to keep up”; And you can
do that with machine learning. He says, “Humans can typically create one or two good models
a week; machine learning can create thousands of models a week”
How is machine learning used today?
Ever wonder how an online retailer provides nearly instantaneous offers for other products
that may interest you? Or how lenders can provide near-real-time answers to your loan
requests? Many of our day-to-day activities are powered by machine learning algorithms,
including:
 Fraud detection.
 Web search results.
 Real-time ads on web pages and mobile devices.
 Text-based sentiment analysis.
 Credit scoring and next-best offers.
 Prediction of equipment failures.
 New pricing models.
 Network intrusion detection.
 Pattern and image recognition.
 Email spam filtering.
What are some popular machine learning methods?
Two of the most widely adopted machine learning methods are supervised learning and
unsupervised learning. Most machine learning – about 70 percent – is supervised learning.
Unsupervised learning accounts for 10 to 20 percent. Semi-supervised and reinforcement
learning are two other technologies that are sometimes used.
Supervised learning algorithms are trained using labelled examples, such as an input where
the desired output is known. For example, a piece of equipment could have data points
labelled either “F” (failed) or “R” (runs). The learning algorithm receives a set of inputs along
with the corresponding correct outputs, and the algorithm learns by comparing its actual output
with correct outputs to find errors. It then modifies the model accordingly. Through methods
like classification, regression, prediction and gradient boosting, supervised learning uses
patterns to predict the values of the label on additional unlabelled data. Supervised learning
is commonly used in applications where historical data predicts likely future events. For
example, it can anticipate when credit card transactions are likely to be fraudulent or which
insurance customer is likely to file a claim.
Unsupervised learning is used against data that has no historical labels. The system is not
told the “right answer”. The algorithm must figure out what is being shown. The goal is to
explore the data and find some structure within. Unsupervised learning works well on
transactional data. For example, it can identify segments of customers with similar attributes
who can then be treated similarly in marketing campaigns. Or it can find the main attributes
that separate customer segments from each other. Popular techniques include self-organizing
maps, nearest-neighbor mapping, k-means clustering and singular value decomposition.
These algorithms are also used to segment text topics, recommend items and identify data
outliers.
Semi-supervised learning is used for the same applications as supervised learning. But it
uses both labelled and unlabelled data for training – typically a small amount of labelled data
with a large amount of unlabelled data (because unlabelled data is less expensive and takes
less effort to acquire). This type of learning can be used with methods such as classification,
regression and prediction. Semi-supervised learning is useful when the cost associated with
labelling is too high to allow for a fully labelled training process. Early examples of this include
identifying a person’s face on a web cam.
Reinforcement learning is often used for robotics, gaming and navigation. With
reinforcement learning, the algorithm discovers through trial and error which actions yield the
greatest rewards. This type of learning has three primary components: the agent (the learner
or decision maker), the environment (everything the agent interacts with) and actions (what
the agent can do). The objective is for the agent to choose actions that maximize the expected
reward over a given amount of time. The agent will reach the goal much faster by following a
good policy. So the goal in reinforcement learning is to learn the best policy.

Support Vector Machines (SVMs)


In machine learning, support vector machines (SVMs) are supervised learning models with
associated learning algorithms that analyze data used for classification and regression
analysis. Given a set of training examples, each marked as belonging to one or the other of
two categories, an SVM training algorithm builds a model that assigns new examples to one
category or the other, making it a non- probabilistic binary linear classifier (although methods
such as Platt scaling exist to use SVM in a probabilistic classification setting). An SVM model
is a representation of the examples as points in space, mapped so that the examples of the
separate categories are divided by a clear gap that is as wide as possible. New examples are
then mapped into that same space and predicted to belong to a category based on which side
of the gap they fall. In addition to performing linear classification, SVMs can efficiently perform
a non-linear classification using what is called the kernel trick, implicitly mapping their inputs
into high-dimensional feature spaces. When data are not labelled, supervised learning is not
possible, and an unsupervised learning approach is required, which attempts to find
natural clustering of the data to groups, and then map new data to these formed groups.
The support vector clustering algorithm created by Hava Siegelmann and Vladimir Vapnik,
applies the statistics of support vectors, developed in the support vector machines algorithm,
to categorize unlabelled data, and is one of the most widely used clustering algorithms in
industrial applications.

SVMs can be used to solve various real-world problems:

 SVMs are helpful in text and hypertext categorization as their application can significantly
reduce the need for labelled training instances in both the standard inductive
and transductive settings.

 Classification of images can also be performed using SVMs. Experimental results show
that SVMs achieve significantly higher search accuracy than traditional query refinement
schemes after just three to four rounds of relevance feedback. This is also true of image
segmentation systems, including those using a modified version SVM that uses the
privileged approach as suggested by Vapnik.

 Hand-written characters can be recognized using SVM.

 The SVM algorithm has been widely applied in the biological and other sciences. They
have been used to classify proteins with up to 90% of the compounds classified
correctly. Permutation tests based on SVM weights have been suggested as a
mechanism for interpretation of SVM models. Support vector machine weights have also
been used to interpret SVM models in the past. Posthoc interpretation of support vector
machine models in order to identify features used by the model to make predictions is a
relatively new area of research with special significance in the biological sciences.

 More formally, a support vector machine constructs a hyperplane or set of hyperplanes in


a high- or infinite-dimensional space, which can be used for classification, regression, or
other tasks like outlier’s detection. Intuitively, a good separation is achieved by the
hyperplane that has the largest distance to the nearest training-data point of any class
(so-called functional margin), since in general the larger the margin the lower
the generalization error of the classifier.
Deep Learning
Deep Learning is a subfield of machine learning concerned with algorithms inspired by the
structure and function of the brain called artificial neural networks.
Deep learning trains a computer to perform human-like tasks, such as recognizing speech,
identifying images or making predictions. Instead of organizing data to run through predefined
equations, deep learning sets up basic parameters about the data and trains the computer to
learn on its own by recognizing patterns using many layers of processing.
Why is deep learning important today?
Deep learning is one of the foundations of cognitive computing. The current interest in deep
learning is due in part to the buzz surrounding cognitive computing, software applications that
understand human input and can respond in humanlike form or output. Deep learning
techniques have greatly improved our ability to classify, recognize, detect and describe – in
one word, understand. Many applications are in fields where these tasks apply to non-
numerical data, for example, classification of images, recognizing speech, detecting objects
and describing content. Systems such as Siri and Cortana are powered in part by cognitive
computing and driven by deep machine learning.
Several developments are now advancing deep learning. Algorithmic improvements have
boosted the performance of deep learning methods, and improving the accuracy of machine
learning approaches creates massive business value. New classes of neural networks have
been developed that fit particularly well for applications like text translation and image
classification. Neural networks have existed for almost 50 years. They had fallen out of favor
by the late 1990s due to difficulties in getting good accuracy in the absence of large training
data. However, two changes within the last decade have revolutionized their use:
1. We have a lot more data available to build neural networks with many deep layers,
including streaming data from the Internet of Things, textual data from social media,
physician’s notes and investigative transcripts.
2. Computational advances of distributed cloud computing and graphics processing units
have put incredible computing power at our disposal. This level of computing power is
necessary to train deep algorithms.
At the same time, human-to-machine interfaces have evolved greatly as well. The mouse and
the keyboard are being replaced with gesture, swipe, touch and natural language, ushering in
a renewed interest in cognitive computing.
Applications of Deep Learning
 Due to the iterative nature of deep learning algorithms, their complexity as the number of
layers increase, and the large volumes of data needed to train the networks, a lot of
computational power is needed to solve deep learning problems.
 Traditional modeling methods are well understood, and their predictive methods and
business rules can be explained. Deep learning methods have been characterized as
more of a black-box approach. You can prove that they perform well by testing them on
new data. However, it is difficult to explain to decision makers why they produce a
particular outcome, due to their nonlinear nature. This can create some resistance to
adoption of these techniques, especially in highly regulated industries.
 On the other hand, the dynamic nature of learning methods – their ability to continuously
improve and adapt to changes in the underlying information pattern – presents a great
opportunity to introduce less deterministic, more dynamic behavior into analytics. Greater
personalization of customer analytics is one possibility.
 Another great opportunity is to improve accuracy and performance in applications where
neural networks have been used for a long time. Through better algorithms and more
computing power, we can add greater depth.
 While the current market focus of deep learning techniques is in applications of cognitive
computing, SAS sees great potential in more traditional analytics applications, for
example, time series analysis.
 Another opportunity is to simply be more efficient and streamlined in existing analytical
operations. Recently, SAS experimented with deep neural networks in speech-to-text
transcription problems. Compared to the standard techniques, the word-error-rate (WER)
decreased by more than 10 percent when deep neural networks were applied. They also
eliminated about 10 steps of data preprocessing, feature engineering and modeling.
Computationally, it might take longer to train the deep network compared to the traditional
modeling flow, but the impressive performance gains and the time savings when
compared to feature engineering signify a paradigm shift.
How is deep learning being used?
To the outside eye, deep learning may appear to be in a research phase as computer science
researchers and data scientists continue to test its capabilities. However, deep learning has
many practical applications that businesses are using today, and many more that will be used
as research continues. Popular uses today include:
 Speech Recognition: Both the business and academic worlds have embraced deep
learning for speech recognition. Xbox, Skype, Google Now and Apple’s Siri®, to name
a few, are already employing deep learning technologies in their systems to recognize
human speech and voice patterns.
 Image Recognition: One practical application of image recognition is automatic image
captioning and scene description. This could be crucial in law enforcement
investigations for identifying criminal activity in thousands of photos submitted by
bystanders in a crowded area where a crime has occurred. Self-driving cars will also
benefit from image recognition through the use of 360-degree camera technology.
 Natural Language Processing: Neural networks, a central component of deep
learning, have been used to process and analyze written text for many years. A
specialization of text mining, this technique can be used to discover patterns in
customer complaints, physician notes or news reports, to name a few.
 Recommendation Systems: Amazon and Netflix have popularized the notion of a
recommendation system with a good chance of knowing what you might be interested
in next, based on past behavior. Deep learning can be used to enhance
recommendations in complex environments such as music interests or clothing
preferences across multiple platforms.
Artificial Intelligence (AI)

Artificial intelligence (AI) is an area of computer science that emphasizes the creation of
intelligent machines that work and react like humans. Artificial intelligence is a branch of
computer science that aims to create intelligent machines. It has become an essential part of
the technology industry. Some of the activities computers with artificial intelligence are
designed for include:

 Speech recognition
 Learning
 Planning
 Problem solving

Research associated with artificial intelligence is highly technical and specialized. The core
problems of artificial intelligence include programming computers for certain traits such as:

 Knowledge
 Reasoning
 Problem solving
 Perception
 Learning
 Planning
 Ability to manipulate and move objects

Knowledge engineering is a core part of AI research. Machines can often act and react like
humans only if they have abundant information relating to the world. Artificial intelligence must
have access to objects, categories, properties and relations between all of them to implement
knowledge engineering. Initiating common sense, reasoning and problem-solving power in
machines is a difficult and tedious approach.

Machine learning is another core part of AI. Learning without any kind of supervision requires
an ability to identify patterns in streams of inputs, whereas learning with adequate supervision
involves classification and numerical regressions. Classification determines the category an
object belongs to and regression deals with obtaining a set of numerical input or output
examples, thereby discovering functions enabling the generation of suitable outputs from
respective inputs. Mathematical analysis of machine learning algorithms and their
performance is a well-defined branch of theoretical computer science often referred to as
computational learning theory.

Machine perception deals with the capability to use sensory inputs to deduce the different
aspects of the world, while computer vision is the power to analyze visual inputs with a few
sub-problems such as facial, object and gesture recognition.

Robotics is also a major field related to AI. Robots require intelligence to handle tasks such
as object manipulation and navigation, along with sub-problems of localization, motion
planning and mapping.

Types of artificial intelligence

AI can be categorized in any number of ways, but here are two examples.

The first classifies AI systems as either weak AI or strong AI. Weak AI, also known as narrow
AI, is an AI system that is designed and trained for a particular task. Virtual personal assistants,
such as Apple's Siri, are a form of weak AI.

Strong AI, also known as artificial general intelligence, is an AI system with generalized
human cognitive abilities so that when presented with an unfamiliar task, it has enough
intelligence to find a solution. The Turing Test, developed by mathematician Alan Turing in
1950, is a method used to determine if a computer can actually think like a human, although
the method is controversial.

The second example is from Arend Hintze, an assistant professor of integrative biology and
computer science and engineering at Michigan State University. He categorizes AI into four
types, from the kind of AI systems that exist today to sentient systems, which do not yet exist.
His categories are as follows:

 Type 1: Reactive machines. An example is Deep Blue, the IBM chess program that beat
Garry Kasparov in the 1990s. Deep Blue can identify pieces on the chess board and make
predictions, but it has no memory and cannot use past experiences to inform future ones.
It analyzes possible moves -- its own and its opponent -- and chooses the most strategic
move. Deep Blue and Google's AlphaGO were designed for narrow purposes and cannot
easily be applied to another situation.
 Type 2: Limited memory. These AI systems can use past experiences to inform future
decisions. Some of the decision-making functions in autonomous vehicles have been
designed this way. Observations used to inform actions happening in the not-so-distant
future, such as a car that has changed lanes. These observations are not stored
permanently.
 Type 3: Theory of mind. This is a psychology term. It refers to the understanding that
others have their own beliefs, desires and intentions that impact the decisions they make.
This kind of AI does not yet exist.
 Type 4: Self-awareness. In this category, AI systems have a sense of self, have
consciousness. Machines with self-awareness understand their current state and can use
the information to infer what others are feeling. This type of AI does not yet exist.

Artificial Superintelligence

Artificial superintelligence is a term referring to the time when the capability of computers will
surpass humans. "Artificial intelligence," which has been much used since the 1970s, refers
to the ability of computers to mimic human thought. Artificial superintelligence goes a step
beyond and posits a world in which a computer’s cognitive ability is superior to humans.

Most experts would agree that societies have not yet reached the point of artificial
superintelligence. In fact, engineers and scientists are still trying to reach a point that would
be considered full artificial intelligence, where a computer could be said to have the same
cognitive capacity as a human. Although there have been developments like IBM's Watson
supercomputer beating human players at Jeopardy, and assistive devices like Siri engaging
in primitive conversation with people, there is still no computer that can really simulate the
breadth of knowledge and cognitive ability that a fully developed adult human has. The Turing
test, developed decades ago, is still used to talk about whether computers can come close to
simulating human conversation and thought, or whether they can trick other people into
thinking that a communicating computer is actually a human.

However, there is a lot of theory that anticipates artificial superintelligence coming sooner
rather than later. Using examples like Moore's law, which predicts an ever-increasing density
of transistors, experts talk about singularity and the exponential growth of technology, in which
full artificial intelligence could manifest within a number of years, and artificial superintelligence
could exist in the 21st century.

What’s Next for AI?


The advances made by researchers at DeepMind, Google Brain, OpenAI and various
universities are accelerating. AI is capable of solving harder and harder problems better than
humans can.
This means that AI is changing faster than its history can be written, so predictions about its
future quickly become obsolete as well. Are we chasing a breakthrough like nuclear fission
(possible), or are attempts to wring intelligence from silicon more like trying to turn lead into
gold?1
There are four main schools of thought, or churches of belief if you will, that group together
how people talk about AI.
Those who believe that AI progress will continue apace tend to think a lot about strong AI, and
whether or not it is good for humanity. Among those who forecast continued progress, one
camp emphasizes the benefits of more intelligent software, which may save humanity from its
current stupidities; the other camp worries about the existential risk of a superintelligence.
Given that the power of AI progresses hand in hand with the power of computational hardware,
advances in computational capacity, such as better chips or quantum computing, will set the
stage for advances in AI. On a purely algorithmic level, most of the astonishing results
produced by labs such as DeepMind come from combining different approaches to AI, much
as AlphaGo combines deep learning and reinforcement learning. Combining deep learning
with symbolic reasoning, analogical reasoning, Bayesian and evolutionary methods all show
promise.
Those who do not believe that AI is making that much progress relative to human intelligence
are forecasting another AI winter, during which funding will dry up due to generally
disappointing results, as has happened in the past. Many of those people have a pet algorithm
or approach that competes with deep learning.
Finally, there are the pragmatists, plugging along at the math, struggling with messy data,
scarce AI talent and user acceptance. They are the least religious of the groups making
prophesies about AI – they just know that it’s hard.
Artificial Intelligence (AI) vs. Machine Learning vs. Deep Learning

AI means getting a computer to mimic human behavior in some way.


Machine learning is a subset of AI, and it consists of the techniques that enable computers
to figure things out from the data and deliver AI applications.
Deep learning, meanwhile, is a subset of machine learning that enables computers to solve
more complex problems.

What is AI?
John McCarthy, widely recognized as one of the godfathers of AI, defined it as “the science
and engineering of making intelligent machines.”
Artificial intelligence as an academic discipline was founded in 1956. The goal then, as now,
was to get computers to perform tasks regarded as uniquely human: things that required
intelligence. Initially, researchers worked on problems like playing checkers and solving logic
problems.

If you looked at the output of one of those checkers playing programs you could see some
form of “artificial intelligence” behind those moves, particularly when the computer beat you.
Early successes caused the first researchers to exhibit almost boundless enthusiasm for the
possibilities of AI, matched only by the extent to which they misjudged just how hard some
problems were.

Artificial intelligence, then, refers to the output of a computer. The computer is doing
something intelligent, so it’s exhibiting intelligence that is artificial.

The term AI doesn’t say anything about how those problems are solved. There are many
different techniques including rule-based or expert systems. And one category of techniques
started becoming more widely used in the 1980s: machine learning.

What is Machine Learning?

Machine learning is the best tool so far to analyze, understand and identify a pattern in the
data. One of the main ideas behind machine learning is that the computer can be trained to
automate tasks that would be exhaustive or impossible for a human being. The clear breach
from the traditional analysis is that machine learning can take decisions with minimal human
intervention.

Machine learning uses data to feed an algorithm that can understand the relationship
between the input and the output. When the machine finished learning, it can predict the
value or the class of new data point.
What is Deep Learning?

Deep learning is a computer software that mimics the network of neurons in a brain. It is a
subset of machine learning and is called deep learning because it makes use of deep neural
networks. The machine uses different layers to learn from the data. The depth of the model
is represented by the number of layers in the model. Deep learning is the new state of the art
in term of AI. In deep learning, the learning phase is done through a neural network. A neural
network is an architecture where the layers are stacked on top of each other

Machine Learning Process

Imagine you are meant to build a program that recognizes objects. To train the model, you
will use a classifier. A classifier uses the features of an object to try identifying the class it
belongs to.

In the example, the classifier will be trained to detect if the image is a bicycle, boat, car or a
plane

The four objects above are the class the classifier has to recognize. To construct a classifier,
you need to have some data as input and assigns a label to it. The algorithm will take these
data, find a pattern and then classify it in the corresponding class.

This task is called supervised learning. In supervised learning, the training data you feed to
the algorithm includes a label.

Training an algorithm requires to follow a few standard steps:

 Collect the data


 Train the classifier
 Make predictions

The first step is necessary, choosing the right data will make the algorithm success or a
failure. The data you choose to train the model is called a feature. In the object example, the
features are the pixels of the images.

Each image is a row in the data while each pixel is a column. If your image is a 28x28 size,
the dataset contains 784 columns (28x28). In the picture below, each picture has been
transformed into a feature vector. The label tells the computer what object is in the image.

The objective is to use these training data to classify the type of object. The first step
consists of creating the feature columns. Then, the second step involves choosing an
algorithm to train the model. When the training is done, the model will predict what picture
corresponds to what object.
After that, it is easy to use the model to predict new images. For each new image feeds into
the model, the machine will predict the class it belongs to. For example, an entirely new
image without a label is going through the model. For a human being, it is trivial to visualize
the image as a car. The machine uses its previous knowledge to predict as well the image is
a car.

Deep Learning Process

In deep learning, the learning phase is done through a neural network. A neural network is
an architecture where the layers are stacked on top of each other.

Consider the same image example above. The training set would be fed to a neural network

Each input goes into a neuron and is multiplied by a weight. The result of the multiplication
flows to the next layer and become the input. This process is repeated for each layer of the
network. The final layer is named the output layer; it provides an actual value for the
regression task and a probability of each class for the classification task. The neural network
uses a mathematical algorithm to update the weights of all the neurons. The neural network
is fully trained when the value of the weights gives an output close to the reality. For
instance, a well-trained neural network can recognize the object on a picture with higher
accuracy than the traditional neural net.

When to use ML or DL?

Machine learning Deep learning

Training dataset Small Large

Choose features Yes No

Number of algorithms Many Few

Training time Short Long

With machine learning, you need fewer data to train the algorithm than deep learning. Deep
learning requires an extensive and diverse set of data to identify the underlying structure.
Besides, machine learning provides a faster-trained model. Most advanced deep learning
architecture can take days to a week to train. The advantage of deep learning over machine
learning is it is highly accurate. You do not need to understand what features are the best
representation of the data; the neural network learned how to select critical features. In
machine learning, you need to choose for yourself what features to include in the model.

References
https://bus206.pressbooks.com/chapter/chapter-4-data-and-databases/
https://www.guru99.com/what-is-dbms.html
https://medium.com/omarelgabrys-blog/database-introduction-part-1-4844fada1fb0
https://searchdatamanagement.techtarget.com/definition/data-analytics
https://www.microstrategy.com/us/resources/introductory-guides/data-mining-explained
https://docs.oracle.com/cd/B28359_01/datamine.111/b28129/process.htm#CHDEFGIE
https://www.guru99.com/data-mining-tutorial.html
https://bus206.pressbooks.com/chapter/chapter-4-data-and-databases/
https://www.lifewire.com/what-exactly-is-big-data-4051020
https://www.mongodb.com/big-data-explained
https://www.investopedia.com/terms/b/big-data.asp
https://datasciencedegree.wisconsin.edu/data-science/what-is-big-data/
https://www.guru99.com/what-is-big-data.html
https://www.techradar.com/news/what-is-a-neural-network
https://www.explainthatstuff.com/introduction-to-neural-networks.html
https://searchenterpriseai.techtarget.com/definition/neural-network
https://www.digitaltrends.com/cool-tech/what-is-an-artificial-neural-network/
https://gumgum.com/what-is-computer-vision
https://freecontent.manning.com/mental-model-graphic-grokking-deep-learning-for-
computer-vision/
https://blog.algorithmia.com/introduction-to-computer-vision/
https://www.analyticsindiamag.com/what-is-the-difference-between-computer-vision-and-
image-processing/
https://www.dezyre.com/article/top-10-machine-learning-algorithms/202
https://electronics.howstuffworks.com/gadgets/high-tech-gadgets/speech-recognition.html
http://searchcrm.techtarget.com/definition/speech-recognition
https://www.lifewire.com/what-is-speech-recognition-3426721
http://www.computerhope.com/jargon/v/voicreco.htm
http://www.sas.com/en_us/insights/analytics/machine-learning.html
http://machinelearningmastery.com/what-is-deep-learning/
http://www.sas.com/en_us/insights/analytics/deep-learning.html
https://www.techopedia.com/definition/190/artificial-intelligence-ai
http://searchcio.techtarget.com/definition/AI
https://www.techopedia.com/definition/31619/artificial-superintelligence-asi
https://www.coresystems.net/blog/the-difference-between-artifical-intelligence-artifical-
general-intelligence-and-artifical-super-intelligence
https://www.guru99.com/machine-learning-vs-deep-learning.html
https://blogs.oracle.com/bigdata/difference-ai-machine-learning-deep-learning

You might also like