Professional Documents
Culture Documents
services. You’ll also get advice for making the best use of your time while enrolled in this
program.
Welcome to the Data Analyst Nanodegree program! In this lesson, you will learn more about the
structure of the program and meet the team.
In this lesson, you'll hear from a few data analysts and data scientists about what it's like to work in data
analytics.
This lesson covers the basics of how to use SQL to extract data from database. It will prepare you to
complete the Explore Weather Trends project.
In this project, you will analyze local and global temperature data and compare the temperature trends
where you live to overall global temperature trends.
In this lesson you will learn how to use Python's numeric and string data types. You will use built-in
functions and methods to process this data, and store the results in variables
In this lesson you will install Python on your own computer. You will learn how to define functions, and
how to use conditional statements to write more elaborate programs. We will also practice our software
engineering skills by learning how to break programs down into manageable pieces.
In this lesson you will learn how to use Python's collections: lists, sets and dictionaries. You will learn
how to iterate over these collections with for loops and while loops. You will also learn how to build
compound data structures that combine these collections. We will practice the software engineering
skills of refactoring and self-reliant problem solving.
In this lesson you will extend your knowledge of functions by learning how to specify default arguments
and how to return multiple value from a function. You will learn how to read from files. You will also
learn how to import modules from Python's standard library, and how to install third-party libraries. We
will also learn more about reliant problem solving.
In this lesson we will apply the skills learned previously by writing a web crawler that explores
Wikipedia.
In this lesson, you will be getting a quick glimpse at the Anaconda environment - one of the most
popular environments for doing data analysis in Python.
Jupyter Notebooks are a great tool for getting started with writing python code. Though in production
you often will write code in scripts, notebooks are wonderful for sharing insights and data viz!
Learn about the data analysis process and practice investigating different datasets using Python and its
powerful packages for data analysis.
Investigate a dataset on chemical properties and quality ratings of wine samples by going through the
entire data analysis process and building more skill with Python for data analysis.
Additional content to expose you to a different workflow for your analysis in Python: IPython's
command line interface, writing scripts in text editors, running scripts in the terminal.
In this section, you will gain knowledge about SQL basics for working with a single table. You will learn
the key commands to filter a table in many different ways.
In this lesson, you will learn how to combine data from multiple tables together.
In this lesson, you will learn how to aggregate data using SQL functions like SUM, AVG, and COUNT.
Additionally, CASE, HAVING, and DATE functions provide you an incredible problem solving toolkit.
In this lesson, you will be learning to answer much more complex business questions using nested
querying methods - also known as subqueries.
Cleaning data is an important part of the data analysis process. You will be learning how to perform data
cleaning using SQL in this lesson.
Compare one row to another without doing any joins using one of the most powerful concepts in SQL
data analysis: window functions.
Lesson 07: [Advanced] SQL Advanced JOINs & Performance Tuning
Learn advanced joins and how to make queries that run quickly across giant datasets. Most of the
examples in the lesson involve edge cases, some of which come up in interviews.
Choose one of Udacity's curated datasets, perform an investigation, and share your findings.
In this lesson, you will learn about data types, measures of center, and the basics of statistical notation.
In this lesson, you will learn about measures of spread, shape, and outliers as associated with
quantitative data. You will also get a first look at inferential statistics.
Learn to ask the right questions, as you learn about Simpson's Paradox.
Learn about one of the most popular distributions in probability - the Binomial Distribution.
Not all events are independent. Learn the probability rules for dependent events.
Learn one of the most popular rules in all of statistics - Bayes rule.
Take what you have learned in the last lessons and put it to practice in Python.
Learn the mathematics behind moving from a coin flip to a normal distribution.
Learn all about the underpinning of confidence intervals and hypothesis testing - sampling distributions.
Learn how to use sampling distributions and bootstrapping to create a confidence interval for any
parameter of interest.
Learn the necessary skills to create and analyze the results in hypothesis testing.
Work through a case study of how A/B testing works for an online education company called Audacity.
Lesson 14: Regression
Use python to fit linear regression models, as well as understand how to interpret the results of linear
models.
Learn to apply multiple linear regression models in python. Learn to interpret the results and
understand if your model fits well.
Learn to apply logistic regression models in python. Learn to interpret the results and understand if your
model fits well.
You will be working to understand the results of an A/B test run by an e-commerce website. Your goal is
to work through to help the company understand if they should implement the new page design.
program.
Welcome to term 2 of the Data Analyst Nanodegree program! In this lesson, you will learn more about
the structure of the program and meet the team.
This lesson contains several exercises to assess your statistics and programming knowledge. It will serve
both as a warm up and as an indication to you of how ready you are to take this program.
Compute descriptive statistics and perform a statistical test on a data set based on a psychological
phenomenon, the Stroop Effect.
Learn about what exploratory data analysis (EDA) is and why it is important.
Lesson 02: R Basics
Install RStudio and packages, learn the layout and basic commands of R, practice writing basic R scripts,
and inspect data sets.
Learn how to quantify and visualize individual variables within a data set using histograms, boxplots, and
transforms.
Practice using R functions and univariate visualizations to explore and understand individual variables.
Learn techniques for exploring the relationship between any two variables in a data set, including
scatter plots, line plots, and correlations.
Learn powerful methods for examining relationships among multiple variables, and find out how to
reshape your data.
Practice using multivariate exploration techniques to look at more complicated relationships between
multiple variables.
Put it all together in this case study where we investigate the diamonds data set alongside Facebook
Data Scientist, Solomon Messing.
Module 02: Project
Choose one of Udacity's curated datasets or find one of your own and perform a complete exploratory
data analysis on the data using R.
Identify each step of the data wrangling process (gathering, assessing, cleaning) through a brief
walkthrough of the process. The dataset for this lesson is an online job postings dataset from Kaggle.
Gather data from various sources and a variety of file formats using Python. Rotten Tomatoes ratings,
Roger Ebert reviews, and Wikipedia movie poster images make up the dataset for this lesson.
Assess data visually and programmatically for quality and tidiness issues using pandas. The dataset for
this lesson is mock Phase II clinical trial data for a new oral insulin called Auralin.
Module 04: Cleaning Data
Lesson 01: Cleaning Data
Using pandas, clean the quality and tidiness issues you identified in the "Assessing Data" lesson. The
dataset is the same: mock Phase II clinical trial data for a new oral insulin called Auralin.
Gather data from a variety of sources and in a variety of formats, assess its quality and tidiness, then
clean it. Showcase your wrangling efforts through analyses and visualizations.
In this lesson, you will learn the main data visualizations used for univariate and bivariate analyses. As
well as the visualizations that are used for when you you want to compare more variables.
In this lesson, you will learn about visual encodings, and best practices for building data visualizations.
In this lesson, you will learn from a Tableau expert, and start putting together your own dashboards and
stories.
Create a Tableau data visualization from a data set that tells a story or highlights trends or patterns in
the data. Your work should be a reflection of the theory and practice of data visualization.
Version control is an incredibly important part of a professional programmer's life. In this lesson, you'll
learn about the benefits of version control and install the version control tool Git!
Lesson 02: Create A Git Repo
Now that you've learned the benefits of Version Control and gotten Git installed, it's time you learn how
to create a repository.
Knowing how to review an existing Git repository's history of commits is extremely important. You'll
learn how to do just that in this lesson.
A repository is nothing without commits. In this lesson, you'll learn how to make commits, write
descriptive commit messages, and verify the changes you're about to save to the repository.
Being able to work on your project in isolation from other changes will multiply your productivity. You'll
learn how to do this isolated development with Git's branches.
Help! Disaster has struck! You don't have to worry, though, because your project is tracked in version
control! You'll learn how to undo and modify changes that have been saved to the repository.
Review how your GitHub profile, projects, and code represent you as a potential job candidate. Learn to
assess your GitHub profile through the eyes of a recruiter or hiring manager.
Project Description - Optimize Your GitHub Profile
In this lesson, learn how to tell your unique story in a succinct and professional way. Communicate to
employers that you know how to solve problems, overcome challenges, and achieve results.
Optimize your LinkedIn profile to show up in recruiter searches, build your network, and attract
employers. Learn to read your LinkedIn profile through the lens of a recruiter or hiring manager.
Update and personalize your Udacity Professional Profile as you complete your Nanodegree program,
and make your Profile visible to Udacity hiring partners when you’re ready to start your job search.
Learn how to search for jobs effectively through industry research, and targeting your application to a
specific role.
Receive a personalized review of your resume. This resume review is best suited for applicants who have
0-3 years of work experience in any industry.
Receive a personalized review of your resume. This resume review is best suited for applicants who have
3+ years of work experience in an unrelated field.
Get a personalized review of your cover letter. A successful cover letter will convey your enthusiasm,
specific technical qualifications, and communication skills applicable to the position.
You're practiced a lot for the interview by now. Continue practicing, and you'll ace the interview!
This lesson introduces you to common types of questions you’ll encounter during an in-person
interview. Develop a healthier, more confident mindset around your qualifications as a candidate.
Begin the section on data structures and algorithms, including Python and efficiency practice.
Learn the definition of a list in computer science, and see definitions and examples of list-based data
structures, arrays, linked lists, stacks, and queues.
Explore how to search and sort with list-based data structures, including binary search and bubble,
merge, and quick sort. Learn how to use recursion.
Understand the concepts of sets, maps (dictionaries), and hashing. Examine common problems and
approaches to hashing, and practice with examples.
Examine the theoretical concept of a graph and understand common graph terms, coded
representations, properties, traversals, and paths.
Explore famous computer science problems, specifically the Shortest Path Problem, the Knapsack
Problem, and the Traveling Salesman Problem.
Practice with five technical interviewing questions on topics discussed in the data structures and
algorithms course and get a personalized review on both your code and solutions.
Learn about classification, training and testing, and run a naive Bayes classifier using Scikit Learn.
Learn about how the decision tree algorithm works, including the concepts of entropy and information
gain.
In this mini project, you will extend your toolbox of algorithms by choosing your own algorithm to
classify terrain data, including k-nearest neighbors, AdaBoost, and random forests.
Find out about the Enron data set used in the next lessons and mini-projects.
Learn about what unsupervised learning is and find out how to use scikit-learn's k-means algorithm.
Learn about feature rescaling and find out which algorithms require feature rescaling before use.
Find out how to use text data in your machine learning algorithm.
Learn about data dimensionality and reducing the number of dimensions with principal component
analysis (PCA).
Learn more about testing, training, cross validation, and parameter grid searches in this lesson.
How do we know if our classifier is performing well? Katie discusses different evaluation metrics for
classifiers in this lesson.
Spend some time reflecting on the course material with Sebastian and Katie!
In this lesson, you'll review the matrix math you'll need to understand to build your neural networks.
You'll also explore NumPy, the library you'll use to efficiently deal with matrices in Python.
Build logic into your code with control flow tools! Learn about conditional statements, repeating code
with loops and useful built-in functions, and list comprehensions.
Learn how to use functions to improve and reuse your code! Learn about functions, variable scope,
documentation, lambda expressions, iterators, and generators.
Setup your own programming environment to write and run Python scripts locally! Learn good scripting
practices, interact with different inputs, and discover awesome tools.
In this lesson you will learn how to use Python's numeric and string data types. You will use built-in
functions and methods to process this data, and store the results in variables
In this lesson you will install Python on your own computer. You will learn how to define functions, and
how to use conditional statements to write more elaborate programs. We will also practice our software
engineering skills by learning how to break programs down into manageable pieces.
In this lesson you will extend your knowledge of functions by learning how to specify default arguments
and how to return multiple value from a function. You will learn how to read from files. You will also
learn how to import modules from Python's standard library, and how to install third-party libraries. We
will also learn more about reliant problem solving.
In this lesson we will apply the skills learned previously by writing a web crawler that explores
Wikipedia.