You are on page 1of 18

1 Welcome

What we’re 2 Lecture takeaways


going to
3 Project walk-through
do today:
4 Next steps
Project Kickoff objectives
● 20 min - Reinforce key lecture
material
● 30 min - Build bridge from
Welcome! lecture material to project
implementation
● 10 min - Leave with clarity
on your next actions for
the project
Housekeeping
● Office Hours Friday 9am PT (5pm GMT) + Project Walkthru Sunday 9am PT (5pm GMT)

● Volunteer to lead a study group - it can be anytime that works for you, 1hr each week. See thread in #py-for-ds-
announcements

● Projects are due end of day SUNDAY on the platform

○ Fill out the week 1 survey when you submit!

● Code review of your partners assignments in Slack due MONDAY

○ If you haven’t been tagged by Monday, review any 2 projects that don’t have comments yet - sharing is
caring 🤗

Community ❤️🤗

● Join #coffee-chats to be randomly paired with a classmate!


● Sign up to present a community talk - DM Barbara!
● Book a 1:1 with Barbara :)
1 Welcome

What we’re 2 Lecture takeaways


going to
3 Project walk-through
do today:
4 Next steps
Lecture: Key Takeaways

Numpy Foundations
- Differences between Numpy and Python Lists
- Numpy Basics (initialization, indexing, slicing)
- Math operations and computation on arrays (broadcasting)
- File Input / Output (genfromtxt)
- Aggregations on arrays (concatenation, stack)
Why Numpy?

Lists Numpy
Why Numpy?
- Fixed Type Storage - Contiguous Memory
- “dtype”: int16, int32, float, etc.
- Much less space used than python lists
- Optimized for numeric data Data elements stored
contiguously in computer
memory, allowing parallel
NumPy processing
8 12 2 3 Int16: 00000000 00001001

7 5 11 9 Lists
Size (Int16)
18 10 4 6 Reference Count (Int32)
Object Type (Int32)
Object Value (Int64)
Numpy Basics

- arr = np.array([1, 2, 3], dtype=’int16’)


- arr_2d[row, col] # access element at (row, col)
- arr_2d[:, col] # all elements in col
- arr_2d[row, :] # all elements in row

- arr_3d[row, :, :] ?
Examples

shape: (17,000, 9)

shape: (batch_size, h, w, c)
4D Matrix - batch of images
Math operations on Numpy Arrays

Numpy is optimized to perform all kinds of math operations between arrays and matrices.

Sum of matrices?

Matrix multiplication?

Apply a function to all elements of a matrix?

This concept is called broadcasting


Merging Datasets with Numpy

What if we want to combine datasets from 2 sources?

- Data may be of different shapes


- We might only need a portion of data from each source
- This is where reshaping and stacking numpy matrices comes in handy
Merging Datasets with Numpy

ID Name (remove) Location ID abc xyz hjk

abc Income $$$ $$$ $$

xyz Sq. Ft. 1200 1500 800

hjk dataset2: shape (3, 4)

dataset1: shape (4, 3)

dataset1 = np.genfromtxt(‘path/to/dataset1’, skip_header=1)


dataset2 = np.genfromtxt(‘path/to/dataset2’)

data1_name_removed = dataset1[:, 0] + dataset1[:, 2] # shape: (3, 2)


data2_reformatted = dataset2.transpose()[1:, :] # shape: (3, 3)

new_data = np.hstack((data1_name_removed, data2_reformatted[:, 1:]))


Questions?
1 Welcome

What we’re 2 Lecture takeaways


going to
3 Project walk-through
do today:
4 Next steps
1 Welcome

What we’re 2 Lecture takeaways


going to
3 Project walk-through
do today:
4 Next steps
👀 Reminders
● Office Hours on FRIDAY, Project Walkthru on SUNDAY

● Join Study Groups!

○ Volunteer to lead a study group - it can be anytime that works for you, 1hr each week.
See thread in #announcements

● Projects are due at end of day SUNDAY

● Peer review by end of day MONDAY


We’ll stick around
in case anyone has
questions
[Fin]

You might also like