You are on page 1of 2

Spreadsheet or database?

1) A spreadsheet is the best tool to gather this information first.


2) The data can be easily corrupted.
I cannot store a very large amount of data.
3) A very large amount of data can be stored.
Databases have various methods to ensure security of data.
4) SQL

Are you a database architect?


1) by connected tables of records
2) It is a unique identifier of a record in a table.
It is the id of a record
3) It is a reference to a record from another table
4) id is a primary key of the table directors.
A director has many movies.
A movie belongs to a director
5) An inhabitant belongs to one city and a city has many inhabitants.
6) The tables are connected together through a primary key - foreign key system.
A primary key is basically represented by the field id of a table.
A foreign key is a reference, in a given table, to a record from another table.
A relational database is composed by tables with records as rows and fields as
columns.

Can you extract this Data?


1) Structured Query Language
2) SELECT genres.name FROM genres
3) SELECT COUNT(*) FROM directors
4) SELECT * FROM movies WHERE LIKE 1999
5) SELECT column FROM table;
6) COUNT()
7) *
8) WHERE
9) This query returns all doctors having a specialty ending with “surgery” (note
the %).
10) SELECT COUNT(*), age FROM students GROUP BY age;
11) SELECT COUNT(*) AS c, specialty FROM doctors GROUP BY specialty ORDER BY c
DESC;

Are you a SQL Pro?


1) SELECT * FROM movies WHERE director_id = 'Angelina Jolie'
2) JOIN
3) SELECT * FROM first_table JOIN second_table ON second_table.id =
first_table.second_table_id;

Excel or Python?
1) Python will let you go further in the analysis and let you show to your manager
that there is a collection of reasons you can evaluate: devivery time, missing
items in order, pain to pay.
2) Python calculation power may seam overkill, and exploring such finacial files
with non-structured data can be a dead end.

Do you know Python Basics?


1) To display the type of object I am handling.
2) int
3) float
4) string
5) It’s a way to store and re-use values in memory.
6) Calling a built in method (or function) on the variable name.

Do you know Pandas Basics?


1) A type of file where we can store data.
2) A way to access data online.
3) A Python library to help analyse large sets of data.
4) A built in function to calculate the average from a list of numbers.
5) df[1:3]
6) Indexed tables, with several dimensions of elements of several types
7) df.plot(kind='scatter',x='col_n1',y='col_n2')
8) A way to select elements of an dataframe using a logical condition
9) False

You might also like