You are on page 1of 15

TEAM-OOPS OPS

TEAM MEMBERS
AISHWARYA SINGH-01
PRANOY MUKHERJEE-35
PRIYANKA GAIKWAD-38
TARUNI RAJAN-56
LEGO DATABASE

• LEGO is a popular brand of toy building bricks. They are often sold in sets with
in order to build a specific object. Each set contains a number of parts in
different shapes, sizes and colors.
• This database contains information on which parts are included in different
LEGO sets. It was originally compiled to help people who owned some LEGO
sets already figure out what other sets they could build with the pieces they
had.
DATASETS USED IN LEGO DATABASE

• Colors.csv-135 rows and 4 columns


• Inventories.csv-11681 rows and 3 colums
• Inventory_sets.csv-2845 rows and 3 colums
• Part_categories.csv-57 rows and 2 columns
• Themes.csv-615 rows and 3 colums
• Inventory_parts.csv-580251 rows and 5 columns
• Parts.csv-25993 rows and 3 columns
• Sets.csv-11673 rows and 5 columns
EXPLANATION ABOUT DATA SET

• Used multiple datasets in lego database so that variables which are common to each
other are merged and find the correlation to each other.

• There can be positive ,negative or 0 correlation.

• The dependent variable varies with change in independent variable.


RELATIONSHIP IN DIFFERENT VARIABLE -SCHEMA
MODEL
OBJECTIVE

This is a very rich dataset that offers lots of rooms for exploration, especially since the
“sets” file includes the year in which a set was first released.
1. What colours are associated with which themes and predict the values.
2. To understand distribution of product over the year ?
3. To find correlation of selected measures.
4. Have the colors of LEGOs included in sets changed over time?
5. To find various relationship between variables and how it affect the overall process.
6. Plot as well as analyze various variables.
EXPLORATION AND EXPLANATION
TABLE 1

In this slide using


data- colors we
basically use three
categories
is_trans,rgb and
name measure by
id with using filter
of showing only
top 15 colors.
The color with
maximum font is
called no name
with id 9999.
TABLE 2

Using data
color with
same category
and measure
this time we
use a filter of
top 10 colors.
So only top 10
colors are
shown
TABLE 3

In the table we
are using rgb
values that is
basically a code
or id of all given
colors.
So for no name
color code is
05131D
TABLE 4

Using multiple
category
color,name,set_num
and theme measure by
id,inventory_id,theme
_id and year so we are
showing top 100
words by id. Color are
measure by year,size
by id and words are
arranged by category
theme
TABLE 5

Using category
colour,name,set_
num and theme
measured by
id,inventory_id,inve
ntory_quantity,the
me_id and year.
Using filter for
colours as well as
visualization id limit
from 16 to 57.
TABLE 6

Plotting
distribution of
year from year
1950 to 2018
The distribution
is maximum in
period of 2014-
2016
TABLE 7

Measuring version
of lego with id
number.
Where correlation
is about 0.4322.
So relationship is
moderate
TABLE 8

Comparing box and whisker plot of 2 spare


part named f and t respectively
TABLE 9
Comparing the quantity of spare part namely f and t
Quantity for f-1893129
Quantity for t-36049

You might also like