Professional Documents
Culture Documents
TeAM-OOPS OPS
TeAM-OOPS OPS
TEAM MEMBERS
AISHWARYA SINGH-01
PRANOY MUKHERJEE-35
PRIYANKA GAIKWAD-38
TARUNI RAJAN-56
LEGO DATABASE
• LEGO is a popular brand of toy building bricks. They are often sold in sets with
in order to build a specific object. Each set contains a number of parts in
different shapes, sizes and colors.
• This database contains information on which parts are included in different
LEGO sets. It was originally compiled to help people who owned some LEGO
sets already figure out what other sets they could build with the pieces they
had.
DATASETS USED IN LEGO DATABASE
• Used multiple datasets in lego database so that variables which are common to each
other are merged and find the correlation to each other.
This is a very rich dataset that offers lots of rooms for exploration, especially since the
“sets” file includes the year in which a set was first released.
1. What colours are associated with which themes and predict the values.
2. To understand distribution of product over the year ?
3. To find correlation of selected measures.
4. Have the colors of LEGOs included in sets changed over time?
5. To find various relationship between variables and how it affect the overall process.
6. Plot as well as analyze various variables.
EXPLORATION AND EXPLANATION
TABLE 1
Using data
color with
same category
and measure
this time we
use a filter of
top 10 colors.
So only top 10
colors are
shown
TABLE 3
In the table we
are using rgb
values that is
basically a code
or id of all given
colors.
So for no name
color code is
05131D
TABLE 4
Using multiple
category
color,name,set_num
and theme measure by
id,inventory_id,theme
_id and year so we are
showing top 100
words by id. Color are
measure by year,size
by id and words are
arranged by category
theme
TABLE 5
Using category
colour,name,set_
num and theme
measured by
id,inventory_id,inve
ntory_quantity,the
me_id and year.
Using filter for
colours as well as
visualization id limit
from 16 to 57.
TABLE 6
Plotting
distribution of
year from year
1950 to 2018
The distribution
is maximum in
period of 2014-
2016
TABLE 7
Measuring version
of lego with id
number.
Where correlation
is about 0.4322.
So relationship is
moderate
TABLE 8