You are on page 1of 46

Project: Game Analysis [Pokemon] & Movie Data Analysis

Submitted by:

Debashree Debalaxmi

3rd Year (6th Semester) B.E. in Computer Science & Engineering

Scholar’s Institute of Technology & Management

Gauhati University, Assam

MTA Centre: Kolkata


Date:-15/07/2018
Ref No:-MTA/KOL/55069
TO WHOMSOEVER IT MAY CONCERN

This is to certify
that both the projects entitled “Game Analysis [Pokemon] &
Movie Data Analysis in “ Machine Learning with Python ” technology is an original
work carried out by Debashree Deblaxmi from 15 th-June-2018 to 10th-July-2018.

The matter embodied in this project is a genuine work done by the student and
can be submitted to the university or to any other university/institute for the
fulfillment of the requirement of any course of study.

For Morling Global Pvt. Ltd


Mr.Binit Kumar

H-86,1st Floor, Sector-63, Noida(U.P)


Tel: 0120 454 3840 , Mob: 8272813860 Email: info@morlingglobal.in Website: www.morlingglobal.in | www.mtaind.com
Acknowledgement

It is my immense pleasure to be indebted to various people in


the training opportunity given by Scholar’s Institute of Technology
and Management and also Prof. Puspakshi Sarmah, HOD, CSE,
who directly or indirectly contributed in the development of training
report by influencing me in good thoughts, communal behaviour and
various acts during the course of the study.
Also, I would like to express my deepest gratitude to Mr. Binit
Kumar, Director and Mr. Anand Pandey, Training Head of
Morling Global Pvt. Ltd. [ MTA Delivery Partner ], for their
constant co-operation. They were always there with competent
guidance and valuable suggestion throughout the pursuance of this
project.
Above all no words can express feeling to my parents who
supported in this project in collecting necessary Information.

Debashree Debalaxmi
d・「。ウィイ・・@d・「。ャ。クュゥ

sッヲエキ。イ・@d・カ・ャッーュ・ョエ@fオョ、。ュ・ョエ。ャウ

aオァオウエ@RPL@RPQX
カhクuMxvnY
This Certificate Accredits that "Debashree Debalaxmi" has successfully
completed project based summer training on
"Machine Learning with Python"
technology from 15th June 2018 to 10th July 2018.

Grade: A
Location: KOLKATA
MTA ID: 779900
TABLE OF CONTENTS

SERIAL NO CONTENTS PAGE NO.

1 COMPANY PROFILE 1
2 INTRODUCTION 2
3 TECHNOLOGY USED 3
4 GAME ANALYSIS [POKEMON] 4
5 MOVIE DATA ANALYSIS 20
6 CONCLUSION 39
7 BIBLIOGRAPHY 40
Page No 1

Company Profile

Microsoft Corporation (abbreviated as MS) is an American multinational technology


company with headquarters in Redmond, Washington. It develops, manufactures, licenses,
supports and sells computer software, consumer electronics, personal computers, and related
services. Its best known software products are the Microsoft Windows line of operating
systems, the Microsoft Office suite, and the Internet Explorer and Edge web browsers. Its
flagship hardware products are the Xbox video game consoles and the Microsoft Surface line
up of touchscreen personal computers. As of 2016, it is the world's largest software maker by
revenue, and one of the world's most valuable companies. The word "Microsoft" is a
portmanteau of "microcomputer" and "software".

Microsoft Technology Associate or MTA exams are exams that provide professional
based certifications on Microsoft products and they provide the fundamentals for Databases
(MSSQL Server), Development (Visual Studio) and IT Infrastructure (Windows, Windows
[1]
Server). MTA exams and certification are offered as part of the Microsoft Certified
Professional (MCP) program. The Microsoft Technology Associate (MTA) certification is
an entry-level credential that validates fundamental technology skills and knowledge among
students and job-seekers who are pursuing a career in technology.

Morling Global is a partner of Microsoft Technology Associate, Hewlett Packard


Enterprise, Autodesk. It has established itself as a leading Training & Development Company
in India having a very successful track record in academic training, corporate training,
Internship , hardware designing, solar panel, medical equipment and web solution and
services. Established in 2011, Morling Global is a fast growing IT Training & Development
Company with an exhausting list of awards and recognitions to its credit.
Page No 2

Introduction
Machine Learning
Machine learning is a field of computer science that uses statistical techniques to give
computer systems the ability to "learn" (e.g., progressively improve performance on a
specific task) with data, without being explicitly programmed.

The name machine learning was coined in 1959 by Arthur Samuel. Evolved from the study
of pattern recognition and computational learning theory in artificial intelligence, machine
learning explores the study and construction of algorithms that can learn from and make
predictions on data – such algorithms overcome following strictly static program instructions
by making data-driven predictions or decisions, through building a model from sample
inputs. Machine learning is employed in a range of computing tasks where designing and
programming explicit algorithms with good performance is difficult or infeasible; example
applications include email filtering, detection of network intruders, and computer vision.

Machine learning is closely related to (and often overlaps with) computational statistics,
which also focuses on prediction-making through the use of computers. It has strong ties to
mathematical optimization, which delivers methods, theory and application domains to the
field. Machine learning is sometimes conflated with data mining, where the latter subfield
focuses more on exploratory data analysis and is known as unsupervised learning.

Python
Python is an interpreted high-level programming language for general-purpose
programming. Created by Guido van Rossum and first released in 1991, Python has a design
philosophy that emphasizes code readability, notably using significant whitespace. It provides
constructs that enable clear programming on both small and large scales. In July 2018, Van
Rossum stepped down as the leader in the language community after 30 years.

Python features a dynamic type system and automatic memory management. It supports
multiple programming paradigms, including object-oriented, imperative, functional and
procedural, and has a large and comprehensive standard library.

Python interpreters are available for many operating systems. CPython, the reference
implementation of Python, is open source software and has a community-based development
model, as do nearly all of Python's other implementations. Python and CPython are managed
by the non-profit Python Software Foundation.

Python is particularly great for web development, and scientific computing. With its vast
libraries, it’s also useful for data visualization and data analytics.
Page No 3

Technology Used

Anaconda (Python distribution)


Anaconda is a free and open source distribution of the Python and R programming languages
for data science and machine learning related applications (large-scale data processing, predictive
analytics, scientific computing), that aims to simplify package management and deployment. Package
versions are managed by the package management system conda. The Anaconda distribution is used
by over 6 million users, and it includes more than 250 popular data science packages suitable for
Windows, Linux, and MacOS.

Spyder
Spyder is a powerful scientific environment written in Python, for Python, and designed by
and for scientists, engineers and data analysts. It offers a unique combination of the advanced editing,
analysis, debugging, and profiling functionality of a comprehensive development tool with the data
exploration, interactive execution, deep inspection, and beautiful visualization capabilities of a
scientific package.Core building blocks of this powerful IDE are Editor, IPython Console, Variable
Explorer, Profiler, Debugger etc.

Pandas
Pandas is an open-source Python Library providing high-performance data manipulation and
analysis tool using its powerful data structures. The name Pandas is derived from the word Panel Data
– an Econometrics from Multidimensional data. Using Pandas, we can accomplish five typical steps
in the processing and analysis of data, regardless of the origin of data — load, prepare, manipulate,
model, and analyze.

Python with Pandas is used in a wide range of fields including academic and commercial domains
including finance, economics, Statistics, analytics, etc.

NumPy
NumPy is a Python package. It stands for 'Numerical Python'. It is a library consisting of
multidimensional array objects and a collection of routines for processing of array.

Numeric, the ancestor of NumPy, was developed by Jim Hugunin. Another package Numarray was
also developed, having some additional functionalities. In 2005, Travis Oliphant created NumPy
package by incorporating the features of Numarray into Numeric package. There are many
contributors to this open source project. Using NumPy, a developer can perform Mathematical and
logical operations on arrays, Fourier transforms and routines for shape manipulation, Operations
related to linear algebra.

Matplotlib
Matplotlib is a Python 2D plotting library which produces publication quality figures in a
variety of hardcopy formats and interactive environments across platforms. Matplotlib can be used in
Python scripts, the Python and IPython shells, the Jupyter notebook, web application servers, and four
graphical user interface toolkits.Matplotlib tries to make easy things easy and hard things possible.
We can generate plots, histograms, power spectra, bar charts, errorcharts, scatterplots, etc., with just a
few lines of code.
Page No 4

Game Analysis [Pokemon]

PAGE NO.
Page No 5

Pokemon Dataset (No of Datas: 800)


(saved as “Pokemon.csv”)
# Name Type 1 Type 2 Total HP Attack Defense Sp. Atk Sp. Def Speed
1 Bulbasaur Grass Poison 318 45 49 49 65 65 45
2 Ivysaur Grass Poison 405 60 62 63 80 80 60
3 Venusaur Grass Poison 525 80 82 83 100 100 80
3 VenusaurMega Venusaur Grass Poison 625 80 100 123 122 120 80
4 Charmander Fire 309 39 52 43 60 50 65
5 Charmeleon Fire 405 58 64 58 80 65 80
6 Charizard Fire Flying 534 78 84 78 109 85 100
6 CharizardMega Charizard X Fire Dragon 634 78 130 111 130 85 100
6 CharizardMega Charizard Y Fire Flying 634 78 104 78 159 115 100
7 Squirtle Water 314 44 48 65 50 64 43
8 Wartortle Water 405 59 63 80 65 80 58
9 Blastoise Water 530 79 83 100 85 105 78
9 BlastoiseMega Blastoise Water 630 79 103 120 135 115 78
10 Caterpie Bug 195 45 30 35 20 20 45
11 Metapod Bug 205 50 20 55 25 25 30
12 Butterfree Bug Flying 395 60 45 50 90 80 70
13 Weedle Bug Poison 195 40 35 30 20 20 50
14 Kakuna Bug Poison 205 45 25 50 25 25 35
15 Beedrill Bug Poison 395 65 90 40 45 80 75
15 BeedrillMega Beedrill Bug Poison 495 65 150 40 15 80 145
16 Pidgey Normal Flying 251 40 45 40 35 35 56
17 Pidgeotto Normal Flying 349 63 60 55 50 50 71
18 Pidgeot Normal Flying 479 83 80 75 70 70 101
18 PidgeotMega Pidgeot Normal Flying 579 83 80 80 135 80 121
19 Rattata Normal 253 30 56 35 25 35 72
20 Raticate Normal 413 55 81 60 50 70 97
21 Spearow Normal Flying 262 40 60 30 31 31 70
22 Fearow Normal Flying 442 65 90 65 61 61 100
23 Ekans Poison 288 35 60 44 40 54 55
24 Arbok Poison 438 60 85 69 65 79 80
25 Pikachu Electric 320 35 55 40 50 50 90
26 Raichu Electric 485 60 90 55 90 80 110
27 Sandshrew Ground 300 50 75 85 20 30 40
28 Sandslash Ground 450 75 100 110 45 55 65
29 Nidoran♀ Poison 275 55 47 52 40 40 41
30 Nidorina Poison 365 70 62 67 55 55 56
31 Nidoqueen Poison Ground 505 90 92 87 75 85 76
32 Nidoran♂ Poison 273 46 57 40 40 40 50
33 Nidorino Poison 365 61 72 57 55 55 65
34 Nidoking Poison Ground 505 81 102 77 85 75 85
35 Clefairy Fairy 323 70 45 48 60 65 35
36 Clefable Fairy 483 95 70 73 95 90 60
37 Vulpix Fire 299 38 41 40 50 65 65
38 Ninetales Fire 505 73 76 75 81 100 100
39 Jigglypuff Normal Fairy 270 115 45 20 45 25 20
40 Wigglytuff Normal Fairy 435 140 70 45 85 50 45
41 Zubat Poison Flying 245 40 45 35 30 40 55
Page No 6

42 Golbat Poison Flying 455 75 80 70 65 75 90


43 Oddish Grass Poison 320 45 50 55 75 65 30
44 Gloom Grass Poison 395 60 65 70 85 75 40
45 Vileplume Grass Poison 490 75 80 85 110 90 50
46 Paras Bug Grass 285 35 70 55 45 55 25
47 Parasect Bug Grass 405 60 95 80 60 80 30
48 Venonat Bug Poison 305 60 55 50 40 55 45
49 Venomoth Bug Poison 450 70 65 60 90 75 90
50 Diglett Ground 265 10 55 25 35 45 95
51 Dugtrio Ground 405 35 80 50 50 70 120
52 Meowth Normal 290 40 45 35 40 40 90
53 Persian Normal 440 65 70 60 65 65 115
54 Psyduck Water 320 50 52 48 65 50 55
55 Golduck Water 500 80 82 78 95 80 85
56 Mankey Fighting 305 40 80 35 35 45 70
57 Primeape Fighting 455 65 105 60 60 70 95
58 Growlithe Fire 350 55 70 45 70 50 60
59 Arcanine Fire 555 90 110 80 100 80 95
60 Poliwag Water 300 40 50 40 40 40 90
61 Poliwhirl Water 385 65 65 65 50 50 90
62 Poliwrath Water Fighting 510 90 95 95 70 90 70
63 Abra Psychic 310 25 20 15 105 55 90
64 Kadabra Psychic 400 40 35 30 120 70 105
65 Alakazam Psychic 500 55 50 45 135 95 120
65 AlakazamMega Alakazam Psychic 590 55 50 65 175 95 150
66 Machop Fighting 305 70 80 50 35 35 35
67 Machoke Fighting 405 80 100 70 50 60 45
68 Machamp Fighting 505 90 130 80 65 85 55
69 Bellsprout Grass Poison 300 50 75 35 70 30 40
70 Weepinbell Grass Poison 390 65 90 50 85 45 55
71 Victreebel Grass Poison 490 80 105 65 100 70 70
72 Tentacool Water Poison 335 40 40 35 50 100 70
73 Tentacruel Water Poison 515 80 70 65 80 120 100
74 Geodude Rock Ground 300 40 80 100 30 30 20
75 Graveler Rock Ground 390 55 95 115 45 45 35
76 Golem Rock Ground 495 80 120 130 55 65 45
77 Ponyta Fire 410 50 85 55 65 65 90
78 Rapidash Fire 500 65 100 70 80 80 105
79 Slowpoke Water Psychic 315 90 65 65 40 40 15
80 Slowbro Water Psychic 490 95 75 110 100 80 30
80 SlowbroMega Slowbro Water Psychic 590 95 75 180 130 80 30
81 Magnemite Electric Steel 325 25 35 70 95 55 45
82 Magneton Electric Steel 465 50 60 95 120 70 70
83 Farfetch'd Normal Flying 352 52 65 55 58 62 60
84 Doduo Normal Flying 310 35 85 45 35 35 75
85 Dodrio Normal Flying 460 60 110 70 60 60 100
86 Seel Water 325 65 45 55 45 70 45
87 Dewgong Water Ice 475 90 70 80 70 95 70
88 Grimer Poison 325 80 80 50 40 50 25
89 Muk Poison 500 105 105 75 65 100 50
90 Shellder Water 305 30 65 100 45 25 40
91 Cloyster Water Ice 525 50 95 180 85 45 70
92 Gastly Ghost Poison 310 30 35 30 100 35 80
Page No 7

93 Haunter Ghost Poison 405 45 50 45 115 55 95


94 Gengar Ghost Poison 500 60 65 60 130 75 110
94 GengarMega Gengar Ghost Poison 600 60 65 80 170 95 130
95 Onix Rock Ground 385 35 45 160 30 45 70
96 Drowzee Psychic 328 60 48 45 43 90 42
97 Hypno Psychic 483 85 73 70 73 115 67
98 Krabby Water 325 30 105 90 25 25 50
99 Kingler Water 475 55 130 115 50 50 75
100 Voltorb Electric 330 40 30 50 55 55 100
101 Electrode Electric 480 60 50 70 80 80 140
102 Exeggcute Grass Psychic 325 60 40 80 60 45 40
103 Exeggutor Grass Psychic 520 95 95 85 125 65 55
104 Cubone Ground 320 50 50 95 40 50 35
105 Marowak Ground 425 60 80 110 50 80 45
106 Hitmonlee Fighting 455 50 120 53 35 110 87
107 Hitmonchan Fighting 455 50 105 79 35 110 76
108 Lickitung Normal 385 90 55 75 60 75 30
109 Koffing Poison 340 40 65 95 60 45 35
110 Weezing Poison 490 65 90 120 85 70 60
111 Rhyhorn Ground Rock 345 80 85 95 30 30 25
112 Rhydon Ground Rock 485 105 130 120 45 45 40
113 Chansey Normal 450 250 5 5 35 105 50
114 Tangela Grass 435 65 55 115 100 40 60
115 Kangaskhan Normal 490 105 95 80 40 80 90
115 KangaskhanMega Kangaskhan Normal 590 105 125 100 60 100 100
116 Horsea Water 295 30 40 70 70 25 60
117 Seadra Water 440 55 65 95 95 45 85
118 Goldeen Water 320 45 67 60 35 50 63
119 Seaking Water 450 80 92 65 65 80 68
120 Staryu Water 340 30 45 55 70 55 85
-- -- -- -- -- -- -- -- -- -- --
628 Braviary Normal Flying 510 100 123 75 57 75 80
629 Vullaby Dark Flying 370 70 55 75 45 65 60
630 Mandibuzz Dark Flying 510 110 65 105 55 95 80
631 Heatmor Fire 484 85 97 66 105 66 65
632 Durant Bug Steel 484 58 109 112 48 48 109
633 Deino Dark Dragon 300 52 65 50 45 50 38
634 Zweilous Dark Dragon 420 72 85 70 65 70 58
635 Hydreigon Dark Dragon 600 92 105 90 125 90 98
636 Larvesta Bug Fire 360 55 85 55 50 55 60
637 Volcarona Bug Fire 550 85 60 65 135 105 100
638 Cobalion Steel Fighting 580 91 90 129 90 72 108
639 Terrakion Rock Fighting 580 91 129 90 72 90 108
640 Virizion Grass Fighting 580 91 90 72 90 129 108
641 TornadusIncarnate Forme Flying 580 79 115 70 125 80 111
641 TornadusTherian Forme Flying 580 79 100 80 110 90 121
642 ThundurusIncarnate Forme Electric Flying 580 79 115 70 125 80 111
642 ThundurusTherian Forme Electric Flying 580 79 105 70 145 80 101
643 Reshiram Dragon Fire 680 100 120 100 150 120 90
644 Zekrom Dragon Electric 680 100 150 120 120 100 90
645 LandorusIncarnate Forme Ground Flying 600 89 125 90 115 80 101
645 LandorusTherian Forme Ground Flying 600 89 145 90 105 80 91
646 Kyurem Dragon Ice 660 125 130 90 130 90 95
Page No 8

646 KyuremBlack Kyurem Dragon Ice 700 125 170 100 120 90 95
646 KyuremWhite Kyurem Dragon Ice 700 125 120 90 170 100 95
647 KeldeoOrdinary Forme Water Fighting 580 91 72 90 129 90 108
647 KeldeoResolute Forme Water Fighting 580 91 72 90 129 90 108
648 MeloettaAria Forme Normal Psychic 600 100 77 77 128 128 90
648 MeloettaPirouette Forme Normal Fighting 600 100 128 90 77 77 128
649 Genesect Bug Steel 600 71 120 95 120 95 99
650 Chespin Grass 313 56 61 65 48 45 38
651 Quilladin Grass 405 61 78 95 56 58 57
52 Chesnaught Grass Fighting 530 88 107 122 74 75 64
653 Fennekin Fire 307 40 45 40 62 60 60
654 Braixen Fire 409 59 59 58 90 70 73
655 Delphox Fire Psychic 534 75 69 72 114 100 104
656 Froakie Water 314 41 56 40 62 44 71
657 Frogadier Water 405 54 63 52 83 56 97
658 Greninja Water Dark 530 72 95 67 103 71 122
659 Bunnelby Normal 237 38 36 38 32 36 57
660 Diggersby Normal Ground 423 85 56 77 50 77 78
661 Fletchling Normal Flying 278 45 50 43 40 38 62
662 Fletchinder Fire Flying 382 62 73 55 56 52 84
663 Talonflame Fire Flying 499 78 81 71 74 69 126
664 Scatterbug Bug 200 38 35 40 27 25 35
665 Spewpa Bug 213 45 22 60 27 30 29
666 Vivillon Bug Flying 411 80 52 50 90 50 89
667 Litleo Fire Normal 369 62 50 58 73 54 72
668 Pyroar Fire Normal 507 86 68 72 109 66 106
669 Flabébé Fairy 303 44 38 39 61 79 42
670 Floette Fairy 371 54 45 47 75 98 52
671 Florges Fairy 552 78 65 68 112 154 75
672 Skiddo Grass 350 66 65 48 62 57 52
673 Gogoat Grass 531 123 100 62 97 81 68
674 Pancham Fighting 348 67 82 62 46 48 43
675 Pangoro Fighting Dark 495 95 124 78 69 71 58
676 Furfrou Normal 472 75 80 60 65 90 102
677 Espurr Psychic 355 62 48 54 63 60 68
678 MeowsticMale Psychic 466 74 48 76 83 81 104
678 MeowsticFemale Psychic 466 74 48 76 83 81 104
679 Honedge Steel Ghost 325 45 80 100 35 37 28
680 Doublade Steel Ghost 448 59 110 150 45 49 35
681 AegislashBlade Forme Steel Ghost 520 60 150 50 150 50 60
681 AegislashShield Forme Steel Ghost 520 60 50 150 50 150 60
682 Spritzee Fairy 341 78 52 60 63 65 23
683 Aromatisse Fairy 462 101 72 72 99 89 29
684 Swirlix Fairy 341 62 48 66 59 57 49
685 Slurpuff Fairy 480 82 80 86 85 75 72
686 Inkay Dark Psychic 288 53 54 53 37 46 45
687 Malamar Dark Psychic 482 86 92 88 68 75 73
688 Binacle Rock Water 306 42 52 67 39 56 50
689 Barbaracle Rock Water 500 72 105 115 54 86 68
690 Skrelp Poison Water 320 50 60 60 60 60 30
691 Dragalge Poison Dragon 494 65 75 90 97 123 44
692 Clauncher Water 330 50 53 62 58 63 44
Page No 9

693 Clawitzer Water 500 71 73 88 120 89 59


694 Helioptile Electric Normal 289 44 38 33 61 43 70
695 Heliolisk Electric Normal 481 62 55 52 109 94 109
696 Tyrunt Rock Dragon 362 58 89 77 45 45 48
697 Tyrantrum Rock Dragon 521 82 121 119 69 59 71
698 Amaura Rock Ice 362 77 59 50 67 63 46
699 Aurorus Rock Ice 521 123 77 72 99 92 58
700 Sylveon Fairy 525 95 65 65 110 130 60
701 Hawlucha Fighting Flying 500 78 92 75 74 63 118
702 Dedenne Electric Fairy 431 67 58 57 81 67 101
703 Carbink Rock Fairy 500 50 50 150 50 150 50
704 Goomy Dragon 300 45 50 35 55 75 40
705 Sliggoo Dragon 452 68 75 53 83 113 60
706 Goodra Dragon 600 90 100 70 110 150 80
707 Klefki Steel Fairy 470 57 80 91 80 87 75
708 Phantump Ghost Grass 309 43 70 48 50 60 38
709 Trevenant Ghost Grass 474 85 110 76 65 82 56
710 PumpkabooAverage Size Ghost Grass 335 49 66 70 44 55 51
710 PumpkabooSmall Size Ghost Grass 335 44 66 70 44 55 56
710 PumpkabooLarge Size Ghost Grass 335 54 66 70 44 55 46
710 PumpkabooSuper Size Ghost Grass 335 59 66 70 44 55 41
711 GourgeistAverage Size Ghost Grass 494 65 90 122 58 75 84
711 GourgeistSmall Size Ghost Grass 494 55 85 122 58 75 99
711 GourgeistLarge Size Ghost Grass 494 75 95 122 58 75 69
711 GourgeistSuper Size Ghost Grass 494 85 100 122 58 75 54
712 Bergmite Ice 304 55 69 85 32 35 28
713 Avalugg Ice 514 95 117 184 44 46 28
714 Noibat Flying Dragon 245 40 30 35 45 40 55
715 Noivern Flying Dragon 535 85 70 80 97 80 123
716 Xerneas Fairy 680 126 131 95 131 98 99
717 Yveltal Dark Flying 680 126 131 95 131 98 99
718 Zygarde50% Forme Dragon Ground 600 108 100 121 81 95 95
719 Diancie Rock Fairy 600 50 100 150 100 150 50
719 DiancieMega Diancie Rock Fairy 700 50 160 110 160 110 110
720 HoopaHoopa Confined Psychic Ghost 600 80 110 60 150 130 70
720 HoopaHoopa Unbound Psychic Dark 680 80 160 60 170 130 80
721 Volcanion Fire Water 600 80 110 120 130 90 70
Page No 10

Python Script
(saved as “Pokemon.py”)

# -*- coding: utf-8 -*-


"""
Created on Tue Jul 3 10:40:38 2018

@author: Debashree_Debalaxmi
"""
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df=pd.read_csv("C:\\Users\\acer\\Downloads\\Pokemon.csv")
print(df)
print(df.shape) #to get the dimensions
print(df.head(2)) #to get the top 2 rows
print(df[4:9]) #to get the rows from index 4 to 8
print(df[36:45]) #to get the rows from index 36 to 44
print(df[101:109]) #to get the rows from index 101 to 108
print(df.columns) #to get the column names
print(df.Name) #to get the values of the column "Name"
print(df.HP) #to get the values of the column "HP"
print(df.head(5)) #to get the top 5 rows
print(df.tail(5)) #to get the last 5 rows
print(df.tail(1)) #to get the last rows
plt.plot(df["Type 1"])
plt.title("Graph")
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.grid(True)
plt.show()
Page No 11

Output
(saved as “Pokemon.html”)
Python 3.6.5 |Anaconda, Inc.| (default, Mar 29 2018, 13:32:41) [MSC v.1900 64 bit (AMD64)]

Type "copyright", "credits" or "license" for more information.

IPython 6.4.0 -- An enhanced Interactive Python.

In [1]: import pandas as pd

In [2]: import numpy as np

In [3]: import matplotlib.pyplot as plt

In [4]: df=pd.read_csv("C:\\Users\\acer\\Downloads\\Pokemon.csv")

In [5]: print(df)

# Name Type 1 ... Sp. Atk Sp. Def Speed

0 1 Bulbasaur Grass ... 65 65 45

1 2 Ivysaur Grass ... 80 80 60

2 3 Venusaur Grass ... 100 100 80

3 3 VenusaurMega Venusaur Grass ... 122 120 80

4 4 Charmander Fire ... 60 50 65

5 5 Charmeleon Fire ... 80 65 80

6 6 Charizard Fire ... 109 85 100

7 6 CharizardMega Charizard X Fire ... 130 85 100

8 6 CharizardMega Charizard Y Fire ... 159 115 100

9 7 Squirtle Water ... 50 64 43

10 8 Wartortle Water ... 65 80 58

11 9 Blastoise Water ... 85 105 78

12 9 BlastoiseMega Blastoise Water ... 135 115 78

13 10 Caterpie Bug ... 20 20 45

14 11 Metapod Bug ... 25 25 30

15 12 Butterfree Bug ... 90 80 70

16 13 Weedle Bug ... 20 20 50

17 14 Kakuna Bug ... 25 25 35

18 15 Beedrill Bug ... 45 80 75

19 15 BeedrillMega Beedrill Bug ... 15 80 145

20 16 Pidgey Normal ... 35 35 56


Page No 12

21 17 Pidgeotto Normal ... 50 50 71

22 18 Pidgeot Normal ... 70 70 101

23 18 PidgeotMega Pidgeot Normal ... 135 80 121

24 19 Rattata Normal ... 25 35 72

25 20 Raticate Normal ... 50 70 97

26 21 Spearow Normal ... 31 31 70

27 22 Fearow Normal ... 61 61 100

28 23 Ekans Poison ... 40 54 55

29 24 Arbok Poison ... 65 79 80

.. ... ... ... ... ... ... ...

781 710 PumpkabooSmall Size Ghost ... 44 55 56

782 710 PumpkabooLarge Size Ghost ... 44 55 46

783 710 PumpkabooSuper Size Ghost ... 44 55 41

784 711 GourgeistAverage Size Ghost ... 58 75 84

785 711 GourgeistSmall Size Ghost ... 58 75 99

786 711 GourgeistLarge Size Ghost ... 58 75 69

787 711 GourgeistSuper Size Ghost ... 58 75 54

788 712 Bergmite Ice ... 32 35 28

789 713 Avalugg Ice ... 44 46 28

790 714 Noibat Flying ... 45 40 55

791 715 Noivern Flying ... 97 80 123

792 716 Xerneas Fairy ... 131 98 99

793 717 Yveltal Dark ... 131 98 99

794 718 Zygarde50% Forme Dragon ... 81 95 95

795 719 Diancie Rock ... 100 150 50

796 719 DiancieMega Diancie Rock ... 160 110 110

797 720 HoopaHoopa Confined Psychic ... 150 130 70

798 720 HoopaHoopa Unbound Psychic ... 170 130 80

799 721 Volcanion Fire ... 130 90 70

[800 rows x 11 columns]

In [6]: print(df.shape)

(800, 11)

In [7]: print(df.head(2))

# Name Type 1 Type 2 ... Defense Sp. Atk Sp. Def Speed

0 1 Bulbasaur Grass Poison ... 49 65 65 45

1 2 Ivysaur Grass Poison ... 63 80 80 60


Page No 13

[2 rows x 11 columns]

In [8]: print(df[4:9])

# Name Type 1 ... Sp. Atk Sp. Def Speed

4 4 Charmander Fire ... 60 50 65

5 5 Charmeleon Fire ... 80 65 80

6 6 Charizard Fire ... 109 85 100

7 6 CharizardMega Charizard X Fire ... 130 85 100

8 6 CharizardMega Charizard Y Fire ... 159 115 100

[5 rows x 11 columns]

In [9]: print(df[36:45])

# Name Type 1 Type 2 ... Defense Sp. Atk Sp. Def Speed

36 31 Nidoqueen Poison Ground ... 87 75 85 76

37 32 Nidoran♂ Poison NaN ... 40 40 40 50

38 33 Nidorino Poison NaN ... 57 55 55 65

39 34 Nidoking Poison Ground ... 77 85 75 85

40 35 Clefairy Fairy NaN ... 48 60 65 35

41 36 Clefable Fairy NaN ... 73 95 90 60

42 37 Vulpix Fire NaN ... 40 50 65 65

43 38 Ninetales Fire NaN ... 75 81 100 100

44 39 Jigglypuff Normal Fairy ... 20 45 25 20

[9 rows x 11 columns]

In [10]: print(df[101:109])

# Name Type 1 ... Sp. Atk Sp. Def Speed

101 94 Gengar Ghost ... 130 75 110

102 94 GengarMega Gengar Ghost ... 170 95 130

103 95 Onix Rock ... 30 45 70

104 96 Drowzee Psychic ... 43 90 42

105 97 Hypno Psychic ... 73 115 67

106 98 Krabby Water ... 25 25 50

107 99 Kingler Water ... 50 50 75

108 100 Voltorb Electric ... 55 55 100

[8 rows x 11 columns]

In [11]: print(df.columns)

Index(['#', 'Name', 'Type 1', 'Type 2', 'Total', 'HP', 'Attack', 'Defense',

'Sp. Atk', 'Sp. Def', 'Speed'],


Page No 14

dtype='object')

In [12]: print(df.Name)

0 Bulbasaur

1 Ivysaur

2 Venusaur

3 VenusaurMega Venusaur

4 Charmander

5 Charmeleon

6 Charizard

7 CharizardMega Charizard X

8 CharizardMega Charizard Y

9 Squirtle

10 Wartortle

11 Blastoise

12 BlastoiseMega Blastoise

13 Caterpie

14 Metapod

15 Butterfree

16 Weedle

17 Kakuna

18 Beedrill

19 BeedrillMega Beedrill

20 Pidgey

21 Pidgeotto

22 Pidgeot

23 PidgeotMega Pidgeot

24 Rattata

25 Raticate

26 Spearow

27 Fearow

28 Ekans

29 Arbok

-- --

770 Sylveon

771 Hawlucha

772 Dedenne
Page No 15

773 Carbink

774 Goomy

775 Sliggoo

776 Goodra

777 Klefki

778 Phantump

779 Trevenant

780 PumpkabooAverage Size

781 PumpkabooSmall Size

782 PumpkabooLarge Size

783 PumpkabooSuper Size

784 GourgeistAverage Size

785 GourgeistSmall Size

786 GourgeistLarge Size

787 GourgeistSuper Size

788 Bergmite

789 Avalugg

790 Noibat

791 Noivern

792 Xerneas

793 Yveltal

794 Zygarde50% Forme

795 Diancie

796 DiancieMega Diancie

797 HoopaHoopa Confined

798 HoopaHoopa Unbound

799 Volcanion

Name: Name, Length: 800, dtype: object

In [13]: print(df.HP)

0 45

1 60

2 80

3 80

4 39

5 58

6 78
Page No 16

7 78

8 78

9 44

10 59

11 79

12 79

13 45

14 50

15 60

16 40

17 45

18 65

19 65

20 40

21 63

22 83

23 83

24 30

25 55

26 40

27 65

28 35

29 60

- --

770 95

771 78

772 67

773 50

774 45

775 68

776 90

777 57

778 43

779 85

780 49

781 44
Page No 17

782 54

783 59

784 65

785 55

786 75

787 85

788 55

789 95

790 40

791 85

792 126

793 126

794 108

795 50

796 50

797 80

798 80

799 80

Name: HP, Length: 800, dtype: int64

In [14]: print(df.head(5)) #to get the top 5 rows

# Name Type 1 ... Sp. Atk Sp. Def Speed

0 1 Bulbasaur Grass ... 65 65 45

1 2 Ivysaur Grass ... 80 80 60

2 3 Venusaur Grass ... 100 100 80

3 3 VenusaurMega Venusaur Grass ... 122 120 80

4 4 Charmander Fire ... 60 50 65

[5 rows x 11 columns]

In [15]: print(df.tail(5)) #to get the last 5 rows

# Name Type 1 ... Sp. Atk Sp. Def Speed

795 719 Diancie Rock ... 100 150 50

796 719 DiancieMega Diancie Rock ... 160 110 110

797 720 HoopaHoopa Confined Psychic ... 150 130 70

798 720 HoopaHoopa Unbound Psychic ... 170 130 80

799 721 Volcanion Fire ... 130 90 70

[5 rows x 11 columns]

In [16]: print(df.tail(1)) #to get the last rows


Page No 18

# Name Type 1 Type 2 ... Defense Sp. Atk Sp. Def Speed

799 721 Volcanion Fire Water ... 120 130 90 70

[1 rows x 11 columns]

In [17]: plt.plot(df["Type 1"])

plt.title("Graph")

plt.xlabel("X-axis")

plt.ylabel("Y-axis")

plt.grid(True)

plt.show()
Page No 19

Movie Data Analysis

SERIAL NO CONTENTS PAGE NO.

1 Movie Dataset 20
2 Python Script 25
3 Output 26
Page No 20

Movie Dataset (No of Datas: 49,590)


(saved as “movies_data.xlsx”)
1 The Nightmare Before Christmas 1993 3.9 4568
2 The Mummy 1932 3.5 4388
3 Orphans of the Storm 1921 3.2 9062
4 The Object of Beauty 1991 2.8 6150
5 Night Tide 1963 2.8 5126
6 One Magic Christmas 1985 3.8 5333
7 Muriel's Wedding 1994 3.5 6323
8 Mother's Boys 1994 3.4 5733
9 Nosferatu: Original Version 1929 3.5 5651
10 Nick of Time 1995 3.4 5333
11 Broken Blossoms 1919 3.3 5367
12 Big Night 1996 3.6 6561
13 The Birth of a Nation 1915 2.9 12118
14 The Boys from Brazil 1978 3.6 7417
15 Big Doll House 1971 2.9 5696
16 The Breakfast Club 1985 4 5823
17 The Bride of Frankenstein 1935 3.7 4485
18 Beautiful Girls 1996 3.5 6755
19 Bustin' Loose 1981 3.7 5598
20 The Beguiled 1971 3.4 6307
21 Born on the Fourth of July 1989 3.4 8646
22 Broadcast News 1987 3.4 7940
23 Swimming with Sharks 1994 3.3 5586
24 Beavis and Butt-head Do America 1996 3.4 4852
25 Brighton Beach Memoirs 1986 3.4 6564
26 The Best of Times 1986 3.4 6247
27 Brassed Off 1996 3.5 6040
28 Last Tango in Paris 1972 3.1 7732
29 Leprechaun 2 1994 3.2 5125
30 Incident at Oglala: The Leonard Peltier Story 1992 3.7 5487
31 Kalifornia 1993 3.4 7095
32 The Lady Vanishes 1938 3.7 5762
33 Jingle All the Way 1996 3.6 5371
34 Killing Zoe 1993 3.4 5773
35 King of Beggars 1992 3.6 6025
36 Into the Woods 1990 4 9077
37 Joe Kidd 1972 3.7 5252
38 In Too Deep 1999 3.9 5823
39 King Kong 1976 3.2 8044
40 Internal Affairs 1990 3.5 6885
41 Jesus Christ Superstar 1973 3.6 6388
42 In the Name of the Father 1993 3.9 7972
43 Easy Money 1987 2.8 5794
44 Do the Right Thing 1989 3.6 7186
45 Days of Heaven 1978 3.4 5628
Page No 21

46 Drop Zone 1994 3.4 6087


47 Escape from L.A. 1996 3.3 6039
48 Emma 1996 3.5 7260
49 Disco Godfather 1979 3 5869
50 The Eiger Sanction 1975 3.5 7726
51 Elvis '56 1987 3.8 3518
52 Double Dragon 1994 3.3 5781
53 Double Jeopardy 1999 3.7 6311
54 Death Becomes Her 1992 3.4 6207
55 The Doors 1991 3.6 8436
56 Evil Dead 2: Dead by Dawn 1987 3.6 5047
57 Eat My Dust! 1976 3 5320
58 Desperado 1995 3.9 6269
59 Darkman II: The Return of Durant 1994 2.8 5557
60 The Doom Generation 1995 2.9 4309
61 The Englishman Who Went Up a Hill but Came Down a Mountain 1995 3.4 5752
62 Dr. Jekyll and Mr. Hyde 1920 3.2 4679
63 48 Hrs. 1982 3.6 5794
64 The Baby 1973 2.9 5101
65 The Addams Family 1991 3.5 5976
66 Bad Lieutenant 1992 3.2 5774
67 The Andromeda Strain 1971 3.6 7861
68 Albino Alligator 1996 3.1 5644
69 101 Dalmatians 1996 3.7 6172
70 Another 48 Hrs. 1990 3.5 5723
71 All About Eve 1950 3.8 8300
72 The Aristocats 1970 3.9 4731
73 Barbarella 1968 3 5882
74 The Asphyx 1973 2.8 5193
75 Apocalypse Now 1979 4 8825
76 Leviathan 1989 3.2 5877
77 Marvin's Room 1996 3.5 5901
78 Love and a .45 1994 3.3 6125
79 The Mark of Zorro 1920 3.1 6433
80 Little Buddha 1993 3.4 8455
81 Like Water for Chocolate 1992 3.8 6320
82 The Mikado 1939 3.1 5448
83 Lionheart 1990 3.5 6301
84 Minnie and Moskowitz 1971 3.2 6928
85 The Hunt for Red October 1990 4 8106
86 The Hindenburg 1975 3.4 7546
87 I Spit on Your Grave 1978 3.1 6084
88 Hellbound: Hellraiser II 1988 3.4 5606
89 Highlander 3: The Final Dimension 1994 3.2 5943
90 Heart and Souls 1993 3.8 6196
91 House of Whipcord 1974 2.5 6144
92 The Great Outdoors 1988 3.7 5424
93 Heaven's Gate 1980 3 13143
94 House on Haunted Hill 1959 3.6 4491
95 The Golden Child 1986 3.5 5609
Page No 22

96 The Hunted 1995 3.4 6605


97 The Great Waldo Pepper 1975 3.5 6467
98 Godzilla: King of the Monsters 1956 3.5 4828
99 Highlander 2: Renegade Version 1991 3.1 6585
100 High Noon 1952 3.9 5087
101 The Hunchback of Notre Dame 1923 3.5 6750
102 Heathers 1989 3.7 6193
103 Godzilla's Revenge 1969 3.3 4165
104 Hard Target 1993 3.4 5771
105 Henry: Portrait of a Serial Killer 1986 3.2 4938
106 Godzilla vs. Mothra 1964 3.5 5285
107 The Grifters 1990 3.4 6620
108 An Ideal Husband 1999 3.5 5879
109 A Fish Called Wanda 1988 3.7 6477
110 Gentleman's Agreement 1947 3.7 7104
111 The Prophecy 1995 3.6 5864
112 Flirting with Disaster 1996 3.2 5565
113 Ghostbusters 1984 3.8 6308
114 Far and Away 1992 3.6 8396
115 Fargo 1996 3.8 5889
116 Georgia 1995 2.9 7050
117 Fascination 1979 2.7 4900
118 Fresh 1994 4 6829
119 The Frighteners 1996 3.6 6607
120 Go West 1925 3.5 4101
121 Ferris Bueller's Day Off 1986 3.9 6177
122 The Ghost and the Darkness 1996 3.7 6594
123 Four Rooms 1995 3.6 5872
124 The First Wives Club 1996 3.6 6139
125 Operation Condor 1991 3.6 5435
126 Gallipoli 1981 3.7 6691
127 Fallen Angels 1995 3.5 5939
128 Girl on a Motorcycle 1968 2.5 5260
129 The Falcon and the Snowman 1985 3.4 7891
130 Red Scorpion 1989 3.3 6313
131 The Pallbearer 1996 2.7 5882
132 Reefer Madness 1936 3.1 3953
133 Reservoir Dogs 1992 4 5963
134 Priest 1994 3.4 5862
135 Pocahontas 1995 3.8 4864
136 The Battleship Potemkin 1925 3.6 4153
137 Platoon 1986 4 7187
138 Re-Animator 1985 3.4 5146
139 Pulp Fiction 1994 4.1 9265
140 Reds 1981 3.5 11710
141 The Quest 1996 3.5 5690
142 Passion Fish 1992 3.6 8079
143 The Relic 1997 3.3 6580
144 The Pebble and the Penguin 1995 3.6 4442
145 The Piano 1993 3.6 7245
Page No 23

146 Restoration 1995 3.4 7057


147 Captured 1998 3.2 5709
148 Conan the Destroyer 1984 3.4 6085
149 Creator 1985 3.3 6454
150 Congo 1995 3.2 6492
151 Child's Play 2: Chucky's Back 1990 3.2 5054
152 Chitty Chitty Bang Bang 1968 3.6 8730
153 Caligula 1979 2.6 6108
154 Camilla 1994 3.6 5486
155 Cop and a Half 1993 3.1 5558
156 The Crow 1994 3.8 6101
157 The Crossing Guard 1995 3.3 6654
158 Clerks 1994 3.7 5505
159 The Castle 1997 3.5 5040
160 The Cat and the Canary 1927 3.1 4954
161 Class of Nuke 'Em High 1986 3 5113
162 The Crow: City of Angels 1996 3.4 5149
163 Coming to America 1988 3.6 7006
164 Clueless 1995 3.6 5833
165 Clockers 1995 3.4 7699
-- -- -- -- --
49546 Bo Burnham: what. 2013 4.1 3614
49547 Life With Boys: Season 1: A Perfect Life with Boys 2011 1368
49548 Life With Boys: Season 1: Wrestling with Boys 2011 1369
49549 Life With Boys: Season 1 2011 4.1
49550 El Fuente: 30 MP 2013 3.3 471
49551 El Fuente: 2997 MP 2013 3.2 471
49552 El Fuente: 25 MP 2013 3.3 465
49553 El Fuente: 24 MP 2013 3.5 484
49554 Max Steel 2013 4.1
49555 Saving Santa 2013 3.8 5038
49556 Lilyhammer: Season 1 (Recap) 2013 4.2 194
49557 Shinobi Girl 2012 2
49558 GLOW: The Story of the Gorgeous Ladies of Wrestling 2012 3.7 4593
49559 Mitt (Trailer) 2013 3 138
49560 My Hope America with Billy Graham: The Cross 2013 1706
49561 The Square (Trailer) 2014 3.6 154
49562 My Hope America with Billy Graham 2013 3.9
49563 My Hope America with Billy Graham: Defining Moments 2013 1791
49564 My Hope America with Billy Graham: Lose to Gain 2013 1400
49565 American Addict 2013 3.5 5377
49566 My Hope America with Billy Graham 2013 3.9
49567 El Fuente: 2997 MP10 2013 2.7 470
49568 El Fuente: 25 MP10 2013 2.9 464
49569 El Fuente: 24 MP10 2013 2.8 484
49570 El Fuente: 23976 MP10 2013 2.9 484
49571 The Short Game (Trailer) 2013 4.1 156
49572 El Fuente: 5994 MP10 2013 2.8 471
49573 El Fuente: 50 MP10 2013 2.9 464
49574 El Fuente: 30 MP10 2013 2.8 470
Page No 24

49575 Greg Fitzsimmons: Life on Stage 2013 3.3 3671


49576 Dave Foley: Relatively Well 2013 3.2 3446
Barbie: Life in the Dreamhouse: Barbie Life in the Dreamhouse:
49577 Best of Family 2013 1390
Barbie: Life in the Dreamhouse: Barbie Life in the Dreamhouse:
49578 Best of Friends 2013 1458
49579 Transformers Prime Beast Hunters: Predacons Rising 2013 4.2 3950
49580 Underground: The Julian Assange Story 2012 3.7 5665
49581 Curious George: A Very Monkey Christmas 2009 3.8 3438
49582 Mumfie's White Christmas 1996 2.4 1350
49583 Lady Gaga & The Muppets' Holiday Spectacular 2013 3.1 3496
49584 Sunset Strip 2012 3 5770
49585 Silver Bells 2013 3.5 5287
49586 Winter Wonderland 2013 2.8 1812
49587 Top Gear: Series 19: Africa Special 2013 6822
49588 Fireplace For Your Home: Crackling Fireplace with Music 2010 3610
49589 Kate Plus Ei8ht 2010 2.7
49590 Kate Plus Ei8ht: Season 1 2010 2.7
Page No 25

Python Script
(saved as “movies_data.py”)
# -*- coding: utf-8 -*-
"""
Created on Fri Jul 6 11:16:08 2018

@author: Debashree_Debalaxmi
"""

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df=pd.read_excel("C:\\Users\\acer\\Desktop\\movies_data.xlsx")
print(df)
print(df.shape) #to know the dimension
print(df.head(4)) #displaying first 4rows
print(df[3:7]) #displaying rows from index 3 to 6
print(df.tail(5)) #displaying first 5rows
plt.plot(df[1])

df=pd.read_excel("C:\\Users\\acer\\Desktop\\movies_data.xlsx",na_values=["n
ot a value"]) #displaying missing values as "not a value"
print(df)
print(df.fillna(method='pad')) #forward fill method

df=pd.read_excel("C:\\Users\\acer\\Desktop\\movies_data.xlsx",header=None,n
ames=["id","movies name","release year","rating","downloads"]) #inserting
new header
print(df['downloads'].isnull()) #checking if the values for
'downloads' are null
grouped=df.groupby('release year') #grouping by 'release year'
print(grouped.size()) #to get the no of released films in
each year of the group
print(grouped.get_group(1935)) #to display details of all the
films released in the year 1935
print(grouped.get_group(1985)['movies name']) #to display only the movies
names released in the year 1985

plt.plot(df["rating"])

print(df[0:5]['movies name']) #to display first 5 movies name

print(df.loc[:,['movies name','rating']]) #to display only values of only


'movies name' and 'rating'
print(df.loc[:,['movies name','rating']][1:10]) #to display only values
of only 'movies name' and 'rating' from id 1 to 10

print(df['rating'].max()) # get the maximum value of the column


'rating'
print(df['rating'].min()) # get the minimum value of the column
'rating'

print(grouped.get_group(1994)['rating']) #ratings in 1994


print(df.loc[df['rating'].idxmax()]) #to get the maximum rating row
print(df.loc[df['rating'].idxmax()]['release year']) #finding the year
of maximum rating
Page No 26

Output
(saved as “movies_data.html”)

Python 3.6.5 |Anaconda, Inc.| (default, Mar 29 2018, 13:32:41) [MSC v.1900 64 bit (AMD64)]

Type "copyright", "credits" or "license" for more information.

IPython 6.4.0 -- An enhanced Interactive Python.

In [1]: import pandas as pd

In [2]: import numpy as np

In [3]: import matplotlib.pyplot as plt

In [4]: df=pd.read_excel("C:\\Users\\acer\\Desktop\\movies_data.xlsx")

In [5]: print(df)

1 ... 4568

0 2 ... 4388.0

1 3 ... 9062.0

2 4 ... 6150.0

3 5 ... 5126.0

4 6 ... 5333.0

5 7 ... 6323.0

6 8 ... 5733.0

7 9 ... 5651.0

8 10 ... 5333.0

9 11 ... 5367.0

10 12 ... 6561.0

11 13 ... 12118.0

12 14 ... 7417.0

13 15 ... 5696.0

14 16 ... 5823.0

15 17 ... 4485.0
Page No 27

16 18 ... 6755.0

17 19 ... 5598.0

... ... ...

49582 49584 ... 5770.0

49583 49585 ... 5287.0

49584 49586 ... 1812.0

49585 49587 ... 6822.0

49586 49588 ... 3610.0

49587 49589 ... NaN

49588 49590 ... NaN

[49589 rows x 5 columns]

In [6]: print(df.shape)

(49589, 5)

In [7]: print(df.head(4))

1 The Nightmare Before Christmas 1993 3.9 4568

0 2 The Mummy 1932 3.5 4388.0

1 3 Orphans of the Storm 1921 3.2 9062.0

2 4 The Object of Beauty 1991 2.8 6150.0

3 5 Night Tide 1963 2.8 5126.0

In [8]: print(df[3:7])

1 The Nightmare Before Christmas 1993 3.9 4568

3 5 Night Tide 1963 2.8 5126.0

4 6 One Magic Christmas 1985 3.8 5333.0

5 7 Muriel's Wedding 1994 3.5 6323.0

6 8 Mother's Boys 1994 3.4 5733.0

In [9]: print(df.tail(5))

1 ... 4568

49584 49586 ... 1812.0

49585 49587 ... 6822.0


Page No 28

49586 49588 ... 3610.0

49587 49589 ... NaN

49588 49590 ... NaN

[5 rows x 5 columns]

In [10]: plt.plot(df[1])

Out[10]: [<matplotlib.lines.Line2D at 0x18c39fbc9e8>]

In [11]: df=pd.read_excel("C:\\Users\\acer\\Desktop\\movies_data.xlsx",na_values=["not a value"])

In [12]: print(df)

1 ... 4568

0 2 ... 4388.0

1 3 ... 9062.0

2 4 ... 6150.0

3 5 ... 5126.0

4 6 ... 5333.0

5 7 ... 6323.0

6 8 ... 5733.0

7 9 ... 5651.0

8 10 ... 5333.0

9 11 ... 5367.0

10 12 ... 6561.0

11 13 ... 12118.0
Page No 29

12 14 ... 7417.0

13 15 ... 5696.0

... ... ...

49579 49581 ... 3438.0

49580 49582 ... 1350.0

49581 49583 ... 3496.0

49582 49584 ... 5770.0

49583 49585 ... 5287.0

49584 49586 ... 1812.0

49585 49587 ... 6822.0

49586 49588 ... 3610.0

49587 49589 ... NaN

49588 49590 ... NaN

[49589 rows x 5 columns]

In [13]: print(df.fillna(method='pad'))

1 ... 4568

0 2 ... 4388.0

1 3 ... 9062.0

2 4 ... 6150.0

3 5 ... 5126.0

4 6 ... 5333.0

5 7 ... 6323.0

6 8 ... 5733.0

7 9 ... 5651.0

8 10 ... 5333.0

9 11 ... 5367.0

10 12 ... 6561.0

11 13 ... 12118.0

12 14 ... 7417.0
Page No 30

13 15 ... 5696.0

... ... ...

49583 49585 ... 5287.0

49584 49586 ... 1812.0

49585 49587 ... 6822.0

49586 49588 ... 3610.0

49587 49589 ... 3610.0

49588 49590 ... 3610.0

[49589 rows x 5 columns]

In [14]:
df=pd.read_excel("C:\\Users\\acer\\Desktop\\movies_data.xlsx",header=None,names=["id","movie
s name","release year","rating","downloads"])

In [15]: print(df['downloads'].isnull())

0 False

1 False

2 False

3 False

4 False

5 False

6 False

7 False

8 False

9 False

10 False

11 False

12 False

13 False

14 False

--

49580 False
Page No 31

49581 False

49582 False

49583 False

49584 False

49585 False

49586 False

49587 False

49588 True

49589 True

Name: downloads, Length: 49590, dtype: bool

In [16]: grouped=df.groupby('release year')

In [17]: print(grouped.size())

release year

1913 3

1914 20

1915 1

1916 1

1918 1

1919 3

1920 6

1921 2

1922 2

1923 4

1924 5

1925 5

1926 2

1927 4

1928 2

1929 5

1930 5
Page No 32

1931 3

1932 4

--

2004 1381

2005 1937

2006 2416

2007 2892

2008 3358

2009 4451

2010 5107

2011 5511

2012 4339

2013 981

2014 1

Length: 101, dtype: int64

In [18]: print(grouped.get_group(1935))

id ... downloads

16 17 ... 4485.0

544 545 ... 3754.0

7290 7291 ... 3891.0

13743 13744 ... 4380.0

13746 13747 ... 4974.0

13748 13749 ... 5538.0

19018 19019 ... 5573.0

44652 44653 ... 5751.0

44725 44726 ... 604.0

44726 44727 ... 468.0

44754 44755 ... 517.0

[11 rows x 5 columns]


Page No 33

In [19]: print(grouped.get_group(1985)['movies name'])

5 One Magic Christmas

15 The Breakfast Club

128 The Falcon and the Snowman

137 Re-Animator

148 Creator

196 The Toxic Avenger

270 The Official Story

442 Clue

538 Santa Claus: The Movie

580 The Stuff

655 Girls Just Want to Have Fun

674 Remo Williams: The Adventure Begins

758 Lust in the Dust

807 Barbarian Queen

976 Silver Bullet

1000 The Care Bears Movie

1027 Transylvania 6-5000

1029 The Boys Next Door

1088 Death of a Salesman

1321 Once Bitten

1517 Flesh + Blood

1695 Tuff Turf

1789 The Man with One Red Shoe

1865 28 Up

1877 Explorers

1925 Cheers: Season 4

1941 MacGyver: Season 1

2116 Creature

2235 Bleak House


Page No 34

2285 Miami Vice: Season 2

35676 Jem and the Holograms: Season 1: Culture Clash

35678 Jem and the Holograms: Season 1: Intrigue at t...

35679 Jem and the Holograms: Season 1: The Jem Jam: ...

35681 Jem and the Holograms: Season 1: Island of Dec...

35683 Jem and the Holograms: Season 1: Old Meets New

35686 Jem and the Holograms: Season 1: Hot Time in H...

35688 Jem and the Holograms: Season 1: The Princess ...

35690 Jem and the Holograms: Season 1: Broadway Magic

35692 Jem and the Holograms: Season 1: In Search of ...

35694 Jem and the Holograms: Season 1: The Music Awa...

35696 Jem and the Holograms: Season 1: The Rock Fash...

35698 Jem and the Holograms: Season 1: In Stitches

35700 Jem and the Holograms: Season 1: The Music Awa...

35701 Jem and the Holograms: Season 1: Last Resorts

35703 Jem and the Holograms: Season 1: Adventures in...

35705 Jem and the Holograms: Season 1: The World Hun...

35706 Jem and the Holograms: Season 1: Starbright: P...

35707 Jem and the Holograms: Season 1: Starbright: P...

35708 Jem and the Holograms: Season 1: Starbright: P...

35709 Jem and the Holograms: Season 1: Battle of the...

35710 Jem and the Holograms: Season 1: Frame Up

35712 Jem and the Holograms: Season 1: Kimber's Rebe...

35851 Jem and the Holograms: Season 1: Disaster

35852 Jem and the Holograms: Season 1: The Beginning

35853 Jem and the Holograms: Season 1

36050 Jem and the Holograms

36057 Miami Vice: Season 2: The Prodigal Son: Part 2

44637 Hot Target


Page No 35

46010 Airwolf: Season 3: Desperate Monday

46017 Airwolf: Season 3: Kingdom Come

Name: movies name, Length: 334, dtype: object

In [20]: plt.plot(df["rating"])

Out[20]: [<matplotlib.lines.Line2D at 0x18c39892e48>]

In [21]: print(df[0:5]['movies name'])

0 The Nightmare Before Christmas

1 The Mummy

2 Orphans of the Storm

3 The Object of Beauty

4 Night Tide

Name: movies name, dtype: object

In [22]: print(df.loc[:,['movies name','rating']])

movies name rating

0 The Nightmare Before Christmas 3.9

1 The Mummy 3.5

2 Orphans of the Storm 3.2

3 The Object of Beauty 2.8

4 Night Tide 2.8

5 One Magic Christmas 3.8

6 Muriel's Wedding 3.5


Page No 36

7 Mother's Boys 3.4

8 Nosferatu: Original Version 3.5

9 Nick of Time 3.4

10 Broken Blossoms 3.3

11 Big Night 3.6

12 The Birth of a Nation 2.9

13 The Boys from Brazil 3.6

... ...

49571 El Fuente: 5994 MP10 2.8

49572 El Fuente: 50 MP10 2.9

49573 El Fuente: 30 MP10 2.8

49574 Greg Fitzsimmons: Life on Stage 3.3

49575 Dave Foley: Relatively Well 3.2

49576 Barbie: Life in the Dreamhouse: Barbie Life in... NaN

49577 Barbie: Life in the Dreamhouse: Barbie Life in... NaN

49578 Transformers Prime Beast Hunters: Predacons Ri... 4.2

49579 Underground: The Julian Assange Story 3.7

49580 Curious George: A Very Monkey Christmas 3.8

49581 Mumfie's White Christmas 2.4

49582 Lady Gaga &#38; The Muppets' Holiday Spectacular 3.1

49583 Sunset Strip 3.0

49584 Silver Bells 3.5

49585 Winter Wonderland 2.8

49586 Top Gear: Series 19: Africa Special NaN

49587 Fireplace For Your Home: Crackling Fireplace w... NaN

49588 Kate Plus Ei8ht 2.7

49589 Kate Plus Ei8ht: Season 1 2.7

[49590 rows x 2 columns]

In [23]: print(df.loc[:,['movies name','rating']][1:10])


Page No 37

movies name rating

1 The Mummy 3.5

2 Orphans of the Storm 3.2

3 The Object of Beauty 2.8

4 Night Tide 2.8

5 One Magic Christmas 3.8

6 Muriel's Wedding 3.5

7 Mother's Boys 3.4

8 Nosferatu: Original Version 3.5

9 Nick of Time 3.4

In [24]: print(df['rating'].max())

4.5

In [25]: print(df['rating'].min())

1.4

In [26]: print(grouped.get_group(1994)['rating'])

6 3.5

7 3.4

22 3.3

28 3.2

45 3.4

51 3.3

58 2.8

77 3.3

88 3.2

117 4.0

133 3.4

138 4.1

153 3.6

155 3.8

157 3.7
Page No 38

186 3.7

--

41427 3.9

41428 NaN

41429 NaN

41430 NaN

41431 NaN

41478 NaN

42274 2.5

42317 3.9

Name: rating, Length: 517, dtype: float64

In [27]: print(df.loc[df['rating'].idxmax()])

id 6997

movies name Breaking Bad: Season 1

release year 2008

rating 4.5

downloads NaN

Name: 6996, dtype: object

In [28]: print(df.loc[df['rating'].idxmax()])

id 6997

movies name Breaking Bad: Season 1

release year 2008

rating 4.5

downloads NaN

Name: 6996, dtype: object

In [29]: print(df.loc[df['rating'].idxmax()]['release year'])

2008
Page No 39

Conclusion

Here, we are able to find answers within large datasets by using


python tools to import data, explore it, analyze it, learn from it and
visualize it.

By learning these skills, we can become a member of a world-wide


community which seeks to build machine learning tools, explore
public datasets, and discuss evidence-based findings. With this
project, anyone will be able to learn basic process of data science, an
applied understanding of how to manipulate and analyze uncurated
datasets, basic statistical analysis and machine learning methods, how
to effectively visualize results.
Page No 40

BIBLIOGRAPHY

1. https://pandas.pydata.org/
2. https://docs.scipy.org/doc/numpy/reference/
3. https://www.wikipedia.org/
4. https://www.quora.com/
5. https://stackoverflow.com/
6. https://www.youtube.com/