Welcome to Scribd!

T DAT 902 - Project

Uploaded by

0% found this document useful (0 votes)

83 views6 pages

The document provides details for a project to predict movie genres based solely on movie posters. Students are asked to: 1. Predict genres for movies in a dataset of 80,000 movie posters and synopses using only poster images. Additional genres can be added or modified. 2. Create a document with visualizations and statistics summarizing the analysis. An interactive tool is optional. 3. Include the methodology, algorithms used, and feature importance for posters and synopses. Clustering and unsupervised analysis should also be included to extract feature importance. 4. Extract archetypal posters for each genre based on feature importance. The program should also be able to process new images and predict genres along

Original Description:

exemple project

Original Title

T-DAT-902_project

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

83 views6 pages

T DAT 902 - Project

Uploaded by

Cedric Durayssex

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 6

Search inside document

T10 - Big Data

T-DAT-902

Nostradamovies
posters standardisation you say?

1.2.5
Nostradamovies
delivery method: Github
repository name: $CourseCode-$GroupName.git
language: Python or R is advised

• The totality of your source files, except all useless files (binary, temp files, obj
files,...), must be included in your delivery.

From an 80 000 row dataset containing movie posters, movie synopsis and full IMDB webiste informa-
tion, you are asked to make movie genre predictions, based on posters only.

It’s at your discretion to decide to modify the dataset genres, e.g. replacing comedy horror, action with only
comedy horror for instance. You can also add your own genres, such as blockbuster, teen movie or Cannes
Palme winner for instance.

This training set is not exhaustive (it does not

contain Bollywood movies for instance); you
are expected to complete it.

1
Neural networks, deep learning and every possible algorithm are welcome, but do not
spend time on them.
Use bullet-proof libraries instead of reinventing the wheel, and focus on data.

2
A document synthetising visualization and statistics is also required.

An interactive tool would be appreciated

It must also include the methodology and algorithms used to make your extractions/predicitions, and fea-
tures importance for both posters and synopsis (for instance, a black color for horror movies, or a large
rounded title font for comedies).

The relevancy of your vizualisation is of prime importance; display any data you consider
meaningful.

Add clustering and unsupervised analysis to extract features importance.

SHAP values would be welcome.

3
Last but not least, you must extract archetypal posters from your feature importance classifications.

You are expected to display the most typical poster from your database, based on the feature importance
for each movie gender.

For example, you might have this kind of features importance, for
“blockbusters”:

1. names in the top quarter 88%

2. central face 60%
3. large title 55%
4. title in the lowest half 44%
5. 1 to 3 faces 44%
6. 5 text lines in the bottom 37%
7. black color 34%
8. number in the title 33%
9. ...

In this example, your program should pick up a poster containing

as many elements as possible, by order of importance, for instance
the adjacent poster.

4
Your final algorithm will be tested on recent movies. The program should contain a function able to process
a PNG or JPEG image and make the prediction along with the features extracted.

e.g. For this independant comedy drama named Little Miss Sunshine your output could be:

∇ Terminal - + x
∼/T-DAT-902> python genre_prediction.py little_miss_sunshine.jpg
Genre predicted : Comedy Drama
Probability : 0.76
Features extracted ->
Number of faces : 5
Colorimetry : Yellow
Similar poster : xxx
...
Feature importance for the genre predicted ->
...

Rather than giving you a single prediction, algorithms will give probabilities for each
genre possible. You can display the top 3 predictions for example with the associated
probabilities

Culture Sketches Case Studies in Anthropology 6th Edition Ebook PDF
Document57 pages
Culture Sketches Case Studies in Anthropology 6th Edition Ebook PDF
tammy.riston721
98% (50)
Business A Changing World 5th Ed. CANADA
Document484 pages
Business A Changing World 5th Ed. CANADA
Aditya Pratap Singh
100% (1)
24 Forms Yang Style Tai Chi Quan
Document4 pages
24 Forms Yang Style Tai Chi Quan
vrtrivedi86
100% (2)
Homework 1 Search in Pacman - 2018spring
Document9 pages
Homework 1 Search in Pacman - 2018spring
Strider_17
No ratings yet
Measurable Annual Goals - Stranger and Dead Man's Test-1
Document1 page
Measurable Annual Goals - Stranger and Dead Man's Test-1
ERROL SOLIDARIOS
No ratings yet
Package Meme': August 5, 2019
Document9 pages
Package Meme': August 5, 2019
age
No ratings yet
ECE2013 MP1 Description
Document5 pages
ECE2013 MP1 Description
jiarui qiu
No ratings yet
CS5785 Homework 0: Verview
Document4 pages
CS5785 Homework 0: Verview
sk1029
No ratings yet
DSA Assignment I006
Document8 pages
DSA Assignment I006
Sanket Chavan
No ratings yet
Description Start Here If... : Evaluation
Document5 pages
Description Start Here If... : Evaluation
Worh Falex
No ratings yet
Data Science Essentials in Python PDF
Document8 pages
Data Science Essentials in Python PDF
Vivek Ss
No ratings yet
Manualprocesspix4D RGB
Document19 pages
Manualprocesspix4D RGB
sahlia jawher
No ratings yet
3P
Document4 pages
3P
Wookie T Bradford
No ratings yet
COMP 4650 6490 Assignment 3 2023-v1.1
Document6 pages
COMP 4650 6490 Assignment 3 2023-v1.1
390942959
No ratings yet
HW2 Multi-Agent Pacman PDF
Document9 pages
HW2 Multi-Agent Pacman PDF
yyarenas
No ratings yet
Spatstat Quickref
Document33 pages
Spatstat Quickref
arthur.pgsouza
No ratings yet
Lab Mannual
Document49 pages
Lab Mannual
vickyakfan152002
No ratings yet
CS464 Ch1 Intro Fall2020
Document83 pages
CS464 Ch1 Intro Fall2020
Mathias Bueno
No ratings yet
HW LM
Document36 pages
HW LM
Sabrina Li
No ratings yet
Statistics with Rust: 50+ Statistical Techniques Put into Action
From Everand
Statistics with Rust: 50+ Statistical Techniques Put into Action
Keiko Nakamura
No ratings yet
Master Thesis Project HOWTO
Document6 pages
Master Thesis Project HOWTO
Solomon Pizzocaro
No ratings yet
MATLAB Machine Learning Recipes: A Problem-Solution Approach
From Everand
MATLAB Machine Learning Recipes: A Problem-Solution Approach
Michael Paluszek
No ratings yet
A5 Csai Problem
Document3 pages
A5 Csai Problem
yashvi.maheshwari
No ratings yet
Assignment 4
Document5 pages
Assignment 4
Ahmed Haa
No ratings yet
Python Mini Project
Document14 pages
Python Mini Project
Nitish Kumar Choudhury
No ratings yet
Champo Carpets Problem Statement
Document2 pages
Champo Carpets Problem Statement
RUDRAKSHI KHANNA
No ratings yet
Understanding Boxplots: Different Parts of A Boxplot
Document14 pages
Understanding Boxplots: Different Parts of A Boxplot
Krishna Chaudhary
No ratings yet
Graphanalyticswitharangodbfeb2021 210215121042
Document56 pages
Graphanalyticswitharangodbfeb2021 210215121042
Adireddy Satyatrinadh
No ratings yet
A 02
Document2 pages
A 02
dsa
No ratings yet
Boruta Feature Selection in R - DataCamp
Document18 pages
Boruta Feature Selection in R - DataCamp
habeeb4sa
No ratings yet
Practical Java Machine Learning: Projects with Google Cloud Platform and Amazon Web Services
From Everand
Practical Java Machine Learning: Projects with Google Cloud Platform and Amazon Web Services
Mark Wickham
No ratings yet
Statistics Materials: Data Science: Week 9
Document22 pages
Statistics Materials: Data Science: Week 9
ARCHANA R
No ratings yet
HW07
Document14 pages
HW07
max
No ratings yet
06DynamicProgrammingAndGreedy Editedolga
Document46 pages
06DynamicProgrammingAndGreedy Editedolga
Marcos Tiago
No ratings yet
ALgorithms
Document73 pages
ALgorithms
Abhishek Naik
No ratings yet
RANDOM FOREST (Binary Classification)
Document5 pages
RANDOM FOREST (Binary Classification)
Noor Ul Haq
No ratings yet
Trevor's Coding Style Guide (Version 0.1)
Document5 pages
Trevor's Coding Style Guide (Version 0.1)
Trevor M Tomesh
No ratings yet
Bachelor Dissertation PDF
Document8 pages
Bachelor Dissertation PDF
NeedSomeoneToWriteMyPaperForMeSingapore
100% (1)
Code Like A Pythonista: Idiomatic Python (Crunchy Remix)
Document36 pages
Code Like A Pythonista: Idiomatic Python (Crunchy Remix)
lennardvdf
100% (3)
BAN401 Final Group Based 2023
Document14 pages
BAN401 Final Group Based 2023
ThouhidAlam
No ratings yet
ELL784/AIP701: Assignment 3: Instructions
Document3 pages
ELL784/AIP701: Assignment 3: Instructions
lovlesh roy
No ratings yet
Unit 1 Primitive Types
Document81 pages
Unit 1 Primitive Types
Hyon Park
No ratings yet
Notas de Python
Document10 pages
Notas de Python
David Vivas
No ratings yet
Steps of Implementation of A GLM
Document8 pages
Steps of Implementation of A GLM
Paul Wattellier
No ratings yet
Python for Probability, Statistics, and Machine Learning
From Everand
Python for Probability, Statistics, and Machine Learning
José Unpingco
No ratings yet
Face Detection Haarcascade
Document5 pages
Face Detection Haarcascade
Aaron Deleon
No ratings yet
7641 Assignment 1
Document4 pages
7641 Assignment 1
Muhammad Aleem
No ratings yet
Comprehensive Guide Data Exploration Sas Using Python Numpy Scipy Matplotlib Pandas
Document12 pages
Comprehensive Guide Data Exploration Sas Using Python Numpy Scipy Matplotlib Pandas
Ahsan Ahmad Beg
100% (1)
Becominghuman - Ai-Cheat Sheets For AI Neural Networks Machine Learning Deep Learning Amp BignbspData
Document24 pages
Becominghuman - Ai-Cheat Sheets For AI Neural Networks Machine Learning Deep Learning Amp BignbspData
Taasiel Julimamm
100% (1)
RPG Thesis
Document7 pages
RPG Thesis
CollegePaperWritingHelpCanada
100% (2)
Lecture 3 - Data Manipulation
Document51 pages
Lecture 3 - Data Manipulation
Anurag Laddha
No ratings yet
Assign 4
Document4 pages
Assign 4
api-252659046
No ratings yet
Mastering Matplotlib - Sample Chapter
Document27 pages
Mastering Matplotlib - Sample Chapter
Packt Publishing
No ratings yet
Research Topics
Document15 pages
Research Topics
malinks
No ratings yet
Topic Analysis Presentation
Document23 pages
Topic Analysis Presentation
Nader AlFakeeh
No ratings yet
19CS49 Computer Vision Lab File PDF
Document29 pages
19CS49 Computer Vision Lab File PDF
suraj yadav
No ratings yet
Report in ML
Document9 pages
Report in ML
Priti Gupta
No ratings yet
CS 663: Assignment-Submission Instructions: For Each Question Do The Following
Document3 pages
CS 663: Assignment-Submission Instructions: For Each Question Do The Following
Saurabh Jain
No ratings yet
CS7643: Deep Learning Assignment 3: Instructor: Zsolt Kira Deadline: 11:59pm Mar 14, 2021, EST
Document12 pages
CS7643: Deep Learning Assignment 3: Instructor: Zsolt Kira Deadline: 11:59pm Mar 14, 2021, EST
Kev Lai
No ratings yet
Fantasy Grounds Adventure Creation Guide
Document9 pages
Fantasy Grounds Adventure Creation Guide
lupinelegend
100% (1)
MBAN Assignment
Document2 pages
MBAN Assignment
Jaspreet Singh Sidhu
No ratings yet
Red Pajama: An Open-Source Llama Model
Document3 pages
Red Pajama: An Open-Source Llama Model
My Social
No ratings yet
IMDB Scraping & Analysis
Document5 pages
IMDB Scraping & Analysis
varun goel
No ratings yet
Risk Formulas
Document98 pages
Risk Formulas
Hernan Huwyler
No ratings yet
Kinder DLL
Document6 pages
Kinder DLL
JanetteAcupidoDelaCruz
No ratings yet
Policy Brief: Improving Health Workforce Planning and Management
Document10 pages
Policy Brief: Improving Health Workforce Planning and Management
MHIKIE MANZANARES
No ratings yet
E-Learning vs. Classroom Instruction in Infection Control in A Dental Hygiene Program
Document7 pages
E-Learning vs. Classroom Instruction in Infection Control in A Dental Hygiene Program
jalarkpatel
No ratings yet
MTB Week 8 (3rd Quarter)
Document2 pages
MTB Week 8 (3rd Quarter)
CHARMINE GAY ROQUE
No ratings yet
Dissertation Certificate
Document4 pages
Dissertation Certificate
OrderAPaperOnlineCanada
100% (2)
COURSEPACK IN EDUC. 111 - THE CHILD and ADOLESCENT LEARNERS and LEARNING PRINCIPLES
Document279 pages
COURSEPACK IN EDUC. 111 - THE CHILD and ADOLESCENT LEARNERS and LEARNING PRINCIPLES
Janice Almero
No ratings yet
3rd Grade Story Quilt Lesson Plan 1
Document5 pages
3rd Grade Story Quilt Lesson Plan 1
api-445729585
No ratings yet
National Board of Accreditation: Pro-Forma For Pre-Qualifiers For Undergraduate Pharmacy Programs
Document9 pages
National Board of Accreditation: Pro-Forma For Pre-Qualifiers For Undergraduate Pharmacy Programs
Saumya Sarangi
No ratings yet
Vector Scholarships 2020 21 Tor
Document2 pages
Vector Scholarships 2020 21 Tor
yousef shaban
No ratings yet
Testbuilder Part 1
Document4 pages
Testbuilder Part 1
Milen Bonev
No ratings yet
Present Perfect Tense PPP
Document38 pages
Present Perfect Tense PPP
Sandra Sánchez Ortiz
No ratings yet
Specialization - LANGUAGE Revised2015
Document5 pages
Specialization - LANGUAGE Revised2015
Altea Azma Naga
No ratings yet
Volume 26 Legon Journal of The Humanities PDF
Document174 pages
Volume 26 Legon Journal of The Humanities PDF
Robin
100% (1)
Maintenance Manager Job Description
Document2 pages
Maintenance Manager Job Description
Aspire Success
No ratings yet
Behaviorist Perspective
Document12 pages
Behaviorist Perspective
Brian Maingi
No ratings yet
Implications of Vygotsky's Zone of Proximal Development (ZPD) in Teacher Education: ZPTD and Self-Scaffolding
Document7 pages
Implications of Vygotsky's Zone of Proximal Development (ZPD) in Teacher Education: ZPTD and Self-Scaffolding
Jenniffer Cortez Tagle
No ratings yet
Solar Technician (Electrical) CTS NSQF-4
Document61 pages
Solar Technician (Electrical) CTS NSQF-4
Singh is King
No ratings yet
Community Health Nursing Review Part 1
Document5 pages
Community Health Nursing Review Part 1
yunjung0518
No ratings yet
Khmer: Continuers Level
Document31 pages
Khmer: Continuers Level
TFuchs
No ratings yet
Mullins L.J. Management and Organisational Behaviour
Document7 pages
Mullins L.J. Management and Organisational Behaviour
api-2179215
No ratings yet
Presentation Inset 2023 SIR DEX
Document19 pages
Presentation Inset 2023 SIR DEX
annalyn vines
No ratings yet
Sains - Science Form 5
Document64 pages
Sains - Science Form 5
Sekolah Portal
89% (18)
Speech On Advantage and Disadvantages of Online Exams
Document2 pages
Speech On Advantage and Disadvantages of Online Exams
Bindiya Goyal
100% (1)
Damn Hot
Document3 pages
Damn Hot
Div Parmar
No ratings yet
HL Psychology Abnormal Psychology Paper 2
Document2 pages
HL Psychology Abnormal Psychology Paper 2
Vlada Soţchi
No ratings yet
General Chemistry Module 20 GENCHEM1-12-Q1-MELC20-MOD-Areola, Marissa - Marissa Areola
Document16 pages
General Chemistry Module 20 GENCHEM1-12-Q1-MELC20-MOD-Areola, Marissa - Marissa Areola
Junelle Ramos Aquino
No ratings yet