0% found this document useful (0 votes)

41 views6 pages

Python Project: Social Media Analysis

Uploaded by

Yashwardhan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views6 pages

Python Project: Social Media Analysis

Uploaded by

Yashwardhan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

CITS 1401 Computational Thinking with

Python

Project 1, Semester 1, 2024

Submission deadline: 19th April 2024, 6:00 PM

Total Marks: 30 (Value: 15%)

Project description

You should construct a Python 3 program containing your solution to the following
problem and submit your program electronically on Moodle. The name of the file
containing your code should be your student ID e.g., 12345678.py. No other method of
submission is allowed. Please note that this is an individual project. Your program
will be automatically run on Moodle for some sample test cases provided in the project
sheet if you click the “check” link. However, this does not test all required criteria and
your submission will be manually tested thoroughly for grading purposes after the due
date. Remember you need to submit the program as a single file and copy-paste the
same program in the provided text box. You have only one attempt to submit, so do not
submit until you are satisfied with your attempt. All open submissions at the time of the
deadline will be automatically submitted. Once your attempt is submitted, there is no
way in the system to open/reverse/modify it.

You are expected to have read and understood the University's guidelines on academic
conduct. In accordance with this policy, you may discuss with other students the general
principles required to understand this project, but the work you submit must be the result
of your own effort. Plagiarism detection, and other systems for detecting potential
malpractice, will therefore be used. Besides, if what you submit is not your own work
then you will have learned little and will therefore, likely, fail the final exam.

You must submit your project before the deadline mentioned above. Following UWA
policy, a late penalty of 5% will be deducted for each day i.e., 24 hours after the deadline,
that the assignment is submitted. No submissions will be allowed after 7 days following
the deadline except approved special consideration cases.

Project Overview

In today's digital age, social media has become an integral part of our daily lives, shaping
how we connect, communicate, and consume information. With millions of users engaging
across various platforms, understanding user behaviours and trends on social media has
become increasingly important. The aim of this project is to analyse user demographics to
better understand social media usage.

The dataset for this project comprises several key columns, including age, gender, time
spent on social media (in hours), platform, interests, country, demographics, profession,
income, and debt status.

You are required to write a Python 3 program that will read a CSV file. After reading the
file, your program is required to complete the following tasks:

1) Find the list of student details (ID and income) for a specific country who are in
debt (or have debt status True) and spending more than 7 hours on any social
media.
2) Find the list of unique countries for a specific age group.
3) Find the age statistics for a specific age group. The age statistics include average
time spent in hours, standard deviation of income, and which demography (e.g.

Page 1 of 6
CITS 1401 Computational Thinking with
Python

rural, sub_urban or urban) spent the lowest average time on social media for the
specific age group.
4) Find the platform that has the highest number of users and calculate the correlation
between the age and the income for that user base.

Requirements

1) You are not allowed to import any external or internal module in python.
While use of many of these modules, e.g., csv or math is a perfectly sensible thing
to do in a production setting, it takes away much of the point of different aspects
of the project, which is about getting practice opening text files, processing text
file data, and use of basic Python structures, in this case lists and loops.
2) Ensure your program does NOT call the input() function at any time. Calling the
input() function will cause your program to hang, waiting for input that the
automated testing system will not provide (in fact, what will happen is that if the
marking program detects the call(s), it will not test your code at all which may
result in zero grade).
3) Your program should also not call print() function at any time except for the case
of graceful termination (if needed). If your program has encountered an error state
and is exiting gracefully then your program needs to return zero for numerical
values such as average time spent, standard deviation, None for non-numeric
output such as demography, otherwise empty list and print an appropriate
message. At no point should you print the program’s outputs instead of (or in
addition to) returning them or provide a printout of the program’s progress in
calculating such outputs.
4) Do not assume that the input file names will end in .csv. File name suffixes such
as .csv and .txt are not mandatory in systems other than Microsoft Windows. Do
not enforce that within your program that the file must end with a .csv or any
other extension (or try to add an extension onto the provided csv file argument),
doing so can easily lead to syntax error and loosing marks.

Input

Your program must define the function main with the following syntax:

def main(csvfile, age_group, country):

The input arguments for this function are:

• csvfile: The name of the CSV file (as string) containing the record of the user
details. The first row of the CSV file will contain the headings of the columns. A
sample CSV file “SocialMedia.csv” is provided with the project sheet on LMS and
Moodle.
• age_group: A list parameter [lower,upper] containing the lower and upper bound
of age.
• country: A string parameter containing the name of a country.

Page 2 of 6
CITS 1401 Computational Thinking with
Python

Output

We expect 4 outputs in the order below.

i. OP1 = [list of student details]: A list containing list(s) of student details [ID and
income] from the country provided as input who are in debt and spends more than
7 hours on any social media. The student details’ are ordered ascendingly based
on student IDs.
ii. OP2 = [list of unique countries]: a list of unique countries for users whose age falls
within the lower and upper bound of input age_group (inclusive). The list needs to
be sorted in alphabetically ascending order.
iii. OP3 = [average time spent, standard deviation of income, demography]: A list
containing a numeric value for average time spent by users whose age falls within
the lower and upper bound of input age_group (inclusive), standard deviation of
their income, and a string for demography name that represents whether rural,
sub_urban or urban users falling within the age_group (inclusive) spent the lowest
average time on social media compared to other demographics. In case two
demographics have same lowest value then sort the demographics in alphabetical
order and return the first one.
iv. OP4 = Correlation value: A numeric value for correlation between age and income
for the user base of the social media platform that has the highest number of users.
If there are multiple social media platforms having same highest number of users
then sort them in alphabetical order and find correlation considering the first one.

All returned numeric outputs (both in lists and individual) must contain values rounded to
four decimal places (if required to be rounded off). Do not round the values during
calculations. Instead, round them only at the time when you save them into the final
output variables.

Examples

Download SocialMedia.csv file from the folder of Project 1 on LMS or Moodle. An example
of how you can call your program from the Python shell and examine the results it returns,
is provided below:

>> OP1, OP2, OP3, OP4 = main('SocialMedia.csv', [18,25], 'australia')

The returned output variables are:

>> OP1

[['11', 4708.0], ['126', 5785.0], ['184', 9266.0]]

>> OP2

['australia', 'bangladesh', 'ireland', 'new zealand', 'pakistan', 'yemen']

>> OP3

[3.5556, 112446.1548, 'rural']

>> OP4

0.4756
Page 3 of 6
CITS 1401 Computational Thinking with
Python

Assumptions

Your program can assume the following:

• Anything that is meant to be string (e.g., header) will be a string, and anything
that is meant to be numeric will be numeric in the CSV file.
• All string data in the CSV file is case-insensitive, which means “Australia” is same
as “australia”. Your program needs to handle the situation to consider both to be
the same. Similarly, your program needs to handle the string input parameter in
the same way.
• The order of columns in each row will follow the order of the headings provided in
the first row. However, rows can be in random order except the first row containing
the headings.
• No data will be missing in the CSV file; however, values can be zero and must be
accounted for mathematical calculations.
[In case any part of the calculation cannot be performed due to zero values or other
boundary conditions, do a graceful termination by printing an error message and
returning a zero value (for numbers), None for (string) or empty list depending on
the expected outcome. Your program must not crash.]
• Number of different social medial platform and country will vary, so do not
hardcode.
• The main function will always be provided with valid input parameters.
• The necessary formulae are provided at the end of this document.

Important grading instruction

Note that you have not been asked to write specific functions. The task has been left to
you. However, it is essential that your program defines the top-level function
main(csvfile, age_group, country) (hereafter referred to as “main()” in the project
document to save space when writing it. Note that when main() is written, it still implies
that it is defined with its three input arguments). The idea is that within main(), the
program calls the other functions. Of course, these functions may then call further
functions. This is important because when your code is tested on Moodle, the testing
program will call your main() function. So if you fail to define main(), the testing program
will not be able to test your code and your submission will be graded zero. Don’t forget
the submission guidelines provided at the start of this document.

Marking rubric

Your program will be marked out of 30 (later scaled to be out of 15% of the final mark).
22 out of 30 marks will be awarded automatically based on how well your program
completes a number of tests, reflecting normal use of the program, and also how the
program handles various states including, but not limited to, different numbers of rows in
the input file and / or any error or corner states/cases. You need to think creatively what
your program may face. Your submission will be graded by data files other than the
provided data file. Therefore, you need to be creative to investigate corner or worst cases.
I have provided few guidelines from ACS Accreditation manual at the end of the project
sheet which will help you to understand the expectations.

8 out of 30 marks will be awarded on style (5/8) “the code is clear to read” and efficiency
(3/8) “your program is well constructed and run efficiently”. For style, think about use of

Page 4 of 6
CITS 1401 Computational Thinking with
Python

proper comments, function docstrings, sensible variable names, your name and student
ID at the top of the program, etc. (Please watch the lectures where this discussed).

Style Rubric:

0 Gibberish, impossible to understand.

1-2 Style is really poor or fair.

3-4 Style is good or very good, with small lapses.

5 Excellent style, really easy to read and follow.

Your program will be traversing text files of various sizes (possibly including large csv
files), so you need to minimise the number of times your program looks at the same data
items.

Efficiency rubric:

0 Code too complicated to judge efficiency or wrong problem tackled.

1 Very poor efficiency, additional loops, inappropriate use of readline(),etc.

2 Acceptable or good efficiency with some lapses.

3 Excellent efficiency, should have no problem on large files, etc.

Automated testing is being used so that all submitted programs are being tested the same
way. Sometimes it happens that there is one mistake in the program that means that no
tests are passed. If the marker can spot the cause and fix it readily, then they are allowed
to do that and your - now fixed - program will score whatever it scores from the tests,
minus 4 marks per intervention, because other students will not have had the benefit of
marker intervention. Still, that's way better than getting zero. On the other hand, if the
bug is hard to fix, the marker needs to move on to other submissions.

Extract from Australian Computing Society Accreditation manual 2019:

As per Seoul Accord section D, a complex computing problem will normally have some or
all of the following criteria:

• involves wide-ranging or conflicting technical, computing, and other issues;

• has no obvious solution, and requires conceptual thinking and innovative analysis
to formulate suitable abstract models;
• a solution requires the use of in-depth computing or domain knowledge and an
analytical approach that is based on well-founded principles;
• involves infrequently encountered issues;
• is outside problems encompassed by standards and standard practice for
professional computing;
• involves diverse groups of stakeholders with widely varying needs;
• has significant consequences in a range of contexts;
• is a high-level problem possibly including many component parts or sub-problems;
• identification of a requirement or the cause of a problem is ill defined or unknown.

Page 5 of 6
CITS 1401 Computational Thinking with
Python

Necessary formulas

i. Correlation coefficient:

Mathematical formula to calculate correlation is as follows:

where 𝑥𝑥𝑖𝑖 and 𝑦𝑦𝑖𝑖 are the values of age and income of a specific user base respectively. 𝑥𝑥̅ is
the average age and 𝑦𝑦� is the average income.

ii. Standard deviation:

where 𝑥𝑥1, 𝑥𝑥2, 𝑥𝑥3 … … 𝑥𝑥𝑛𝑛 are observed values in the sample data. 𝑥𝑥̅ is the average value of
observations and 𝑁𝑁 is the number of observations.

Page 6 of 6

Lab#04 Sem2 2025
No ratings yet
Lab#04 Sem2 2025
6 pages
Project Details
No ratings yet
Project Details
9 pages
Social Media by Gautam
No ratings yet
Social Media by Gautam
25 pages
Assignment
No ratings yet
Assignment
10 pages
Collaborative Learning with Twitter Data
No ratings yet
Collaborative Learning with Twitter Data
4 pages
CC7182 - Programming For Data Analytics
No ratings yet
CC7182 - Programming For Data Analytics
9 pages
Titanic Data Analytics Assignment Guide
No ratings yet
Titanic Data Analytics Assignment Guide
8 pages
Mid Term Practical Exam
No ratings yet
Mid Term Practical Exam
3 pages
Question Bank 2023-24 Computer Science
No ratings yet
Question Bank 2023-24 Computer Science
146 pages
Cits2402 Assignment
No ratings yet
Cits2402 Assignment
7 pages
Spring 2025 - CS306 - 1
No ratings yet
Spring 2025 - CS306 - 1
2 pages
Python Programming Assignment Guide
No ratings yet
Python Programming Assignment Guide
6 pages
Exam Second Internal Examination - December 2024
No ratings yet
Exam Second Internal Examination - December 2024
17 pages
NPV 70 Marks Set 2
No ratings yet
NPV 70 Marks Set 2
4 pages
Exploratory Data Analysis Using Python
No ratings yet
Exploratory Data Analysis Using Python
41 pages
Data Science Practical Workbook 2024-25
No ratings yet
Data Science Practical Workbook 2024-25
10 pages
University Course Enrollment Programs
No ratings yet
University Course Enrollment Programs
8 pages
Week3 LA
No ratings yet
Week3 LA
3 pages
00 - Lesson - Data Science Workflow - Jupyter Notebook
No ratings yet
00 - Lesson - Data Science Workflow - Jupyter Notebook
6 pages
XI CS Practical List 2023-24
No ratings yet
XI CS Practical List 2023-24
3 pages
23CSE006 CPS MID Set 2
No ratings yet
23CSE006 CPS MID Set 2
5 pages
ENGG1003 ProjectSpecification 2024-25term2
No ratings yet
ENGG1003 ProjectSpecification 2024-25term2
4 pages
Computational Thinking and Problem Solving (COMP1002) and Problem Solving Methodology in Information Technology (COMP1001)
No ratings yet
Computational Thinking and Problem Solving (COMP1002) and Problem Solving Methodology in Information Technology (COMP1001)
3 pages
ENGG1003 ProjectSpecification 2024-25term1
No ratings yet
ENGG1003 ProjectSpecification 2024-25term1
4 pages
2ET1000110P Python-Programming Practical
No ratings yet
2ET1000110P Python-Programming Practical
3 pages
DSBDAlab Manual
No ratings yet
DSBDAlab Manual
116 pages
Class 12 Computer Science Sample Paper
No ratings yet
Class 12 Computer Science Sample Paper
18 pages
ComputerScience SQP
No ratings yet
ComputerScience SQP
11 pages
Ai Class 12 Practical 2
0% (1)
Ai Class 12 Practical 2
21 pages
Astha Gupta Project Report Word
No ratings yet
Astha Gupta Project Report Word
57 pages
XI CS Practical List 2023-24
No ratings yet
XI CS Practical List 2023-24
3 pages
Data Analytics & Machine Learning HW
No ratings yet
Data Analytics & Machine Learning HW
10 pages
Orange CS083 12 MS
No ratings yet
Orange CS083 12 MS
18 pages
ICT202 Machine Learning - Assignment 2
No ratings yet
ICT202 Machine Learning - Assignment 2
2 pages
ENGG1003 Project Specification Guide
No ratings yet
ENGG1003 Project Specification Guide
3 pages
MS Comp. Science Class 12
No ratings yet
MS Comp. Science Class 12
5 pages
Computer Science Exam Paper for XI
No ratings yet
Computer Science Exam Paper for XI
7 pages
DBDAL LAB - MANUAL - Final
No ratings yet
DBDAL LAB - MANUAL - Final
93 pages
Ai Class 12 Practical
No ratings yet
Ai Class 12 Practical
21 pages
Collaborative Programming Problem Design
No ratings yet
Collaborative Programming Problem Design
9 pages
ComputetScience HSSC-I Model Question Paper
No ratings yet
ComputetScience HSSC-I Model Question Paper
5 pages
Class XII CS Exam Marking Guide
No ratings yet
Class XII CS Exam Marking Guide
10 pages
Data Science Lab Guide
No ratings yet
Data Science Lab Guide
98 pages
Class XII Computer Science Pre-Board Exam
No ratings yet
Class XII Computer Science Pre-Board Exam
7 pages
DSBDA Manual
No ratings yet
DSBDA Manual
76 pages
Computer Science Exam Paper 2020-21
No ratings yet
Computer Science Exam Paper 2020-21
8 pages
COSC2984 Assign1 2023s1
No ratings yet
COSC2984 Assign1 2023s1
5 pages
PWP Final N
No ratings yet
PWP Final N
21 pages
DSC C BCA 353P - MAJOR - Practical Solutions Using Python
No ratings yet
DSC C BCA 353P - MAJOR - Practical Solutions Using Python
5 pages
PSPP Syllabus
No ratings yet
PSPP Syllabus
2 pages
CS1026 - Assignment 3
No ratings yet
CS1026 - Assignment 3
3 pages
2023 June CST362-C
No ratings yet
2023 June CST362-C
4 pages
Assignment 6: Software Design and Testing
No ratings yet
Assignment 6: Software Design and Testing
10 pages
CSC8001 Assignment 2 Marking Criteria
No ratings yet
CSC8001 Assignment 2 Marking Criteria
5 pages
Final Coursework - 24.2 Ad Cert Python
No ratings yet
Final Coursework - 24.2 Ad Cert Python
2 pages
SQP Xii It 083
No ratings yet
SQP Xii It 083
18 pages
Sample-Paper-See Xi Cs Ms Lucknow
No ratings yet
Sample-Paper-See Xi Cs Ms Lucknow
6 pages
Improve G3 Grades: Data Analysis Guide
No ratings yet
Improve G3 Grades: Data Analysis Guide
8 pages
Lab 2
No ratings yet
Lab 2
4 pages
s49-Ctsd Project Documentation Template
No ratings yet
s49-Ctsd Project Documentation Template
31 pages
MVP Vs MMP With Examples
No ratings yet
MVP Vs MMP With Examples
2 pages
PROII Data Transfer System User Guide
No ratings yet
PROII Data Transfer System User Guide
86 pages
Software Metrics for Students
No ratings yet
Software Metrics for Students
17 pages
Secret Codes
No ratings yet
Secret Codes
15 pages
CSC128 Programming Course Lesson Plan
No ratings yet
CSC128 Programming Course Lesson Plan
3 pages
ICSE Board Class 10 Computer Applications Syllabus
No ratings yet
ICSE Board Class 10 Computer Applications Syllabus
5 pages
Nuendo 4 Operation Manual English
100% (3)
Nuendo 4 Operation Manual English
540 pages
SYLLABUS HND 1 General Computer I
No ratings yet
SYLLABUS HND 1 General Computer I
5 pages
bq答案准备
No ratings yet
bq答案准备
13 pages
IBM Watson IoT Application Management Guide
No ratings yet
IBM Watson IoT Application Management Guide
3 pages
Web3 Internship Curriculum 2024
No ratings yet
Web3 Internship Curriculum 2024
8 pages
The Genesis of Java / History of Java: Unit I - Introduction To Java
100% (1)
The Genesis of Java / History of Java: Unit I - Introduction To Java
24 pages
STM32 Audio Playback and Recording Guide
No ratings yet
STM32 Audio Playback and Recording Guide
15 pages
Discord Bot Command Reference Sheet: Bot Triggers Dice Identifiers
No ratings yet
Discord Bot Command Reference Sheet: Bot Triggers Dice Identifiers
6 pages
Teens Social Media and Technology
No ratings yet
Teens Social Media and Technology
30 pages
Rudin - 2019 - Stop Explaining Black Box Machine Learning Models For High Stakes Decisions and
No ratings yet
Rudin - 2019 - Stop Explaining Black Box Machine Learning Models For High Stakes Decisions and
10 pages
JETIR2412269
No ratings yet
JETIR2412269
6 pages
Bus Reservation System Using C++
100% (3)
Bus Reservation System Using C++
38 pages
Problem Solving & Algorithm Design Guide
No ratings yet
Problem Solving & Algorithm Design Guide
5 pages
Syllabus PG DITISS
No ratings yet
Syllabus PG DITISS
4 pages
CS 5008 Fall 2023 Week 1
No ratings yet
CS 5008 Fall 2023 Week 1
267 pages
ABB Drive and Motor Selector ACS880-04 Module PDF
No ratings yet
ABB Drive and Motor Selector ACS880-04 Module PDF
3 pages
Graduate Trainee It Ppda Job Application Form
No ratings yet
Graduate Trainee It Ppda Job Application Form
8 pages
BBA IT Exam Paper Dec 2014
No ratings yet
BBA IT Exam Paper Dec 2014
1 page
WWW Themagiccafe Com Forums Viewtopic PHP Forum 202&topic 325190
No ratings yet
WWW Themagiccafe Com Forums Viewtopic PHP Forum 202&topic 325190
6 pages
Aircrack NG
No ratings yet
Aircrack NG
39 pages
Data Structures Module 1 Overview
No ratings yet
Data Structures Module 1 Overview
30 pages
Gmail - FWD Odoo Hackathon Must Attend Package 8 Lpa Notice For Hackathon Batch 2026,2027,2028
No ratings yet
Gmail - FWD Odoo Hackathon Must Attend Package 8 Lpa Notice For Hackathon Batch 2026,2027,2028
3 pages