You are on page 1of 9

Food Violations Report

Chin Wei Lian s5111405

2810ICT Software Technologies

October 4, 2019
Abstract

This report outlines the analysis of relation between inspection for each property in United State

and violation records. The data documented in this report contains a list of businesses that have at

least 1 violation record, the number of each type of violation based on violation code as well as the

minimum, maximum and average violations point of every month from 7-2015 until 12-2017. Some

serial number contain multiples of violation which may lead to a great deduction of points. From

the findings, we can understand that even though the business was rated as A grade, it violates

some of the rules.

Food Violations Report | Chin Wei Lian s5111405 Page 1 of 8


Introduction

The purpose of this project was to process a great deal of food inspection and health

violation data and produce a precise summary for each task. By storing the raw data into a

database, so users can execute SQLite queries to interact with database or perform analysis.

As long as the user know how to manipulate the SQL query syntax, it could be time saving

method to get the data they want. Representation of data by using MatPlotLib can be used to

illustrate the food violation trends over time.

Database Structure

Throughout this assignment, I created 3 tables which are “inspections”, “violations” and

“previous_violations”. In the inspections table , I created a function to identify the attribute

types (INT, DATE, VARCHAR and CHAR) based on their title.

For the violations tables, I also used similar function to provide attributes type for the data.

Food Violations Report | Chin Wei Lian s5111405 Page 2 of 8


And for the previous_violation table, it stores the processed data from those two tables

(inspection and violations table).

Food Violations Report | Chin Wei Lian s5111405 Page 3 of 8


Violation counts

Firstly , I connect the python file with database I created before with using sqlite3. Then, I

retrieve all data in the serial_number. Use the retrieved serial_number in new sql query as a

condition. The method I used to calculate the number of each type of violation based on

violation code was if the violation code first time show up , then set the violation code as

key and provide value of 1 and store it in dictionary. If the violation code appeared before,

then do increment.

The result of “count” dictionary will be (I am only using first 3000 data of violation and

inspections to test the function):

Food Violations Report | Chin Wei Lian s5111405 Page 4 of 8


Violations over time

In order to get total point for the specific serial number , I had ran sql query to get total

violation point for a serial number and store it in a dictionary which called “point”. At

below, there are few screenshot of sql query command and its result.

From the screenshot above, it represents that every serial number contain different points.

Later, I ran another sql query to get data of serial number for each zipcode in different time.

Result in https://sqliteonline.com/:

Food Violations Report | Chin Wei Lian s5111405 Page 5 of 8


Result in IPython console.

Food Violations Report | Chin Wei Lian s5111405 Page 6 of 8


Then I created nested dictionary with date as the key and assign a value according to the serial

number.

I used another nested dictionary (zip_code_point) to store array of data, the index 0 represent the

zip code that contain the highest violation point , the index 1 represent the highest violation points;

the index 2 represent the zip code that contain the least violation point , the index 3 represent the

least violation points. Finally, the index 4 represent the average violation points in that month. I

also deleted the test key which is used for initial comparison.

Food Violations Report | Chin Wei Lian s5111405 Page 7 of 8


Due to my hardware problem , I am not able to import the big data from excel to the database. Then,

I only used first 3000 records data from both excel files. ( inspection and violation). Those data only

able to process the data for the month (7-2015).

Food Violations Report | Chin Wei Lian s5111405 Page 8 of 8

You might also like