Professional Documents
Culture Documents
On
LITERACY RATE ANALYSIS
Submitted in partial fulfillment of the requirement of
Bachelors of Computer Applications (BCA)
Guru Gobind Singh Indraprastha University, Delhi
Session 2019-20
I hereby declare that this Major Project Report titled “ Literacy Rate Analysis “ submitted by
me to JEMTEC, Greater Noida is a bonafide work undertaken during the period from 01-
January-2020 to 25-April-2020 by me and has not been submitted to any other University or
Institution for the award of any degree diploma / certificate or published any time before.
_____________________
i
BONAFIDE CERTIFICATE
This is to certify that as per best of my belief the project entitled “Literacy Rate Analysis” is
the bonafide research work carried out by AKASH AGGARWAL student of BCA, JEMTEC,
Greater Noida, in partial fulfilment of the requirement for the major project report of the Degree
of Bachelor of Computer Application.
ii
ACKNOWLEDGEMENT
I offer my sincere thanks and humble regards to JEMTEC, Greater Noida for imparting us very
valuable professional training in BCA.
I pay my gratitude and sincere regards to Dr.Ruchi Agarwal, my project guide for giving me the
cream of her knowledge. I am thankful to her as she has been a constant source of advice,
motivation and inspiration. I am also thankful to her for giving his suggestions and
encouragement throughout the project work.
I take the opportunity to express my gratitude and thanks to our computer Lab staff and library
staff for providing me opportunity to utilize their resources for the completion of the project.
I am also thankful to my family and friends for constantly motivating me to complete the project
and providing me an environment, which enhanced my knowledge.
Date: -
Enroll. – 35225502017
_____________________
iii
CONTENTS
Education is the foremost important tool for change of the society and betterment
of nation. Proficiency and level of training are fundamental pointers of the level of
improvement accomplished by a general public. Spread of literacy is by and large
connected with vital attributes of present day development for example,
modernization, urbanization, trade and industrialization. Literacy shapes a vital
contribution to generally improvement of society empowering them to understand
their social, political and social condition better and react to it appropriately. Better
education and literacy prompt a more noteworthy mindfulness and furthermore
contributes in enhancement of economical and social conditions. Ministry of
Human Resource Development (DISE) releases a data on literacy rate each year
which can be exceptionally valuable in examining different elements influencing
education rate of a state or an area. An all around structured dashboard that
exhibits the best possible examination of the information will give a reasonable
picture of proficiency in different locales of India. Data to be analyzed is handled
and cleaned to draw out the most imperative and significant features. The data at
that point analyzed gives the last outcome which is presented on dashboard making
it easy to understand and comprehend.
1
CHAPTER 1
INTRODUCTION
2
INTRODUCTION TO LITERACY RATE ANALYSIS
Literacy has always been an issue for the world. Every country aims to achieve
full literacy rate. Although literacy rate has increased up to a great extent now but
still there is a need to know the areas that are still lagging behind.
Following are the means in moving toward the analysis of the data.
COLLECTION OF DATASET
4
OBJECTIVE
Education is the foremost important tool for change of the society and betterment
of nation.
5
CHAPTER-2:
6
SOFTWARE REQUIREMENT SPECIFICATIONS
1.2.1. Python:
1.3.1. Anaconda:
plots and rich media, usually ending with the ".ipynb" extension.
1.4.1. Numpy:
1.4.2. Pandas:
1.4.3. Matplotlib:
1.4.4. Seaborn:
FUTURE SCOPE
This project deals with the analysis of Literacy Rate in different states of India
based on 680 factors. This dataset contains information about the year 2015-16 and
was published by HRD Ministry of India. We are focusing at finding top five
factors and the least five factors that influence the literacy rate of given state and
Analysation on literacy rate may government use in future for comparing old
educational growth and make changes in new one to provide the quality of
education and lots of facilities May this Analysation report help people living in
rural areas lead a very different life compared to the people living in the urban
areas. There is less motivation to go to school in rural areas as a lot of people tend
to take up their parent’s profession or business. This Analysation help a lot.
11
CHAPTER 3:
12
Python is an increasingly popular tool for data analysis. In recent years, a number of
libraries have reached maturity, allowing R and Stata users to take advantage of the beauty,
flexibility, and performance of Python without sacrificing the functionality these older
programs have accumulated over the years. Data analysis is the process of evaluating data
using analytical and statistical tools to discover useful information and aid in business
decision making. There are a several data analysis methods including data mining, text
analytics, business intelligence and data visualization.
CHAPTER 4:
SOURCE CODE AND OUTPUT SNAPSHOTS
15
This table we get by using this syntax : print(elementary.head)
Here, This above is the dataset of state wise elementary that we have imported
here to perform analysis in dataset using pd.read_csv this method is the method of
pandas library used for importing dataset..
17
Here, This is the second dataset state wise meta_elementary imported for the data
analysis.
Here, this is the dataset of state wise secondary we have readed..
In , above code we tried to checking the shape of the dataset and checking the
null entries in the elementary dataset.
Here, we are trying to print 2 rows of the elementary dataset which we have
imported above.
Here we have seen that the overall literacy rate using describe inbuilt function from
the above analysis the data we get maximum rate is : 93.91 ,minimum is 63.82
About this analysis this will be more clear from the plots bar graph .
These are the area details of
school that how many area allocated to schools
21
We can see here that we have tried to see that which state has the lowest growth
rate then we get that the Nagaland is the state where the growth rate is lowest.
So when we try to see the growth rate maximum then we get that Dadra and nagar
haveli has the highest growth rate.
22
Here for working on various operation analysis we have created an attribute
DIFF_LIT from old attribute MALE_LIT and FEMALE_ LIT this defines that
here the aggregation process is going which comes under data transformation so
lets get to know that how we get the DIFF_LIT we get it by subtracting
MALE_LIT from FEMALE_LIT .
Now we will be able to see the OVERALL_LI describe in the ploting graph .
Here in the above code we are now calling that function that we have created at
first i.e. plot_barh function to plot the bar on graph.
Here , we are comparing literacy rate state wise between the different state wise.
From the above bar graph it is easily to see that Kerala has the highest literacy rate
and lowest literacy rate in Bihar .
So, Now we can conclude that from the graph points that kerala is above 80 so it
may be 93.something because when we have described overall literacy rate using
describe method then it has been seen there that maximum is 93.91 % and here in
the bar highest is kerala and min is the 63.82 % and if we see in the bar that bihar
is above 60% so it may be 63.82 % .
HIGHEST: Kerala.
LOWEST:Bihar.
Here we have compared the
Here we have Compared the FEMALE AND STATEWISE so from the above
observation it has been seen that :
Highest: Lakshadweep
Lowest: Bihar.
Here In this bar we are comparing the Male and Female Literacy Rate Statewise.
27
Here in the above code we are printing the last some rows using tail () function.
And now we are going to print the average literacy rate from the diff_lit and
comparing it with the national average .
As it has been seen that north east have an average lesser difference,, between male
and female literacy rate when compared with the average of the country.
Now we have printed the female literacy rate of Meghalaya and average female
literacy rate.
Now we will create a new data frame named as top_bottom it contains only
top_3 and bottom_3 states from overall literacy rate.
29
We are droping here telangana because it has been founded so it is may be difficult
to make schools and all the facilities.
And in the next line of code we are concatinig the top_3_elem and bottom_3_
elem and making a top_bottom with axis=0 an sort= false show that now we need
to sort our data because we have already sorted it while making top_3_elem and
bottom_3_elem .
30
Seen our data of 6 states that we get from the concat operation that we have
performed above.
KERALA , LAKSHDEEP,ARUNACHAL
PRADESH,BIHAR,MIZORAM ,RAJASTHAN.
So these are the some states that are being considered as top 3 and bottom 3
Kerala,
31
Here in this code top_bottom data frame is multiplied with the total poplation and
then divided by the top_bottom multiplied with area sqkm * 1000
Here we are trying to see the difference between male and female in top_bottom
data frame i.e from theses 6 states. So we get the above output
Then rajasthan has the highest difference in literacy rate analysis and kerala has the
lowest difference as because in above analysis we have seen in male and female
analysis statewise that there is a very little difference between male and female
literacy rate and from that difference we are sure that there is only 4.04 % of
difference between male and female in kerala whereas in rajasthan difference is
more because if we look forward above analysis then what we are getting we are
getting that in rajasthan females rate is only above 40 and males is about to 80 and
even female literacy rate is lowest as compared to all the other states so we can say
that there is a 27.85 %is the highest difference between the male and female.
Now whatever we have analysed above through the textual output no it will
become easier to understand while looking up for the visualization;
33
Here in the above code we have ploted the same analysis that we have discussed
above rajasthan has the highest difference whereas kerala has the lowest difference
rate.
We are ploting this here we have given that what kind of bar or scatter points we
need to use in displaying the bar graph . and these all are done using all the
top_bottom dataframe.
Now we are checking the sexratio of top_bottom data frame sex ratio is the 1000
of males are in the states then how many female are there.
So as we can see that kerala has the highest sexratio and Bihar has the lowest sex
ratio.
35
Here it is the visualization form that we can see that kerala has the hhighest sex
ratio and bihar has the lowest .
37
Here we can see that we have ploted SC_ST_POP, _POP, P_ST_POP
38
Here in this from meta_secondary data here we get the SCHTOT and same as seen
above .
39
These are the some of the total facts that are in the above datasets.
In this we are trying to see that the how many schools are by category in
top_bottom data frame .
40
so here is the schools by category in these 6 states highest schools by category in
RAJASTHAN and BIHAR and after that kerala and so on…
Now in this we are printing the school kids now we have top_bottom[SCHKIDS]
is created by adding the 2 different attributes..
41
As from the above we can visualize that bihar has the highest number of school
and Mizoram has the lowest number of schools kids per school top 3 states are
42
Now here totcls1g we are printing the top_bottom data frame
From the above we can see that bihar has the highest no. of classes as compared to
rajasthan and kerala and Mizoram as Mizoram has very less no of schools so it has
been possible that there is less classs and all.
43
now here we are creating the KIDSPERCL by dividing the SCHKIDS BY
TOTCLS1G from top_bottom data frame .
So here this is the kids per class kids perclass is highest in the bihar and lowest in
the Mizoram.
Here in this we are creating the elem data frame and elem[schkids] and
elem[kidspercl] no these values are being visualized with the plot graph
45
Here we are comparing the kidspercl and overall _li that to get the kidspercl
values.
Here now we are going to compare the private schools and government schools
basically a type of schools to get that ho many schools are in how many states
Here this code for the displaying the private government schools and madras in
the whole analysis. here we
Have prepared so many attributes then we are concating those to plot in a same
graph to get the accurate and the relative result of the analysis ..
Kerala has the highest no of private schools and it is the only state that have
highest no of privte school as compare to other states and the number of madradsas
and the number of government schools is much less then national average.
Bihar and arunachal Pradesh have really less private schools compared to
government schools.
Rajasthan has 35% private schools which is largely compared in literacy rates.
Here contie is the attribute we used for presenting that school son development or
expending i.e . government has granted the permission to the schools for the
development and expansion.
Now here elem is created to compare the contie from overall literacy rates i.e. from
this overall literacy rates this much of schools are being on development.
49
Now these all are the data of the schools development and granted permission for
50
Now we have to find the maximum dropout rate in different states dropot rate from
8th to 9th so how will do that we will first sum up C9_G, C9_B and same with the
8th class enrollments then we will subtract from 9th to 8th and divide by the total
enrollments of class 8th and then we find the proportion of students who droped out
from 8th to 9th .
51
Now using these we will calculate that percentage of childrens droping in top_3
and bottom_3 from 8th to 9th.
Now these are the data that we require to get the maximum drop out rate.
We will using the above data to get the dropout rate as we can see that pd.concat
is used for the concating i.e. merging and making one.
Now we are looking the droping out rate from top_ bottom data frame.
We can see that droping rate in kerala are vary less Mizoram , lakshdeep kerala
are in - top_bottom
53
top_bottom we have shown the droping out rate.
Here the details of the enrollments class column columns of enrollments
Total enrolments of each class and we are storing the column and classes now we
are totaling all the columns of the data frame.
55
56
Now here we are plotting the rate of the droping .
57
CHAPTER-5:
CONCLUSION
58
CONCLUSION
From the above of the analysis we get to know that the highest and lowest literacy
rate analysis state wise and between male and female and male and state name and
female and state name
State Wise:
Highest: Kerala
Lowest: Bihar
Highest: Kerala
Lowest: Rajasthan
Highest: Lakshdeep
Lowest: Bihar
Maximum: 0.08
Minimum: -0.00
Schools by Category:
Government: Lakshdeep
Private: Kerala
Madrasas: Kerala
From this analysis we get to learn so many concepts and figures by literacy rate is
Less and high so many we have seen that so many states have the low literacy rates
because the schools are there not as they require some facilities are may not given
to students and in many states schools are there but they are unable to grab it
because it may be far from home and may be not financially able to grab it or may
not have features as they wanted to be in the schools and drop out rate that some
students drop out from 8th to 9th because they may not able to continue and so
many reasons are there to define that why literacy rate are less in many of the
states .
60
CHAPTER-6
BIBLIOGRAPHY/REFERENCES
61
BIBLIOGRAPHY
https://www.educationforallinindia.com/page167.html
https://www.ijstr.org/final-print/aug2019/Literacy-Rate-Analysis-Dashboard.pdf
http://dataworld.org
http://kaggle/.com
http://census.com
62