Professional Documents
Culture Documents
Department of I.T.
An
INDUSTRIAL TRAINING REPORT
On
Python using Data Science
Project Internship Title: “Internship Finder Application”
Training Organization: HCL TSS, Lucknow
[Sub code: RIT-753]
Semester 7th
Section: IT-41
Session: 2019-2020
Firstly, I would like to profess my deep gratitude to the Final year Industrial
training coordinators of BBDNITM, Mr. A.K. Gahalautand Mr. Anurag Tiwari,
whose eminent contribution in stimulating good suggestions and enhancing the
encouragementthat helped me to gear up my report to a good output.
I am extremely thankful to the HCL TSS Lucknow who gave me the golden
opportunity of completing my Summer Internship under the reflection of the
educational SKILL DEVELOPMENT CENTRE.
I would like to appreciate the guidance given by the other supervisor as well as
the panels that motivated me and grew up my confidence in improving my
presentation skills.
And finally, I would like to offer many thanks to all my colleagues for their
valuable suggestions and constructive feedback.
1. Training Objective 1
3.1.1 History
3.1.2 Features
4.2 Study 1
4.3 Study 2
4.5 Study 4
7. Conclusion 31
8. References 32
CHAPTER 1
TRAINING OBJECTIVE
The main aim behind the Industrial training was to develop and enhance the project
skills. The great Hardwork and perseverance towards the specific motive in order to
learn something new was the basic focus. We learn, we implement and finally we
produce which leads to a beneficial output. The Skill development Centre at the HCL
TSS organization focuses on what the trainee wants to get trained with?
I chose Python as the backend as well as the frontend language in this project. The
major focus is on the DATA SCIENCE concept which is trending nowadays in order
to enhance and modify your project to a new technological based development.
The few major objectives of this Internship Programme are thrown light below-
1. The main aim was to learn trending technology which is PYTHON-3 nowadays
with which the coding becomes simpler and efficient.
2. The concept of DATA SCIENCE is so influencing that focus on the major
concepts of scrapping, modification, analysis, displaying and many more.
3. The global concept of learning python with data science helped me to complete
my project during the internship sessions.
4. The learning of this new technology provided me a step ahead towards
developing my career.
5. The internship not only developed the keen potential to learn something new but
also to develop a productive output.
6. It developed my interest towards what new I have learned and what I will follow
up in future for better analysis.
7. The concept of Python helped me to learn the concept from installation to the
final binding.
8. It helped in data visualization, the data modification, the data analysis and the
displaying of data.
9. The major objective was to know the working environment of the corporate
sector.
1
CHAPTER 2
TRAINING ORGANISATION DETAILS
HCL Training & Staffing Services (HCL TSS) is a division subsidiary company
of HCL Technologies Limited created with a vision to provide trained & skilled
workforce through its multiple Training & Hiring Programs. Given the ever-
increasing demand for quality talent within HCL, there was a significant need to
create a talent pool that would be equipped with the requisite expertise, and will
be technically and professionally prepared to join the highly specialized
workforce at HCL. Thus was conceived the idea of HCL TSS, with the
objective of becoming the largest integrated talent-solutions Company in India
preparing skilled workforce for the future.
Recruitment experts have noted that the skill gap across industries is primarily
an education issue – one that creates a mismatch between what gets taught to
aspiring students in institutions, and the expectations awaiting them in the real-
life job environment. HCL TSS creates the much-needed bridge between
deserving talent across the country and vacant jobs that are difficult to fill due to
a crisis of skills. HCL TSS’s offers best-in class skill based training programs
for entry level job roles across in HCL. Candidates interested to kick-start their
IT career with HCL can apply for our fee based training & hiring programs.
HCL TSS offers training programs for students who have completed Science
Graduates and Engineering Graduates / Post Graduates.
HCL Technologies Limited is an Indian multinational information
technology (IT) service and consulting company headquartered in Noida, Uttar
Pradesh. It is a subsidiary of HCL Enterprise. Originally a research and
development division of HCL, it emerged as an independent company in 1991
when HCL entered into the software services business.
The company has offices in 42 countries including the United Kingdom, the
United States, France, and Germany with a worldwide network of R&D,
"innovation labs" and "delivery centers", and 137,000+ employees and its
customers include 250 of the Fortune 500 and 650 of the Global 2000
companies. It operates across sectors including aerospace and defense,
automotive, banking, capital markets, chemical and process industries, energy
and utilities, healthcare, hi-tech, industrial manufacturing, consumer goods,
insurance, life sciences, manufacturing, media and entertainment, mining and
natural resources, oil and gas, retail, telecom, and travel, transportation, logistics
& hospitality.[
2
HCL Technologies is on the Forbes Global 2000 list.It is among the top 20
largest publicly traded companies in India with a market capitalisation of
$18.7 billion as of May 2017.As of May 2018, the company, along with its
subsidiaries, had a consolidated revenue of $7.8 billion
HCL Enterprise was founded in 1976.
The first three subsidiaries of parent HCL Enterprise were:
3
CHAPTER 3
3.1.1HISTORY-
Python was conceived in the late 1980sby Guido van Rossum at Centrum
Wiskunde & Informatica (CWI) in the Netherlands as a successor to the ABC
language (itself inspired by SETL), capable of exception handling and
interfacing with the Amoeba operating system.
Python 3.0 was released on 3 December 2008. It was a major revision of the
language that is not completely backward-compatible. Python 3 include the
utility, which automates (at least partially) the translation of Python 2 code to
Python 3.
3.1.2 FEATURES-
Python is a multi-paradigm programming language. Object-oriented
programming and structured programming are fully supported, and many of its
features support functional programming and aspect-oriented
programming (including by metaprogramming and metaobjects (magic
methods)).Many other paradigms are supported via extensions, including design
by contract and logic programming.
Python uses dynamic typing, and a combination of reference counting and a
cycle-detecting garbage collector for memory management. It also features
dynamic name resolution (late binding), which binds method and variable
names during program execution.
4
Python's design offers some support for functional programming in
the Lisp tradition. It has filter , map , and reduce functions; list
comprehensions, dictionaries, sets and generator expressions. The standard
library has two modules (itertools and functools) that implement functional
tools borrowed from Haskell and Standard ML.
Web frameworks
Multimedia
Databases
Networking
Test frameworks
Automation
Web scraping
Documentation
System administration
Scientific computing
Text processing
Image processing
5
3.1.4APPLICATION and USES:
Python can serve as a scripting language for web applications, e.g.,
via mod_wsgi for the Apache web server.With Web Server Gateway Interface, a
standard API has evolved to facilitate these
applications. Webframeworks like Django, Pylons, Pyramid, TurboGears, web2
py, Tornado, Flask, Bottle and Zope support developers in the design and
maintenance of complex applications. .
Libraries such as NumPy, SciPy and Matplotlib allow the effective use of
Python in scientific computing, with specialized libraries such
as Biopython and Astropy providing domain-specific functionality.
Data analysis is the process of evaluating data using analytical and statistical
tools to discover useful information and aid in business decision making. There
are a several data analysis methods including data mining, text analytics,
business intelligence and data visualization.
7
3.2.2DATA MINING-
Data mining is a method of data analysis for discovering patterns in large data
sets using the methods of statistics, artificial intelligence, machine learning and
databases. The goal is to transform raw data into understandable business
information. These might include identifying groups of data records (also
known as cluster analysis), or identifying anomolies and dependencies between
data groups.
8
Increasing amounts of data are being generated by a number of sensors in the
environment (referred to as “Internet of Things” or “IOT”). This data (referred
to as “big data”) presents challenges in understanding which can be eased by
using the tools of Data visualization. Data visualization is used in the following
applications.
Extracting summary data from the raw data of IOT.
Using a bar chart to represent sales performance over several quarters.
A histogram shows distribution of a variable such as income by dividing
the range into bins.
If the value in column A1 is 100, and you change that value you are
modifying it.
If you create a formula in A2 that uses the value in A1, you are
manipulating it, storing the result in A2.
If you write a macro that picks up the value in A1, runs a calculation
then saves the result back in A1, you are both manipulating and
modifying it.
9
CHAPTER 4
10
4.2 STUDY 1:
Python offers multiple options for developing GUI (Graphical User Interface).
Out of all the GUI methods, tkinter is most commonly used method. It is a
standard Python interface to the Tk GUI toolkit shipped with Python. Python
with tkinter outputs the fastest and easiest way to create the GUI applications.
Creating a GUI using tkinter is an easy task.
To create a tkinter:
1. Importing the module – tkinter
2. Create the main window (container)
3. Add any number of widgets to the main window
4. Apply the event Trigger on the widgets.
Importing tkinter is same as importing any other module in the python code.
Note that the name of the module in Python 2.x is ‘Tkinter’ and in Python 3.x is
‘tkinter’.
importtkinter
11
2. grid() method:It organizes the widgets in grid (table-like structure) before
placing in the parent widget.
3. place() method:It organizes the widgets by placing them on specific
positions directed by the programmer.
There are a number of widgets which you can put in your tkinter application.
Some of the major widgets are explained below:
1. Button:To add a button in your application, this widget is used.
The general syntax is:
w=Button(master, option=value)
There are number of options which are used to change the format of the
Buttons. Number of options can be passed as parameters separated by commas.
Some of them are listed below.
activebackground: to set the background color when button is under the
cursor.
activeforeground: to set the foreground color when button is under the
cursor.
bg: to set he normal background color.
command: to call a function.
font: to set the font on the button label.
image: to set the image on the button.
width: to set the width of the button.
height: to set the height of the button.
2. Canvas: It is used to draw pictures and other complex layout like graphics,
text and widgets.
The general syntax is:
w = Canvas(master, option=value)
master is the parameter used to represent the parent window.
There are number of options which are used to change the format of the widget.
Number of options can be passed as parameters separated by commas. Some of
them are listed below.
bd: to set the border width in pixels.
bg: to set the normal background color.
cursor: to set the cursor used in the canvas.
highlightcolor: to set the color shown in the focus highlight.
width: to set the width of the widget.
height: to set the height of the widget.
3. CheckButton: To select any number of options by displaying a number of
options to a user as toggle buttons. The general syntax is:
w = CheckButton(master, option=value)
12
There are number of options which are used to change the format of this widget.
Number of options can be passed as parameters separated by commas. Some of
them are listed below.
Title: To set the title of the widget.
activebackground: to set the background color when widget is under the
cursor.
activeforeground: to set the foreground color when widget is under the
cursor.
bg: to set he normal backgrouSteganography
Break
Secret Code:
Attach a File:nd color.
command: to call a function.
font: to set the font on the button label.
image: to set the image on the widget.
4. Entry:It is used to input the single line text entry from the user.. For multi-
line text input, Text widget is used.
The general syntax is: w=Entry(master, option=value)
master is the parameter used to represent the parent window.
There are number of options which are used to change the format of the widget.
Number of options can be passed as parameters separated by commas. Some of
them are listed below.
bd: to set the border width in pixels.
bg: to set the normal background color.
cursor: to set the cursor used.
command: to call a function.
highlightcolor: to set the color shown in the focus highlight.
width: to set the width of the button.
height: to set the height of the button.
Frame: It acts as a container to hold the widgets. It is used for grouping
and organizing the widgets. The general syntax is:
w = Frame(master, option=value)
master is the parameter used to represent the parent window.
5. Label: It refers to the display box where you can put any text or image
which can be updated any time as per the code.
The general syntax is:
w=Label(master, option=value)
There are number of options which are used to change the format of the
widget. Number of options can be passed as parameters separated by commas.
13
6. MenuButton: It is a part of top-down menu which stays on the window all
the time. Every menubutton has its own functionality. The general syntax
is:
w = MenuButton(master, option=value)
7. Menu: It is used to create all kinds of menus used by the application.
The general syntax is:
w = Menu(master, option=value)
8. Message: It refers to the multi-line and non-editable text.
The general syntax is:
w = Message(master, option=value)
9. RadioButton: It is used to offer multi-choice option to the user. It offers
several options to the user and the user has to choose one option.
The general syntax is:
w = RadioButton(master, option=value)
10.Scrollbar: It refers to the slide controller which will be used to implement
listed widgets.
The general syntax is:
w = Scrollbar(master, option=value)
master is the parameter used to represent the parent window.
11.Text: To edit a multi-line text and format the way it has to be displayed.
The general syntax is:
w =Text(master, option=value)
4.3 STUDY 2:
Web scraping is used to collect large information from websites. But why does
someone have to collect such large data from websites? To know about this,
let’s look at the applications of web scraping:
14
Social Media Scraping: Web scraping is used to collect data from Social
Media websites such as Twitter to find out what’s trending.
Research and Development: Web scraping is used to collect a large set
of data (Statistics, General Information, Temperature, etc.) from websites,
which are analyzed and used to carry out Surveys or for R&D.
Job listings: Details regarding job openings, interviews are collected
from different websites
and then listed in one place so that it is easily accessible to the user.
You’ve probably heard of how awesome Python is. But, so are other languages
too. Then why should we choose Python over other languages for web
scraping?
Here is the list of features of Python which makes it more suitable for web
scraping.
Ease of Use: Python is simple to code. You do not have to add semi-
colons “;” or curly-braces “{}” anywhere. This makes it less messy and
easy to use.
Large Collection of Libraries: Python has a huge collection of libraries
such as NumPy, Matplotlib, Pandas etc., which provides methods and
services for various purposes. Hence, it is suitable for web scraping and
for further manipulation of extracted data.
Dynamically typed: In Python, you don’t have to define datatypes for
variables, you can directly use the variables wherever required. This
saves time and makes your job faster.
Easily Understandable Syntax: Python syntax is easily understandable
mainly because reading a Python code is very similar to reading a
statement in English. It is expressive and easily readable, and the
indentation used in Python also helps the user to differentiate between
different scope/blocks in the code.
Small code, large task: Web scraping is used to save time. But what’s
the use if you spend more time writing the code? Well, you don’t have to.
15
In Python, you can write small codes to do large tasks. Hence, you save
time even while writing the code.
Community: What if you get stuck while writing the code? You don’t
have to worry. Python community has one of the biggest and most active
communities, where you can seek help from.
How to scrape?
When you run the code for web scraping, a request is sent to the URL that you
have mentioned. As a response to the request, the server sends the data and
allows you to read the HTML or XML page. The code then, parses the HTML
or XML page, finds the data and extracts it.
To extract data using web scraping with python, you need to follow these basic
steps:
16
Step 4: Write the code
Step 5: Run the code and extract the data
Step 6: Store the data in a required format
4.4 STUDY 3:
A CSV is a text file, so it can be created and edited using any text editor. A
CSV file is created by exporting (File menu -> Export) a spreadsheet or
database in the program that created it. .
In python the CSV file is displayed in the form of the dictionary and list
format. This encrypts the data that has been scrapped from the online
website and then the filters are applied or to be more specific are sorted
according to the particular row or column to be displayed in the GUI.
17
4.5 STUDY 4:
After the proper visualization and modification of data that is stored in the
CSV file, it is properly sorted in order to be displayed. The framework used
by tkinter GUI allows the user to properly display the data stored in the form
of the dictionary.
18
CHAPTER 5
Project Internship:
Internship Finder provides the scenario to find the best internship according
to the given essentials.
The information of all internships and the data is collected by the online
websites.This includes the scrapped data which is further send for the analysis. The
filters are applied to the scrapped data and hence are finally retrieved for the further
analysis.
The general process of the internship finder is to scrap the data from the
online internship websites like the intenshala.com, WorkIndia.com, shine.com,
AllCareerPoint.com and many more. It is basically a web scrapper that is used for
collecting the data for the data analysis purpose. The filters are applied to the data
according to which the information is fetched. These filters are applied based on the
specific choices like the gross value of the companies, the stipend given or not,
duration or the number of days or months, preferred locations, facilities provided,
employee feedback and the overall working environment.
19
after the internship is completed. The user can choose the internship based on their
requirements. After the internship is over, the administrator can change the
password and can also delete the account of the previous list of users registered. The
administrator maintains records for users who have registered for the particular
internship on the specific website.
User’s Functions :
Can update the details in his/her profile.
Can view the various internship details with the detailed descriptions,
criteria etc.
Can register for an internship.
Can login with the certain user ID and password registered.
Can give the suitable location, duration and qualification for the
internship.
Can search for the internships according to their needs.
Can choose the internships scrapped with the applied filters.
Can perform certain functions of receiving the message for the incorrect
mail ID and password and in order to enter the correct email ID or
password.
Can edit the details accordingly.
Our user is a college student and the students or the people who are searching
for the best internship all over online through various websites, to gain a good
experience and learn new things.
The problem of each and every user searching for an internship is that either
they are not able to find a suitable location or duration. The company is providing
stipend or not and various other issues.The information to be displayed can be
gathered by answering the following questions :
4. What are the problems student faces while searching for an internship on
the large websites ?
20
CHAPTER 6
Business/Project Objective:
Construct an online web scrapper named Internship Finder to scrap the data
from the different websites like Internshala.com, Shine.com and the many other
such websites. The scrapped data will be displayed in the form of the comma
separated file. The CSV file will contain the five columns on which filters will be
applied and top five internships will be displayed for the user based on their needs.
Hardware Requirements:
A laptop with the i5 core processor that will help you access all the tools
in the completion and fulfillment of the project.
4 Gegabytes of Ram or higher.
Software Requirements:
Jupyter Notebook
Python 3.7/ Anaconda 64 bit
Visual Studio Code
My SQL database connectivity
21
SQL Lite
Web browser
Scope of the Work(in Brief):
Depending on the scrapped data the top list of the internships will be
displayed and the clickable links will open into the web browser. On one click the
user in search for internship can apply directly.
Database tier
Fig: 6.2.1
Login
Fig: 6.3.1
Username
password
Name
REGISTER
Email ID
Re-Type Password
Fig: 6.3.2
23
1 month
Duration
2 months
Field
more
FIND
Location
Payed
Full time
Stipend Type
Unpayed
Fig: 6.3.3
24
6.4.1 Table : Staff (for registration)
Field Name Data Type Null Key References Description
Table
Name Varchar(50) No Store the name of the
user
User Varchar(50) No Pk Store the user ID
Passw Varchar(50) No Store the password
e-mail Varchar(50) No Stores the email of the
user
25
6.5.1 Registration Page:
The registration page renders to register the details of the user who is applying
for the internship. This page contains the Name, user name , password and
email ID data fields to enter the user’s data. On clicking the ‘REGISTER ME’
button the user is successfully registered. A dialog box appears with the
following information. One Clicking the OK button the user can go back to the
login window for logging in.
Img: 6.5.1
26
6.5.2 Login Page :
This page allows the user to login with the registered user name and password,
on clicking on the login button.
Img: 6.5.2
27
Img: 6.5.3
28
6.7 SNAPSHOTS OF DATABASE TABLES
This is the final top five internships scrapped, analysed and displayed in the
form of the CSV File in a table format.
29
6.8 Project Directory Structure :
Internship Finder
Images
users
others
Desktop INF
myDatabase
Online Internship
The My_InternshipFinder directory contains all the .csv files on the root.
The others folder contains all the other images used in the project.
The online internship directory contains all the .py files required by the
project.
The myDatabase folder is used to store the registration details during the
file upload process.
30
CHAPTER 7
CONCLUSION
This project is aimed at developing a Web scrapper called as the Internship
Finder that is of importance to users in searching for an quality internship. The
Internship Finder is an application that can be accessed by all the users based on the
certain filters applied. This system can be used to automate the workflow of finding
or searching to the internships that satisfy the following criteria.
Solves all the problems student faces while searching for an internship on
the large websites like internshala.com.
In order to reduce the work load of searching an suitable internship too much by
scrolling the cursor, the best way is the introduction of the web scrapper named
Internship Finder which provides filters based on the location, duration and
stipend offered or not. This scrapper with finally display the top 5 or best
scrapped internship.
31
CHAPTER 8
REFERENCES
www.geeksforgeeks.org
www.edureka.com
www.stackoverflow.com
www.towardsdatascience.com
www.wikipedia.com
www.google.com
32
33