Professional Documents
Culture Documents
OF
BACHELOR OF TECHNOLOGY
i
GURU NANAK DEV ENGINEERING COLLEGE
LUDHIANA 141006
ii
ACKNOWLEDGEMENT
I am highly grateful to the Dr. Sehijpal Singh, Principal, Guru Nanak Dev Engineering College
(GNDEC), Ludhiana, for providing this opportunity to carry out the six month industrial training at
ANSH Infotech .
The constant guidance and encouragement received from Prof. G.S.Sodhi Dean T&P, GNDEC
Ludhiana has been of great help in carrying out the project work and is acknowledged with
reverential thanks.
I would like to express a deep sense of gratitude and thanks profusely to Mr. Ansh Aneja
Director/CEO of Company. Without the wise counsel and able guidance, it would have been
impossible to complete the report in this manner.
The help rendered by Mr. Akashdeep singh, Data Scientist of CSE department for
experimentation is greatly acknowledged.
I express gratitude to other faculty members of Information Technology department of GNDEC for
their intellectual support throughout the course of this work.
Finally, WE are indebted to all whosoever have contributed in this report work and friendly stay at
Ansh Infotech.
Dinesh Kumar.
iii
TABLE OF CONTENTS
CERTIFICATE II
ACKNOWLEDGEMENT III
LIST OF FIGURES V
List Of Tables vi
2.1) Overview 2
iv
3.3) Database Design 6-7
3.4) ER Diagram 8
5.1) Conclusion 23
v
REFERENCE 25
vi
LIST OF FIGURES
Figure 1 ER Diagram 8
vii
LIST OF TABLES
viii
ix
Chapter -1 : Introduction To Company
founded with its core values focused on Creative, People Driven, Socially Conscious, Intense and
Ethical. The company provides software products, IT services, for a variety of industries including
insurance, banking, capital markets and retail markets. These products and services include
business intelligence, web development, IT consulting and various transaction processing services.
We undertake projects located around the world and are committed to creating a challenging and
rewarding work environment whereby our team members are goal focused and motivated
throughout our clients' engagement. Our goal is to provide the most cost effective solutions while
keeping in mind the urgency to launch products without sacrificing the quality. Excellence lies at
our heart; we constantly challenge the convention. Excellence, Partnership and Commitment are
the three hallmarks with which we approach our clients, we believe and you will see this in our
service. Company also has a Training division that offers learning management and training
delivery solutions to Corporations, Institutions and Individuals. It is Global Leader in Skills &
Talent Development having three main lines of business: Corporate Learning Group (CLG), Skills
1
[Chapter-2] Introduction To Project
[2.1] Overview
Prediction and diagnosing of heart disease become a challenging factor faced by doctors and
hospitals both in India and abroad. In order to reduce the large scale of deaths from heart diseases,
a quick and efficient detection technique is to be discovered. Data mining techniques and machine
learning algorithms play a very important role in this area. The researchers accelerating their
research works to develop a software with the help machine learning algorithm which can help
doctors to take decision regarding both prediction and diagnosing of heart disease. The main
objective of this research paper is predicting the heart disease of a patient using machine learning
algorithms. Comparative study of the various performance of machine learning algorithms is done
through graphical representation of the results.The highest mortality of both India and abroad is
due to heart disease. So it is vital time to check this death toll by correctly identifying the disease in
initial stage. The matter become a headache for all doctors both in India and abroad. Now a days
doctors are adopting many scientific technologies and methodology for both identification and
diagnosing not only common disease, but also many fatal diseases. The successful treatment is
always attributed by right and accurate diagnosis. Doctors may sometimes fail to take accurate
decisions while diagnosing the heart disease of a patient, therefore heart disease prediction systems
which use machine learning algorithms assist in such cases to get accurate results.
Though there are many existing systems, but those seems to be very expensive as if a user wants a
regular heart check up then hospitals will become quite expensive. So better system instead of them
2
can be heart disease prediction machine which is developed and modified using python language.
so, it can give an idea to the user that he will face any heart problem in future or not. Through this
user can regularly check heart problem just to get a rough idea without wasting a huge amount of
money in hospitals.
A good SRS defines the how Software System will interact with all internal modules, hardware,
communication with other programs and human user interactions with wide range of real life
scenarios. Using the Software requirements specification (SRS) document on QA lead, managers
creates test plan. It is very important that testers must be cleared with every detail specified in this
document in order to avoid faults in test cases and its expected results.
It is highly recommended to review or test SRS documents before start writing test cases and
Hardware requirements include that hardware which is required for its working. It includes:
1) Pentium 4 Computer (Minimum).
2) 512 MB RAM (Minimum).
3) Hard Disk - 40GB (Minimum).
4) Network Interface (For Communication).
This assessment is based on an outline design of system requirements, to determine whether the
store has the technical expertise to handle completion of the project. When writing the report, the
following were under consideration:
1) A brief description of the framework to assess more possible factors which could affect the
study
2) The part of the business being examined(marketing, economical
4
3) The human and economic factor
4) The possible solutions to the problem within minimum time
The technical feasibility assessment is focused on gaining an understanding of the present technical
resources of the store and their applicability to the expected needs of the proposed system. It is an
evaluation of the hardware and software and how it meets the need of the proposed system.
After the above analysis, it was concluded that the system is technically feasible.
The purpose of an economic feasibility study is to demonstrate the net benefit of a proposed project
for accepting or disbursing electronic funds/benefits, taking into consideration the benefits and
costs to the agency, other state agencies, and the general public as a whole.
Operational feasibility is the measure of how well a proposed system solves the problems, and
takes advantage of the opportunities identified during scope definition and how it satisfies the
The operational feasibility assessment focuses on the degree to which the proposed development
project fits in with the existing business environment and objectives with regard to development
To ensure success, desired operational outcomes must be imparted during design and development.
These include such design-dependent parameters as reliability, maintainability, supportable,
5
usability, reducibility, disposability, sustainability, affordability and others.
A data dictionary is a structured repository of data about data. It is a set of rigorous definition of all
[Table-1]
6
An entity. A
source of
data or a
destination
for data.
A process o
r task that is
performed by
the system.
A data
store, a place
where data is
held between
processes.
A data flow.
3.4) ER Diagram
An entity relationship diagram (ERD) shows the relationships of entity sets stored in a database. An
entity in this context is a component of data. In other words, ER diagrams illustrate the logical
structure of databases.
At first glance an entity relationship diagram looks very much like a flowchart. It is the specialized
symbols, and the meanings of those symbols, that make it unique.
8
3.5) SPECIFIC REQUIREMENT
The functional requirements part discusses the functionalities required from the system. The
system is considered to perform a set of high-level functions {fi}. The functional view of the
system is shown. Each function fi of the system can be considered as a transformation of a set
of input data (ii) to the corresponding set of output data (oi). The user can get some
Python uses dynamic typing, and a combination of reference counting and a cycle-detecting
garbage collector for memory management. It also features dynamic name resolution (late binding),
which binds method and variable names during program execution.
Python's design offers some support for functional programming in the Lisp tradition.
It has filter(), map(), and reduce() functions; list comprehensions, dictionaries, and sets;
and generator expressions. The standard library has two modules (itertools and functools) that
implement functional tools borrowed from Haskell and Standard ML.
The language's core philosophy is summarized in the document The Zen of Python (PEP 20),
which includes aphorisms such as:
1) Beautiful is better than ugly.
2) Explicit is better than implicit.
3) Simple is better than complex.
4) Readability counts.
Rather than having all of its functionality built into its core, Python was designed to be highly
extensible. This compact modularity has made it particularly popular as a means of adding
programmable interfaces to existing applications. Van Rossum's vision of a small core language
with a large standard library and easily extensible interpreter stemmed from his frustrations
with ABC, which espoused the opposite approach.
10
While offering choice in coding methodology, the Python philosophy rejects exuberant syntax
(such as that of Perl) in favor of a simpler, less-cluttered grammar. As Alex Martelli put it: "To
describe something as 'clever' is not considered a compliment in the Python culture." Python's
philosophy rejects the Perl "there is more than one way to do it" approach to language design in
favor of "there should be one—and preferably only one—obvious way to do it".
A support vector machine is a classifier that divides its input space into two regions, separated by
a linear boundary. Here, it has learned to distinguish black and white circles.
Among other categories of machine learning problems, learning to learn learns its own inductive
bias based on previous experience. Developmental learning, elaborated for robot learning,
generates its own sequences (also called curriculum) of learning situations to cumulatively acquire
repertoires of novel skills through autonomous self-exploration and social interaction with human
12
teachers and using guidance mechanisms such as active learning, maturation, motor synergies, and
imitation.
A core objective of a learner is to generalize from its experience. Generalization in this context is
the ability of a learning machine to perform accurately on new, unseen examples/tasks after having
experienced a learning data set. The training examples come from some generally unknown
probability distribution (considered representative of the space of occurrences) and the learner has
to build a general model about this space that enables it to produce sufficiently accurate predictions
in new cases.
The computational analysis of machine learning algorithms and their performance is a branch of
theoretical computer science known as computational learning theory. Because training sets are
finite and the future is uncertain, learning theory usually does not yield guarantees of the
performance of algorithms. Instead, probabilistic bounds on the performance are quite common.
The bias–variance decomposition is one way to quantify generalization error.
For the best performance in the context of generalization, the complexity of the hypothesis should
match the complexity of the function underlying the data. If the hypothesis is less complex than the
function, then the model has under fit the data. If the complexity of the model
is increased in response, then the training error decreases. But if the hypothesis is too complex,
then the model is subject to overfitting and generalization will be poorer.
In addition to performance bounds, computational learning theorists study the time complexity and
feasibility of learning. In computational learning theory, a computation is considered feasible if it
can be done in polynomial time. There are two kinds of time complexity results. Positive results
show that a certain class of functions can be learned in polynomial time. Negative results show that
certain classes cannot be learned in polynomial time.
Machine learning, reorganized as a separate field, started to flourish in the 1990s. The field
changed its goal from achieving artificial intelligence to tackling solvable problems of a practical
nature. It shifted focus away from the symbolic approaches it had inherited from AI, and toward
methods and models borrowed from statistics and probability theory.[14] It also benefited from the
increasing availability of digitized information, and the ability to distribute it via the Internet.
Machine learning also has intimate ties to optimization: many learning problems are formulated as
minimization of some loss function on a training set of examples. Loss functions express the
13
discrepancy between the predictions of the model being trained and the actual problem instances
(for example, in classification, one wants to assign a label to instances, and models are trained to
correctly predict the pre-assigned labels of a set of examples). The difference between the two
fields arises from the goal of generalization: while optimization algorithms can minimize the loss
on a training set, machine learning is concerned with minimizing the loss on unseen samples.
14
2) Data set is linked and details are shown
15
4) Data set is arranged in an order
16
5) Define the relation between various entities in data set required for disease prediction.
17
18
19
6) Import sklearn library for logistic regression. Data set is divided into two parts of ratio 80
and 20 .80 for training and 20 for testing.
20
7) Testing
8) Accuracy check
21
4.3) Testing
Software testing is an investigation conducted to provide stakeholders with information about the
quality of the product or service under test. Software testing can also provide an objective,
independent view of the software to allow the business to appreciate and understand the risks of
software implementation. Test techniques include the process of executing a program or
application with the intent of finding software bugs (errors or other defects).
Software testing involves the execution of a software component or system component to evaluate
one or more properties of interest. In general, these properties indicate the extent to which the
component or system under test:
22
3) Performs its functions within an acceptable time
4) Sufficiently usable
As the number of possible tests for even simple software components is practically infinite, all
software testing uses some strategy to select tests that are feasible for the available time and
resources. As a result, software testing typically (but not exclusively) attempts to execute a
program or application with the intent of finding software bugs (errors or other defects). The job of
testing is an iterative process as when one bug is fixed, it can illuminate other, deeper bugs, or can
even create new ones.
Software testing can provide objective, independent information about the quality of software and
risk of its failure to users and/or sponsors.
Software testing can be conducted as soon as executable software (even if partially complete)
exists.
The overall approach to software development often determines when and how testing is
conducted.
For example, in a phased process, most testing occurs after system requirements have been defined
and then implemented in testable programs. In contrast, under an Agile approach, requirements,
programming, and testing are often done concurrently.
Data-driven testing (DDT) is a term used in the testing of computer software to describe testing
done using a table of conditions directly as test inputs and verifiable outputs as well as the process
where test environment settings and control are not hard-coded. In the simplest form the tester
supplies the inputs from a row in the table and expects the outputs which occur in the same row.
The table typically contains values which correspond to boundary or partition input spaces. In the
control methodology, test configuration is "read" from a database.
23
Data-driven testing is the creation of test scripts to run together with their related data sets in a
framework. The framework provides re-usable test logic to reduce maintenance and improve
test coverage. Input and result (test criteria) data values can be stored in one or more central data
sources or databases, the actual format and organization can be implementation specific.
The data comprises variables used for both input values and output verification values. In advanced
(mature) automation environments data can be harvested from a running system using a
purpose-built custom tool or sniffer, the DDT framework thus performs playback of harvested data
producing a powerful automated regression testing tool.
Navigation through the program, reading of the data sources, and logging of test status and
information are all coded in the test script.
1. Test Case ID
2. Test Scenario
3. Test Case Description
4. Test Steps
5. Prerequisite
6. Test Data
7. Expected Result
8. Test Parameters
9. Actual Result
10. Environment
24
11. Information Comments
The exhaustive testing is not possible due to large number of data combinations and large number
of possible paths in the software. Scenario testing is to make sure that end to end functionality of
application under test is working as expected. Also check if the all business flows are working as
expected. In scenario testing tester need to put his/her foot in the end users shoes to check and
perform the action as how they are using application under test. In scenario testing the preparation
of scenarios would be the most important part, to prepare the scenario tester needs to consult or
take help from the client, stakeholder or developers.
25
[Chapter-5] Conclusion and Future Scope
5.1) CONCLUSION
Heart Disease Detection can be of great help to a whole society. By using one of the above models,
a framework can be developed that can determine your heart disease by predicting it using various
26
features. The heart disease determined can be further used to predict the longevity of an individual
by using one of the machine learning algorithms. These results can be used by pediatricians for
further diagnosing and treating the patient in case of some heart disorder.
The goals that are achieved by the software are:
1) Optimum utilization of resources.
2) Efficient management of records.
3) Simplification of the operations.
4) Less processing time and getting required information.
5) User friendly.
6) Portable and flexible for further enhancement.
7) User can manage its account and its transactions of orders.
8) A Store manager can give various offers, whichever fits best to him.
9) User can redeem these offers and earn profits.
10) User can order without going around and looking for products.
As no application is perfect so there are certain amendments in every application with passage of
time, so there is a big need for the updates in the application. Certain modules and features which
are to be added are
1) A feature of retailer Interface is to be inserted wherein whenever the quantity of any product is
below 10% of a given amount, a message shall be sent to the retailers asking them for the prices.
The one which gives lowest cost price, can be contacted
2) Better User Interface Designs will be introduced, which will enhance the performance of the
application.
3) A common application for all the Stores can be developed in which each store is recognized
with a unique code.
4) Generalized version can be formed in which services will not be limited to a single Filling
station.
27
28
REFERENCE
1) https://adc.bmj.com/content/81/2/172.short
2) https://rpubs.com/ssshinde/Heart_disease_prediction
3) https://www.kaggle.com/amanajmera1/framingham-heart-study-dataset
4) https://www.sciencepubco.com/index.php/ijet/article/view/10557
5) https://ieeexplore.ieee.org/abstract/document/938240/
6) https://www.sciencedirect.com/science/article/pii/089561119500005B
7) https://www.sabudh.org
8) https://www.pucho.comhttps://www.udemy.com
29