Professional Documents
Culture Documents
PROJECT REPORT
ON
Research Methodology For Commerce Lab
BATCH: 2022-2025
Project Supervisor
Name: Dr. Pramod Kumar Nayak.
Signature:
ACKNOWLEDGEMENT
They have provided me with valuable guidance, sustained efforts, and a friendly approach. It
would have been difficult to achieve the results in such a short period without their help.
I deem it my duty to record my gratitude towards the Project supervisor Dr. Pramod Kumar
Nayak. who devoted his precious time to interact, guiding, and giving me the right approach
to accomplish the task and also helped me to enhance my knowledge and understanding of the
project.
Signature:
Name of Student: Rahul Singh Rawat
Enroll. No:01519588822
Course: B.com(H)
Semester: 3rd Sem.
DECLARATION
I hereby declare that the following documented project report titled “Research Methodology
Lab Project” is an original and authentic work done by me for the partial fulfillment of the
BACHELOR OF COMMERCE (HONOURS) degree program.
I hereby certify that all the Endeavour put in the fulfillment of the task are genuine and original
to the best of my knowledge & I have not submitted it earlier elsewhere.
Signature:
Course: B.COM(H)
Classification of Research
On the basis of On the basis On the basis On the basis
of of research of Other Basis
purpose application method measurement
Place/E
Descriptive Exploratory Correlational Pure/Basic Applied/ Conceptual Empirical Quantitative Qualitative Time period
Era of Orientation nvironm
Research Fundamental Action time ent
Types of
Data
Qualitative Quantitative
Primary Secondary
Data Data
A. Primary Data: Primary data collecting is obtaining information from authentic, first-hand
sources. Primary data-collecting techniques aid in the acquisition of precise and current
information on study participants by researchers or service providers. These techniques
entail contacting a certain population and gathering information from them via surveys,
interviews, experiments, observations, and other means. There are two types of primary
data-gathering methods: qualitative methods and quantitative approaches.
• Quantitative Methods: Statistical methods are typically used in quantitative
approaches for market research and demand forecasting. These methods anticipate
demand by using previous data. These fundamental data collecting techniques are
typically employed in the production of long-term projections. Techniques for statistical
analysis have very little subjectivity, making them extremely dependable.
a) Time Series Analysis: A time series is a sequential arrangement of a variable's
values at regular intervals that is referred to as a trend. An organisation can forecast
the demand for its goods and services for the anticipated period of time by using
patterns.
b) Smoothing Techniques: Smoothing techniques can be applied when there are no
discernible trends in the time series. They take out of the historical demand a
random fluctuation. In order to forecast future demand, it is helpful to determine
trends and levels of demand. The weighted moving average approach and the simple
moving average method are the most often utilised strategies for smoothing demand
predictions.
c) Barometric Method: This technique, sometimes referred to as the leading
indicators approach, is employed by academics to forecast future trends by
analysing present advancements. occurrences from the past serve as leading
indicators when they are used to forecast occurrences from the future.
• Qualitative Methods: Methods for gathering qualitative data are particularly helpful
when historical data is unavailable. Alternatively, no mathematical computations or
numbers are required. Words, sounds, feelings, emotions, colours, and other non-
quantifiable aspects are directly linked to qualitative research. These methods rely on a
variety of factors, such as judgment, intuition, emotion, hypothesis, and experience.
Quantitative approaches frequently fail to reach disadvantaged communities, take a long
time to collect data, and do not reveal the motivations underlying participants' answers.
Therefore, combining quantitative and qualitative methodologies is the ideal approach.
a) Surveys: Data from the target audience is gathered through surveys, which also
provide insights into their choices, thoughts, and feedback on the items and services
offered. There is usually a large selection of question types available in survey
software. To save time and work, you may alternatively utilize a pre-made survey
template. Online questionnaires can have their themes, logos, and other elements
changed to match the company's branding. They may be shared via a variety of
platforms, including social media, email, websites, offline apps, and QR codes. You
may choose the channel based on your audience's demographics and sources.
Survey software may provide a variety of reports and run analytics algorithms to
find hidden insights once the data has been gathered. You may get information on
response rate, completion rate, demographic filters, export and sharing options, and
other topics on a survey dashboard. Adding third-party app integration to your
survey builder can help you get the most out of your online real-time data-gathering
efforts. The effectiveness of reporting and analytics together is essential to practical
business intelligence, as reporting disseminates the insights that are gleaned from
analytics to stakeholders.
b) Polls: A single, multiple-choice question is included in a poll. When you need to
quickly gauge the opinions of the audience, you may use surveys. People are more
likely to respond to them since they are brief. Online polls may be integrated into a
variety of platforms, much like surveys. After responding to the question, the
respondents may also see how their response compares to that of other respondents.
c) Interviews: Using this approach, the interviewer speaks with the respondents over
the phone or in person. In in-person interviews, the interviewer poses a series of
questions to the subject and records the latter's answers. The interviewer may do a
phone interview if it is not possible to meet with the subject. Only a small number
of respondents are appropriate for this type of data collection. If there are several
participants, it would take too much time and effort to perform the same procedure.
B. Secondary Data: Retrieving data that is already available from sources other than the
intended audience is known as secondary data collecting. When using secondary data, the
researcher consults secondary data sources rather than "collecting" data.
The two main categories of secondary data sources are unpublished and public data. As the
titles imply, unpublished data is unreleased private information that researchers or people
have documented, whereas published data has been publicised and made available for use
by the public or commercial sector. Drow strongly advises taking into account the text's
degree of debate and depth of analysis, the influence it has had on the development of the
area of study, the date of publication, the author's qualifications, the reliability of the source,
and the text itself when selecting public data sources. The researcher can obtain data from
the data sources, both internal and external, to the organizational data.
❖ Internal sources of secondary data:
Organization’s health and safety records
Mission and vision statements
Financial Statements
Magazines
Sales Report
CRM Software
Executive summaries
❖ External sources of secondary data:
Government reports
Press releases
Business journals
Libraries
Internet
Both quantitative and qualitative methods may be used in the secondary data collecting
procedures. Since secondary data is more readily available than primary data, it requires less
effort and money to get. Nevertheless, it is impossible to confirm the validity of the data
collected via secondary data-gathering techniques. Techniques for both quantitative and
qualitative observation may also be used in the secondary data collecting approaches.
Secondary data is more costly, more readily available, and requires less effort than primary
data. Nevertheless, it is impossible to confirm the validity of the data collected via secondary
data-gathering techniques. Regardless of the approach you use for gathering data, direct
engagement with decision-makers is essential to ensure they comprehend the findings and
pledge to act accordingly. They consequently need to be extremely careful about how the
information is analyzed and presented. Keep in mind that the technique of data collecting
utilized has a significant impact on the usefulness and functionality of the data for us.
1.5 Variables
The word "variable" is commonly used in research initiatives. When planning research
initiatives that involve quantitative analysis, it is important to define and identify the variables.
In any research, a variable generates more enthusiasm than constants. Therefore, it is essential
that those new to research understand this word and its associated notions. In layman's terms,
a variable is something that may have several values or changes. "A variable is something that
varies, as the name suggests." It might be something like body temperature, income, anxiety
levels, height, or weight. Every one of these attributes has a varied value along a continuum
and varies from person to person. It might be social, physical, or demographic and include
things like language, cuisine, fashion, money, occupation, humidity, temperature, and religion.
Certain factors, like gender, blood group, and order of birth, can be very specific and certain,
while other factors might be far more nebulous and abstract. A variable is a property with a
range of values. It is also a sensible arrangement of the qualities. Characteristics or features
that characterise a thing are called attributes. For instance, male and female are the
characteristics if gender is a variable. Urban, semi-urban, and rural areas become the qualities
if residence is variable. Thus, characteristics here delineate a person's place of residence.
Understanding the relationships between different study factors is important for researchers.
To provide a precise description of the relationship between the variables, it is crucial to define
the variables. The number of variables that may be assessed is unlimited, but the more
variables, the more intricate the research and statistical analysis will be. Additionally, the
amount of time needed for data collecting increases with the length of the list of variables. By
use of an operationalization procedure, variables may be characterised in terms of quantifiable
components. It will transform complex ideas into clear notions that may then be scientifically
measured. To enable measurement and quantification, the word "variables" must be defined.
In other words, the variable needs to function for you to operate or for you to become
operational. There are different types of variables that have their influence differently in a study
viz. Independent & dependent variables, Active and attribute variables, Continuous, discrete,
and categorical variables, Extraneous variables, and Demographic variables.
▪ Independent & dependent variable: The antecedent is the independent variable, while
the consequence is the dependent variable. Researchers adjust the values of the variable
to examine its impact on another variable if the independent variable is an active variable.
In the case above, we change the degree of anxiety to determine if it improves reactivity
to the painkiller. The active independent variable is the degree of anxiety.
The variable that the independent variable affects is known as the dependent variable. In
the case above, the dependent variable is responsiveness to a painkiller. The independent
variable is what the dependent variable depends on.
▪ Active and attribute variables: Variables are frequently traits of study participants,
including their age, weight, or attitudes about their health. "Active variables are those that
the researcher creates; attribute variables are those that cannot be manipulated."
Independent variables can also be active variables. For instance, how well the
communication board meets the requirements of patients who are intubated. Since the
communication board is a researcher's notion and may be changed to meet study
requirements or patient wants, it is considered an "active independent variable." It is
called the independent variable, or the cause.
An attribute variable is one that is not changed while the study is being conducted. Though
it has limits, it can also be the independent variable. A few examples of characteristic
factors are blood type, age, gender, eye colour, etc. Everyone should look into how age
affects weight. Although we cannot alter a person's age, researchers may examine
individuals with varying ages and weights. It is possible for an attribute variable in one
research to be an active variable in another.
▪ Continous, discrete, and categorial variables: Variables can have a broad range of
values on a continuum at times. "A continuous variable has an infinite range of values it
can take on between two points." The range of values for the continuous variable weight,
which is between 1 and 2 kg, is infinite and includes 1.005, 1.7, 1.33333, and so on. In
practical application, continuous measurements fall into a range where each person
receives a score between 1 and 2. Conversely, a discrete variable, which represents
discrete quantities, has a finite number of values between any two locations. Categorical
variables are a subset of nominal measurements. Two or more subgroups of the set object
being measured exist in nominal measurements. "They have a straightforward
requirement that each member of the subset be given the same name (nominal) and
number and that each member be treated equally." That is to say, we are unable to quantify
or even rank order the categories; instead, we can only test whether particular things fall
into specific, discrete categories. For instance, there are only two values for the variable
gender (male and female). Categorical variables are those that have a limited range of
discrete, non-quantitative values.
▪ Extraneous variables: Occasionally, when research is over, we discover that the actual
outcome does not match our expectations. Even after implementing every option
available, the result is unanticipated. The reason for this is unrelated variables. Extraneous
factors are those that have not been given enough consideration in the study but may have
an impact on research findings. All studies contain extraneous factors, which can have an
impact on how study variables are measured and how they relate to one another.
"Confounding variables are exogenous variables that are identified before the study is
started but cannot be controlled, or that are identified while the study is underway." Even
if the researcher cannot see it, some external circumstances may have an impact on the
connection between the research variables. These variables are called intervening
variables. For example, girl's knowledge and practices help in maintaining menstrual
hygiene. Here, motivation, mother and friends, and mass media are some intervening
variables that may also help in maintaining menstrual hygiene. Thus, if these two factors
are not controlled it would be impossible to know what the underlying cause is.
▪ Demographic variables: "The traits or features of the subjects that are gathered to five
characterise the sample are known as demographic variables." Another name for them is
sample characteristics. This implies that these factors characterise the research sample
and establish whether or not the samples are representative of the target population.
Researchers can elucidate the links between demographic characteristics and dependent
variables, despite the fact that demographic variables cannot be controlled. A few typical
demographic factors are age, gender, marital status, employment, and income, among
others.
1.6 Hypothesis:
A hypothesis can only be used to forecast the data, results, and conclusion of your article. It
originates from an intuitive and inquisitive place. When you construct a hypothesis, you are
effectively speculating based on data and biases in science, which is then confirmed or refuted
via the scientific method. Research is done in order to observe a certain phenomena. Thus, a
hypothesis defines the occurrence in question. It does this by using two variables: an
independent variable and a dependent variable. The cause of the observation is the independent
variable, and the result of the cause is the dependent variable. This is best shown by the saying
"Mixing" red and blue forms purple." Since you're combining the two colours voluntarily,
mixing red and blue is the independent variable in this hypothesis. Since purple formation is
reliant on the independent variable in this scenario, it is the dependent variable. A null
hypothesis and an alternative hypothesis are the only two categories of hypotheses:
➢ Null Hypothesis: A null hypothesis states that there isn't any connection between two
variables. A negative statement such as "Athletes' on-field performance is not affected
by attending physiotherapy sessions" is represented by the symbol H0. Here, the author
asserts that physical therapy sessions have no bearing on players' on-field results. If
there is, it is only coincidental.
➢ Alternative Hypothesis: An alternative hypothesis, denoted as H1 or Ha, is thought of
as the antithesis of a null hypothesis. The relationship between the dependent and
independent variables is stated clearly. "Athletes' on-field performance is improved by
attending physiotherapy sessions" is an excellent example of an alternative hypothesis.
or "At 100°C, water evaporates." The alternative hypothesis further branches into
Directional And Non-Directional Hypothesis:
❖ Directional Hypothesis: A directed hypothesis is one that predicts an outcome will
either be good or negative. It is linked to H1 by either the '<' or '>' symbol.
❖ Non-Directional Hypothesis: The dependent variable is the sole thing that a non-
directional hypothesis asserts an influence on. It doesn't say if a favourable or bad
outcome would occur. '≠' is the indicator indicating a non-directional hypothesis.
1.7 Questionnaire
Questionnaires are a fundamental and versatile data collection tool in the realm of research
methodology. They play a pivotal role in the collection and analysis of data across various
disciplines, ranging from the social sciences to healthcare, marketing, and beyond. A
questionnaire is essentially a systematic set of questions designed to elicit specific information,
opinions, or feedback from participants. These instruments can be administered in various
formats, including paper-based forms, online surveys, or interviews. The key aspects of
questionnaires, highlighting their importance, design, administration, and analysis.
Key Aspects:
➢ Structured Data Collection: Questionnaires offer a structured and systematic
approach to gathering data. They provide researchers with a standardized set of
questions to pose to participants, ensuring that every individual is exposed to the same
inquiries, often with predefined response options. This structured nature makes
questionnaires ideal for research that requires consistent data collection.
➢ Quantitative and Qualitative Applications: Questionnaires are versatile tools used in
both quantitative and qualitative research. In quantitative studies, researchers typically
employ closed-ended questions that yield numerical data for statistical analysis. In
contrast, qualitative research may incorporate open-ended questions to elicit more
nuanced responses, facilitating in-depth exploration of complex phenomena. As such,
questionnaires can be tailored to suit the specific research objectives and
methodologies.
➢ Standardization and Consistency: One of the key advantages of questionnaires is the
level of standardization they offer. Every participant is presented with the same set of
questions, thereby ensuring that the data collection process is uniform. This
standardization enhances the reliability and comparability of the data, making it suitable
for research conducted on a large scale, across diverse populations, or in multi-site
studies. Researchers carefully design questionnaires to align with their research goals
and to obtain data that is relevant and insightful.
➢ Variety of Administration Methods: The administration of questionnaires can take
various forms. Traditional paper-based questionnaires are still widely used, particularly
in face-to-face surveys, mail surveys, or surveys distributed at specific events or
locations. With the advent of digital technology, online questionnaires have gained
immense popularity. These can be administered via email, websites, or survey
platforms, making it easier to reach a broader audience and collect data efficiently.
Furthermore, telephone interviews and in-person interviews can also be considered
questionnaire administration methods, as they involve asking participants a set of
structured questions.
➢ Data Analysis and Interpretation: Once data is collected through questionnaires, the
process of analysis and interpretation becomes crucial. Quantitative data obtained from
closed-ended questions can be subjected to statistical analyses, such as descriptive
statistics, correlation, regression, or inferential tests, depending on the research
objectives. Qualitative data gathered from open-ended questions can undergo content
analysis to identify patterns, themes, and narratives. The results of the analysis provide
valuable insights and contribute to answering research questions, supporting
hypotheses, or forming a basis for further exploration in the given field.
1.8 Coding
When a researcher has completed collecting information or data, this information is ready to
be processed and analyzed. Quantitative data is information that is measurable and focuses on
numerical values, unlike qualitative data which is more descriptive. During the data processing
step, the collected data is transformed into a form that is appropriate to manipulate and analyze.
The process in which raw data is transformed into a standardized form suitable for machine
processing and analysis is called coding. In other words, coding is the act of assigning
numerical values to a set of data in order to make the analysis simpler. Coding can be used to
quantify both manifest content i.e. the tangible or concrete surface content (data), and latent
content i.e. the underlying meaning behind this information. The difference between manifest
content and latent content is very important when it comes to survey research. It is advisable
to do a pilot or a pretest of the instrument of data collection as it would help uncover the
potential problems with the study and accordingly help make changes in the tool. It will also
give the researcher an idea of how the data will look. On the basis of this, the researcher can
work out the layout of the codebook keeping in mind the responses collected for each variable,
guiding him to provide enough variables to capture all the richness, complexity, and variety of
data that has been collected. Depending on what shape the data comes in, the researcher will
have to decide how to code this information, with the help of one, two, or multiple variables.
❖ Data Coding: Though the preparation of a codebook commences prior to actual data
collection, after designing the instrument and pre testing it, data coding as a step in the
research process takes place after the completion of data collection, simultaneously during
data entry. It is important to keep the following points in mind during coding of data:
a) Identification variables: Unique identification of the respondent/ questionnaire/
response sheet is extremely important as it helps the researcher verify and check data.
The identification variable is a unique number corresponding to every respondent which
has to be accommodated in a special field at the beginning of each record. For example,
001, 002, etc. may be used as identification variables.
b) Code categories: Code categories should be mutually exclusive, exhaustive, and
precisely defined. Each interview response should fit into one and only one category.
Ambiguity will cause coding difficulties and problems with the interpretation of the data
also. An example of this would be while recording literacy levels of youth in a slum
community, the coding should include not just those who have gone through the formal
system of education, but also those who have participated in non-formal education
programs and are therefore literate.
c) Preserving original information: Data once coded is retained and becomes final; hence
it is important to code as much detail as possible by recording the original data rather
than collapsing or bracketing the information. With original or detailed data, the research
analyst can determine other meaningful relationships between variables beyond those
which are selected primarily or restricted by the entering or coding data. Hence,
occupation of women in a slum community being surveyed could include the home
based small enterprises (involved in small scale business like making snacks, Knick
knacks, etc.) each as a separate code.
d) Closed-ended questions: Responses to survey questions that are pre coded in the
questionnaire should retain this coding scheme in the machine-readable data to avoid
errors and confusion. For example, in the above mentioned study in case the women
respondents may be pre divided as per their age and marital status, this pre coding should
be retained.
e) Open-ended questions: For open-ended items, investigators can either use a
predetermined coding scheme or review the initial survey responses to construct a
coding scheme based on major categories that emerge. Any coding scheme and its
derivation should be reported in study documentation. Increasingly, investigators submit
the full verbatim text of responses to open-ended questions to archives so that users can
code these responses themselves. However, such responses may contain sensitive
information and may involve the risk of identification; they must therefore be reviewed
prior to disclosure.
f) Check-coding: Check-coding provides an important means of quality control in the
coding process. In this process some cases are repeated with an independent coder in
order to verify the coding assigned and rule out discrepancies and ambiguities if any.
g) Series of responses: If a series of responses requires more than one field, organizing
the responses into meaningful major classifications becomes helpful. Responses within
each major category are assigned the same first digit. Secondary digits can distinguish
specific responses within the major categories. Such a coding scheme permits analysis
of the data using broad groupings or more detailed categories.
2 RESEARCH METHODOLOGY LAB
2.1 Hardware Requirements
• Laptops or a computer desktop ( monitor, CPU, keyboard, mouse)
• Processor: A modern multi-core CPU, such as Intel Core i3 or equivalent.
• Memory (RAM): Minimum 2GB, recommended 4-8GB for larger datasets.
• Storage: 500MB HDD/SSD space for RStudio installation.
• Graphics: Integrated graphics are sufficient for data analysis and visualization.
• Internet Connection: A stable internet connection is useful for accessing online
resources, downloading packages, and collaborating on research projects.
• Operating System: RStudio is compatible with Windows, macOS, and Linux.
2.3 Software Requirement
There are many different software needs and tools needed when using the R language for
research techniques. R is an environment and programming language for statistical
computation and graphics that is robust and available as free software. The following are
the specific software needs to use R in research methodology:
• R Programming Environment: The R programming language is a fundamental
software requirement for research methodology involving data analysis and statistical
research. R is an open-source language known for its versatility and robust capabilities
in data manipulation, statistical modeling, and data visualization. Researchers must start
by installing R, which is compatible with Windows, macOS, and Linux operating
systems. This programming environment serves as the core tool for managing and
analyzing research data, allowing researchers to perform a wide range of statistical and
computational tasks.
• Integrated Development Environment (IDE): To enhance productivity and
streamline research workflows, researchers often rely on an Integrated Development
Environment (IDE). Among IDEs, RStudio is the most popular choice. It offers an
intuitive and user-friendly interface for writing, running, and debugging R code.
Additionally, RStudio simplifies package management, project organization, and report
generation, making it a valuable tool for maintaining a structured and efficient research
process. With a multitude of features, RStudio significantly enhances the researcher's
experience by facilitating code development and management within the R environment.
• R Packages and Libraries: The strength of R lies in its extensive collection of packages
and libraries designed to address specific research needs. The selection of relevant
packages and libraries should be based on the specific objectives of the research. For
data manipulation and transformation, 'dplyr' is an indispensable package. It provides a
comprehensive set of functions for data reshaping, aggregation, and filtering. 'ggplot2'
is a go-to package for data visualization and graphical representation, offering flexibility
and customization in creating visualizations. Researchers dealing with date and time
data may find 'lubridate' particularly useful for data management and parsing. 'tidyr' aids
in data tidying tasks, making data frames more amenable to analysis. The 'tidyverse'
collection bundles these and other packages, creating a cohesive data analysis toolkit
that harmonizes data manipulation, visualization, and modeling. For research projects
involving predictive modeling and machine learning, the 'caret' package is versatile and
valuable, offering a unified framework for model training, testing, and evaluation.
Researchers should be prepared to adapt and expand their package selection as their
research evolves, as specific projects may require additional specialized packages.
• Version Control and Collaboration: Collaborative research projects greatly benefit
from the implementation of version control systems, with Git being the preferred choice.
Git allows researchers to track changes in their code, manage collaborative work, and
ensure the integrity and reproducibility of their research results. By setting up a Git
repository, researchers can easily collaborate with peers, track the evolution of their
work over time, and maintain transparency throughout the research process. Online
platforms such as GitHub, GitLab, and Bitbucket provide hosting services for Git
repositories, making it convenient to share code with collaborators, conduct peer
reviews, and ensure that the project remains well-organized.
• Data Management and Visualization Tools: Researchers often require supplementary
software tools for data management and visualization in addition to the capabilities
offered by R. Excel or Google Sheets can be invaluable for initial data entry, simple data
cleaning, and basic data exploration. These tools provide a user-friendly interface for
quick data manipulation and ad hoc analysis, making them valuable assets in the early
stages of a research project. For more advanced data visualization and the creation of
interactive dashboards, researchers can turn to specialized tools like Tableau or
Microsoft Power BI. These tools allow researchers to present their findings effectively,
create impactful visual representations of their data, and cater to specific reporting
requirements.
When utilizing the R language in research methodology, specific software requirements
encompass the installation of R, the use of an integrated development environment (such
as RStudio), the selection of relevant R packages and libraries, the integration of version
control and collaboration tools (such as Git and online platforms), and the incorporation
of supplementary data management and visualization tools as necessary. These software
components form a comprehensive toolkit that ensures research analysis and reporting
are conducted efficiently, systematically, and in a reproducible manner, ultimately
contributing to the credibility and rigor of the research.
3 Introduction To Statistical Software ‘R’ For Data Analysis
Statistical software plays a pivotal role in the modern era of data analysis, offering researchers,
analysts, and data scientists powerful tools to make sense of vast and complex datasets. Among
the myriad options available, the open-source statistical programming language 'R' stands out
as a versatile and indispensable tool for data analysis. R, often referred to as the 'lingua franca'
of statistics, is renowned for its flexibility, extensibility, and an active community of users and
developers. It was first created by Ross Ihaka and Robert Gentleman at the University of
Auckland, New Zealand, in the early 1990s and has since evolved into a go-to choice for
statistical and data analysis tasks. One of the defining features of R is its adaptability to diverse
data analysis needs. It excels in tasks such as data manipulation, statistical modeling,
hypothesis testing, and data visualization. Researchers, statisticians, and data analysts employ
R to draw insights from data and derive meaningful conclusions. One of the strengths of R is
its rich ecosystem of packages and libraries. These packages extend R's functionality, providing
specialized tools for tasks like regression analysis, time series forecasting, machine learning,
and more. With over 18,000 packages available on the Comprehensive R Archive Network
(CRAN), users have access to a wealth of resources to tailor their analyses to the specific
requirements of their projects. R's data manipulation capabilities are highly regarded, with
packages like 'dplyr' and 'tidyr' offering streamlined solutions for tasks like filtering,
summarizing, and reshaping datasets. This enables users to clean and prepare data efficiently
for further analysis. For statistical modeling, R offers a range of packages, including 'stats,'
'lme4,' and 'survival,' which facilitate regression analysis, mixed-effects models, and survival
analysis. Additionally, R provides advanced data visualization tools, with 'ggplot2' being a
popular choice for creating customized and publication-quality plots. A notable feature of R is
its scripting capabilities. Users can write scripts to automate data analysis processes, making
their work more efficient and reproducible. Moreover, the integration of R Markdown allows
for the creation of dynamic and interactive reports that blend code, visualizations, and
narrative, making it easier to communicate and share results. The open-source nature of R
fosters a collaborative community of users and developers. This collaborative environment
encourages knowledge-sharing and allows users to benefit from a constant stream of updates
and improvements. Users can contribute packages, share code, and seek assistance through
various forums and mailing lists. As a result, R remains a dynamic and continuously evolving
platform, staying up-to-date with the latest advancements in data analysis and statistics.
Furthermore, R's compatibility with other data analysis tools is a significant advantage. It can
seamlessly integrate with databases, big data platforms, and other software, ensuring that
researchers can access and analyze data from various sources without disruption. R is not
without its challenges. As an open-source software, it may not have the same level of
commercial support as some proprietary counterparts, which can occasionally lead to a steeper
learning curve for beginners. However, the community-driven support system and abundant
online resources help address these challenges. R is a powerful and adaptable statistical
software that has cemented its place as a primary choice for data analysis. Its versatility, rich
ecosystem of packages, and strong community support make it an invaluable tool for
researchers, statisticians, and data analysts seeking to uncover insights from complex datasets.
Whether used for academic research, business analytics, or data science projects, R continues
to shape the landscape of modern data analysis and remains at the forefront of statistical
computing.
4 Screen Shots And Descriptions
4.1 Installation of R & R Studio (Screenshots):
• First, you have to open Google and search download R and RStudio for Windows.
• Click on the first link and after that, you see on your right-hand and lest-hand side
RStudio and R, installed the software.
• RStudio is now successfully installed on your computer. The RStudio Desktop IDE
interface is shown in the figure below:
• Array: Compared to matrices, arrays can have more than two dimensions. We can use
the array() function to create an array, and the dim parameter to specify the dimensions:
• List: A list in R can contain many different data types inside it. A list is a collection of data
which is ordered and changeable. To create a list, use the list() function:
• Data Frame: A data frame is a tabular data structure in R, used to organize and store
data in a structured manner. It consists of rows and columns, where each row represents
an individual observation or data point, and each column represents a particular variable
or attribute. Data frames are widely used in data analysis and statistical computing due
to their flexibility and ease of manipulation.
A. Importing Data:
➢ Select the File to Import: Navigate to the File menu and choose "Import
Data" or "Import Dataset."
➢ Choose File Type: Select the file type of the data you want to import,
such as CSV, TXT, or Excel.
➢ Specify File Location: Browse to the location of the file and select it.
➢ Configure Import Options: Depending on the file type, you may need to
configure additional import options, such as delimiter settings for CSV files
or sheet selection for Excel files.
➢ Import Data: Click the "Import" button to import the data into R.
B. Exporting Data:
➢ Select Data to Export: Highlight the data frame or specific variables you
want to export.
➢ Choose File Type: Navigate to the File menu and choose "Export Data" or
"Export Dataset." Select the desired file type, such as CSV, TXT, or Excel.
➢ Specify File Location: Browse to the location where you want to save the
exported file and enter the desired filename.
➢ Configure Export Options: Depending on the file type, you may need to
configure additional export options, such as delimiter settings for CSV files
or sheet selection for Excel files.
➢ Export Data: Click the "Export" button to export the data from R.
➢ Null Hypothesis(H0): There are no customer satisfaction levels and identify areas
for improvement to enhance member retention and loyalty.
➢ Alternate Hypothesis(H1): There are customer satisfaction levels and identify
areas for improvement to enhance member retention and loyalty.
7. DATA PREPARATION
7.1 Preparation of Questionnaire:
The questionnaire is prepared by Rahul Singh Rawat who has been testing the “STUDY
OF CUSTOMER SATISFACTION IN GYM.”
INSTRUCTION: On one of the following pages there are some questions and their
probable answers given against them. You read them carefully and put a tick in a blank
space given against it. You have to mark only one answer. An illustration is given below.
There is no time limit for it.
Kindly Fill in all the necessary Details :
Section 1:
1) Name _________
2) Age Group (in years)
a) 16-21
b) 22-25
c) 26-35
d) Above 35
3) Gender
a) Female
b) Male
c) Other
5) Qualification
a) Matric
b) 12 th Pass
c) Graduate
d) Postgraduate
e) Other
6) Occupation
a) Govt. Job
b) Private Job
c) Student
d) Business
e) Other
Section 2:
1) How long have you been a member of GYM?
a) Less than 3 months
b) 3-6 months
c) 6 months to 1 year
d) More than 1 year
3) How satisfied are you with the overall cleanliness and hygiene of gym facilities?
a) Very dissatisfied
b) Dissatisfied
c) Neutral
d) Satisfied
e) Very Satisfied
4) How would you rate the quality and variety of fitness equipment available at Gym?
a) Poor
b) Fair
c) Good
d) Excellent
5) How would you rate the professionalism and helpfulness of the staff at the gym,
including trainers and Front Desk Personnel?
a) Unprofessional and Unhelpful
b) Somewhat professional and helpful
c) Professional and helpful
d) Very Professional and Helpful
6) How satisfied are you with the value for money you receive as a Gym member?
a) Poor value for money
b) Fair value for money
c) Good value for money
d) Excellent value for money
e) Can’t say anything
7) How likely are you to recommend the gym to a friend or family Member?
a) Not likely at all
b) Somewhat likely
c) Likely
d) Very likely
8) How satisfied are you with the overall customer service experience at the gym,
including communication and problem resolution?
a) Very dissatisfied
b) Somewhat dissatisfied
c) Neutral
d) Satisfied
e) Very satisfied
9) What are the most reasons for choosing a GYM as your fitness facility?
a) Convenient location
b) Reputation and brand name
c) Availability of desired fitness equipment
d) Variety of group fitness classes
e) Qualified trainers
f) Membership price and promotion
g) Recommendation from friend/family
h) Other
10) How satisfied are you with the availability and condition of fitness equipment at the
GYM, including cardio machines, weightlifting equipment, and accessories?
a) Not satisfied
b) Somewhat satisfied
c) Neutral
d) Satisfied
e) Very satisfied
11) How would you rate the effectiveness of the fitness programs and workouts offered at
the gym in helping you achieve your fitness goals?
a) Ineffective
b) Somewhat effective
c) Neutral
d) Effective
e) Very effective
12) How satisfied are you with the availability of parking facilities at the GYM?
a) Not satisfied
b) Somewhat satisfied
c) Neutral
d) Satisfied
e) Very satisfied
13) How satisfied are you with the overall atmosphere and ambiance of the GYM,
including Lighting and music?
a) Very satisfied
b) Satisfied
c) Neutral
d) Dissatisfied
e) Very dissatisfied
14) How would you rate the overall customer service experience at GYM?
a) Very dissatisfied
b) Somewhat satisfied
c) Satisfied
d) Very satisfied
15) How would you rate the quality of group fitness classes at GYM?
a) Excellent
b) Good
c) Neutral
d) Poor
16) How would you rate the ventilation and air conditioning at GYM?
a) Excellent
b) Good
c) Neutral
d) Average
e) Poor
17) How satisfied are you with the locker room facilities at GYM?
a) Very satisfied
b) Satisfied
c) Neutral
d) Dissatisfied
e) Very dissatisfied
• CODING OF A QUESTIONNAIRE
1)Name _________
2) Age Group (in years)
a) 16-21
b) 22-25
c) 26-35
d) Above 35
[“16-21-1”] [“22-25-2”] [“26-35-3”] [“above 35-4]
3) Gender
a) Female
b) Male
c) Other
[“Female-1”] [“Male-2”] [“Other-3]
4) Annual household Income
a) Less than Rs.2,250,000
b) Rs. 2,50,000-5,00,000
c) Rs.5,00,001-8,00,000
d) Rs. 8,00,001-10,00,000
e) Above Rs.10,00,000
[“Less than Rs.2,250,000-1”] [“Rs. 2,50,000-5,00,000-2”] [“Rs.5,00,001-8,00,000-3]
[“Rs. 8,00,001-10,00,000-4”] [“Above Rs.10,00,000-5]
5) Qualification
a) Matric
b) 12th Pass
c) Graduate
d) Postgraduate
e) Other
[“Matric-1”] [“12th Pass-2”] [“Graduate-3”] [“Postgraduate-4”] [“Other-5”]
6) Occupation
a) Govt. Job
b) Private Job
c) Student
d) Business
e) Other
[“Govt. Job-1”] [“Private Job-2”] [“Student-3”] [“Business-4”] [“Other-5”]
Section 2:
1) How long have you been a member of GYM?
a) Less than 3 months
b) 3-6 months
c) 6 months to 1 year
d) More than 1 year
[“Less than 3 months-1”] [“3-6 months-2”] [“6 months to 1 year-3”] [“More than 1
year-4]
2) How frequently do you visit Gym in a week?
a) Less than 1 time per week
b) 1-2 times per week
c) 3-4 times per week
d) More than 4 times per week
[“Less than 1 time per week-1”] [“1-2 times per week-2”] [“3-4 times per week-3”]
[“More than 4 times per week-4”]
3) How satisfied are you with the overall cleanliness and hygiene of gym facilities?
a) Very dissatisfied
b) Dissatisfied
c) Neutral
d) Satisfied
e) Very Satisfied
[“Very dissatisfied-1”] [“Dissatisfied-2”] [“Neutral-3”] [“Satisfied-4”] [“Very
Satisfied-5]
4) How would you rate the quality and variety of fitness equipment available at Gym?
a) Poor
b) Fair
c) Good
d) Excellent
[“Poor-1”] [“Fair-2”] [“Good-3”] [“Excellent-4”]
5) How would you rate the professionalism and helpfulness of the staff at the gym,
including trainers and Front Desk Personnel?
a) Unprofessional and Unhelpful
b) Somewhat professional and helpful
c) Professional and helpful
d) Very Professional and Helpful
[“Unprofessional and Unhelpful-1”] [“Somewhat professional and helpful-2”]
[“Professional and helpful-3”] [“ Very Professional and Helpful-4”]
6) How satisfied are you with the value for money you receive as a Gym member?
a) Poor value for money
b) Fair value for money
c) Good value for money
d) Excellent value for money
e) Can’t say anything
[“Poor value for money-1”] [“Fair value for money-2”] [“Good value for money-3”]
[“Excellent value for money-4”] [“Can’t say anything-5”]
7) How likely are you to recommend the gym to a friend or family Member?
a) Not likely at all
b) Somewhat likely
c) Likely
d) Very likely
[“Not likely at all-1”] [“Somewhat likely-2”] [“Likely-3”] [“Very likely-4”]
8) How satisfied are you with the overall customer service experience at the gym,
including communication and problem resolution?
a) Very dissatisfied
b) Somewhat dissatisfied
c) Neutral
d) Satisfied
e) Very satisfied
[“Very dissatisfied-1”] ["Somewhat dissatisfied-2”] [“Neutral-3”] [“Satisfied-4”]
[“Very satisfied-5”]
9) What are the most reasons for choosing a GYM as your fitness facility?
a) Convenient location
b) Reputation and brand name
c) Availability of desired fitness equipment
d) Variety of group fitness classes
e) Qualified trainers
f) Membership price and promotion
g) Recommendation from friend/family
h) Other
[“Convenient location-1”] [“Reputation and brand name-2”] [“Availability of desired
fitness equipment-3”] [“Variety of group fitness classes-4”] [“Qualified trainers-5”]
[“Membership price and promotion-6”] [“Recommendation from friend/family-7”]
[“Other-8”]
10) How satisfied are you with the availability and condition of fitness equipment at the
GYM, including cardio machines, weightlifting equipment, and accessories?
a) Not satisfied
b) Somewhat satisfied
c) Neutral
d) Satisfied
e) Very satisfied
[“Not satisfied-1”] [“Somewhat satisfied-2”] [“Neutral-3”] [“Satisfied-4”] [“Very
satisfied-5”]
11) How would you rate the effectiveness of the fitness programs and workouts offered at
the gym in helping you achieve your fitness goals?
a) Ineffective
b) Somewhat effective
c) Neutral
d) Effective
e) Very effective
[“Ineffective-1”] [“Somewhat effective-2”] [“Neutral-3”] [“Effective-4”][“Very
Effective-5”]
12) How satisfied are you with the availability of parking facilities at the GYM?
a) Not satisfied
b) Somewhat satisfied
c) Neutral
d) Satisfied
e) Very satisfied
[“Not satisfied-1”] [“Somewhat satisfied-2”] [“Neutral-3”] [“Satisfied-4”] [“Very
satisfied-5”]
13) How satisfied are you with the overall atmosphere and ambiance of the GYM,
including Lighting and music?
a) Very satisfied
b) Satisfied
c) Neutral
d) Dissatisfied
e) Very dissatisfied
[“Very satisfied-1”] ["Satisfied-2”] [“Neutral-3”] [“Dissatisfied-4”] [“Very
dissatisfied-5”]
14) How would you rate the overall customer service experience at GYM?
a) Very dissatisfied
b) Somewhat satisfied
c) Satisfied
d) Very satisfied
[“Very dissatisfied-1”] ["Somewhat satisfied-2”] [“Neutral-3”] [“Satisfied-4”] [“Very
satisfied-5”]
15) How would you rate the quality of group fitness classes at GYM?
a) Excellent
b) Good
c) Neutral
d) Poor
[“Excellent-1”] [“Good-2”] [“Neutral-3”] [“Poor-4”]
16) How would you rate the ventilation and air conditioning at GYM?
a) Excellent
b) Good
c) Neutral
d) Average
e) Poor
[“Excellent-1”] [“Good-2”] [“Neutral-3”] [“Average-4”] [“Poor-5”]
17) How satisfied are you with the locker room facilities at GYM?
a) Very satisfied
b) Satisfied
c) Neutral
d) Dissatisfied
e) Very dissatisfied
[“Very satisfied-1”] ["Satisfied-2”] [“Neutral-3”] [“Dissatisfied-4”] [“Very
dissatisfied-5”]
18) Is the GYM a good fit for Beginners?
a) Yes
b) No
[“Yes-1”] [“No-2”]
19) Is the GYM a good fit for experienced fitness enthusiasts?
a) Yes
b) No
[“Yes-1”] [“No-2”]
20) How would you rate the overall experience at GYM?
a) Excellent
b) Good
c) Neutral
d) Poor
[“Excellent-1”] [“Good-2”] [“Neutral-3”] [“Poor-4”]
21) How much money do you spend in a month for GYM training?
a) 1,000
b) 2,000
c) 3,000
d) 4,000
e) Other______
[“1,000-1”] [“2,000-2”] [“3,000-3”] [“4,000-4”] [“Other-5”]
7.4 Tabulation (description and Excel sheet screenshots)
• The Data we have collected is being recorded in the table given below as:-
• Max.
• Min.