Professional Documents
Culture Documents
RESEARCH METHODOLOGY
3.1 Introduction
The system analysis, tools and methods employed in the research work is discussed in this
chapter. This study technique will direct the methodical approach we take to gathering data,
preparing it, choosing models, and evaluating them. We seek to create a strong and accurate
system capable of identifying and stopping fraudulent transactions by fusing the power of
machine learning with credit card transaction data, thereby enhancing security and trust in the
constantly changing world of digital money. To gain insights, validate our findings, and make a
contribution to the field of fraud detection technology, we will follow a strict process throughout
this journey. The various tools used in this research work, the research approach or methods
engaged in this research work were also discussed in this chapter.
3.2.1 Dataset
In this research, we use a dataset that includes credit card transactions that were made by
European cardholders for 2 days in September 2013. This dataset contains 284807 transactions in
total in which 0.172% of the transactions are fraudulent. The dataset has the following 30
features (V1..., V28), Time and Amount. All the attributes within the dataset are numerical. The
last column represents the class (type of transaction) whereby the value of 1 denotes a fraudulent
transaction and the value of 0 otherwise. The features V1 to V28 are not named for data security
and integrity reasons kaggle.com (2021). In order to solve the issue of class imbalance, we
applied the Synthetic Minority Oversampling Technique (SMOTE) method in the Data-
Preprocessing phase of the proposed framework in Fig. 1. The SMOTE method works by picking
samples that are close to each other within the feature space, drawing a line between the data
points in the feature space and creating a new instance of the minority class at a point along the
line.
3.3 Method of Data Collection
The primary and secondary methods of data collection are two approaches used to gather
information for research or analysis purposes.
Primary data collection is gathering original information directly from the source or via direct
engagement with respondents. This strategy enables researchers to gather firsthand knowledge
that is unique to their study aims. There are several methods for gathering primary data,
including:
b. Interviews: In interviews, the researcher and the responder interact directly. They can take
place in person, over the phone, or via video conferencing. Structured interviews (with preset
questions), semi-structured interviews (providing flexibility), and unstructured interviews (more
conversational) are all options.
c. Observations: Researchers observe and document natural behaviors, acts, or occurrences. This
strategy can be used to collect data about human behavior, interactions, or occurrences without
requiring direct involvement.
d. Experiments: Experiments entail manipulating factors to see how they affect the outcome.
Researchers manipulate the variables and gather data in order to form conclusions regarding
cause-and-effect correlations.
e. Focus Groups: A focus group is a small group of people who meet in a controlled setting to
discuss a given issue. This strategy aids in comprehending the participants' ideas, perceptions,
and experiences.
Data collecting entails gathering relevant information on a given subject of research. The
essential data for this research topic was acquired from secondary sources, notably from the
dataset accessible at kaggle repository.
The fitness method is defined a function that receives a candidate solution (a feature vector) and
determines whether it is fit or not. The measure of fitness is determined by the accuracy that is
yielded by a particular attribute vector in the testing process of the RF method within the GA.
Algorithm 1 provides more details about the implementation of RF in the GA.
8. Abhishek L. Optical character recognition using ensemble of SVM, MLP and extra
trees classifier. In: International conference for emerging technology (INCET)
IEEE; 2020. p. 1–4.
9. Chen T, He T, Benesty M, Khotilovich V, Tang Y, Cho H. Xgboost: extreme
gradient boosting. R package version 04-2. 2015;1(4):1–4.