You are on page 1of 5

AMITY UNIVERSITY

Uttar Pradesh Lucknow Campus


Amity Institute of Information Technology
WEEKLY PROGRESS REPORT
For the week commencing from: 26 June 2023 – 2 July 2023

WPR No. 4

Name of the student: Shaurya Upadhyay

Enrollment Number: A7304822105

Program: Bachelor of Computer Applications (2019-2022) Semester: II

Name of the Non-Teaching Credit Course: Summer Internship – 1 [ETTP100]

Organization Name: AIIT, Amity University Uttar Pradesh, Lucknow

Faculty Guide’s Name: Dr. Ajay Pratap

Project Title: Detecting Fraud Apps Using Sentiment Analysis Using Machine Learning

Targets set for the week Achievements for the week Future work plans
Here is a breakdown of Here is a breakdown of Here is a breakdown of future
suggested targets for each day suggested achievements for work plans for each day of the
of the week in the project each day of the week in the week in the project "Detecting
"Detecting Fraud Apps Using project "Detecting Fraud Apps Fraud Apps Using Sentiment
Sentiment Analysis Using Using Sentiment Analysis Analysis Using Machine
Machine Learning": Monday: Using Machine Learning": Learning": Monday: 1.
1. Define the project's Monday: 1. Define and Research and explore
objectives, scope, and success finalize the project's advanced feature engineering
criteria in detail. 2. Identify objectives, scope, and success techniques to enhance the
and finalize the specific fraud criteria. 2. Conduct a fraud detection capabilities.
detection metrics and comprehensive review of Consider techniques such as
evaluation criteria to assess existing literature and research word embeddings, contextual
the performance of the papers related to fraud embeddings (e.g., BERT), or
system. 3. Conduct a thorough detection, sentiment analysis, domain-specific feature
review of the existing and machine learning extraction. 2. Investigate the
literature and research papers techniques in the context of use of alternative machine
on fraud detection, sentiment app reviews. 3. Identify and learning algorithms or models,
analysis, and machine learning gather a suitable dataset of such as random forests,
techniques in the context of app reviews, including both gradient boosting, or neural
app reviews. 4. Set up the legitimate and fraudulent networks, to improve the
development environment, samples. 4. Preprocess the accuracy and robustness of the
including installing necessary dataset by cleaning, fraud detection system. 3.
libraries and frameworks. tokenizing, and normalizing Evaluate the possibility of
Tuesday: 1. Gather a the text data, and handle any integrating external data
comprehensive dataset of app missing or noisy data. sources, such as user
reviews, including both Tuesday: 1. Implement a demographics or app usage
legitimate and fraudulent sentiment analysis algorithm statistics, to enrich the feature
samples, considering different using machine learning set and enhance fraud
app categories and platforms. techniques such as Naive detection performance.
2. Preprocess the dataset by Bayes, Support Vector Tuesday: 1. Investigate
cleaning and normalizing the Machines, or deep learning techniques for handling
text data, handling any models like LSTM or imbalanced datasets, as fraud
missing values or outliers, and Transformer. 2. Train the cases are typically rare
performing basic exploratory sentiment analysis model compared to legitimate cases.
data analysis. 3. Split the using the labeled dataset and Explore oversampling,
dataset into training, evaluate its performance using undersampling, or hybrid
validation, and testing subsets, appropriate metrics such as approaches to balance the
ensuring an appropriate accuracy, precision, recall, dataset and improve the
balance between legitimate and F1-score. 3. Fine-tune the model's ability to detect fraud.
and fraudulent samples. 4. sentiment analysis model if 2. Implement cross-validation
Implement a baseline necessary, by optimizing or other validation techniques
sentiment analysis model hyperparameters or exploring to assess the generalization
using machine learning different architectures. 4. performance of the fraud
techniques and evaluate its Document the implementation detection model. Fine-tune
performance using appropriate details, performance results, model hyperparameters to
metrics. Wednesday: 1. and any improvements made optimize its performance on
Experiment with different to the sentiment analysis unseen data. 3. Evaluate the
feature extraction techniques, model. Wednesday: 1. Extract feasibility of deploying the
such as bag-of-words, TF- relevant features from the fraud detection system on
IDF, or word embeddings, to preprocessed app reviews, cloud platforms or edge
capture relevant information such as text length, sentiment devices for real-time or near-
from the preprocessed app scores, keyword frequency, or real-time fraud detection in
reviews. 2. Incorporate the linguistic features. 2. Explore app marketplaces.
sentiment analysis results as different feature selection Wednesday: 1. Explore
features in the fraud detection techniques to identify the most techniques for interpretability
model and explore additional informative features for fraud and explainability of the fraud
features that can enhance detection. 3. Design and detection model. Consider
fraud detection accuracy. 3. implement a fraud detection methods such as feature
Design and implement a fraud model that combines the importance analysis, SHAP
detection model using sentiment analysis results with values, or LIME to gain
machine learning algorithms, selected features. 4. Train the insights into the model's
such as logistic regression, fraud detection model using decision-making process. 2.
random forest, or gradient the prepared dataset and Investigate the use of semi-
boosting. 4. Train the fraud evaluate its performance using supervised or active learning
detection model using the appropriate metrics, techniques to leverage
labeled dataset and evaluate comparing it against baselines unlabeled data and reduce the
its performance using the or existing fraud detection reliance on manual labeling of
chosen metrics. Thursday: 1. approaches. Thursday: 1. fraudulent app reviews. 3.
Fine-tune the hyperparameters Fine-tune the fraud detection Implement techniques to
of the fraud detection model model by optimizing detect and mitigate concept
using techniques like grid hyperparameters or exploring drift, where the fraud patterns
search or Bayesian ensemble methods to improve may change over time.
optimization to optimize its its accuracy and robustness. 2. Develop mechanisms to adapt
performance. 2. Conduct an Conduct extensive testing and the fraud detection model to
in-depth analysis of the evaluation of the fraud evolving fraud tactics.
model's performance, detection model using Thursday: 1. Evaluate the
including evaluating the separate test datasets or real- scalability of the fraud
impact of different thresholds time data to assess its detection system to handle
or decision boundaries on the effectiveness in detecting large volumes of app reviews
detection results. 3. fraudulent apps. 3. Perform in real-time. Optimize the
Implement mechanisms to comparative analysis with system's performance,
handle class imbalance in the existing fraud detection considering parallel
dataset, such as oversampling, methods or state-of-the-art processing, distributed
undersampling, or employing approaches to showcase the computing, or cloud-based
class-weighted techniques. 4. superiority of the proposed infrastructure. 2. Investigate
Explore different evaluation solution. 4. Document the the use of natural language
strategies, such as cross- performance results, generation techniques to
validation or bootstrapping, to comparative analysis, and any generate informative and
obtain more reliable refinements made to the fraud actionable responses to users
performance estimates. detection model. Friday: 1. regarding the detection of
Friday: 1. Assess the Prepare a comprehensive fraudulent apps. 3. Conduct
robustness and generalization report summarizing the additional experiments and
capability of the fraud project's methodology, evaluations using diverse
detection model using findings, and results. Include datasets or in collaboration
additional datasets or real- sections on data collection and with industry partners to
world app reviews. 2. Conduct preprocessing, sentiment validate and benchmark the
thorough evaluations of the analysis, feature extraction, performance of the fraud
model's performance, fraud detection model, detection system. Friday: 1.
considering different evaluation metrics, and Explore techniques for
performance metrics, performance comparisons. 2. proactive fraud prevention,
including accuracy, precision, Discuss any limitations or such as anomaly detection or
recall, F1-score, and area challenges encountered during outlier analysis, to identify
under the ROC curve. 3. the project and propose potential fraud cases even
Compare the performance of potential solutions or future before they are reported by
the developed model against research directions. 3. Revise users. 2. Investigate the use of
existing fraud detection and refine the report, ensuring explainable AI techniques to
methods or state-of-the-art clarity, coherence, and build trust and transparency
approaches to highlight its accurate representation of the with app developers and users
effectiveness. 4. Document project's achievements. 4. by providing understandable
the evaluation results, Prepare a visually appealing justifications for fraud
including performance presentation highlighting the detection outcomes. 3.
metrics, comparative analysis, key achievements, Collaborate with app store
and any insights or methodology, and results of platforms or regulatory bodies
observations. Saturday: 1. the project. Saturday: 1. to share insights, exchange
Prepare a detailed project Finalize the project report, data, and enhance fraud
report summarizing the incorporating any feedback or detection capabilities.
methodology, findings, and suggestions received from Saturday: 1. Prepare
conclusions of the project. 2. project stakeholders or documentation summarizing
Document the dataset advisors. 2. Practice and the future work plans,
collection and preprocessing rehearse the project including a detailed roadmap
steps, feature extraction presentation to ensure for each area of improvement
techniques, fraud detection effective delivery and clear and expansion. 2. Discuss and
model architecture, evaluation communication of the prioritize the future work
metrics, and results. 3. Include project's objectives and plans with the project team or
a comprehensive analysis of accomplishments. 3. Submit stakeholders, ensuring
the strengths, limitations, and the project report and deliver alignment with project goals
potential areas for the presentation as required. 4. and objectives. 3. Plan and
improvement in the developed Reflect on the overall project allocate resources, considering
fraud detection system. 4. journey, lessons learned, and the feasibility, impact, and
Proofread and refine the areas for future improvement dependencies of each future
project report, ensuring or expansion. Remember to work plan. Remember to
clarity, coherence, and adapt these achievements adapt and prioritize the future
accurate representation of the based on the specific work plans based on the
project's targets and requirements and scope of project's specific
achievements. Remember to your project. Regularly requirements, available
adjust the targets based on the communicate with your team resources, and timeline.
specific requirements and members, seek guidance from Regularly communicate with
timeline of your project. mentors or advisors, and track team members, stakeholders,
Regularly communicate with progress to ensure successful and relevant experts to gather
your team members, seek completion of the project. feedback and refine the future
guidance from mentors or work plans as needed.
advisors, and track progress to
ensure successful completion
of the project.

Signature of the Student: Shaurya Upadhyay _


(Name of Student)

Signature of the Faculty Guide: _


(Name of Guide)

You might also like