0% found this document useful (0 votes)

78 views6 pages

IJCRT PaperFormat

The document introduces TextGuardianship, a Python and AI-powered plagiarism detection tool that utilizes advanced Natural Language Processing and machine learning techniques to accurately identify potential plagiarism in text. It features an intuitive interface for users to upload text or PDF documents, providing detailed reports on similarity scores and matched sources. The system has been evaluated for performance, achieving high precision and recall rates, and also includes functionality for detecting AI-generated content.

Uploaded by

chinmaypathak.1907

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

78 views6 pages

IJCRT PaperFormat

Uploaded by

chinmaypathak.1907

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

www.ijcrt.

org © 20XX IJCRT | Volume X, Issue X Month Year | ISSN: 2320-2882

TEXTGUARDIANSHIP
PYTHON & AI-POWERED PLAGIARISM DETECTION
1
Milind Patekar, 2 Chinmay Pathak, 3 Sushant Patil, 4 Devendra Shete
1
Final Year BE Student, 2 Final Year BE Student, 3 Final Year BE Student, 4 Final Year BE Student
1
Department of Artificial Intelligence and Data Science,
1
Ajeenkya DY Patil School of Engineering, Pune, India
________________________________________________________________________________________________________

Abstract: In the age of the internet, maintaining the authenticity of text content has become more and more important for academic
institutions, content developers, and professionals. This paper introduces TextGuardianship, a Python and AI-based plagiarism
detection tool that aims to provide a sound and effective means of detecting potential plagiarism. The system utilizes cutting-edge
Natural Language Processing (NLP) methods and machine learning models, such as TF-IDF vectorization, cosine similarity, and
semantic analysis, to scan and compare text data with high precision. Main aspects of TextGuardianship are the provision for both
direct text uploading and PDF upload, allowing easy content scanning in its original format. The platform also boosts user
comprehension through comprehensive reports identifying matched sources and offering similarity scores.

Index Terms - Plagiarism Detection, Natural Language Processing, Text Similarity, Python, TF-IDF, AI in Education
________________________________________________________________________________________________________
I. INTRODUCTION
During this period of technology advancements, access and sharing information easily have further contributed to high-risk cases
of unconscious and deliberate plagiarism. While a boom of publications online and more academic assignments increase the
frequency and volume of output, safeguarding the genuineness and purity of writings in terms of uniqueness has grown imperative
for scholars, publishers, as well as online content contributors. Old plagiarism detection software usually depends on rudimentary
keyword matching or shallow web scraping, which may fail to identify semantically equivalent or paraphrased material.
To overcome this challenge, we introduce TextGuardianship, a Python and AI-based plagiarism detector tool that offers accurate,
efficient, and informative analysis of text data. The system employs state-of-the-art Natural Language Processing (NLP) algorithms
and machine learning techniques to examine text similarity, identify paraphrased passages, and create comprehensive similarity
reports.
Our method utilizes Python libraries like NLTK, scikit-learn, and TensorFlow to conduct semantic analysis, TF-IDF vectorization,
and cosine similarity calculations. The system also allows direct PDF uploads, enabling users to verify academic papers, research
articles, or reports in their native format. TextGuardianship not only detects matched content but also gives detailed reports
highlighting plagiarized portions and suggesting possible sources.
This paper presents the architecture, functionality, and performance of TextGuardianship, showcasing its capacity to advance
academic integrity and original writing in the age of AI.
II. EASE OF USE
TextGuardianship features a simple, intuitive interface for easy accessibility by students, authors, and teachers. One can paste raw
text or import PDF documents for immediate plagiarism scanning. The system returns results complete with similarity scores,
highlighted matches, and matched sources in an easy-to-read report.

III. RESEARCH METHODOLOGY

This chapter describes the processes and equipment involved in the development, implementation, and assessment of the
TextGuardianship plagiarism detection system. The approach is intended to make the system technically sound and useful in
practice under real-world conditions.

IJCRT1601009 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org 25

www.ijcrt.org © 20XX IJCRT | Volume X, Issue X Month Year | ISSN: 2320-2882

3.1 Data and Sources of Data

In order to create and test the effectiveness of the TextGuardianship system, a heterogeneous corpus was built. This dataset consists
of text taken from scholarly research articles, blog posts, and Wikipedia articles. These were chosen to reflect various writing styles,
sentence structures, and topics. To make it realistic, multiple levels and forms of duplicated content were added to the dataset. This
involved direct copy-paste, paraphrasing, synonym replacement, and sentence reordering. The mix of original and tampered texts
provided thorough coverage of overt as well as latent plagiarism types, making the dataset effective for training as well as testing
the detection system.

3.2 Tools and Technologies

The system was implemented on top of the Python programming language because of its large ecosystem of natural language
processing (NLP) libraries and utilities. The most important libraries and frameworks utilized are:

1.Natural Language Toolkit (NLTK): Used for basic preprocessing operations like tokenization, removal of stop-words, and
normalization of text.

2.spaCy: Utilized for more complex NLP tasks like lemmatization, part-of-speech tagging, and named entity recognition.

3.Sentence-BERT (SBERT): Used to produce high-quality sentence embeddings that maintain semantic meaning, so that the system
is able to identify paraphrased or contextually equivalent content.

4.Scikit-learn: Used for similarity computations, evaluation metric calculations, and other support machine learning tasks.

Together, these packages helped to create a solid pipeline for text ingestion, processing, comparison, and plagiarism detection.

3.3 Methodological Steps

The plagiarism detection process within TextGuardianship is based on a structured pipeline with four main steps:

1.Preprocessing: The input text is cleaned by processes such as removal of punctuation, conversion to lower case, and tokenization.
This is done to maintain uniformity and make the text ready for semantic analysis.

2.Embedding Generation: Every sentence from the input and reference text is converted into a high-dimensional numeric vector by
the Sentence-BERT (SBERT) model. They store semantic meaning, allowing for correct comparison beyond word matching.

3.Similarity Detection: Cosine similarities are calculated between the input text embeddings and the reference dataset embeddings.
A similarity threshold is set a priori to decide whether a sentence or paragraph is detected as plagiarized.

4.Visualization and Reporting: The system marks plagiarized parts in the original text and shows similarity scores. A comprehensive
report is produced summarizing the overall extent of plagiarism detected, facilitating easy interpretation of results.

3.4 Model Architecture

The TextGuardianship system has a lean architecture centered around effectively detecting plagiarized content through
user input. The central workflow is described as follows:

1.User Input Interface:

The user is asked to provide a Title (subject of the content) and Description (actual content or paragraph). This description serves
as the main input for plagiarism analysis.

2.Text Preprocessing:
The description provided is preprocessed with standard Natural Language Processing (NLP) operations including:

• Lowercasing
• Removal of punctuation
• Tokenization

This cleans the data before comparison without distracting formatting noise.

3.Embedding Generation:
The transformed input text is mapped to semantic vector representations using Sentence-BERT (SBERT). These embeddings
encode the meaning of each sentence in context.

4.Similarity Checking:
The process compares the input embeddings with a reference dataset (Wikipedia pages, blogs, and academic journals) using cosine
similarity. It computes a similarity score for each sentence.

IJCRT1601009 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org 26

www.ijcrt.org © 20XX IJCRT | Volume X, Issue X Month Year | ISSN: 2320-2882

5.Threshold Evaluation:
If any sentence crosses a predetermined similarity threshold of 80% or higher, the system tags the content as "Plagiarism Detected."
Otherwise, it marks it as original.

6.Output Display:
The output is presented to the user, indicating:

• The similarity score

• Whether plagiarism was detected or not
• Visual highlights if necessary

This design guarantees fast and efficient plagiarism detection through semantic comprehension, as opposed to relying merely on
word-matching algorithms

3.4.2.1 Model for Text Similarity Detection

The center of the plagiarism detection system relies on a semantic similarity model with Sentence-BERT (SBERT). Once the user
provides input in the form of plain text or a PDF document with a title and description, the system processes the text (tokenization,
normalization). Subsequently, SBERT creates sentence embeddings for the input and reference documents in the data set.

Cosine similarity is calculated between the embeddings. Any piece of text that exceeds the threshold of similarity at 70% is marked
as plagiarized. This model supports strong detection of not only direct copying but also paraphrasing content.

IJCRT1601009 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org 27

www.ijcrt.org © 20XX IJCRT | Volume X, Issue X Month Year | ISSN: 2320-2882

3.4.2.2 Model for AI Content Detection

Apart from plagiarism, the system also uses AI-generated content detection via the Gemini API, which can detect if the provided
input is most likely to come from an AI language model. It analyzes language structure, repetition, coherence, and semantic depth
to estimate AI presence.

This detection distinguishes between content written by a human and machine-generated content, further supporting the academic
integrity check. The output is reported with the plagiarism score, aiding an overall content authenticity assessment.

3.4.3.1 TF-IDF Model for Plagiarism Detection

TF-IDF is a statistical measure used to evaluate how important a word is to a document in a collection. It is used in this
system to vectorize input and reference texts for similarity comparison.

Term Frequency (TF):

TF(t, d) = (Number of times term 't' appears in document 'd') ÷ (Total number of terms in document 'd')

Inverse Document Frequency (IDF):

IDF(t) = log(Total number of documents ÷ (1 + Number of documents containing term 't'))

TF-IDF Score:
TF-IDF(t, d) = TF(t, d) × IDF(t)

After converting both input and reference documents using TF-IDF, cosine similarity is applied to detect similarity levels between
them. A high similarity score (e.g., >70%) indicates potential plagiarism.

IV. RESULTS AND DISCUSSION

IJCRT1601009 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org 28

4.1 System Evaluation Metrics

The plagiarism detection system was tested with a test collection of documents whose plagiarism and originality levels are
known. The following measures were used to compare performance:

1.Precision: 0.89
2.Recall: 0.86
3.F1-score: 0.875

These figures show that the system performs very well at detecting plagiarized material while avoiding false positives.

4.2 Threshold Analysis

The plagiarism is flagged when similarity is above 70%. The tests indicated that the threshold optimizes detection accuracy
while avoiding raising alerts for small text overlaps or coincidental occurrences.

4.3 AI Content Detection

The system also uses Gemini API for detection of AI text. The results indicated that the majority of AI-written segments were
properly identified, allowing further analysis of originality and content validity.

4.4 Visual Output

The report then indicates plagiarized portions and shows similarity percentages. The system can accept content directly or through
PDF, and it produces a comprehensive result summary.

V. ACKNOWLEDGMENT
We want to thank Prof. Neelam Jain, our esteemed project guide, for her unwavering support, expert supervision, and invaluable
feedback throughout the development of this project. Her encouragement and deep understanding of artificial intelligence and data
science helped shape our ideas and guided us through technical challenges.

We also express our sincere gratitude to the Department of Artificial Intelligence and Data Science, Ajeenkya DY Patil School
of Engineering, Pune, for providing a strong academic foundation, access to essential infrastructure, and a collaborative environment
conducive to research and innovation.

Our heartfelt thanks also go to the college management and faculty members who contributed to our academic and professional
growth during this journey.

IJCRT1601009 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org 29

REFERENCES

[1] Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of Deep Bidirectional Transformers for
Language Understanding. arXiv preprint.
https://arxiv.org/abs/1810.04805

[2] Reimers, N., & Gurevych, I. (2019). Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. arXiv preprint.
https://arxiv.org/abs/1908.10084

[3] Bird, S., Klein, E., & Loper, E. (2009). Natural Language Processing with Python. O'Reilly Media.
https://www.nltk.org/book/

[4] Pedregosa, F., et al. (2011). Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research.
https://scikit-learn.org/

[5] Google AI. (2024). Gemini API Documentation.

https://ai.google.dev/gemini-api

[6] Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information Processing &
Management.
https://doi.org/10.1016/0306-4573(88)90021-0

[7] Jain, A., & Sharma, R. (2020). AI-based Plagiarism Detection Techniques in Indian Academia: A Review. International Journal
of Computer Applications.
https://doi.org/10.5120/ijca2020920034

[8] Patel, K., & Joshi, H. (2021). Implementation of Text Similarity Algorithms for Plagiarism Detection in Indian Research Papers.
International Journal of Engineering Research & Technology (IJERT), Vol. 10, Issue 04.
https://www.ijert.org/implementation-of-text-similarity-algorithms-for-plagiarism-detection

[9] Kumar, A., & Singh, M. (2019). Survey on Plagiarism Detection Tools and Techniques in India. Proceedings of the International
Conference on Computer Communication and Informatics (ICCCI).
https://ieeexplore.ieee.org/document/8707095

IJCRT1601009 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org 30

NLP-Based Plagiarism Detection System
No ratings yet
NLP-Based Plagiarism Detection System
11 pages
My Project
No ratings yet
My Project
16 pages
Sheild Plagiarism Detection Improving Accuracy and Efficiency Enhancement in Text and Image Similarity Detection
No ratings yet
Sheild Plagiarism Detection Improving Accuracy and Efficiency Enhancement in Text and Image Similarity Detection
2 pages
Plagiarism Detection Final PPT - PPTX (1) - 1
No ratings yet
Plagiarism Detection Final PPT - PPTX (1) - 1
37 pages
My Projeact
No ratings yet
My Projeact
21 pages
A New Era of Plagiarism The Danger of Cheating Using AI
No ratings yet
A New Era of Plagiarism The Danger of Cheating Using AI
6 pages
Ijresm V4 I4 34
No ratings yet
Ijresm V4 I4 34
3 pages
Generative AI Report
No ratings yet
Generative AI Report
15 pages
Fcomp 7 1504725
No ratings yet
Fcomp 7 1504725
20 pages
6014
No ratings yet
6014
36 pages
Plagiarism Checker Using Levenshtein & Google
No ratings yet
Plagiarism Checker Using Levenshtein & Google
6 pages
Utilization of NLP Techniques in Plagiarism Detection System Through Semantic Analysis Using Word2Vec
No ratings yet
Utilization of NLP Techniques in Plagiarism Detection System Through Semantic Analysis Using Word2Vec
6 pages
Jhgythjgcn B
No ratings yet
Jhgythjgcn B
7 pages
AI Text Detection System Development
No ratings yet
AI Text Detection System Development
2 pages
IJCRT2409745
No ratings yet
IJCRT2409745
6 pages
AI Based Student's Assignments Plagiarism Detector
No ratings yet
AI Based Student's Assignments Plagiarism Detector
11 pages
A System For Detection of Plagiarism of Ideas Based On Deep Learning Algorithm
No ratings yet
A System For Detection of Plagiarism of Ideas Based On Deep Learning Algorithm
8 pages
NLP-Based Text Plagiarism Checker
No ratings yet
NLP-Based Text Plagiarism Checker
18 pages
Ijarcce 2024 134107
No ratings yet
Ijarcce 2024 134107
6 pages
Ai Mini Project
No ratings yet
Ai Mini Project
16 pages
AI Assignment Detection System
No ratings yet
AI Assignment Detection System
10 pages
Online Assignment Plagiarism Checker System
No ratings yet
Online Assignment Plagiarism Checker System
3 pages
Academic Plagiarism Detection
No ratings yet
Academic Plagiarism Detection
3 pages
IJCRT2312092
No ratings yet
IJCRT2312092
6 pages
Plagiarism Detection Algorithm Using Natural Language Processing Based On Grammar Analyzing
No ratings yet
Plagiarism Detection Algorithm Using Natural Language Processing Based On Grammar Analyzing
13 pages
Source Code Plagiarism
No ratings yet
Source Code Plagiarism
41 pages
Plagiarism Chapter 1 N 2
No ratings yet
Plagiarism Chapter 1 N 2
14 pages
Tijani Adesola
No ratings yet
Tijani Adesola
11 pages
Two-Phase Plagiarism Detection System
No ratings yet
Two-Phase Plagiarism Detection System
13 pages
Paper 4
No ratings yet
Paper 4
6 pages
Identifying Machine-Paraphrased Plagiarism: Bibtex Ris Enw
No ratings yet
Identifying Machine-Paraphrased Plagiarism: Bibtex Ris Enw
22 pages
JETIR1706044
No ratings yet
JETIR1706044
3 pages
A Unified Framework For Text Extraction and Plagiarism Detection in Image-Based Content Using OCR and NLP
No ratings yet
A Unified Framework For Text Extraction and Plagiarism Detection in Image-Based Content Using OCR and NLP
10 pages
AI Self-Detection in Text Generation
No ratings yet
AI Self-Detection in Text Generation
12 pages
Review1
No ratings yet
Review1
19 pages
Wahle 2022 B
No ratings yet
Wahle 2022 B
23 pages
(IJCST-V8I4P13) :M. Chilakarao, K. Sri Sahitya, K. Hari Priya, N. Bala Manikanta, M. Deepika
No ratings yet
(IJCST-V8I4P13) :M. Chilakarao, K. Sri Sahitya, K. Hari Priya, N. Bala Manikanta, M. Deepika
8 pages
Report4 Merged Organized 1 30
No ratings yet
Report4 Merged Organized 1 30
30 pages
Ilovepdf Merged Pagenumber
No ratings yet
Ilovepdf Merged Pagenumber
59 pages
Signed Report
No ratings yet
Signed Report
37 pages
Python Plagiarism Checker Project Report
No ratings yet
Python Plagiarism Checker Project Report
18 pages
Mini - Project - Final (1) .PPTX (Read-Only) .
No ratings yet
Mini - Project - Final (1) .PPTX (Read-Only) .
15 pages
Coat Corpus of Artificial Texts
No ratings yet
Coat Corpus of Artificial Texts
26 pages
Synopsis 6 Sem 34 ECE
No ratings yet
Synopsis 6 Sem 34 ECE
9 pages
AuthenTracK Tracking and Verifying Content Authenticity With AI
No ratings yet
AuthenTracK Tracking and Verifying Content Authenticity With AI
8 pages
Text Similarity with Cosine in Python
No ratings yet
Text Similarity with Cosine in Python
4 pages
AuthenTracK Tracking and Verifying Content Authenticity With AI
No ratings yet
AuthenTracK Tracking and Verifying Content Authenticity With AI
8 pages
A Comprehensive Framework For Semantic Similarity Analysis of Human and AI-Generated Text Using Transformer Architectures and Ensemble Techniques
No ratings yet
A Comprehensive Framework For Semantic Similarity Analysis of Human and AI-Generated Text Using Transformer Architectures and Ensemble Techniques
4 pages
AI Text Detection Project Synopsis
No ratings yet
AI Text Detection Project Synopsis
5 pages
Basawashreeeeeeeee
No ratings yet
Basawashreeeeeeeee
10 pages
AI On Plagism
No ratings yet
AI On Plagism
10 pages
Source-Code Plagiarism Detection Tool
No ratings yet
Source-Code Plagiarism Detection Tool
16 pages
Basawashree 1
No ratings yet
Basawashree 1
10 pages
Plagiarism PPT 1
No ratings yet
Plagiarism PPT 1
21 pages
Bi Lab Manual Cl-IV Be Ai&Ds
No ratings yet
Bi Lab Manual Cl-IV Be Ai&Ds
67 pages
Capgemini Selects 2025 Batch 1
No ratings yet
Capgemini Selects 2025 Batch 1
12 pages
Computational Intelligence Endsem
No ratings yet
Computational Intelligence Endsem
8 pages
Ci Unit 4
No ratings yet
Ci Unit 4
14 pages
Plagiarism Report Victim Families of Will Get Justice. The Harshest Response Will Be Given 1749400347672
No ratings yet
Plagiarism Report Victim Families of Will Get Justice. The Harshest Response Will Be Given 1749400347672
1 page
Business Intelligence Endsem
No ratings yet
Business Intelligence Endsem
12 pages
Power Bi 4
No ratings yet
Power Bi 4
20 pages
Business Intelligence Endsem
No ratings yet
Business Intelligence Endsem
10 pages
CTF 2025 Recruitment Tasks Overview
No ratings yet
CTF 2025 Recruitment Tasks Overview
2 pages
2025 Hyundai i20 Brochure Overview
No ratings yet
2025 Hyundai i20 Brochure Overview
9 pages
Coal Liquefaction Processes Explained
No ratings yet
Coal Liquefaction Processes Explained
8 pages
Concreto Armado: Diseño y Tecnología
No ratings yet
Concreto Armado: Diseño y Tecnología
11 pages
Human Performance Maturity Model White Paper Final
100% (1)
Human Performance Maturity Model White Paper Final
12 pages
Personnel Lifting Safety Devices
No ratings yet
Personnel Lifting Safety Devices
3 pages
7504 - IND Application Form
No ratings yet
7504 - IND Application Form
16 pages
Freight Car Models and Specifications
No ratings yet
Freight Car Models and Specifications
50 pages
RISO EZ 370E Parts List Overview
100% (2)
RISO EZ 370E Parts List Overview
116 pages
R Programming Basics Guide
No ratings yet
R Programming Basics Guide
49 pages
Ridhima Yadav - G023 - 80511020325: Brand Management Faculty: Dr. Smriti Pande
No ratings yet
Ridhima Yadav - G023 - 80511020325: Brand Management Faculty: Dr. Smriti Pande
5 pages
FEECO Agglomeration Equipment Overview
No ratings yet
FEECO Agglomeration Equipment Overview
8 pages
Indian Contract Act
No ratings yet
Indian Contract Act
15 pages
BCA IT Question Bank - Lala Lajpat Rai
No ratings yet
BCA IT Question Bank - Lala Lajpat Rai
4 pages
Options Trading Analysis for Mary
No ratings yet
Options Trading Analysis for Mary
8 pages
Getnet Gr-534w User Manual
No ratings yet
Getnet Gr-534w User Manual
147 pages
Project Flexibility and Tradeoff Matrix
No ratings yet
Project Flexibility and Tradeoff Matrix
2 pages
Screw Air Compressor Common Fails
No ratings yet
Screw Air Compressor Common Fails
5 pages
09 Managing Plant Diseases
100% (1)
09 Managing Plant Diseases
25 pages
Ma - Na.Pa - Bhavan - Lohgaon Ajinkya DY Patil Vidyapeeth: 158 Bus Time Schedule & Line Map
No ratings yet
Ma - Na.Pa - Bhavan - Lohgaon Ajinkya DY Patil Vidyapeeth: 158 Bus Time Schedule & Line Map
5 pages
C Process
No ratings yet
C Process
1 page
NetCalculator - CodeProject
No ratings yet
NetCalculator - CodeProject
5 pages
List of Candidates Applied For Upward Movement (FOR ALLOTMENT ON 23/09/2020)
No ratings yet
List of Candidates Applied For Upward Movement (FOR ALLOTMENT ON 23/09/2020)
2 pages
TVS Raider Service Invoice 2786
No ratings yet
TVS Raider Service Invoice 2786
2 pages
Baltimor Condenser
0% (1)
Baltimor Condenser
44 pages
Spot Satellite
No ratings yet
Spot Satellite
8 pages
Mathematical Models of Mechanical Systems
No ratings yet
Mathematical Models of Mechanical Systems
12 pages
Serial Peripheral Interface (SPI) : Available Online at
No ratings yet
Serial Peripheral Interface (SPI) : Available Online at
8 pages
Telecom Digital Transformation: Technology
100% (1)
Telecom Digital Transformation: Technology
31 pages
Mazda RX-7 Bodyshop Repair Manual
100% (68)
Mazda RX-7 Bodyshop Repair Manual
8 pages

IJCRT PaperFormat

Uploaded by

IJCRT PaperFormat

Uploaded by

www.ijcrt.

org © 20XX IJCRT | Volume X, Issue X Month Year | ISSN: 2320-2882

III. RESEARCH METHODOLOGY

IJCRT1601009 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org 25

3.1 Data and Sources of Data

3.2 Tools and Technologies

3.3 Methodological Steps

3.4 Model Architecture

1.User Input Interface:

IJCRT1601009 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org 26

• The similarity score

3.4.2.1 Model for Text Similarity Detection

IJCRT1601009 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org 27

3.4.2.2 Model for AI Content Detection

3.4.3.1 TF-IDF Model for Plagiarism Detection

Term Frequency (TF):

Inverse Document Frequency (IDF):

IV. RESULTS AND DISCUSSION

IJCRT1601009 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org 28

4.1 System Evaluation Metrics

4.2 Threshold Analysis

4.3 AI Content Detection

4.4 Visual Output

IJCRT1601009 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org 29

[5] Google AI. (2024). Gemini API Documentation.

IJCRT1601009 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org 30

You might also like