You are on page 1of 5

SECURING THE LLM CHATBOT

- Threats and mitigation

Ahmed Mahfooz Ali Khan alaa23@student.bth.se (PA2562)


Sai Nikhil Sakhetapuram sask23@student.bth.se (PA2562)
Syed Tafazzul Hazqueel tasy23@student.bth.se (PA2562)
Surya Gujja sugu23@student.bth.se (PA2562)
Rajeev Varma Kanna raka23@student.bth.se (PA2562)

ABSTRACT:
This report highlights the hypercritical need for securing LLM chatbot system to protect
against potential threats. In this report, we have discovered important security flaws that
make system more vulnerable to unauthorized access, data breaches and malicious inquiries.
To address these issues, the report proposes comprehensive mitigation strategies that
incorporates of user authentications, encryption, security, regular audits and monitoring.
Hence it is imperative to implement the measures for safeguarding the LLM and ensuring
long-term success.

INTRODUCTION:
The LLM project aims to deploy a chatbot for user queries, making security important. Due
to lack of security safeguard leaves the system vulnerable to attack, endangering user privacy
and data integrity. This introduction highlights the significance of safeguarding the LLM, not
only to mitigate threats but also to amplify user’s satisfaction, confidence and trust.

Dept. Computer Science & Engineering


Blekinge Institute of Technology
SE–371 79 Karlskrona, Sweden
POTENTIAL THREATS
The potential threats associated to the LLM chatbot are as follows:

 DATA BREACH: Data breachers occurs when confidential information gathered by chatbot is
accessed by unauthorised parties. This data comprises of personal information such as names and
addresses ,conversation history ,financial information, behavioural insights gained from user
interactions. Just like any other breaches there may be serious repercussions for both users and the
company responsible for the chatbot.

It can occur due to:

Security vulnerabilities: weak encryption, software bugs, and coding errors can create entry
points for attackers.
Third-party breaches: vulnerabilities in external services can expose user data.
Targeted attacks: skilled attackers may take advantage of particular flaws in the chatbot's
programming or hardware.

 SQL INJECTIONS:SQL Injection is a computing terminology for a code method that is used
to attack data driven systems. The attacker inserts malicious SQL statements into an entry field
for execution. When user input is not strongly typed and is handled unexpectedly, or when user
input is filtered wrongly for string literal escape characters encoded in SQL queries, SQL
injection must exploit a fault in the program's functionality.
SQL injections attacks provide attackers the ability to modify balances or cancel transactions,
fake identities, tamper with existing data and disclose all information.

How SQL injection might occur in a chatbot:


User Input Handling: The chatbot allows users to input information, which is then used to create
SQL queries.
Lack of Filtration of input: A chatbot can create dangerous input with SQL code if the chatbot
does not a check adequately and filter the user input.

 ADVERSARIAL ATTACK: These attacks are a specific type of cyberattack that targets machine
learning model. They involve crafting carefully modified inputs with the intention of deceiving the
model and causing it to produce incorrect outputs or predictions. These attacks pose a grave danger to
the security and reliability of AI systems, peculiarly as these systems become more and more
integrated in our daily lives.

How do they work:


Attacker examines the inner working of a machine learning model, its strengths and flaws.
Then they create crafted input called “adversarial example”. Example could be slightly modified
text ,images or other information that appears normal to human but actually contains hidden signs
that tricks the model.
When the model receives these examples, it makes them incorrect predictions or output.

 MISINFORMATION AND DISINFORMATION: Misinformation can occur in the setting of


LLM chatbot when the model generates responses factually incorrect. This can happen because of
biases in the training data, misinterpretation of the context, or in unintentional errors in language
problems.
On the other hand disinformation could occur when malicious actors deliberately alter the model
to provide misleading material that spreads incorrect information .
MITIGATION RECOMMENDATIONS:

DATA BREACHES:

 Robust encryption: Strong encryption algorithms can be used to safeguard data both in transit
and at rest.
 Monitor performance: Continuously monitor the chatbot for suspicious activity and update
security settings as necessary.
 Data minimization: Only gather the information required for the chatbot to work; avoid storing
sensitive information unless it is necessary
 Regular data security audits: By conducting routine audits Proactively identify and address
vulnerabilities in your data security procedures.

SQL INJECTIONS:

 Input validation and sanitization: Scrutinize users input for malicious code or harmful
instructions before computing the data.
 Parameterized queries: Use prepared statements to isolate user input from SQL code,
preventing the injection of malicious commands.
 Web Application Firewall(WAF): Implement a WAF to filter and block malicious traffic. SQL
injection attempts can be identified and stopped by WAFs before they even reach the application.
 Education and awareness: Educate system administrators and developers on secure coding
techniques and SQL injection hazards. Develop a development team culture that is security-
aware.

ADVERSARIAL ATTACKS:

 Adversarial training: Adding adversarial samples to the model during training makes it more
resilient to these kinds of attacks and get prevented from it.
 Ensemble methods: Since different models may have distinct vulnerabilities, it is often feasible
to improve robustness against adversarial attacks by combining predictions from multiple models
or by employing ensemble approaches.
 Input and output constraints: Define constraints on acceptable input and output ranges. This
helps prevent the model from generating false or malicious responses.
 Ethical Guidelines and Governance: Establish ethical guidelines for the use of LLMs and
implement governance practices to ensure responsible deployment. Regularly evaluate and
update these guidelines to address emerging threats and ethical considerations.

MISINFORMATION AND DISINFORMATION:

 Filtering and content moderation: Implement content filtering mechanisms to recognize and
remove false information. Use predefined rules, heuristics, or machine learning models to flag or
block content that raises concerns.
 Source verification: Evaluate the sources of information's credibility. Give consumers details
about the information's origins and exhort them to confirm facts from reputable and reliable
trustworthy sources.
 User feedback system: Establish a framework for user feedback so that people may report false
information. Utilize this input to fix any possible problems, address potential issues and enhance
the chatbot's accuracy of responses.
 Transparency and explainable: Make the chatbot's decision-making process more transparent
and understandable. Give justification for the sources of information or reasoning behind
responses, helping users better evaluate the reliability of the information.

CONCLUSION: The LLM chatbot faces potential threats such as data breaches, SQL
injections, adversarial attack, misinformation and disinformation. To mitigate these potential
threats strong encryption, ongoing monitoring, and data reduction are indispensable to
reduce these threats. Input validation, parameterized queries, and education may all be used
to counter SQL injections. Adversarial attacks incorporates ethical standards, group
techniques, and adversarial training. Filtering, source verification, user feedback mechanisms,
and openness are all crucial in the fight against disinformation. In today's technological
environment, putting these mitigating techniques into practice is crucial to guaranteeing the
security, dependability, and moral use of LLM chatbots.

REFERENCES:

1.ISO/IEC 27001:2013 - Information security management systems.

2.NIST Special Publication 800-137 - Information Security Continuous Monitoring (ISCM) – 2011

3.Verizon Data Breach Investigations Report (DBIR) – 2021

4.OWASP Application Security Verification Standard (ASVS) - Version 4.0

5.Cornell University Computer Science >


CryptographyandSecurity https://doi.org/10.48550/arXiv.2308.01990

6. Jiawei Zhou, Yixuan Zhang, Qianni Luo, Andrea G Parker, and Munmun De Choudhury. 2023.
Synthetic Lies: Understanding AI-Generated Misinformation and Evaluating Algorithmic and Human
Solutions. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems
(CHI '23), April 23–28, 2023, Hamburg, Germany. ACM, New York, NY, USA 20
Pages. https://doi.org/10.1145/3544548.3581318

7.Malik Almaliki. 2019. Online Misinformation Spread: A Systematic Literature Map. In Proceedings
of the 2019 3rd International Conference on Information System and Data Mining (Houston, TX,
USA) (ICISDM 2019). Association for Computing Machinery, New York, NY, USA, 171-178.
https://doi.org/10.1145/3325917.3325938

8.Chen, X. and Sin, S.C.J., 2013. 'Misinformation? What of it? 'Motivations and individual
differences in misinformation sharing on social media. Proceedings of the Association for Information
Science and Technology, 50(1), pp.1--4.
9.Buchanan B, Lohn A, Musser M, Sedova K. Truth, Lies, and Automation: How Language Models
Could Change Disinformation. Washington, DC: Centre for Security and Emerging Technology.
(2021).

10.Kreps S, Mccain RM, Brundage M. All the news that's fit to fabricate: AI-generated text as a tool
of media misinformation. J Exp Polit Sci. (2022) 9:104–17. doi: 10.1017/XPS.2020.37

11.Universal and Transferable Adversarial Attacks


on Aligned Language Models Andy Zou, Zifan https://doi.org/10.48550/arXiv.2307.15043.
Wang, Nicholas Carlini, Milad Nasr, J. Zico Kolter, Matt Fredrikson.

12.Aligning Large Language Models with Human Preferences through Representation Engineering
Wenhao Liu, Xiaohua Wang, Muling Wu, Tianlong Li, Changze Lv, Zixuan Ling, Jianhao Zhu,
Cenyuan Zhang, Xiaoqing Zheng, Xuanjing Huang https://doi.org/10.48550/arXiv.2312.15997.

13.Zhang, Y., Wallace, B., & Celikyilmaz, A. (2019). Design Challenges and Misconceptions in
Neural Sequence Labeling. In Proceedings of the AAAI Conference on Artificial Intelligence
(AAAI).

14.Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I.
(2017). Attention is All You Need. arXiv:1706.03762

15.Resnik, P., & Armstrong, T. (2015). Rethinking Evaluation Metrics for Dialogue Systems: The
Role of Human-AI Interaction. In Proceedings of the 2015 Conference on Empirical Methods in
Natural Language Processing (EMNLP).

16.ACM CHI Conference on Human Factors in Computing Systems.

17.OWASP Guide to Input Validation-OWASP Input Validation

18.OWASP SQL Injection Prevention Cheat Sheet - OWASP SQL Injection Prevention.

19.Shreves, R., Antoniewicz, M., & Dhanjani, N. (2007). Web Application Obfuscation:
'/WAFs..Evasion..Filters//alert(/Obfuscation/)-˜'. Syngress.

20.ISO/IEC 27001:2013 - Information technology — Security techniques — Information security


management systems — Requirements.

21.CERT Secure Coding Standards - CERT Secure Coding.

22.ISO/IEC 27032:2012 - Information technology — Security techniques — Guidelines for


cybersecurity.

23.Brundage, M., et al. (2018). The Malicious Use of Artificial Intelligence: Forecasting, Prevention,
and Mitigation. arXiv preprint arXiv:1802.07228.

24. European Commission. (2019). "Ethics Guidelines for Trustworthy AI." High-Level Expert Group
on Artificial Intelligence.

25. Doshi-Velez, F., & Kim, B. (2017). Towards a rigorous science of interpretable machine learning.
arXiv preprint arXiv:1702.08608.

You might also like