You are on page 1of 12

M.A.R.

MULTI-MODAL AI
RESEARCH SYSTEM
TEAM
Guide: Ms. Kavitha
ASHLESHA CHOPANE (202101070109)
MAHESHWARI JADHAV (202101040176)
School of Computer Engineering SAHIL BHOITE (202101070121)
01 PROBLEM STATEMENT

02 OBJECTIVES

LIST OF 03 WORK DONE IN THIS SEMESTER

CONTENTS 04 PROPOSED BLOCK DIAGRAM

05 BASIC IMPLEMENTATION

06 SELECTED REFERENCES
PROBLEM STATEMENT
To Develop an AI-driven solution for streamlined management
of diverse professional files using a multi-modal approach

M.A.R.S
OBJECTIVES

To Develop robust file processing capabilities for diverse formats, focusing on efficient text and image
extraction.

To integrate and optimize AI models for tasks such as question answering, content modification, data
analysis, and document similarity.

To create a user-friendly Streamlit interface, enabling seamless file interaction and conversational AI.

To Implement collaborative features, allowing users to share files and maintain a conversation history.

M.A.R.S
WORK DONE IN THIS SEMESTER
W= F * D

Developed file processing functionality for diverse types, focusing on PDFs and
implementing text extraction for NLP tasks.

Fine-tuned the various module using the Gradient AI library, deploying quantized
Llama-2 and Palm models for resource-efficient inference.

Explored diverse language models and ML modules, integrating Llama2, Palm and
Hugging Face's T5-XXL for QA and similarity tasks.

Implemented a Streamlit app allowing users to upload PDFs, ask questions, and
receive responses through a conversational interface.

M.A.R.S
PROPOSED FLOWCHART DIAGRAM

M.A.R.S
BASIC IMPLEMENTATION: FINE-TUNING MARS MODULE

Gradient AI Integration:
Leveraged Gradient AI for fine-tuning, focusing on QA and
similarity tasks.
Deployed quantized Llama-2 models for efficient inference.
Extended Dataset and Fine-Tuning:
Augmented the dataset with diverse queries, including team details.
Fine-tuned the model iteratively, enhancing its performance over multiple iterations.
Testing and Deployment:
Evaluated the model's response to varied queries post fine-tuning.
Deployed the enhanced model for conversational use in Multi-modal AI Research System (MARS).
M.A.R.S
STREAMLIT INTERFACE

Enabled PDF uploads, utilized PyPDF2 for text extraction,


and implemented efficient text splitting with Google Palm
embeddings
Developed user-friendly PDF interaction with a chat-based
system

M.A.R.S
LOOKING AHEAD, OUR
ROADMAP
Expanding MARS capabilities to handle an even wider array of file types, ensuring comprehensive
coverage for various professional documents.
Enhancing MARS with the ability to extract valuable information directly from images, broadening
its scope to include diverse data sources
Introducing features for visualizing data through images and graphs, leveraging generative AI to
enhance understanding and presentation.
Enabling MARS to generate and interact with files, including code snippets, documents,
presentations, and more, enhancing its utility for a variety of professional tasks.

M.A.R.S
CONCLUSION

MARS transforms file management with advanced AI, adept at handling diverse
formats.
Our progress showcases advancements in data processing, collaboration, and model
integration.
Future plans involve refining file processing, fine-tuning AI models, including image
data extraction, and enhancing the user interface.
Our commitment includes innovative methodologies and emerging tech to fully
unlock MARS's potential for efficient and intelligent file management.

M.A.R.S
REFERENCES
D. Carlander-Reuterfelt et al., "JAICOB: A Data Science Chatbot," IEEE Access, vol. 8, pp. 180672-
180680, Sep. 2020.
L. Labadze, M. Grigolia, and L. Machaidze, "Role of AI chatbots in education: systematic literature
review," Computers & Education: Open Access, vol. 165, pp. 107182, Apr. 2021.
I. V. Serban et al., "A Deep Reinforcement Learning Chatbot," arXiv preprint arXiv:1706.03762, 2020.
L. R. Soenksen et al., "Integrated multimodal artificial intelligence framework for healthcare
applications," Journal of Healthcare Engineering, vol. 2022, pp. 1-12, Feb. 2022.
P. Jung and L. Lee, "A Multimodal Deep Learning Model Using Text, Image, and Code Data for
Improving Issue Classification Tasks," IEEE Transactions on Computational Intelligence and AI in
Medicine, vol. 5, pp. 1-10, Apr. 2023.
J. Chen et al., "AI Framework for Multimodal File Management," IEEE International Conference on Big
Data (Big Data), pp. 6093-6102, Dec. 2023.
J. Lipkova et al., "Artificial intelligence for multimodal data integration in oncology," Artificial
Intelligence in Medicine, vol. 129, pp. 100995, Oct. 2021. M.A.R.S
THANKS

M.A.R.S

You might also like