Professional Documents
Culture Documents
294 Research Paper
294 Research Paper
Devansh Bansal
Computer Engineering Department, San Jose State University,
San Jose, USA
Email: devansh.bansal@sjsu.edu
Abstract: Within the dynamic realm of natural these apps and services essentially provide three main
language processing (NLP) and artificial intelligence functionalities:
(AI), large language models (LLMs) have emerged as
revolutionary instruments that have increased the 1. Text Understanding
human-AI interaction manifold. These models have an 2. Content generation
exceptional capacity to produce text that closely 3. Reasoning
resembles human language, utilizing input prompts as a
basis. They can understand information related to any Currently, there are two main problems that researchers are
domain and provide suitable and correct responses, for trying to solve in this domain. The first is that these models
e.g: understanding mathematical formulas, writing code, are very expensive to train and the training data really
summarizing text, and so on. However, the crucial factor affects the performance of these models. So it’s crucial that
in realizing their complete capabilities and guaranteeing the data is clean in all aspects. It should not be biased or
a smooth collaboration between humans and AI is in the incorrect. The second problem is that once these models are
field of prompt engineering, which combines both trained, the only way to interact with them is via text
artistic and scientific elements. The goal of this study is prompts or attachments comprising text files.
to delve into the world of LLMs, understand how they
work, and examine how prompts affect performance in Consider the following two prompts given to ChatGPT
terms of correctness and speed. In this study, we running on GPT 3.5
examine different prompt engineering methods and
propose a layered architecture approach toward
prompting the foundational models such as GPT and
DALLE.
2
C. Prompting techniques and model behavior of guidelines for the users so that they can write better
prompts and get better results.
Within this section, we conducted an in-depth examination
of various prompting techniques proposed by fellow B. Problem Statement
researchers. A noteworthy study in this domain focused on
prompt engineering tailored for text-to-image generative AI In today's AI landscape, prompts play a crucial role. Large
models. The authors of this research undertook a Language Models (LLMs) tend to respond differently to
comprehensive exploration, executing five distinct questions depending on who's asking. This variability can
experiments that varied prompts, random seeds, lead to different user perceptions about the model's
optimization lengths, styles, and subjects. Notably, their performance. We see a lot of potential in these models, and
findings indicated that the incorporation of prompt seeds there are many interesting ways in which we can use these
and iteration lengths in the final prompts resulted in models if we make user interactions simpler and intuitive. In
improved model responses. Moreover, the study highlighted an ideal situation, the model should accurately understand
that diverse styles were interpreted distinctively by the and respond to user questions in a way that feels human. To
models, revealing nuances in the models' responsiveness. achieve this, it's crucial to understand how these models
Subsequently, the researchers proffered insightful prompt function, how they react to different inputs and prompts, and
guidelines, aiming to inform the design of applications the algorithmic tweaks needed to improve response
utilizing such models. [2] accuracy and speed.
Our research is based on the assumption that the adoption of This study aims to achieve the following main objectives:
AI in different spheres of human life will increase. In
today’s world, the mobile apps that we use will be taken ● Comprehend the workings and behavior of Large
over by these AI agents, providing specific experiences to Language Models (LLMs) across diverse
the users. scenarios.
● Explore the applications of LLMs to gain a
comprehensive understanding of their practical
uses.
● Investigate the interaction dynamics between
prompts and LLMs.
● Analyze strategies for crafting improved prompts
to enhance model responses.
● Propose guidelines for architecture and design to
facilitate the development of applications that
optimize prompt-based interactions.
D. Motivation
In today’s world, AI is significantly changing experiences We propose an iterative, prompt optimization pipeline with
and systems across different spheres. In a couple of years, reinforced active learning for automated generation and
this interaction between humans and AI in different realms testing customized to target outcomes. A model-agnostic
of life will increase much more. It’s important that everyone waveform controller architecture updates prompts based on
is aware of this amazing technology and how to leverage it modular validation signals, allowing rapid tuning without
for their own use cases. LLMs represent a fundamental shift human input. Before discussing the architecture and the
in how we interact with intelligent systems and it’s critical implementation plan, let’s first discuss the foundational
that we well understand how they work to ensure their full models that we will use interchangeably under the hood.
utilization. In this project, we scope our goals to understand
these models from prompting perspective and provide a set
3
A. GPT (Generative Pre-trained Transformer): 4.1 Working of LLMs
GPT, a pioneering model in the landscape of language Large Language Models (LLMs) represent a transformative
models, has significantly contributed to the advancements in leap in natural language processing, with their intricate
Prompt Engineering for Large Language Models (LLMs). architectures enabling them to comprehend and generate
Developed by OpenAI, GPT employs transformer human-like text. A cornerstone in this domain is the
architecture and is pre-trained on massive datasets, enabling transformer architecture, widely adopted in models like
it to generate coherent and contextually relevant text. Its GPT (Generative Pre-trained Transformer). This
ability to understand and respond to natural language architecture relies on self-attention mechanisms, allowing
prompts has made it a cornerstone in the field. Researchers the model to weigh the significance of different words in a
have extensively explored techniques to fine-tune GPT for sequence, fostering improved contextual understanding.
specific tasks, highlighting its flexibility in adapting to
diverse applications. Pre-training is a crucial phase in the development of LLMs.
It involves training the model on massive corpora to learn
linguistic patterns and contextual relationships. This process
B. DALL-E: equips the model with a broad understanding of language,
enabling it to generate coherent text across various contexts.
DALL-E, an innovative model by OpenAI, extends the GPT exemplifies this pre-training paradigm, leveraging a
capabilities of generative models into the realm of images. diverse range of internet texts to acquire language
Unlike traditional language-centric LLMs, DALL-E proficiency.
operates as a generative model for visual content. By
conditioning on textual prompts, DALL-E generates images Fine-tuning is the subsequent step, tailoring the pre-trained
that correspond to the given descriptions. Understanding model for specific tasks. This phase involves exposing the
and manipulating prompts in the visual domain present model to task-specific datasets, allowing it to adapt its
unique challenges, and DALL-E's architecture offers knowledge to particular domains. Researchers have
insights into the intersection of language and image prompt explored diverse fine-tuning strategies to enhance the
engineering. performance of LLMs in applications such as text
completion, summarization, and question-answering.
C. CLAUDE (Contrastive Language-Image
Pre-Training):
4
and tailored responses. This augmented layer acts as a More formally: Rt = α*R't + (1-α)*R ̃t
crucial intermediary step, fine-tuning the user's input to
better align with the nuanced requirements of the LLM and, Where Rt is the reward at time t; R’t is the Monte Carlo
consequently, fostering improved performance in various rollout estimate; R ̃t is the exponentially smoothed
application scenarios. long-term reward; and α is a tuned weighting coefficient.
This framework outperforms prior programmatic
optimization techniques for the breadth of language use
cases while maximizing safety.
5
into the decision-making process of LLMs, particularly in 6. Beurer-Kellner, L., Fischer, M. L., & Vechev, M. (2023).
critical domains such as healthcare and legal applications. Prompting is programming: a query language for large
language models. Proceedings of the ACM on Programming
In conclusion, users need to follow a four-step approach Languages, 7(PLDI), 1946-1969
while thinking/formulating prompts: https://doi.org/10.1145/3591300.
1. Ensure that you are setting up as much context as possible 7. M. Kuzlu, Z. Xiao, S. Sarp, F. O. Catak, N. Gurler and O.
in the question Guler, "The Rise of Generative Artificial Intelligence in
2. Ensure you are using correct spelling and grammar. (to Healthcare," 2023 12th Mediterranean Conference on
prevent hallucination) Embedded Computing (MECO), Budva, Montenegro, 2023,
3. Start small and gradually build up the conversation (chain pp. 1-4, doi: 10.1109/MECO58584.2023.10155107.
of reasoning)
4. Imagine you are talking to a human, not AI. 8. J. Mange, "Effect of Training Data Order for Machine
Learning," 2019 International Conference on Computational
Science and Computational Intelligence (CSCI), Las Vegas,
NV, USA, 2019, pp. 406-407, doi:
10.1109/CSCI49370.2019.00078.
1. M. K. Pehlivanoğlu, M. A. Syakura and N. Duru, 11. Y. Miao, S. Li, J. Tang and T. Wang, "MuDPT:
"Enhancing Paraphrasing in Chatbots Through Prompt Multi-modal Deep-symphysis Prompt Tuning for Large
Engineering: A Comparative Study on ChatGPT, Bing, and Pre-trained Vision-Language Models," 2023 IEEE
Bard," 2023 8th International Conference on Computer International Conference on Multimedia and Expo (ICME),
Science and Engineering (UBMK), Burdur, Turkiye, 2023, Brisbane, Australia, 2023, pp. 25-30, doi:
pp. 432-437, doi: 10.1109/UBMK59864.2023.10286606. 10.1109/ICME55011.2023.00013.