Ten Simple Rulesfor Crafting Effective Promptsfor Large Language Models

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/373809003
Ten Simple Rules for Crafting Effective Prompts for Large Language Models
Article in SSRN Electronic Journal · January 2023

DOI: 10.2139/ssrn.4565553
CITATIONS READS
0 23
1 author:
Zhicheng Lin
University of Science and Technology of China
61 PUBLICATIONS 1,150 CITATIONS
SEE PROFILE
All content following this page was uploaded by Zhicheng Lin on 09 October 2023.
The user has requested enhancement of the downloaded file.

1
2
3
4 Ten Simple Rules for Crafting Effective Prompts for Large Language Models
5
6
7 Zhicheng Lin
8 The Chinese University of Hong Kong, Shenzhen
9 Email: zhichenglin@gmail.com
10
11
12
13 Author Note
14 Correspondence should be addressed to Zhicheng Lin (zhichenglin@gmail.com), PhD, The
15 Chinese University of Hong Kong, Shenzhen, China 518172
16
17 Acknowledgments: The writing was supported by the National Key R&D Program of China
18 (STI2030-Major Projects+2021ZD0204200), National Natural Science Foundation of China
19 (32071045), and Shenzhen Fundamental Research Program (JCYJ20210324134603010).
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
1
47 Standfirst (abstract)
48 As large language models (LLMs) become increasingly integrated into research settings, the
49 ability to interact effectively with these models becomes crucial. This article sets forth ten
50 rules—from understanding LLM capabilities to the art of asking well-structured questions—that
51 serve as a foundational guide for crafting prompts that maximize LLM utility.
52
53 Introduction
54 Why large language models (LLMs)?
55 LLMs employ deep learning—a subset of artificial intelligence (AI) that emulates the neural
56 networks of the human brain (refer to Box 1 for a glossary and Box 2 for an LLM primer)—to
57 generate human-like text in response to user queries, or “prompts.” These models excel in a wide
58 array of language-based tasks, ranging from rudimentary sentence completion to intricate
59 problem-solving. Within a short span of time, LLMs have become a ubiquitous presence in the
60 technological landscape (for a comparative analysis of leading LLMs, see Table 1). Their
61 influence transcends the boundaries of computer science, permeating diverse disciplines
62 including behavioral, social, and biomedical sciences. In writing, programming, and
63 visualization—they are rapidly becoming indispensable for both researchers and professionals1.
64
65 Why prompts?
66 Interacting with LLMs may seem deceptively simple: just type a question and get an instant
67 answer! However, effective engagement with these models proves to be more challenging and
68 nuanced than it initially seems2. This imposes a substantial limitation on the utility of LLMs, as
69 the quality of their output is directly tied to the quality of the prompts given—a consideration
70 often overlooked in current discussions about the utility of LLMs. A well-crafted prompt can
71 elicit a detailed, accurate, and contextually relevant response, thereby maximizing the model’s
72 performance. Conversely, a poorly structured one may result in a vague, incorrect, or irrelevant
73 answer. This limitation stems in part from the inherent weakness of LLMs: despite their
74 sophisticated algorithms and voluminous training data, they lack a true understanding of the
75 world and rely on pattern recognition within that data. Another key factor is the models’ ability
76 for in-context learning, which allows them to adapt temporarily to the prompts they receive,
77 rendering these prompts crucial for conveying contextual information.
78
79 Thus, mastering the art and science of formulating effective prompts—sometimes termed
80 “prompt engineering”—is essential for leveraging the capabilities of LLMs. Achieving optimal
81 results requires a blend of domain-specific knowledge, model understanding, and skills. This
82 article outlines ten simple rules to lay the foundation for mastery.
83
84 Rule 1: Understand the capabilities, limitations, and transparency issues of LLMs
85 LLMs are not silver bullets; they are powerful tools with specific strengths and weaknesses.
86 Unlike traditional software that relies on predetermined algorithms, LLMs are able to interpret
87 natural language commands and excel in many tasks, making them user-friendly and versatile,
88 but they may produce different results even when given the same commands, which makes them
89 inconsistent. Understanding these nuances is crucial for their effective deployment. Below, we
90 outline three key aspects to consider:
91
92 Capabilities:
2
93 • Intelligence: LLMs excel in tasks like summarizing articles, answering queries, and
94 generating human-like text. Their performance often rivals or even surpasses human
95 capabilities in these domains.
96 • Versatility: Trained on a plethora of data sources, LLMs can be applied to a wide range
97 of tasks and disciplines, from natural language processing to more specialized scientific
98 queries.
99 • Collaborative nature: These models are designed for iterative interaction, allowing for a
100 more nuanced and context-aware dialogue through conversational prompts and feedback.
101
102 Limitations:
103 • Hallucination: LLMs can make statements that are confidence-sounding but incorrect or
104 misleading. While advanced techniques like retrieval-augmented generation (RAG) can
105 mitigate this issue—by pulling relevant information from a database or corpus to inform
106 the response3—hallucination remains a persistent concern. Always verify the output,
107 especially when the stakes are high.
108 • Contextual fragility: Despite their abilities for in-context learning, LLMs can struggle to
109 retain context in extended conversations, which can compromise the quality of their
110 responses (see “context length” in Table 1).
111 • Biases, ethics, and security: LLMs are a mirror to the data they are trained on, reflecting
112 both its wisdom and its biases. Be cautious of underrepresentation and other data-related
113 biases. The potential for misuse is also real, particularly in sensitive areas like academic
114 publishing, where sophisticated fake research threatens to corrupt the literature. Security
115 is another major concern, particularly when using LLMs as autonomous agents:
116 malicious instructions can be embedded within the prompts (“prompt injection”),
117 potentially causing the models to execute unintended actions.
118
119 Transparency and reproducibility:
120 • Dynamic nature: The web-based, evolving nature of LLMs poses challenges to the
121 reproducibility of results. Different versions may yield different outputs.
122 • Stochastic variability: The same prompt can yield different outputs due to the stochastic
123 processes in the algorithms, adding another layer of complexity to reproducibility.
124 • Documentation: Given the iterative and evolving use of LLMs in a research project,
125 capturing every interaction for the sake of transparency can be daunting (see “chat-
126 history sharing” in Table 1).
127
128 In light of these capabilities and limitations, the art of crafting effective prompts becomes a
129 nuanced skill—a skill that must be honed through learning and experience. Play with the models
130 (see Rule 10). The more you interact with a model, the better you’ll understand its nuances. To
131 ensure accuracy and transparency in your work, verify the model’s output and stay abreast of the
132 latest best practices in AI documentation.
133
134 Rule 2: Stick to one question at a time
135 When playing with the models, resist the temptation to bombard them with a barrage of
136 questions in a single prompt. Instead, adopt a disciplined approach: ask one question at a time—
137 an etiquette that speakers at podiums would readily appreciate. Just as you are more likely to
138 catch what you are aiming for when fishing with one hook instead of a tangled mess of multiple
3
139 hooks, the precision of a solitary hook is your friend when fishing for insights. Thus, start with
140 your main question. Once satisfied, proceed to ask subsequent questions in sequence, as you
141 would naturally in conversation (see Rule 3). This focused, stepwise approach carries several
142 advantages:
143
144 • Minimize ambiguity by allowing the model to focus solely on one task.
145 • Lessen unintended biasing effects between related questions.
146 • Allow you to scrutinize the accuracy of each answer individually.
147 • Facilitate the building of a logical line of inquiry.
148
149 Of course, exceptions exist—such as when asking a set of questions that are not interrelated. But
150 generally, treat prompts with single-minded purpose. Much as cognitive load diminishes human
151 performance, piling on multiple questions hinders these models. A focused, conversational flow
152 allows them to shine.
153
154 Rule 3: Start simple, then refine
155 Start your interaction with a simple, clear prompt. A complex prompt risks confusing more than
156 illuminating. Aim to distill your query down to its essence. Once the model demonstrates
157 comprehension, build on that foundation iteratively. This is akin to using a search engine: you
158 start with a broad query, inspect the results, and then add keywords to filter toward relevance.
159 With LLMs, your “keywords” are follow-up prompts guiding the model toward the desired
160 response. This iterative approach offers several benefits:
161
162 • Avoid over-specification when objectives are not yet fully defined.
163 • Enable course-correcting initial misses.
164 • Incrementally build context for more nuanced interactions.
165 • Foster a conversational dynamic.
166
167 Optimal prompts often follow a pivoting path. Starting simple enables efficient tuning through
168 dialogue.
169
170 Rule 4: Add relevant context
171 LLMs have no inherent knowledge beyond their training data. They rely on the contextual
172 specifics you provide as a guiding compass to generate nuanced, relevant responses. A well-
173 framed question should:
174
175 • Clarify jargon: Make explicit any specialized terminology or concepts relevant to your
176 query. For example, instead of asking the LLM to explain RLHF, spell the term out
177 (“reinforcement learning with human feedback”).
178 • Embed specifics: Root your query in specific details to guide the LLM toward a more
179 accurate, relevant interpretation. Thus, instead of asking it to draft a cover letter, provide
180 it with the job ad and your CV to add relevant context.
181 • Prioritize evidence: Ground your interactions in relevant factual information. Rather
182 than asking the model about the best way to achieve eternal happiness, provide it with a
183 peer-reviewed study and ask it questions based on those findings.
184
4
185 The aim is not to inundate the LLM with general knowledge but to prime it with the
186 particularities pertinent to your question. When queries brim with relevant details, LLMs
187 generate more insightful, nuanced responses.
188
189 Rule 5: Be explicit in your instructions
190 To get your favorite drink (“a large hot latte with oat milk” if you are like me), you don’t walk
191 into a random coffee shop with the order “A cup of coffee, please!” Don’t expect LLMs to read
192 your mind either. Imprecise requests risk off-target responses as the LLM grasps for intent.
193 Clarity is key. Specify exactly what you want. Instead of “Revise the text,” consider a more
194 explicit instruction. What stylistic approach are you aiming for? Who is your intended audience?
195 Do you have a particular focus, like clarity or brevity? Another example: rather than asking for
196 suggestions for a name, be more explicit with your constraints: “The name must start with a verb
197 and the implicit subject/actor is the user.” Hence the rule: try to be explicit in stating the task, its
198 objectives, the desired emphasis, and any constraints (see also Rule 9). Explicit instructions help:
199
200 • Minimize ambiguity about the task and its scope.
201 • Enable the LLM to concentrate capabilities on your specific needs.
202 • Lessen unintended biasing effects from vague associations.
203 • Provide clear criteria to judge the model’s accuracy.
204
205 While LLMs are designed for conversational refinement, explicit instructions can streamline the
206 process by clearly declaring your aims upfront. Steer the dialogue by articulating your purpose
207 and constraints.
208
209 Rule 6: Ask for lots of options
210 To exploit the potential of LLMs, ask for an array of options rather than a single suggestion.
211 Therefore, request three analogies to explain a concept, five ideas to begin the introduction, 10
212 alternatives to replace the final paragraph, 20 names to name a function—let the model provide
213 you with food for thought and you choose from the buffet. Asking for lots of options offers
214 several advantages:
215
216 • Encourage the model to explore multiple avenues, enhancing the creativity and diversity
217 of the output.
218 • Provide you with a comprehensive set of options, minimizing the risk of settling for a
219 suboptimal or biased suggestion.
220 • Facilitate iterative refinement.
221
222 Treat LLMs as a versatile ideation partner. Asking for plentiful options, from myriad angles,
223 enriches your decision process. Abundant choice unlocks maximal utility.
224
225 Rule 7: Assign characters
226 LLMs are capable of simulating various roles to offer specialized feedback or unique
227 perspectives. Instead of asking for generic advice or information, consider instructing the model
228 to role-play. Act as a typical reader of your audience to provide feedback on the writing, as a
229 writing coach to help revise the manuscript, as a Tibetan yak specialized in human physiology to
5
230 explain the impact of high altitudes, as a sentient cheesecake explaining in cheesecake
231 analogies—the possibilities are endless. Assigning characters provides several benefits:
232
233 • Contextualize the model’s responses, making them more relevant to your specific needs.
234 • Allow for a more interactive and engaging dialogue with the model.
235 • Yield more nuanced and specialized information, enhancing the quality of the output.
236 • Provide a creative approach to problem-solving, encouraging out-of-the-box thinking.
237
238 Personas provide framing to yield responses from unique vantage points. By leveraging the role-
239 playing capabilities of LLMs, you can obtain more targeted and contextually appropriate
240 responses—and have more fun in the process.
241
242 Rule 8: Show examples, don’t just tell
243 When interacting with LLMs, specificity is your ally (see Rules 4, 5, and 9). A particularly
244 effective approach is to embody your intent with concrete examples, as LLMs are adept at few-
245 shot learning—learning from examples4. Rather than a vague “Create a chart for this data,”
246 provide an example: “Create a bar chart for this data, similar to the one in Figure 3 of the
247 attached paper.” Just as showing a picture to a hairstylist is far superior to trying to describe your
248 desired haircut, providing explicit examples—whether it’s a code snippet for a programming
249 query or a sample sentence for a writing task—serves as an invaluable guide for the model. By
250 providing a tangible reference, you accomplish several objectives:
251
252 • Clarify the context, enabling the LLM to better grasp the nuances of your request.
253 • Reduce the number of iterations needed to achieve the desired output.
254 • Offer a benchmark against which to evaluate the model’s output.
255
256 Examples act as a roadmap for the LLM, guiding it toward generating responses that are closely
257 aligned with your expectations. Consider supplementing your instructions with illustrative
258 examples to catalyze performance.
259
260 Rule 9: Declare your preferred response format
261 LLMs tend to be verbose. Specifying your desired formatting—bullet points, reading level, tone,
262 and so on—helps constrain the possible outputs, improving relevance. For example, instead of
263 “Summarize the key findings,” declare the response format: “Summarize the key findings in
264 bullet points and use language a high school student would understand.” Declaring format
265 upfront also provides clear criteria for evaluating LLM performance. Some options:
266
267 • Bullet points for summarizing concisely.
268 • Casual tone for accessibility.
269 • Code comments for documentation.
270 • Restrict response length for conciseness or reading level for comprehension.
271
272 Declaring your preferred format sets clear expectations to streamline prompting. Constraints
273 foster relevance.
274
275 Rule 10: Experiment, experiment, experiment!
6
276 Effective prompting is not formulaic; small tweaks can sometimes yield dramatic differences.
277 Consider two demonstrations.
278
279 • Demonstration one: Across a range of reasoning tasks, simply adding the instruction
280 “let’s think step by step” to the prompt in GPT-3 leads to much-improved
281 performance5—a form of Chain of Thought (CoT) prompting.
282 • Demonstration two: While LLMs may falter in direct queries involving complex
283 calculations, they shine in generating functional computer code that solves the same
284 problems.
285
286 The two cases demonstrate just how sensitive LLMs are to the prompts. LLMs are “weird.” You
287 must therefore experiment, experiment, experiment! Productive use of LLMs requires ongoing,
288 creative experimentation. Consider:
289
290 • Vary phrasings, lengths, specificity, and constraints.
291 • Toggle between different examples, contexts, and instructions.
292 • Attempt both conversational and concise declarative prompts.
293 • Try the same prompts on different LLMs.
294
295 Treat prompts as testable hypotheses. Use results to inform iterations. Not all attempts will
296 succeed, but evidence accrues with each. With tenacity, optimal results will emerge.
297
298 Conclusion
299 LLMs stand as unparalleled ideation partners in the realm of natural language tasks. Yet, their
300 utility hinges on the art of effective prompting. This article presents ten simple rules for crafting
301 effective prompts, serving as a foundational guide to unlocking the full potential of LLMs. The
302 rules establish core principles for success: know the models’ capabilities and limitations; ask
303 focused, one-at-a-time questions; start simple and refine iteratively. Central themes encompass
304 the importance of framing with relevant details, explicitly declaring aims, and illustrating intent
305 through examples. Additional recommendations include: assigning personas and requesting
306 diverse options to tap into the versatility of LLMs; specifying format to set expectations; and
307 engaging in continuous experimentation for optimal outcomes. As LLMs continue their rapid
308 advancement, these rules provide a roadmap for users to extract maximal utility. While the skill
309 of effective prompting may not lead to eternal happiness, it promises to pay increasing dividends
310 in productivity—and maybe happiness too.
311
312
313
314
315
316
317
318
319
320
321
7
322 References
323
324 1 Lin, Z. Why and how to embrace AI such as ChatGPT in your academic life. Royal
325 Society Open Science 10, 230658, doi:10.1098/rsos.230658 (2023).
326 2 Zamfirescu-Pereira, J. D., Wong, R. Y., Hartmann, B. & Yang, Q. in Proceedings of the
327 2023 CHI Conference on Human Factors in Computing Systems 1-21 (2023).
328 3 Lewis, P. et al. in Advances in Neural Information Processing Systems Vol. 33 9459-
329 9474 (2020).
330 4 Brown, T. et al. in Advances in Neural Information Processing Systems Vol. 33 1877-
331 1901 (2020).
332 5 Kojima, T., Gu, S. S., Reid, M., Matsuo, Y. & Iwasawa, Y. in Advances in Neural
333 Information Processing Systems Vol. 35 22199-22213 (2022).
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
8
368 Table 1. Comparisons of major leading LLMs as of September 2023
Feature GPT-3.5/GPT- Claude 2 PaLM 2 LLaMA
4 (ChatGPT (by (Bard 2 (by
by OpenAI; Anthropic) by Meta AI)
Bing by Google
Microsoft) AI)
Support for file Limited (via Text files Image None
*
input plugins) files
Context length 4,096/8,192 100,000 8,000 4,096
(in tokens)
Internet access Limited (via No Yes No
plugins or
Microsoft
Bing)
Editable Yes No Yes No
prompts after
execution
Support for Yes No No No
third-party
plugin
Availability of Yes No No Yes
global
settings
Chat-history Yes No Yes No
sharing
Subscription GPT-4 only No No No
requirement
Open-source No No No Yes#
status
369 Note: Access to these LLMs may be restricted in certain regions (e.g., Claude is currently only
370 accessible in the US and the UK). Additionally, some features may not be available in all
371 regions. Among the listed LLMs, GPT-4 is the largest and most powerful but is trained on data
372 only up to September 2021.
373
*
374 ChatGPT-4 “Advanced Data Analysis” mode allows uploading of files for data analysis and
375 code execution
#
376 With restrictive clauses (not permitted for training other language models; special license
377 required for large-scale applications)
378
379 ChatGPT: https://chat.openai.com
380 Claude: https://claude.ai
381 Bard: https://www.google.com/bard
382 LLaMA: https://www.llama2.ai
383
384
385
9
386 Box 1. Glossary
387
388 Application programming interface (API): A set of rules for software entities to communicate
389 with each other.
390 Artificial intelligence (AI): The simulation of human intelligence in machines.
391 Biases: Prejudices in machine learning models that arise from the data they are trained on.
392 Chain of Thought (CoT) prompting: A technique to improve the model’s reasoning
393 capabilities by adding specific instructions like “let’s think step by step”.
394 Context length: The maximum number of tokens that a model can consider from the
395 conversation history.
396 Deep learning: A subfield of AI that mimics the neural networks of the human brain to analyze
397 data.
398 Hallucination: Incorrect or misleading statements generated by LLMs.
399 Large language models (LLMs): Machine learning models trained on vast datasets to perform
400 language-based tasks.
401 Neural networks: Computational models inspired by the human brain, used in machine learning
402 algorithms to solve complex problems.
403 Parameter: An adjustable weight representing the strengths of connections between artificial
404 neurons in the neural network.
405 Prompt: A user query or instruction that triggers a response from an LLM.
406 Prompt engineering: The art and science of crafting effective prompts to interact with LLMs.
407 Prompt injection: Malicious instructions embedded within the prompts to make the model
408 perform unintended actions.
409 Reinforcement learning with human feedback (RLHF): A type of machine learning where
410 models learn to make decisions based on real-time feedback provided by human evaluators.
411 Retrieval-augmented generation (RAG): A technique to pull relevant information from a
412 database to inform the model’s response.
413 Self-attention: A mechanism in transformers that evaluates the relevance of different segments
414 of input text.
415 Token: The unit of text that is processed by an LLM.
416 Transformers: A type of deep learning architecture designed to handle sequential data.
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
10
432 Box 2. Understanding and using LLMs
433
434 LLMs primarily use deep learning for their initial training to “learn” from large amounts of data
435 (e.g., web pages, academic articles, and books). In particular, they use a type of deep learning
436 architecture called transformers, which are designed to handle sequential data. A key feature of
437 transformers is self-attention, a mechanism that allows the model to evaluate the relevance of
438 different segments of the input text when making predictions to generate output.
439
440 Specifically, text is broken down into smaller units called tokens, which can be as small as a
441 character or as large as a word. These tokens are then converted into numerical values, serving as
442 the model’s input. Inside the model, there are a large number of adjustable weights, commonly
443 referred to as parameters. These parameters represent the strengths of connections between
444 artificial neurons in the model’s neural network architecture. During training, these parameters
445 are fine-tuned to capture complex linguistic patterns. Being pre-trained on extensive language
446 datasets—such as the 1.4 trillion tokens used for LLaMA 2—enables these models to generate
447 text that is remarkably similar to human language.
448
449 While deep learning forms the backbone of their training, LLMs also incorporate reinforcement
450 learning from human feedback for further refinement. Specifically, after the initial training
451 phase, these models enter a fine-tuning stage where they interact with human evaluators. The
452 evaluators provide real-time feedback on the model’s responses, effectively “rewarding” or
453 “penalizing” the model based on the accuracy and relevance of its output. This reinforcement
454 learning process allows the model to adapt and improve over time, becoming more aligned with
455 human values and expectations.
456
457 Advanced LLMs have become invaluable tools for researchers, from summarizing scientific
458 literature to creative brainstorming. Indeed, since the publication of the transformers architecture
459 in 2017, several leading LLMs have emerged that are accessible via web-based conversational
460 interfaces (see Table 1 for a comparison of four leading LLMs). To use these LLMs, users
461 generally need to register on their respective websites. For those seeking more flexible
462 integration, API (application programming interface) access is often available, allowing users to
463 send HTTP requests to the LLM providers’ servers to incorporate the models into various
464 applications and services, such as creative writing platforms or translation services. Some
465 providers may require a subscription or use a pay-per-use model.
466
467 When choosing an LLM, consider the specific tasks you require the model for, its availability in
468 your region, and your personal preferences (see Table 1). Build intuition by experimenting with
469 different LLMs on various tasks and comparing their outputs. As LLMs are continually being
470 improved, it pays to stay updated with release notes and new features.
11
View publication stats

Ten Simple Rulesfor Crafting Effective Promptsfor Large Language Models

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ten Simple Rulesfor Crafting Effective Promptsfor Large Language Models

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Article in SSRN Electronic Journal · January 2023

The user has requested enhancement of the downloaded file.

View publication stats

You might also like