Prompt Engineering Guidelines For Using Large
Prompt Engineering Guidelines For Using Large
1
Chalmers University of Technology, Gothenburg, Sweden
2
University of Gothenburg, Gothenburg, Sweden
arXiv:2507.03405v1 [[Link]] 4 Jul 2025
1 Introduction
Section 2 present the relevant background literature that motivates our re-
search goals. Section 3 describes our research design and execution. Section 4
presents the results of the our study and answer the RQ1, RQ2 and RQ3. We
discuss the implications of the gathered results along with the threats to valid-
ity of our study in Section 5. Finally, in Section 6, we conclude our study and
present potential areas for further research based on the discussed implications.
quality results. “Few-shot prompting” has been observed to improve the LLM
output quality compared to alternative methods Chen et al. [2023], West et al.
[2021]. Conversely, other studies highlight cases where few-shot prompting falls
short, suggesting that its effectiveness depends on specific conditions Yu et al.
[2022], Clavié et al. [2023].
Furthermore, research on models like GPT-3, GPT-3.5, and ChatGPT indi-
cates persistent challenges in tasks requiring emotional understanding or mathe-
matical reasoning, regardless of the prompting approach used Yang et al. [2023].
Additionally, these models are prone to confidently generating incorrect facts, a
phenomenon known as hallucinations. Various studies propose mitigation strate-
gies to address these inaccuracies Koralus and Wang-Maścianica [2023]. Recent
studies have also explored applying prompt engineering guidelines to new do-
mains, including SE. These studies often propose domain-specific adaptations
of existing guidelines, expanding the applicability of prompt engineering tech-
niques White et al. [2023a], Yu et al. [2022], Clavié et al. [2023], Chen et al.
[2023], Yang et al. [2023]. These sources, while crucial for shaping and enhanc-
ing our understanding of prompt engineering for LLMs, fail to address the gap
of lack of standard prompt engineering guidelines applicable for RE activities,
further strengthening the motivation of our study.
Through the works of Pressman, R.S. Pressman [2005] and Sommerville,
I. Sommerville and Sawyer [1997], the five main RE activities identified in the
literature are elicitation, analysis, specification, validation, and management.
These activities are crucial for successful product development and face various
challenges. Our mapping of prompt engineering guidelines to RE was done while
focusing primarily on these five activities.
3 Methodology
We based our systematic review process partially on the guidelines for conducting
a systematic literature review Kitchenham and Charters [2007]. To construct a
comprehensive search strategy, we identified relevant keywords through prior
research in related domains and a snowballing approach. Based on our research
Title Suppressed Due to Excessive Length 5
4 Results
The results of this study are presented in three subsections, each corresponding
to an RQ. Subsection 4.1 addresses RQ1, presenting a non-exhaustive list of
prompt engineering guidelines. These guidelines are categorised into the nine
most frequently occurring themes identified in our review. Subsection 4.2 ex-
plores the advantages and limitations of applying the identified prompt engi-
neering guidelines in RE based on expert interviews, providing insights neces-
Title Suppressed Due to Excessive Length 7
sary for answering RQ2. Subsection 4.3 answers RQ3 by mapping the identified
guidelines to relevant RE activities.
From the 28 included primary studies, we identified and extracted 36 prompt en-
gineering guidelines. These guidelines were categorised into nine distinct themes
based on their defining characteristics. Each theme was defined to capture the
underlying concepts and group similar guidelines, providing a clearer overview
of their overall intent and purpose.
Context The “Context” theme revolves around guidelines that relate to con-
textual information in any manner for prompts.
Reasoning The “Reasoning” theme captures guidelines that aim to affect rea-
soning capabilities or the ability to think through complex problems or tasks in
a generated output.
Keywords The theme “Keywords” represent guidelines that involve any use of
single-word modifiers to prompts.
Fig. 1. Theme distribution of the extracted guidelines from the studies in the review.
Theme Description
1. Adding context to examples in prompts produce more efficient and informative output. Reynolds
and McDonell [2021]
2. Provide context to all prompts to avoid output hallucinations. Kumar [2023]
3. Provide context of the prompt to ensure a closely related output. Lubiana et al. [2023], Zhou et al.
Context [2022a]
(C) 4. The more context tokens pre-appended to prompts, the more fine-grained [Link] et al. [2022a]
5. The Context Manager Pattern enables users to specify or remove context for a conversation with
an LLM White et al. [2023b]
6. Providing more context and instructions is an effective strategy to increase the semantic quality of
the output Vogelsang [2024]
1. Improves the generation quality by conditioning the prompt with an identity, such as “Python
programmer” or “Math tutor” Wei et al. [2023]
Persona
2. To explore the requirements of a software-reliant system, include:
(P)
- “I want you to act as the system”,
- “Use the requirements to guide your behaviour”White et al. [2023c]
1. To improve reasoning and common sense in output, follow a template such as:
- “Reason step-by-step for the following problem. [Original prompt inserted here]” Koralus and Wang-
Maścianica [2023]
Templates
2. The following prompt template has shown an impressive quality of AI art:
(T)
- “[Medium] [Subject] [Artist(s)] [Details] [Image repository support]”Oppenlaender et al. [2023]
3. I am going to provide a template for your output; This is the template: PATTERN with PLACE-
HOLDERS White et al. [2023b]
1. Ensure any areas of potential miscommunication or ambiguity are caught, by providing a detailed
scope:
- “Within this scope”,
Disambi-
- “Consider these requirements or specifications” White et al. [2023c]
guation
2. To find points of weakness in a requirements specification, consider including:
(D)
- “Point out any areas of ambiguity or potentially unintended outcomes” White et al. [2023c]
3. The persona prompt method can be used to consider potential ambiguities from different perspec-
tives. White et al. [2023a]
1. Prepending “Let’s think step by step” improves zero-shot performance. Zhou et al. [2022b]
2. Extending the previously known “Let’s think step by step”, with “to reach the right conclusion,” to highlight decision-making in
the prompt. Clavié et al. [2023]
3. Chain Of Thought (CoT) prompting improves LLM performance and factual inconsistency evaluation
Reasoning
compared to [Link] et al. [2023], Luo et al. [2023]
(R)
4. Tree of thought (ToT) allows LLMs to perform deliberate decision making by considering multiple different reasoning paths
and self-evaluating choices to decide the next course of action Yao et al. [2024]
5. The intent of the Cognitive Verifier pattern is to force the LLM to always subdivide questions into additional questions that can
be used to provide a better answer to the original question White et al. [2023b]
1. self-consistency boosts the performance of chain-of-thought prompting Wang et al. [2022a]
2. The best of three strategy improves the LLM output stability and detecting hallucinations Ronanki
Analysis et al. [2024a]
(A) 3. Emotion-enhanced CoT prompting is an effective method to leverage emotional cues to enhance
the ability of ChatGPT on
mental health analysis. Yang et al. [2023]
1. When picking the prompt, focus on the subject and style keywords instead of connecting words.
Liu and Chilton [2022b]
2. Pre-appending keywords to prompts are shown to greatly improve performance by providing the
language model with
Keywords appropriate context. Gero et al. [2022]
(K) 3. Modifiers/Keywords can be added to the details or image repository sections of a template such
as:
- “[Medium] [Subject] [Artist(s)] [Details] [Image repository support]” Oppenlaender et al. [2023]
4. The inclusion of multiple descriptive keywords tends to align results closer to expectations. Hao
et al. [2022]
1. In translation tasks, adding a newline before the phrase in a new language increases the odds that
the output sentence
is still English. Reynolds and McDonell [2021]
2. A complete sentence definition with stop words performs better as a prompt than a set of core
terms that were extracted
Wording from the complete sentence definition after removing the stop words. Yong et al. [2022]
(W) 3. Words such as “well-known” and “often used to explain” are successful for analogy generation.
Bhavya et al. [2022]
4. Modifying prompts to resemble pseudocode tend to be the most successful in coding tasks. Denny
et al. [2023], White et al. [2023c]
5. Prompts to contain explicit algorithmic hints in engineering tasks perform better. Denny et al.
[2023]
1. Inclusion of “Question:” and “Answer:” improves the response, but rarely gives a binary answer.
Trautmann et al. [2022]
2. For easier understanding, number examples in few-shot prompting. West et al. [2021]
Few-shot 3. The format of [INPUT] and [OUTPUT] should linguistically imply the relationship between them.
Prompts West et al. [2021]
(F) 4. Specifications can be added to each [INPUT] and [OUTPUT] pair to give extra insight into com-
plicated problems. West et al. [2021]
5. In Few-shot prompting include a rationale in each shot (Input-rationale-output). Wang et al.
[2022b]
Table 3. Thematic Classification of Prompt Engineering Guidelines
10 Ronanki et al.
iteratively, any unclear aspects of your requirements and, as I say, any ambigu-
ities where there could be potential misunderstanding” They further suggested
that integrating persona-based guidelines could enhance this process by enabling
ambiguity exploration from multiple perspectives. They stressed the importance
of justifications, explaining that there is a need for reasoning behind why a
requirement is deemed ambiguous or weak.
Expert 2 identified a key advantage, noting that “it [LLM] could help to
point out weaknesses in a requirement specification” and that “prompt engi-
neering could help point out ambiguity.” However, they also acknowledged that
ambiguity might not always be a concern, referencing prior studies on smaller
teams where such issues had minimal impact.
Expert 3 exclusively highlighted advantages, emphasising that these guide-
lines could be beneficial not only for requirement specification but also for re-
quirements review. They provided an example related to guideline D2, stating,
“So this is a useful prompt for sure, it’s going to change the way we make re-
quirements reviews.” They further elaborated on the critical role of requirements
reviews, particularly in safety-critical contexts.
The proposed mapping between RE activities and the identified guideline themes,
along with the corresponding justifications, is outlined in this section. A detailed
representation of this mapping is provided in Table 4.
5 Discussion
One of the main objectives of this study was to explore how existing prompt
engineering guidelines can help users in leveraging LLMs in RE activities. Our
findings indicate that while various guidelines exist, they originate from a diverse
range of models, including text-to-text models such as GPT-3.5 and BERT, as
well as text-to-image and multi-modal models like DALL-E and the combination
of CLIP and VQGAN. Although some guidelines exhibit overlap, our analysis
reveals notable differences, suggesting that the role of prompt engineering will
continue to grow in importance. Notably, the majority of guidelines identified
through our literature review were mentioned only once across studies. This
shows that, while there is much interest in improving the performance of the
LLM through coming up various prompt based approaches, this field of study is
still in its early stages and needs to mature before this becomes an established
discipline of study. The guidelines and techniques need to be validated across
broader range of use cases to reach a generalisable conclusion on which techniques
work reliably enough for which tasks and use cases and what are the constraints
and limitations of them.
We also examined how these guidelines apply to different RE activities by
mapping them to guideline themes. Our interviews confirmed that while multiple
interpretations of these mappings exist, there was general agreement on the
value of the guideline themes. The template theme guidelines seems to the most
14 Ronanki et al.
relevant ones, as they are mapped to three of the five RE activities. The keyword,
persona and reasoning theme guidelines were mapped to two activities each
while the context, analysis and disambiguation themes were only mapped to
one activity. However, based on the experts’ insights, while the context and
disambiguation guidelines are applicable to fewer use cases, they bring more
value to the use cases they are applicable to compared to the rest of the guideline
themes. This indicates that the prompt engineering guidelines can serve two
major purposes: broadly apply to various RE activities and bring value to the
tasks they are used within.
Another interesting observation is that these seven themes of prompt engi-
neering guidelines make up 72% of the all the prompt engineering guidelines we
found. Some guidelines, such as those focused on few-shot prompting and word-
ing which make up 28% of all the prompt engineering guidelines we found in our
literature, were deemed irrelevant to RE and were not part of the mapping. This
indicates that, while existing prompt engineering guidelines are fairly applicable
to RE activities, there is also a risk that a good chunk of them are not. This
points to the need for either developing new prompt engineering guidelines, spe-
cific to RE or fine-tuning existing guidelines to make them more applicable and
effective for RE. Overall, our findings show both the potential and limitations
of prompt engineering guidelines for using LLMs in RE. Providing context is
key to improving LLM performance, but more research is needed to refine these
guidelines and explore their use in more advanced RE activities.
While we followed established methods and techniques for the collection and
analysis of the data to ensure methodological rigour and soundness, it is essential
to acknowledge and address potential threats to the validity of the findings.
Internal Validity: Despite selecting a review period from 2018 to the present,
there is a risk of omitting relevant studies. However, as large language models
(LLMs) largely emerged after 2017, and prompt-related guidelines first appeared
in 2021, this risk is likely minimal. Limited available literature and inconsisten-
cies in terminology pose risks. Selection bias may arise from reviewing a re-
stricted set of studies. Open discussions among researchers aimed to minimise
misinterpretations.
External Validity: The study’s findings may have limited generalisabil-
ity due to its specific focus and the rapidly evolving field. Including unpublished
preprints from arXiv introduces potential bias but helps capture cutting-edge re-
search. Interviewed experts may have personal biases influencing their responses,
potentially introducing confirmation bias.
Construct Validity: The study’s inclusion and exclusion criteria may affect
the construct validity. Also, the interviewed RE experts’ varying understanding
of the terminology within the guidelines is another threat.
Title Suppressed Due to Excessive Length 15
6 Conclusion
The primary motivation to conduct and present this study was to address the
lack of domain specific prompt engineering guidelines for using LLMs, especially
within RE. To that extent, our study provided an overview of existing prompt
engineering guidelines. We examined the benefits and limitations of these guide-
lines through interviews with three RE experts. Our interviews highlighted the
usefulness of the identified guidelines, especially for the context, template, key-
words, analysis, reasoning, persona, and disambiguation guidelines for various
RE activities. Based on the insights gathered from these interviews, we proposed
a mapping for prompt engineering guidelines for different RE activities.
We believe that as LLMs become more reliable in RE activities, they could
help reduce errors, detect issues earlier in the process, and prevent costly correc-
tions later in development. This, in turn, could lower project failure rates and
improve overall efficiency in RE workflows. Our findings contribute to the im-
provement of RE practices by supporting the development of tailored prompt en-
gineering guidelines. Future work should explore how general prompt guidelines
can be adapted for specific domains like RE and whether additional fine-tuning
of these guidelines is required to make them more effective.
Bibliography
Waad Alhoshan, Alessio Ferrari, and Liping Zhao. Zero-shot learning for re-
quirements classification: An exploratory study. Information and Software
Technology, 159:107202, 2023a.
Waad Alhoshan, Alessio Ferrari, and Liping Zhao. Zero-shot Learning for Re-
quirements Classification: An Exploratory Study. Information and Software
Technology, 159:107202, 2023b.
AutoGPT. “Empower your digital tasks with AutoGPT”, 2025. URL https:
//[Link]/. Accessed: 19/02/2025.
Bhavya Bhavya, Jinjun Xiong, and Chengxiang Zhai. Analogy generation by
prompting large language models: A case study of instructgpt. arXiv preprint
arXiv:2210.04186, 2022.
Brown, Tom and Mann, Benjamin and Ryder, Nick and Subbiah, Melanie and
Kaplan, Jared D and Dhariwal, Prafulla and Neelakantan, Arvind and Shyam,
Pranav and Sastry, Girish and Askell, Amanda and others. Language Models
are Few-shot Learners. Advances in Neural Information Processing Systems,
33:1877–1901, 2020.
Shan Chen, Yingya Li, Sheng Lu, Hoang Van, Hugo JWL Aerts, Guergana K
Savova, and Danielle S Bitterman. Evaluation of chatgpt family of models
for biomedical reasoning and classification. arXiv preprint arXiv:2304.02496,
2023.
Choi, Jonathan H and Hickman, Kristin E and Monahan, Amy and Schwarcz,
Daniel. ChatGPT Goes to Law School. Available at SSRN, 2023.
Benjamin Clavié, Alexandru Ciceu, Frederick Naylor, Guillaume Soulié, and
Thomas Brightwell. Large language models in the workplace: A case
study on prompt engineering for job type classification. arXiv preprint
arXiv:2303.07142, 2023.
Daniela S. Cruzes and Tore Dybå. Research synthesis in software engineering:
A tertiary study. Information and Software Technology, 53(5):440–455, 2011.
ISSN 0950-5849. [Link]
2011.01.004. URL [Link]
pii/S095058491100005X. Special Section on Best Papers from XP2010.
Paul Denny, Viraj Kumar, and Nasser Giacaman. Conversing with copilot: Ex-
ploring prompt engineering for solving cs1 problems using natural language.
In Proceedings of the 54th ACM Technical Symposium on Computer Science
Education V. 1, pages 1136–1142, 2023.
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-
training of deep bidirectional transformers for language understanding. arXiv
preprint arXiv:1810.04805, 2018.
James W Drisko and Tina Maschi. Content analysis. Oxford university press,
2016.
Title Suppressed Due to Excessive Length 17
Katy Ilonka Gero, Vivian Liu, and Lydia Chilton. Sparks: Inspiration for science
writing using language models. In Designing Interactive Systems Conference,
pages 1002–1019, 2022.
Yaru Hao, Zewen Chi, Li Dong, and Furu Wei. Optimizing prompts for text-to-
image generation. arXiv preprint arXiv:2212.09611, 2022.
Haque, Mubin Ul and Dharmadasa, Isuru and Sworna, Zarrin Tasnim and Ra-
japakse, Roshan Namal and Ahmad, Hussain. "I think this is the most dis-
ruptive technology": Exploring Sentiments of ChatGPT Early Adopters using
Twitter Data. arXiv preprint arXiv:2212.05856, 2022.
Barbara Kitchenham and Stuart Charters. Guidelines for performing systematic
literature reviews in software engineering. 2, 01 2007.
Philipp Koralus and Vincent Wang-Maścianica. Humans in humans out: On gpt
converging toward common sense in both success and failure. arXiv preprint
arXiv:2303.17276, 2023.
Krishna Kumar. Geotechnical parrot tales (gpt): Harnessing large language
models in geotechnical engineering. arXiv e-prints, pages arXiv–2304, 2023.
Vivian Liu and Lydia B Chilton. Design Guidelines for Prompt Engineering
Text-to-Image Generative Models. In Proceedings of the 2022 CHI Conference
on Human Factors in Computing Systems, CHI ’22, New York, NY, USA,
2022a. Association for Computing Machinery. ISBN 9781450391573. https:
//[Link]/10.1145/3491102.3501825. URL [Link]
3491102.3501825.
Vivian Liu and Lydia B Chilton. Design guidelines for prompt engineering text-
to-image generative models. In Proceedings of the 2022 CHI Conference on
Human Factors in Computing Systems, pages 1–23, 2022b.
Tiago Lubiana, Rafael Lopes, Pedro Medeiros, Juan Carlo Silva, Andre Nico-
lau Aquime Goncalves, Vinicius Maracaja-Coutinho, and Helder I Nakaya.
Ten quick tips for harnessing the power of chatgpt/gpt-4 in computational
biology. arXiv preprint arXiv:2303.16429, 2023.
Zheheng Luo, Qianqian Xie, and Sophia Ananiadou. Chatgpt as a factual
inconsistency evaluator for abstractive text summarization. arXiv preprint
arXiv:2303.15621, 2023.
Jonas Oppenlaender, Rhema Linder, and Johanna Silvennoinen. Prompting ai
art: An investigation into the creative skill of prompt engineering. arXiv
preprint arXiv:2303.13534, 2023.
Ethan Perez, Douwe Kiela, and Kyunghyun Cho. True Few-shot Learning with
Language Models. Advances in neural information processing systems, 34:
11054–11070, 2021.
Roger S Pressman. Software engineering: a practitioner’s approach. Pressman
and Associates, 2005.
Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever, et al. Im-
proving language understanding by generative pre-training. 2018.
Laria Reynolds and Kyle McDonell. Prompt programming for large language
models: Beyond the few-shot paradigm. In Extended Abstracts of the 2021
CHI Conference on Human Factors in Computing Systems, pages 1–7, 2021.
18 Ronanki et al.
Robson, Colin. Real World Research: A Resource for Social Scientists and
Practitioner-Researchers. Blackwell, 1993. URL [Link]
crid/1130282271686871424.
Alberto D. Rodriguez, Katherine R. Dearstyne, and Jane Cleland-Huang.
Prompts matter: Insights and strategies for prompt engineering in auto-
mated software traceability. In 2023 IEEE 31st International Requirements
Engineering Conference Workshops (REW), pages 455–464, 2023. https:
//[Link]/10.1109/REW57809.2023.00087.
Krishna Ronanki, Beatriz Cabrero-Daniel, and Christian Berger. Chatgpt
as a tool for user story quality evaluation: Trustworthy out of the box? In Agile
Processes in Software Engineering and Extreme Programming – Workshops,
pages 173–181. Springer Nature Switzerland, 2024a. ISBN 978-3-031-48550-3.
Krishna Ronanki, Beatriz Cabrero-Daniel, Jennifer Horkoff, and Christian
Berger. Requirements Engineering Using Generative AI: Prompts and
Prompting Patterns, pages 109–127. Springer Nature Switzerland, Cham,
2024b. [Link] URL https:
//[Link]/10.1007/978-3-031-55642-5_5.
Yuya Sasaki, Hironori Washizaki, Jialong Li, Dominik Sander, Nobukazu Yosh-
ioka, and Yoshiaki Fukazawa. Systematic literature review of prompt engineer-
ing patterns in software engineering. In 2024 IEEE 48th Annual Computers,
Software, and Applications Conference (COMPSAC), pages 670–675, 2024.
[Link]
Seaman, C.B. Qualitative Methods in Empirical Studies of Software Engineering.
IEEE Transactions on Software Engineering, 25(4):557–572, 1999. https:
//[Link]/10.1109/32.799955.
Ian Sommerville and Pete Sawyer. Requirements Engineering: A Good Practice
Guide. John Wiley & Sons, Inc., USA, 1st edition, 1997. ISBN 0471974447.
Strandberg, Per Erik. Ethical Interviews in Software Engineering. In 2019
ACM/IEEE International Symposium on Empirical Software Engineering and
Measurement (ESEM), pages 1–11, 2019. [Link]
2019.8870192.
Dietrich Trautmann, Alina Petrova, and Frank Schilder. Legal prompt
engineering for multilingual legal judgement prediction. arXiv preprint
arXiv:2212.02199, 2022.
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones,
Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you
need. Advances in neural information processing systems, 30, 2017.
Andreas Vogelsang. From specifications to prompts: On the future of generative
large language models in requirements engineering. IEEE Software, 41(5):
9–13, 2024. [Link]
Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc Le, Ed Chi, Sharan Narang,
Aakanksha Chowdhery, and Denny Zhou. Self-consistency improves chain
of thought reasoning in language models. arXiv preprint arXiv:2203.11171,
2022a.
Title Suppressed Due to Excessive Length 19
Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc Le, Ed Chi, and Denny
Zhou. Rationale-augmented ensembles in language models. arXiv preprint
arXiv:2207.00747, 2022b.
Jing Wei, Sungdong Kim, Hyunhoon Jung, and Young-Ho Kim. Leveraging large
language models to power chatbots for collecting user self-reported data. arXiv
preprint arXiv:2301.05843, 2023.
Peter West, Chandra Bhagavatula, Jack Hessel, Jena D Hwang, Liwei Jiang,
Ronan Le Bras, Ximing Lu, Sean Welleck, and Yejin Choi. Symbolic knowledge
distillation: from general language models to commonsense models. arXiv
preprint arXiv:2110.07178, 2021.
Jules White, Quchen Fu, Sam Hays, Michael Sandborn, Carlos Olea, Henry
Gilbert, Ashraf Elnashar, Jesse Spencer-Smith, and Douglas C Schmidt. A
prompt pattern catalog to enhance prompt engineering with chatgpt. arXiv
preprint arXiv:2302.11382, 2023a.
Jules White, Quchen Fu, Sam Hays, Michael Sandborn, Carlos Olea, Henry
Gilbert, Ashraf Elnashar, Jesse Spencer-Smith, and Douglas C. Schmidt. A
Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT,
2023b.
Jules White, Sam Hays, Quchen Fu, Jesse Spencer-Smith, and Douglas C
Schmidt. Chatgpt prompt patterns for improving code quality, refactoring, re-
quirements elicitation, and software design. arXiv preprint arXiv:2303.07839,
2023c.
Kailai Yang, Shaoxiong Ji, Tianlin Zhang, Qianqian Xie, and Sophia Ananiadou.
On the evaluations of chatgpt and emotion-enhanced prompting for mental
health analysis. arXiv preprint arXiv:2304.03347, 2023.
Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Tom Griffiths, Yuan Cao,
and Karthik Narasimhan. Tree of thoughts: Deliberate problem solving with
large language models. Advances in Neural Information Processing Systems,
36, 2024.
Gunwoo Yong, Kahyun Jeon, Daeyoung Gil, and Ghang Lee. Prompt engi-
neering for zero-shot and few-shot defect detection and classification using a
visual-language pretrained model. Computer-Aided Civil and Infrastructure
Engineering, 2022.
Fangyi Yu, Lee Quartey, and Frank Schilder. Legal prompting: Teaching a lan-
guage model to think like a lawyer. arXiv preprint arXiv:2212.01326, 2022.
Zhao, Liping and Alhoshan, Waad and Ferrari, Alessio and Letsholo, Keletso J
and Ajagbe, Muideen A and Chioasca, Erol-Valeriu and Batista-Navarro, Riza
T. Natural Language Processing for Requirements Engineering: A Systematic
Mapping Study. ACM Computing Surveys (CSUR), 54(3):1–41, 2021.
Kaiyang Zhou, Jingkang Yang, Chen Change Loy, and Ziwei Liu. Learning to
prompt for vision-language models. International Journal of Computer Vision,
130(9):2337–2348, 2022a.
Yongchao Zhou, Andrei Ioan Muresanu, Ziwen Han, Keiran Paster, Silviu Pitis,
Harris Chan, and Jimmy Ba. Large language models are human-level prompt
engineers. arXiv preprint arXiv:2211.01910, 2022b.