Welcome to Scribd!

A Survey of Techniques For Maximizing LLM Performance

Uploaded by

100% found this document useful (1 vote)

151 views40 pages

This document discusses techniques for optimizing large language model (LLM) performance, including prompt engineering, retrieval-augmented generation (RAG), and fine-tuning. Prompt engineering involves writing clear instructions and testing changes systematically. RAG gives the model access to domain-specific context to reduce hallucinations. Fine-tuning continues training on a smaller dataset to improve performance on a specific task. The techniques each have different strengths, and an optimization flow may combine them, such as fine-tuning a model and using RAG to inject relevant knowledge.

Original Description:

Original Title

A Survey of Techniques for Maximizing LLM Performance

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

100% found this document useful (1 vote)

151 views40 pages

A Survey of Techniques For Maximizing LLM Performance

Uploaded by

huangyuanshui

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 40

Search inside document

A Survey of Techniques for

Maximizing LLM Performance

精进大型语言模型性能的各种技巧
https://www.youtube.com/watch?v=ahnGLM-RC1Y
Organized by Richard
Twitter: richchat
微信公众号：檬查查
Optimizing LLMs is hard
Optimizing LLMs is hard
• Extracting signal from the noise is not easy
• Performance can be abstract and difficult to measure
• When to use what optimization
• Today‘s talk is about maximizing performance.
You should leave here with:
• A mental model of what the options are
• An appreciation of when to use one over the other
• The confidence to continue on the journey yourself
Optimizing LLM performance is not always
linear
1 – Prompt Engineering
Prompt engineering - Strategies for
optimization
Start with Extend to
• Write clear instructions • Provide reference text
• Split complex tasks into • Use external tools
simpler subtasks
• Give GPTs time to “think”
• Test changes systematically
Prompt engineering
Intuition - Best place to start, and can be a pretty good place to finish
Good for Not good for
• Testing and learning early, • Introducing new information
• When paired with evaluation, it • Reliably replicating a complex
provides your baseline and style or method, i.e. learning a
sets up further optimization new programming language
• Minimizing token usage
2 - Retrieval-augmented
generation
RAG vs fine-tune
RAG
• Giving the model access to
domain-specific context
RAG
Intuition - If you want to give your LLM domain knowledge, then RAG is
likely the best next step

Good for Not good for

• Introducing new information • Embedding understanding of a
to the model to update its broad domain
knowledge • Teaching the model to learn a
• Reducing hallucinations by new language, format or style
controlling content • Reducing token usage
RAG – success story
RAG - Cautionary Tale
RAG - How to think about eval
3 – fine-tuning
Fine-tuning
• Continuing the training process
on a smaller, domain-specific
dataset to optimize a model
fora specific task
Fine-tuning Benefits
• Improve model performance on a specific task
• Often a more effective way of improving model performance than
prompt-engineering or FSL;
• Improve model efficiency
• Reduce the number of tokens needed to geta model to perform well on
your task.
• Distill the expertise of a large model into a smaller one
Fine-tuning
Intuition - lf prompt engineering isn't helping, finetuning likely isn't right
for your use-case

Good for Not good for

• Emphasizing knowledge that • Adding new knowledge to the
already exists in the model base model
• Customizing the structure or • Quickly iterating on a new use-
tone of responses case
• Teaching a model very
complex instructions
Fine-tuning – Canva success story
Fine-tuning - Cautionary Tale
Fine-tuning Steps
Fine-tuning Best Practices
Fine-tuning+RAG
Best of both worlds
• Fine-tune the model to understand complex instructions
• Minimize prompt-engineering tokens
• More space for retrieved context
• Use RAG to inject relevant knowledge into the context
4 – Application of Theory
Challenge
RAG – What we did
RAG - Evaluation
Fine-tuning
The optimization flow
Thank you！
https://www.youtube.com/watch?v=ahnGLM-RC1Y
Organized by Richard
Twitter: richchat
微信公众号：檬查查

Cigniti Technologies
Document267 pages
Cigniti Technologies
ReTHINK INDIA
No ratings yet
AIML Hackathon
Document26 pages
AIML Hackathon
Gaurav Mandke
No ratings yet
KM Notes
Document102 pages
KM Notes
Chauhan Prince
No ratings yet
Upload
Document12 pages
Upload
suchitha
No ratings yet
Concept of QUESTIONED DOCUMENT EXAMINATION
Document39 pages
Concept of QUESTIONED DOCUMENT EXAMINATION
Clarito Lopez
100% (2)
Swiggy - Tracxn Company Report - 27 Dec 2023
Document53 pages
Swiggy - Tracxn Company Report - 27 Dec 2023
Rohit Bhangale
No ratings yet
Python Course Content: Course Duration: Batches Available: Course Features
Document2 pages
Python Course Content: Course Duration: Batches Available: Course Features
chitransh bhalla
No ratings yet
Rajat Goel Rajat Goel: Profile Summary
Document3 pages
Rajat Goel Rajat Goel: Profile Summary
क्षितिज राज़
No ratings yet
100 - Top Product Based Companies
Document3 pages
100 - Top Product Based Companies
Dina Diar
No ratings yet
eXtremeDB Dot Net User Guide
Document20 pages
eXtremeDB Dot Net User Guide
nicknextmove
No ratings yet
International Expansion Strategy
Document15 pages
International Expansion Strategy
kanangupta
No ratings yet
Dasari Kethan Srinivas: Education
Document2 pages
Dasari Kethan Srinivas: Education
Navya Dasari
No ratings yet
Indiastack: Gateway To Opportunities in India
Document50 pages
Indiastack: Gateway To Opportunities in India
Akshay Prasath
No ratings yet
ENPM667: Control of Robotic Systems Final Project: University of Maryland, College Park
Document18 pages
ENPM667: Control of Robotic Systems Final Project: University of Maryland, College Park
Sandeep Kota
100% (1)
Cyber Crime and Cyber Security
Document9 pages
Cyber Crime and Cyber Security
Editor IJTSRD
No ratings yet
Energy and Utilities Cloud Accredited Professional - Exam Guide
Document8 pages
Energy and Utilities Cloud Accredited Professional - Exam Guide
Elaine Zacke
No ratings yet
Hotel Destination Details
Document3 pages
Hotel Destination Details
executivesenthilkumar
100% (1)
Kuldeep Kumar Rawani: Medley Medical Solutions PVT LTD
Document2 pages
Kuldeep Kumar Rawani: Medley Medical Solutions PVT LTD
Prithviraj Radhakrishnan
No ratings yet
Jinka Hari Priya
Document4 pages
Jinka Hari Priya
Suresh Suresh
No ratings yet
MS-Excel Test: Save The Workbook As Your Name Before Starting The Test
Document52 pages
MS-Excel Test: Save The Workbook As Your Name Before Starting The Test
Shubhra Sharma
0% (2)
Library Management System
Document73 pages
Library Management System
Kr Sandy
60% (5)
BIGuidebook Templates - BI Logical Data Model - Data Integration Design
Document12 pages
BIGuidebook Templates - BI Logical Data Model - Data Integration Design
Shahina H Crowne
No ratings yet
Rarr Technologies
Document11 pages
Rarr Technologies
Sunil Makkar
No ratings yet
Building LLM Applications For Production
Document28 pages
Building LLM Applications For Production
roshan.U 1GV18ME016
No ratings yet
Ankush IITG Resume MLE
Document1 page
Ankush IITG Resume MLE
foboxom
No ratings yet
Sales Manager Resume Template 1 PDF
Document2 pages
Sales Manager Resume Template 1 PDF
nothereforit
No ratings yet
Resume Rajat RAI
Document3 pages
Resume Rajat RAI
Rajat Kumar Rai
No ratings yet
TalentSprint Corporate Presentation Colleges
Document22 pages
TalentSprint Corporate Presentation Colleges
Jooo
No ratings yet
Interview Letter
Document2 pages
Interview Letter
Nishal Manu
No ratings yet
EY-FICCI Report
Document292 pages
EY-FICCI Report
Nilesh Kumar
No ratings yet
Introduction To SQL: University of Tehran NOVEMBER, 2016
Document25 pages
Introduction To SQL: University of Tehran NOVEMBER, 2016
Alvin Enrico
No ratings yet
Myndtree New
Document9 pages
Myndtree New
lohith
No ratings yet
Ameyo - PDF Revised
Document28 pages
Ameyo - PDF Revised
Madan Savanth
No ratings yet
AIM India Fund Metals and Media Sector RV Albuero For Printing
Document106 pages
AIM India Fund Metals and Media Sector RV Albuero For Printing
Raymond Villamante Albuero
No ratings yet
Tushar Chhabra: Education Skills Courses and Tools
Document1 page
Tushar Chhabra: Education Skills Courses and Tools
Tushar Chhabra
No ratings yet
Shobhit Singh: Male, 26 Years
Document2 pages
Shobhit Singh: Male, 26 Years
Zero India
No ratings yet
Mrinal Pal: Contact: E-Mail Id: Permanent Address
Document4 pages
Mrinal Pal: Contact: E-Mail Id: Permanent Address
Mrinal
No ratings yet
5 Star Hotels Chennai
Document6 pages
5 Star Hotels Chennai
Kushal Raheja
No ratings yet
Little Feet School: (Signature of The Person Issuing The Receipt)
Document1 page
Little Feet School: (Signature of The Person Issuing The Receipt)
MADANMOHANREDDY
No ratings yet
Software Product Development
Document15 pages
Software Product Development
Bursys
No ratings yet
Priya Resume
Document3 pages
Priya Resume
sunnychadha0548
No ratings yet
Genetic Algorithms (GA)
Document32 pages
Genetic Algorithms (GA)
musa
No ratings yet
Byju's Mail - Audit Policy - BDA - Direct Sales Demo Audit & Order Audit Process
Document4 pages
Byju's Mail - Audit Policy - BDA - Direct Sales Demo Audit & Order Audit Process
Rishabh Karamchandani
No ratings yet
Espire Group - Espire Infrastrucutre - Espire Info Labs
Document48 pages
Espire Group - Espire Infrastrucutre - Espire Info Labs
espireinfra
No ratings yet
Battery Warranty Certificate
Document2 pages
Battery Warranty Certificate
faisal_yb
100% (2)
1 3 2 Mca
Document104 pages
1 3 2 Mca
jaydeep4567
No ratings yet
Hotel Industry Project Report
Document29 pages
Hotel Industry Project Report
RAHUL MEHER
No ratings yet
Arasayya Mmidisetty: E L, P D
Document4 pages
Arasayya Mmidisetty: E L, P D
Milan Gupta
0% (1)
GENPACT Job Description
Document2 pages
GENPACT Job Description
Ajith naik
No ratings yet
Presentation On IT Skill Set
Document33 pages
Presentation On IT Skill Set
Vishal Krishna
No ratings yet
Impact Analytics - Campus JD - Business Analyst PDF
Document2 pages
Impact Analytics - Campus JD - Business Analyst PDF
shrey
No ratings yet
Shashwat Kumar: Education Skills
Document1 page
Shashwat Kumar: Education Skills
Shashwat Kumar
No ratings yet
Britannia Case
Document2 pages
Britannia Case
AbhiMisra
No ratings yet
RESUME (M.Pharm-Pharmaceutics) : Ashishkumar N. Panchal
Document3 pages
RESUME (M.Pharm-Pharmaceutics) : Ashishkumar N. Panchal
Ashish
100% (1)
TATA AIA Afava: How Has Been Your 1st Time of Joining
Document5 pages
TATA AIA Afava: How Has Been Your 1st Time of Joining
Shivani raikar
No ratings yet
Internet Lease Line Tata
Document4 pages
Internet Lease Line Tata
BSNL-KOZ-EB
No ratings yet
MSP Pricing Structures Assignment
Document1 page
MSP Pricing Structures Assignment
Þórarinn Már Kristjánsson
No ratings yet
HDFC SL Crest Receipt For 2018 For Tax Purpose Vijesh
Document1 page
HDFC SL Crest Receipt For 2018 For Tax Purpose Vijesh
ravi
No ratings yet
BM - NSB
Document5 pages
BM - NSB
Nandini Singh
No ratings yet
Classedge Brochure Ebook
Document8 pages
Classedge Brochure Ebook
Dheeraj Bhoma
No ratings yet
Brainee OldPitchDeck
Document36 pages
Brainee OldPitchDeck
Tra
No ratings yet
Performance Guide For ISP 18: About Internshala
Document3 pages
Performance Guide For ISP 18: About Internshala
Sameer Sheikh
No ratings yet
Aakash (HR) Resume
Document2 pages
Aakash (HR) Resume
anon-816088
50% (4)
Deep Reinforcement Learning Nanodegree Program Syllabus
Document4 pages
Deep Reinforcement Learning Nanodegree Program Syllabus
İlkan Süslü
No ratings yet
PNSQCCreating in House Testing Course
Document127 pages
PNSQCCreating in House Testing Course
api-27294532
No ratings yet
LDPC - Low Density Parity Check Codes
Document6 pages
LDPC - Low Density Parity Check Codes
pandyakavi
No ratings yet
Neo Router Performance Tuning
Document4 pages
Neo Router Performance Tuning
Santosh Telawane
No ratings yet
CS95 Deductive Databases
Document21 pages
CS95 Deductive Databases
Avinash Samuel
No ratings yet
582 30 CDR v21
Document90 pages
582 30 CDR v21
Klaber Malara de Paula
No ratings yet
K-Means in Python - Solution
Document6 pages
K-Means in Python - Solution
Rodrigo Violante
No ratings yet
On Stopwords, Filtering and Data Sparsity For Sentiment Analysis of Twitter
Document8 pages
On Stopwords, Filtering and Data Sparsity For Sentiment Analysis of Twitter
Sun The
No ratings yet
Your Logo: It Policy Template
Document8 pages
Your Logo: It Policy Template
Kavya Gopakumar
No ratings yet
Object Detection and Recognition For A Pick and Place Robot: Rahul Kumar Sanjesh Kumar
Document7 pages
Object Detection and Recognition For A Pick and Place Robot: Rahul Kumar Sanjesh Kumar
Luật Nguyễn
No ratings yet
Module 2 Topic 2 Workbook
Document30 pages
Module 2 Topic 2 Workbook
Natia Kurdadze
No ratings yet
MP 301
Document101 pages
MP 301
vinhnguyen
No ratings yet
Java Persistence API Jpa Exam
Document30 pages
Java Persistence API Jpa Exam
willycruise
No ratings yet
Ups
Document4 pages
Ups
mukhtiarali5
No ratings yet
The Java CAN A Java Gateway To Fieldbus Communication
Document7 pages
The Java CAN A Java Gateway To Fieldbus Communication
Luana Santos
No ratings yet
UNIT 10- TIẾNG ANH 8
Document6 pages
UNIT 10- TIẾNG ANH 8
37. Lương Khánh Thiện
No ratings yet
Single-Channel High-Speed MOSFET Drivers: Features
Document25 pages
Single-Channel High-Speed MOSFET Drivers: Features
plamka
No ratings yet
U4VD Manual V2.0 All
Document201 pages
U4VD Manual V2.0 All
javier a
No ratings yet
Hmi500 Operator Manual
Document25 pages
Hmi500 Operator Manual
Daniel Felipe Losada Ramos
No ratings yet
Tutorial 4
Document6 pages
Tutorial 4
Nixon Gaming
No ratings yet
ZXSDR bs8700 Step by Step Commissioning Guide
Document29 pages
ZXSDR bs8700 Step by Step Commissioning Guide
nazila
No ratings yet
Chapter05 Isomerism Jeemain - Guru
Document32 pages
Chapter05 Isomerism Jeemain - Guru
Shikhar Gupta
No ratings yet
C Programming
Document148 pages
C Programming
api-20009988
100% (1)
07 Neural Networks1
Document73 pages
07 Neural Networks1
Rhiksa D'vhieyyrho
No ratings yet
32C New PLEVA ProcessBox E Mail
Document2 pages
32C New PLEVA ProcessBox E Mail
Naren Prasath
100% (1)
The School of The Future)
Document2 pages
The School of The Future)
Diaconescu Cristian Alexandru
No ratings yet
B.N.N College, Bhiwandi Department of Information Technology Subject: Business Intelligence Questions Bank
Document49 pages
B.N.N College, Bhiwandi Department of Information Technology Subject: Business Intelligence Questions Bank
Yohan Malshika
No ratings yet