Welcome to Scribd!

Dolly2.0 Ready For Commercial Use

Uploaded by

0% found this document useful (0 votes)

42 views3 pages

Dolly 2.0, an open-source instruction following LLM trained on a human-generated instruction dataset licensed for research and commercial use. just go through this document post to learn more about it.

Original Title

Dolly2.0 Ready for Commercial Use

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

42 views3 pages

Dolly2.0 Ready For Commercial Use

Uploaded by

My Social

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 3

Search inside document

Overview:

Databricks has released Dolly 2.0, an LLM that follows instructions and
is open source. It has been fine-tuned on a dataset that is both
transparent and free to use and is also open sourced for commercial
purposes.

What is Dolly 2.0?

● Dolly 2.0 is an open-source instruction following LLM.

● It is trained on a human-generated instruction dataset licensed for

research and commercial use.

● The model is based on a EleutherAI model family and has a 12b

parameter language model.

● They fine-tuned EleutherAI’s pythia-12b in order to get the dolly

2.0 model. The fine-tuning process used their instruct data set,
which they claim is better than the original Dolly that was trained
on the synthetic alpaca dataset.
● The model requires significant hardware to run due to its size.

About the Dataset

● The dataset, called Databricks Dolly 15K, contains 15,000

high-quality human-generated prompt-response pairs specifically
designed for instruction tuning large language models.

● The data set contains natural and expressive training records that
represent a wide range of behaviours from brainstorming and
content generation to information extraction and summarization.

● The dataset was generated by professionals as high quality and

contains long answers to most tasks.

● This dataset is released under the Creative Commons 3.0 licence

which means anyone can use, modify or extend it for any purpose
including commercial applications.

Commercial Use

● Existing instruction following models prohibit commercial use, so

Databricks created this new data set because they wanted to
produce an open-source model that can be commercially used.

● Databricks is open sourcing the entirety of Dolly 2.0 including the

training code data set and model weights making it suitable for
commercial use.

● Any organisation can create and customise powerful LLMs that

can talk to people without paying for API access or sharing data
with third parties.
Few Drawbacks

● As per information available on GitHub page for the dataset,

During the process of developing prompts and responses,
Wikipedia information was utilised in the training. This means that
any biases present in Wikipedia could potentially be reflected in
the final dataset.

● Additionally, some of the individuals involved in creating the

dataset were not native English speakers, which could result in
inconsistencies.

● Furthermore, the demographic composition of the team

responsible for the dataset's creation could also contribute to
biases specific to their backgrounds being present in the dataset.

Important download Links

● Link to Dolly 2.0 Demo

● Link to Dataset:
● Link to Alpaca compatible dataset here:
● Link to blog:

Conclusion (Encouraging Innovation)

The open-source data sets and models encourage commentary

research and Innovation that will help ensure everyone benefits from
advances in artificial intelligence technology.
Databricks hopes that Dolly and the open-source data set will act as the
seed for a multitude of follow-on works which may serve to bootstrap
even more powerful language models. It is not meant to be state of the
art but rather a good model at following instructions.

Text2Video-Zero: High-Quality and Consistent Video Generation With Low Overhead
Document3 pages
Text2Video-Zero: High-Quality and Consistent Video Generation With Low Overhead
My Social
No ratings yet
AutoGPT - AutoBusDevTool
Document3 pages
AutoGPT - AutoBusDevTool
My Social
No ratings yet
Bedrock Amazon's Game Changing AI Platform
Document3 pages
Bedrock Amazon's Game Changing AI Platform
My Social
No ratings yet
HuggingChat: The New Open-Source Chatbot Challenging ChatGPT
Document4 pages
HuggingChat: The New Open-Source Chatbot Challenging ChatGPT
My Social
No ratings yet
Open Assistant-Open-Source Chat Assistant
Document2 pages
Open Assistant-Open-Source Chat Assistant
My Social
No ratings yet
OpenLLAMA-The Future of Large Language Models
Document5 pages
OpenLLAMA-The Future of Large Language Models
My Social
No ratings yet
Vicuna - Open-Source Chatbot - Alternative For GPT-4
Document3 pages
Vicuna - Open-Source Chatbot - Alternative For GPT-4
My Social
No ratings yet
What Is GPT-4
Document4 pages
What Is GPT-4
My Social
100% (1)
Ultimate Django for Web App Development Using Python: Build Modern, Reliable and Scalable Production-Grade Web Applications with Django and Python (English Edition)
From Everand
Ultimate Django for Web App Development Using Python: Build Modern, Reliable and Scalable Production-Grade Web Applications with Django and Python (English Edition)
Leonardo Luis Lazzaro
No ratings yet
Bluetooth Low Energy LE Complete Self-Assessment Guide
From Everand
Bluetooth Low Energy LE Complete Self-Assessment Guide
Gerardus Blokdyk
No ratings yet
"Artificial Intelligence: How Does It Work? And How to Use It?"
From Everand
"Artificial Intelligence: How Does It Work? And How to Use It?"
Sabry Fattah
No ratings yet
Analysing Chatgpt's Potential Through The Lens of Creating Research Papers
Document17 pages
Analysing Chatgpt's Potential Through The Lens of Creating Research Papers
Anonymous Gl4IRRjzN
No ratings yet
Python GUI with PyQt: Learn to build modern and stunning GUIs in Python with PyQt5 and Qt Designer (English Edition)
From Everand
Python GUI with PyQt: Learn to build modern and stunning GUIs in Python with PyQt5 and Qt Designer (English Edition)
Saurabh Chandrakar
No ratings yet
Trackpad Pro Ver. 5.0 Class 8
From Everand
Trackpad Pro Ver. 5.0 Class 8
Nidhi Arora
No ratings yet
Quick Configuration of Openldap and Kerberos In Linux and Authenicating Linux to Active Directory
From Everand
Quick Configuration of Openldap and Kerberos In Linux and Authenicating Linux to Active Directory
Dr. Hidaia Mahmood Alassouli
No ratings yet
DB2 9 for Developers
From Everand
DB2 9 for Developers
Philip K. Gunning
No ratings yet
Engineering Education in The Age of ChatGPT and Generative AI
Document10 pages
Engineering Education in The Age of ChatGPT and Generative AI
Reina Garcia
No ratings yet
Exploiting Big Data: Strategies For Integrating With Hadoop To Deliver Business Insights
Document35 pages
Exploiting Big Data: Strategies For Integrating With Hadoop To Deliver Business Insights
Prasad Billahalli
No ratings yet
A Beginner's Guide To Stable LM Suite of Language Models
Document4 pages
A Beginner's Guide To Stable LM Suite of Language Models
My Social
No ratings yet
Codelab: Get Started With Dialogflow
Document41 pages
Codelab: Get Started With Dialogflow
Candice
No ratings yet
How-To Leverage ChatGPT For Test Automation
Document24 pages
How-To Leverage ChatGPT For Test Automation
suraj satav
No ratings yet
Hadron Cloud - Project Analysis
Document2 pages
Hadron Cloud - Project Analysis
Gas Trader
No ratings yet
Java Practical Programs List for MCA Students
Document7 pages
Java Practical Programs List for MCA Students
poojadhanrajani
100% (2)
Week 2 Discussion 1
Document2 pages
Week 2 Discussion 1
perminusuhuru
No ratings yet
Open Source Tools (507) : Theory by Sweta. J. Dave
Document41 pages
Open Source Tools (507) : Theory by Sweta. J. Dave
Pratham p salvi
No ratings yet
Big Data Fund
Document5 pages
Big Data Fund
Mariano Soto
No ratings yet
Google Colab: A Seminar Report On
Document17 pages
Google Colab: A Seminar Report On
kamal rana
No ratings yet
Contribution and Performance of ChatGPT and Other Large Language Models (LLM) For Scientific and Research Advancements: A Double-Edged Sword
Document25 pages
Contribution and Performance of ChatGPT and Other Large Language Models (LLM) For Scientific and Research Advancements: A Double-Edged Sword
nitin
No ratings yet
Big Data Analytics Syllabus
Document2 pages
Big Data Analytics Syllabus
Saiyed Faiayaz Waris
No ratings yet
30 Days of Google Cloud-GDSC QU
Document12 pages
30 Days of Google Cloud-GDSC QU
Aman Rai
No ratings yet
College Enquiry Chatbot: Department of Computer Science and Engineering Nitte Meenakshi Institute of Technology
Document14 pages
College Enquiry Chatbot: Department of Computer Science and Engineering Nitte Meenakshi Institute of Technology
krishna
No ratings yet
How Close Is Chatgpt To Human Experts? Comparison Corpus, Evaluation, and Detection
Document20 pages
How Close Is Chatgpt To Human Experts? Comparison Corpus, Evaluation, and Detection
Abbas M Syed
No ratings yet
Marketing Strategy Final
Document14 pages
Marketing Strategy Final
Ahmad Waheed
No ratings yet
8.progress Report Presentation (Clickbait Detection System)
Document26 pages
8.progress Report Presentation (Clickbait Detection System)
mehak
No ratings yet
Term Paper Topics in Operating System
Document8 pages
Term Paper Topics in Operating System
bav1dik0jal3
No ratings yet
VLLM: Using PagedAttention To Optimize LLM Inference and Serving
Document6 pages
VLLM: Using PagedAttention To Optimize LLM Inference and Serving
My Social
No ratings yet
System Software 3rd Edition Leland L Beck Eym14 PDF
Document4 pages
System Software 3rd Edition Leland L Beck Eym14 PDF
فاروق الفهيدي
0% (3)
Big Data Analytics: Free Guide: 5 Data Science Tools To Consider
Document8 pages
Big Data Analytics: Free Guide: 5 Data Science Tools To Consider
Keeme
No ratings yet
CHATBOT: Architecture, Design, & Development
Document46 pages
CHATBOT: Architecture, Design, & Development
Tuan Phan
No ratings yet
Report For Chatbot Using NLTK Library Using Python Programming Python For Machine Learning (Int 522)
Document9 pages
Report For Chatbot Using NLTK Library Using Python Programming Python For Machine Learning (Int 522)
Radhika Goyal
No ratings yet
RS579 - Computer Engineering Curriculum 2073
Document45 pages
RS579 - Computer Engineering Curriculum 2073
reema yadav
No ratings yet
BDA Answers-1
Document15 pages
BDA Answers-1
afreed khan
No ratings yet
Big Data Assignment PDF
Document18 pages
Big Data Assignment PDF
Mukul Mishra
No ratings yet
WP Integrating Active Directory
Document28 pages
WP Integrating Active Directory
Cesar Mann
No ratings yet
Big Data 3
Document16 pages
Big Data 3
Royal Hunter
No ratings yet
Best Practices For Prompt Engineering With OpenAI API - OpenAI Help Center
Document7 pages
Best Practices For Prompt Engineering With OpenAI API - OpenAI Help Center
Ricardo Pietrobon
No ratings yet
Web Browsers Report
Document19 pages
Web Browsers Report
Thulasi Pandiaraja
No ratings yet
Project
Document14 pages
Project
Shivam Tiwari
No ratings yet
ChatGPT: A Revolutionary Human-Machine Communication Technology
Document3 pages
ChatGPT: A Revolutionary Human-Machine Communication Technology
International Journal of Innovative Science and Research Technology
No ratings yet
Wireless Sensor Networks
Document32 pages
Wireless Sensor Networks
Sumeet Chauhan
No ratings yet
16-029 Pentaho Hadoop Ebook v7
Document22 pages
16-029 Pentaho Hadoop Ebook v7
radiumtau
No ratings yet
Guanaco LLM With QLoRA - A ChatGPT Competitor Trained On A Single GPU
Document3 pages
Guanaco LLM With QLoRA - A ChatGPT Competitor Trained On A Single GPU
My Social
No ratings yet
Six Semester Training Project Report: Uttarakhand Technical University, Dehradun
Document44 pages
Six Semester Training Project Report: Uttarakhand Technical University, Dehradun
Amar Rajput
No ratings yet
A Seminar Report On voiceXML
Document16 pages
A Seminar Report On voiceXML
Chandra Mohanty
No ratings yet
Museum Marketing Plan
Document12 pages
Museum Marketing Plan
nnigeltaylor
No ratings yet
CSS Tutorial for Beginners: Adding Color and Font Styling
Document13 pages
CSS Tutorial for Beginners: Adding Color and Font Styling
scribdresort
No ratings yet
Connect Globe Project: Objective/ Vision
Document11 pages
Connect Globe Project: Objective/ Vision
Harshitha
100% (1)
Introduction To Hadoop - Part Two: 1 Hadoop and Comma Separated Values (CSV) Files 1
Document38 pages
Introduction To Hadoop - Part Two: 1 Hadoop and Comma Separated Values (CSV) Files 1
Sadikshya khanal
No ratings yet
Android Case Study
Document19 pages
Android Case Study
matsmatss
33% (3)
Cse-CSEViii-web 2.0 & Rich Internet Application (06cs832) - Notes
Document86 pages
Cse-CSEViii-web 2.0 & Rich Internet Application (06cs832) - Notes
lifeoffame
No ratings yet
Reka Series Unleashed : Exploring the Power of Reka Core
Document10 pages
Reka Series Unleashed : Exploring the Power of Reka Core
My Social
No ratings yet
CodeGemma : Google’s Open-Source Marvel in Code Completion
Document9 pages
CodeGemma : Google’s Open-Source Marvel in Code Completion
My Social
No ratings yet
SAFE : Google DeepMind’s Open-Source Solution for Fact Verification
Document8 pages
SAFE : Google DeepMind’s Open-Source Solution for Fact Verification
My Social
No ratings yet
Video2Game : Bridging Real-World Scenes to Interactive Virtual Worlds
Document8 pages
Video2Game : Bridging Real-World Scenes to Interactive Virtual Worlds
My Social
No ratings yet
Open-Source Revolution : Google’s Streaming Dense Video Captioning Model
Document8 pages
Open-Source Revolution : Google’s Streaming Dense Video Captioning Model
My Social
No ratings yet
How Stability AI's Stable Code Instruct 3B Outperforms Larger Models
Document8 pages
How Stability AI's Stable Code Instruct 3B Outperforms Larger Models
My Social
No ratings yet
OpenCodeInterpreter: Open Source AI For Code Generation With Feedback
Document10 pages
OpenCodeInterpreter: Open Source AI For Code Generation With Feedback
My Social
No ratings yet
DATA INTERPRETER: Open-Source Genius in Spotting Data Inconsistencies
Document9 pages
DATA INTERPRETER: Open-Source Genius in Spotting Data Inconsistencies
My Social
No ratings yet
Unveiling Jamba: The First Production-Grade Mamba-Based Model
Document8 pages
Unveiling Jamba: The First Production-Grade Mamba-Based Model
My Social
No ratings yet
Lumiere: Space Time Diffusion Model For Video Synthesis by Google
Document6 pages
Lumiere: Space Time Diffusion Model For Video Synthesis by Google
My Social
No ratings yet
Open-Sora: Create High-Quality Videos From Text Prompts
Document8 pages
Open-Sora: Create High-Quality Videos From Text Prompts
My Social
No ratings yet
Advanced AI Planning With Devika: New Open-Source Devin Alternative
Document7 pages
Advanced AI Planning With Devika: New Open-Source Devin Alternative
My Social
No ratings yet
VQGraph: A New Method To Encode and Learn From Graphs
Document7 pages
VQGraph: A New Method To Encode and Learn From Graphs
My Social
No ratings yet
Command-R: Revolutionizing AI With Retrieval Augmented Generation
Document8 pages
Command-R: Revolutionizing AI With Retrieval Augmented Generation
My Social
No ratings yet
Meta's CodeLlama 70b: Revolutionizing AI-Powered Coding
Document7 pages
Meta's CodeLlama 70b: Revolutionizing AI-Powered Coding
My Social
No ratings yet
Vision Mamba: Rethinking Visual Representation With Bidirectional LSTMs
Document7 pages
Vision Mamba: Rethinking Visual Representation With Bidirectional LSTMs
My Social
No ratings yet
Stability AI's Stable Cascade: High Image Quality and Faster Inference Times
Document7 pages
Stability AI's Stable Cascade: High Image Quality and Faster Inference Times
My Social
No ratings yet
FinGPT: Democratizing Internet-Scale Financial Data With LLMs
Document7 pages
FinGPT: Democratizing Internet-Scale Financial Data With LLMs
My Social
No ratings yet
Platypus: How To Refine LLMs With Human Feedback
Document7 pages
Platypus: How To Refine LLMs With Human Feedback
My Social
No ratings yet
MetaGPT: A Framework For Multi-Agent Meta Programming
Document7 pages
MetaGPT: A Framework For Multi-Agent Meta Programming
My Social
No ratings yet
Unleashing Creativity With AI A Deep Dive Into Supermind Ideator
Document7 pages
Unleashing Creativity With AI A Deep Dive Into Supermind Ideator
My Social
No ratings yet
Kosmos-2: A Phrase Grounding Model From Microsoft Research
Document7 pages
Kosmos-2: A Phrase Grounding Model From Microsoft Research
My Social
No ratings yet
FactLLaMA: A Smart Model For Automated Fact-Checking
Document8 pages
FactLLaMA: A Smart Model For Automated Fact-Checking
My Social
No ratings yet
Valley: A Video Assistant With Natural Language Interaction
Document8 pages
Valley: A Video Assistant With Natural Language Interaction
My Social
No ratings yet
Michelangelo: Using A Shape-Image-Text-Aligned Space To Create and Translate 3D Shapes
Document7 pages
Michelangelo: Using A Shape-Image-Text-Aligned Space To Create and Translate 3D Shapes
My Social
No ratings yet
GPT4RoI: The Vision-Language Model With Multi-Region Spatial Instructions
Document6 pages
GPT4RoI: The Vision-Language Model With Multi-Region Spatial Instructions
My Social
No ratings yet
How Macaw-LLM Solves The Challenges of Multi-Modal Language Modeling and Text Generation
Document7 pages
How Macaw-LLM Solves The Challenges of Multi-Modal Language Modeling and Text Generation
My Social
No ratings yet
MotionGPT: How To Generate and Understand Human Motion
Document6 pages
MotionGPT: How To Generate and Understand Human Motion
My Social
No ratings yet
FalCoder-7B: The Ultimate Open Coding Assistant Powered by Falcon-7B
Document6 pages
FalCoder-7B: The Ultimate Open Coding Assistant Powered by Falcon-7B
My Social
No ratings yet
How WizardCoder Outperforms Other Code LLMs On HumanEval and HumanEval+
Document8 pages
How WizardCoder Outperforms Other Code LLMs On HumanEval and HumanEval+
My Social
No ratings yet
FibeAir IP-20C-S-E Release Notes 8 5 5 Rev A 01
Document58 pages
FibeAir IP-20C-S-E Release Notes 8 5 5 Rev A 01
(unknown)
No ratings yet
01 CTAL TM Study Guide v2.04
Document433 pages
01 CTAL TM Study Guide v2.04
CorniciucOana
100% (3)
History and Development of Linux22
Document260 pages
History and Development of Linux22
Jatinder Randahwa
No ratings yet
Feasibility Study Final
Document18 pages
Feasibility Study Final
Aqilah
No ratings yet
According To Chandler Simon (2016) :: Initiatives
Document2 pages
According To Chandler Simon (2016) :: Initiatives
Neetu Sharma
No ratings yet
Forensics
Document5 pages
Forensics
BSASciti
No ratings yet
Unit - 3: Cyber Ethics - Chapter 9 Objective Type Questions
Document14 pages
Unit - 3: Cyber Ethics - Chapter 9 Objective Type Questions
Vikas Shukla
No ratings yet
Docker For Sysadmins Linux Windows Vmware
Document113 pages
Docker For Sysadmins Linux Windows Vmware
Son Tran Hong Nam
100% (3)
s71500 Di 32x24vdc Ba Manual en-US en-US
Document36 pages
s71500 Di 32x24vdc Ba Manual en-US en-US
Phan Hữu Tài
No ratings yet
249 SiPass Int OSS Decl A-100046-1 en
Document51 pages
249 SiPass Int OSS Decl A-100046-1 en
thangbura
No ratings yet
Python Optimization Modeling Objects (Pyomo)
Document17 pages
Python Optimization Modeling Objects (Pyomo)
arteepu4
No ratings yet
The Open Source Movement: A Revolution in Software Development
Document27 pages
The Open Source Movement: A Revolution in Software Development
Nidhi Ashok
No ratings yet
List of Computing and IT Abbreviations
Document33 pages
List of Computing and IT Abbreviations
MithuMondal
No ratings yet
MLOps Emerging Trends in Data, Code and Infrastructure
Document25 pages
MLOps Emerging Trends in Data, Code and Infrastructure
Vineet Ankam
No ratings yet
The Ethical Issues of Spec Work For Designers
Document15 pages
The Ethical Issues of Spec Work For Designers
Lisa Winand
No ratings yet
Micro Focus Fortify and Sonatype Deliver 360 Degree View of Application Security Brochure
Document2 pages
Micro Focus Fortify and Sonatype Deliver 360 Degree View of Application Security Brochure
andini eldananty
No ratings yet
Support Show Command Ref SSCRG
Document362 pages
Support Show Command Ref SSCRG
jhaanit
No ratings yet
Technological Institute of The Philippines
Document36 pages
Technological Institute of The Philippines
Rahp Relly
No ratings yet
Kentucky Information Technology Systems
Document116 pages
Kentucky Information Technology Systems
ScribdAskana
No ratings yet
Introduction To Digital Library
Document10 pages
Introduction To Digital Library
Suresh Chandra Panda
No ratings yet
HRIS Book-Second Edition-Future of HRIS
Document22 pages
HRIS Book-Second Edition-Future of HRIS
Anonymous jUuMHTpHP
100% (1)
Reference Paper 1 Sahana Murthy PDF
Document9 pages
Reference Paper 1 Sahana Murthy PDF
bennybenham
100% (2)
3.05 Software: Gis Cookbook For Lgus
Document4 pages
3.05 Software: Gis Cookbook For Lgus
Christian Santamaria
No ratings yet
Red Hat Enterprise Linux
Document36 pages
Red Hat Enterprise Linux
Nelum Narmada Dewapura
No ratings yet
Hacking BRL Cad
Document31 pages
Hacking BRL Cad
maths22
No ratings yet
Report Ol State of Oss 2024
Document34 pages
Report Ol State of Oss 2024
Hieu Le
No ratings yet
Unit 3 Computer Software: Structure
Document12 pages
Unit 3 Computer Software: Structure
Shams
No ratings yet
MPLAB C32 Compiler License
Document5 pages
MPLAB C32 Compiler License
keerthi3214
No ratings yet
Open Source Solutions for Nonprofits
Document7 pages
Open Source Solutions for Nonprofits
Chris Ryan
No ratings yet
Open Source Software: Group 3
Document11 pages
Open Source Software: Group 3
Marvie Macaspac
No ratings yet