You are on page 1of 8

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/370582580

Meta AI's Open Pretrained Transformer (OPT): The Future of Text Generation?

Conference Paper · May 2023

CITATIONS READS

0 174

5 authors, including:

Chi Wee Tan


Tunku Abdul Rahman University of Management and Technology
34 PUBLICATIONS 30 CITATIONS

SEE PROFILE

All content following this page was uploaded by Chi Wee Tan on 07 May 2023.

The user has requested enhancement of the downloaded file.


Meta AI’s Open Pretrained Transformer (OPT): The Future
of Text Generation?
Chi Wee Tan1, a) ,Yuen Kei Khor1, b) , Jia Hou Tan2, c) , Gloria Jennis Tan3, d)

Author Affiliations
1,2
Faculth of Computing and Information Technology,Tunku Abdul Rahman University of Management and
Technology, Jalan Genting Kelang, 53300, Setapak, Kuala Lumpur, Malaysia
2
The Institution of Engineering and Technology, Futures Place, Kings Way,Stevenage, Hertfordshire, SG1 2UA
United Kingdom
3
Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA Cawangan Terengganu,
Kuala Terengganu, Malaysia

Author Emails
a)
Corresponding author: chiwee@tarc.edu.my
b)
khoryk-wm17@student.tarc.edu.my
c)
jiahou21.jt@gmail.com
d)
gloria@uitm.edu.my

Abstract. The IT industry is growing more influential with the advent of Artificial Intelligence (AI). In between,
chatbots are able to assist businesses in resolving customer service problems. They can create emails, blog posts,
social media ads, and web content faster, more efficiently, and at a lower cost. Natural Language Generation (NLG)
is a subfield of Natural Language Processing (NLP) that is concerned with the automatic generation of
human-readable text by a computer. In a recent study, 35% of consumers want more companies to use chatbots to
improve their communication strategy and deliver a better experience. The texts produced by this technique have
higher conversion rates, are categorized as unique content, and are ranked higher in text generation. Companies can
provide a better customer service experience with automated texts that have good quality. This helps to increase
productivity of the company to generate more sales. In this study, we identified the Open Pre-trained Transformer
(OPT) released by Meta AI, a language model with 175 billion parameters trained on publicly available data sets, as
the text-generator system for this project. To deploy our OPT 1.3 billion of text generator model, Gradio is being
used. A pilot study with blind validation is carried out to understand and compare the performance of proposed
solution against human level performance. According to the study, the more than 80% of respondents agree that
OPT models are more beneficial and believe that it can respond and generate human-like texts.

INTRODUCTION

The development of the AI text generation system brings the potential commercialization value to the company.
The company must produce blog entries, marketing content, customer care responses, and recruiting blogs in order
to remain operational. Copywriting and marketing copy generation are challenging tasks for companies and
copywriters alike. Clearly, there is a reason why a quality copywriting article may cost more than $10,000 [1]. There
are certain published pieces that even require weekly updates. This leads to extensive resource and time utilization
on these articles. According to studies, each blog is typically 1500 words long and takes 4 hours to write.
Additionally, an AI text-generator can assist employees to save more valuable time and save resources. The
text-generator merely requires a few keywords from employees to start producing articles. Artificial intelligence
techniques enable machines to carry out tasks that typically necessitate human intelligence. Thanks to advancements
in high-performance computing and augmented data storage capacities, AI technologies have gained more
capabilities and are now being widely implemented in various applications, spanning from basic daily activities,
intelligent assistants, and finance to highly specialized command and control operations and national security [2]. By
implementing artificial intelligence, smart devices or computers are, for example, capable of understanding text and
reading it out loud, hearing voices and responding to them, viewing images and recognizing objects in them, and
even anticipating what will happen next. To elaborate further, as AI technology has progressed, it has been able to
detect patterns in human interactions and behaviour, allowing it to analyse and interpret social activity in more
sophisticated ways [3]. This has been achieved through the use of natural language processing, machine learning,
and other advanced techniques that enable AI systems to comprehend and process complex human interactions.
With these capabilities, AI has the potential to provide valuable insights into social trends, as well as identify
potential areas of concern or opportunities for improvement in various aspects of human life. Additionally, the use of
AI to analyse social activity has implications for fields such as marketing, politics, and law enforcement, where
understanding and predicting human behaviour is crucial for success.

In addition to this, AI text generation systems also allow machines to sound like humans. AI text generators are able
to learn how the target audience speaks. As a result of using this technique, the texts are more likely to convert, are
categorized as unique content, and are ranked higher. Therefore, companies can provide a better customer service
experience with automated texts that have good quality. The customer could find all the information needed in the
product description rather than wasting their time to do research from the internet. This helps to increase the
productivity of the company to generate more sales.

LITERATURE REVIEW
Machine learning algorithms are complex computer programs that have been designed to learn and improve their
performance by analysing vast amounts of data. As the name suggests, these algorithms "learn" from the data they
are presented with, continually refining their interpretation and analysis of information over time, much like humans
do through the process of learning. In this way, machine learning algorithms represent a significant breakthrough in
the field of artificial intelligence, as they enable computers to become progressively more sophisticated in their
ability to analyse complex information.

One of the key advantages of machine learning algorithms is their ability to adapt and modify their own
parameters based on their previous performance. By doing so, they can continually enhance their ability to make
accurate predictions and generate insights about the data they are analysing. This continual self-adjustment is made
possible by the use of advanced techniques such as natural language processing, neural networks, and deep learning,
which allow machine learning algorithms to process and analyse vast amounts of data with incredible speed and
precision. In addition to their use in data analysis and prediction, machine learning algorithms have applications in a
wide range of fields, including medicine, finance, marketing, and transportation. For example, in the medical field,
machine learning algorithms can be used to analyse patient data and predict the likelihood of certain diseases or
conditions, enabling doctors to take proactive measures to prevent or treat these conditions. In finance, machine
learning algorithms can be used to analyse financial data and predict market trends, helping investors make more
informed decisions about their investments. Overall, the ability of machine learning algorithms to learn and adapt
based on their past performance represents a significant advancement in the field of artificial intelligence, with
wide-ranging applications and potential benefits for a variety of industries and fields.

Open Pre-trained Transformer (OPT) and Generative Pre-trained Transformer (GPT) are pre-trained language
models that range from 125M to 175B parameters, twice the number of neurons from our human brains. These
transformers trained pretty much on whole internet resources to understand human writing, speaking, reply to
messages, etc. Moreover, OPT and GPT are primarily employed for text-generation purposes, wherein users can
obtain a comprehensive article by simply inputting keywords, after which the machine will autonomously generate
the relevant content.
TABLE 1. Comparison between contemporary solution from market players

Open Pre-trained Transformer Generative Pre-trained


Solution
(OPT) Transformer (GPT)
Year founded May 2022 February 2019

Founding company

License Pricing Free Paid


• Text generation • Construct human-like
• Solve simple maths problems text
Supported features
• Answer reading • Productivity Boosters
comprehension questions • Translation
• Text-Generator • Chatbot
• Dialogue application • Text-Generator
Common
• Zero/ Few-shot learning • Copywriting/contents
applications
writer
• Document editor
• Does not perform satisfactorily • True intelligence is
when given declarative lacking
commands or direct questions. • Capacity to violate an
• Exhibits repetitive behaviour individual's right to
and is susceptible to becoming privacy
stuck in a cycle. • Bias
Limitations • It has the potential to generate
statements that are factually
inaccurate.
• Demonstrates a heightened
inclination to produce harmful
language and perpetuate
damaging stereotypes.

For this project, we have chosen to work with the Open Pre-trained Transformer (OPT) and Generative
Pre-trained Transformer (GPT) models, which are considered to be among the most powerful and appropriate
models for a text generation machine. These language models have been trained on massive text collections and
have demonstrated impressive abilities in generating text and performing zero-shot and few-shot learning. However,
it should be noted that access to these models is currently restricted to a few well-resourced labs, with limited
interaction available through paid APIs. This lack of accessibility poses a challenge for researchers looking to study
the workings of these large language models, which can hinder progress in addressing issues related to robustness,
bias, and toxicity [4]. Furthermore, OPT and GPT are open-source models that are freely available, making them
ideal for university students who wish to develop text-generation machines. Additionally, these models have varying
parameter sizes, ranging from 125 million to 1.3 billion, making them suitable for low-resource requirements. They
only require 1.10 GB RAM and 2 x vCPU, which means that the machine can run smoothly using Google Colab,
without overburdening the developers' devices.

STRATEGIC AND TEST PLAN

Machine learning algorithms are computer programs that exhibit exceptional capabilities in improving their
performance through the acquisition of more data. The fundamental principle of "learning" in machine learning
implies that these algorithms adjust their interpretation of data over time, much like humans learn through
experience. Thus, a machine learning algorithm is a sophisticated software that can autonomously modify its
parameters based on its prior performance in predicting outcomes of a given dataset. This self-correcting mechanism
enables the algorithm to continuously refine and evolve, achieving unparalleled accuracy and efficiency in data
processing and analysis. In this project, we use OPT and GPT to generate text based on the given keywords and
evaluate the output using Word2Vec and Human Level Performance (HLP) through a survey.

Models
The text generation system employed in our work utilizes the Open Pre-trained Transformer (OPT), an
innovative model created by the AI team at Meta (Meta AI). There are 9 models that contain different numbers of
parameters ranging from 125 million to 175 billion [4]. Fortunately, the models were available at an open-source
platform that allows the users to build, train and deploy Machine Learning models called Hugging Face. Lastly, the
team has chosen the OPT model that contains 1.3 billion parameters (OPT-1.3b) to deploy as the text generation for
this project.

Tokenizer
For this project, we employed the OPT-1.3b tokenizer, which was created by the Meta AI team and can also be
obtained through Hugging Face's open-source platform.

Libraries
There are plenty of libraries were used in this project, which includes the TensorFlow to deploy machine learning
models, pipeline from hugging face transformers, pandas to read CSV files, NumPy and Matplotlib to generate
graphs such as bar chart, Gradio which is to produce the graphical user interface for entering the text inputs, NLTK
library, and the word2vec model library from Gensim.

FIGURE 1. Test Plan Strategic


RESULT & DISCUSSION

Blind Survey with Human Level Performance


The proposed testing plan has been developed to evaluate and analyze the results of text generation from the
Poem, Article Writing, and Text Generation models. Unlike translation, calculation, and speech dialogue, these text
generators do not have a specific schema to evaluate the correctness of their outputs. To contribute to the output
result, a GPT2 model was used to insert the same input text as that of OPT-1.3B. A Google form consisting of ten
multiple-choice questions has been created for humans to determine their preference between the output generated
by the two models. Word2Vec has been employed to evaluate which output is more closely associated with the input
text.

TABLE 2. Sample input text to test the capabilities of the system in text generation

Sentence ID Input Text


S1 “a poem of sonnet about my love”
S2 “A poem of burning my old year”
S3 “I love natural language processing!”
S4 “2. Literature Review of Convolutional Neural Network”
S5 “Kentucky Fried Chicken is sucked because”
S6 “Thanos can be defeated by”
S7 “Actions Speak Louder Than Words meaning”
S8 “Recipe of Hainan chicken rice”
S9 “This question con9lan7firm1 will come out in the exam”
S10 “limitations and future works of open pre-trained transformer”

Data was collected through a Google form distributed to students aged between 20 and 22, from various faculties
at TAR UMT. The form captured responses to both OPT and GPT models but did not identify which model
produced which result. Respondents were asked to select the answer they believed best addressed the questions at a
human-level. Based on this selection, volunteers were able to identify which model they thought was more accurate
and relevant. Figure 2 visually displays the results, where blue represents the OPT model, red represents the GPT
model, yellow indicates support for both models, and green represents cases where both models produced
inadequate results.

Word2Vec
The primary aim and advantage of Word2Vec is to group similar word vectors together in vector space by
seeking mathematical similarities. By dispersing word attributes like contextual information into numerical
representations known as vectors, Word2Vec enables this process. In this project, we leveraged Word2Vec's built-in
function, Word's Movement Distance, to determine the distance between the input and output results, thereby
allowing us to discern the relationship between them.

1
“con9lan7firm” is an Out-of-Vocab (OOV) word that is referring to the higher level of “confirmation” in the sense of Malaysian community.
FIGURE 2. The result of Blind Survey with Human Level Performance

TABLE 3. The relativity of the text generation, measured using Word2Vec (lower better)
Sentence
OPT 1.3b GPT 1.5b
ID
S1 3.508538795407386 ✔ 3.89882738573764
S2 2.901428526459573 ✔ 3.69009982602391
S3 3.367989234614098 ✔ 3.50985044976469
S4 2.807018637308871 ✔ 3.55788136250483
S5 3.387234708242395 ✔ 3.77833743981631
S6 4.050523441074861 ✔ 4.12097752029902
S7 2.821987852875757 ✔ 3.55518145969468
S8 3.316757178047051 ✔ 4.09982252755007
S9 3.306534900087090 ✔ 3.63010798679142
S10 2.980123223491885 ✔ 3.49222622331955

CONCLUSION
In a nutshell, we can see that the vast majority of respondents are in favor of OPT models. In their opinion, OPT
is capable of responding and generating texts that resemble those produced by humans. However, what should be
noted here is that GPT is only slightly worse than OPT in this regard. Furthermore, over 80% of respondents agree
that both models are capable of producing useful information and text, and only a small number think that either
model's results are going to be useful or like those of a human.

ACKNOWLEDGMENTS
The authors express their gratitude for the financial and technological support provided by the Faculty of
Computing and Information Technology at Tunku Abdul Rahman University of Management and Technology.
Furthermore, they extend their heartfelt appreciation to a team of Natural Language Processing students, Mr. Chia
Kit Yau, Mr. Chew Man Hee, and Mr. Yap Wai Yen, who assisted in completing the puzzle of this project.

REFERENCES

[1] Robert W. Bly, The copywriter’s handbook : a step-by-step guide to writing copy that sells.
Henry Holt, 2005.
[2] N. Anantrasirichai and D. Bull, “Artificial intelligence in the creative industries: a review,”
Artificial Intelligence Review 2021 55:1, vol. 55, no. 1, pp. 589–656, Jul. 2021, doi:
10.1007/S10462-021-10039-7.
[3] N. Ferina, A. Agung, G. Sri, and D. Professor, “Opportunities and Challenges of Instagram
Algorithm in Improving Competitive Advantage,” Int J Innov Sci Res Technol, vol. 4, no. 1,
2019, Accessed: Jan. 06, 2023. [Online]. Available: www.ijisrt.com743
[4] S. Zhang et al., “OPT: Open Pre-trained Transformer Language Models,” May 2022, Accessed:
Dec. 02, 2022. [Online]. Available: http://arxiv.org/abs/2205.01068

View publication stats

You might also like