FinGPT: Democratizing Internet-Scale Financial Data With LLMs

To read more such articles, please visit our blog https://socialviews81.blogspot.
com/
FinGPT: Democratizing Internet-Scale Financial Data with

LLMs
Introduction
The field of natural language processing (NLP) has witnessed

remarkable advances in recent years, thanks to the development of
large-scale pre-trained language models (LMs) such as BERT, GPT, and
T5. These LMs are trained on massive amounts of general-domain text
data, such as Wikipedia, news articles, and web pages, and can capture
rich linguistic and semantic knowledge. However, when it comes to
domain-specific applications, such as finance, these LMs may not
perform well due to the lack of domain-relevant data and vocabulary.
A team of researchers from Columbia University, and New York

University (Shanghai) has developed a financial large language model
that leverages internet-scale data for pre-training. Their goal is to
democratize the internet-scale Data for Financial Large Language
Models by introducing an open-sourced and data-centric framework that
To read more such articles, please visit our blog https://socialviews81.blogspot.com/

automates the collection and curation of real-time financial data from

diverse sources on the Internet. This new AI model is called 'FinGPT'.
What is FinGPT?
FinGPT stands for Financial Generative Pre-trained Transformer. It is a

data-centric framework that introduces a simple yet effective strategy for
fine-tuning FinLLM using the inherent feedback from the market, dubbed
Reinforcement Learning with Stock Prices (RLSP).
Key Features of FinGPT
FinGPT has several key features that make it unique and powerful for
financial NLP applications:
● FinGPT democratizes access to high-quality financial data from

various online sources, such as company websites, financial news,
blogs, forums, reports, and social media. This enables FinGPT to
capture diverse and rich financial knowledge and vocabulary.
● FinGPT uses an automatic data curation pipeline that can update
the financial data on a regular basis, such as monthly or weekly.
This ensures that FinGPT stays up-to-date with the latest trends
and developments in the financial domain.
● FinGPT leverages the strengths of some of the best available
open-source large language models (LLMs) and fine-tunes them
for financial language modeling. This allows FinGPT to inherit the
general linguistic and semantic knowledge from the LLMs and
enhance it with domain-specific knowledge from the financial data.
Capabilities/Use Case of FinGPT
FinGPT can enable various applications that require financial domain

knowledge and natural language understanding, such as:

● Robo-advisor: FinGPT can provide personalized and automated

financial advice to users based on their goals, preferences, and
risk profiles. FinGPT can also generate natural language
explanations and recommendations for the users to understand
the rationale behind the advice.
● Sentiment analysis for algorithmic trading: FinGPT can analyze
the sentiment of financial text data, such as news articles, tweets,
or reports, and infer the market sentiment and trends. FinGPT can
also generate trading signals or strategies based on the sentiment
analysis results.
● Low-code development: FinGPT can facilitate the development
of financial applications or services by using natural language
instructions or queries. FinGPT can also generate code or scripts
based on the natural language inputs and execute them
accordingly.
How does FinGPT work?
The FinGPT framework is designed to gather a comprehensive amount

of accessible financial data from the internet and provide a unified data
interface for developers. It incorporates data curation pipelines to ensure
that only high-quality data is used in training. Additionally, FinGPT
employs reinforcement learning to instruct large language models
(LLMs) with market feedback and adapt the model with Low-rank
Adaptation (LoRA). This lightweight adaptation approach can
significantly reduce costs.

source - https://arxiv.org/pdf/2307.10485.pdf
FinGPT consists of four layers, each with its own unique function. The
first layer is the data source layer, which offers unified data APIs. The
second layer is the data curation layer, which is responsible for cleaning
and processing the fine-tuning data. The third layer is the LLM layer,
which is capable of accommodating any pre-trained LLM. The fourth and
final layer is the application layer, which applies the fine-tuned model to
diverse financial applications. This four-layer design makes FinGPT
highly extensible and adaptable to a wide range of financial applications.
Performance Evaluation with Other Model
The researchers conducted a few experiments to showcase distinct

fine-tuning methodologies for FinGPT. The first experiment used
Reinforcement Learning with Stock Prices (RLSP) for labeling,
leveraging market feedback. The second experiment harnessed an
external LLM, such as GPT-4, for labeling, enabling the model to distill
knowledge from an already potent LLM. The third experiment involved
full-shot fine-tuning, leveraging the entirety of the training data to refine
the model. The results of these experiments showed significant
enhancements over prevailing LLMs, underscoring the promise of
crafting financial large language models (FinLLMs) through fine-tuning.

FinGPT was compared with other models such as LLaMA and

BloombergGPT (as shown in above Table) on various financial tasks.
The results showed that FinGPT had a consistent advantage over these
models and exhibited substantial improvement in tasks such as
sentiment classification and quantitative trading.
In another experiment as shown in above table, the researchers used

instruction fine-tuning on the same datasets as in the previous
experiment. The results with full-shot fine-tuning showed that FinGPT
outperformed BloombergGPT, which has not released its model or API.
Three models were trained in this experiment, including FinGPT,
FinGPT-8bit, and FinGPT-4bit. FinGPT is the LoRA model based on
ChatGLM which was fine-tuned under the instructions made by the
training set. FinGPT-8bit is the 8-bit training version of the FinGPT model
and FinGPT-4bit is the 4-bit training version. Thanks to LoRA, the
researchers were able to reduce the training cost of LLMs for sentiment
analysis from $2.67 million to $262 with high-quality instruction datasets.
For more detailed information on cost estimations and F1 score

calculations etc., please refer to the research paper.

How to access and use this model?
FinGPT is an open-source framework that can be accessed through its

code repositories on GitHub. The model can also be accessed through
the Hugging Face website. FinGPT is designed to be data-centric and
open-source for open finance, allowing developers to access its code
and use it for their own purposes.
To use FinGPT, developers can visit the GitHub repositories and follow
the instructions provided in the README files. The repositories contain
detailed information on how to install and use the framework, as well as
examples of how to fine-tune the model for specific financial tasks.
NOTE That currently codes are shared for academic purposes under
the MIT education license.
If you are interested to know more about the FinGPT model, all relevant
links are provided under the 'source' section at the end of this article.
Future Work
The research team in collaboration with The AI4Finance Foundation

aims to improve access to high-quality financial data and expand to
other markets.
● They plan to investigate more parameter-efficient fine-tuning

methods and fine-tune pre-trained FinLLMs for better performance.
● They intend to explore advanced prompting strategies, support
longer context windows, and use retrieval-augmented generation
techniques.
● They plan to evaluate and mitigate potential bias and fairness
issues and enhance the experience of low-code development.
● FinLLMs could potentially be applied to SWAPs for decision
support and risk analysis, as well as event detection and outlier
detection for predicting market trends and managing risk.

The community is encouraged to use the FinGPT framework to train

their own LoRA weights, utilizing a variety of data sources, to expedite
advancements in the field of FinLLMs.
Conclusion
FinGPT is a breakthrough in financial NLP that leverages internet-scale

data for pre-training. It enables various applications that require financial
domain knowledge and natural language understanding. FinGPT is a
milestone in the journey of artificial intelligence across industries,
especially finance, as it democratizes access to high-quality FinLLMs
and stimulates innovation and opportunities in open finance.
Source
research Paper - https://arxiv.org/abs/2307.10485
research document- https://arxiv.org/pdf/2307.10485.pdf
code repo1 - https://github.com/AI4Finance-Foundation/FinGPT
code repo2 - https://github.com/AI4Finance-Foundation/FinNLP
Model - https://huggingface.co/oliverwang15/FinGPT_ChatGLM2_Sentiment_Instruction_LoRA_FT
Website - https://ai4finance-foundation.github.io/FinNLP/
License - https://github.com/AI4Finance-Foundation/FinNLP/blob/main/LICENSE

FinGPT: Democratizing Internet-Scale Financial Data With LLMs

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

FinGPT: Democratizing Internet-Scale Financial Data With LLMs

Uploaded by

Copyright:

Available Formats

To read more such articles, please visit our blog https://socialviews81.blogspot.

FinGPT: Democratizing Internet-Scale Financial Data with

The field of natural language processing (NLP) has witnessed

A team of researchers from Columbia University, and New York

To read more such articles, please visit our blog https://socialviews81.blogspot.com/

automates the collection and curation of real-time financial data from

FinGPT stands for Financial Generative Pre-trained Transformer. It is a

Key Features of FinGPT

● FinGPT democratizes access to high-quality financial data from

Capabilities/Use Case of FinGPT

FinGPT can enable various applications that require financial domain

To read more such articles, please visit our blog https://socialviews81.blogspot.com/

● Robo-advisor: FinGPT can provide personalized and automated

How does FinGPT work?

The FinGPT framework is designed to gather a comprehensive amount

To read more such articles, please visit our blog https://socialviews81.blogspot.com/

Performance Evaluation with Other Model

The researchers conducted a few experiments to showcase distinct

To read more such articles, please visit our blog https://socialviews81.blogspot.com/

FinGPT was compared with other models such as LLaMA and

In another experiment as shown in above table, the researchers used

For more detailed information on cost estimations and F1 score

To read more such articles, please visit our blog https://socialviews81.blogspot.com/

How to access and use this model?

FinGPT is an open-source framework that can be accessed through its

The research team in collaboration with The AI4Finance Foundation

● They plan to investigate more parameter-efficient fine-tuning

To read more such articles, please visit our blog https://socialviews81.blogspot.com/

The community is encouraged to use the FinGPT framework to train

FinGPT is a breakthrough in financial NLP that leverages internet-scale

To read more such articles, please visit our blog https://socialviews81.blogspot.com/

You might also like