You are on page 1of 7

Blockchain For AI and Data-Science

Mesum Sultan Shaikh


Department of Software Engineering, NED University of Engineering and Technology, Pakistan

Abstract – Data Science, Artificial Intelligence (AI), and blockchain have recently risen to the top of the list of most popular and
disruptive technologies. The increasing volume and variety of data has compelled academics to make business decisions based on
the analysis of big data. In contrast, artificial intelligence imbues computers with human-like intelligence and decision-making
abilities. In contrast, blockchain technology offers the potential to automate bitcoin payments while simultaneously providing
decentralized, secure, and trustworthy accessibility to a shared register of information, transactions, and records. In this paper, I
look into blockchain technology research and applications in data science and artificial intelligence, as well as the potential
benefits and drawbacks of the technology. I also provide a summary of blockchain systems and protocols that are largely
targeted toward data science and artificial intelligence. Furthermore, I investigate and answer open research issues associated
with the usage of blockchain in data science as well as AI.

Index Terms—Data Science, Artificial Intelligence (AI), Blockchain, Big Data, Bitcoin Payment.

I. INTRODUCTION work. What tends to happen if these technologies are used

We are currently experiencing a data influx. Data is expanding concurrently?


at a wondrous amount, and every single one is generating latest
data every second. Massive volumes of data have been To answer this question, a better grasp of the distinctions and
gathered and generated as a result of sensing applications, IoT connections between blockchain, data science, and artificial
devices, social media, and online applications, assisting in the intelligence is required.
evolution of AI and Data Science [1]. Various machine
learning and deep learning algorithms [2] can use such data to
perform a range of analytics. Currently, the large percentage of II. BACKGROUND
AI and Data Science machine learning and deep learning
In this section overview of blockchain, data science and AI is
models rely on a highly centralized training model in which a
given.
group of servers runs a specific model against training and
validating datasets, and many companies, including Google, a. Data Science
Apple, Facebook, and Amazon, manage massive amounts of
data to make informed decisions [3].
"Data Science" cites the process of translating data into
generating outcomes and espionage. It is a suite of tools,
Big data is derived from range of antecedents and is divided
techniques, and procedures that humans and computers employ
into three data categories: structured, unstructured, and semi-
to transform data into knowledge. Data science, which turns
structured data. Data utilization is difficult since the great bulk
data into valuable knowledge, has been affected by big data.
of huge data is unstructured or semi-structured. Data gathering,
The progress of data science is intrinsically related to big data.
storing, accessing, partake, analysis, administration, and
It delves into research methods together with the big data
visualization all present challenges on several fronts, as do data
setting. Academic and corporate circles are increasingly
protection and confidentiality. There is a great deal of data
focusing on big data-driven practical research. Big data
from many areas of life, yet it is fragmented and must be
technology is now widely employed in a variety of disciplines
successfully integrated. Because of an insufficient storage
such as genetics, meteorology, finance, and healthcare [8].
of infrastructure and analytic tools, many businesses assess just
a tiny fraction of vast volumes of data [4]. Furthermore, b. Artificial Intelligence (AI)
because data can be hacked and modified because it is
processed and kept centrally, the centralized structure of AI AI is defined as the research of "intelligent agents," or
may lead to data tampering [5]. The data sources' authenticity any technology that observes its environment and conducts
and provenance are not assured [6]. This might lead to AI actions to maximize its chances of success [9]. The vast
findings that are exceedingly inaccurate, dangerous, and majority of AI systems now under development are specialized
damaging. expert systems that make judgments based on a knowledge
database. Many scholars, on the other hand, are attempting to
The fields of data science, artificial intelligence, and develop AI systems capable of applying truly smart decision-
blockchain technology are all thriving. Some people may making processes to a limited range of challenges, some of
believe that these technologies are incompatible. They are, in which may have a positive impact on our everyday lives.
reality, complimentary technologies. As per Neimeth [7], the
blockchain database might be valued up to 20 percent of the c. Blockchain
whole big data industry till 2030, with yearly profit of nearly
100 billion dollars. Blockchain, like the further new The growing popularity of Bitcoin draws attention to the
technologies, is progressively altering the way some sectors blockchain technology that underpins it. The scope of
Blockchain is far wider than that of Bitcoin. Blockchain is a throughput and latency. Based on the outcomes of this study,
distributed, immutable ledger that records and distributes all researchers are offered suggestions for further research
transactions to all participants. It is a decentralized method in directions. [2]
which no third-party organizations are involved. The consent
of the participants validates each transaction. Every transaction The author's study in this article focuses on revealing and
on the blockchain is shared with all nodes. A transaction resolving the limitations of Blockchain from the viewpoints of
cannot be deleted once it has been entered in the ledger. This safety and security, whereas many of the proposed solutions lack
feature increases system transparency when it comes to specific evaluation of their effectiveness. Many additional
centralized payments that involve third parties [10]. Simply Blockchain scaling challenges, such as throughput and latency,
said, blockchain is a method of data transfer. It enables users to have gone unaddressed. Based on the study's findings,
safely communicate data and perform node payments without researchers are offered recommendations for future research
the use of a mediator or centralized management system. areas. [9]

The author of this study emphasizes that blockchain technology


III. RELATED SURVEY provides a digital trust mechanism for humans, which improves
the efficiency of value exchange and decreases costs, and that the
The author addresses the problems in this work by investigating truly credible and efficient Internet of Value is approaching.
how the advent of new technologies has the potential to change They also discuss Nebula AI's work in developing a
B2B partnerships and lead to the appearance of dark side effects decentralized artificial intelligence computing blockchain. [3]
via mechanisms that have not before been investigated or
understood. In the process, he shows how theories that have The author introduces big data technology, as well as its
been largely ignored in the literature might yield unique insights importance in today's world, as well as existing initiatives that
when investigating the dark side of B2B relationships. [11] are effective and critical in transforming the concept of science
into big science, as well as society, in this paper. The numerous
AI and blockchain are two of the most disruptive technologies, hurdles and obstacles encountered in adapting and accepting Big
with the potential to change the way we live, work, and interact Data technology, its tools (Hadoop), and the problems
dramatically. The authors discuss current initiatives and encountered by Hadoop are also thoroughly discussed. The study
examine the promising future of their integration in order to finishes with recommendations for good big data practices. [4]
answer the question, "What can smart, decentralized, and secure
systems achieve for our society?" [5] The author describes the impact of blockchain on big data and its
outcomes in this study. They also talk about how blockchain can
According to author, SecNet, a structure that can empower be used to add another data layer to the Big Data analytics
secure data archiving, automating, and spreading in an process. [14]
enormous Internet surroundings, is described in this paper, with
the goal of creating a much more safe cyberspace with real big The purpose of this research is to provide a technique for
data and therefore bolstered AI with a significant amount of data developing a blockchain-based environment for exchanging
antecedents, by assimilating three key components: 1) a health care data. The technique considers privacy concerns, data
blockchain-rooted data sharing platform with ownership exchange, and patients as focus point for data regulation. [15]
guarantees; 2) an AI-rooted safe computing platform; and 3) a
trustworthy value-swap mechanism that allows participants to This research suggests leveraging blockchain technology to
obtain economic incentives while sharing their data or diligence, develop Access Control systems that allow traceability of access
thereby encouraging data sharing and improving AI enactment. control policies evaluation. The basic concept of our approach is
In addition, they investigate SecNet's normal use scenario as to encode attribute-based rules for access control as smart
well as other possibilities of deployment strategy, and efficacity contracts and implement them on a blockchain, transforming
in terms of network protection and budgetary revenue. [12] policy review into a totally public smart contract execution. [16]

The author of this paper investigates how Blockchain This study demonstrated the many possibilities for developing
technology has proven to be exceptionally effective at securely accurate artificial intelligence models in e-Health utilizing
conducting distributed transactions. They offer a wide range of blockchain, which is an open network for information sharing
uses, including bitcoin cryptocurrency management and smart and permission. Healthcare practitioners will be able to access
contracts. The potential of blockchain for data science the blockchain to view the patient's medical information, and AI
applications has recently been examined. This research looks will use a variety of proposed algorithms and decision-making
into blockchain technology and how it can be applied in data capability, as well as vast amounts of data. [17]
science and cyber security. [13]
In this paper, the author gives a brief description of various
The author of this study conducted a thorough discussion on methods and platforms that attempt to enhance blockchain
several aspects of big data and the issues it brings, as well as strength into robotic applications, advance Ai technologies, or fix
some future research avenues. [8] issues that exist in significant blockchains, which can result to
the potential to develop robotic applications with enhanced
The author's study in this article focuses on revealing and functionality and security. They give an overview, analyze the
resolving the limitations of Blockchain in terms of approaches, and finish the research with our thoughts on the
confidentiality and security, yet a lot of the proposed solutions upcoming years of technological integration. [18]
lack specific evaluation of their effectiveness. Many additional
Blockchain scaling challenges have gone neglected, such like In this paper, the author emphasizes current achievements in this
subject and calls for improved readability in artificial Many of the problems of AI and blockchain can be efficiently
intelligence. Further, it provides 2 methods for comprehending addressed by merging both technical ecosystems [22]. To learn,
deep learning forecasting accuracy: one that evaluates the infer, and make final conclusions, AI algorithms rely on data or
prediction's sensitivity to varying the input and a second that information. Machine learning algorithms perform better when
effectively decomposes the choice in terms of the data is acquired from a dependable, safe, trusted, and credible
input parameters. These approaches are evaluated using three data repository or platform. Blockchain is a distributed ledger
classification tasks. [19] that allows data to be stored and transacted in a fashion that is
cryptographically signed, confirmed, and agreed upon by all
mining nodes. Blockchain data has excellent integrity and
IV. CHARACTERISTICS OF BLOCKCHAIN resilience and cannot be tampered with. When smart contracts
are utilized to make judgments and do analytics using machine
The following are the five fundamental properties of blockchain learning algorithms, the results may be trusted and unquestioned.
technology: The combination of AI and blockchain has the potential to build
a safe, immutable, and decentralized system for the very sensitive
Decentralization: All blockchain participants have total access data that AI-driven systems must collect, store, and use [23].
to the database and its historical records. The information and This approach leads to major gains in data and information
data are not within the control of a single entity. Without the security in a variety of domains, including medical, personal,
need for a mediator, each party may simply examine the banking and financial, trading, and legal data.
information of its business partners.
Figure below depicts how AI and Data Science might profit from
Security: Blockchain technology can prevent data from being the availability of numerous blockchain platforms for running
altered with. Each participant in the blockchain system has their machine learning algorithms and tracing data saved on
own database. Each node on this blockchain will hold all of the decentralized P2P storage systems.
data. Even if the nodes is broken or targeted, the database
remains unaffected.

Data privacy and security: Because data sharing among


blockchain nodes supersede a tight concord procedure,
blockchain doesn’t require trust and may replace data situated
on addresses instead of human identities. Simultaneously, the
blockchain employs encryption to assure data protection; despite
the fact that information is spilled, it can’t be interpreted.

Smart covenant:  A smart covenant is a piece of software that


runs axiomatically when specific circumstances are satisfied.
Due to the cybernated structure of the ledger, blockchain
transactions may be associated to computational inference and
are essentially programmable. As a result, users may construct
algorithms and rules to automatically start transactions between
nodes.

Controllability: Since the blockchain structure saves all previous


data with a time after provenance block, any data on the VI. ADVANTAGES OF BLOCKCHAIN IMPACT OVER
blockchain could be tracked back to its origin. It increases the DATA SCIENCE AND AI
traceability and visibility of blockchain data.
Data risk management is not without hurdles, despite
breakthroughs in data science and AI technologies, which
including data silos, poor data performance, and data outflows.
V. THE IMPACT OF BLOCKCHAIN ON DATA SCIENCE The core philosophy of blockchain is one of decentralization,
AND AI openness, and clarity. Blockchain technology has the potential to
solve the Internet ecosystem's trust problem, hence propelling the
Data science's value is discovered in the harvesting of pertinent growth of big data and the automated prudence. In this part, we'll
data from data set in order to create data products. Due to data look at some of the credible upsides of mixing blockchain with
confidentiality, complexity, and an unequal demand and supply data science and AI.
current data transmission has substantially hampered the growth
of data science in community. Data gathering, partaking, and a. Security and Privacy
confidentiality protection are key hurdles to data science
growth. Many setups, businesses, and government entities [20] Because of its decentralized structure, blockchain
are investigating blockchain application programs in logistics ensures data privacy and security. The great majority of data is
network, health care records, voting, energy supply, stored on centralized servers, which might result in data leak and
proprietorship management, and critical domestic substructure loss. Cybercriminals frequently target them. Blockchain
security [21] because blockchain properties such as clarity, technology decentralizes data control, making large-scale data
protection, traceability, and privacy can supplement data access and manipulation more difficult. Furthermore, the
science. blockchain transaction data, such as the transfer location,
amount, and convention time, is open and public, but the identity
of the programmer of the action location is unidentified. The data must be scalable while supporting high transaction speeds.
ciphertext process of blockchain technology isolates a person Scalability is described as the proficiency of a system, network,
identify from user data. or process to inflate its capabilities by managing rising tasks.
Despite the fact that blockchain has major advantages over
b. Credibility and Transparency conventional systems, it is prone to the scalability problem,
which impedes real-time commerce. The larger the blockchain,
Through the automated deployment of smart contracts, the more time it requires to replicate data to further network
data science and artificial intelligence (AI) might provide data
nodes. This impacts both new nodes as well as those brought
analysis for buyers. Using blockchain technology as a bridge.
back up after an extended period of inactivity.
Smart contracts decrease human interference and redundancy.
The blockchain technology is coupled with automated contract
implementation, which is not familiar but yet trusted by many b. Difficulty in Accurate Analysis
parties, through monitoring and accurate assessment of all data.
The data scalability will continue to expand due to the
exponential expansion of blockchain technology applications.
c. Data Interpretation Blockchain data fusion will widen and enrich data in a variety of
corporate scenarios. Although blockchain protects anonymity and
In the phase of big data, data protection difficulties privacy, it also makes obtaining critical information from
encompass not only private protection security issues, but also blockchain sets of data extremely difficult. Mapping the
data interpretation, which tries to anticipate people's health and connection between accounts and locations is the most important
behavior. and complex aspect of on-chain data analysis. The more
confidential the blockchain data collection, the more complicated
it is to correctly market to people. Researchers can only make
broad generalizations from this information, but accurately
d. Data Sharing forecasting people's future behavior and impacts is challenging.
To some extent, blockchain has addressed the security c. Consensus Upgrade
issue associated with data exchange. Healthcare is a classic
cross-sector scenario, with no single organization having access A consensus algorithm is a mechanism that allows
to all data. Blockchain provides a solution for data interchange several network users to agree. Because the public blockchain is
by integrating cloud computing, node transmission, a consensus decentralized, the decentralized nodes must agree on the
process, and encryption techniques. The blockchain removes transaction's validity. The consensus algorithm's goal is to
data silos and increases the value of data. In healthcare, validate that all lumps adhere to etiquette rules and that all
blockchain may be utilized to strike an equilibrium among payments are made in a secure environment. Whenever
security and digitized medical record accessibility. blockchain is employed in data science, the network's
membership must be diverse.
e. Protection of Data Sovereignty
d. Intensified Competition
Because there is no effective means to monitor how
data is utilized and who retains it, majority of data really aren’t Without a doubt, data is the bedrock of the data
held by their owners. As a result, there isn't any system in place processing. Every company desire to acquire much data as
to track down or penalize offenders who utilize data without possible in order to improve its long-term combativeness.
restriction. Because the concepts of "blockchain," "data science," and "AI"
are mutually beneficial, an increasing number of start-ups are
venturing into the blockchain, data science, and AI fields. Big
data firms' current profit model may be characterized as "data-
tools-services." The initial blockade to combativeness is the
VII. OPEN CHALLENGES IN THE BLOCKCHAIN collecting of data.
IMPACT ON DATA SCIENCE AND AI

This part explores the primary challenges that must be overcome


when implementing blockchain technology in data science and
artificial intelligence. The blockchain was developed far too VIII. CONCLUSION
quickly, and the technology infrastructure continues to be in its
In conjunction to transform how we process and understand data
existence. Blockchain is still in its "infancy" in comparison to
we can utilize Data science, artificial intelligence, and
most other technologies. Despite the fact that blockchain has
blockchain. The massive volume, multiplicity, and rapid
tremendous potential benefits for data management, there are
expansion rate of data, as well as the rapid advancement of data
also significant challenges. This section discusses some of the
applications, have placed enormous ultimatum on the
identified issues.
optimization of consumer quantity, parallelization and energy
a. Scalability consumption of inquiries for privacy protection services. In this
research, we have provided an introduction of data science and
A massive amount of data must be managed and artificial intelligence before delving into big data technologies
evaluated within data science and AI. As a result, storage of and the threats they face. After that, we have gone through
blockchain, covering blockchain architectonics and pivotal
blockchain liniments. We also discuss the possible upsides, concept. Journal of diabetes science and technology, 2019. 13(2):
drawbacks, and unsolved challenges of combining blockchain p. 248-253.
with data science. While we adopt blockchain technology, we
must further establish blockchain technology objectives and [16] Maesa, D.D.F., P. Mori, and L. Ricci, A blockchain based
principles, concentrating on data security risks research and approach for the definition of auditable Access Control systems.
analysis, continually monitor development trends, and Computers & Security, 2019. 84: p. 93-119.
aggressively seek regulatory channels.
[17] Tagde, P., Tagde, S., Bhattacharya, T. et al. Blockchain and
VIII. REFERENCES artificial intelligence technology in e-Health. Environ Sci Pollut
Res 28, 52810–52831 (2021).
[1] M. Koch, “Artificial intelligence is becoming natural,” Cell,
vol. 173, no. 3, pp. 531–533, 2018. [18] V. Lopes and L. A. Alexandre, “An overview of blockchain
integration with robotics and artificial intelligence,” arXiv
[2] J. Schmidhuber, “Deep learning in neural networks: An preprint arXiv:1810.00329, 2018.
overview,” Neural networks, vol. 61, pp. 85–117, 2015.
[19] W. Samek, T. Wiegand, and K.-R. Müller, “Explainable
[3] N. A. Team, “Nebula ai-A decentralized ai blockchain artificial intelligence: Understanding, visualizing and interpreting
whitepaper,” 2018. deep learning models,” arXiv preprint arXiv:1708.08296, 2017.

[4] Katal, A., M. Wazid, and R. Goudar. Big data: issues, [20] Walport, M., Distributed ledger technology: Beyond
challenges, tools and good practices. in 2013 Sixth international blockchain. UK Government Office for Science, 2016. 1.
conference on contemporary computing (IC3). 2013. IEEE.
[21] Xu, X., et al. A taxonomy of blockchain-based systems for
[5] T. N. Dinh and M. T. Thai, “Ai and blockchain: A disruptive architecture design. in 2017 IEEE International Conference on
integration,” Computer, vol. 51, no. 9, pp. 48–53, 2018. Software Architecture (ICSA). 2017. IEEE.

[6] Y. Qi and J. Xiao, “Fintech: Ai powers financial services to [22] A. Panarello, N. Tapas, G. Merlino, F. Longo, and A.
improve people’s lives,” Communications of the ACM, vol. 61, Puliafito, “Blockchain and iot integration: A systematic survey,”
no. 11, pp. 65– 69, 2018. Sensors, vol. 18, no. 8, p. 2575, 2018.

[7] Neimeth, C. What can be uncovered when big data meets the [23] T. Marwala and B. Xing, “Blockchain and artificial
blockchain. JUN 29, 2017; Available from: intelligence,” arXiv preprint arXiv:1802.04451, 2018.
https://www.infoworld.com/article/3203748/what-can-
beuncovered-when-big-data-meets-the-blockchain.html.
[24] Salah, K.; Rehman, M. H.; Nizamuddin, N.; Al-Fuqaha, A.
[8] Yin, S. and O. Kaynak, Big data for modern industry: (2019). Blockchain for AI: Review and Open Research
challenges and trends [point of view]. Proceedings of the IEEE, Challenges. IEEE
2015. 103(2): p. 143-146.
[25] Jiameng Liu;Shaoliang Peng;Chengnian Long;Lijun
[9] D. Marr, “Artificial intelligence-A personal view,” Artificial Wei;Yunhao Liu;Zhihui Tian; (2020). Blockchain for Data
Intelligence, vol. 9, no. 1, pp. 37–48, 1977. Science . Proceedings of the 2020 The 2nd International
Conference on Blockchain Technology.
[10] Yli-Huumo, J., et al., Where is current research on
blockchain technology—a systematic review. PloS one, 2016.
11(10): p. e0163477. IX. AUTHOR’S BIOGRAPHY

[11] David M. Gligor;Kishore Gopalakrishna Pillai;Ismail I am Mesum Sultan. I am a third-year student of software
Golgeci; (2021). Theorizing the dark side of business-to- engineering at NED University of Engineering and
business relationships in the era of AI, big data, and blockchain Technology. I have hands on experience in data science, AI
. Journal of Business Research. and web development. I am further exploring Blockchain
and web 3.0.
[12] Wang, Kai; Dong, Jiaqing; Wang, Ying; Yin, Hao
(2019). Securing Data with Blockchain and AI. IEEE.

[13] Bhavani Thuraisingham; (2020). Blockchain Technologies


and Their Applications in Data Science and Cyber Security .
2020 3rd International Conference on Smart BlockChain
(SmartBlock)

[14] Fedak, V., Blockchain and big data: The match made in
heavens. Towards Data Science, 2018

[15] Cichosz, S.L., et al., How to use blockchain for diabetes


health care data and access management: an operational

You might also like