You are on page 1of 5

BAN 702 Module 1 Assignment: Types of Data, Big Data and Artificial

Intelligence

Overview
Please answer the following questions. Although each question could require many pages to answer
fully, I expect that you will write just a few paragraphs to serve for each answer for this assignment.
Please answer the questions thoughtfully; do not use single-word answers or quick phrases to answer
the questions (except for the last question). Save your typed answers as either a Word document or a
.pdf file and upload that file to Canvas by the required due date/time.

Questions

1. You were asked to read one of the possible applications-oriented readings as the fourth
reading for this module based on your relative interest in the domains provided. Provide a
very brief summary of the reading without copying the abstract. Do you think that the domain
you selected is currently using big data or artificial intelligence effectively? Why or why not?
What are the key issues in using big data or artificial intelligence in that domain? Make sure
you define big data or artificial intelligence, as relevant to your selected reading, in your
answer.

The reading I chose is “Big Data in Accounting: An Overview” by Miklos A. Vasarhelyi, Alexander
Kogan, and Brad M. Tuttle. Throughout this essay it discusses how big data can be incorporated in
different fields of accounting such as auditing and general accounting. The reading dives into the
sources, uses and challenges of using Big Data in these two accounting domains. With that being said,
there is a new wave of changes coming when collecting accounting data and records. They have started
to develop nontraditional sources of data and the need for changes in accountancy standards as well as
audit analytics which create new opportunities by using Big Data. Not only does the reading discuss the
challenges of using Big Data but also expresses how this advancement can benefit different accounting
domains.
The domains in this reading include accounting and auditing. The reading suggests that the
accounting domain has yet to use Big Data in accounting practices and standards. From my
understanding, Big Data wouldn’t be effective right away in this domain since it can cause a paradigm
shift which will allow economic activities to be traced and measured earlier and deeper. The rise in
technology has made it possible for accounting businesses to account for more data than they have in
the past. The reading suggests that the audit domain is trying to use Big Data, but it hasn’t been
effective. It states that there have been major challenges with the assurance processes when businesses
have larger systems that provide Big Data.
There are several key issues in using Big Data in the accounting and audit domains demonstrated in
the reading. Big Data can have several different meanings. Big Data considered by a small accounting
firm may differ from a Big 4 accounting firm. However, it can be determined that Big Data implies that
the amount of data is at or beyond the limit of what the relevant information systems can store and/or
process. The key issues in using Big Data in accounting are that is leaves companies with several
questions about whether the new measurements can be reliable in regard to different variables, GAAP
guidelines and whether or now the data can be extracted. Companies with Big Data only have a minimal
portion the is directly related to financial reporting. Internal reporting has become enormous with
information which make data less structured. The key issues in using Big Data in auditing are that
systems have increased in size and complexity, there has been little possibility that primary assurance
can be accomplished in a manual fashion, auditing tends to lag in the adoption of technology by upper
management which raises questions on the willingness of professionals to adopt new technologies such
as Big Data.

2. A few of the articles mentioned that Walmart collects more than 2.5 petabytes of data every
hour from its customer transactions. That is an extraordinarily large quantity or data, but do
you think that this data could be considered “big data”? Why or why not? What type of data
composes a customer transaction (either through the cash register at a Walmart store or an
online purchase)? Make sure that you discuss the characteristics of big data and evaluate
Walmart’s data based on those characteristics. Remember that the volume of data alone
does not qualify it as “big data.”

I believe the collection of more than 2.5 petabytes of data every hour from its customer transactions
collect by Walmart should be considered “big data”. The reason for this is because it can be
demonstrated that one petabyte is equivalent to about 20 million filing cabinets worth of text. That is an
insane amount of data that is collect within an hour. It also helps Walmart improve its operational
efficiency by leveraging the big data analytics. The data they are collecting is unstructured when a
customer demonstrates a transaction. When characterizing the big data in Walmart, the data collect
have a significant amount of volume. Walmart datasets are in petabytes which require powerful
procession technologies. The is a lot of variety which qualifies Walmart in having big data. They are able
to collect and store several different types of data whether is from the customer transactions but also
clickable actions on their website. The velocity characteristic is also met to be considered big data. The
2.5 petabytes of data are able to be collect by Walmart within an hour. This is not a long time to collect
that much data. As a company such as Walmart, you can say the veracity quality is displayed. Walmart
won’t be as successful as they are if they didn’t have quality, accurate and trustworthy data being

3. A few of the articles discuss structured vs. unstructured data. What is the difference between
structured and unstructured data? Is a tweet structured or unstructured data? Is a Word
document structured or unstructured data? Is a customer transaction at Walmart structured
or unstructured data? Explain your reasoning for deciding whether a given type of data is
structured or unstructured.

When looking at structured and unstructured data, structured data can be comprised of clearly
defined data types whose pattern makes them easily searchable; while unstructured data is
everything else- it comprised of data that is usually not as easily searchable. This can be formats
such as audio, video, and social media postings. A tweet from Twitter is considered a little bit of
both. The tweets can be mined with advanced algorithms which makes in unstructured. However,
the tweets can be considered structed since there are log files or other metadata. A Word document
is considered unstructured data because the data is essentially the information you have that can’t
be stored neatly in a database. Any miscellaneous documents you have can be considered
unstructured data. A customer transaction at Walmart is considered unstructured data because this
data is very difficult to search for since there is so much of it. This data is also being generate by
machines.

4. Identify a chat bot/answer bot that you have used (or could have used) while frequenting a
website of your choice. Describe the purpose of that chat bot/answer bot. Was the bot able to
work with you successfully? If yes, what data do you think the bot needed to interface with
you successfully? If no, did the bot need additional data, and what additional data could have
helped make the artificial intelligence of the bot work more successfully?

A chat bot/answer bot is a computer program designed to simulate conversation with human users,
especially over the internet. The chat bot that I discovered was on the Lululemon website. The purpose
of this chat bot is for customer support. If a customer had a question about their orders, product related
questions, returns, and help with fit/sizing this chat bot would be able to help them. The bot was
somewhat successful. The data that the bot needed to interface with me successfully was I had to enter
my account information, name and email. However, it also wasn’t successful because they asked for
more information about the order number and what type of item I purchase since that information
wasn’t given before. This information of asking for the order number and other questions you had ahead
of time before you were put in the query would have sped up the process.

5. The article in Harvard Business Review titled “Big Data: The Management Revolution,”
mentions that as of 2012, about 2.5 exabytes of data are created each day. That number is a
little confusing and potentially misleading, so let’s put it into more practical terms. Use a
search engine to find one “quantity” of data that is interesting to you. For example, you might
ask, how many text messages are sent each hour? Or how many tweets are sent each day?
How many total tweets have been stored since Twitter originated in 2006? How many email
messages are sent each day? How much email does Google store? How many orders are
processed by Amazon each day? I have completed the following chart with two examples: the
number of reported cat videos available on YouTube and the amount of data required for
patient visits to the Emergency Room (ER) in one hospital. Fill out the chart on the next page
with one additional type of data that you have identified. Make sure that you complete the
multiplication to determine how many bytes of data would be stored for the type of data that
you have selected.

How many “bytes”


What is the or characters of data
published does that quantity
Do you think
“quantity” of the mean? A byte is Where do you think
Type of Data this data is
data and what is equivalent to an the data is stored?
“big”?
the date of English language
publication? character
(A, B, $, 2).
Cat videos on Assuming an average YouTube is owned by Probably yes.
YouTube video length of 4.5 Google, so the videos There is a large
minutes at decent are stored on Google- volume, and the
2 million cat quality = 75MB per owned servers all velocity is high
videos stored on video. Approx. 150 over the world. The (because new
YouTube (2015) Terabytes. videos are probably cat videos are
stored redundantly, added
so the amount of frequently).
storage required is at However, there
least two or three is very little
times the amount variety of data
calculated. —it is all video.
Emergency 117,362 To store Stored on the Probably no.
room visits to visits/year (2016) demographic data, hospital server in one The volume is
a hospital in a prescriptions, tests location. comparatively
region with with no imaging = small, and
about 5,000 bytes per visit. although there
500,000 Approx. 560 are many visits
potential Megabytes to the ER, this is
patients relatively low
velocity, and the
data format is
predictable with
little variety.

Songs on 50 million songs Assuming the This would be stored Yes, the data is
Apple Music average song is 3.5 on the iCloud for considered
mins therefore this their apple products “big”. There is a
would equal to large volume of
about 30.73. Since songs compared
there is 50 million to Spotify or
songs this equals Soundcloud.
about 1536 Velocity is high
terabytes as well since
there is music
constantly
added
throughout the
week from
different artists.
There is a lot of
variety as well
since there is
more than on
genre.

You might also like