You are on page 1of 9

Introduction

The goal of this report is to critically discuss the task of developing a bespoke data science product
relating to a specific application domain as part of an R&D project. This assignment is split into two
tasks. Firstly, to produce the data science product for an end user with little to no knowledge in data

work. Secondly, this report will critically evaluate the development process to produce the app, looking
at the design of the product, the products development and how the project was managed.

This assignment will look to create a data science product based on stock market data, with the goal to
create an interactive dashboard called Stock Market Dashboard (SMD), where the user, as part of a
financial company, can load data on different company stocks and have it presented in a visually
appealing, easy to understand and informative way.

Product design
Product design is important before developing any software-based product, allowing the developer to
plan what will happen during development and how the product will look at the end. During the project,
an initial design was made to ensure that the end-product had all the requirements and functionality for
it to work in an industry scenario (Elliott, 2000).

The first product design decision to be made was what data and data source was the product going to
be built on. To do this, an assessment of the product needed to be made. With the product being a
dashboard for stock market data for finance companies, a snapshot of historical stock data seemed
(Kaggle.com, 2019)
was chosen as it contained time-series stock market data and it could be used to develop relevant charts
that could provide insight into the data.

The goal of this product is to allow the user to retrieve a large amount of insight into data by presenting
it clearly and understandably to the user. To do this, an interactive dashboard product was chosen to
provide this insight. Having interactive charts that can be easily manipulated to provide the information
the user is looking for could allow for quicker at-a-glance analysis of the data in an industry where fast
decision making is very important. While the financial industry has many means of doing a deep
analysis of stock market data, complexity has caused quick analysis and easy comparisons of data to be
overlooked, leaving an opportunity to create a streamlined solution.

Looking at the application domain and end-user requirements for the product, there are three main
factors to be considered; speed, accessibility and clarity. The application must meet these factors
because the financial industry is centred around making fast but well-informed decisions to ensure
maximum profit can be made (Lehtola and Kauppinen, 2006). Focusing on the mentioned factors
should result in an application that can allow for an industry professional to make a fast and well-
informed decision using the SMD.

When designing a product, it is important to understand the product functional and non-functional
requirements as it helps with the planning and design phase of development. Also, function and non-
function requirement is useful in helping ensure that the product being developed is reaching the initial
goals and can act as a checklist of when needs to be added to the application (Dua and Raj, 2018). By
planning these requirements, it should allow for a better-quality product to be developed by the end of
the project.

pg. 2
Functional Requirements:

Import data

User selectable stock data

Stock comparison graphs

Select different data features

Deep dive into specific stock

Non-functional Requirements:

Speed

Scalability

Reliability

Usability

Accessibility

Interactive

This product is as an initial look at the data for a financial industry professional, it is designing to be
easy to use and quickly accessible, so people can find information about the stock market quickly.
Because of this, there are two main use-cases for this product; firstly, it will be used in financial offices
as the first port-of-call to see how stock market data is doing. Because of this, it should work on PCs
and easy to access. To ensure this, the product will be hosted on a website where a user can quickly go
to a URL and see the data. The second main use case for this product will be on mobile, as it should be
accessible for a user to quickly look at the data while they're on the go, this is done because financial
industry professionals often look at the financial markets outside of work hours and will need to access
the necessary data from home.

Product development
When considering the appropriate software tools, platforms and hardware methodologies, it is important
to look at what the goal of this product is. For the SMD product that is being developed, the main goal
of the product is to create a data dashboard. There are few tools that can be used to produce a data
dashboard like this such as Elasticsearch and IBM Cognos, however, when looking at the functional
and non-functional requirements of the product, the cheapest and easiest solution is R Shiny
(Shiny.rstudio.com, 2019). R shiny allows for the quick and easy development of simple to use data
dashboards built in the R language, it has the benefit of being very customisable since it is built in R
from the ground up, and It is free, which makes it more accessible to develop for. It also has the
capability to meet all the function and non-functional requirements the project asks for (Beeley and R.
Sukhdeve, n.d.).

pg. 3
For any project is important to decide what software engineering methodology should be implemented
to get the best result from the development. For this SMD project, an agile methodology was
implemented. This was done because it gave the regular opportunity to assess the project and make
changes when necessary. This is important when working with an unfamiliar technology such as R
Shiny as it allows for changes in the development to be made quickly which can reduce time wasted
when changes to the plan must be made.

Figure 1: Agile software development cycle (lib.ncsu.edu, 2019)

A benefit of the agile development methodology is the way it integrates system testing into the
development cycle. When using agile methodology, testing becomes a regular occurrence since the
development cycles are short and testing is one of the main steps. when testing is done during this
project, it will be done to ensure that the application does not have any bugs, crashes and that it runs in
a way that is satisfactory to the user. R shiny has its own testing package and interfaces in the "shinytest"
package which can be used to run, and record tests are done to the interface.

To evaluate the product at the end of the development cycle, analysis of the product and how it meets
the functional and non-functional requirements will be made. If the product can justifiably meet all the
requirements set out for it that means it should be completed to the required standard. The product
should also meet a list of the quality assurance requirements, ensuring that the application if free of
bugs, crashes and any slowdown that could affect the user's experience.

Project management
When working on a large-scale product, time management is vital in ensuring that development goals
and the completion of the project are met by a deadline. This becomes evident when working in the real
world as part of a company because time is money and going over a deadline can result in losses for a
business. Because of this, time management tools can be vital in showing project progress and help
predict when a project will be complete.

As part of the SMD project, a Gantt chart was used to monitor the project and record time taken. In a
real-world scenario, time management tools like this could be used to develop a bill for a client if

pg. 4
needed. However, the main purpose of tools like Gantt chart is to monitor project progress and assess
if project deadlines are going to be made. It is also useful when assessing where issues have arisen and
where improvements can be made in future projects (Ong, Wang and Zainon, 2016).

Figure 2: Project Gantt Chart

For the project in this report, I used a Gantt chart to monitor my progress in developing the application,
I found it was a very useful planning tool in helping achieve project goal in a timely manner. It can also
be used to see my progress and skill in using the R shiny toolset as more complex task begin to take
less time as the project goes on, this is reflected in the chart.

When working on any data-focused project, it is important to evaluate the risks surrounding the data
that is being used and to do a risk assessment of personal data and identification. This risk assessment
is needed as a legal requirement since misuse of data, and especially private data is a criminal offence
in the UK and most countries (Smouter, 2018). However, since this project focuses on stock market
data, which is publicly available and has no identifying features this is not much of a concern.

Something that should be a concern when working with data in the stock market is getting access to
data that is not publicly available, this can breach insider trading law which protects against making
investments on the stock market using private data. This is seen as bad since it can be considered an
unfair advantage to people who have more information than what is available. However, since the data
of the SMD is available on Kaggle and only contains publicly available data, this is not a concern.

It is also important to consider security and governance of both the data and the software that is being
developed. Since the security of data is becoming more of a requirement because of GDPR and the data
protection act, it is important to take all the necessary step to ensure data is protected and handled
correctly. Some of the requirements and considerations these new laws provide include ensuring the
data is used; fairly, lawfully and transparently; for specified, explicit purposes; accurate and kept up to
date (Scott, 2019). These laws and considerations will be considered during the development of the
SMD.

Quality control of a software development project is important as it ensures that the product at the end
of development can be released without risk of breaking, having bugs, or providing unsatisfactory user
experience. The process used for quality control can be highly depended on the software development
methodology used. Using an agile methodology results in testing being a common occurrence in the

pg. 5
development cycle, this can boost the quality of the product by removing bugs, this is regarded as test-
driven development (Crispin, 2006).

When developing a project, developer/user relationship management is vital in ensuring that the product
that is being developed is what the client/ user wants. Having a good connection with a user could give
more insight into what the user wants from the product that could result in adding or changing function
and non-function requirement thought-out the development cycle. The better the relationship with the
user, the more chances there will be to receive feedback on the developed product, this should increase
the quality of the product and give a high change the user will want to purchase the product once its
finished (Cheruy, Robert and Belbaly, 2017).

When a product is fully developed, the marketing strategy is important in ensuring that the product can
make as much money as possible if it gets sold. Having a good relationship with the user during the
development of a product can result in the product being sold before it is finished if a user-tester likes

the product get sold, but this will only get the product so far. Another marketing strategy that will be
employed when selling the SMD is advertising and working with people inside of the financial and
stock market industry. Giving free trials and demos can entice users in before selling the full product
and if the product can be developed in collaboration with industry professionals, it should ensure that
most people will like what they saw and what to buy (Singh, 2013).

Conclusion
During this assignment I have managed to develop a good initial prototype to the SMD product.
However, the product still lacks features and enhancements that are needed to be marketable.
Nevertheless, during this project I have developed a wealth of knowledge into how to turn data science
tools into a potential product rather than a single graph or algorithm. I have also gained knowledge in
using the R-Shiny toolset to develop an interactive product that the user can alter to get the data, graphs
and insights that they are looking for.

However, there are still many opportunities for the project to be enhanced and expanded upon. For the
application to work in the real world, access to real-time data is required to ensure financial
professionals have the latest insights rather than old stock market snapshots. Also, for the application
to be industry ready, more machine learning metrics that are useful to the user should be implemented,
such as; forecasting and stochastic relative strength indexing (STOCHRSI).

However, the application I created can still act as a prototype and a proof of concept as a solution to the
problem financial professionals have. It provides quick, accessible and understandable stock market
data to the client that can be used to provide insight into the financial markets. As the project is expanded
through the suggestions mentioned above, it could become a fully developed, industry used piece of
software.

pg. 6
References
Beeley, C. and R. Sukhdeve, S. (n.d.). Web Application Development with R Using Shiny.
Cheruy, C., Robert, F. and Belbaly, N. (2017). OSS popularity: Understanding the relationship
between user-developer interaction, market potential and development stage. Systèmes d'information
& management, 22(3), p.47.

Crispin, L. (2006). Driving Software Quality: How Test-Driven Development Impacts Software
Quality. IEEE Software, 23(6), pp.70-71.

Dua, R. and Raj, G. (2018). Quality Analysis for Web Services Recommendation Using Functional
and Non-Functional Requirement. SSRN Electronic Journal.

Elliott, J. (2000). Design of a product-focused customer-oriented process. Information and Software


Technology, 42(14), pp.973-981.

Kaggle.com. (2019). S&P 500 stock data. [online] Available at:


https://www.kaggle.com/camnugent/sandp500 [Accessed 17 May 2019].

Lehtola, L. and Kauppinen, M. (2006). Suitability of requirements prioritization methods for market-
driven software product development. Software Process: Improvement and Practice, 11(1), pp.7-19.

lib.ncsu.edu (2019). Agile software development cycle. [image] Available at:


https://www.lib.ncsu.edu/news/main-news/agile-and-scrum-workshops-start-jan-31 [Accessed 24 May
2019].

Ong, H., Wang, C. and Zainon, N. (2016). Integrated Earned Value Gantt Chart (EV-Gantt) Tool for
Project Portfolio Planning and Monitoring Optimization. Engineering Management Journal, 28(1),
pp.39-53.

Scott, P. (2019). National Security, Data Protection, and Data Sharing after the Data Protection Act
2018. SSRN Electronic Journal.

Shiny.rstudio.com. (2019). Shiny. [online] Available at: https://shiny.rstudio.com/ [Accessed 23 May


2019].

Singh, S. (2013). A Study of Marketing Strategies of Software Firm: A Case of Syber Systems and
Solutions. Journal of Research in Marketing, 1(3), p.74.

Smouter, K. (2018). The Year of the GDPR. Research World, 2018(68), pp.48-49.

pg. 7
Appendix 1 App screenshots

pg. 8
pg. 9
pg. 10

You might also like