You are on page 1of 6

Predicting the Selling Price of Cars Using Business

Intelligence with the Feed-forward Backpropagation


Algorithms
1st Nur Oktavin Idris 2nd Aspian Achban 3rd Siti Andini Utiarahman
Department of Computerized Department of Electrical Engineering Department Design of Visual
Accounting and Information Technology Communication
Sekolah Tinggi Manajemen Universitas Gadjah Mada Universitas Ichsan Gorontalo
Informatika dan Komputer Ichsan Yogyakarta, Indonesia Gorontalo, Indonesia
Gorontalo aspian.achban@mail.ugm.ac.id siti_andini@unisan.ac.id
Gorontalo, Indonesia
nuroktavin@stmik-ichsan.ac.id

4th Jorry Karim 5th Fuad Pontoiyo


Department of Informatics Magister of System Engineering
Management Universitas Gadjah Mada
Sekolah Tinggi Manajemen Yogyakarta, Indonesia
Informatika dan Komputer Ichsan fuad.pontoiyo@mail.ugm.ac.id
Gorontalo
Gorontalo, Indonesia
oyie.potlot@gmail.com

Abstract—The automotive industry is increasingly analyses, and the prediction of business operation run by a
competitive every year by releasing cars featured with corporate or organization [1]. The corporate can use the
innovative specifications offered by automotive manufacturing analysis of the prediction when making innovations
companies. The specifications, supported by the technology and stimulated by developments and challenges in this
performance a car has, are a tool to determine a car’s price. globalization and digital era and by an increase in
However, today the automotive industry frequently releases a competitiveness, especially in the car industry.
new product or type of car with the latest specifications,
affecting a car’s price to change. It perplexes car manufacturing The automotive industry is increasingly competitive every
companies when they are determining a car’s price. Responding year by releasing cars featured with innovative specifications
to this issue, an approach to a decision-making strategy to offered by automotive manufacturing companies. One of the
predict a car’s price is needed. One of the approaches that can cars released is BMW. BMW (Bavarian Motor Works) is one
be implemented is business intelligence with its primary aspects of the most notable automotive manufacturing companies
i.e. descriptive, predictive, and prescriptive. Using the concept, which consistently releases luxurious cars equipped with the
we implement Business Intelligence and use the feed-forward best technology and high performances [2]. Due to the
backpropagation algorithm to predicts the selling price of a car preeminence and luxury, the cars offer, BMW is priced at a
based on its specification and predict a car price based on the relatively expensive selling rate. However, BMW is one of
latest specification which has never been on sale. The research the public preferences due to its specifications, features, and
findings, identified by using a dataset containing the technology which put a high consideration of convenience and
specifications of BMW, reveal that the actual price and safety [3]. It is proven by BMW sales which consistently grew
predicted price are close at a mean error of 11.46%. Besides, the
in Indonesia by 10% as of 2019 [4], although BMW sales
research findings also state that the predicted price of a new car
with new specifications is $55,754. This research aims to analyze
declined in the first semester of 2020 due to the COVID-19
the estimation of the price of a car with the latest specification, pandemic [5], BMW Group could mitigate the decline by
which is the focus of the implementation of the business applying appealing and effective strategies such as releasing
intelligence method we do. new products [6]. The products came with various series and
specifications offered at various selling rates, allowing
Keywords—predicting the selling price of cars, business potential customers to select a car that suits their preference,
intelligence, feed-forward backpropagation. lifestyle, and budget.

I.INTRODUCTION The car’s price is commonly determined based on its


specifications, technology, and performances. However,
As one of the essential means of transportation in today automotive industries frequently release a car product
developing countries, cars are mainly used in business or series with the latest specifications, and the price is adjusted
activities. Accurate and quick information is necessary and to the specifications. Therefore, there is a very possibility of
one of the supporting factors for the decision making of changes in cars’ prices. It perplexes customers and car
corporate management. Using an application also used in the manufacturing companies when they are determining a car’s
Business Intelligence (BI) concept to which BI functions as a price of the car to buy or sell. BI is thus needed to deal with
business data analyst whose expertise is in the income of sales this situation as the approach can be implemented to support
or the relevant price with income, we collect, store, and alter business decisions, starting from operational to strategic
the information into data access. It will help a business user activities, especially in determining a car’s price, based on
make a better decision. BI can show records, the most updated

Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on May 21,2021 at 13:22:28 UTC from IEEE Xplore. Restrictions apply.
both internal and external information [7]. Considering the and SET markets by making the prototype of the application.
specifications or features with which a car is equipped, we can Besides its reliability to predict product prices and share
predict the car’s price. This step is usually overlooked by car markets, the algorithms have other features i.e:
manufacturing companies when they are determining a car’s
price. Instead, they will build the price upon the car’s • The ability to model a non-linear process without
production cost, competitiveness, demand, and profit [8]. prior knowledge of the characters of the process.

To predict a car’s price, we can use the feed-forward • The prediction of the result of an experiment that is
backpropagation method. The method is an efficient and better than that acquired using another artificial
flexible decision-making method that can be implemented in neural network method.
any condition and multiple datasets. The feed-forward • The ability to predict the future price movement,
backpropagation method was also applied by Worasucheep helping us while making decisions.
[9], who made a prototype of an application which could
foresee shares’ price in NYSE and SET markets. The feed- • The flexibility that makes the method applicable to
forward backpropagation algorithm was selected because it various datasets and in any condition.
was suitable to produce a product’s price and the stock market.
In this research, we thus analyze a car’s predicted price based III. METHODOLOGY
on the specifications and features of BMW using a regression A. Dataset
analysis which allows us to investigate the relationship
between features and price using the feed-forward This research uses a dataset of BMW acquired from
backpropagation method. The method can also be used to Kaggle. BMW is a top-selling luxurious car [12]. Since BMW
predict a car’s price with the latest specifications about to be foreseeably remains popular in the next years, we require the
released. car’s predicted price. The dataset of BMW consists of 15
features and 325 instances from the year 1995-2017. The 15
The paper comprises several parts. Part I contains an features are the car model consisting of car series, car
introduction elaborating on the issues of the determination of production year that is from the year 1995-2017, engine fuel
car selling price and the object being studied. Part II explains consisting of four types, engine HP that is in the range of 170-
the literature frameworks taken from some previous studies. 600, engine cylinder consisting of four types, transmission
Part III indicates the research method used. Part IV is the types consisting of three types, drive wheel consisting of three
research findings and discussions. Finally, Part V contains the types, the numbers of doors that are two and four, the market
research conclusions of the experiment conducted and future category consisting of a value of 1, the vehicle size that comes
work in three types, the vehicle style that comes in seven types,
highway MPG with a range of 18-45, city MPG with a range
II. RELATED WORKS of 10-32, 3916 popularity, and MSRP with a minimum and
The study of predicted car price is one of the fields of maximum of 4697 and 141200 respectively. The dataset is
research widely interested as it requires knowledge from an divided into two i.e. data training consisting of 264 data and
expert system. Gegic et al. [10] use engine learning techniques testing data consisting of 60 data.
(ANN, SVM SVM, and Random Forest) to predict second- B. Business Intelligence
hand car prices in Bosnia-Herzegovina. However, the
technique can only be implemented as an ensemble so they Business Intelligence (BI) is a set of devices that can assist
have to use the data collected through web scraper using PHP us in the strategic design of a corporate by collecting, storing,
programming to make predictions. The accuracy identified and analyzing the data to help make decisions [13]. BI is
with an evaluation using the testing data is 87.83%. considered the most effective framework for decision making.
The following are several BI concepts:
In predicting the price of a car, the machine learning
technique significantly affects the acquisition process for the • BI is defined as a decision-making tool that works by
expert system. Van Thai et al. [11] build a knowledge-based using information and knowledge acquired from
system using qualitative data and use quantitative data various sources and data structures [14].
analysis to predict second-hand car prices. They conduct • BI is a set of information systems and technology
preprocessing by narrowing the car features and hence acquire which is supposed to help managers make decisions
qualitative data (the brand, name, actuator, seller, fuel, color, regarding the operational activities of their companies
and origin) and numeric variables (kilometer and age). based on internal and external information [7].
However, we require a knowledge-based system to transfer
non-numeric data into quantitative ones. Similarly, using • BI is a framework consisting of a business decision-
double-linear regression Noor and Jan [8] consider the making concept, theory, and method which uses an
significant variables by deleting other variables to predict car evidence-based supporting system [15].
prices. The variables they select are used to make a double-
linear regression model. By the above definitions, BI is considered the most
effective framework for decision-making. Fig. 1 describes the
In this study, we build a system that can be used to predict processes of how BI works [15].
car prices based on car specifications and the price of new cars
with the latest specifications and that have not been available BI needs software that can do data analysis to for example
at dealers using a dataset of BMW specifications and sales. query database, reporting (SAP, Oracle, and others),
The feed-forward backpropagation algorithms are selected multidimensional data analysis (OLAP), and data mining
using the Business Intelligence concept whose main aspects (predictive analysis, text mining, web mining) [15]. BI
are descriptive, predictive, and prescriptive. Worasucheep [9] activities are:
applies the same algorithms to predict share prices at NYSE

Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on May 21,2021 at 13:22:28 UTC from IEEE Xplore. Restrictions apply.
• Descriptive: describing the characteristics of data and
the relationship between entities. It can also be used to
find problems (when the problems occur and what they
are) [15] and describe the average specification of cars
and how it relates to price.
• Predictive: predicting the future trend including what
will happen and how it happens [16] [17]. At this stage,
a prediction model, such as a prediction model of car
price based on car specifications, is formed based on
the available data.
Fig. 2. Neural network scheme
• Prescriptive: analyzing what actions are required, why
the actions should be made, and what the consequences IV. EXPERIMENT AND RESULTS
are [18]. For example, to predict the selling price of
A. Feed-forward Backpropagation Algorithms to Predict
cars that will be produced with new specifications.
Car Prices
C. Regression In the first experiment, we choose 15 data of car
Regression is one of the statistical methods to identify how specifications randomly as testing data and test the data in 264
a feature affects another. The affecting feature is an training data. The experiment aims to investigate whether the
independent feature, while the other is a dependent feature. system used can predict car prices based on the specifications
Analyzing the features will allow the management to identify inserted. The specifications inserted are those of the cars that
how the relationship between features correlates to price. The have been produced and sold. In Business Intelligence this
correlation can be used in different objectives. The most stage is included in the predictive analysis stage.
important objective is a prediction as prediction is one of the
Fig. 3 is the neural network scheme used in this research.
typical uses of regression [19].
The neural network scheme is supervised and consists of 14
D. Feed-forward Backpropagation inputs with the features possessed. One hidden layer with ten
Feed-forward backpropagation is a part of an Artificial neurons uses sigmoid activation. The next process uses one
Neural Network (ANN) and a type of supervised learning. It output layer depicted as price. The experiment of this study
represents human’s brains during a learning process. It studies uses Matlab tools. On the input layer, there is no computation
training data. The result of the study will be used to build a process, only the input signal is sent to the hidden layer. On
dataset, as indicated in Fig. 2. Backpropagation reduces errors the hidden layer and the output layer, there is a computation
in the output network and works in the multilayer networks. process of weight and bias, and the amount of output resulted
from the activities on the hidden and output layers is computed
The multilayer networks consist of an input layer, hidden based on the activation function.
layer, and output layer. Each layer has interconnected neurons
functioned as activators. Meanwhile, the connection has either Fig. 3 is the visualized version of our research. The weight
an increase or a decrease in weight, depending on activation. of the training data is set. Training validation is made to ensure
Among all nerve networks, the feed-forward backpropagation the training data is properly functioned using 265 training
algorithms have the highest convergence level [20]. In this data. The making process of training data used to predict car
study, the feed-forward back-propagation algorithms are used prices is presented in Fig. 4.
to predict changes in car prices. First, input all training data to The neural network uses a monitored approach which
get the best weight. The data testing process uses the weight indicates learning behaviors for a training set (the blue line),
that has been determined to predict testing data. validation set (the green line), testing set (the red line), and the
fourth training set which integrates all sets (the black line)
presented in Fig. 4. Based on the observation in the training
set, the majority of performances are in the optimal line with
an error of 0.98567. The validation phase is shown on the top
on the right side. Besides, based on the testing phase, the error
is up to 0.99542. Finally, the bottom on the right side indicates
that the error in the three phases is 0.9882. In the artificial
neural network, the predicted score Y (the generated score
JST) is expected to be equivalent to the target score T (the
expected score). As the predicted score Y is random, by the
means of the learning process, the artificial neural network
will self-rectify to achieve the predicted score Y which is close
to the target score T, as presented in Fig. 4 which indicates
Fig. 1. Procedures of business intelligence
that Y = T. Therefore, it can be seen that the training data and
target data inputted results in a good score and accuracy. The
training data are stored and used to predict car prices.

Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on May 21,2021 at 13:22:28 UTC from IEEE Xplore. Restrictions apply.
Fig. 3. Neural network layer

Fig. 5. The comparison of the experiment result


% 100 (1)

We find the error values as shown in Table II. In the


experiments made as shown in Table II, we figure out an
average error value of 11.46% by testing 15 data randomly
selected. The error value can be reduced by adding more data
to training data.
B. Implementing Business Intelligence to Predict Car
Prices
In its implementation in Business Intelligence, the system
can be used to predict car prices that will be produced and
whose specifications are never be made and sold before. In
Business Intelligence, the use of the system is categorized into
the prescriptive analysis stage.
After making experiments using testing data in accordance
with car specifications made based on the BMW dataset, we
make other experiments to predict car prices with the
Fig. 4. Training data regression using six epochs specifications that have not been made by BMW. The
specifications we test are indicated in Table III.
A system experiment is then made by testing the training
data made with 15 testing data randomly selected. In this All specifications indicated in Table III are inserted in
experiment, we do not include car prices but car testing data and tested using training data that have been made.
specifications. The results of the experiment are listed in To do predictions, we use a simulation function in Matlab, as
Table I. in:

TABLE I. THE EXPERIMENT RESULT OF CAR PRICES % , _ (2)

Actual Predicted After making the experiments, we find the predicted price
Price Price of cars with new specifications that is $55,754. The
1 2 1 1 2 27 18 3916 37200 38910 experiments imply that the system can be used to predict the
price of cars with new specifications about to launch.
1 2 2 1 1 28 20 3916 39600 41408
However, the predicted price is not affected by the
1 2 2 1 1 28 19 3916 31500 32418 development of the currency rate and the factors of the global
1 2 2 1 2 28 19 3916 44400 46671 economy.
2 2 3 1 1 31 21 3916 46450 43086
TABLE II. ERROR VALUES FOUND IN THE EXPERIMENTS
1 2 3 1 2 32 21 3916 49050 38556
2 2 3 1 2 32 21 3916 51050 43543 Actual Predicted Error
Car Price Gap ($)
Price ($) Price ($) Value (%)
1 2 3 1 1 32 21 3916 44450 36269
2 4 2 2 4 34 22 3916 43000 38618 1. 37200 38910 1710 4.39
2 4 3 2 4 30 20 3916 49200 47013 2. 39600 41408 1808 4.37
2 4 3 2 4 30 20 3916 49650 46931
3. 31500 32418 918 2.83
2 4 2 2 4 33 23 3916 43950 36399
2 4 1 2 5 33 22 3916 41950 44128 4. 44400 46671 2271 4.87
1 4 3 2 6 33 25 3916 50150 37855
5. 46450 43086 3364 7.81
1 4 2 2 6 35 23 3916 37500 37261
6. 49050 38556 10494 27.22

Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on May 21,2021 at 13:22:28 UTC from IEEE Xplore. Restrictions apply.
Actual Predicted Error
data randomly selected. The error value can be reduced by
Car
Price ($) Price ($)
Price Gap ($)
Value (%) adding more data to training data.
In the prescriptive phase, we make an experiment in which
7. 51050 43543 7507 17.24
we predict the price of a car with new specifications that are
8. 44450 36269 8181 22.56 never been made and sold. It results in a predicted price which
is $55,754. This experiment aims to help us to identify the
9. 43000 38618 4382 11.35 price of a car with the latest specification, which is the focus
of the implementation of Business Intelligence we do.
10. 49200 47013 2187 4.65 However, it should be highlighted that the predicted price is
only based on the specification and the popularity of the
11. 49650 46931 2719 5.79
previous sale data because an actual car price is also impacted
12. 43950 36399 7551 20.75 by currency, the global economy, and others.

13. 41950 44128 2178 4.94 VI. FUTURE WORKS


For future work, we are interested to use the k-fold cross-
14. 50150 37855 12295 32.48
validation to select training or testing use the dataset of other
15. 37500 37261 239 0.64
cars. We are also interested to find other methods to predict a
car's price
Average error value 11.46
ACKNOWLEDGMENT
TABLE III. NEW SPECIFICATIONS OF BMW USED IN DATA TESTING
The authors would like to thank the Research Institutions
Sekolah Tinggi Manajemen Informatika dan Komputer Ichsan
Specification Value Information Gorontalo

Model 1 1 Series
REFERENCES
[1] V. Yesmaya, P. Studi, and T. Informasi, “Infrastruktur Business
Year 2019 Intelligence Mendukung Data Mining Dalam Proses E-Business,”
Infrastruktur Bus. Intell. Mendukung Data Min. Dalam Proses E-bus.,
vol. 02, pp. 189–199, 2013.
Engine Fuel Type 3 Diesel
[2] BMW Group, “The BMW Group company profile”,
https://www.bmwgroup.com/en/unternehmen.html, [accessed: 28-
Engine HP 300 Horsepower Sept-20]
[3] BMW Group, “Production today and tomorrow”,
Engine Cylinders 4
https://www.bmwgroup.com/en/unternehmen.html, [accessed: 28-
Sept-20]
Transmission Type 1 Manual
[4] A. Mega Nanda, “Penjualan BMW di Indonesia Masih Aman dari
Imbas Corona”, https://otomotif.kompas.com/read/2020/04/05/1621
Driven_Wheels 2 All-wheel drive
00115/penjualan-bmw-di-Indonesia-masih-aman-dari-imbas-corona,
[accessed: 26-Sept-20]
Number of Doors 2 [5] BMW Group, “Investor Presentation September 2020”,
https://www.bmwgroup.com/content/dam/grpw/websites/bmwgroup_
Market Category 1 com/ir/downloads/en/2020/Investor_Presentation/BMW_Investor_Pre
setation_2020.pdf, [accessed: 28-Sept-20]
Vehicle Size 1 Compact [6] BMW Group, “Quarterly Report” https://www.bmwgroup.com
/content/dam/grpw/websites/bmwgroup_com/ir/downloads/en/2020/q
Vehicle Style 1 Coupe 2/Q2_2020_BMW_Group_EN_Online.pdf,[accessed:28-Sept-20]
[7] R. Alsufyani and V. Chang, "Risk Analysis of Business Intelligence in
Highway MPG 30
Cloud Computing," 2015 IEEE 7th International Conference on Cloud
Computing Technology and Science (CloudCom), Vancouver, BC,
City MPG 35 2015, pp. 558-563, doi: 10.1109/CloudCom.2015.84.
[8] K. Noor and S. Jan, “Vehicle price prediction system using machine
Popularity 3916
learning techniques,” International Journal of Computer Applications,
vol. 167, pp. 27–31, 06 2017
V. CONCLUSION [9] C. Worasucheep, “A stock price forecasting application using neural
networks with multi-optimizer,” 2016 IEEE 9th Int. Work. Comput.
This research implements Business Intelligence and uses Intell. Appl. IWCIA 2016 - Proc., no. 1, pp. 63–68, 2017.
the data of BMW’s specifications and price as a dataset. The [10] E. Gegic, B. Isakovic, D. Keco, Z. Masetic, and J. Kevric, “Car price
implementation is conducted using the feed-forward prediction using machine learning techniques,” TEM J., vol. 8, no. 1,
backpropagation algorithms in the predictive and prescriptive pp. 113–118, 2019.
phases. [11] D. Van Thai, L. Ngoc Son, P. V. Tien, N. Nhat Anh, and N. T. Ngoc
Anh, “Prediction car prices using quantify qualitative data and
In the predictive phase, we predict the BMW price based knowledge-based system,” Proc. 2019 11th Int. Conf. Knowl. Syst.
on its specifications by evaluating the price of cars with Eng. KSE 2019, pp. 1–5, 2019.
specifications that have been made and sold. The step aims to [12] Rully, Kurniawan, “Segmen Mobil Mewah, Penjualan BMW kalahkan
determine whether the predicted price suits the actual price. Mercedez Benz”, https://otomot.kompas.com/read/2020/0
7/21/174200115/segmen- mobil-mewah-penjualan-bmw-kalahkan-
From the experiment, we observe that the actual price is close mercedes-benz?page=all
to the predicted price at a mean error of 11.46% by testing 15
[13] E. Sahafizadeh and E. Ahmadi, “Prediction of Air Pollution of
Boushehr City Using Data Mining,” in 2009 Second International

Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on May 21,2021 at 13:22:28 UTC from IEEE Xplore. Restrictions apply.
Conference on Environmental and Computer Science, 2009, pp. 33– [17] L. Volonino and E. Turban, Information Technology for Management:
36. Improving Strategic and Operational Performance, 8th Edition.
[14] G. Ibarra-Berastegi, J. Saenz, A. Ezcurra, A. Elias, and A. Barona, Danvers, MA: Wiley & Sons.
“Using neural networks for short-term prediction of air pollution [18] M. Minelli, M. Chambers, and A. Dhiraj, “Big data, big analytics :
levels,” in 2009 International Conference on Advances in emerging business intelligence and analytic trends for today’s
Computational Tools for Engineering Applications, 2009, pp. 498– businesses.” 2013.
502. [19] N. Burak, Z. Hn, Ö. Uzun, G. Yeg, N. Kalki, and C. Erthemo,
[15] C. Shen, Z. Riaz, M. S. Palle, Q. Jin, and F. Peña-Mora, “Big Data “Similarity of Distributions Which Belongs to Data Estimated by
Analytics as a Service for Business Intelligence,” Lect. Notes Comput. Regression Analysis and Real Data,” Med. Technol. Natl. Congr., pp.
Sci. Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinforma., vol. 2–5, 2017.
9373, no. September, pp. 247–260, 2015. [20] N. Burak, Z. Hn, Ö. Uzun, G. Yeg, N. Kalki, and C. Erthemo,
[16] D. Delen and H. Demirkan, Data, information, and analytics as “Similarity of Distributions Which Belongs to Data Estimated by
services, vol. 55. 2013. Regression Analysis and Real Data,” Med. Technol. Natl. Congr., pp.
2–5, 2017.
Handbook. Mill Valley, CA: University Science, 1989.

Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on May 21,2021 at 13:22:28 UTC from IEEE Xplore. Restrictions apply.

You might also like