Professional Documents
Culture Documents
Abstract - In today's transportation systems, it is Efficient management of e-bike fleets and optimizing the
essential to effectively control trip time and battery user experience hinges on two critical factors: trip duration
usage, especially for electric modes such as e-bikes. The prediction and battery consumption management. The ability
foundation of this research is the enhancement of the to accurately predict how long a trip will take and how much
dataset with trip details, ambient conditions, and battery power will be consumed during that journey can
battery-specific information via strategic feature significantly enhance the usability and sustainability of E-
engineering. Using data mining techniques on a large bike sharing systems. This project, titled "Enhancing E-Bike
dataset, the goal is to properly estimate trip time and Trips Through Precise Duration and Battery Consumption
Prediction Through Machine Learning & Power BI," aims to
battery usage in a rental e-bike sharing system. Journey
tackle these essential challenges.
time and energy consumption are estimated using
machine learning models, such as Random Forest,
Gradient Boosting Machines, and Linear Regression; II. LITERATURE REVIEW
model efficacy is measured by performance metrics.
The research emphasizes the importance of trip duration
Through Predictive Mobility Insights, Power BI's
and battery management in modern transportation, especially
visualizations further improve data interpretation for
within the context of electric modes of transport. By
stakeholders, bringing in a new age of intelligent, integrating weather parameters and battery-specific data into
responsive, and environmentally friendly urban mobility. the dataset, the study aims to enhance predictive accuracy.
Keywords - Power-BI, E-bikes, Data mining, Data The machine learning models mentioned are employed to
visualization estimate both journey time and energy usage, indicating a
quantitative approach to solving the problem. The abstract
I. INTRODUCTION mentions the use of performance indicators to evaluate the
models, but specific metrics are not detailed. Furthermore,
In the dynamic landscape of contemporary urban the abstract highlights the use of Power BI for data
transportation, there is a growing emphasis on sustainable visualization, emphasizing its role in conveying complex
and eco-friendly mobility solutions. Electric bikes, or e- data to stakeholders effectively.
bikes, have emerged as a promising remedy for challenges Portigliotti, F., Rizzo, A.: ‘A network model for an urban
such as urban congestion, air pollution, and limited parking bikesharing system’, IFAC-PapersOnLine, 2017, 50, (1), pp.
space. E-bikes provide a convenient and environmentally 15633–15638 [1].
friendly transportation option, appealing to both commuters
and leisure riders. However, to achieve precise predictions, E-bikes, like other forms of active transportation, can
it's crucial to consider various factors influencing trip improve individual and community health. This theme of the
duration and battery usage. Variables like traffic, terrain, literature review considers the effects of e-bikes on both
weather conditions, and rider behavior play significant roles. physical and mental health and well-being.
By enriching the dataset with real-time ambient Bourne, J., Sauchelli, S., Perry, R., Page, A., Leary, S.,
conditions and battery specifics, we can improve the England, & C., Cooper, A. (2018).Health benefits of
accuracy of predictions. This allows for better fleet electrically-assisted cycling: a systematic review.
management and ensures a more satisfying user experience.
Input Trained
Model
CUSTMOR
DATA
Predicte
d output
C. Split Data:
Split Splitting a dataset involves dividing it into two types:
training and testing data. In this study, the split approach is
Train Test
employed for training and assessment.
D. Train Data
Algorithm Utilizing the information of users to train a ML concept
is known as training data in ML. Analysing or processing
training dataset learning requires some human input.
LR DT RF Depending on the machine learning algorithms used and the
kind of issue they're supposed to solve can vary how much
participation there is from people.
Selecting the E. Test Data
best result
A. VISUALIZING DISTRIBUTIONS
Data Collecting and Traffic Sensing Unit:
This unit is responsible for storing and managing the
collected data. It might include memory storage or USB
services to hold the data, which can be valuable for various
purposes, including analysis and archiving.
Fig 4. Graphical representation of visualizing distributions Fig 6. Graphical representation of checking linearity
A. LINEAR REGRESSION
Fig 5. Graphical representation of checking outliner Fig 7. Graphical representation of Linear Regression
algorithm
We see outliers in some columns like Sunlight, Wind,
Rain and Snow but let’s not treat them because they may We plotted the absolute values of the beta coefficients
not be outliers as snowfall, rainfall etc. themselves are rare which can be seen parallel to the feature importance of
tree-based algorithms.
event in some countries.
Since the performance of simple linear model is not so
We treated the outliers in the target variable by capping
good. We experimented with some complex models
with IQR limits.
B. DECISION TREE
C. MANIPULATING THE DATASET:
DecisionTreeRegressor(max_depth=10,min_samples_
leaf=40, min_samples_split=50, random_state=1) XII. CONCLUSION
Decision tree performs well better than the linear reg Functioning day is the most influencing feature and
witha test r2 score more than 70%. temperature is at the second place for LinearRegressor.
Temperature is the most important feature for
C. RANDOM FOREST REGRESSOR DecisionTree.
Functioning day is the most important feature and Winter
is the second most for Linear Regressor.
RMSE Comparisons:
1) Liner regressorRMSE : 370.46
2) Decision Tree RegressorRMSE : 302.53
3) Random Forest MethodRMSE : 290.20
The feature temperature is on the top list for all the
regressors.
So It can be considered as the best model for given
problem.
REFERENCES
Fig 9. Graphical representation of Random Forest regressor [1] Portigliotti, F., Rizzo, A.: ‘A network model for an urban bike sharing
system’, IFAC-PapersOnLine, 2017, 50, (1), pp. 15633–15638.
RandomForestRegressor(max_depth=10,min_samples [2] Bourne, J., Sauchelli, S., Perry, R., Page, A., Leary, S., England, &
C., Cooper, A. (2018).Health benefits of electrically-assisted cycling:
_leaf=40, min_samples_split=50, random_state=2) a systematic review. International Journal of Behavioral Nutrition and
Random forest also performs well in both test and train Physical Activity.
data with a r2 score 77% on train data and around 75% on [3] Calafiore, G.C., Portigliotti, F., Rizzo, A.: ‘A network model for an
the test data. urban bike-sharing system’, IFAC-PapersOnLine, 2017, 50, (1), pp.
15633–15638system for measuring traffic parameters,” in Proc. IEEE
Conf. Computer Vision and Pattern Recognition, Puerto Rico, June
1997, pp. 496–501.
XI. RESULTS
[4] Shaheen, S., Guzman, S., Zhang, H.: ‘Bikesharing in Europe, the
americas, and Asia: past, present, and future’, Transp. Res. Rec., J.
ALGORITHM LINEAR DESCION RANDOM Transp. Res. Board, 2010, 2143, p. 159167.
REGRESSION TREE FOREST [5] Giraud-Carrier, C., Vilalta, R., Brazdil, P.: ‘Introduction to the special
issue on metalearning’, Mach. Learn., 2004, 54, (3), pp. 187–193.
MSE 137241.308 91524.533 84111.621 [6] Turner, S., Eisele, W., Benz, R., et al.: ‘Travel Time Data Collection
Handbook’, Federal Highway Administration, Report FHWA-PL-98-
RMSE 370.460 302.530 290.020
035, 1998.
MAE 254.740 188.507 178.308 [7] Li, Y., DimitriosGunopulos, C.L., Guibas, L.: ‘Urban travel time
prediction using a small number of GPS floating cars’. Proc. of the
TRAIN R2 0.58346 0.75989 0.77380 25th ACM SIGSPATIAL Int. Conf. on Advances in Geographic
Information Systems, USA, 2017.
REST R2 0.59240 0.72818 0.75019
[8] Mridha, S., NiloyGanguly, S.B.: ‘Link travel time prediction from
ADJUSTED 0.58959 0.72551 0.74774 large scale endpoint data’. Proc. of the 25th ACM SIGSPATIAL Int.
Conf. on Advances in Geographic Information Systems, USA, 2017.
R2 [9] Miura, H.: ‘A study of travel time prediction using universal kriging’,
Top, 2010, 18, (1), pp. 257–270 .
Fig 10. Results of exploratory data analysis [10] Kwon, J., Coifman, B., Bickel, P.: ‘Day-to-day travel-time trends and
travel time prediction from loop-detector data’, Transp. Res. Rec.: J.
Transp. Res. Board, 2000.
[11] Zhang, X., Rice, J.A.: ‘Short-term travel time prediction’, Transp.
Res. C: Emergency Technology , 2003, 11, (3), pp. 187–210
[12] Brazdil, P., Soares, C., Costa, J.D.: ‘Ranking learning algorithms:
using IBL and metalearning on accuracy and time results’, Mach.
Learn., 2003, 50, pp. 251–277.
[13] Zarmehri, M.N., Soares, C.: ‘Using meta learning for prediction of
taxi trip duration using different granularity levels’. Int. Symp. on
Intelligent Data Analysis, Cham, 2015, pp. 205–216.