Professional Documents
Culture Documents
Uber Ridesharing
Clustering
Determining optimal position of cabs
Now, just upload the file using first button, and you can also mount
google drive using last button
#3.1
Data Collection
Every time someone books a ride from Uber's application, Uber
saves their location with a timestamp. Uber has been collecting this
data since they begin its operations. Uber uses this data to
determine the optimal position of cabs.
#3.2
#3.2
#3.3
K Means
K Means is an Unsupervised clustering algorithm used to find
clusters in unlabelled data. This algorithm is used to divide data into
K clusters, where K is a variable that depends on the case (a problem
that we are solving) and efficiency. This algorithm takes data as
input and returns centroids of K clusters.
#3.3
K Means
Step 2: Clustering
In this step, the algorithm enters the loop and computes the distance
between each data point and each centroid. Data points are
assigned to cluster centroid which is closest. After this step, we have
K clusters of data points.
#3.4
How to use K Means in
python:
In python, K Means algorithm can be imported from the
sklearn library.
from sklearn.cluster import KMeans
Parameters:
n_clusters : Used to pass the number of clusters needed
#3.4
How to use K Means in
python:
Train model
model.fit(X)
Parameters:
X : Dataframe, or numpy array
model.cluster_centers_
Result