You are on page 1of 3

3/14/24, 10:02 AM SAP HANA PAL – K-Means Algorithm or How to do Cust...

- SAP Community

and will point the AFL Wrapper Generator to the different

table types that we just created */

DROP TABLE PDATA_TELCO;

CREATE COLUMN TABLE PDATA_TELCO(

"ID" INT,

"TYPENAME" VARCHAR(100),

"DIRECTION" VARCHAR(100) );

/* Fill the table */

INSERT INTO PDATA_TELCO VALUES (1, '_SYS_AFL.PAL_KMEANS_DATA_TELCO', 'in');

INSERT INTO PDATA_TELCO VALUES (2, '_SYS_AFL.PAL_CONTROL_TELCO', 'in');

INSERT INTO PDATA_TELCO VALUES (3, '_SYS_AFL.PAL_KMEANS_RESASSIGN_TELCO', 'out');

INSERT INTO PDATA_TELCO VALUES (4, '_SYS_AFL.PAL_KMEANS_CENTERS_TELCO', 'out');

/* Creates the KMeans procedure that executes the KMeans Algorithm */

https://community.sap.com/t5/technology-blogs-by-members/sap-hana-pal-k-means-algorithm-or-how -to-do-customer-segmentation-for-the/ba-p/12976696/page/2 10/39


3/14/24, 10:02 AM SAP HANA PAL – K-Means Algorithm or How to do Cust... - SAP Community

"NAME" VARCHAR (50),

"INTARGS" INTEGER,

"DOUBLEARGS" DOUBLE,

"STRINGARGS" VARCHAR (100)

);

/* Fill the parameters table */

INSERT INTO PAL_CONTROL_TAB_TELCO VALUES ('THREAD_NUMBER',2,null,null); --> Number of threads to be used


during the execution

INSERT INTO PAL_CONTROL_TAB_TELCO VALUES ('GROUP_NUMBER',3,null,null); --> Number of clusters

INSERT INTO PAL_CONTROL_TAB_TELCO VALUES ('INIT_TYPE',4,null,null); --> This parameter will specify how to
select the initial center of each cluster

INSERT INTO PAL_CONTROL_TAB_TELCO VALUES ('DISTANCE_LEVEL',2,null,null); --> Which distance to use. In this
case I'm using Euclidean

INSERT INTO PAL_CONTROL_TAB_TELCO VALUES ('MAX_ITERATION',100,null,null); --> Maximum Iterations

https://community.sap.com/t5/technology-blogs-by-members/sap-hana-pal-k-means-algorithm-or-how -to-do-customer-segmentation-for-the/ba-p/12976696/page/2 12/39


3/14/24, 10:02 AM SAP HANA PAL – K-Means Algorithm or How to do Cust... - SAP Community

As you can see, the distance goes dramatically down between 2 and 3, and after 3 the distance keeps going down but in a
smaller scale. So the “elbow” is clearly in cluster 3. This means that I should use 3 clusters to run the algorithm. So now I’m
going to run the algorithm again using the right number of clusters. This is the result:

The first column is the Customer ID and the second column is the cluster that has been assigned to that customer. So
based on how customers use their mobile phones, the K-Means algorithm clustered my customers in the following way:

Customer ID 1 thru 10 --> Cluster 2


Customer ID 10001 thru 10010 --> Cluster 1
Customer ID 20001 thru 20010 --> Cluster 0

https://community.sap.com/t5/technology-blogs-by-members/sap-hana-pal-k-means-algorithm-or-how -to-do-customer-segmentation-for-the/ba-p/12976696/page/2 16/39

You might also like