You are on page 1of 10


Business Computation and Analysis

Super Market Sales

Novita Hastuti Zen


School of Business and Management

Institut Teknologi Bandung

1.1. Data description
Super Market Sales consisting of 1000 examples , 0 specialattributes, and 17 regular

No. Attributes Attributes Type Data Type Level of


1. InvoiceId Polynomial Categorical Nominal

2. Branch Polynomial Categorical Ordinal

3. City Polynomial Categorical Ordinal

4. Quantity Integer Numerical Discreate

5. Gender Polynomial Categorical Nominal

6. Unit Price Real Numerical Ratio

7. Customer Type Polynomial Categorical Nominal

8. Product Line Polynomial Categorical Nominal

9. Tax Real Numerical Ratio

10. Total Real Numerical Ratio

11. Date Date Categorical Nominal

12. Time Polynomial Numerical Interval

13. Payment Polynomial Categorical Ordinal

14. COGS Real Numerical Interval

15. Gross Margin Real Numerical Ratio


16. Gross Income Real Numerical Ratio

17. Rating Real Numerical Interval

' With this dataset, it can be used as a reference in making some business
decisions. But net data has no value lost so that the decision to be taken can be
based on clean data, from the data I want to analyze which branches produce a lot of
sales and also what goods are sold a lot’.
1.2. Data Pre-processing
Data that is clean, not have “missing value”

From statistical data, I know that there are not have missing values so I don’t want to
clean because all of data is have imfact for other data, that is can see in Market Data
Analyze for know the relationship betIen the data with one another whether they
affect each other.
But, in here the data I want to see this have outliers or no, so to find outliers I use
the general attribute tools in Rapidminer
Before entering into the general attribute I use normalization so that the standard
deviation appears
In the function expression I write the function like the picture above so that the
outlier can be knowfrom all data the true have outlier is 10, n, and I get then the
treatment that I do to outliers by removing them because it can cause problems with
data cleaning. By using filter examples on Rapidminer I eliminate outliers

From the results of cleaning outliers, it can be seen that no outliers appear in the
- Want to know about which branches produce a lot of sales
- Want to know what goods are sold a lot
So from all the problem, I can know strategic for the company to have many


1. To find out which items are often purchased together I use "Market Basket
Analysis" on Rapidminer .

- For the first step, I input our data that I will process
- After that, I set the “aggregation attribute” and “group by attributes” on the
parameter of “Aggregate” as following picture.

- Then set the parameter of “Rename”

- Also set the parameter of “Set Role”

- For the “FP-Growth” and “Create Association Rules”, I follow the templates.
- Then click “Run”
From the data, I use Market Basket Analyze to see relation in City and Product Line
That is for to know correlation in City and Product Line


To analysis predictive, I choose Classic Decomposition and choose two variable that is COGS
and Gross Income to analyze forecast

And I get COGS trend

Next to forecast, I adjust from modul I use Moving Average Filter
But I just choose COGS to know Forecast

I use manual analyze and calculation because many problem from excel and RapidMiner
1. Search the which branche products, so I calculate from excel

The many product sold is Fashion, that search with manually and get 178
sales from 1000 sales from 6 Categories of product, and from 3 city, the most many
product sold day in Yangon, different with Naypyitaw . The Company can increase
product in Yangon, and in Naypyitaw can it to but maybe in Naypyitaw can can
increase a discount
Most Popular per 1000 sales is Fashion, maybe the demand from fashion is very
big, because fashion have a big demand concent of female

You might also like