You are on page 1of 4

Hubei University of Technology (HBUT)

湖北工业大学

Name: Rashid Md Mamunur (潘安)

Student Id: 1811562126

Class: 19lc 软工

Assignment: 1
1. You are approached by the marketing director of a local company, who believes that he has devised
a foolproof way to measure customer satisfaction. He explains his scheme as follows: “It’s so simple
that I can’t believe that no one has thought of it before. I just keep track of the number of customer
complaints for each product. I read in a data mining book that counts are ratio attributes, and so, my
measure of product satisfaction must be a ratio attribute. But when I rated the products based on my
new customer satisfaction measure and showed them to my boss, he told me that I had overlooked
the obvious, and that my measure was worthless. I think that he was just mad because our best-
selling product had the worst satisfaction since it had the most complaints. Could you help me set
him straight?”
a) Who is right, the marketing director or his boss? If you answered, his boss, what would you do to
fix the measure of satisfaction?
(b) What can you say about the attribute type of the original product satisfaction attribute?

Answer: (a).
The boss is correct in this situation with the marketing director overlooking the obvious. The
number of complaints is a meaningless measurement when it doesn’t consider the number of
products purchased. To fix the measurement of satisfaction analysis, one would have to consider the
number of products sold and compare it to the number of complaints filed.
To determine which product has the most complaints, you must compare the percentage of
complaints divided by the number of products sold. Another consideration that must be considered
is the scale of the minimum number of products sold to take an accurate analysis. For example, if a
store sold two products: products x and y, that sold 100 units of x, and 2 units of y. If the store received
30 complaints for product x, and 1 complaint for product y, then computing the percentage of
complaints for each product sold for product results in 30% and 50%. When taking a quick look at the
percentage rate of complaints, the boss would rush to fix the problem with 50% complaint rate.
Though in this case only 2 items of this product type were sold, and the severity of the complaint is
unknown. Therefore, placing a minimum number of products sold to consider making an accurate
analysis is needed.

(b).
The original product satisfaction attribute of the counts being ratio attributes is a correct analysis.
Although the data set is not comparable since each number count of complaints is not based on the
same scale resulting a bias sample set of data. This analysis is the same as having a sample set of
temperatures measured in Celsius, Kalvin, and Fahrenheit and just reporting the numerical
temperature without converting all measurements to one common scale domain.

2. Which of the following quantities is likely to show more temporal autocorrelation: daily rainfall or
daily temperature? Why?

Answer:
A feature shows spatial autocorrelation if locations that are closer to each other are more similar
with respect to the values of that feature than locations that are farther away. It is more common for
physically close locations to have similar temperatures than similar amounts of rainfall since rainfall
can be very localized the amount of rainfall can change abruptly from one location to another.
Therefore, daily temperature shows more spatial autocorrelation then daily rainfall.

3. Based on the data in Table 1 in Chapter 4, draw separate decision trees to predict which category
the lion, owl and crocodile belong to?

Answer:

Body temperature, hibernation have legs are the attributes in the dataset that decides a mammal or
non-mammal.
Because mammals and non-mammals have creatures that are aquatic, ariel, can have various range
of skin colors and may or may not give birth.
4. We further explore the cosine and correlation measures.
(a)What is the range of values that are possible for the cosine measure?
(b)if two objects have a cosine measure of 1,are they identical? Explain.

Answer:

(a)
[-1, 1]. Many times the data has only positive entries and in that case the range is [0, 1].

(b)
Not necessarily. All we know is that the values of their attributes differ by a constant factor.

You might also like