Professional Documents
Culture Documents
(HBUT)
湖北工业大学
Answer: (a).
The boss is correct in this situation with the marketing director overlooking the obvious. The number
of complaints is a meaningless measurement when it doesn’t consider the number of products purchased.
To fix the measurement of satisfaction analysis, one would have to consider the number of products sold
and compare it to the number of complaints filed.
To determine which product has the most complaints, you must compare the percentage of complaints
divided by the number of products sold. Another consideration that must be considered is the scale of the
minimum number of products sold to take an accurate analysis. For example, if a store sold two products:
products x and y, that sold 100 units of x, and 2 units of y. If the store received 30 complaints for product
x, and 1 complaint for product y, then computing the percentage of complaints for each product sold for
product results in 30% and 50%. When taking a quick look at the percentage rate of complaints, the boss
would rush to fix the problem with 50% complaint rate. Though in this case only 2 items of this product
type were sold, and the severity of the complaint is unknown. Therefore, placing a minimum number of
products sold to consider making an accurate analysis is needed.
(b).
The original product satisfaction attribute of the counts being ratio attributes is a correct analysis.
Although the data set is not comparable since each number count of complaints is not based on the same
scale resulting a bias sample set of data. This analysis is the same as having a sample set of temperatures
measured in Celsius, Kalvin, and Fahrenheit and just reporting the numerical temperature without
converting all measurements to one common scale domain.
2. Which of the following quantities is likely to show more temporal autocorrelation: daily rainfall or daily
temperature? Why?
Answer:
A feature shows spatial autocorrelation if locations that are closer to each other are more similar with
respect to the values of that feature than locations that are farther away. It is more common for physically
close locations to have similar temperatures than similar amounts of rainfall since rainfall can be very
localized the amount of rainfall can change abruptly from one location to another. Therefore, daily
temperature shows more spatial autocorrelation then daily rainfall.
3. Based on the data in Table 1 in Chapter 4, draw separate decision trees to predict which category the
lion, owl and crocodile belong to?
Answer:
Body Skin Gives Aquatic Aerial Has Class
Name Hibernates
Temperature Cover Birth Creature Creature Legs Label
Lion Warm-blooded hair yes no no yes no mammal
Owl Warm-blooded feathers no no yes yes no bird
Crocodile Cold-blooded scales no no no yes no reptile
Table:1
From the table 1 & Decision tree of lion we can predict that a lion is a mammal.
From the table 1 & Decision tree of owl we can predict that a owl is a bird.
From the table 1 & Decision tree of crocodile we can predict that a crocodile is a reptile.
Body temperature, hibernation and legs are the attributes in the dataset that decides a mammal or non-
mammal.
Because mammals and non-mammals have creatures that are aquatic, aerial, can have various range of skin
colors and may or may not give birth.
4. We further explore the cosine and correlation measures.
(a)What is the range of values that are possible for the cosine measure?
(b)if two objects have a cosine measure of 1,are they identical? Explain.
Answer:
(a)
[-1, 1]. Many times, the data has only positive entries and in that case the range is [0, 1].
(b)
Not necessarily. All we know is that the values of their attributes differ by a constant factor.