Mosquito Mark Final PDF

You might also like

You are on page 1of 3

Mosquito Repellant A/B Testing

Data Encoding:
I asked the researchers to encode their results, as summarized by the table below. Note that, SAgree and SDisagree stands for Strongly Agree and Disagree.

Out[2]:
SAgree Agree Neutral Disagree SDisagree

0 5 4 3 2 1

For the Recommendation column, 0 was assigned to Commercial by default since, respondents are not ask to provide recommendation for them.

Out[3]:
HighlyRecommended Moderate Recommended LeastRecommended NotRecommended Commercial

0 5 4 3 2 1 0

Similarly, Gender was encoded as 1 - Male and 0 - Female. Finally, Age was binned as summarized by the table below.

Out[4]:
15 below 16-20 21-25 26-30 31-35 36 above

0 1 2 3 4 5 6

Below is what the first 3 responses look like.

Out[5]:
RespondentNumber Gender Age Range Product Price Packaging Fragrance Span Safeness Effectiveness Duration Efficiency Recommendation

0 1 1 3 Commercial 5 5 4 4 5 3 3 4 0

1 1 1 3 Own 2 3 3 3 2 1 1 2 1

2 2 0 3 Commercial 2 4 4 5 4 5 3 5 0

3 2 0 3 Own 2 2 2 3 2 1 2 3 2

4 3 1 6 Commercial 3 3 1 3 2 4 3 4 0

5 3 1 6 Own 3 3 3 3 2 3 3 3 3

First, let us check the distribution of the Recommendations. Out of the 100 respondents, ~50% has 'recommended' the product. Also, looking at the distribution, ~70% of the
respondents chose 3-5. Looking at the gender level, we have a slightly higher respondents that are female. Finally, most of therespondents who gave a poor recommendation
are female.

Out[6]:

Next, let's examine the distribution of the recommendation by Age group. Most of the respondents are under the 3, 4 and 6 age range. The distribution by age range across all
recommendations seems fairly similar.

19/09/2019, 8:58 am
Out[7]:

Now, we will focus our attention to the questionaire. Let us examine how the ratings of our product compares to the commercial product.

Out[9]:

Out[11]:

19/09/2019, 8:58 am
Here's what the graphs above is telling us:

We are getting most of the 'negative' votes, except for fragrance.


While the above observation is true, in terms of the 'positive' reviews, we are on par with the commercial product.
Our best attributes would be Packaging, Span, Duration, Efficiency and Effectiveness.
In terms of our best attributes, the distribution of the metrics by gender is almost equal. This means that both genders agrees with the strengths of our product.

Finally, to better understand why our respondents chose their respective recommendations, we will use Logistic Regression. (Note that this is a multi-class example) We chose
this learning algorithm since this will assign coefficients to each metric, that after some calculations, it can be interpreted as probabilities.

The coefficients $\theta_i$ are chosen such that the below function is minimized.
$$-y\cdot log(h(\theta)) - (1-y)\cdot log(1-h(\theta))$$

where $h(\theta)$ is is the sigmoid function applied to the linear combination of the coefficients and the data. Note that we're only concerned with the coefficients, not how well
the model generalizes the data since our sample size is relatively small.

Out[14]: LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=False,


intercept_scaling=1, l1_ratio=None, max_iter=100,
multi_class='warn', n_jobs=None, penalty='l2',
random_state=None, solver='warn', tol=0.0001, verbose=0,
warm_start=False)

Out[15]:
Gender Age Range Price Packaging Fragrance Span Safeness Effectiveness Duration Efficiency

0 0.43817 0.397997 0.468527 0.507815 0.27758 0.620712 0.476683 0.605174 0.503099 0.660538

For interpretability's sake, we only show the first row of the assigned probabilities. Notice how our best attributes received the highest probabilities.

Summary
~70% of the respondents are likely to recommend the product.
Although we receive most of the lower votes on the most of the metrics, we're almost on par with respect to the higher votes.
Our best attributes are Packaging, Span, Effectiveness, Duration and Efficiency.

Recommendatons
Apply more sophisticated sampling techniques to avoid bias.
Increase the sample size.

Analysis made by
Benjamin Reyes Cabalona Jr.
Associate Data Scientist at Novare Technologies
benjamin.cabalonajr@novare.com.hk (benjamin.cabalonajr@novare.com.hk)
benjamin.cabalonajr@outlook.com (benjamin.cabalonajr@outlook.com)

19/09/2019, 8:58 am

You might also like