You are on page 1of 4

8/10/2021 A Large Number Of Insurance Records Are To Be Exam... | Chegg.

com

  Textbook Solutions Expert Q&A Study Pack Practice 

Find solutions for your homework Search

home / study / engineering / computer science / computer science questions and answers / a large number of insurance records are to be examin…

Question: A large number of insurance records are to be examined to


d… Post a question
Answers from our experts for your tough
homework questions

A large number of insurance records are to be examined to


develop a model for predicting fraudulent Enter question
claims. Of the claims in
the historical database, 1% were judged to be fraudulent.
A sample database is taken to develop a model, and oversampling
is used to provide a balanced sample in
light of the very low
response rate. When applied to this sample database (total number
of records, N
=800), the model ends up correctly classifying 310
frauds, and 270 non-frauds. It misses 90 frauds, and
Continue to post
classified 130
records incorrectly as frauds when they were not.
20 questions remaining
a. Produce the classification matrix for the sample as it
stands.
b. Find the adjusted misclassification rate (adjusting for the
oversampling).
c. What percentage of new records would you expected to be
classified as fraudulent?
Snap a photo from your
phone to post a question
Expert Answer We'll send you a one-time download
link

Cooper Park answered this


Was this answer helpful? 0 0
618 answers 888-888-8888 Text me
a.
By providing your phone number, you agree to receive a one-tim
A classification matrix sorts all cases from the model into
categories, by determining whether the predicted automated text message with a link to get the app. Standard
messaging rates may apply.
value matched the
actual value. All the cases in each category are then counted, and
the totals are
displayed in the matrix.

The number of frauds in real is mentioned as 1%, which means 8


of the 800 samples were actually fraud

In this case, the classification matrix is given as:


My Textbook Solutions
Instant access to step-by-step solutions for your
If the model predicts fraud when actually it is fraud, such
records are 310. textbooks

If the model predicts non fraud when actually it is non-fraud,


such records are 270

If the model predicts non-fraud when actually it is fraud, such


records are 90
Add a Add a Add a
If the model predicts fraud when actually it is non-fraud, such
records are 130. textbook textbook textbook

The matrix is 2 X 2 because there are only 2 categories, (fraud


and non-fraud) else it would've been n X n
matrix if there were n
categories for classification.

https://www.chegg.com/homework-help/questions-and-answers/large-number-insurance-records-examined-develop-model-predicting-fraudulent-claim… 1/4
8/10/2021 A Large Number Of Insurance Records Are To Be Exam... | Chegg.com

  Textbook Solutions Expert Q&A Study Pack Practice 

b.

Misclassification rate:

NOTE: The misclassification rate is the rate of


wrong predictions.

Here, 130 + 90 out of the total 800 predictions were wrong,


this means it had a misclassification rate of
220/800 i.e., 27.5% ,
with 50 % of the sites predicted as a fraud.
Actual % of fraud= 400/800=50%
Predicted % of fraud by model= (310+130)/800=55%
Have to adjust non-fraud's percent in our model.

The historical percent of fraud=1%

Then non-fraud= 99%,


Now adjust it in our model
To reweight to account to the actual number of 0’s and 1’s in
the validation set, we need to add enough 1’s
to get the original
balance (1 : 100), that is
440 + 0.99x = x

x = 44000
Which means 44000 more non-fraud sites are to be added to the
sample? The samples are added in the
same ratio of the previous
non-fraud sites.

So now, the overweighing is done and the adjusted matrix is:

https://www.chegg.com/homework-help/questions-and-answers/large-number-insurance-records-examined-develop-model-predicting-fraudulent-claim… 2/4
8/10/2021 A Large Number Of Insurance Records Are To Be Exam... | Chegg.com

  Textbook Solutions Expert Q&A Study Pack Practice 

Therefore, the adjusted misclassification rate is now

(130 +10890)/44000 = 25.045%

c.

Now, with a 25.045% misclassification rate, this model would end


up predicting 11200 of the 44000 or
25.4545% of the new records as
a fraud.

Comment


Practice with similar questions

Q: A large number of insurance records are to be examined to


develop a model for predicting fraudulent claims. Of the
claims in
the historical database, 1% were judged to be fraudulent. A sample
database is taken to develop a model,
and oversampling is used to
provide a balanced sample in light of the very low response rate.
When applied to this
sample database (total number of records...

A: See answer

Q: A large number of insurance records are to be examined to


develop a model for predicting fraudulent
claims. Of the
claims in the historical database, 1% were judged to
be fraudulent.
A sample database is taken to develop a model,
and oversampling is
used to provide a balanced sample in light of the very low response
rate. When applied to this
sample database (total number of
records...

A: See answer

Show more 

Questions viewed by other students

Q: A data mining routine has been applied to a transaction dataset


and has classified 88 records as fraudulent (30
correctly so) and
952 as non fraudulent (920 correctly so). Construct the
classification matrix and calculate the error
rate.
Suppose that this routine has an adjustable cutoff (threshold)
mechanism by which you can alter the proportion
of records
classified as fraudulent...

A: See step-by-step answer

Q: A large number of insurance records are to be examined to


develop a model for predicting fraudulent claims. Of the
claims in
the historical database, 1% were judged to be fraudulent.
A sample database is taken to develop a model,
and oversampling
is used to provide a balanced sample in light of the very low
response rate. When applied to this
sample database (total number
of records...

A: See answer 100% (3 ratings)

https://www.chegg.com/homework-help/questions-and-answers/large-number-insurance-records-examined-develop-model-predicting-fraudulent-claim… 3/4
8/10/2021 A Large Number Of Insurance Records Are To Be Exam... | Chegg.com
Show more 

  Textbook Solutions Expert Q&A Study Pack Practice 

COMPANY

LEGAL & POLICIES

CHEGG PRODUCTS AND SERVICES

CHEGG NETWORK

CUSTOMER SERVICE

© 2003-2021 Chegg Inc. All rights reserved.

https://www.chegg.com/homework-help/questions-and-answers/large-number-insurance-records-examined-develop-model-predicting-fraudulent-claim… 4/4

You might also like