You are on page 1of 3

SECTION 1 - INTRODUCTION The ups and downs of combined operation ratios (CORs) and

premiums are typical depictions of the motor insurance industry. Insurers may use these openings to
stay competitive during downturns and build up reserves during recoveries. Because of this
cyclicality, auto insurance is an interesting market to study. For the last decade, the share of the
European P&C (Property and Casualty) insurance market that is devoted to car insurance has been
steadily declining. Despite this, vehicle insurance remained the most important P&C insurance
business line in 2016, accounting for 38% of the industry's total contribution. This was followed by
property insurance (27%), general liability insurance (11%), and then general liability insurance
(11%). (InsuranceEurope, 2019). High CORs of 108.1% and the worst underwriting results led to a
loss of €5.5bn in the European auto insurance business in 2009. This was a direct effect of the Great
Recession that began in 2007 and continued through 2008. It had a COR of 109%, with France
being a major contributor. There was a dip in COR and underwriting outcomes during 2008-2010,
but both have been on the increase since 2016. The French auto insurance market remained
unprofitable in 2016 despite some encouraging signs of recovery (the COR increased from its 2009
peak of 106.9% to 106% in 2016). The key factor that increased the COR percentage the most was
the rising cost of claims, which cut into insurers' profits.
This study answers questions about the optimal size and efficiency of France's auto-insurance
sector. The size and output are standing in for the data collection in this study, which is based on the
frequency and severity of the claims. Machine learning technologies, such as MATLAB, are used to
assess usable gaps, since they provide the best possible outcomes that can be inferred from a finite
set of data sources. Factors including population density, location, vehicle age, engine output, fuel
type, manufacturer, and driver age are compared to the claims data. The annual dataset includes
413169 policyholders for Third Party Liability Motor Insurance. The frequency of claims filed by
policyholders and the severity of claims are broken down into separate data sets, FreMTPLfreq and
FreMTPLsev. In the realm of AI Machine Learning, the two primary categories are as follows: First,
there is a kind of machine learning called supervised or predictive learning, in which the computer
is given data and instructions to learn how to interpret the data and draw conclusions. Second,
unsupervised and descriptive learning, when the computer explores data sources and discovers
outcomes on its own (Murphy, 2012). The enormous number of zero instances presents a challenge
when trying to prove claims recurrence in this kind of insurance, as does developing a model that
can capture them. With MTPL policies, "no cases" implies that no claims were made against the
insurer during the policy's effective period, not that no accidents occurred.

SECTION 2: INTRODUCTION AND REVIEW OF THE TEXTUAL MATERIAL:


Based on the results of the investigation, it was deduced that some locations are more prone to
accidents than others. To that end, informative collecting investigations of this type assist to foresee
the specific location, and this examination helps to watch the dangers and exposure that impact
accidents in this kind of place (Sakhare and Kasbe, 2017). There is no way to predict how or when
an accident will occur, and many factors, including road and weather conditions, visibility, the
number of people in the vehicle, and the rate of travel at the time of the accident, may all play a role
in its severity. Insurers must now take into account a plethora of fresh information that allows for
the more accurate prediction of accidents and is also consistent with accepted societal norms.
The population studied, the diversity of that population, the socioeconomic standing of that study's
participants, and the period of time to which the study's findings are being applied all have an
impact on the epidemiological data that may be gathered (Dunne et al., 2020). Most of the
publications we looked at had rather small patient populations, which makes their findings less
reliable, especially for a densely populated region like South America. There is also a disparity
between the populations and medical facilities of rural areas and major cities. This study found that
urban dwellers are more likely to be at risk than their rural counterparts.
Comparing the rate of fatalities from traffic accidents in metropolitan centers with those in rural
areas reveals a startling disparity. Although India's major cities bear somewhat less of the country's
accident load, over half of the country's urban districts have a greater casualty risk than their
mofussil counterparts (Singh, 2017). So, it is of the utmost importance to recognize the devastating
nature of road accidents and injuries and take appropriate action.
It also seems that the number of miles driven each year is a good indicator of performance. The
rationale for this is because, according to statistics, an individual's annual mileage is a reliable
predictor of his or her likelihood of being at fault in a collision (Lemaire et al., 2016). Although
mileage may be a proxy for the age of a vehicle, it can also be an indicator of how hard the engine
has been worked, which can have a negative impact on the car's longevity.
Two suggested measures of openness were used, one comparing drivers' involvement in a specific
kind of overtaking accident to their involvement in overpowering accidents in general, and the other
comparing the average ages of at-fault drivers to those of other drivers in the sample (Clarke et al.,
1998). There are ten recognized categories of overtaking accidents, of which only three are
discussed at length here: Collision with a right-turning vehicle (the most common injury-mishap for
over takers), which can occur when either a young driver makes a poor decision when passing an
oncoming vehicle or an older driver makes a poor choice when turning right; a head-on collision,
which can happen to drivers of any age due to openness; and a'reverse and lose control' accident,
which is more common among younger drivers.
Several AI machine learning processes may be used to show claims recurrence and insurance price,
including regression analysis, decision tree, neural network, and boosting algorithms like XGBoost,
etc (Weerasinghe and Wijegunasekara, 2016). While these models have impressive predictive
ability, they differ significantly from regression models in that their parameters are difficult to
interpret and their computation time is lengthy. The results demonstrate that the neural network has
the highest degree of predictive accuracy when compared to the other two models. Yet, they state
that the logistic regression is the most effective model for understanding the interplay between these
two types of variables.
Working with large volumes of informative data is Machine Learning's (ML) major strength,
making it ideal for application in the insurance industry. In all three types of datasets—Structured,
Semi-Structured, and Unstructured—ML may be of great assistance. With state-of-the-art enhanced
predictive accuracy, AI may be used accurately throughout the whole value chain in relation to risk,
claims, and client activities. Machine learning has a wide variety of potential applications in the
insurance industry, from detecting fraud to improving accuracy in assessing risk and calculating
premiums (Burri et al., 2019). Artificial intelligence is not a smart idea, yet it has been widely used
in recent years. The three main types of learning are supervised learning, unsupervised learning, and
assisted learning. Throughout the past several decades, most insurance companies have relied on
Supervised Learning to conduct risk surveys using a variety of known parameters in different
permutations to get a desired outcome.

AIM(S), PART III:


The goal of this research is to use Machine Learning to make sense of data from the French Vehicle
Third-Party Liability insurance market. The goal is to use Matlab to learn about the Claims in the
dataset based on demographic characteristics such as population density, region, vehicle age,
vehicle power, fuel type, vehicle brand, driver age, and claim amount. Claims data from the French
Motor Insurance sector will be analyzed using both supervised and unsupervised methods. The
study will allow for premium estimate to be made based on the varying criteria. The goal is to
determine how much each factor affects the claims, so that underwriters can prioritize which
portfolios to work on in order to reduce risk.

SECTION 4: QUESTIONS FOR FURTHER STUDY


The following questions will be investigated in order to determine the best course of action.
1. What impact do elements like population density, geography, age of vehicles, engine power, and
exposure have in the claims that are filed?
If you own a car in France, you may be curious in the state of the auto insurance claims market right
now.
When and how may the French auto insurance market use controls to lower claim costs?

FIVE: BENEFITS This research provides a fundamental audit of the factors influencing claim
volumes in the French auto insurance sector. Its goal is to disentangle the relative weights of each
factor's influence on the cases, information that may be used by underwriters to prioritize which
portfolios need future attention to reduce risk. Insurance companies in charge of motor vehicle
policies may use the information in this study as a starting point for developing a risk strategy
process expectation and control model that will increase the sustainability of operations while also
catering to the needs of policyholders. The study's findings may also be used to provide light on the
strategies that car insurance firms might use to help spread risk and widen their coverage.
FRAMEWORK AND METHODOLOGY (CHAPTER 6)
This analysis uses a causal methodology to pin down precisely what aspects of French auto
insurance affect claim sizes and what don't. The insurance industry can take the necessary steps,
informed by the findings of the analysis of the primary factors associated with the largest claim
amounts in the automobile insurance sector. For instance, larger claims may indicate greater risk,
and as such, they should prompt the insurance company to propose new options for premium
developments or the establishment of premium costs (Gulati, 2009). Yet, causal planning isn't
without its limitations, since it's possible to arrive at outcomes based on coincidence rather of the
actual circumstance.
The process relies on a quantitative cycle based on quantitative study of the data, where the
relationships in the elements are evaluated using classification models with controlled supervised
learning methodologies. It was also decided that visual analysis, such as scatterplots and visual
diagrams, would be useful in deciphering the associations. With data that can be measured,
calculated, and analyzed, one may create a more certain opinion on the relationship between an
endogenous and an exogenous variable.
The k-Nearest-Neighbors model is a non-parametric classification technique that is both easy to use
and effective (Guo et al., 2003). Nevertheless, in order to use kNN, we need to choose an
appropriate value for k, and this number has a substantial impact on the rate at which classification
progresses. One may argue that the kNN method is biased by the value of k. There are several ways
to determine the optimal k value, but one of the simplest is to simply try out different k values in a
series of regular calculations and choose the one with the most convincing results.
As was said before, KNN was evaluated as a classification model for use in supervised learning for
the purpose of data analysis. The analyst was able to demonstrate the data and carry out its
significance and linkages inside a pre-characterized learning algorithm with the help of supervised
learning. The results would be used to confirm the effects of well-known variables in auto insurance
that increase or decrease claim amounts. Such findings were crucial for the researchers in
determining how each of the above characteristics may be used for predictive case control that may
or may not be applicable to the French auto insurance market.
One of the most important models in insights, linear regression is used to determine the association
between dependent and independent variables (Brown, 2009). To examine the association between a
dependent variable and a number of independent variables, a variation on this model, known as
multiple linear regression, is used. The investigation is centered on a multiple linear regression
model, which is dissected using a script approach in MATLAB.

You might also like