You are on page 1of 4

Predicting Forest Fires with Machine Learning

Sophie Tan
Maplesoft

Introduction
Forest fires are devastating, and can rage out of control. The 2017 forest fires in
British Columbia, for example, compelled thousands to escape, burned 1.2 million
hectares of forest, and caused more than CAD $500 million in damage. Moreover,
pollution from the fires caused respiratory problems for people living hundreds of
kilometers away.

Predicting the source and spread of forest fires could have considerable benefits for
human health and life, the economy and the environment. This could help identify
areas with higher risk - for example, with limited resource, the authorities could
choose to focus on monitoring specific areas.

Many techniques have been developed, including approaches that use satellite
images, historical weather data and computational fluid dynamics.

The advent of high-power, low-cost computing has heralded the use of machine
learning and neural networks for predicting forest fires. Machine learning can see
trends and patterns that humans often overlook; in fact, this approach is gaining
recent media attention.

The forest fires data set from the UCI Machine Learning Repository (attached to this
workbook) records several observations (from the northeast region of Portugal)
against the burnt area of a forest fire. The contents of the data file are described
here.

The Month and Day columns from the original dataset are not considered in
this application.
The first 10 columns are continuous, numerical indices that characterize each
forest fire occurrence (location, meteorological data, etc) . The last column is
the burnt area in hectare.

Employing the DeepLearning package, this application

trains a neural network with the forest fire data set


and predicts the burnt area of a forest fire for a given location, and
environmental and ecological conditions

The donator of the dataset suggests performing a log transformation to reduce the
skew towards 0. However, this application uses the original data.

Reference: Dataset accessed from forestfires.csv (UCI Machine Learning Repository)

Import Data
> restart;
> train_data := Import("this:///forestfires_train.csv");

(2.1)
(2.1)

> test_data := Import("this:///forestfires_test.csv");

(2.2)

> cols := ColumnLabels(train_data)[..-2];


(2.3)
The dataset contains 517 instances, 413 are used for training and 104 are used for
testing.

Training and evaluating the Deep Neural Regressor


> with(DeepLearning):
> fc := [seq(NumericColumn(convert(u, string), shape = [1]), u in
cols)]:
In the model, we set three hidden layers with 32, 64, 32 nodes on each layer with
respect.
> regressor := DNNRegressor(fc, hidden_units=[32,64,32]):
> regressor:-Train(train_data[1..10], train_data[11], steps = 20,
num_epochs = 20, shuffle = false):
We specify that the entire dataset will be passed to the model for 20 times and each
iteration is consist of 20 steps, the training data batches will not be randomly
shuffled. NOTE that, the initial weights of the neural network are selected randomly,
which means that each time we run the train command (with a fixed input data), we
get a slightly different model, thus producing different accuracy results and
predictions.
Now after we trained the DNN regressor, we can evaluate the effectiveness with the
test data. In this case, the average loss is ~6448.78, which means that the model is
not very effective. Note that this is a difficult regression task and this focus of this
application is to demonstrate the use of the regressor. Based on these evaluations,
we can decide if we want to adjust the architecture of the model above.
> regressor:-Evaluate(test_data[1..10], test_data[11], steps =
200);

(3.1)

Build Predictor Function


> predictor := proc (ds) regressor:-Predict(Transpose(DataFrame(ds)), num_epochs = 1, shuffle
= false)[1] end proc;
(4.1)

Above we have built a predictor function that takes an arbitrary set of measurements
as a DataSeries and returns a prediction generated by the trained DNN regressor.
Now we can pass the predictor a set of values that represent a hypothetical forest fire
that starts at a given location under a given set of weather conditions, and use it to
predict the burnt area of said forest fire.
> ds := DataSeries([4,4,88.1,25.7,67.6,3.8,14.1,43,2.7,0], labels=
cols):
> predictor(ds);
(4.2)

The prediction suggests that a forest fire with the given traits (location, weather, etc)
will have a burnt area of ~3.96 ha according to our regression model.

You might also like