You are on page 1of 12

International Journal of Coal Preparation and Utilization

ISSN: 1939-2699 (Print) 1939-2702 (Online) Journal homepage: http://www.tandfonline.com/loi/gcop20

Identification of Coal and Gangue by Feed-forward


Neural Network Based on Data Analysis

Wei Hou

To cite this article: Wei Hou (2017): Identification of Coal and Gangue by Feed-forward Neural
Network Based on Data Analysis, International Journal of Coal Preparation and Utilization, DOI:
10.1080/19392699.2017.1290609

To link to this article: https://doi.org/10.1080/19392699.2017.1290609

Accepted author version posted online: 16


Feb 2017.
Published online: 15 Mar 2017.

Submit your article to this journal

Article views: 208

View Crossmark data

Citing articles: 2 View citing articles

Full Terms & Conditions of access and use can be found at


http://www.tandfonline.com/action/journalInformation?journalCode=gcop20
INTERNATIONAL JOURNAL OF COAL PREPARATION AND UTILIZATION
http://dx.doi.org/10.1080/19392699.2017.1290609

Identification of Coal and Gangue by Feed-forward Neural


Network Based on Data Analysis
Wei Hou
School of Engineering and Applied Science, University of California, Los Angeles, CA, USA

ABSTRACT ARTICLE HISTORY


While coal is the major power source around the globe, gangue is Received 1 October 2016
Accepted 31 January 2017
unwanted in power plants. Thus, separating gangue from coal is a
crucial part in the preprocessing step of mining. With the develop- KEYWORDS
ment of the computational technologies, it is possible to find one Automation separation; coal;
way to enhance the effect of gangue separation. By establishing a coal beneficiation; gangue;
coal-gangue separation system based on the difference between coal image recognition; neural
and gangue in their surface texture and grayscale feature, this paper network
proposes a method of combining image feature extraction and arti-
ficial neural network, to identify gangue. In addition, this method will
enable robots, instead of human, to pick the gangue. Ultimately, the
automated separation of coal-gangue and increased efficiency of raw
coal sorting and quality of coal can be achieved if the method
proposed in this paper can be applied in coal industry.

Introduction
Coal is a main energy source and an important industrial raw material in China,
accounting for more than 70% of primary energy production and consumption [1].
Gangue is the unwanted by-product of coal as the gangue can cause incomplete combus-
tion in the power plant. Majority of coal refineries use the manual method. The mechan-
ical methods are roughly categorized as the dry process and the wet process. The dry
process doesn’t involve the use of water, whereas in the wet process, the water is the main
medium for washing and jigging. K. Guru Raghavendra Reddy compared the mainly used
method in Table 1 [2]. All the following methods have their own disadvantages such as
pollutions and low efficiency.
As the booming of computer science and interdisciplinary studies, the possibility of
combining image recognition and coal-gangue identification technologies is being
explored [3]. Ma X. discussed an identification method based on wavelet analysis [4]. K.
Guru Raghavendra Reddy explored the difference between the surface of coal and gangue
in their average grayscale and variance [2]. Liang et al. proved that it is possible to use
SVM and neural network to develop a high accuracy coal-gangue recognition system [5].
Zelin Zhang et al. (2011) proposed a method of extracting characteristics parameters to
forecast coal particle density [6]. This paper introduces a novel method based on the
difference between coal and gangue in their chemical composition and formation process.

CONTACT Wei Hou weihou@g.ucla.edu School of Engineering and Applied Science, University of California,
Los Angeles, CA 90095, USA.
Color versions of one or more of the figures in the article can be found online at www.tandfonline.com/gcop.
© 2017 Taylor & Francis
2 W. HOU

Table 1. Comparison of coal cleaning methods.


Sl. no. Methods Advantages Disadvantages Costs
1 Jigs Large capacity inexpensive Lower separation than dense- Inexpensive
worldwide usage medium
2 Dense-medium Good separation Small capacities Expensive
separators
3 Hydro cyclones Simple structure Water consumption Inexpensive
4 Concentration Inexpensive good pyrite Small capacities Inexpensive
separation
5 Froth flotation Good results on fines Complex Poor pyrite separation Expensive
6 Dry cleaning No water required Used for metallurgical coals size Lower than wet
<0.5mm processes

Specifically, the way coal and gangue are formed causes the difference between of them in
their surface gloss, color, and texture, so it is possible to build an image recognition device
based on those differences. This paper adopts a self-developed image processing gangue
recognition algorithm, which combines feature extraction with artificial neural network, to
identify gangue and uses robots to pick gangue instead of human.
The surfaces and texture difference between coal and gangue are what enables human
to identify them in the preprocessing step of mining. The difference comes from the
chemical composition of the coal and gangue.
Coal is derived from ancient plants changed in their physical and chemical properties
through biochemical reactions and geological process. Coal is a kind of black solid
mineral, mainly composited of carbon, hydrogen, oxygen, nitrogen. Gangue is a low-
carbon and hard by-product of coal. This black stone is formed because of the unevenness
of precipitation and base. They are composite of various kinds of stones. The difference
between the surfaces of coal and gangue is rather obvious in their gray scale and texture,
which provide the basis for image recognition of coal and gangue.

Materials and methods


Research overview
The contrast between the surface property of coal and gangue is shown in Figure 1. The
difference is shown through gray scale and texture of their surfaces. Theoretically, it is
feasible to identify coal and gangue base on the gray scale difference between coal and
gangue.
To bionically simulate the process of manual identification of gangue, the extraction of
gangue image characteristics and the computation of related parameters are carried out by
image identification system, the identification is implemented by artificial neural network, and
robots are adopted to replace human for picking gangue, so as to achieve the automation of
gangue separation. In sum, with this system, integrated computer system replaces human
brain, image processing system replaces eye, and control system replaces neural system.
The purpose of this experiment is to establish the relationship between selected features
and the identity of an unknown ore (whether it is coal or gangue). To do so, we have to
first gather a large scale of data of coal and gangue, respectively. Then, we can train a
neutral network based on the data we gathered. The neutral network can be used to
identify whether the ore is coal or gangue based on the ore’s features.
INTERNATIONAL JOURNAL OF COAL PREPARATION AND UTILIZATION 3

Figure 1. The image of coal (a) and gangue (b) are shown in Figure 1. The histogram next to images
are the grayscale histogram of coal and gangue respectively. The histograms show the grayscale level
distribution of the coal and gangue. As shown in the figure, the distribution of grayscale levels of pixels
is more dispersed in the image of coal than in the image of gangue.

Table 2. Proximate analysis of raw coal, coal, and gangue.


Sample Mad /% Aad /% Vdaf /% FCdaf /% Cad/% Had/% Oad /% Nad /%
Raw coal 2.23 28.31 26.02 43.45 58.03 3.26 5.28 0.92

Analysis of the sample


The proximate analysis of raw coal is conducted. The results are shown in Table 2.

Feature extraction device


(1) Illuminant
The illuminant available includes bar light units, line light, and dome light. Because of the
great ability to show texture details, dome light we chosen as the illuminant.

(2) Industrial Camera


To extract enough features from the image, industrial CCD camera must have high image
quality. The camera chosen produces images with 1024×768 pixels to ensure enough
surface information to identify coal and gangue. A conceived hardware structure design
of robot coal-gangue separation system is shown in Figure 2.

Image processing
Since it is inefficient to keep all the surface information in the process of recognition, a
total of nine features are selected to represent the related features that differentiate coal
from gangue.
4 W. HOU

Figure 2. This figure shows a design of an automated coal-gangue separation system using the
algorithm discussed in this article.

(1) Grayscale

In photography and computing, a grayscale or grayscale digital image is an image in which


the value of each pixel is a single sample, that is, it carries only intensity information.
Images of this sort, also known as black-and-white, are composed exclusively of shades of
gray, varying from black, at the weakest intensity, to white, at the strongest.

(2) Grayscale Histogram

A “grayscale histogram” is a type of histogram that acts as a graphical representation of


the grayscale distribution in a digital image. It plots the number of pixels for each
grayscale value. By looking at the histogram for a specific image, a viewer will be able to
judge the entire grayscale distribution at a glance.
Grayscale histogram of the image is a discrete function of grayscale. It is proven to be
efficient in identifying coal and gangue [7]. Grayscale histogram is represented by the
following function:
ni
HðiÞ ¼ ; i ¼ 0; 1; . . . ; L  1
N
INTERNATIONAL JOURNAL OF COAL PREPARATION AND UTILIZATION 5

Table 3. Features from grayscale histogram.


Feature Definition Expression
Average Average reflects the average grayscale value of an image LP
1
μ¼ iHðiÞ
i¼0
Variance Variance reflects how spread out of the set of grayscale value LP
1
σ2 ¼ ði  μÞ2 HðiÞ
i¼0
Smoothness Smoothness represents the how continuously the grayscale changes in an μk ¼ 1
1þσ2
image
Skewness Skewness reflects the asymmetry of the grayscale histogram LP
1
μs ¼ σ13 ði  μÞ3 HðiÞ
i¼0
Energy Energy reflects the regularity of scattering of grayscale value LP
1
μN ¼ HðiÞ2
i¼0
Entropy Entropy reflects the regularity of scattering of grayscale value on histogram LP1
μE ¼  HðiÞlog2 ½HðiÞ
i¼0

In the function above, i represents gray scale, L represents the number of the kinds of gray
scale, and n represents total number of pixels. The function describes the percentage of a
specific grayscale pixels in the whole image, which is the frequency pixels with gray scale I
shows in the image.
Coal and gangue differ in their grayscale histogram [8], so grayscale histogram is
chosen as one of the features to be extracted to help coal-gangue identification.

(3) Gray Features


Based on the chemical properties of coal and gangue and the difference between their
chemical composition, following grayscale features are chosen as the foundations of the
identification algorithm. The features extracted from the graysale histogram is shown in
Table 3.

(4) Image texture


In this algorithm, co-occurrence matrix is used to represent the image texture with
features include energy, entropy, correlation, and contrast.
n o
Grayscale matrices P ¼ pij ; 0 < i  m; 0 < j  n are used to represent the image: m
represents the height of the image; n represents the width of the image. Spatial grayscale
matrices are constructed based on different directions a (a = 0°, 45°, 90°, 135°), and
second-order conditional functions p(i,j,d,a). p(i,j,d,a) represent the probability of i
jumping to j in the direction a with the distance d.
In this project, 1 is given to d. Based on the matrix, extracting three features in four
directions: Energy E, Entropy εs , and Contrast I. The expressions and definitions of those
features are shown in Table 4.

Table 4. Features from co-occurrence matrix.


Feature Definition Expression
P P
Energy Energy reflects the regularity of scattering of grayscale value E ¼ i j ½pði; j; d; aÞ2
P P
Entropy Entropy reflects the complexity and irregularity of scattering of εs ¼  i j pði; j; d; aÞlog½pði; j; d; aÞ
grayscale value
P P
Contrast Contrast reflects the intensity of variation in a segment of the I¼ i j ½ði  jÞ2 pði; j; d; aÞ
image
6 W. HOU

Artificial neural network


To establish a classifier, “01” was set to represent coal, and “10” was set to represent
gangue. Matching the data with the results, training can establish the relationship between
features and results.
The neural network this paper discusses is feed-forward neutral network with error
backpropagation algorithm. The feed-forward neural network model [9] is expressed as
follows:
!
XM
yðx; wÞ ¼ f ωj ϕj ðxÞ
j¼1

The error function [8] is expressed as follows:


X
N
EðwÞ ¼ En ðwÞ
n¼1

MATLAB provides an embedded neural network toolbox that enables us to use the feed-
forward BP neural network directly [10].
The design of the algorithm consists of nine features (based on co-occurrence matrix:
energy, entropy, and inertia; based on statistic data: average, variance, smoothness, skew-
ness, energy, and entropy) as input data, 15 hidden layers, and 2 output data. Of the
output data, each ranges from 0 to 1.

Result and discussion


Neutral network selection
Although the model of feed-forward neural network is well-established, for this particular
application, we need to find the appropriate number of hidden layers and of nodes in each
layer.
We train 10 networks based on the data obtained from 30 pieces of coal and 30 pieces
of gangue. Each one of these 10 networks either has 1 or 2 layer, and the number of nodes
in each layer must be 5, 10, 15, 20, or 25. They are names as network 1–5, network 1–10,
network 1–15, network 1–20, network 1–25, network 2–5, network 2–10, network 2-15,
network 2–20, and network 2–25. The result is as follows:
From Table 5, the best validation MSE of network 1–5, network 1–15, network 2–5,
network 2–10, network 2–15, network 2–25 is significantly smaller than the rest. However,
their training performance plots are different.
Form those six training performance plots from Figure 3, the test MSE of network 2–10
and network 2–25 keep increasing from the start of the training and become much higher
than validation MSE and the test MSE of other networks.
Among the four rest networks, we chose the one with lowest validation MSE: net-
work 1–15.
INTERNATIONAL JOURNAL OF COAL PREPARATION AND UTILIZATION 7

Table 5. MSE (mean square error) of each possible solution.


Number of layers 1
Number of nodes in each layer 5 10 15 20 25
Best validation MSE 1.53E-09 1.28E-01 1.70E-11 8.37E-02 3.88E-03
Number of layers 2
Number of nodes in each layer 5 10 15 20 25
Best validation MSE 7.38E-11 5.04E-10 1.45E-09 2.11E-03 7.60E-10

Figure 3. This figure shows the performance of the selected networks. The plots are the performance of
network 1-5(A), network 1-15(B), network 2-5(C), network 2-10(D), network 2-15(E), and network 2-25(F).
8 W. HOU

Table 6. Algorithm (9-arguments) identification result.


Item Predicting result Predicted class Aad /% True or False
Coal 1 0.05 0.96 Coal 8.29 TRUE
Coal 2 0 1 Coal 8.67 TRUE
Coal 3 0 1 Coal 9.31 TRUE
Coal 4 0 1 Coal 10.29 TRUE
Coal 5 0 1 Coal 12.33 TRUE
Coal 6 0 1 Coal 9.18 TRUE
Coal 7 0 1 Coal 10.79 TRUE
Coal 8 0 1 Coal 11.03 TRUE
Coal 9 0 1 Coal 10.38 TRUE
Coal 10 0 1 Coal 8.39 TRUE
Gangue 1 1 0 Gangue 73.48 TRUE
Gangue 2 1 0 Gangue 68.02 TRUE
Gangue 3 1 0 Gangue 62.84 TRUE
Gangue 4 1 0 Gangue 80.05 TRUE
Gangue 5 1 0 Gangue 74.29 TRUE
Gangue 6 1 0 Gangue 65.79 TRUE
Gangue 7 1 0 Gangue 74.37 TRUE
Gangue 8 1 0 Gangue 68.49 TRUE
Gangue 9 1 0 Gangue 71.45 TRUE
Gangue 10 0 1 Coal 56.92 FALSE

Table 7. Confusion Matrix (9-arguments algorithm).


Gangue (expected) Coal (expected)
Gangue (predicted) 9 0
Coal (predicted) 1 10

After the network is trained, 20 sets of data are randomly selected (20 of coal and 20 of
gangue), and the remaining 20 sets of data are reserved as testing data. The testing results
are shown in the table below.
As indicated by Tables 6 and 7, if we categorize coal as true and gangue as false since
the major product in a refinery is coal instead of gangue, the accuracy is 95%, the
precision is 90.91%, and the recall is 100%.

Excluding insignificant arguments


As indicated in Table 8, the difference between coal and gangue in their entropy (from co-
occurrence matrix), contrast, energy (from grayscale histogram), and entropy (from
grayscale histogram) is not significant enough to be used to identify coal and gangue.
Thus, those arguments are excluded from target arguments.
Afterwards, the remainder of the arguments and the corresponding data is used to train
a new network. After going through, the same process as in 3.1, a network with 2 hidden
layer and 15 nodes in the first layer is selected since it generates the best result.
After the network is trained, 20 sets of data are randomly selected (20 of coal and 20 of
gangue), and the remaining 20 sets of data are reserved as testing data. The testing results
are shown in the table below.
As indicated by Tables 9 and 10, if we categorize coal as true and gangue as false since
the major product in a refinery is coal instead of gangue, the accuracy is 90%, the
precision is 83.3%, and the recall is 100%.
Table 8. Candidate arguments analysis.
From co-occurrence matrix From grayscale histogram
Item Energy Entropy Contrast Average grayscale Variance Smoothness Skewness Energy Entropy
Average of coal 0.1346721 2.6059464 0.6942153 54.50559 22.560604 0.008898 0.3417263 0.0172694 6.1931122
Average of gangue 0.1702504 2.4129081 0.5204203 94.007786 19.993976 0.006703 0.1440586 0.0196205 6.0899834
St. dev. (coal) 0.0569394 0.5028017 0.3959866 20.049266 8.8399972 0.0070386 0.4403702 0.0052002 0.4832815
St. dev. (gangue) 0.1066504 0.5314626 0.2329108 13.01329 6.327536 0.0038651 0.1761478 0.0077325 0.4819778
Avg. coal - avg. gangue 0.0355782 0.1930383 0.173795 39.502195 2.5666285 0.002195 0.1976677 0.0023512 0.1031288
St. dev. (coal)/avg. coal 42.28% 19.29% 57.04% 36.78% 39.18% 79.10% 128.87% 30.11% 7.80%
St. dev. (gangue)/avg. gangue 62.64% 22.03% 44.75% 13.84% 31.65% 57.66% 122.28% 39.41% 7.91%
Difference avg./mean avg. 11.67% 3.85% 14.31% 26.60% 6.03% 14.07% 40.69% 6.37% 0.84%
Mean without outlier coal 0.1346721 2.5599156 0.5319656 54.50559 22.560604 0.0075456 0.2627407 0.0172694 6.1931122
Mean without outlier gangue 0.1579183 2.4129081 0.5204203 94.007786 19.993976 0.006703 0.0834587 0.0188293 6.0899834
Difference avg. (without outliers) 0.0232461 0.1470075 0.0115453 39.502195 2.5666285 0.0008425 0.179282 0.0015599 0.1031288
Difference avg./mean avg. (without outliers) 15.89% 5.91% 2.19% 53.20% 12.06% 11.83% 103.57% 8.64% 1.68%
INTERNATIONAL JOURNAL OF COAL PREPARATION AND UTILIZATION
9
10 W. HOU

Table 9. Algorithm (5-arguments) identification result.


Item Predicting result Predicted class Aad /% True or false
Coal 1 0.13 0.84 Coal 8.29 TRUE
Coal 2 0 1 Coal 8.67 TRUE
Coal 3 0 1 Coal 9.31 TRUE
Coal 4 0 1 Coal 10.29 TRUE
Coal 5 0 1 Coal 12.33 TRUE
Coal 6 0 1 Coal 9.18 TRUE
Coal 7 0 1 Coal 10.79 TRUE
Coal 8 0 1 Coal 11.03 TRUE
Coal 9 0 1 Coal 10.38 TRUE
Coal 10 0 1 Coal 8.39 TRUE
Gangue 1 1 0 Gangue 73.48 TRUE
Gangue 2 1 0 Gangue 68.02 TRUE
Gangue 3 0 0.99 Coal 62.84 FALSE
Gangue 4 1 0 Gangue 80.05 TRUE
Gangue 5 1 0 Gangue 74.29 TRUE
Gangue 6 1 0 Gangue 65.79 TRUE
Gangue 7 1 0 Gangue 74.37 TRUE
Gangue 8 1 0 Gangue 68.49 TRUE
Gangue 9 1 0 Gangue 71.45 TRUE
Gangue 10 0.55 0.68 Coal 56.92 FALSE

Table 10. Confusion matrix (5-arguments algorithm).


Gangue (expected) Coal (expected)
Gangue (predicted) 8 0
Coal (predicted) 2 10

Comparison and analysis of the two algorithms


As the two previous parts indicate, the accuracy is 95% for the 9-arguments algorithm and
90% for the 5-arguments algorithm. However, still, the first algorithm’s time consumption
is reduced due to less parameters to calculate.
However, the 9-arguments algorithm still has its own advantages. The higher identifi-
cation accuracy and precision are the two of them. More importantly, one of the data
points wrongly identified by the 5-arguments algorithm is also the one that is wrongly
identified by the 9-arguments algorithm. In other words, the 9-arguments algorithm can
eliminate some of the corner cases that the 5-arguments algorithm cannot.
In addition, both algorithms mistakenly characterized gangue 10 as coal. Meanwhile,
gangue 10 has the lowest ash content among all the gangue samples. The low ash content
indicates a large degree of similarity to coal in terms of both surface and chemical
properties. Thus, the resemblance of gangue 10 to the coal samples is the cause of the
wrong prediction.

Conclusion
The training result in the series of experiments showed promising application potential.
For the 9-arguments algorithm, the mean square error of validation reaches as low as
1.6994e-11 at epoch 23. For the 5-arguments algorithm, the mean square error of valida-
tion reaches as low as 5.6051e-10 indicated by the data generated. The accuracy of those
INTERNATIONAL JOURNAL OF COAL PREPARATION AND UTILIZATION 11

two algorithms reaches 95% and 90%, respectively. The high accuracy shows that the
algorithm is ready for industrial level application.
In addition, the 9-arguments algorithm, compare to the 5-arguments algorithm, can
solve and some of the corner cases, the 5-arguments algorithm cannot solve due to more
parameters to calculate, while the cost is more time consumption in the process of
identification.

ORCID
Wei Hou http://orcid.org/0000-0001-8023-6395

References
[1] Wu, B., and M. Yang. 2012. Analysis on Technical and Economic Policy of Comprehensive
Utilization of Coal Ash and Gangue in China. Energy of China 11: 8–11.
[2] Reddy, K. Guru Raghavendra, and D. P. Tripathy, 2013. Separation of gangue from coal based
on histogram thresholding. International Journal of Technology Enhancements and Emerging
Engineering Research 1(4): 31–34.
[3] Ma, X., and Y. Jiang. 2003. Application of Digital Image Processing Technology in Automatic
Separation System of Coal and Gangue. Computer Technology and Automation. 22: 178–180.
[4] Ma, X., and J. Zang. 2008. Coal gangue image process approaches with wavelet analysis.
Congress on image and signal processing, 352–356. Sanya, Hainan, China: IEEE Computer
Society.
[5] Liang, H., H. Cheng, T. Ma, Z. Pang, and Y. Zhong. 2010. Identification of coal and gangue
by self-organizing competitive neural network and SVM. International Conference on
Intelligent Human-Machine Systems and Cybernetics, 41–45. Nanjing, Jiangsu, China: IEEE.
[6] Zhang, Z., J. Yang, and Y. Wang. 2011. Image recognition of coal pieces based on MATLAB
and forecasting of density and production. In Coal preparation technology 1: 53–55.
[7] Gao, K., C. Du, H. Wang, and S. Zhang. 2013. An efficient of coal and gangue recognition
algorithm. International Journal of Signal Processing Image Processing & Pattern Recognition
6: 345–354.
[8] Li, W., Y. Wang, B. Fu, and Y. Lin. 2010. Coal and coal gangue separation based on computer
vision. Fifth international conference on Frontier of Computer Science and Technology, IEEE
Computer Society, Changchun, Jilin, China, 467–472.
[9] Bishop, Christopher M. 2006. Pattern recognition and machine learning. New York: Springer
Science+Business Media, LLC, 227–245.
[10] Gonzalez, Rafael C., Richard E. Woods, Steven L. Eddins. 2013. Digital image processing using
MATLAB. Beijing: Publishing House of Electronics Industry.

You might also like