Reflection On The Use of Self-Organizing Map For Text Clustering in Engineering Change Process Analysis: A Case Study

REFLECTION
On the Use of Self-Organizing Map for Text Clustering in Engineering Change Process
Analysis: A Case Study
This article affect me because the activity requires a deep knowledge of the different kinds of ECRs
managed during the engineering change process and may be difficult and time consuming. Research on use of
SOM text clustering in engineering change process analysis appears to be a promising direction for further
research.
This article catch the reader’s attention through the study showed some limitations of the application of
SOM text clustering and classification. A first limitation is linked to natural language written texts. The terms
contained in different texts may be similar even if an engineering change request concerns a different product.
The similarity of terms may influence the performance of SOM-based clustering. A second limitation is linked
to the use of SOM as classification method. Classification, indeed, requires the labelling of a training dataset.
This article realize me about something that the use of the SOM text clustering can be an effective tool in
improvement of the engineering change process analysis. In particular, some of the advantages highlighted in
this study are as follows:
(1) Text mining methods allow analyzing unstructured data and deriving high-quality information. The
main difficulty in ECR analysis consisted in analyzing natural language written texts.
(2) Clustering analysis of past ECRs stored in the company allows automatically gathering ECRs on the
basis of similarity between documents. When a new change triggers, the company can quickly focus on
the cluster of interest. Clustering can support the company to know if a similar change was already
managed in the past, to analyze the best solution adopted and to learn avoiding the same mistakes made
in the past.
(3) Use of SOM for ECRs text clustering allows automatically organizing large documents collection. With
respect to other clustering algorithms, the main advantage of SOM text clustering is that the similarity of
the texts is preserved in the spatial organization of the neurons. The distance among prototypes in the
SOM map can therefore be considered as an estimate of the similarity between documents belonging to
clusters.
In addition, SOM can first be computed using a representative subset of old input data. New input can be
mapped straight into the most similar model without recomputing the whole mapping. The study concerned the
postchange stage of the engineering change process, in which past engineering changes data are analyzed to discover
information exploitable in new engineering changes.
This article was very detailed and I don’t have any comments or questions regarding to this article.
REFLECTION
Multistrategy Self-Organizing Map Learning for Classification Problems
The Article affect me in classification process; normally, large classes of objects are separated into
smaller classes. This approach can be very complicated due to the challenge in identifying the criteria especially
for procedures involving complex data structures. In this scenario; practically, the Machine Learning (ML)
techniques will be used and introduced by many researchers as alternative solutions. Among the ML methods
and tools, Artificial Neural Network (ANN), Fuzzy Set, Genetic Algorithm (GA), Swarm Intelligence (SI), and
rough set are commonly used by researchers.
This article catch the reader’s attention through multistrategy learning by proposing Enhanced Self-
Organizing Map with Particle Swarm Optimization (ESOMPSO) for classification problems. The proposed
method was successfully implemented on machine learning datasets: XOR, Cancer, Glass, Pen digits, and Iris.
The analysis was done by comparing the results for each dataset produced by Self-Organising Map (SOM),
Enhanced Self-Organising Map (ESOM), Self-Organizing Map with Particle Swarm Optimization (SOMPSO)
and ESOMPSO with different distance measurements.
This article realize me about something that the conducted experiments is to investigate the
performance of the proposed methods. The comparisons are done on ESOMPSO, SOM with PSO (SOMPSO),
and enhanced SOM (ESOM). The results are validated in terms of classification accuracy and quantisation error
(QE) on standard universal machine learning datasets: Iris, XOR, Cancer, Glass, and Pen digits. From the
conducted experiments, it shows that the proposed methods, ESOMPSO and SOMPSO, give better accuracy
despites higher convergence time.
The proposed method has been tested on various standard datasets with substantial comparisons
with existing SOM network and various distance measurement. The results show that the proposed method
yields a promising result with better average accuracy and quantisation errors compared to the other methods as
well as convincing significant test.
Application of Self-Organizing Maps on Time Series Data for identifying interpretable

Driving Manoeuvres
The Article affect me that it is also quite common to distinguish driving styles as Aggressive, Normal,
Gentle, and so on, in the case of City Buses, there are more restrictions due to fixed schedules and traffic.
Aggressiveness and defensiveness cannot be easily computed, and a driver can exhibit both depending on the
context. For instance, a driver can be aggressive while entering a bus stop and defensive while leaving. The
proposed method does not distinguish the driving style as aggressive or defensive and rather differentiates
driving manoeuvres.
This article catch the reader’s attention through Understanding the usage of a product is essential for any
manufacturer in particular for further development. Driving style of the driver is a significant factor in the usage
of a city bus. This work proposes a new method to observe various driving manoeuvres in regular operations
and identify the patterns in these manoeuvres. The significant advance in this method over other engineering
approaches is the use of uncompressed data instead of transformations into certain Performance indicators.
Here, the time series inputs were preserved and prepared as 10-second-frames using a sliding window technique
and fed into Kohonen’s Self-organizing Map (SOM) algorithm. This produced a high accuracy in the
identification and classification of manoeuvres and at the same time to a highly interpretable solution that can
be readily used for suggesting improvements.
This article realize me about something that driving Manoeuvres provide essential insights for
automotive manufacturers to understand their vehicle usage and to improve their design. It is also useful for
individual drivers as well as fleet owners to understand their vehicle usage to improve their operation and
service.
However, these methods are supervised and required labelled data to predict certain specific
manoeuvres, such as exiting a round-about or stopping at a traffic light. The proposed method is an
unsupervised approach which requires no prior knowledge about the manoeuvres performed. This approach
enables the learning of a wider spectrum of manoeuvres that are not captured in supervised approaches. The
usage of Self-organizing Maps algorithm improves the interpretability of the results.
The ubiquitous self-organizing map for non-stationary data streams.
The Article affect me that, At present, all kinds of stream data processing based on instantaneous data
have become critical issues of Internet, Internet of Things (ubiquitous computing), social networking and other
technologies. The massive amounts of data being generated in all these environments push the need for
algorithms that can extract knowledge in a readily manner.
This article catch the reader’s attention through Nowadays, data streams are generated naturally within
several applications as opposed to simple datasets. Such applications include network monitoring, web mining,
sensor networks, telecommunications, and financial applications. All have vast amounts of data arriving
continuously. Being able to produce clustering models in real-time assumes great importance within these
applications. Hence, learning from streams not only is required in ubiquitous environments, but also is of
relevance to other current hot topics, namely Big Data. The rationale behind the requirement of learning from
streams is that the amount of information being generated is to big to be stored in devices, where traditional
mining techniques could be applied. Data streams arrive continuously and are potentially unbounded. Therefore,
it is impossible to keep the entire stream in memory.
This article realize me about something that, this paper presented the improved version of the ubiquitous
self-organizing map (UbiSOM), a variant tailored for real-time exploratory analysis over data streams. Based on
literature review and the conducted experiments, it is the first SOM algorithm capable of learning stationary and
non-stationary distributions, while maintaining the original SOM properties. It introduces a novel average
neuron utility assessment metric in addition to the previously used average quantization error, both used in a
drift function that measures the performance of the map over non-stationary data and allows for learning
parameters to be estimated accordingly. Experiments show this is a reliable method to achieve the proposed
goal and the assessment metrics proved fairly robust. The UbiSOM outperforms current SOM algorithms in
stationary and non-stationary data streams.
Self-organizing maps as an approach to exploring spatiotemporal diffusion patterns

The Article affect me that, the search for synchrony is not unique to epidemiology but originates in
innovation diffusion and ecology, and it occurs in many other disciplines. Hence, multiple methods exist to
quantify and to map synchrony. Among these methods wavelets are frequently used as they also allow to study
non-stationary (trends) in time series. Wavelets analyse disease diffusion in the frequency domain where
synchrony can be identified via the coherence in the phase of the number of diseases cases at each geographic
location.
This article catch the reader’s attention through a SOM-based method to analyse spatiotemporal
diffusion of infectious diseases. The method is based on training a SOM for a larger time-series (including
multiple waves) and mapping back individual outbreaks for characterisation and comparison. Via a number of
experiments we showed how this method can be applied for finding synchronies between spatial locations and
for comparing spatial temporal diffusion patterns of different epidemics. We also demonstrated how different
types of data organisation (in space and time) can help to reveal different information. Several types of
secondary clustering (hierarchical, enhanced U-matrix and Component planes) were shown, that can be used to
improve the SOMs performance. The integration of SOMs with other visualisation techniques, especially
Sammon’s Projection and GIS was used to detect, interpret and visualise spatial temporal patterns. Results of
the method are consistent with diffusion patterns found using other methods; this makes SOMs an interesting
alternative, worth further exploring. For instance, by applying it to a larger dataset in a more dynamic
geographic environment, by coupling it to a spatially-explicit disease model or by using it for near-real time
disease monitoring.
This article realize me about something that this study demonstrates the applicability of SOMs
(combined with Sammon’s Projection and GIS) in spatiotemporal diffusion analyses. It shows how to visualise
diffusion patterns to identify (dis)similarity between individual waves and between individual waves and an
overall time-series performing integrated analysis of synchrony and diffusion trajectories. Both stable and
incidental synchronisation between medical districts were identified as well as two distinct groups of epidemic
waves, a uniformly structured fast developing group and a multiform slow developing group. Diffusion
trajectories for the fast developing group indicate a typical diffusion pattern from Reykjavik to the northern and
eastern parts of the island. For the other group, diffusion trajectories are heterogeneous, deviating from the
Reykjavik pattern.

Reflection On The Use of Self-Organizing Map For Text Clustering in Engineering Change Process Analysis: A Case Study

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Reflection On The Use of Self-Organizing Map For Text Clustering in Engineering Change Process Analysis: A Case Study

Uploaded by

Copyright:

Available Formats

REFLECTION

Multistrategy Self-Organizing Map Learning for Classification Problems

Application of Self-Organizing Maps on Time Series Data for identifying interpretable

The ubiquitous self-organizing map for non-stationary data streams.

Self-organizing maps as an approach to exploring spatiotemporal diffusion patterns

You might also like