You are on page 1of 6

7th IFAC Conference on Manufacturing Modelling, Management,

and Control
International Federation of Automatic Control
June 19-21, 2013. Saint Petersburg, Russia

A Semantic Mapping Approach to Retrieve Manufacturing Information


Resources: 670LFURHOHFWURQLFV¶ FDVH VWXG\
S. Bouzid*,**. C. Cauvet*. C. Frydman*
J. Pinaton**

*Laboratory for Systems and Information Sciences (LSIS), University of Aix-Marseille, France
France (e-mail: sara.bouzid@lsis.org;corine.cauvet@lsis.org;claudia.frydman@lsis.org).
**STMicroelectronics, Rousset area, France
(e-mail:sara.bouzid@st.com;jacques.pinaton@st.com)

Abstract: Controlling a manufacturing process in a company requires using same manufacturing


indicators among company engineers. However, the generalization of Commercial Off-The-Shelf
systems in manufacturing companies to get such indicators has rapidly entailed the increase of the
quantity of manufacturing information resources, where a resource represents an indicator in a specific
format. As a consequence, retrieving such resources has become a real challenge for the engineers
because these resources usually lack of semantic description. We propose in this paper a semantic
mapping approach to associate business semantics to the resources during their search. The core contents
of the approach are presented in general and an implementation example is given.
Keywords: Resource Retrieval, Mapping, Semantic Approach, Manufacturing Indicator

the related works. The fourth section describes the semantic


1. INTRODUCTION
mapping approach. Finally, the last section presents an
With the growing need of monitoring manufacturing implementation example with the validation technique used
processes in industrial companies, sharing and using same in the STMicroelectronics Company.
manufacturing information ±such as control indicators±
2. PROBLEM DESCRIPTION
among company engineers is necessary to efficiently ensure
this task. The use of off-the-shelf distributed systems has Controlling manufacturing processes with a set of indicators
become common in many industries, because of their ability enables industrial companies to ensure a continuous control
to provide a flexible management and maintenance of the of their activities. The process control relies on a set of
manufacturing data. However, the extensive use of these standard tools and methods established in manufacturing
systems by the end users often lead to manufacturing- companies to guarantee the use of reliable and efficient
indicDWRUV¶ overflow. In fact, such systems provide a wide indicators by company engineers. The complexity of the
range of functionalities (data extraction, analysis, reporting, information system in such companies leads them to use
etc.), which often overlap. Company engineers can get in this Commercial Off-The-Shelf (COTS)1 distributed systems for
way several calculations of indicators with several formats. engineering data management and analysis, so to facilitate the
They entail the increase of the number of information access to manufacturing data to the engineers. Moreover, the
resources because each indicator in a specific format is a modular and distributed aspect of these systems allows easy
resource. As a consequence, to use a manufacturing indicator, adaptation to changes and reduces long-term maintenance
an engineer must spend a lot of time in searching in resource costs. However, their extensive use to produce manufacturing
repositories, sometimes without effective results, even by information and indicators can lead to a MI-resource
using a search engine. The main reason behind this difficulty overflow.
is that these resources usually lack of descriptive meta-data
Indeed, the study of the information system within
that give a sense to the resources. This meaningful
STMicroelectronics points out this problem of information-
description is known as semantics. We try within
resource overflow. STMicroelectronics is a manufacturing
STMicroelectronics to improve the retrieval of the
company which uses COTS systems for the control of its
manufacturing information (MI) resources using a semantic
activity (production of electronic chips). These systems have
mapping approach. This approach relies on a manufacturing
diverse components which offer several functionalities that
process ontology and a process-control dictionary to find
can overlap. Therefore, an engineer can get and process data
correspondences between an indicator stored in a resource
in different ways, entailing the increase of the MI resources
repository and its relationship with the control of the
manufacturing process.
1
The outline of this paper is as follows. The second section COTS are specialized software, designed for specific applications such as
introduces the problem description. The third section presents medical billing, chemical analysis, statistical analysis, etc. They can be used
with little or no modification

978-3-902823-35-9/2013 © IFAC 2069 10.3182/20130619-3-RU-3018.00201


2013 IFAC MIM
June 19-21, 2013. Saint Petersburg, Russia

in the company. We found within STMicroelectronics more realizes the mapping between user queries and engineering
than 20 kinds of data management, analysis and reporting documents.
systems. Examples of used systems are Business Object for
Most of the research works that use a semantic retrieval
reporting, Kla ace xp for data analysis and statistical
process emphasize the use of one or many business
calculation, Apf for real time data extraction and reporting
ontologies. Such ontologies give a description of a domain
and so on (Fig. 1).
as a whole, without differentiating between business
semantics and the linguistic aspect. We propose in our
approach to separate the business description of a domain
from the definition of its vocabulary. In addition, a specific
combined string similarity technique is used in the mapping
approach to broaden the scope of matching of concepts.
4. THE SEMANTIC MAPPING APPROACH
4.1 Approach overview
The proposal tries to improve the MI-resource retrieval using
semantic description of the resources. To that aim, we
developed an approach to automatically identify business
descriptions for MI resources. Because these resources aim at
Fig. 1. Examples of COTS systems within controlling a manufacturing process, our goal is to determine
STMicroelectronics their relevance to the process control. A process-control
Actually, there is an average of 200 to 400 resources by each dictionary referencing the main terminologies of the process-
system. As a consequence, it is almost impossible for an control domain (standard methods, standard indicators, etc.)
engineer to retrieve an indicator among the available huge is used. A manufacturing process ontology related to the
quantity stored in resource repositories. In fact, because these business activity of the company is also used. Afterwards, the
resources essentially contain figures (e.g., flow charts, mapping process is based on a semantic mapping algorithm
histograms, etc.) and numerical data, they usually lack of which uses string similarity and statistical similarity
semantics. Commercial data management systems were techniques. These techniques allow identifying as best as
generally designed in a way to enable an easy access and possible an accurate business description for a given resource
process of data with fewer considerations about how to share (Fig. 2).
and retrieve the produced resources. Hence, using a The main purpose of such a mapping technique is to enable
document management system or a search engine will not company engineers to easily access to MI resources and to
improve the retrieval of such resources as long as they still use them for their business needs. They can explore and
lack semantics. We are trying to address this lack through a retrieve a resource according to its description and regardless
semantic approach. of its location in the company network. This approach can
3. RELATED WORKS also be used to reference the resources with their description,
like a resource annotation system.
The issue of semantic retrieval of resources in manufacturing
companies has become an emerging field of study. First
semantic-based techniques have been applied to web search.
With the emergence of ontology paradigm, semantic
techniques have shown interest in other fields such as in
software-component retrieval (Y. Peng et al. 2009; Alnusair
& Zhao 2010), in document retrieval (Vdovjak & Houben
2010) and in knowledge management systems (Cubranic et
al. 2003). Few research works tackle the MI-resource
retrieval because such kind of resources is specific to
industries and needs specific semantics. There is a recent
work of Li & Qiao (2012) about manufacturing-information Fig. 2. The semantic mapping approach
retrieval. The authors proposed an ontology for
manufacturing information and showed its usefulness using The proposed approach can be summarized in four steps.
the web engine of their data management system. There are The first step is to identify the business description that will
other similar works focusing on the retrieval of engineering make the resources closer to the manufacturing objectives of
documents. McMahon et al. (2004) developed an integrated the company. A meta-model for MI-resource description is
retrieval system where engineering documents can be provided for this purpose. This meta-model is also used to
annotated using pre-identified concepts and retrieved using a prepare an expert model, which is a manual description of a
faceted-classification mechanism. Finally, Yao et al. (2009) set of selected resources. We validate thereby the resulting
proposed an integrated environment for engineering- semantic description.
information retrieval using an engineering ontology. This one

2070
2013 IFAC MIM
June 19-21, 2013. Saint Petersburg, Russia

The second step of the approach focuses on gathering the weekly, etc. This dimension represents the time period from
business concepts, terminologies, relationships between which manufacturing data were collected. Finally an
concepts and all the vocabulary that will help in building the indicator has an output form which refers to its visual
semantic description of the resources. A process-control representation. For example, a trend or a control chart are
dictionary and a manufacturing process ontology are kinds of output forms.
proposed. The process-control dictionary provides the
The control domain can be a method or an objective of the
vocabulary of concepts that makes the link between the raw
process-control domain in general. Examples of process-
description of the resources and the targeted business
control methods implemented within STMicroelectronics are
semantic. Note that this dictionary is related to the process-
SPC, FDC, Run to Run, Sampling, etc. Main standard
control domain because the involved indicators in this
process-control methods are based on statistical techniques.
approach are used for the manufacturing process control in
The control indicators are defined and calculated within the
general. The manufacturing process ontology is a global
company by using these methods for a specific
business description of the manufacturing activity. It provides
manufacturing objective. Thus, the manufacturing objective
the link between the process-control methods used in the
represents the business purpose addressed by the MI
company and the manufacturing process description.
resources.
The third step in our approach is the mapping process. It
4.3 The process-control dictionary
consists in mapping the basic raw description that a resource
can have (for example, its name, its location) with the The proposed dictionary gives the required terminologies of
concepts of the dictionary and the ontology. After several the process-control domain. Linguistic dictionaries, such as
processing steps, the mapping system returns an estimated the famous WordNet2, are so much generic for process-
description of the manufacturing resources. control concepts. For example a generic dictionary can give
the definition of statistics, of a process and of the control, but
The final step in the proposed approach consists in validating
it cannot give the meaning of the statistical process control
the resulting description using the expertV¶ GHVFULSWLRQ.
(SPC), whereas SPC is a standard process-control method.
An overview of each step is presented in the next sub- Our dictionary aims at defining these specific terminologies.
sections. It is based on the following meta-model (Fig. 4) which uses
some generic concepts of linguistic dictionaries (i.e.
4.2 The resource description meta-model
synonym, hypernym, hyponym).
Fig. 3 depicts the meta-model that we defined within
STMicroelectronics for the MI-resource description.

Fig. 4. The meta-model of the process-control dictionary


Fig. 3. The meta-model for MI-resource description The process-control dictionary is composed of a list of
concepts representing the entries of the dictionary. Each
A resource can have two types of descriptive concepts: concept may have a synonym or a variant. Note that a variant
business ones and operational ones. The manufacturing represents another form of a concept, like an abbreviation.
objective is a kind of business description. The control For example, the out-of-control concept is also known as
indicator and the control domain are kinds of operational OOC.
descriptions.
A concept has some relation types. Examples of these
The indicator represents the main subject of a resource. It is relations are: hypernym and hyponym. The hypernym is a
mainly composed of data objects which can be related to the concept that encompasses another concept, such a kind of
manufacturing process or to the process-control domain or generalization. The hyponym is a kind of specialization. If we
ERWK )RU H[DPSOH LI ZH WDNH WKH LQGLFDWRU ³control chart on take again the concept out of control, its hypernym is SPC.
lot´ the control chart is a concept in the process-control
domain and the lot is a manufacturing concept. This indicator
can be provided with a temporal dimension, such as daily, 2
http://wordnet.princeton.edu/

2071
2013 IFAC MIM
June 19-21, 2013. Saint Petersburg, Russia

Finally, each concept in the dictionary is defined with a set of The mapping approach seeks to automatically find a semantic
key concepts. They serve to provide the scope of definition of description, referenced in the ontology, for a MI resource.
the involved concept. For example the concept out of control Note that we mean by the term mapping here, to find
is, in fact, a process-control indicator. Limit, target and associations or correspondences between at least two
control chart are examples of key concepts that define its concepts. Mapping approaches are widely used in ontology-
scope. based works when an ontology needs to be created by
mapping two ontology structures (Pirrò & Talia 2007; Ehrig
4.4 The manufacturing process ontology
& Staab 2004). The mapping notion is also used in
$Q RQWRORJ\ LV ³DQ H[SOLFLW VSHFLfication of a information retrieval on the web, in document retrieval and
FRQFHSWXDOL]DWLRQ´ (Gruber 1993). It represents a description component retrieval systems (Vdovjak & Houben 2010). In
of a domain of knowledge or discourse. Ontologies have been these systems, the user query is generally mapped with the
created to improve the communication between humans, available data and information. String similarity techniques
between humans and computers and between computers by are used to map information with the vocabulary associated
offering a unique and standardized vocabulary. According to to the processed data and knowledge (Pellegrino & Corno
the semantic mapping purpose, we developed a 2006; S. Li & Qiao 2012). Business ontologies and linguistic
manufacturing process ontology for STMicroelectronics to dictionaries are also used in this context to enhance the
get a standard description of its business process. This matching of user queries. In our approach, we combine
ontology relies on four views (Fig. 5): semantic analysis with string similarity techniques. The
following mapping techniques are applied:
- Organization view: references the business activities
of the company String similarity measuring: it consists in the syntactic
matching of concepts. There is a set of well-known string
- Function view: describes the manufacturing similarity functions in the literature, including edit-distance,
objectives of the company for each business activity
Jaro-winkler, Jaccard similarity, TF-IDF, and so forth (F. Lin
- Control view: describes the process-control 2007). We use the edit-distance function, also known as
objectives and methods as implemented within the Levenshtein distance. In the mapping process, this function
company. calculates the distance between each concept of the resource
name and the concepts of the dictionary. The best edit-
- Data view: gathers the data types involved in the distance under a given threshold between two concepts is
manufacturing process selected. After several experimentations we set the threshold
to 0.7. This distance is combined with another string
similarity function called the dice coefficient (D. Lin 1998).
This one returns an average similarity coefficient between
two sets of concepts (instead of two concepts).
Reasoning-based technique: this technique relies on
searching the target description using the semantic relations
between the concepts that are defined in the manufacturing
process ontology.

Fig. 5. The upper concepts of the manufacturing process Our approach actually focuses on identifying the generic
ontology concepts that describe a resource in the proposed meta-model
(Fig. 3), which are the key indicator, the control domain and
These views refer in fact to the descriptive levels of the ARIS the manufacturing objective. For example, taking the
architecture (Ferdian 2001). They were chosen as a starting UHVRXUFH QDPH ³ooc analysis by equipment´ E\ PDSSLQJ
point for the domain description because the ARIS approach these concepts with our approach, the system must obtain the
seeks to reduce the complexity of modeling business following results:
processes using those levels, besides that the ontology has
been built with a top down strategy. Indicator: <ooc>
Control domain: <SPC>
The manufacturing process ontology and the process-control
Manufacturing objective: <Lot control>
dictionary are linked through the process-control methods. In
fact, the process-control dictionary gives the link between a Fig. 6 summarizes the mapping process. The resource name
raw description of an indicator in a resource and the process- represents the starting point in the approach. We try at the
control domain (including process-control methods, beginning to tokenize the resource name and eliminate the
objectives, correlated concepts with the domain and so on), stop words. Tokenzing means to transform a string to simple
whereas the manufacturing process ontology gives the link tokens. For example:
between a process-control concept and how it is used in the
manufacturing activity. In the last case, the semantic relations <ooc-analysis> : ooc> <analysis>
between the data view, the control view and the function Example of eliminating stop words:
view are decisive in building this link.
<ooc analysis by equipment> : <ooc> <analysis>
4.5 The mapping process <equipment>

2072
2013 IFAC MIM
June 19-21, 2013. Saint Petersburg, Russia

The string similarity function combines the edit-distance The experimentation of the approach showed good results
function and the dice coefficient, so to get a better mapping with an average of 90% of precision and 85% of recall. Such
result between the concepts of the resource name and the results required several experimentations and corrections in
concepts of the dictionary. This step enables to find out the the semantic models, as well as in the mapping process. The
key indicator that represents the subject of the resource, and validation process is done manually in csv files. In fact, the
suggests the corresponding control domain. Once the control system automatically exports the results in csv files each time
domain is identified, the system searches all the relationships a mapping is done. This output is then compared with the
around this concept in the manufacturing process ontology, H[SHUWV¶ GHVFULSWLRQ Fig. 7). In this way, the validation is
according to its four descriptive views. This step focuses on done progressively for each panel of resources selected
logical reasoning on the relations between concepts in the within the company. The aim of the validation step is to
ontology. The system tries in that way to infer the evaluate the effectiveness of the mapping techniques, and to
corresponding manufacturing objective. check the coherency of the dictionary and the ontology, up to
stabilizing their content. The limit of this technique is that it
is time consuming, in particular, when there are a lot of
resources to describe. However, in such a context, only the
business experts have the necessary knowledge to validate
the results.

Fig. 7. Validation technique


6. CONCLUSION
This paper presents a semantic mapping approach to enhance
the retrieval of MI resources in industries. This approach
gives the guidelines to search and add semantics to these
resources basing on four main elements: a meta-model for
MI-resource description, a process-control dictionary, a
manufacturing process ontology and a mapping process.
These elements enabled to obtain a novel search approach to
Fig. 6. Overview of the mapping process explore and retrieve MI resources in industries. Furthermore,
comparing to the existing works, only our approach provides
5. IMPLEMENTATION AND VALIDATION
business description in the search results. This description is
The general approach has been experimented within centralized in a dictionary and an ontology, which, somehow,
STMicroelectronics on a set of 384 heterogeneous resources unify the vocabulary used in an activity domain. As a result,
(from analysis and reporting tools). A mapping module has the end users can better explore and understand the existing
been developed using the Php technology for the user indicators in resource repositories, instead of developing new
interface. The C language was also used for string similarity ones. Also, the problem of resource overflow can be
calculation. The manufacturing process ontology and the progressively reduced.
process-control dictionary have been implemented using the The experimentation of our approach within
xml technology. The mapping system takes as input any STMicroelectronics showed its effective application and
location of a resource repository in the company network and identified a promising solution over time. Our contribution
produces as output the semantic description of these also pinpointed a lack of research works in the field of
resources (cf. Appendix A). The types of descriptions are pre- semantic-based-resource retrieval in manufacturing
identified in the beginning of the approach (with the resource companies.
meta-model) and are taken from the ontology during the Future works focus on improving the validation method of
mapping process (Fig. 6). the approach. Giving guidelines for the maintaining of the

2073
2013 IFAC MIM
June 19-21, 2013. Saint Petersburg, Russia

dictionary and the ontology must also be considered, in order Similarity. In The 15th international conference on
to regularly ensure a consistent semantic content over time. Machine Learning. pp. 296±304.
/LQ ) 6WDWH RI WKH $UW× $XWRPDWLF 2QWRORJ\
REFERENCES
Matching, Jonkoping, Sweden.
Alnusair, A. & Zhao, T., 2010. Component Search and McMahon, C. et al., 2004. Waypoint: An Integrated Search
5HXVH× $Q 2QWRORJ\-based Approach. In Knowledge and Retrieval System for Engineering Documents.
Creation Diffusion Utilization. Las Vegas, Nevada, Journal of Computing and Information Science in
USA, pp. 258±261. Engineering, 4(4), p.329.
Cubranic, D. et al., 2003. Tools for light-weight knowledge Pellegrino, P. & Corno, F., 2006. An extensible platform for
sharing in open-source software development. 2003. In semantic classification and retrieval of multimedia
Workshop on Open-Source Software - International resources. In SWAP 2006 - Proceedings of the 3rd
Conference on Software Engineering. Portland, Oregon, Italian Semantic Web Workshop. Pisa, Italy, pp. 18±20.
USA, pp. 683±697. Peng, Y. et al., 2009. An Ontology-Driven Paradigm for
Ehrig, M. & Staab, S., 2004. QOM±quick ontology mapping. Component Representation and Retrieval. In Ninth IEEE
In Third International Semantic Web Conference International Conference on Computer and Information
(ISWC). Hiroshima, Japan: LNCS, pp. 683±697. Technology. Xiamen, China: Ieee, pp. 187±192.
Ferdian, 2001. A Comparison of Event-driven Process Pirrò, G. & 7DOLD ' 8)2PH× $ 8VHU )ULHQGO\
Chains and UML Activity Diagram for Denoting Ontology Mapping Environment. In 4th Italian Semantic
Business Processes, Web Workshop on Semantic Web Applications And
Gruber, T.R., 1993. Toward Principles for the Design of Perspectives (SWAP). Italy.
Ontologies Used for Knowledge Sharing. Knowledge Vdovjak, R. & Houben, G., 2001. RDF Based Architecture
Creation Diffusion Utilization, pp.907±928. for Semantic Integration of Heterogeneous Information
Hajmoosaei, A. & Kareem, S.A., 2008. An Approach for Sources. In Proceedings of the International Workshop
Semantic Query Mapping on the Heterogeneous Web on Information Integration on the Web. Rio de Janeiro,
Data. In First International Conference on the Brazil, pp. 51±57.
Applications of Digital Information and Web Yao, Y., Lin, L. & Dong, J., 2009. Research on Ontology-
Technologies (ICADIWT). Ostrava, pp. 555±562. Based Multi-source Engineering Information Retrieval in
Li, S. & Qiao, L., 2012. Ontology-based Modeling of Integrated Environment of Enterprise. In International
Manufacturing Information and its Semantic Retrieval. Conference on Interoperability for Enterprise Software
In Proceedings of the 16th International Conference on and Applications. China: Ieee, pp. 277±282.
Computer Supported Cooperative Work in Design.
Wuhan, China, pp. 540±545.
Lin, D., 1998. An Information-Theoretic Definition of

Appendix A. EXAMPLE OF THE MAPPING INTERFACE

2074

You might also like