Professional Documents
Culture Documents
and Control
International Federation of Automatic Control
June 19-21, 2013. Saint Petersburg, Russia
*Laboratory for Systems and Information Sciences (LSIS), University of Aix-Marseille, France
France (e-mail: sara.bouzid@lsis.org;corine.cauvet@lsis.org;claudia.frydman@lsis.org).
**STMicroelectronics, Rousset area, France
(e-mail:sara.bouzid@st.com;jacques.pinaton@st.com)
in the company. We found within STMicroelectronics more realizes the mapping between user queries and engineering
than 20 kinds of data management, analysis and reporting documents.
systems. Examples of used systems are Business Object for
Most of the research works that use a semantic retrieval
reporting, Kla ace xp for data analysis and statistical
process emphasize the use of one or many business
calculation, Apf for real time data extraction and reporting
ontologies. Such ontologies give a description of a domain
and so on (Fig. 1).
as a whole, without differentiating between business
semantics and the linguistic aspect. We propose in our
approach to separate the business description of a domain
from the definition of its vocabulary. In addition, a specific
combined string similarity technique is used in the mapping
approach to broaden the scope of matching of concepts.
4. THE SEMANTIC MAPPING APPROACH
4.1 Approach overview
The proposal tries to improve the MI-resource retrieval using
semantic description of the resources. To that aim, we
developed an approach to automatically identify business
descriptions for MI resources. Because these resources aim at
Fig. 1. Examples of COTS systems within controlling a manufacturing process, our goal is to determine
STMicroelectronics their relevance to the process control. A process-control
Actually, there is an average of 200 to 400 resources by each dictionary referencing the main terminologies of the process-
system. As a consequence, it is almost impossible for an control domain (standard methods, standard indicators, etc.)
engineer to retrieve an indicator among the available huge is used. A manufacturing process ontology related to the
quantity stored in resource repositories. In fact, because these business activity of the company is also used. Afterwards, the
resources essentially contain figures (e.g., flow charts, mapping process is based on a semantic mapping algorithm
histograms, etc.) and numerical data, they usually lack of which uses string similarity and statistical similarity
semantics. Commercial data management systems were techniques. These techniques allow identifying as best as
generally designed in a way to enable an easy access and possible an accurate business description for a given resource
process of data with fewer considerations about how to share (Fig. 2).
and retrieve the produced resources. Hence, using a The main purpose of such a mapping technique is to enable
document management system or a search engine will not company engineers to easily access to MI resources and to
improve the retrieval of such resources as long as they still use them for their business needs. They can explore and
lack semantics. We are trying to address this lack through a retrieve a resource according to its description and regardless
semantic approach. of its location in the company network. This approach can
3. RELATED WORKS also be used to reference the resources with their description,
like a resource annotation system.
The issue of semantic retrieval of resources in manufacturing
companies has become an emerging field of study. First
semantic-based techniques have been applied to web search.
With the emergence of ontology paradigm, semantic
techniques have shown interest in other fields such as in
software-component retrieval (Y. Peng et al. 2009; Alnusair
& Zhao 2010), in document retrieval (Vdovjak & Houben
2010) and in knowledge management systems (Cubranic et
al. 2003). Few research works tackle the MI-resource
retrieval because such kind of resources is specific to
industries and needs specific semantics. There is a recent
work of Li & Qiao (2012) about manufacturing-information Fig. 2. The semantic mapping approach
retrieval. The authors proposed an ontology for
manufacturing information and showed its usefulness using The proposed approach can be summarized in four steps.
the web engine of their data management system. There are The first step is to identify the business description that will
other similar works focusing on the retrieval of engineering make the resources closer to the manufacturing objectives of
documents. McMahon et al. (2004) developed an integrated the company. A meta-model for MI-resource description is
retrieval system where engineering documents can be provided for this purpose. This meta-model is also used to
annotated using pre-identified concepts and retrieved using a prepare an expert model, which is a manual description of a
faceted-classification mechanism. Finally, Yao et al. (2009) set of selected resources. We validate thereby the resulting
proposed an integrated environment for engineering- semantic description.
information retrieval using an engineering ontology. This one
2070
2013 IFAC MIM
June 19-21, 2013. Saint Petersburg, Russia
The second step of the approach focuses on gathering the weekly, etc. This dimension represents the time period from
business concepts, terminologies, relationships between which manufacturing data were collected. Finally an
concepts and all the vocabulary that will help in building the indicator has an output form which refers to its visual
semantic description of the resources. A process-control representation. For example, a trend or a control chart are
dictionary and a manufacturing process ontology are kinds of output forms.
proposed. The process-control dictionary provides the
The control domain can be a method or an objective of the
vocabulary of concepts that makes the link between the raw
process-control domain in general. Examples of process-
description of the resources and the targeted business
control methods implemented within STMicroelectronics are
semantic. Note that this dictionary is related to the process-
SPC, FDC, Run to Run, Sampling, etc. Main standard
control domain because the involved indicators in this
process-control methods are based on statistical techniques.
approach are used for the manufacturing process control in
The control indicators are defined and calculated within the
general. The manufacturing process ontology is a global
company by using these methods for a specific
business description of the manufacturing activity. It provides
manufacturing objective. Thus, the manufacturing objective
the link between the process-control methods used in the
represents the business purpose addressed by the MI
company and the manufacturing process description.
resources.
The third step in our approach is the mapping process. It
4.3 The process-control dictionary
consists in mapping the basic raw description that a resource
can have (for example, its name, its location) with the The proposed dictionary gives the required terminologies of
concepts of the dictionary and the ontology. After several the process-control domain. Linguistic dictionaries, such as
processing steps, the mapping system returns an estimated the famous WordNet2, are so much generic for process-
description of the manufacturing resources. control concepts. For example a generic dictionary can give
the definition of statistics, of a process and of the control, but
The final step in the proposed approach consists in validating
it cannot give the meaning of the statistical process control
the resulting description using the expertV¶ GHVFULSWLRQ.
(SPC), whereas SPC is a standard process-control method.
An overview of each step is presented in the next sub- Our dictionary aims at defining these specific terminologies.
sections. It is based on the following meta-model (Fig. 4) which uses
some generic concepts of linguistic dictionaries (i.e.
4.2 The resource description meta-model
synonym, hypernym, hyponym).
Fig. 3 depicts the meta-model that we defined within
STMicroelectronics for the MI-resource description.
2071
2013 IFAC MIM
June 19-21, 2013. Saint Petersburg, Russia
Finally, each concept in the dictionary is defined with a set of The mapping approach seeks to automatically find a semantic
key concepts. They serve to provide the scope of definition of description, referenced in the ontology, for a MI resource.
the involved concept. For example the concept out of control Note that we mean by the term mapping here, to find
is, in fact, a process-control indicator. Limit, target and associations or correspondences between at least two
control chart are examples of key concepts that define its concepts. Mapping approaches are widely used in ontology-
scope. based works when an ontology needs to be created by
mapping two ontology structures (Pirrò & Talia 2007; Ehrig
4.4 The manufacturing process ontology
& Staab 2004). The mapping notion is also used in
$Q RQWRORJ\ LV ³DQ H[SOLFLW VSHFLfication of a information retrieval on the web, in document retrieval and
FRQFHSWXDOL]DWLRQ´ (Gruber 1993). It represents a description component retrieval systems (Vdovjak & Houben 2010). In
of a domain of knowledge or discourse. Ontologies have been these systems, the user query is generally mapped with the
created to improve the communication between humans, available data and information. String similarity techniques
between humans and computers and between computers by are used to map information with the vocabulary associated
offering a unique and standardized vocabulary. According to to the processed data and knowledge (Pellegrino & Corno
the semantic mapping purpose, we developed a 2006; S. Li & Qiao 2012). Business ontologies and linguistic
manufacturing process ontology for STMicroelectronics to dictionaries are also used in this context to enhance the
get a standard description of its business process. This matching of user queries. In our approach, we combine
ontology relies on four views (Fig. 5): semantic analysis with string similarity techniques. The
following mapping techniques are applied:
- Organization view: references the business activities
of the company String similarity measuring: it consists in the syntactic
matching of concepts. There is a set of well-known string
- Function view: describes the manufacturing similarity functions in the literature, including edit-distance,
objectives of the company for each business activity
Jaro-winkler, Jaccard similarity, TF-IDF, and so forth (F. Lin
- Control view: describes the process-control 2007). We use the edit-distance function, also known as
objectives and methods as implemented within the Levenshtein distance. In the mapping process, this function
company. calculates the distance between each concept of the resource
name and the concepts of the dictionary. The best edit-
- Data view: gathers the data types involved in the distance under a given threshold between two concepts is
manufacturing process selected. After several experimentations we set the threshold
to 0.7. This distance is combined with another string
similarity function called the dice coefficient (D. Lin 1998).
This one returns an average similarity coefficient between
two sets of concepts (instead of two concepts).
Reasoning-based technique: this technique relies on
searching the target description using the semantic relations
between the concepts that are defined in the manufacturing
process ontology.
Fig. 5. The upper concepts of the manufacturing process Our approach actually focuses on identifying the generic
ontology concepts that describe a resource in the proposed meta-model
(Fig. 3), which are the key indicator, the control domain and
These views refer in fact to the descriptive levels of the ARIS the manufacturing objective. For example, taking the
architecture (Ferdian 2001). They were chosen as a starting UHVRXUFH QDPH ³ooc analysis by equipment´ E\ PDSSLQJ
point for the domain description because the ARIS approach these concepts with our approach, the system must obtain the
seeks to reduce the complexity of modeling business following results:
processes using those levels, besides that the ontology has
been built with a top down strategy. Indicator: <ooc>
Control domain: <SPC>
The manufacturing process ontology and the process-control
Manufacturing objective: <Lot control>
dictionary are linked through the process-control methods. In
fact, the process-control dictionary gives the link between a Fig. 6 summarizes the mapping process. The resource name
raw description of an indicator in a resource and the process- represents the starting point in the approach. We try at the
control domain (including process-control methods, beginning to tokenize the resource name and eliminate the
objectives, correlated concepts with the domain and so on), stop words. Tokenzing means to transform a string to simple
whereas the manufacturing process ontology gives the link tokens. For example:
between a process-control concept and how it is used in the
manufacturing activity. In the last case, the semantic relations <ooc-analysis> : ooc> <analysis>
between the data view, the control view and the function Example of eliminating stop words:
view are decisive in building this link.
<ooc analysis by equipment> : <ooc> <analysis>
4.5 The mapping process <equipment>
2072
2013 IFAC MIM
June 19-21, 2013. Saint Petersburg, Russia
The string similarity function combines the edit-distance The experimentation of the approach showed good results
function and the dice coefficient, so to get a better mapping with an average of 90% of precision and 85% of recall. Such
result between the concepts of the resource name and the results required several experimentations and corrections in
concepts of the dictionary. This step enables to find out the the semantic models, as well as in the mapping process. The
key indicator that represents the subject of the resource, and validation process is done manually in csv files. In fact, the
suggests the corresponding control domain. Once the control system automatically exports the results in csv files each time
domain is identified, the system searches all the relationships a mapping is done. This output is then compared with the
around this concept in the manufacturing process ontology, H[SHUWV¶ GHVFULSWLRQ Fig. 7). In this way, the validation is
according to its four descriptive views. This step focuses on done progressively for each panel of resources selected
logical reasoning on the relations between concepts in the within the company. The aim of the validation step is to
ontology. The system tries in that way to infer the evaluate the effectiveness of the mapping techniques, and to
corresponding manufacturing objective. check the coherency of the dictionary and the ontology, up to
stabilizing their content. The limit of this technique is that it
is time consuming, in particular, when there are a lot of
resources to describe. However, in such a context, only the
business experts have the necessary knowledge to validate
the results.
2073
2013 IFAC MIM
June 19-21, 2013. Saint Petersburg, Russia
dictionary and the ontology must also be considered, in order Similarity. In The 15th international conference on
to regularly ensure a consistent semantic content over time. Machine Learning. pp. 296±304.
/LQ ) 6WDWH RI WKH $UW× $XWRPDWLF 2QWRORJ\
REFERENCES
Matching, Jonkoping, Sweden.
Alnusair, A. & Zhao, T., 2010. Component Search and McMahon, C. et al., 2004. Waypoint: An Integrated Search
5HXVH× $Q 2QWRORJ\-based Approach. In Knowledge and Retrieval System for Engineering Documents.
Creation Diffusion Utilization. Las Vegas, Nevada, Journal of Computing and Information Science in
USA, pp. 258±261. Engineering, 4(4), p.329.
Cubranic, D. et al., 2003. Tools for light-weight knowledge Pellegrino, P. & Corno, F., 2006. An extensible platform for
sharing in open-source software development. 2003. In semantic classification and retrieval of multimedia
Workshop on Open-Source Software - International resources. In SWAP 2006 - Proceedings of the 3rd
Conference on Software Engineering. Portland, Oregon, Italian Semantic Web Workshop. Pisa, Italy, pp. 18±20.
USA, pp. 683±697. Peng, Y. et al., 2009. An Ontology-Driven Paradigm for
Ehrig, M. & Staab, S., 2004. QOM±quick ontology mapping. Component Representation and Retrieval. In Ninth IEEE
In Third International Semantic Web Conference International Conference on Computer and Information
(ISWC). Hiroshima, Japan: LNCS, pp. 683±697. Technology. Xiamen, China: Ieee, pp. 187±192.
Ferdian, 2001. A Comparison of Event-driven Process Pirrò, G. & 7DOLD ' 8)2PH× $ 8VHU )ULHQGO\
Chains and UML Activity Diagram for Denoting Ontology Mapping Environment. In 4th Italian Semantic
Business Processes, Web Workshop on Semantic Web Applications And
Gruber, T.R., 1993. Toward Principles for the Design of Perspectives (SWAP). Italy.
Ontologies Used for Knowledge Sharing. Knowledge Vdovjak, R. & Houben, G., 2001. RDF Based Architecture
Creation Diffusion Utilization, pp.907±928. for Semantic Integration of Heterogeneous Information
Hajmoosaei, A. & Kareem, S.A., 2008. An Approach for Sources. In Proceedings of the International Workshop
Semantic Query Mapping on the Heterogeneous Web on Information Integration on the Web. Rio de Janeiro,
Data. In First International Conference on the Brazil, pp. 51±57.
Applications of Digital Information and Web Yao, Y., Lin, L. & Dong, J., 2009. Research on Ontology-
Technologies (ICADIWT). Ostrava, pp. 555±562. Based Multi-source Engineering Information Retrieval in
Li, S. & Qiao, L., 2012. Ontology-based Modeling of Integrated Environment of Enterprise. In International
Manufacturing Information and its Semantic Retrieval. Conference on Interoperability for Enterprise Software
In Proceedings of the 16th International Conference on and Applications. China: Ieee, pp. 277±282.
Computer Supported Cooperative Work in Design.
Wuhan, China, pp. 540±545.
Lin, D., 1998. An Information-Theoretic Definition of
2074