Professional Documents
Culture Documents
net/publication/353398156
CITATIONS READS
0 106
9 authors, including:
All content following this page was uploaded by Mario Ricardo Nascimento Marques Junior on 26 July 2021.
Abstract: Databooks are the main documents for monitoring and validating a work in the oil industry. However, the
analysis of these Databooks requires many man-hours of professionals, and will still be subject to failures
during these analyzes. An approach to optimize this task would be to use an intelligent search and validation
system for this documentation. Once the documents are scanned, algorithms could assist the professional to
analyze these Databooks, based on automatic indexing and checks. A problem that arises from this approach
is the difficulty to develop a valid conceptual model for all Databooks found and all types of components
and structures subject to inspection procedures. As this modeling is done manually by professionals, this
task becomes difficult, slow and often inefficient. To mitigate this problem, this work proposes a system to
automatically generate a conceptual model from textual descriptions of the inspection elements, making this
task simpler. This is done from the development of an ontology that describes the standardized knowledge
on inspection of different types of components of the structure of works for oil platforms for the validation of
inspection reports and quality certificates regarding criteria of completeness and compliance.
Database
Data insert
JSON File
lightweight data exchange format with great perfor- ponents, documents and correct data assignment even
mance (Izquierdo and Cabot, 2013). JSON is a under different structures. The proposed system has
human-readable format that consists of sets of objects two main functionalities. The first functionality is
(that is, types or concepts) described by name/value to generate a database from an ontology. For that,
pairs (that is, fields or attributes). JSON has no from an OWL file, an algorithm should generate the
schema, that is, there is no schema that specifies database schema corresponding to the ontology. From
the internal structure of JSON objects, instead, the this data schema, the database will be created. For
schema is implicit. Schemaless data is particularly in- the development of this functionality, the works re-
teresting in cases dealing with non-uniform data (for ported will be maded (Afzal et al., 2016), (Panawong
example, non-uniform types or custom fields) or in et al., 2016), (Abo Zidan et al., 2019). In addition, the
schema migration (Fowler, 2013), however, it can be- schema generated from the ontology will be used in
come a difficulty in data integration scenarios , where the schema matching.
it becomes necessary to discover, at least partially, the The second functionality, on the other hand, con-
structure to correctly process the data. sists of, starting from a JSON file of unknown
In (Izquierdo and Cabot, 2013) an approach is pre- schema, inserting this data in their respective database
sented to automatically discover an implicit schema tables. To do this, from an unknown JSON schema,
of a set of JSON documents. For this, model-oriented it is necessary to infer which schema this file corre-
techniques were used to design a process in which sponds to. To carry out this stage of such function-
initial schema excerpts are discovered from each indi- ality, the work presented by (Izquierdo and Cabot,
vidual service and then combined to obtain a compos- 2013) will be taken as a reference, and libraries that
ite model that describes the underlying domain model perform this function will also be investigated.
of the application, which facilitates understanding Knowing the schema of the JSON file, the match-
of JSON-based services. The approach was imple- ing schema will be performed, generated from the on-
mented in Java and made available on the hosting plat- tology. To perform the schema matching, the tool pro-
form Github (Izquierdo, 2014). posed by (Aumueller et al., 2005), called Coma ++
(A system for flexible combination of schema match-
ing approaches), will be used as the basis. This tool
3 PROPOSED SYSTEM was chosen because it allows the matching of differ-
ent files (for example, OWL and SQL), in addition to
As Databooks are usually scanned documents or PDF being an open source project under the AGPL license.
documents, processing is necessary to extract only Currently, the tool does not have the ability to work
the relevant information.Thus, a module is responsi- with JSON files, so this capability should be devel-
ble for performing this processing. This module per- oped. Then, a file containing the matching schema
forms this processing through OCR (Optical Charac- matches is generated. Finally, based on the JSON
ter Recognition) algorithms, transforming the Data- file and the result of the schema matching, an algo-
book into a JSON file. Assuming that the informa- rithm inserts the data into the corresponding database
tion extracted from a Databook will be contained in a tables.
JSON file, the system of Figure 3 has been proposed.
The main objective of the proposed system is to
allow the integration of the diversity of types of com-
4 CONCLUSION AND FUTURE Fusão de Sensores em Plantas Industriais. Master’s
thesis, Universidade Federal do Rio Grande, Brasil.
WORK
Duarte, G. (2010). O Controle da Qualidade em Pro-
cessos de Produção Mecânica Não-Seriada. Escola
This article proposed a system for integrating quality Politécnica da Universidade de São Paulo.
control data from components in the Brazilian oil and Elmasri, R. and Navathe, S. B. (2011). Database systems,
gas industry.The system is proposed based on tools al- volume 9. Pearson Education Boston, MA.
ready developed, for example Coma ++ (Aumueller Fowler, M. (2013). Schemaless data structures. URL
et al., 2005) e Ontology2RDB(Abo Zidan et al., 2019). http://martinfowler. com/articles/schemaless.
Thus, the main objective of the system is to promote Izquierdo, J. L. C. (2014). JSON discoverer tool. https:
a structure that allows access and management to this //github.com/SOM-Research/jsonDiscoverer.
data, promoting a better management of equipment Izquierdo, J. L. C. and Cabot, J. (2013). Discovering im-
quality control. Therefore, in the case of manual data plicit schemas in json data. In International Confer-
verification, inspectors will perform the task faster ence on Web Engineering, pages 68–83. Springer.
due to the ease of access to information. Also, this Nascimento, R. S. (2010). Qualidade na soldagem em uma
quality control data can be provided to a system that empresa fabricante de estruturas metálicas soldadas
will infer as to the completeness and compliance of do setor de óleo e gás. VI Congresso Nacional de Ex-
celência em Gestão.
the databooks.
Panawong, J., Ruangrajitpakorn, T., and Buranarach, M.
As future work, tests can be cited to validate the (2016). An automatic database generation and ontol-
functionalities of the tools mentioned in this work. ogy mapping from owl file. In JIST (Workshops &
From these tests, it will be possible to evaluate which Posters), pages 20–27.
tool best suits the proposed system.An important step Petróleo-Brasileiro-S.A (2017). ABC da inspeção
for the implementation of this work is the develop- de fabricação. https://www.petronect.com.br/
ment of an ontology applied to the inspection of com- irj/go/km/docs/pccshrcontent/Site\%20Content\
ponents of the oil industry. First, a search in the lit- %20\%28Legacy\%29/Portal2018/arquivos/cep/
erature should be carried out for ontologies applied to abc inspecao.pdf.
this domain and then an ontology developed with the Rachman, A. and Ratnayake, R. C. (2019). An ontology-
based approach for developing offshore and on-
peculiarities of the analyzed domain.Finally, after the shore process equipment inspection knowledge base.
development and integration of the proposed system, In International Conference on Offshore Mechan-
the effective gains from its use must be evaluated. ics and Arctic Engineering, volume 58783, page
V003T02A084. American Society of Mechanical En-
gineers.
Rodrigues, D. d. A. (2013). Casamento de esquemas de
REFERENCES banco de dados aplicando aprendizado ativo. Master’s
thesis, Universidade Federal do Amazonas, Brasil.
Abo Zidan, R., Almustafa, M., and Tahawi, M. (2019). On-
Staab, S. and Studer, R. (2010). Handbook on ontologies.
tology2rdb for storing ontology in relational database.
Springer Science & Business Media.
Journal of Theoretical and Applied Information Tech-
nology, 97:2229– 2240. Wand, Y., Storey, V. C., and Weber, R. (1999). An ontologi-
cal analysis of the relationship construct in conceptual
Afzal, H., Waqas, M., and Naz, T. (2016). Owlmap:
modeling. ACM Transactions on Database Systems
fully automatic mapping of ontology into relational
(TODS), 24(4):494–528.
database schema. International journal of advanced
computer science and applications, 7(11):7–15.
Almeida, M. B. (2014). Uma abordagem integrada so-
bre ontologias: Ciência da informação, ciência da
computação e filosofia. Perspectivas em Ciência da
Informação, 19(3):242–258.
Aumueller, D., Do, H.-H., Massmann, S., and Rahm, E.
(2005). Schema and ontology matching with coma++.
In Proceedings of the 2005 ACM SIGMOD interna-
tional conference on Management of data, pages 906–
908.
Bernstein, P. A., Madhavan, J., and Rahm, E. (2011).
Generic schema matching, ten years later. Proceed-
ings of the VLDB Endowment, 4(11):695–701.
de Freitas, A. L. P. (2017). Proposta de uma Ontologia para
Tratamento de Divergência de Dados na Aplicação de