You are on page 1of 15

Ref.

Ares(2017)6377400 - 28/12/2017

D2.2 Requirements and specification


for integrated database

Work Package 2
Lead Authors (Org) Donna Dykeman (GRANTA)

Contributing Author(s) (Org) Dave Cebon (GRANTA), Andrea Berto (GRANTA), Nic Austin
(GRANTA), Borek Patzak (CTU), Vít Šmilauer (CTU), Carlos Kavka
(ESTECO)
Reviewers (Org) Salim Belouettar (LIST) and Gaetano Giunta (LIST)
Due Date 30-09-2017
Date 08-12-2017
Version V05

Dissemination level

PU: Public
PP: Restricted to other programme participants
RE: Restricted to a group specified by the consortium
CO: Confidential, only for members of the consortium X

COMPOSELECTOR H2020 Project


Funding from the European Union’s Horizon 2020 research and innovation programme under grant
agreement No 721105. | www.composelector.net Page 1
Versioning
Version Date Author Instruction

v.0.1 20.09.2017 D. Dykeman (GRANTA) First draft


v.0.2 03.11.2017 D. Dykeman (GRANTA), A. Berto First draft released for
(GRANTA), N. Austin (GRANTA) partner review
v.0.3 05.12.2017 B. Patzάk (CTU), V. Šmilauer WP2 Review
(CTU), C. Kavka (ESTECO)
v.0.4 08.12.2017 All partners Consortia Review
15.12.2017 Submission

Acronyms
-
- API: Application Programming Interface
- BDSS: Business Decision Support System
- EMMC: European Materials Modelling Council
- EMMO: European Materials Modelling Ontology
- KPI: Key Performance Indicator
- IP: Intellectual Property
- MODA: Modelling Data Elements
- MuPIF: Multi-Physics Integration Framework
- RoMM: Review of Materials Modelling
- STK: Scripting Toolkit (specific to GRANTA MI)

Disclaimer:
This document’s contents are not intended to replace consultation of any applicable legal
sources or the necessary advice of a legal expert, where appropriate. All information in
this document is provided “as is” and no guarantee or warranty is given that the information
is fit for any particular purpose. The user, therefore, uses the information at its
sole risk and liability. For the avoidance of all doubts, the European Commission has no
liability in respect of this document, which is merely representing the authors’ view.

COMPOSELECTOR H2020 Project


Funding from the European Union’s Horizon 2020 research and innovation programme under grant
agreement No 721105. | www.composelector.net Page 2
TABLE OF CONTENT
Versioning 2
Acronyms 2
Disclaimer: 2
Introduction 4
Stakeholders 5
Database Requirements 7
Materials Information Management 7
Database Integration Requirements for Materials Information Management 8
Data Structure 8
Classification and relationships 9
Interoperability 9
General Interoperability Requirements 9
Generic Interoperability Example 11
Specific Interoperability Example 12
Conclusions 14
References 14
ANNEX 1 - EMMO 15

COMPOSELECTOR H2020 Project


Funding from the European Union’s Horizon 2020 research and innovation programme under grant
agreement No 721105. | www.composelector.net Page 3
Introduction
COMPOSELECTOR focuses on development of a business decision support system (BDSS) which
enhances and integrates existing software codes, targets decision makers and provides them with
easy-to-use tools and procedures for choosing the right polymer-matrix composites (PMC)
processes and modelling options. The modeling part focuses on development and demonstration of
integrated solutions based on a multi-disciplinary, multi-model and multi-field approaches for
decision-making in the selection, design and fabrication of PMCs.

The purpose of this deliverable is to identify and publicly share the requirements for an integrated
database with multi-scale modelling platforms and tools for the benefit of end-users in materials and
manufacturing sectors. Integration can refer to the seamless interface with the end-user’s
environment for decision-making (full integration), or it may mean the seamless exchange of data
between different decision-making environments (interoperability). Computational frameworks may
refer to centralized, cloud-hosted Hubs, or to tools which can be downloaded for local or centralized
installation. This deliverable will draw on Granta Design’s experience as a database-system software
design house for a database-system dedicated to materials and process information management
(GRANTA MI [1]) for engineering and science applications. Granta has full integrations with leading
CAD, CAE and PLM systems, and has developed functionality for interoperability with both
centralized Hubs and tools for multi-scale modelling.

The deliverable introduces the perspective of end-users from manufacturing and materials industries
on data storage requirements and interoperability. The data storage requirements are related
specifically to materials and process information management, which is the core use of data and
information generated by the MuPIF framework [2] with integrated modelling codes, and will
henceforth be called the database. There are two end-user perspectives which can be applied to an
integrated database for the platform:
1) Workflows executed from a central location with authorized connection to the DB, optimally,
this can be done at DB server.
2) Workflows executed at distributed locations, then authorized connections (licenses to Granta
API, etc) are needed.
Note that in either case, the actual models will be executed on remote servers, provided by project
partners.

The focus of this deliverable will be on perspective 2, however the same technical requirements can
be employed for perspective 1 with changes to requirements for Access Control, business decision
system integration, and data license rights. An image of perspective 2 and how an enterprise
integrated database integrates with a computational framework is illustrated in Figure 1.

COMPOSELECTOR H2020 Project


Funding from the European Union’s Horizon 2020 research and innovation programme under grant
agreement No 721105. | www.composelector.net Page 4
Figure 1 Generic architectural overview of an enterprise database (integrated within a
manufacturing company) interoperable with multi-scale modelling platform components

In Figure 1, GRANTA MI [1] represents a locally installed materials information environment which
is behind the firewall of a company. The local and open components communicate securely via VPN,
and together create an open innovation framework for the company. The components on the side of
the open system can also be adopted by an enterprise for implementation behind their firewall.

The remainder of this work details definitions which provide the context for integration (BDSS,
materials information management, end-users) and requirements for integration with business
decision support systems from an enterprise perspective (infrastructure and tools, interoperability,
data needs). The specifics of data formats will be deferred until D2.4 Storage Platform Alpha Release
when the MODAs are also defined. Granta Design’s database system, GRANTA MI [1], will serve
as the case study for this deliverable since it is being used in integration studies with multi-scale
modelling frameworks/Hubs (e.g. MuPIF, SimPhoNy [3], nanoHUB [4]), among others. Therefore,
the content takes on a best practice approach for an integrated materials information database
system. A demonstration of integration with the MuPIF platform is also included.

Stakeholders
Drawing from statements across other COMPOSELECTOR deliverables and Granta’s
understanding of materials information management end-users, there are different types of
stakeholders in COMPOSELECTOR which can be characterized by unique and intersecting
requirements. Note, that this is a general classification of stakeholders:

COMPOSELECTOR H2020 Project


Funding from the European Union’s Horizon 2020 research and innovation programme under grant
agreement No 721105. | www.composelector.net Page 5
● End-user: might potentially have limited software development technical expertise and is
interested in executing the workflows and the inspection of the corresponding results
(represented by manufacturing end-users from enterprises such as Dow, GoodYear, Airbus).
● Platform developer: develops the computational platform or tool and serves the needs of
all other stakeholders (CTU).
● Component developer/integrator: is the stakeholder who either develops its own piece of
software (i.e. LIST, ESI, e-Xstream, Granta Design) or wraps an existing computational tool
into the platform (i.e. POLITO, INSA).
● Translator: is the modelling expert stakeholder who has high technical competency and
domain knowledge for modelling and supports the design of the simulation workflow for the
end-user, alone with helping to establish the value of the modelling results and activity for
the end-user, if needed (defined as a Translator by the EMMC [5]).
● Admin and IT-support staff: is the stakeholder who maintains the resources and the
network availability of the computers connecting to the platform. This stakeholder may exist
within the End-user, or may be a representative of the Platform developer, or a representative
able to combine support for the Platform and Components (although this scenario is less
likely unless a new software entity is formed supported by legal integration).

Two specific end-user personas are described here to support the requirements of a database for
integration with the COMPOSELECTOR components and the company business decision support
systems (informatics and decision-making tools such as WP5 tools, CAD, CAE, PLM, ERP):
● Materials Engineer/Scientist – individuals with the responsibility of selecting materials
based on requirements from structural engineers and processing, developing new materials,
enhancing materials, understanding the behavior of materials in-service, defining process
conditions and specifications. This stakeholder is typically entrusted with materials
qualification (whether internal as is the case for automotive, or to international standard as is
the case for aerospace).
● Structural Engineer – individuals ensuring the performance of a material for an application
typically through physical test design and simulation analysis. In the case of aerospace, this
individual will follow a critical path to part certification to international standards agreed by
the FAA and its international partners (e.g. EASA). For automotive, the combined car
structure must reach an international crash standard (e.g. NCAP), and hence any composite
part performance is homologated within the overall structural performance.

COMPOSELECTOR H2020 Project


Funding from the European Union’s Horizon 2020 research and innovation programme under grant
agreement No 721105. | www.composelector.net Page 6
Database Requirements
Materials Information Management
Materials Information Management refers to the full traceability of physical and virtual
data/information in the life cycle of a material and/or part, as illustrated in Figure 2. Emphasis has
been placed on materials information management in the COMPOSELECTOR project since it is
intrinsic to capturing traceability for model input and output and the physical tests which validate the
models, and is often lost at the product or application level. Product lifecycle management typically
appends materials information by a link to a data store or a pdf of finalized results, whereas materials
information management ensures the right data point is available to the right stakeholder, in the
format for the right environment with appropriate context for decision-making. It is the action of
having the best available materials and process information available when and where authorized
users need it (correct data type, units and tools for viewing, selecting/assigning, analyzing).
Materials information management and related tools have demonstrated significant cost and risk
reductions across a business (a recent quote from Rolls-Royce revealed a £6.9M annual savings
across three sites due to implementation of Granta’s materials information management system).

Figure 2 Illustration depicting the huge range of data, information and knowledge that is
associated with a particular material at different stages in the product lifecycle – engineering,
economic, regulatory, environmental, manufacturing, and more [courtesy of Granta Design].

COMPOSELECTOR H2020 Project


Funding from the European Union’s Horizon 2020 research and innovation programme under grant
agreement No 721105. | www.composelector.net Page 7
Database Integration Requirements for Materials Information
Management

High-level requirements for robust materials information management systems as defined by end-
users from materials and process groups and structural analysis groups include [Granta, EMMC]:
• Physical and virtual test data (input and output), and associated meta data (reduction,
translation, analysis, interpretation, etc.) for reproducibility.
• Well-pedigreed, traceable, reliable data (i.e. ‘Gold Source’) such that data can be trusted
by users across the business; this is a core measure of quality as defined by the EMMC,
along with the amount of relevant data available.
• Data that is statistically significant and validated to ensure users have confidence that
the data is trustworthy, which requires storage of associated analysis and metadata for
analysis parameters.
• Access controlled to limit rights to edit data when appropriate and defining who is
appropriate to make changes; versions of the data are managed and change processes exist;
and, confidentiality of materials data is maintained (e.g. ITAR restrictions).
• Consistent and well managed (i.e. not just a file dump) acknowledging that data has
structure appropriate for its end use (often acknowledged within a domain), tools and
workflows enable users to transfer data in and out efficiently, and authorization processes to
ensure questionable data is not used.
• Domain specific schema is required backed by meaning behind attributes, agreed across
domain stakeholders (manufacturers, materials suppliers, equipment manufacturers,
operators, digital tool developers).
• Ability to compare both physical and virtual test data as both are increasingly used as a
combined source of primary data to extend the range of data available for selection and to
search across data spaces, and as validation.
• Flexible schema with importers which can be updated ‘on-the-fly’, particularly relevant for
researcher.
• Large volume store of key output files with relevant links to reduced, analyzed data.
• Workflow enabled database allowing data to be pushed-out and received from multi-scale
platforms, MuPIF, contributing to the traceability of data sources, modelling data, simulation
workflows, etc..
• Export data to Excel for further analysis as this is often the requirement for bespoke internal
tools, domain specific statistical software, notably for business decision-making.
• Data model which includes record links is key in linking data and information to pedigree
for both experimental and virtual data.

Data Structure
Data structure, by definition, is a prescribed way of organizing data for efficient use by computer
software algorithms. It refers to the digital infrastructure needed to represent data appropriately, i.e.
data types required for specific domains. Formal definitions for data types exist (abstract data,
opaque data, transparent data, protocols, and design by contract) depending on whether they are
used by an interface between code or hidden in subroutines within a single code. This work will

COMPOSELECTOR H2020 Project


Funding from the European Union’s Horizon 2020 research and innovation programme under grant
agreement No 721105. | www.composelector.net Page 8
conclude on those types associated with physical and virtual data and as required to interface
between the software components in the COMPOSELECTOR project. In this case, typical data types
include: functional data, short and long string, arrays, point, tabular, etc. Specific data types will be
defined with partners as models are selected from each code for the application cases. The concept
is floating to create ‘universal connection tools’ or ‘common APIs’ which is an integration approach
requiring further consideration for software providers as it must bridge between several domain fields
wherein data structure is typically supported by the classification of domain-specific data into data
type requirements (e.g. fatigue data stored as functional data) for storage, visualization, combining
data sets, searchability, data analytics, statistical analysis, and easy transfer to other file formats
required by decision-making tools (e.g. ESI, e-Xstream tools). Essentially, this would be a significant
undertaking beyond the needs of the project scope and outcomes. For COMPOSELECTOR,
emphasis will be placed on application use case workflows and specific data type requirements once
the MODAs and selection of models are further defined (M12).

Classification and relationships


The hierarchical classification of domain-specific data is typically shared in the form of a schema or
ontology, the latter of which provides an over-arching framework, and the former relates information
on a very detailed level with associated units, data types, and links between data for full traceability
(as required to support materials qualification and certification programmes). Granta’s schema for
composites is validated by industry members (e.g. MDMC, EMIT, AutoMATiC consortia [6]) and will
be used in the COMPOSELECTOR project to link aspects of structure-process-property
relationships and meta data for reporting (e.g. batch number, supplier, test house, design allowable,
etc.) across both physical and virtual data. The EMMC is developing a high-level ontology known as
EMMO [7] which is undergoing review and anticipated was released for public review in November
2017 (an illustration of the EMMO is shared in Annex 1). The EMMO is designed to be adopted by
existing hierarchical database solutions, and for the representation of data for searchability (e.g.
research literature, file stores). Nomenclature and meta data for materials modelling is supported by
the material modelling definitions as published in the RoMM [8] and materials modelling metadata
as templated in the MODA [9] developed by the EMMC, the latter of which continues to be expanded
by members of the EMMC to include business KPIs (to be released publicly in the coming year).
Nomenclature of physical test data and its associated metadata is supported by test standards
(ASTM, ISO, SAE, etc.).

Interoperability
General Interoperability Requirements
This section presents general information on interoperability requirements to support the integration
of materials modelling information for business decision-making, before commenting on specific
interoperability options for the database. For software, interoperability is the ability of different
codes to exchange information and data by a common set of exchange formats. Individual codes
must be able to read and write the same file formats and to use the same protocols, both preferably
defined and updated by a standards group. Interoperability can raise technical and organizational
issues. Interoperability can impact data ownership (and hence licensing is addressed in the Data

COMPOSELECTOR H2020 Project


Funding from the European Union’s Horizon 2020 research and innovation programme under grant
agreement No 721105. | www.composelector.net Page 9
Management Plan), and usability for decision-making (i.e. business process interoperability).
Interoperability is achieved by common testing standards (as addressed in the Software Quality
deliverable), product engineering standards, industry/community partnership as undertaken by the
EMMC interoperability working group with input from software companies, common technology and
IP to reduce the variability in exchange formats (e.g. 3rd party libraries or open source developments),
and standard implementation as promoted by the EMMC via the concept of common open APIs, and
is explored in this project via the D2.5 Report on API definition for models, high-level data, and data
storage platform. However, it is acknowledged that this approach requires agreed taxonomy and
ontology (or schema) to facilitate the classification and identification of data – ongoing activities
within the EMMC are expected to target this issue at a high-level. Technology barriers for
interoperability as defined by the EMMC community [9]:
• Common API interfaces for uploading, retrieval and deciphering of information and data –
as noted, this is the goal of the MuPIF API (D2.5);
• Standard data formats, ontologies, protocols missing due to lack of tools and domain
specific definitions – Granta’s industry schema for composites (Figure 3) will be used to store
application use case data, and the EMMC-O will continue to be followed and adopted upon
release; Force and COMPOSELECTOR are working together to define CUDS based data
structures for common data types to achieve interoperability.
• Common nomenclature (i.e. semantics) between different domains – not an issue for
COMPOSELECTOR, but more widely for the application of the database for a solution to
perspective 1 (Introduction);
• Metadata to enable the use/re-use of resources, notably to support confidence of data in
business decision-making – EMMC has established metadata for materials modelling
(MODA) which continues to be expanded upon to support business decisions; metadata will
also be supported by Granta’s composite schema in the context of the project.

COMPOSELECTOR H2020 Project


Funding from the European Union’s Horizon 2020 research and innovation programme under grant
agreement No 721105. | www.composelector.net Page 10
Figure 3 GRANTA MI schema for a generic material class, linking material and process pedigree,
test data, statistical data, design data, and various reports [courtesy Granta Design]

Generic Interoperability Example


A high-level infrastructure for the connection between the GRANTA MI database system and the
MuPIF computational framework is illustrated in Figure 4 from the end-user perspective. The
workflow as denoted by numbers along the arrows in Figure 4 is explained here with relevant material
decision examples:
1. The story of the End-User begins within a decision-making environment which is linked to
KPIs for a material-based design decision (KPI’s in this context are for business and
modelling-performance). The information for this material/process-based decision can come
from information and data stored within the database, viewed in supporting data mining and
decision-making tools or as evidenced in a report. Now, these tools may exist within the
GRANTA MI suite of tools, or within tools which the database integrates with (CAD, CAE,
PLM). Example decisions include:
a. the need for improvement according to a KPI (e.g. reduction of cost) which indicates
a lack of physical or virtual information to understand the behavior of a material or
part to reduce the number of coupon tests; the comparison of physical and virtual
data may clearly indicate an inconsistency (either more virtual or physical information
is required, or a change in one or the other);
b. a requirement for substitution or a change in the material design (e.g. a material will
no longer be allowed based on upcoming EU REACh regulations), for which a specific
tool may run and output a report against CAS numbers and/or constituents as defined
in the material and process pedigree.
The KPIs will be made publicly available by the EMMC in the coming year.
COMPOSELECTOR H2020 Project
Funding from the European Union’s Horizon 2020 research and innovation programme under grant
agreement No 721105. | www.composelector.net Page 11
2. The need for modelling support to address the KPI may be triggered by the Workflow
Manager which sends a request to the MuPIF tool via Granta’s MI:Scripting Toolkit (STK [11])
to execute the simulation workflows, in this project using Python. There are at least two ways
to engage with MuPIF from the Database web portal:
a. Workflow Manager can launch a request for information or simulation to MuPIF, and
a request for information/data to be stored in the Database;
b. MuPIF can call on the database to pull/push data/information for simulations.
This work will focus on the first option. It should also be noted that the development of an
interoperable digital workflow is based on the outline of the modelling workflow as defined by
the MODA template.
3. Input requirements for the codes, metadata associate with the MuPIF tool and codes it
reaches out to, is stored in the database. Intermediate and final results from the codes may
be stored in a large data store, or the database, as appropriate. Access control is applied at
the level of the database beyond which interfaces exist to pull the data for analysis and
downloading to other external tools in a web portal.
4. Importing physical test data for data-based modelling and/or model validation, and capturing
pedigree for both physical and virtual test data.
5. Analysis of results and comparison against physical test data can take place in the Web
database portal or data analysis tools (1) to reassess the KPI. Test data can be remotely
imported directly from equipment to ensure zero data loss. Reference data can also be added
to the database for data mining by external applications.

Specific Interoperability Example


Granta Design has undertaken work to prove the concept of interoperability with the MuPIF
framework by using the example simulation scripts that are part of the MuPIF installation. Despite
being a simple example, it is sufficiently compelling to demonstrate data exchange between
GRANTA MI and MuPIF via python scripting. Specifically, the workflow retrieves input data from
GRANTA MI for MuPIF to start the modelling chain, and subsequently writes back the model outputs
to GRANTA MI at the end of the calculations. MuPIF workflows have a generic API to enable input
parameters to be set, execution, and collection of results for general use. The specific workflow for
Example01 is detailed below:

1) MuPIF Example01 requires 3 inputs, hardcoded in the script:


o Time
o Target time
o Time step number
2) Granta removed the hardcoded inputs and replaced them with a call to the GRANTA MI
system, where these inputs are stored in dedicated database fields, inside a temporary
“MuPIF Example01 test” record:

COMPOSELECTOR H2020 Project


Funding from the European Union’s Horizon 2020 research and innovation programme under grant
agreement No 721105. | www.composelector.net Page 12
Figure 5 Screenshot of GRANTA MI database simulation record demonstrating
model inputs and placeholders for results.

In Figure 5 above, only the simulation inputs are initially present in the record, and the fields to store
outputs (Simulation results) are empty. After running Example01 script, the simulation outputs are
imported in the database, together with additional metadata, as shown in Figure 6 below. The
decision to store simulation outputs within the same record which contains the inputs, metadata,
etc., is for the convenience of the end-user, but is not prescriptive. Different database record profiles
are possible depending on how the data in the database will be used downstream by end-users (e.g.
data/information can be formatted for display in other business-decision tools).

Figure 6 Screenshot showing the addition of simulation results with visual analysis
capabilities, held within the database record
COMPOSELECTOR H2020 Project
Funding from the European Union’s Horizon 2020 research and innovation programme under grant
agreement No 721105. | www.composelector.net Page 13
Conclusions
The GRANTA MI database system is highly configurable and represents an industry standard
database for materials and process information management. GRANTA MI currently integrates or
interoperates with several leading commercial solvers, pre-solvers and CAD packages, and is
actively working towards a solution in COMPOSELECTOR and other projects to facilitate
interoperability with multi-scale modelling frameworks.

Next steps will include:


1. working through MuPIF examples of increasing complexity with relevant open software
identified as end-user codes of interest (M6-12);
2. the database will be available to start trials with individual case studies at M12 as anticipated
by D2.4, having met the basic requirements for interoperability with MuPIF;
3. expanding MuPIF interoperability once all models have been defined for the three project
case studies and interoperability established with commercial solvers (M13-24);
4. building and integrating business tools to address relevant KPIs related to materials and
process selection/substitution decisions for manufacturing industries (M13-40);
5. transferring of data/information relevant for the specific workflows of the projects,
visualization and data analytics, supporting business KPIs and case study dissemination
(M18-48).

References
[1] GRANTA MI, http://www.grantadesign.com/products/mi/
[2] MuPIF, https://sourceforge.net/projects/MuPIF/
[3] SimPhoNy, http://www.SimPhoNy-project.eu/
[4] nanoHUB, https://nanoHUB.org/
[5] EMMC, Translator’s Working Group, https://emmc.info/translators/
[6] Granta Design Consortia (MDMC, EMIT, AUTOMATIC),
http://www.grantadesign.com/consortia/
[7] EMMO, European Materials Modelling Ontology (work in progress),
https://emmc.info/emmc-workshop-on-interoperability-in-materials-modelling/
[8] 2017 A. DeBaas, L. Russo, Review of Materials Modelling VI. European Commission.
[9] 2017 MODA (Modelling Data Elements) templates. EMMC. https://emmc.info/moda-
workflow-templates/
[10] EMMC, Interoperability working group, https://emmc.info
[11] MI:Scripting Toolkit (STK), http://www.grantadesign.com/products/mi/integration.htm

COMPOSELECTOR H2020 Project


Funding from the European Union’s Horizon 2020 research and innovation programme under grant
agreement No 721105. | www.composelector.net Page 14
ANNEX 1 - EMMO

Figure A1.1 EMMO branch example

A branch of EMMO, showing the taxonomy for the materials entity object in presented in Figure A1.1.
The EMMO is a work in progress and will be further developed by the EMMC. This diagram was
shared at the EMMC Interoperability Workshop [7]. [courtesy EMMC]

COMPOSELECTOR H2020 Project


Funding from the European Union’s Horizon 2020 research and innovation programme under grant
agreement No 721105. | www.composelector.net Page 15

You might also like