You are on page 1of 33

Supporting Publications 2012:EN-233

TECHNICAL REPORT OF EFSA

Technical report on the use of Excel/XML files for submission of data to the
Zoonoses system1
European Food Safety Authority2, 3

European Food Safety Authority (EFSA), Parma, Italy

ABSTRACT
This report summarises the technical aspects and the requirements to be considered in future
implementation of XML and Excel submission to EFSA of data on zoonoses, antimicrobial resistance,
animal population and food-borne outbreaks in accordance to Directive 2003/99/EC. It applies to
aggregated and sample-based data, and defines the practical support expected by the Member States in
this exercise. In order to move to ‘one-single’ reporting system maintained by EFSA, it suggests to
use the EFSA’s Data Collection Framework tool and to adopt the exchange protocols as defined in the
“EFSA’s Guidance on Data Exchange”. It highlights the need to keep and maintain the Zoonoses Web
Reporting Application until full automatic transmission is in place in all the Member States and
summarises the outcomes of the pilot on AMR isolate-based data submission run in 2011 including
some suggestion on how to extend the ‘EFSA Standard Sample Description (SSD)’ model to support
the collection of AMR isolate-based data.

© European Food Safety Authority, 2012

KEY WORDS
Zoonoses, Zoonotic Agents, Animal population, Antimicrobial Resistance Food-borne Outbreaks,
Data transmission, XML Submission, Excel Submission, Data Collection Framework, Controlled
Terminology, Standard Sample Description

1
On request from EFSA, Question No EFSA-Q-2011-00226, approved on 26 January 2012.
2
Correspondence: zoonoses@efsa.europa.eu
3
Acknowledgement: EFSA wishes to thank the members of the ‘Working Group on the use of XML and Excel files in the
Zoonoses system’: Cristian Apostu, Matthias Hartung, Elina Lahti, Eileen O'Dea, Peter Sewell ,Verena Spiteller,
Attila Tarpai and EFSA staff: Fabrizio Abbinante, Stefano Cappè, Marco Leoni, Jane Richardson and Kenneth Mulligan
for the support provided to this technical output.

Suggested citation: European Food Safety Authority; Technical report on the use of Excel/XML files for submission of data
to the Zoonoses system. Supporting Publications 2012:EN-233. [33 pp.]. Available online:
http://www.efsa.europa.eu/en/publications

© European Food Safety Authority, 2012


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

SUMMARY
The European Food Safety Authority (EFSA) is charged with coordinating the annual reporting of
zoonoses, zoonotic agents, animal population, antimicrobial resistance and food-borne outbreaks in the
European Union under the Directive 2003/99/EC as well as analysing and summarising the data
collected. For data collection purposes, EFSA currently provides a web-based reporting application,
where the data is manually entered, as well as the possibility, for food-borne outbreaks and
antimicrobial resistance data, to submit data using XML (Extensible Markup Language) files.

EFSA is extending the use of XML in order to fully automate the annual submission of data and to
improve the quality of the information received by minimising the errors due to manual data entry. In
addition, a gradual move to collection of sample-based data by XML is foreseen. At Task Force of
Zoonoses Data Collection (EFSA network) meetings, Member States have expressed the need to be
supported by EFSA in the future creation and submission of XML files. For some Member States,
Excel is, at the moment, the most largely used tool for collecting, storing and aggregating data. Those
Member States expressed their need to be supported in translating data from Excel into XML files.

To address these requirements and to better plan the next steps in using XML files for reporting on
zoonoses, antimicrobial resistance and food-borne outbreaks, EFSA set up a Working Group in order
to agree how EFSA can best support the Member States in the use of Excel and XML files.

The conclusions and recommendations of the ‘Working Group on the use of XML and Excel files in
the Zoonoses system’ are included in this technical report and are for consideration by the Zoonoses
Task Force and by EFSA. It is suggested to use the EFSA’s Data Collection Framework tool to
support the automatic submission to EFSA of zoonoses aggregated and sample based data and to adopt
the exchange protocols already defined in “EFSA’s Guidance on Data Exchange” This will enable a
smooth move to ‘one-single’ reporting system maintained by EFSA but also keeping and maintaining
the “Zoonoses Web Reporting Application” in order to allow the Member States to report zoonoses
data until the full automatic transmission is in place in all the countries. The use of “simple flat
models” instead of “complex data models” is advisable as well as the support of “Comma Separated
Values” files in addition to the support of XML and Excel files. The documentation to be provided to
the Member States is also to be improved in comparison with the documentation provided in the past
years by EFSA.

Supporting Publications 2012:EN-233 2


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

TABLE OF CONTENTS
Abstract .................................................................................................................................................... 1
Summary .................................................................................................................................................. 2
Table of contents ...................................................................................................................................... 3
Background as provided by efsa .............................................................................................................. 4
Terms of reference as provided by efsa.................................................................................................... 5
1. Introduction ..................................................................................................................................... 6
2. AMR Isolate-based pilot .................................................................................................................. 6
2.1. Scope of the AMR pilot .......................................................................................................... 6
2.2. AMR isolate-based data model ............................................................................................... 7
2.3. Outcomes of the AMR pilot .................................................................................................... 7
2.3.1. The submission process ...................................................................................................... 7
2.3.2. Data entry to Excel ............................................................................................................. 8
2.3.3. Transformation from Excel into XML ............................................................................... 9
2.3.4. Submission to Data Collection Framework (DCF) ............................................................ 9
2.3.5. Data aggregation ................................................................................................................. 9
2.3.6. Overview of the AMR pilot .............................................................................................. 10
3. Submission of XML and Excel files in the Zoonoses system ....................................................... 11
3.1. Milestones of the XML submission in Zoonoses.................................................................. 11
3.2. Current submission of data in Zoonoses ............................................................................... 12
3.3. Future submission of data in Zoonoses ................................................................................. 13
3.4. Submission of Zoonoses data through DCF ......................................................................... 14
3.4.1. Exchange Protocol ............................................................................................................ 14
3.4.2. Validation of pick lists in DCF ......................................................................................... 15
3.4.3. Implementation of business rules in DCF ........................................................................ 15
3.5. Supported XML schema for Zoonoses in DCF .................................................................... 15
3.5.1. XML schema: generic data model vs flat data model ...................................................... 15
3.6. Controlled Terminology maintenance .................................................................................. 16
3.6.1. Publication of pick lists .................................................................................................... 17
3.6.2. Update of pick lists ........................................................................................................... 17
3.6.3. Notification to Member States .......................................................................................... 17
3.7. Support of Excel and CSV files ............................................................................................ 18
3.7.1. Excel data models ............................................................................................................. 18
3.7.2. Excel macros and Excel tools ........................................................................................... 18
3.7.3. Re-issuing Excel tools ...................................................................................................... 19
4. Support expected by Member States for XML/CSV/Excel submission ........................................ 19
4.1. Documentation to be provided to the Member States ........................................................... 19
4.2. Helpdesk ............................................................................................................................... 19
4.3. Training and meetings........................................................................................................... 19
4.4. Supporting IT Tools .............................................................................................................. 20
Conclusions and recommendations ........................................................................................................ 21
References .............................................................................................................................................. 22
Appendices ............................................................................................................................................. 23
A. AMR isolate-based data model...................................................................................................... 23
B. Proposed revision of the SSD model to accommodate AMR isolate-based reporting .................. 28
Abbreviations ......................................................................................................................................... 33

Supporting Publications 2012:EN-233 3


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

BACKGROUND AS PROVIDED BY EFSA


EFSA’s Biological Monitoring Unit (BIOMO) is promoting the use of XML in the data transmission
of zoonoses, antimicrobial resistance and food-borne outbreaks data to EFSA. This is to both automate
the annual submission of data and to improve the quality of the information received by minimising
errors due to manual data entry. In addition, XML allows the submission of sample-based data,
whereas the manual data entry of sample-based data is not practical. In September 2010 a meeting
with Member States' IT experts was held in Parma to address how to extend the use of XML files in
annual reporting on zoonoses. It was decided not to extend the use of XML files in 2011 but to
implement the XML submission to all table types in 2012. At that meeting, the topic of whether to use
Excel files for the submission of data in Zoonoses system was also discussed, as Excel is, at the
moment, the tool most largely used in the National Zoonoses Reporting systems.

At the Zoonoses Task Force meeting held in Parma on 9-10 October 2010, the possibility to use Excel
files as an intermediate step to allow a smooth migration from manual data entry to submission of
XML files was examined. In this context, some Member States expressed the need to set up a Working
Group comprising Member States' experts. This Working Group should discuss the technical aspects
and the possible solutions on how to use XML and Excel files in zoonoses, antimicrobial resistance
and food-borne outbreaks for both aggregated and sample-based data and to define practical support
needed by the Member States in this exercise.

In this context, it was also decided to run a pilot in 2011 on isolate-based antimicrobial resistance
(AMR) data with some volunteer reporting countries using XML and the Data Collection Framework
(DCF) tool and adapting the "Standard Sample Description for Food and Feed” to the specific AMR
needs.

In order to concretely support the Member States in their migration to XML transfer and the
submission of XML files to the Zoonoses database, it was also decided to launch a Grant call under
Article 36 in order to co-finance the implementation of the XML data submission at Member States
level.

Supporting Publications 2012:EN-233 4


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

TERMS OF REFERENCE AS PROVIDED BY EFSA


1. The Biological Monitoring Unit is asked to run a pilot in 2011 with volunteer Member States on
the submission of AMR sample-based data in XML using the Data Collection Framework
(DCF).

2. The Biological Monitoring Unit is asked to set up a Working Group of EFSA which shall:
• monitor the pilot on AMR isolate-based data submission run in 2011 and consider any
potential technical difficulties regarding data extraction from the Member States’ databases
and data transmission to EFSA using the Data Collection Framework tool;
• consider how to use XML and Excel files for the provision of aggregated and sample-based
data on zoonoses, animal population, antimicrobial resistance and food-borne outbreaks;
• investigate the possibility of implementing and maintaining Excel Macros;
• define the IT and non-IT support that the Member States expect from EFSA related to use
of XML files and/or Excel files in the reporting;
• report on the progress done to the Task Force on Zoonoses Data Collection; and
• prepare an EFSA technical report on the subjects.

Members of the Biological Monitoring Unit and members of the EFSA's ITOP Unit will also
participate in the Working Group.

Supporting Publications 2012:EN-233 5


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

1. Introduction
The European Food Safety Authority (EFSA) is charged with coordinating the annual reporting of
zoonoses, zoonotic agents, animal population, antimicrobial resistance and food-borne outbreaks in the
European Union under the Directive 2003/99/EC4 as well as analysing and summarising the data
collected. EFSA’s Biological Monitoring Unit (BIOMO) is promoting the use of XML in the data
transmission of zoonoses, animal population, antimicrobial resistance and food-borne outbreaks data
to EFSA. This is to both automate the annual submission of data and to improve the quality of the
information received by minimising errors due to manual data entry. In this context, EFSA has set up a
Working Group in order to agree how EFSA can best support the Member States in the use of Excel
and XML5 files.

As defined in the terms of references, the “Working Group on the use of XML and Excel files in the
Zoonoses system” has discussed the following issues:

• outcomes of the pilot on Antimicrobial Resistance (AMR) isolate-based data submission;


• use of XML and Excel files for the provision of aggregated and sample-based data on
zoonoses, animal population, antimicrobial resistance and food-borne outbreaks, including
the possibility to implement and maintain Excel macros;
• IT and non-IT support that the Member States expect from EFSA related to use of XML
files and/or Excel files in the reporting.

2. AMR Isolate-based pilot


EFSA analysed data on antimicrobial resistance (AMR) in-depth during 2009 and 2010 by producing
two Community Summary reports covering the years 2004-2008 and by finalising, in close
collaboration with European Centre for Disease prevention and Control (ECDC), a third Summary
Report for the year 2009. AMR data reported and analysed in those reports were reported by the
Member States as aggregated data.

An AMR expert Working Group was set up in 2010 by EFSA6 to propose specifications on the most
optimal and scientifically sound way of reporting and analysing AMR data at EU level. Although
work is still ongoing, the Working Group has already acknowledged the need for AMR data to be
collected and analysed at bacteria isolate level. This proposal has been lately discussed with the Task
Force on Zoonoses Data Collection and the proposal was well received. A number of reporting
countries put themselves forward to participate in a pilot AMR isolate-based data collection in 2011.

2.1. Scope of the AMR pilot


The AMR pilot had two main objectives. The first objective was to test the collection of AMR data at
a deeper level of granularity and verify whether the collection of data at isolate level enables more in-
depth scientific analysis. AMR isolate level data gives access to the multi-resistance patterns of
isolates, and then allows identification of multi-resistance and description of clonal spreading of
resistance. Furthermore, information on multi-resistance is of utmost importance to investigate the
association between antimicrobial use and antimicrobial resistance.

The second objective, more technical and analysed in this report, was to test the submission of XML
files and Excel files to the Zoonoses system using the Data Collection Framework (DCF). In addition,
it was also possible to test the aggregation of isolate-based data and the migration of the aggregated
data into the Zoonoses database.

4
Directive 2003/99/EC of the European Parliament and of the Council of 17 November 2003 on the monitoring of zoonoses
and zoonotic agents, amending Council Decision 90/424/EEC and repealing Council Directive 92/117/EEC. OJ L 325,
12.12.2003, p. 31–40.
5
XML = eXtensible Markup Language
6
Under mandate M-2010-0291

Supporting Publications 2012:EN-233 6


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

2.2. AMR isolate-based data model


The AMR data model used in the pilot exercise was developed by EFSA together with the “AMR
expert Working Group” and based on the EFSA’s “Standard Sample description for food and feed
(SSD)” model (EFSA, 2010a). After the analysis of the AMR data collection needs, it was concluded
that some data fields needed for the collection of AMR isolate-based data were missing in the SSD
model and it was decided to use an “ad-hoc” data model based on the SSD model but integrating the
missing parts.

The so called “AMR data model” used in the pilot exercise is shown in Annex A. This model used for
the pilot exercise will probably need to be further revised and corrected by AMR experts in order to
improve the collection and analysis of AMR isolate-based data. These modifications, or extensions,
are outside the scope of this technical report.

This Working Group, as outcome of the pilot exercise, performed a deep analysis of the differences
between the SSD model and the model used in the AMR pilot and makes some proposal on how to
extend the SSD model in order to accommodate the AMR isolate-based data collection needs. The
results of this comparison are reported in “Annex B. - Proposed revision of the SSD model to
accommodate AMR isolate-based reporting” and are for consideration by the future “SSD Review
Committee” to be set up by EFSA.

2.3. Outcomes of the AMR pilot


2.3.1. The submission process

During the AMR pilot two alternative submission routes were made available by EFSA to the
Reporting Countries as visualized in Figure 1: manual submission of Excel files or manual submission
of XML files. The possibility to use web-services instead of manual submission, which is available in
DCF, was not tested or used by any reporting country during the pilot exercise.

Supporting Publications 2012:EN-233 7


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

Figure 1: Submission process in the AMR pilot

As shown in the diagram, some countries manually entered the data in a specific “data entry”
spreadsheet developed by EFSA and included in the provided “Excel pilot workbook”. The
information was automatically codified in a “to be submitted” excel spreadsheet. Other countries
exported the data from their database directly into Excel, either in the format requested by EFSA (i.e.
the “to be submitted” format) or in the “data entry” spreadsheet. At this point some countries created
from the “to be submitted” file, an XML file, whereas other countries decided to export the data from
their database directly into XML files. At the end of this process, either a XML file or an Excel file
was ready to be submitted to DCF.

The final stage was for EFSA to aggregate the submitted data and return a summary of the aggregation
to the Member States for review. At the end of the pilot, with the consensus of the reporting country,
some aggregated tables generated from the isolate-based data provided in the pilot were migrated in
the Zoonoses database and used as official “AMR Quantitative Tables” which are used for the
preparation of the 2010 EU Summary Report.

The following text summarizes the conclusions based on issues encountered along the process steps.

2.3.2. Data entry to Excel

The Excel template provided by EFSA proved its worth. The manual data entry into this Excel file was
more convenient than to the "Zoonoses Web Reporting Application". The use of a very large Excel file
can cause memory issues and difficulties when transmitting it. A consideration for the future may be
to distribute its content across more than one file.

In spite of the acknowledged support of the Excel file, this stage proved to require the highest
workload because the source data must be mapped to set pick lists and field definitions.

Supporting Publications 2012:EN-233 8


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

The length of the pick lists and especially the composition of non-hierarchical information within the
“food categories or animal species”, “sampling stage” and “sampling context” pick lists resulted in a
daunting task. Therefore, the revision by EFSA of such pick lists is suggested. Beyond this, the
harmonisation of controlled terminologies between EFSA and ECDC is desirable to reduce mapping
efforts.

The automatic coding of entered terms which was implemented in the Excel tool provided in the pilot
functioned flawlessly (on condition that the entered term was valid). This was of significant benefit to
Member States.

In addition to the substantial support for the participants supplied by EFSA (i.e. web-conferences at
the beginning of the project, data dictionary, XML schema documentation and diagram, assistance by
email) even further documentation would be useful for getting started in such an exercise.

2.3.3. Transformation from Excel into XML

Due to the flat hierarchy of the schema used in the AMR pilot, the transformation of an Excel file to a
well-structured XML file was straightforward. It is favourable to use exactly the same header as
defined in the EFSA's "Guidance on Data Exchange" for the Standard Sample Description-Message,
as the one used for the pilot was, by mistake, slightly different and needed corrections.

In case EFSA provides or suggests an IT tool for the transformation of data to XML, it is
recommended to suggest a free tool and not a commercial product, or at least complement any
proprietary tool suggested with one open source option (like the suggestion of using Comma Separated
Values files (CSV) instead of Excel Files).

2.3.4. Submission to Data Collection Framework (DCF)

The file upload via DCF worked well. However, the need to limit the size of uploaded files was
highlighted since large files can cause a time out during the submission phase. It is suggested to split
the data into several smaller files and to alert the reporters to the possible use of .zip files.

To accelerate the data testing and correction cycle, access to DCF should be granted to Member State
data managers and not only to the zoonoses reporting officers.

Concerning the enhancements of DCF the following points were highlighted:


• Currently DCF supports only Excel files in ‘xls’ format. EFSA should investigate the
possibility to support ‘xlsx’ format as well.
• The result of loading data “Partially inserted” is misleading as it indicates that there was an
error but not clearly so. Use of the phrase “Error(s) in loaded data” would be clearer.
• The user interface could be more user friendly with regard to the design, alignment, labels and
colours of buttons.
• The performance of DCF in general should be improved in terms of responsiveness of the
application.

2.3.5. Data aggregation

The aggregation of the isolate-based data revealed errors within the provided data that were not
detected by data validation (e.g. due to wrong “isolate codes” reported in the pilot or rows not
submitted to EFSA). The pilot showed the need of human review of the submitted data to ensure
completeness and correctness of the reported samples and the need of implementing business rules for
both the submitted data in DCF but also additional “business rules” implemented by EFSA on the
aggregated data.

Supporting Publications 2012:EN-233 9


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

2.3.6. Overview of the AMR pilot

It can be concluded that the pilot was a success with the contributing Member States all able to
provide isolate-based AMR data within the agreed three month project window. Furthermore, from a
technical point of view, no major problems were encountered during any phase of the pilot and where
minor issues were raised these were resolved quickly and efficiently.

With regards to the tools made available by EFSA, the Excel template received praise for its ease of
use and provided a common launch pad for those submitting Excel files to DCF, and similarly the
mechanism whereby the Excel files were transformed to XML was simple and straightforward.
Finally, from a technical point of view the submission of files through the DCF interface was trouble-
free, even tough the usability of DCF could be improved.

There were elements of the process that were more time consuming, one obvious candidate being the
mapping of the original data to EFSA's pick lists. In fact, this data management step was identified as
the main workload of the project. Any facilitation of this task would be most useful and therefore will
be considered for future projects. Provision of more documentation on the process would also be
useful. Although not as significant a draw on resource, it was noted that time was also spent validating
the submitted data once it had been aggregated into summary reports. Development of business rules
to validate Member State’s submissions to ensure data quality, as mentioned above, should reduce the
need for this over time. Overall, it was felt that the pilot was a valuable step in the evolution of
zoonoses data collection.

Supporting Publications 2012:EN-233 10


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

3. Submission of XML and Excel files in the Zoonoses system

3.1. Milestones of the XML submission in Zoonoses

The submission of XML files to the Zoonoses system started in 2008 with a pilot project for
submission of food-borne outbreaks (FBO) data and continued in 2009 with an extended pilot for
submission of AMR aggregated data. For this purpose, EFSA developed a XML schema supporting
the submission of XML files for both FBO and AMR aggregated data. This FBO and AMR XML
Schema, developed in 2008 and extended in 2009, was based on a so-called “generic model” with the
intent to support the submission of all tables in Zoonoses by using the same generic XML Schema.
The complex structure of the XML Schema supporting the generic model and the complexity of the
validation of the pick lists, discouraged the Member States from using this schema for reporting
zoonoses data to EFSA; only two countries, from 2008 to 2011, were able to submit XML files using
the generic model.

In 2010, EFSA ran a survey of the Member States in order to understand the status of the national data
collection systems in place in the different Member States and their willingness to submit data to
EFSA using XML files. The survey showed the evident need to develop a system able to support not
only the submission of XML files but also the submission of Excel files and to develop a plan for
XML/Excel submission together with the Member States. For this reason, EFSA decided to hold the
first Zoonoses IT Task Force meeting with the Member States’ IT experts, to discuss the outcome of
the survey and to better plan the transmission of XML files to the Zoonoses database. At that meeting,
it was decided to put on hold the development of XML submission and have one year of discussion,
piloting and trial. To follow up these decisions, in 2011 EFSA set up the “Working Group on the use
of XML and Excel files in the Zoonoses system” (group responsible of this technical report). It was
also decided to concretely support the Member States by launching a call for Grants under Article 36
in order to co-finance a 20-month-projects for the development of XML submission of Zoonoses data,
and to run a pilot (using DCF) for the submission of AMR isolate-based data using XML and Excel
files.

To conclude these initiatives and to discuss and plan the next steps, EFSA organised the 2nd Zoonoses
IT Task Force meeting with the Member States’ IT experts on 21-22 September 2011.

The following paragraphs summarise the decisions taken and suggestions proposed to the Task Force
on Zoonoses Data Collection by the experts of the “Working Group on the use of XML and Excel files
in the Zoonoses system”, taking in consideration the conclusions of the 2nd Zoonoses IT Task Force
meeting.

Supporting Publications 2012:EN-233 11


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

3.2. Current submission of data in Zoonoses

An overview of the Zoonoses Reporting System in place for the reporting season in 2011 is shown in
Figure 2.

Figure 2: Zoonoses Reporting System 2011

In 2011, EFSA supported:

• The “Zoonoses Web Reporting Application” for the manual provision of aggregated
data in the Zoonoses database.
• Two “loaders”: one for the ‘generic model’ to report FBO and AMR aggregated data
using XML files only, one for the ‘simple flat model’ to report to AMR isolate-based
data to DCF using XML or Excel files.
• Two “User Management modules”: one in Zoonoses and one in DCF,
• Two “Dictionary Management modules”: one in Zoonoses and one in DCF.

The migration of the AMR isolate-based data submitted to DCF during the pilot was performed by
EFSA (after aggregation) and after validation (via email) by the Member States.

This reporting system is not efficient due to the high costs of maintenance at EFSA (duplication of
systems), and for the Member States, as they have to interface two different systems.

Supporting Publications 2012:EN-233 12


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

3.3. Future submission of data in Zoonoses

In order to improve the current system, to minimise the maintenance costs and to allow the use of one
single system, EFSA, together with the IT experts of the Member States at the last Zoonoses IT Task
Force meeting, agreed to use DCF for the provision of XML and Excel files to EFSA, and to support
also the submission of CSV files. Therefore, the scenario foreseen for the future Zoonoses reporting
system is illustrated in Figure 3:

Figure 3: Zoonoses Reporting System in future

In this scenario, EFSA will support:

• The “Zoonoses Web Reporting Application” for the manual provision of aggregated
data in the Zoonoses database. This web application is to be maintained until all the
Member States are able to submit all the data via XML or Excel/CSV files.
• One single XML/Excel/CSV “loader” in DCF supporting ‘simple flat models’ for
reporting aggregated data, narrative text forms, isolate-based or sample-based data
using XML, Excel or CSV files.
• One single “Dictionary Management module”
• One single “User Management module”

In the new system, aggregated data submitted through DCF will be automatically imported in the
Zoonoses database, and isolate-based or sample-based data submitted through DCF will be
automatically aggregated and then migrated into the Zoonoses database, where the Member States will
be able to access and validate the submitted data.

Data submitted via XML/Excel/CSV should be corrected or amended by submitting a new


XML/Excel/CSV file instead of manually correcting through the “Zoonoses Web Reporting
Application”.

Supporting Publications 2012:EN-233 13


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

3.4. Submission of Zoonoses data through DCF

In the scenario described in the previous paragraph, Member States will have several possibilities for
submitting Zoonoses data to EFSA. The process of preparing and submitting Excel, CSV or XML files
to the Zoonoses database via DCF is illustrated in Figure 4s where all the possibilities provided to the
Member States are covered.

Figure 4: Submission process in future

3.4.1. Exchange Protocol

For the exchange protocol, the Zoonoses system will embrace the manual (manual post) and automatic
(web-services or FTP) submissions of XML/Excel/CSV files already supported in the DCF system and
defined in the EFSA’s "Guidance on data exchange” (EFSA, 2010b).

Any future modification of this guidance should be also discussed and agreed with all relevant experts,
including Zoonoses experts, in order to take in consideration the needs and the requirements of the
provision of Zoonoses data to EFSA. This is anticipated to be the function of the Standard Sample
Description (SSD) and Data Transmission review committee which EFSA intends to convene.

In this context, this Working Group suggests to allow in the DCF the reporting for “manual post” of
XML messages having “dataset” as root element. It turned out that DCF ignores all header information
contained in an XML file when uploaded manually. For automatic submissions the “message” entity
should have all the header elements mandatory. The presence of the “message” element should be
checked only in the latter case. The SSD XML Schema is already able to validate documents having as
root element “message” or having as root element “dataset”.

For the automatic processing of an Excel file by DCF it is essential that the first row must match
exactly with the specified template.

Supporting Publications 2012:EN-233 14


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

3.4.2. Validation of pick lists in DCF

For the submission of XML, Excel and CSV files, validation of the pick list’s terms is undertaken by
DCF during the upload process.

It is suggested, in order to minimise the maintenance of the XML schema and Excel mapping tools
(see 3.7.2- Excel macros and Excel tools), that the validation of pick list’s terms should be kept inside
the DCF loading system and not be included inside the XML Schema. In this way, the pick lists
validation will be the same for submitting data in XML, in Excel or in CSV format.

3.4.3. Implementation of business rules in DCF

Business rules applicable to the Zoonoses data collection should all be implemented inside DCF. As in
the case of validation of pick list’s terms, this will minimise the maintenance costs and would allow
the use of the same tool for all the input types i.e. the same business rules will be used when
submitting data in XML, in Excel or in CSV format.

3.5. Supported XML schema for Zoonoses in DCF

In 2011, the Zoonoses reporting system supported two different types of XML schema.

The ‘Zoonoses’ generic model’, used from 2008 to 2011, based on the principle that any data
collection could be mapped in a model that answers the following logical questions: “What”, “Who”,
“Where”, “When” and “Why”. Such a model could accommodate different data collection needs as
different tables can be mapped against the same list of generic attributes and elements. In practical
terms, this means that one generic data model (and therefore the same XML schema) could be used to
collect several tables, which is the case of the Zoonoses data collection system.

The ‘AMR Isolate-based model’ based, instead, on a simple structure where the table’s data elements
are mapped against a simple flat file. This model, introduced in spring 2011, was shown to be easily
understood and implemented; after its publication twelve Member States involved in the AMR pilot
project were able to submit data to EFSA within weeks.

A comparison between the two different schemas is provided in the following paragraphs.

3.5.1. XML schema: generic data model vs flat data model

The use of the original generic schema was advocated by the attendees at the 1st Zoonoses IT Task
Force in September 2010. Whilst the schema was considered comprehensive and allowed one data
model to be used for all zoonosis data types, it was also very complex.

The benefits of a single schema include low maintenance and support for the one-to-many
relationships inherent in the zoonosis sample data between samples and analysis results. Use of a
schema supporting such relationships would reduce the volume of data to be transmitted from Member
States to EFSA due to non-repetition of sample information for each result associated with the sample.

The main disadvantage of the generic schema is its complexity. In addition, many Member States use
Excel as a mid-step in applying EFSA codes to their national data prior to creation of XML data per
the schema and this process is not available when using the complex generic schema.

The flat file schema was discussed in depth at the 2nd Zoonoses IT Task Force meeting in September
2011. The discussion groups were unanimous in advocating adoption of the simple flat file schema.
Significantly, the only Member State to have used the generic schema for AMR data submission also
advocated adoption of the simple flat file schema in preference to the generic one.

Supporting Publications 2012:EN-233 15


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

The main benefit of the flat file schema is its simplicity which makes it readily understandable. In
addition, the support EFSA provided for its use in the AMR XML Pilot project, including the Excel
mid-step for EFSA code creation, made its use straightforward for Member States. Although the flat
file schema should be maintained as simple as possible, sometimes the use of ‘choice elements’ in the
schema could be considered (e.g. in the AMR data model, two alternative mandatory sections could be
implemented - one to be used for dilution results and one for diffusion results).

There are challenges associated with the simple schema. These include:
• Necessity for EFSA to maintain separate schemas for each of the Zoonoses data types
• Risk of incorrect de-normalisation even when starting from a correct dataset
• Data volumes are increased due to de-normalisation

Data volume implications are not significant when transmitting aggregated data for Zoonoses or AMR.
Data volume is also not an issue for sample based AMR which is typically a relatively small dataset. If
a move to sample-based data reporting for Zoonoses was to be piloted, data volumes for some larger
Member States could become a challenge and this would need to be explored with those Member
States affected. However, experience in other EFSA’s data collection projects using DCF has shown
that the benefits of simple flat schema far outweigh any issues associated with data volumes.

In conclusion, Member States are strongly in favour on the use of the simple flat file schema for future
Zoonoses data collection needs. It is also recommended (where the tables involved are similar) to
minimise the number of simple flat file schemas used, like in the case of “Serovar” and “Phagetypes”
tables. This means that for future Zoonoses data collection needs, the following simple flat file schema
should be developed:

a. Prevalence tables
b. Serovar and Phagetypes tables
c. AMR Isolate-based tables
d. AMR Quantitative (Aggregated) tables
e. AMR Qualitative (Aggregated) tables
f. AMR Cut-off values tables
g. Animal population table
h. Disease Status tables
i. FBO tables
j. Text Forms

Concerning the use of simple flat files schema, there are technical aspects to be considered in more
detail, and reporting solutions to be suggested that are outside the scope of this technical report. For
example, when reporting aggregated data in Zoonoses for prevalence tables the information regarding
serovars and phagetypes should be an optional section in order to allow the reporting of no positive
findings. In this way, the simple flat schema for Prevalence tables can support the submission of no
positive results and also the submission of results where more than one serovar or phagetype has been
identified from a sample7. All these technical aspects are left to EFSA for analysis and are to be
addressed in the documentation provided to the Member States (See 4.1- Documentation).

3.6. Controlled Terminology maintenance

The adoption of the simple flat file schema has no implications for validation of correct values against
the EFSA’s controlled terminology values (also known as ‘pick lists’). Since pick list validation will
be done on the DCF platform, rather than in the published XML schema (see 3.4.2- Validation of pick
lists in DCF), pick list validation will remain the same for Member States submitting data in XML, in
Excel or in CSV format.

7
Some of these issues could be solved by starting the collection of sample-based data by adopting or extending the SSD data
model.

Supporting Publications 2012:EN-233 16


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

In this context, it is crucial, in order to support the reporting of data to EFSA, that an effective
publication and maintenance system of the EFSA’s controlled terminology be created as well as the
implementation of a notification system to inform the Member States about any modification to the
published pick lists. The following paragraphs cover these requirements in details.

3.6.1. Publication of pick lists

Pick lists are to be made available to Member States on the Internet (i.e. the EFSA Website and
through web-services).

Member States should be able to download the pick lists from the Internet or to subscribe to a
dedicated web-service tool which will allow the synchronisation of the EFSA’s pick lists with Member
States’ copies stored in the national systems.

Both a full format of all controlled terminology values and a list of changes since the previous version
will be made available for download or synchronisation to facilitate Member States who implement
mapping in their system from national values to EFSA values.

There are a number of options for the format of publication and these include XML, Excel, CSV and
all of these should be supported.

3.6.2. Update of pick lists

Availability of pick lists for Member States’ use in submission of data in XML, Excel or CSV format
should be done as early as possible. The Working Group suggests that EFSA should define a
mechanism for ongoing identification and notification to EFSA of new values that will be needed
during the reporting season. This mechanism should be available to Member States throughout the
year as analysis results are continuously recorded in the national systems

The EFSA proposal for annual amendments should be published well in advance of the reporting
period of each year. This will allow consideration of the suggestions by EFSA and the Zoonoses Task
Force group in good time and should contribute to minimising the changes necessary during the
reporting period.

Pick lists will include for each value the ‘Valid from (date)’, ‘Valid to (date)’, ‘Last update (date)’
information to facilitate identification of the changes, the dates of changes and the period for which
values were valid.

Notification to Member States of changes should include identification of insertions, amendments and
deletions8 and should be sufficiently detailed to allow reporting officers to know whether the changes
are relevant to his/her Member State’s data.

Where they are not applicable, ensuring synchronisation of the Member States’ national information
system and the EFSA pick lists may not be urgent.

3.6.3. Notification to Member States

In order to allow data transmitters to register their interest and receive automatic notifications of
changes in the EFSA’s pick lists, the Working Group proposes the possibility for interested users in
Member States (reporting officers as well national interested reporters or data managers) to subscribe
on a notification system that should provide full lists, changes lists and re-issued Excel Mapping Tool

8
No term should be physically deleted from the Controlled Terminology, a “deletion” of term will result in flagging the
term as “deprecated” and its “Valid to (date)” filled in.

Supporting Publications 2012:EN-233 17


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

(see 3.7.2 - Excel macros and Excel tools) whenever updated by EFSA. Subscribers should be able to
choose preferred format for download in the subscription setup.

3.7. Support of Excel and CSV files

The DCF allows the submission of Excel files and also CSV files without additional need of
enhancing the DCF. The Working Group suggests supporting both formats for the submission of
Zoonoses data to EFSA.

CSV file format is not proprietary, has no limitations on the number of rows that can be provided and
is not linked to a specific version of any proprietary tool and therefore can be better supported over
time. Member States can use Excel to prepare the data for submission and then can transform it in
CSV format to minimise the possible errors due to Excel handling.

Excel can also be used as intermediate step to transform the national data into XML files to be
submitted by EFSA. If this is the case, EFSA should provide any needed support and guidelines for
this transformation. It is nevertheless not recommended to develop a specific tool to be distributed to
the Member States for converting Excel files into XML files.

3.7.1. Excel data models

The data models defined by EFSA for the provision of both aggregated and sample-based data should
be available also in Excel format and not only in XSD format.

The data models in Excel format should indicate:

• The data elements to be provided for each type of table, with these details:
o Data element code (to be used as reference code in the data model)
o Data element short name (to be used in the corresponding XML schema)
o Data element long name (to be used to refer to the data element in the guidelines)
o Data element label (to be used in the "Zoonoses Web Reporting Application" or in
PDF)
o Data element description
• The type of the data elements (i.e. numeric, text, date)
• The length of the data fields
• The pick lists to be used for the data fields, where a pick list is applicable
• Information about mandatory data fields and optional data fields
• Business rules to be applied to the data elements
• Some examples of valid data elements

3.7.2. Excel macros and Excel tools

For optimising and enhancing the use of Excel and CSV files, EFSA should support the Member
States as much as possible especially in the mapping of the national pick lists onto the EFSA pick lists
and in the translation of the data for submission into “codes” accepted by DCF.

In this context, the Working Group suggests EFSA should not develop and distribute ad-hoc Excel
Macros but should provide Member States with precise guidelines and examples on how to map and
codify the national pick lists against the EFSA pick lists and how to prepare an Excel or CSV file
ready to be uploaded in DCF.

A simple mapping tool in Excel format, like the one provided during the AMR Isolate-based pilot,
should be provided by EFSA to support the Member States. This can be achieved for example by
providing three spreadsheets:

Supporting Publications 2012:EN-233 18


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

• Spreadsheet_1 with the data fields extracted from the national reporting system in the right
order defined by the data model

• Spreadsheet_2 with the EFSA pick lists where to import the national pick lists for mapping

• Spreadsheet_3, where, by means of the VLOOKUP functionality and by using the mapping of
Spreadsheet_2, the extracted data entered in Spreadheet_1, is converted in “codes” accepted
by DCF.

This mapping tool should be available for download to the Member States.

3.7.3. Re-issuing Excel tools

When pick lists are updated, the EFSA simple mapping tool which assists Member States in allocating
EFSA codes to their national data, needs to be re-issued. The Working Group sees the maintenance of
this Excel file by EFSA as a valuable support to Member States.

4. Support expected by Member States for XML/CSV/Excel submission


This Working Group discussed the support expected by the Member States in reporting
XML/Excel/CSV files to Zoonoses.

Some recommendations for EFSA about the documentation, helpdesk, IT tools, training and meetings
needs are reported in the next paragraphs.

4.1. Documentation to be provided to the Member States

The following documentation is suggested to support the collection of Zoonoses data using DCF via
XML/CSV/Excel files:

• Publication of a “Read me” file to indicate where to find all the information about the
documentation available
• Publication of data elements collected and their definitions in PDF format
• Publication of data models in Excel format (see 3.7.1- Excel data models)
• Publication of XML schemas (both in XSD and as a diagram)
• Publication of EFSA’s pick list and guidelines about the notification system (see 3.6.1-
Publication of pick lists and 3.6.3-Notification to Member States)
• Publication of Excel mapping tools (see 3.7.2 - Excel macros and Excel tools)
• Publication of business rules
• Publication of meta data values applicable for the data collection (e.g. value for "dcCode" as
the unique identifier of the data collection)
• Publication of sample files for each table (in CSV, Excel and XML format)
• Publication of the DCF User Manual (user interface and registration process)

4.2. Helpdesk

It is suggested that the helpdesk provided by EFSA to the Member States for the collection of
Zoonoses data for supporting the Member States in the use of DCF for Zoonoses monitoring purpose
should continue. Therefore, it is recommended that the zoonoses_support@efsa.europa.eu mailbox be
used for any technical or scientific requests for support.

4.3. Training and meetings

It is suggested that EFSA organises training or data collection workshops. These could be provided at
EFSA premises, in conjunction with Task Force on Zoonoses Data Collection and Zoonoses IT Task

Supporting Publications 2012:EN-233 19


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

Force meetings, or, upon request, in the Member States (to allow wider participation of data managers
and scientists from official reporting organisations and laboratories). These trainings or workshops
should focus on the following topics:

• Techniques for mapping between reporting organisation’s controlled terminologies and EFSA
pick lists
• Understanding XML and XSDs
• Creating XML files from Excel and the use of the XML map functionality
• Submitting data files via the DCF
• Interpretation of the DCF automated error messages
• Sharing best practice in data collection, preparation and reporting

To support these workshops, training material would be developed and, when possible, use will be
made of existing training materials developed by the ECDC. The materials would include PowerPoint
presentations, frequently asked questions, video clips and worked examples with Excel workbooks.
The training materials would be available for download from the same location as the documentation
described in point 4.1.

4.4. Supporting IT Tools

To support the creation and provision of XML/Excel/CSV files through DCF, it is suggested that
EFSA support Member States with some basic tools to allow the mapping of the national pick lists
against the EFSA pick lists as already described in paragraph 3.7.2 - Excel macros and Excel tools.

In addition, EFSA should suggest some non-proprietary and free tools to allow the conversion of
Excel and CSV files into XML files.

Supporting Publications 2012:EN-233 20


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

CONCLUSIONS AND RECOMMENDATIONS


The conclusions and recommendation of this Working Group for the submission of XML and Excel
files in the Zoonoses systems are summarised below. For details, it is recommended to refer to the
corresponding chapters of the technical report.

AMR Isolate based pilot


As outcome of the AMR isolate-base reporting pilot it is suggested:
• to ask AMR’s experts to revise the list of data elements collected for AMR isolate-based
analysis needs, if there is the need to improve the analysis of AMR resistance tests;
• to continue the use of DCF for reporting AMR isolate-based data to Zoonoses;
• to align and harmonise the “message header” and the “transmission section” of all XML
schemas used in DCF;
• to improve the DCF interface to provide better feedback and readability of data to data
providers;
• pending the introduction of the Isolate concept in the SSD model (see Annex B), to consider
the use of an extended SSD model for reporting AMR isolate-based data.

Automatic transfer of data to Zoonoses


With reference to automatic transfer of data to the Zoonoses database, it is recommended:
• to adopt and use the Data Collection Framework tool (DCF) and more in details:
o to support not only submission of Excel and XML files but also of CSV files;
o to adopt and use the validation of pick lists inside DCF;
o to adopt and use the validation of business rules inside DCF;
o to adopt and use the exchange protocols defined in the EFSA "Guidance on Data
Exchange”;
• to implement one single “User Management” module;
• to implement one single “Dictionary Management” module;
• to drop the “generic data model” so far used in Zoonoses;
• to implement “flat data models” for the reporting of aggregated and sample-based data on
Zoonoses;
• to review and harmonise the EFSA’s pick lists:
o between the different Data Collection projects in EFSA;
o between EFSA and ECDC;
• to implement an automatic mechanism of publishing EFSA’s pick lists and a notification
system to inform Member States when there are some changes to the EFSA pick lists;
• not to develop Excel Macros but to provide some “Excel tools” to help data providers in the
mapping of national pick lists against the EFSA pick lists;
• to grant access to the test environment in DCF to all data managers of the Member States.

IT and Non-IT Support


With reference to the IT and non-IT Support expected by Member States for automatic transfer of data
to the Zoonoses database, it is recommended:
• to improve the documentation provided to the Member States;
• to extend the Zoonoses helpdesk in supporting XML/Excel/CSV submission;
• to organise training and ad-hoc meetings for the Member States.
The aforementioned recommendations are addressed to EFSA and to the Zoonoses Task Force to
allow proper planning and implementation of XML/Excel/CSV transmission in Zoonoses.

In addition, the suggestions on how to extend the SSD data model to accommodate AMR isolate-based
data collection needs are reported in “Annex B. - Proposed revision of the SSD model to
accommodate AMR isolate-based reporting” and are for consideration by the future “SSD Review
Committee”.

Supporting Publications 2012:EN-233 21


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

REFERENCES
European Food Safety Authority 2010a; Standard sample description for food and feed. EFSA Journal
2010;8(1):1457 [54 pp.]. doi:10.2903/j.efsa.2010.1457
European Food Safety Authority 2010b; Guidance on Data Exchange. EFSA Journal 2010; 8(11):1895
[50pp.]. doi:10.2903/j.efsa.2010.1895
Langual 2010; The International Framework for Food Description 2010
http://www.langual.org/langual_Thesaurus.asp

Supporting Publications 2012:EN-233 22


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

APPENDICES

A. AMR ISOLATE-BASED DATA MODEL

Element Code Element Short Element Type Mandatory Definition Picklist Business Business Rules in SAS
Name name for Rules in
XML/Excel the XML
transfer Schema

AMR.00 Result Code resultCode xs:String(60) Mandatory Unique For the purpose of the 2011 AMR Pilot, EFSA and not the
identification organisation sending the data will build the "Isolate's Result
number of the Code".
AST result (a row
of the data table) The resultCode will be built as follows:
in the
transmitted file. resultCode =[repYear] + [repCountry] + ["Z" + zoonose] + ["S" +
sampType] + ["M"+method] + ["A"+AMR_substance] + "D" +
The result code SampY+SampM+SampD] + ["I" + labIsolCode]
must be
maintained at - "repCountry" is the ISO-3166-1-alpha-2 of the reporting
organisation Country
level and it will - "zoonose","sampType","AMR_Substance" are the
be used in corresponding codes in the pick lists (with as many "0" as needed
further in front of the code to reach the length of 5); Example: "Code =
updated/deletion 123" --> "00123"
operation from - "method" is the corresponding code in the pick list (with as
the senders. many "0" as needed in front to reach the length of 3) Example:
"Code = "12" --> "012"
- "sampD" may be missing, in this case it is replaced by "00"
Examples of resultCode, where sampD ("day of sampling") is
known or not known:
-2009ATZ00121S159613M023A03561D20011203IISOLCX324
-2009ATZ00121S159613M023A03561D20011200IISOLCX324

Supporting Publications 2012: EN-233 23


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

AMR.01 Reporting Year repYear xs:decimal(4, Mandato Reporting >=1980 and


0) ry Year <=2020
AMR.02 Reporting Country repCountry xs:string(2) Mandato Reporting LK_Country
ry Country
AMR.03 Language lang xs:string(2) Mandato language LK_Language ="en" ="en"
ry used to fill (for the pilot only
in the free "en" is allowed)
text fields
INFO ABOUT
AMR.04 Laboratory Identification Code labCode xs:string(100) Optional Identificatio Free text
COUNTRY
n code of
AND LABORATORY
the
laboratory
in the
country
performing
susceptibilit
y test (AST)
of the
isolate

AMR.05 Zoonosise zoonose xs:string(5) Mandato Zoonosis LK_Zoonose


ry species,
serovar or
phagetype
AMR.06 Sample Type (food category or sampType xs:string(5) Mandato Food LK_Species
animal species) ry categories
or animal
species of
the sample
AMR.07 Laboratory Isolate Code labIsolCode xs:string(20) Mandato Alphanumer Free text
ry ic code
INFO ABOUT THE TYPE OF SAMPLE given to the
AND THE ISOLATE isolate by
the
laboratory
that
performs
the AST
AMR.08 Total number of isolates labTotIsol xs:decimal(4, Optional Total >0
available in the laboratory 0) Number of
isolates
available in
the
laboratory

Supporting Publications 2012: EN-233 24


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

AMR.9 Program Code progCode xs:string(20) Optional Unique Free Text


identification code
of the programme
or project for
which the sample
analysed was
taken
AMR.10 Area of Sampling sampArea xs:string(5) Optional Area or Region or LK_NUTS to be
Province of the specified
sampling (in when
accordance to the sampType
NUTS standard) refers to an
animal
species
AMR.11 Sampling Context sampConte xs:string(5) Optional Sampling Context LK_Sampling
xt Context
AMR.12 Sampling Stage sampStage xs:string(5) Optional Sampling Stage LK_Sampling_Stag
e
AMR.13 Sampling Details sampDetail xs:string(100) Optional Sampling Details Free Text
s
INFO ABOUT AMR.14 Sampling Year sampY xs:decimal(4, Mandatory Year of sampling.
SAMPLING 0)
AMR.15 Sampling Month sampM xs:decimal(2, Mandatory Month of
0) sampling.
AMR.16 Sampling Day sampD xs:decimal(2, Optional Day of sampling.
0)
AMR.17 Isolation Year IsolY xs:decimal(4, Optional Year when the
0) isolation was
completed
AMR.18 Isolation Month IsolM xs:decimal(2, Optional Month when the
0) isolation was
completed
AMR.19 Isolation Date IsolD xs:decimal(2, Optional Day when the
0) isolation was
completed
AMR.20 Susceptibility Test Year analisysY xs:decimal(4, Optional Year when the AST
0) was completed
AMR.21 Susceptibility Test Month analisysM xs:decimal(2, Optional Month when the
0) AST was
completed
AMR.22 Susceptibility Test Day analisysD xs:decimal(2, Optional Day when the AST
0) was completed

Supporting Publications 2012: EN-233 25


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

INFO ABOUT METHOD AMR.23 Method method xs:string(3) Mandatory Method for the LK_Method
AND antimicrobial
ANTIMICROBIAL SUBSTA NCE susceptibility
testing
AMR.24 Antimicrobial substance xs:string(5) Mandatory Antimicrobial LK_Substances
Substance Substance
AMR.25 Cut-off value cutoffValue xs:double Mandatory Cut-off value

AMR.2 Lowest lowest xs:double Optional Lowest limit (for lowest>=0. Mandatory
6 (limit) dilution) 008< and when
lowest<=40 Laboratory
96 Method is =
"dilution"
and
highest>lowe
st
AMR.2 Highest highest xs:double Optional Highest limit (for highest>=0 Mandatory
7 (limit) dilution) .008< and when
INFO ABOUT MIC Values
highest<=4 Laboratory
(FOR DILUTION)
096 Method is =
"dilution"
and
highest>lowe
st
AMR.2 MIC values MIC xs:string(5) Optional MIC value (mg/L) LK_MIC_Values Mandatory
8 (mg/L) (Minimal Inhibitory when
Concentration) Laboratory
Method is =
"dilution"

Supporting Publications 2012: EN-233 26


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

AMR.2 Disk diskConc xs:double Optional Quantity of >0 Mandatory


9 concentratio antimicrobial per when
n (µg) the disk Laboratory
Method is =
"diffusion"
AMR.3 Disk diskDiam xs:double Optional Diameter of the disk Mandatory
0 diameter used disk diameter>= when
(mm) 6 and disk Laboratory
diameter<= Method is =
35 "diffusion"
INFO ABOUT Disk and IZD Values
and
(FOR DIFFUSION)
IZD>=disk
diameter
AMR.3 IZD values IZD xs:double Optional IZD value (mm ) IZD>= 6 Mandatory
1 (mm) (Inhibition Zone and when
Diameter) IZD<=35 Laboratory
Method is =
"diffusion"
and
IZD>=disk
diameter

AMR.3 comment ResComm xs:string(25 Optional Additional Free Text


2 0) comment for the
AST result
Example of types:

xs:String(40) text field (= String) of maximum 40 characters


xs:decimal (4,0) numeric fields made of 4 digits and no decimals e.g. 2010
xs:double numeric fields with decimals e.g. 124.56
The data type xs: double and the other numeric data types which allow decimal separator requires the
decimal separator to be a “.” while the decimal separator “,” is not allowed.

Supporting Publications 2012: EN-233 27


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

B. PROPOSED REVISION OF THE SSD MODEL TO ACCOMMODATE AMR ISOLATE-BASED


REPORTING

The version of the SSD and the associated controlled terminologies were compared with the AMR
isolate-based data model used in the AMR isolate-based pilot (version 2.1). This Annex contains a
description of the main differences between the two data models and some suggestions for the future
“SSD review committee” on how to possibly extend the SSD model in order to accommodate the
AMR isolate-based data collection needs.

Isolate Entity

The major difference between the SSD and the AMR pilot model is the lack in the SSD model of the
“Isolate” entity and concept.

The SSD is based upon five key entities “Laboratory”, “Local Organisation”, “Laboratory Sample”,
“Laboratory Sub-Sample”, and “Result”. This structure allows laboratory samples to be subdivided in
sub-samples, if specified by legislation or analytical protocol, and the associated laboratory results to
be reported for one9 or more sub-samples.

The AMR pilot data model is designed to allow reporting of the results of the antimicrobial
susceptibility testing of bacterial isolates. It is very important to be able to link the susceptibility
testing of a single isolate in order to investigate multiple antibiotic resistance patterns.

Figure 5 below shows a suggestion on how to extend the entity relationships model of SSD in order to
accommodate the AMR isolate-based model.

Figure 5: Extension of SSD entity relationships to include Isolate entity.

9
In this case no sub-samples should be specified and the “sub-sample” entity is skipped

Supporting Publications 2012: EN-233 28


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

An isolate is characterised by the following elements:

• the unique isolate code used as an identifier in the laboratory,


• the species or relevant subtypes of the isolate (including the phagetype or serovar where
available)
• the date of isolation

Missing AMR data elements in the SSD model

The SSD has been developed with a minimum number of mandatory elements, on the understanding
that specific topic areas may need to specify additional elements as mandatory.

This is already applied in pesticide monitoring where optional elements in the SSD related to the
maximum residue limit (MRL) and compliance assessment are mandatory for this data collection,
thereby ensuring MRL compliance at EU level can be assessed. Since many EU reporting
organisations have already implemented the SSD in their data management systems, any new elements
added to the SSD would be optional, however AMR reporting guidelines could specify the topic
specific mandatory fields.

A number of data fields used in the AMR pilot data model do not have a direct correspondence in the
SSD. Some AMR numeric data fields used to collect the cut-off values, the antimicrobial susceptibility
test and the parameters under which the test was performed are missing in the SSD model.

In the case of the Minimum Inhibitory Concentration for dilution tests (MIC) this is a categorical value
with an associated MICVALUES pick list.

There is no direct correspondence with the numerical values describing substance concentration and
the analytical method performance parameters in the current SSD. As a consequence, a requirement to
extend the SSD data model to allow the reporting of these new data elements would be needed.

Mandatory SSD data elements missing in the AMR model

Conversely, there are some mandatory elements in the SSD that are not included in the AMR data
model. Some are:

• Laboratory sample code


• Country of origin (of the sample)
• Product treatment
• Sampling method
• Laboratory accreditation
• Type of parameter
• Type of result

If SSD is extended and adopted for AMR isolate based reporting, when no information is available the
value “Unknown” can be used to report on “Country of origin”, “Product treatment” and “Sampling
method”. Where missing, the other controlled terminologies could be extended to include the value
“Unknown”. In the latter case, it will be necessary to make this ‘Unknown’ value valid only for AMR
isolate-based data, since other data collections require mandatory completion of these fields based on
existing controlled terminologies which exclude ‘Unknown’.

However, the information reported on “Laboratory sample code”, “Country of origin”, “Laboratory
accreditation”, “Type of parameter” and “Type of result” can be informative when assessing data
quality. So the addition of these data fields in the AMR data collection model can be also considered
by the AMR experts that will discuss the revision of the AMR isolate-model used in the pilot exercise.

Supporting Publications 2012: EN-233 29


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

Revision of the SSD pick lists

In order to describe the “Isolate” entity added to the SSD model, the SSD’s PARAM controlled
terminology needs to be revised in order to include all the species, subtypes, phagetypes and serovars
available in the Zoonoses pick list.

In order to describe food items, feed items or animal species items used in the AMR data collection
and included in the Zoonoses pick list, there is the need to review and adapt the existing FOODEX and
MATRIX pick lists in the SSD. This review, especially for FOODEX, should take in consideration the
future EFSA’s harmonised food classification system, yet to be published, and assess the possibility to
use the “product code” element of the SSD.

Revision of the AMR Isolate-based pick lists

In the AMR pilot, two pick lists were identified for which there was not an exact correspondence to
the SSD elements and the following proposals for revision are made:

• The pick list “Sampling Stage” at level 1 can be a subset of the SSD’s “Sampling point” data
element, associated with the SSD’s SMPNT controlled terminology.

Some entries of the current “Sampling Stage” level 2 refer to “domestic” and “imported”. This
information should be kept in a different column or derived from the field “Country of origin (of
the sample)” which is currently not collected in Zoonoses for aggregated tables. The addition of
the field “Country of origin (of the sample)” with its specific pick-list should be considered for
future sample-based data collection needs in Zoonoses. For the moment, to support backward
compatibility it is suggested to still collect “domestic” and “imported” in the Level 2 of the
Sampling Stage pick lists, where applicable.

In order to support reporting “Sampling Stage” at level 2 and level 3 an additional “Sample Type”
data element in SSD could be considered with a controlled terminology potentially based on the
“Langual 2010 C: Part of plant or animal codes” (Langual 2010) or alternatively by using the new
EFSA harmonised food classification system after its adoption.

Therefore, the splitting of the “Sampling Stage” pick list into two separate pick lists as follows
may facilitate the mapping to local data structures and the SSD data model and therefore improve
reporting of this information:

Sampling Stage Level 2


Level 1
at AI station
at catering domestic
imported
at cutting plant domestic
imported
at farm
at feed mill domestic
imported

at game handling establishment


at hatchery
at hospital or care home
at packing centre
at processing plant domestic
imported
at retail domestic

Supporting Publications 2012: EN-233 30


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

imported
at slaughterhouse
at zoo
from hunting
in total
unspecified
at border control

Sample Type
Level 1 Level 2 Level 3
animal sample blood
caecum
ear
eggs
faeces
fleece
foetus/stillbirth
lymph nodes
meat juice
milk
mucosal swab rectum-anal swab
vaginal swab
placental swab
neck skin
organ/tissue
tonsil
food sample blood
carcass swabs
meat
milk
neck skin
tonsil
feed sample
environmental sample boot swabs
dust
unknown

• The pick list “Sampling Context” at level 1 and level 2 is very similar to the SSD’s element
“Programme type”, categorised by the SSD’s SRCTYP controlled terminology. However,
“Sampling Context” level 1 and level 2 are partially but not fully covered by the “Programme
type”. This could be overcome by reviewing the controlled terminology SRCTYP and creating in
SSD new elements and controlled terminologies based on “Sampling Context” level 1 and level 2.
For example, one new data element can be added in SSD to describe the design of the programme
and the “Sampler” (i.e. “Sample taker”). The pick list “Sampling Context” at level 3 uses the
EUROSTAT classification and is equivalent to the SSD element “Sampling strategy” and the
associated controlled terminology SAMPSTR.

Supporting Publications 2012: EN-233 31


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

Therefore, the splitting of the “Sampling Context” pick list into three separate pick lists as
follows may facilitate the mapping to local data structures and the SSD data model and therefore
improve reporting of this information:

Sampling Context
Level 2
Level 1
Clinical investigations
Control and eradication programmes
Monitoring
Surveillance
EU baseline survey
Survey
National survey
Unspecified

Sampler
Level 1
Industry sampling
Official sampling
Official and industry sampling
HACCP and owns check
Not applicable

Sampling Strategy
Level 1
Objective sampling
Census
Convenience sampling
Selective sampling
Suspect sampling
Unspecified

Conclusions

The comparison of the SSD and AMR isolate-based data models indicates that, with the introduction
in SSD of the “Isolate” entity, with the addition in SSD of some new data elements and the revision of
some pick lists in both SSD and AMR models, it would be possible to use the SSD data model to
report on AMR isolate-based data.

The extension of the SSD model to accomplish future Zoonoses sample-based data collection needs
for food, feed or animal samples is outside the scope of this Working Group and should be discussed
by the “SSD review committee” where also new requirements should be considered and discussed in
details (e.g. collection of animal samples, specification of storage of the sample until shelf life, etc…).

Supporting Publications 2012: EN-233 32


Technical report on the use of Excel/XML files
for submission of data to the Zoonoses system

ABBREVIATIONS

AMR = Antimicrobial Resistance

CSV = Comma Separated Values

DCF = Data Collection Framework

ECDC = European Centre for Disease Prevention and Control

EFSA = European Food Safety Authority

FBO = Food-borne Outbreak

SSD = Standard Sample Description

XML = eXtensible Markup Language

Supporting Publications 2012: EN-233 33

You might also like