You are on page 1of 5

SoftwareX 19 (2022) 101101

Contents lists available at ScienceDirect

SoftwareX
journal homepage: www.elsevier.com/locate/softx

Original software publication

drtsans: The data reduction toolkit for small-angle neutron scattering


at Oak Ridge National Laboratory✩
∗ ∗
William T. Heller a , , John Hetrick b , , Jean Bilheux a , Jose M. Borreguero Calvo b ,
Wei-Ren Chen a , Lisa DeBeer-Schmitt a , Changwoo Do a , Mathieu Doucet a , Michael
R. Fitzsimmons a , William F. Godoy b , Garrett E. Granroth a , Steven Hahn b , Lilin He a ,
Fahima Islam c , Jiao Lin d , Kenneth C. Littrell c , Marshall McDonnell b , Jesse McGaha b ,
Peter F. Peterson b , Sai Venkatesh Pingali a , Shuo Qian a , Andrei T. Savici a , Yingrui Shang a ,
Christopher B. Stanley e , Volker S. Urban a , Ross E. Whitfield b , Chen Zhang b ,
Wenduo Zhou b , Jay Jay Billings a,b ,1 , Matthew J. Cuneo a ,2 , Ricardo M. Ferraz Leal a,b ,3 ,
Tianhao Wang c ,4 , Bin Wu a ,3
a
Neutron Scattering Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
b
Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
c
Neutron Technologies Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
d
Second Target Station Project Office, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
e
Computational Science and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA

article info a b s t r a c t

Article history: Data reduction is a critical step in a small-angle neutron scattering experiment. It corrects the data for
Received 19 February 2022 instrument-specific artefacts, making it ready for analysis, interpretation, as well as for comparison
Received in revised form 21 April 2022 against data collected with different small-angle neutron scattering instruments. Here, the drtsans
Accepted 3 May 2022
software package developed at Oak Ridge National Laboratory for data reduction for the EQ-SANS,
Keywords: GP-SANS, and Bio-SANS instruments, which are located at the Spallation Neutron Source and High
Small-angle neutron scattering Flux Isotope Reactor, is described. The software and the rigorous development methods employed
Data reduction have positively impacted the scientific programs on the three SANS instruments.
Large-scale user facility © 2022 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license
(http://creativecommons.org/licenses/by/4.0/).

✩ Notice of copyright ∗ Corresponding authors.


This manuscript has been authored by UT-Battelle, LLC under Contract DE- E-mail addresses: hellerwt@ornl.gov (William T. Heller),
AC05-00OR22725 with the U.S. Department of Energy (DOE). The U.S. gov- hetrickjm@ornl.gov (John Hetrick).
ernment retains and the publisher, by accepting the article for publication, 1 Current address: RuleLXII, Oak Ridge, TN 37830.
acknowledges that the US government retains a nonexclusive, paid-up, irre- 2 Current address: St. Jude Children’s Research Hospital, Memphis, TN 38105.
vocable, worldwide license to publish or reproduce the published form of 3 Current address: unknown.
this manuscript, or allow others to do so, for U.S. government purposes. DOE
4 Current address: Spallation Neutron Source Science Center, China Spallation
will provide public access to these results of federally sponsored research in
accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe- Neutron Source, Gongguan, China.
public-access-plan).

https://doi.org/10.1016/j.softx.2022.101101
2352-7110/© 2022 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
William T. Heller, John Hetrick, Jean Bilheux et al. SoftwareX 19 (2022) 101101

Code metadata

Current code version v. 1.2.0


Permanent link to code/repository used for this code version https://github.com/ElsevierSoftwareX/SOFTX-D-22-00051
Code Ocean compute capsule N/A
Legal Code License GNU GPLv3
Code versioning system used git
Software code languages, tools, and services used python
Compilation requirements, operating environments & dependencies Mantid, Jupyter, h5py, numpy, docutils, jsonschema, lmfit, matplotlib, mpld3,
numexpr, pandas, sortedcontainers, tinydb, ipywidgets, access to the ORNL
Neutron Sciences computer systems.
If available Link to developer documentation/manual https://code.ornl.gov/sns-hfir-scse/sans/sans-backend/-
/blob/next/CONTRIBUTING.rst
Support email for questions shangy@ornl.gov

1. Motivation and significance software package [2] that are written in C++, which helps with
the execution speed. Users configure the data reduction to suit
Oak Ridge National Laboratory (ORNL) is home to three small- their needs, such as by specifying the sample data, the data
angle neutron scattering (SANS) instruments [1]. The GP-SANS from the background measurement, options related to the output,
and the Bio-SANS instruments are located at the High Flux Iso- sample thickness, and a wide variety of other settings. Then,
tope Reactor (HFIR), while the EQ-SANS instrument is located drtsans reads the parameter set, which is supplied in JSON format
at the Spallation Neutron Source (SNS). These instruments serve (https://www.json.org), and performs the specified reduction. It
researchers from around the world working in a wide variety of is highly configurable, which is necessitated by the diversity of
scientific disciplines including structural biology, quantum ma- experiment-specific approaches to data reduction that must be
terials, soft condensed matter physics and materials science. A taken. Once executed, the data reduction process requires no
vital step in an experiment performed with these instruments is additional user intervention.
data reduction, which takes the raw neutron data collected by The data reduction with drtsans is almost always performed
the instrument and translates it into a scattering cross-section on the Linux computer system available to the users of the
dΣ (q⃗)/dΩ , where q⃗ is the neutron momentum transfer, that is Neutron Sciences user facilities at ORNL. This computer system
ready for data analysis. provides access to the raw data, the most recent version of
Since commencing operation of the three instruments, be- drtsans, and the necessary calibration files. Deploying drtsans
ing in 2007 for the GP-SANS and Bio-SANS and in 2009 for on ORNL systems also enabled performance evaluation and opti-
the EQ-SANS, data reduction was accomplished through multiple mization of the computationally-intensive routines, such as those
software tools such as macros developed by an instrument scien- in Mantid [2]. In particular, the Mantid’s ‘‘LoadEventNexus’’ rou-
tist in IgorPro (Wavemetrics, Inc.; Portland, OR, USA), Mantid [2], tine was identified as a bottleneck for loading raw data [4,5].
and the GRASP application that was developed at the Institute Being a member of a neutron beam time proposal team that
Laue Langevin [3], which is written in Matlab (The MathWorks, has performed research using one of the SANS instruments at
Inc.; Natick, MA, USA). The existence of various data reduction ORNL [1] is currently a prerequisite for accessing the systems and
software packages created confusion among members of the user the data. More information about the neutron scattering facilities
communities of the instruments, particularly when performing
at ORNL and gaining access to the instruments can be found at
experiments on more than one of the SANS instruments at ORNL.
https://neutrons.ornl.gov/.
The differences in how the various software packages functioned
was significant enough that operational knowledge did not trans-
2.1. Software architecture
fer from one package to another. Further, the software develop-
ment process was not well-controlled, which allowed anyone to
make changes at any time that impacted both underlying function The high-level structure of the drtsans code base is presented
and usage. Many such changes were made without consulting all in Fig. 1. Both the code for testing drtsans and the actual data re-
stakeholders and were often not widely announced. After being duction code are contained under a single top-level project called
unable to reach a consensus among the instrument scientists sans-backend. Both drtsans and tests are organized into global
about the best way forward, ORNL decided to undertake a project methods, facility-specific methods and instrument-specific meth-
to develop a new, unified software package for data reduction for ods. Facility-specific methods are those that are either suitable
the SANS instruments at ORNL [1]. The result was drtsans, which for working with SANS instruments that use a single wavelength
went into official use as the production data reduction software (denoted ‘‘Mono’’), such as GP-SANS [6] and Bio-SANS [7], or in-
for the SANS at ORNL in May, 2020. struments that use time-of-flight methods (denoted ‘‘Tof’’), such
Here, the architecture, development approach and function- as the EQ-SANS [8]. This structure makes it possible to add an
ality of drtsans are described. An illustrative example of the instrument, such as might take place at the proposed Second
use of the software and descriptions of the ways in which it Target Station of SNS, with minimal impact on the data reduction
is employed by the users of the instruments at ORNL are also code for the other instruments. Each instrument has a top-level
presented. The impact of drtsans on the SANS user community application programming interface (API) that ensures that the
at ORNL is discussed. Finally, future development directions will best practices agreed upon by the instrument team are followed.
be presented. Some methods are located in the highest level in the structure,
and such methods are independent of the facility or instrument.
2. Software description As can be seen in the figure, the same structure is employed
for the code in tests that is used for testing the methods in
drtsans is written in Python 3 (https://www.python.org). drtsans, except that there are unit tests for individual methods
Some data reduction operations employ routines from the Mantid and integration tests for how the methods work together.
2
William T. Heller, John Hetrick, Jean Bilheux et al. SoftwareX 19 (2022) 101101

Fig. 1. Organization of the drtsans code.

2.2. Development approach those for correcting for the per-pixel response of the detector, the
wavelength-dependent sample transmission, the blocked beam
Development of drtsans used, and continues to use, software background, and the ‘‘dark current" (shutter closed) background.
development processes that ensure the desired outcome. First, Methods for calculating the instrument-dependent q-resolution,
the data reduction method is developed by the instrument sci- which is necessary for data analysis, were implemented for all
entists and is documented in sufficient detail to make it possible three instruments. All three SANS instruments at ORNL save the
to implement as software. The instrument scientists also provide data in "event mode’’, which keeps the arrival time and pixel for
at least one example of how the method should work when each neutron detected, and methods are available for dividing
applied to data, which may be created by the instrument scientist raw data files into individual time-binned data sets during the re-
or involve the use of actual instrument data. Then, software duction process, which is very useful for studies of materials that
engineers review the information provided by the instrument are evolving with time. Methods are also available for performing
scientists and request clarification, if necessary. The method and wedge (sector) binning and for extracting annularly-binned data
one or more tests are implemented by the software engineers, during the data reduction process to avoid additional processing
and the instrument scientists have the opportunity to review the steps.
code. The instrument scientists are responsible for inspecting the drtsans can be invoked in a variety of ways. Importantly,
results for correctness. Key to the success of the development drtsans is readily called from within other Python scripts, which
process is the rigorous testing that takes place of individual makes it possible to process entire experiments with hundreds of
methods and the integration testing. The testing of the developed different measurements with a single script. Jupyter notebooks
code uses an automated pipeline to run drtsans against the suite (https://jupyter.org) are frequently used for data reduction. A
of tests that have been developed and the expected outcomes of large number of example Jupyter notebooks have been developed
the various operations. and are available to meet the needs of the vast majority of users
At any given time, there are three primary branches of the of the ORNL SANS instruments. Another approach is to use a
code that are deployed for instrument scientists and facility users: scripting interface that was originally developed by C. Do, which
the development branch (dev), the quality assurance (qa) branch, provides the flexibility of the Jupyter environment but can be run
and the release branch. These three branches are available to the from a terminal. C. Do was also responsible for developing an
instrument scientists and users through the use of conda envi- approach for invoking drtsans using configurations in comma-
ronments (https://conda.io). The development branch is where separated-value files that are convenient to edit in a spreadsheet,
new features are first implemented and it can be used to deploy which also makes it easy for a user to reduce an entire experiment
emergency bug-fixes. Utilizing the development branch for data at one time using a consistent method. Each instrument team has
reduction is more feasible now that the rate of development of developed documentation and tutorials that are available on their
new features has decreased significantly. The qa branch is where respective instrument pages at https://neutrons.ornl.gov.
the fully implemented and tested version with any updates is
made available for testing by instrument scientists. Once the qa 3. An illustrative example
branch is tested and approved, it moves to the release branch that
instrument scientists and users most often employ for data reduc- In the example presented here, data reduction using the script-
tion. The development process successfully produced software ing interface developed by C. Do is presented. The script is shown
that the instrument team or teams agree generates correctly- in Listing 1. The two paths provided on lines 6 and 7 are for the
reduced data, and the process ensures that changes being made intended location of the results (‘‘output_path’’) and the location
to the code are thoroughly reviewed before release. of the common files required for data reduction (‘‘shared_path’’).
The block of code that begins with ‘‘eq4._’’ contains a series of
2.3. Software functionalities entries that configure the parameters for the data reduction, such
as when default values are not suitable. The first two lines of
All of the methods that are necessary for reducing SANS data this section (lines 14–15) establish the default data reduction
⃗)/dΩ were implemented in drt-
into absolute cross-section dΣ (q parameters to use and the specific template of the parameter
sans. The list of available methods includes, but is not limited to, file to employ, which is required because new features added to
3
William T. Heller, John Hetrick, Jean Bilheux et al. SoftwareX 19 (2022) 101101

Table 1
Parameters used in the example presented in Listing 1.
Parameter Meaning
ipts The unique identifier for the beam time proposal
darkfilename A measurement of external sources of background signal
maskfilename Regions of the detector to ignore during data reduction
sensitivityfilename A measurement of the relative efficiency of the detector pixels
standardabsolutescale A multiplicative constant to be applied during reduction to obtain absolute scattering
cross-section
thickness The thickness of the sample in mm
sampleaperturesize The diameter of the aperture immediately prior to the sample in mm
detectoroffset A distance that must be applied to obtain the correct sample-to-detector distance, which
can change based on the experiment being performed
empty The run number of the empty beam transmission measurement that is also used for the
determination of the direct beam position on the detector
bkgscatt The run number of the background signal to be subtracted during reduction
bkgtrans The measurement of the neutron transmission of equipment causing the background signal
qbintype The method for spacing the q-values in the output result
numqbins The number of q-values used when calculating the output result

drtsans often require additional configuration parameters. Then, 33 # the run numbers
the experiment-specific data reduction parameters are specified 34 scatt4m = 124984
35 trans4m = 124981
(lines 16–30). Most of these parameters are generally set by 36
the instrument scientist working with a visiting user and are 37 print ( " ... reducing data set # " + str( scatt4m ) + " at 4m,
not changed by the user, with the exception of ‘‘qbintype’’ and 2.5A " )
38 eq4. _samscatt = str( scatt4m )
‘‘numqbins’’. The set of parameters available includes many that 39 eq4. _samtrans = str( trans4m )
are experiment specific. There are also parameters that make it 40 eq4. _filename = " EQSANS_ " + str( scatt4m )
possible to over-write information that is saved in the metadata 41 print ( " ..... reducing " + eq4. _samscatt + " for " + eq4.
_filename )
of the data files, such as when the sample environment has a 42 reduceNow (eq4) # comment this line in order to skip
different position than is defined as the ‘‘standard" position. In 43 print ( " ..... process complete . " )
this example, the parameters are presented in Table 1.
Listing 1: Example script for configuring and calling drtsans.
Then, the sample run number and the run number of its
neutron transmission measurement are specified (‘‘scatt4m’’ and
The data reduction process produces several files that are
‘‘trans4m’’; lines 34–35) and are used to inform the data reduc-
saved to the specified location. Two of the files output by the data
tion which data sets are to be used (‘‘samscatt’’ and ‘‘samtrans’’).
reduction process, which are plots generated using matplotlib
Next, the base portion of the output file names is specified (‘‘file-
(https://matplotlib.org) during execution of drtsans, are shown in
name’’). Finally, the drtsans data reduction process is launched Fig. 2. The reduced data is also provided in ASCII format suitable
(‘‘reduceNow(eq4)’’). Note that lines 33–43 can be made into an for import into data analysis packages, such as Sasview [9]. One
iterative process using lists (lines 34–35) and a loop enclosing of the important files output by the reduction is a comprehensive
lines 37–43 for processing a set of files that were measured using log file in HDF5 format (https://www.hdfgroup.org/). The log file
the same instrument configuration. includes important information derived during the data reduction
1 #!/ usr/bin/env python3 that is not included in the reduced data files, such as the position
2 import sys of the neutron beam on the detector and the values of the neutron
3 sys.path. append (’/SNS/ EQSANS / shared / script / eqsanstools /’)
4 from eqsans_drtsans_script import * transmissions of the sample and background.
5
6 output_path = " /SNS/ EQSANS /IPTS -26522/ shared /test/ " 4. Impact
7 shared_path = " /SNS/ EQSANS / shared / NeXusFiles / EQSANS /2021
B_mp/ "
8 There are four main impacts of drtsans on the instrument
9 # ########################### scientists and users of the SANS instruments at ORNL [1]. First,
10 # Reduction of data from IPTS -26522 the development process necessitated that all data reduction
11 # 4m 2.5a config
12 # ########################### methods be documented by the instrument scientists through the
13 description of the methods and the tests that are required by the
14 eq4 = EQVar (’2021B/Default - aug21_qa .json ’) developers. Documenting how data is to be reduced prompted
15 eq4. _defaultjsonfile = ’/SNS/ EQSANS / shared / script /
eqsanstools / eqsans_reduction_qa .json ’ discussion among the instrument scientists and gave all stake-
16 eq4._ipts = " 26522 " holders the ability to question and refine the methods. The result
17 eq4. _outputdir = output_path was staff that are confident in the methods and the ability to
18 eq4. _darkfilename = shared_path + " EQSANS_124667 .nxs.h5 "
19 eq4. _maskfilename = shared_path + " beamstop_mask_4m .nxs " present the information to interested instrument users.
20 eq4. _sensitivityfilename = shared_path + " A second important impact resulting from drtsans stems from
Sensitivity_patched_thinPMMA_4m_124972 .nxs " the integrated code testing process. The extensive, automated
21 eq4. _standardabsolutescale = " 5.74715236899679 "
22 eq4. _thickness = 1.0
testing that is integral to the development process ensured that
23 eq4. _sampleaperturesize = " 10 " the instrument scientists understood how the code was to work
24 eq4. _detectoroffset = 80.0 and could see it provide the desired result. The staff developed
25 eq4. _empty = " 124979 " input and output data for the unit tests employed, which ensured
26 eq4. _bkgscatt = " 124983 "
27 eq4. _bkgtrans = " 124980 " that their methods were correctly translated to software. An
28 increased level of confidence in the code is an important outcome
29 eq4. _qbintype = " log " from the testing.
30 eq4. _numqbins = 100
31 The third important impact that drtsans has made is that the
32 controlled development approach and release process ensures
4
William T. Heller, John Hetrick, Jean Bilheux et al. SoftwareX 19 (2022) 101101

5. Conclusions

The development of drtsans provided an opportunity to tran-


sition the data reduction software used on the SANS instruments
at ORNL from several packages to a single package. The de-
velopment process, the rigorous testing implemented and the
method of deploying the software to the systems of ORNL greatly
improved control over the data reduction software that staff and
users rely upon for performing their experiments. In principle,
drtsans could be extended to enable data reduction on SANS
instruments at other neutron scattering facilities, but doing so
would require a considerable investment of resources. A suit-
able graphical user interface is highly desired by instrument
scientists and members of the user community who work with
low-angle diffraction data, such as is often encountered in studies
of superconductors and skyrmions, and one will be pursued in
the future. The development of automation for data reduction
during experiments with minimal input from staff and users after
initially configuring the desired method could prove valuable for
users of the facility. Integrating data reduction with data analysis
also has potential to greatly improve the user experience, and is
being considered for future development.

Declaration of competing interest

The authors declare that they have no known competing finan-


cial interests or personal relationships that could have appeared
to influence the work reported in this paper.

Acknowledgments
Fig. 2. Example output from drtsans using the script presented in Listing 1. The
images shown in the Figure are only a portion of the results that are generated This research used resources at the Spallation Neutron Source
during the data reduction process. and High Flux Isotope Reactor, DOE Office of Science User Facili-
ties operated by the Oak Ridge National Laboratory, USA.

that the data reduction software is not being changed by a single References
individual without other stakeholders being aware of the change.
[1] Heller WT, Cuneo M, DeBeer-Schmitt L, Do C, He L, Heroux L, et al. The
A key challenge that existed with the previous data reduction
suite of small-angle neutron scattering instruments at Oak Ridge National
software for the SANS at ORNL was that a single individual Laboratory. J Appl Crystallogr 2018;51(2):242–8. http://dx.doi.org/10.1107/
could modify it without the knowledge of others. Results would S1600576718001231.
change, which could be evident in a measurement of a calibrated [2] Arnold O, Bilheux JC, Borreguero JM, Buts A, Campbell SI, Chapon L, et al.
standard sample, or the approach for operating the software that Mantid-data analysis and visualization package for neutron scattering and
µ SR experiments. Nucl Instrum Methods Phys Res A 2014;764:156–66.
was used previously would simply cease to function. Unexpected http://dx.doi.org/10.1016/j.nima.2014.07.029.
changes would require a frantic search for the reasons for any [3] Dewhurst C. GRASP. 2021, Institut Laue Langevin, https://www.ill.eu/users/
differences. It was often not possible to find the solution and fix support-labs-infrastructure/software-scientific-tools/grasp.
it in a timely manner, which was a source of considerable stress [4] Godoy WF, Peterson PF, Hahn SE, Billings JJ. Efficient data management
in neutron scattering data reduction workflows at ORNL. In: 2020 IEEE
for the instrument scientists and the users of the instruments
International Conference on Big Data. 2020, p. 2674–80. http://dx.doi.org/
because access to SANS instruments is a precious commodity. 10.1109/BigData50022.2020.9377836.
Removing the ability of one person to change a method on the [5] Godoy WF, Peterson PF, Hahn SE, Hetrick J, Doucet M, Billings JJ. Per-
fly without the knowledge of others due to the rigorous testing formance improvements on SNS and HFIR instrument data reduction
employed affords a great deal of stability for people performing workflows using Mantid. In: Nichols J, Verastegui B, Maccabe AB, Hernan-
dez O, Parete-Koon S, Ahearn T, editors. Driving scientific and engineering
data reduction, a process that is often rote, mechanical and should
discoveries through the convergence of HPC, Big Data and AI. Cham:
never be a source of stress during or after an experiment. Springer International Publishing; 2020, p. 175–86. http://dx.doi.org/10.
The fourth important impact from drtsans is the performance 1007/978-3-030-63393-6_12.
improvements propagated back to the Mantid framework. Unify- [6] Wignall GD, Littrell KC, Heller WT, Melnichenko YB, Bailey KM, Lynn GW,
ing the SANS data reduction software made it possible to identify, et al. The 40 m general purpose small-angle neutron scattering instrument
at Oak Ridge National Laboratory. J Appl Crystallogr 2012;45(5):990–8.
understand and address bottlenecks in the loading of raw data http://dx.doi.org/10.1107/S0021889812027057.
using Mantid’s ‘‘LoadEventNexus’’ routine. Improvements made [7] Heller WT, Urban VS, Lynn GW, Weiss KL, O’Neill HM, Pingali SV, et al.
to this one routine resulted in 10% to 30% decreases in the in wall- The Bio-SANS instrument at the High Flux Isotope Reactor of Oak Ridge
clock times for a single data set to be reduced. Achieving faster National Laboratory. J Appl Crystallogr 2014;47(4):1238–46. http://dx.doi.
org/10.1107/S1600576714011285.
data reduction workflows on the ORNL computational systems
[8] Zhao JK, Gao CY, Liu D. The extended Q-range small-angle neutron scattering
benefits all SANS users [4]. The changes to ‘‘LoadEventNexus" also diffractometer at the SNS. J Appl Crystallogr 2010;43(5 Part 1):1068–77.
improved the performance of the code for other instruments that http://dx.doi.org/10.1107/S002188981002217X.
use NeXus files [4]. The improved routine became available to [9] Doucet M, Cho JH, Alina G, Bakker J, Bouwman W, Butler P, et al. SasView
all users of Mantid beginning with the 5.1.0 version that was version 5.0, Zenodo. 2020, http://dx.doi.org/10.5281/zenodo.3930098.
released in September, 2020.
5

You might also like