ASTM E2077-00(2010)
(Specification)Standard Specification for Analytical Data Interchange Protocol for Mass Spectrometric Data
Standard Specification for Analytical Data Interchange Protocol for Mass Spectrometric Data
ABSTRACT
This specification covers an analytical data interchange protocol for mass spectrometric data representation and a software vehicle to affect the transfer of mass spectrometric data between instrument data systems. This specification does not provide for the storage of data acquired simultaneous to and integrated with the mass spectrometric data, but on other detectors. The protocol, which is designed to benefit users of analytical instruments and increase laboratory productivity and efficiency, provides a standardized format for the creation of raw data files, library spectrum files or results files. This file, which has a ".cdf" extension, contains typical header information like instrument, sample, and acquisition method description, followed by raw, library, or processed data. Once data have been written or converted to this protocol, they can be read and processed by software packages that support the protocol. This protocol is intended to perform the following functions: (1) transfer data between various vendors' instrument systems; (2) provide Laboratory Information Management Systems (LIMS) communications; (3) link data to document processing applications; (4) link data to spreadsheet applications, and (5) archive analytical data, or a combination thereof.
SCOPE
1.1 This specification covers a standardized format for mass spectrometric data representation and a software vehicle to effect the transfer of mass spectrometric data between instrument data systems. This specification provides a protocol designed to benefit users of analytical instruments and increase laboratory productivity and efficiency.
1.2 The protocol in this specification provides a standardized format for the creation of raw data files, library spectrum files or results files. This standard format has the extension “.cdf” (derived from NetCDF). The contents of the file include typical header information like instrument, sample, and acquisition method description, followed by raw, library or processed data. Once data have been written or converted to this protocol, they can be read and processed by software packages that support the protocol.
1.3 This specification does not provide for the storage of data acquired simultaneous to and integrated with the mass spectrometric data, but on other detectors; for example attached to the mass spectrometer's liquid or gas chromatographic system. Related Specification E1947 and Guide 1948E1948 describe the storage of 2-dimensional chromatographic data.
1.4 The software transfer vehicle used for the protocol in this specification is NetCDF, which was developed by the Unidata Program and is funded by the Division of Atmospheric Sciences of the National Science Foundation.
1.5 The protocol in this specification is intended to (1) transfer data between various vendors' instrument systems, (2) provide Laboratory Information Management Systems (LIMS) communications, (3) link data to document processing applications, (4) link data to spreadsheet applications, and (5) archive analytical data, or a combination thereof. The protocol is a consistent, vendor independent data format that facilitates the analytical data interchange for these activities.
1.6 The protocol consists of:
1.6.1 This specification on mass spectrometric data, which gives the full definitions for each one of the generic mass spectrometric data elements used in implementation of the protocol. It defines the analytical information categories, which are a convenient way for sorting analytical data elements to make them easier to standardize.
1.6.2 Guide E2078 on mass spectrometric data, which gives the full details on how to implement the content of the protocol using the public-domain NetCDF data interchange system. It includes a brief introduction to using NetCDF and describes an API (Application Programming Interface) that is intended to be incorporated into application programs to read or write NetCDF files. It ...
General Information
Relations
Standards Content (Sample)
NOTICE: This standard has either been superseded and replaced by a new version or withdrawn.
Contact ASTM International (www.astm.org) for the latest information
Designation:E2077 −00(Reapproved 2010)
Standard Specification for
Analytical Data Interchange Protocol for Mass
Spectrometric Data
This standard is issued under the fixed designation E2077; the number immediately following the designation indicates the year of
original adoption or, in the case of revision, the year of last revision.Anumber in parentheses indicates the year of last reapproval.A
superscript epsilon (´) indicates an editorial change since the last revision or reapproval.
1. Scope is a consistent, vendor independent data format that facilitates
the analytical data interchange for these activities.
1.1 Thisspecificationcoversastandardizedformatformass
1.6 The protocol consists of:
spectrometric data representation and a software vehicle to
1.6.1 This specification on mass spectrometric data, which
effect the transfer of mass spectrometric data between instru-
gives the full definitions for each one of the generic mass
ment data systems. This specification provides a protocol
spectrometric data elements used in implementation of the
designedtobenefitusersofanalyticalinstrumentsandincrease
protocol.Itdefinestheanalyticalinformationcategories,which
laboratory productivity and efficiency.
are a convenient way for sorting analytical data elements to
1.2 The protocol in this specification provides a standard-
make them easier to standardize.
ized format for the creation of raw data files, library spectrum
1.6.2 GuideE2078onmassspectrometricdata,whichgives
files or results files. This standard format has the extension
thefulldetailsonhowtoimplementthecontentoftheprotocol
“.cdf” (derived from NetCDF).The contents of the file include
using the public-domain NetCDF data interchange system. It
typical header information like instrument, sample, and acqui-
includesabriefintroductiontousingNetCDFanddescribesan
sition method description, followed by raw, library or pro-
API(ApplicationProgrammingInterface)thatisintendedtobe
cessed data. Once data have been written or converted to this
incorporated into application programs to read or write
protocol,theycanbereadandprocessedbysoftwarepackages
NetCDF files. It is intended for software implementors, not
that support the protocol.
those wanting to understand the definitions of data in a mass
1.3 This specification does not provide for the storage of
spectrometric dataset.
data acquired simultaneous to and integrated with the mass
1.6.3 NetCDF Users Guide.
spectrometric data, but on other detectors; for example at-
tached to the mass spectrometer’s liquid or gas chromato-
2. Referenced Documents
graphicsystem.RelatedSpecificationE1947andGuideE1948
2.1 ASTM Standards:
describe the storage of 2-dimensional chromatographic data.
E1947Specification for Analytical Data Interchange Proto-
1.4 The software transfer vehicle used for the protocol in
col for Chromatographic Data
this specification is NetCDF, which was developed by the
E1948Guide for Analytical Data Interchange Protocol for
UnidataProgramandisfundedbytheDivisionofAtmospheric
Chromatographic Data
Sciences of the National Science Foundation.
E2078Guide for Analytical Data Interchange Protocol for
Mass Spectrometric Data
1.5 The protocol in this specification is intended to (1)
transferdatabetweenvariousvendors’instrumentsystems,(2) 2.2 Other Standards:
provideLaboratoryInformationManagementSystems(LIMS) EIA 232
IEEE 488
communications, (3) link data to document processing
applications, (4) link data to spreadsheet applications, and (5) IEEE 802
archive analytical data, or a combination thereof.The protocol Occupational Safety and Health Administration (OSHA)
1 3
This specification is under the jurisdiction of ASTM Committee E13 on For referenced ASTM standards, visit the ASTM website, www.astm.org, or
Molecular Spectroscopy and Separation Science and is the direct responsibility of contact ASTM Customer Service at service@astm.org. For Annual Book of ASTM
Subcommittee E13.15 on Analytical Data. Standards volume information, refer to the standard’s Document Summary page on
Current edition approved Nov. 1, 2010. Published November 2010. Originally the ASTM website.
approved in 2000. Last previous edition approved in 2005 as E2077–00 (2005). Available from Electronic Industries Alliance (EIA), 2500 Wilson Blvd.,
DOI: 10.1520/E2077-00R10. Arlington, VA 22201, http://www.ecaus.org/eia.
2 5
For more information on the NetCDF standard, contact Unidata at www.uni- Available from Institute of Electrical and Electronics Engineers, Inc. (IEEE),
data.ucar.edu. 445 Hoes Ln., Piscataway, NJ 08854, http://www.ieee.org.
Copyright © ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United States
E2077−00 (2010)
TABLE 1 Administrative Information Class
Standards-29 CFR part 1910
NetCDFUser’s Guide
NOTE 1—Particular analytical information categories (C1, C2, C3, C4,
2.3 ISO Standards:
or C5) are assigned to each data element under the Category column.The
ISO 639:1988Code for the representation of names of meaning of this category assignment is explained in Section 5.
languages
NOTE 2—The Required column indicates whether a data element is
ISO8601:1988Dataelementsandinterchangeformats(First
required, and if required, for which categories. For example, M1234
edition published 1988-06-15; with Technical Corrigen-
indicates that that particular data element is required for any dataset that
includes information from Category 1, 2, 3, or 4. M4 indicates that a data
dum 1 published 1991-05-01)
element is only required for Category 4 datasets.
ISO 9000Quality Management Systems
ISO/IEC 8802
NOTE 3—Unless otherwise specified, data elements are generally
recorded to be their actual test values, instead of the nominal values that
3. Terminology
were used at the initiation of a test.
3.1 Analytical Information Classes—The Mass Spectrom-
NOTE 4—A table is not to be interpreted as a table of keywords. The
etry Information Model categorizes mass spectrometric infor-
software implementation is independent of the data element names used
mation into a number of information “classes.” There is not a
here,andisinfactquitedifferent.Likewise,thedatatypesgivenarenotan
implementation representation, but a description of the form of the data
direct mapping of these classes into the implementation cat-
element name. That is, a data element labeled as floating point may, for
egories described further below. The implementation catego-
example, be implemented as a double precision floating point number; in
riesdescribetheinformationhierarchy;theclassesdescribethe
this document, it is sufficient to note it as floating point without reference
contents within the hierarchy. The model presented here only
to precision.
partially addresses these classes. In particular, the last two
Data Element Name Datatype Category Required
(Processed Results and Component Quantitation Results) are
dataset-completeness string C1 M12345
not described at all. Only Implementation Category 1 is
protocol-template-revision string C1 M12345
netcdf-revision string C1 M12345
required for compliance within this specification. Information
languages string C1 or C5 . . .
about the other implementation categories is provided for
administrative-comments string C1 or C2 . . .
historical interest. The classes defined here are:
dataset-origin string C1 M4
dataset-owner string C1 . . .
3.1.1 Administrative—information for administrative track-
dataset-date-time-stamp string C1 M1234
ing of experiments.
injection-date-time-stamp string C1 M1234
experiment-title string C1 . . .
3.1.2 Instrument-ID—information about the instrument that
experiment-cross-references string array[n] C3 or C4
generally does not change from experiment to experiment.
operator-name string C1 M4
experiment-type string C1 or C4 . . .
3.1.3 Sample Description—information describing the
pre-experiment-program-name string C2 or C5 . . .
sample and its history, handling and processing.
post-experiment-program-name string C2 or C5 . . .
number-of-times-processed integer C5
3.1.4 Test Method—allinformationusedtogeneratetheraw
number-of-times-calibrated integer C5
data and processed results. This includes instrument control, calibration-history string array[n] C5
source-file-reference string C5 M4
detection, calibration, data processing and quantitation meth-
source-file-format string C5
ods.
source-file-date-time-stamp string C5 M4
external-file-references string array[n] C5
3.1.5 Raw Data—the data as stored in the data file, along
error-log string C5
with any parameters needed to describe it.
3.2.1 administrative-comments—comments about the data-
3.1.6 Processed Results—processinginformationandvalues
set identification of the experiment. This free text field is for
derived from the raw data.
anything in this information class that is not covered by the
3.1.7 Component Quantitation Results—individual quanti-
other data elements in this class.
tation results for components in a complex mixture.
3.2.2 calibration-history—an audit trail of file names and
3.2 Definitions for Administrative Information Class—
data sets which records the calibration history; used for Good
These definitions are for those data elements that are imple-
Laboratory Practice (GLP) compliance.
mented in the protocol. See Table 1.
3.2.3 dataset-completeness—indicates which analytical in-
formation categories are contained in the dataset. The string
Available from Occupational Safety and Health Administration (OSHA), 200
Constitution Ave., Washington, DC 20210, http://www.osha.gov.
shouldexactlylistthecategoryvalues,asappropriate,asoneor
Available from Russell K. Rew, Unidata Program Center, University Corpora-
more of the following “C1+C2+C3+C4+C5,” in a string
tion for Atmospheric Research, P.O. Box 3000, Boulder, CO 80307-3000.
8 separated by plus (+) signs.This data element is used to check
Available from International Organization for Standardization (ISO), 1, ch. de
la Voie-Creuse, CP 56, CH-1211 Geneva 20, Switzerland, http://www.iso.org. for completeness of the analytical dataset being transferred.
E2077−00 (2010)
3.2.4 dataset-date-time-stamp—indicates the absolute time profile) form. Scans are represented as mass-intensity pairs,
of dataset creation relative to Greenwich Mean Time. Ex- whether incrementally spaced or not.
pressed as the synthetic datetime given in the form: library mass spectrum—a data set consisting of one or
YYYYMMDDhhmmss6ffff. more spectra derived from a spectral library. This is distin-
3.2.4.1 Discussion—This is a synthesis of ISO 8601:1988, guished from an experimental mass spectral data set in that
which compensates for local time variations. each spectrum in the library set has associated chemical
3.2.4.2 Discussion—The YYYYMMDDhhmmss expresses identification and other information.
the local time, and time differential factor (ffff) expresses the 3.2.10.2 Discussion—A required Raw Data Information
hours and minutes between local time and the Coordinated parameter, the number of scans, is used to define the shape of
Universal Time (UTC or Greenwich Mean Time, as dissemi- the data in the file, that is, to differentiate between single and
nated by time signals), as defined in ISO 8601:1988. The time multiplespectrumfiles.Anotherparameter,thescannumber,is
differential factor (ffff) is represented by a four-digit number used to determine whether multiple scan files have an order or
preceded by a plus (+) or a minus (−) sign, indicating the relatedness between scans.
number of hours and minutes that local time differs from the 3.2.10.3 Discussion—Some instruments are capable of
UTC. Local times vary throughout the world from UTC by as mixedmodedataacquisition,forexample,alternatingpositive/
negative EI (Electron Ionisation) or CI (Chemical Ionisation)
much as −1200 h (west of the Greenwich Meridian) and by as
much as +1300 h (east of the Greenwich Meridian). When the scans. In order to keep this interchange standard as simple as
possible, each scan mode must be treated as a separate data
time differential factor equals zero, this indicates a zero hour,
zerominute,andzeroseconddifferencefromGreenwichMean set regardless of how the data are actually stored in the source
data file. Alternating positive/negative EI data, for example,
Time.
3.2.4.3 Discussion—An example of a value for a datetime will generate two interchange files (possibly simultaneously,
depending on the implementation); one for the positive EI
would be: 1991,08,01,12:30:23-0500 or 19910801123023-
0500. In human terms this is 23 s past 12:30 PM onAugust 1, scans and one for the negative EI scans. These files may be
made mutually cross-referential using their “external-file-
1991 in New York City. Note that the −0500 h is 5 full hours
time behind Greenwich MeanTime.The ISO standard permits references” fields.
theuseofseparatorsasshown,iftheyarerequiredtofacilitate
3.2.11 external-file-references—an array of strings listing
human understanding. However, separators are not required
filenamesreferredtofromwithintherawdatafile.Thesecould
andconsequentlyshallnotbeusedtoseparatedateandtimefor
include, for example, tune parameter, method, calibration,
interchange among data processing systems.
reference, sequence, or other files. NetCDF files produced in
3.2.4.4 Discussion—The numerical value for the month of
parallel(suchaspairedfilescontainingalternatingEI/CIscans)
the year is used, because this eliminates problems with the
should be cross-referenced here.
different month abbreviations used in different human lan-
3.2.12 injection-date-time-stamp—indicates the absolute
guages.
time of sample injection relative to Greenwich Mean Time.
3.2.5 dataset-origin—name of the organization, address,
Expressed as the synthetic datetime given in the form:
telephone number, electronic mail nodes, and names of indi- YYYYMMDDhhmmss 6ffff. See dataset-date-time-stamp for
vidual contributors, including operator(s), and any other infor-
details of the ISO standard definition of a date-time-stamp.
mation as appropriate. This is where the dataset originated.
3.2.13 languages—optional list of natural (human) lan-
3.2.6 dataset-owner—name of the owner of a proprietary guages and programming languages delineated for processing
dataset. The person or organization named here is responsible by language tools.
for this field’s accuracy. Copyrighted data should be indicated
3.2.13.1 ISO-639-language—indicates a language symbol
here.
and country code from Annex B and D of ISO 639:1988.
3.2.7 error-log—informationthatservesasalogforfailures
3.2.13.2 other-language—indicates the languages and dia-
of any type, such as instrument control, data acquisition, data
lect using a user-readable name; applies only for those lan-
proces
...
Questions, Comments and Discussion
Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.