#394 closed (fixed)
mzML reader
Reported by: | Fredrik Levander | Owned by: | olle |
---|---|---|---|
Milestone: | Proteios 2.3 | Keywords: | |
Cc: |
Description
The spectrum reader interface should be implemented for mzML 1.0 once it is released.
Change History (15)
comment:1 Changed 15 years ago by
Status: | new → assigned |
---|
comment:2 Changed 15 years ago by
Traceability note: Previous ticket related to mzML was Ticket #248 (Peak list conversion to mzML).
comment:3 Changed 15 years ago by
Reference note:
Development of the mzML standard can be followed on the PSI wiki mzML Development (PSI = Proteomics Standards Initiative). At the time of writing, 2008-03-26, the proposed release candidate version was still reviewed.
comment:4 Changed 15 years ago by
The first revision of spectrum reader interface for mzML will be based on the current version of the mzML specification, version 0.99.1.
Differences in the mzData and mzML specification relevant for obtaining the necessary spectrum data is described below. The basic encoding scheme for the spectra seems to be the same, although different parameters are allowed.
MzData spectrum tag:
<spectrum id="20"> ... <mzArrayBinary> <data precision="64" endian="little" length="43">AAAAwN ... KCYgEA=</data> </mzArrayBinary> <intenArrayBinary> <data precision="64" endian="little" length="43">AAAAAA ... ABgnUA=</data> </intenArrayBinary> </spectrum>
MzML spectrum tag:
<spectrum id="S20" ...> ... <binaryDataArray arrayLength="43" ...> <cvParam cvLabel="MS" accession="MS:1000523" name="64-bit float" value=""/> <cvParam cvLabel="MS" accession="MS:1000576" name="no compression" value=""/> <cvParam cvLabel="MS" accession="MS:1000514" name="m/z array" value=""/> ... <binary>AAAAwN ... KCYgEA=</binary> </binaryDataArray> <binaryDataArray arrayLength="43" ...> <cvParam cvLabel="MS" accession="MS:1000523" name="64-bit float" value=""/> <cvParam cvLabel="MS" accession="MS:1000576" name="no compression" value=""/> <cvParam cvLabel="MS" accession="MS:1000515" name="intensity array" value=""/> ... <binary>AAAAAA ... ABgnUA=</binary> </binaryDataArray> </spectrum>
Byte ordering for mzML is always litte endian.
comment:5 Changed 15 years ago by
Clarification of the mzML spectrum tag.
The "binary" and "cvParam" tags are on the same level, i.e. both have the "binaryDataArray" tag as nearest parent tag. The example mzML spectrum tag should therefore be:
MzML spectrum tag:
<spectrum id="S20" ...> ... <binaryDataArray arrayLength="43" ...> <cvParam cvLabel="MS" accession="MS:1000523" name="64-bit float" value=""/> <cvParam cvLabel="MS" accession="MS:1000576" name="no compression" value=""/> <cvParam cvLabel="MS" accession="MS:1000514" name="m/z array" value=""/> ... <binary>AAAAwN ... KCYgEA=</binary> </binaryDataArray> <binaryDataArray arrayLength="43" ...> <cvParam cvLabel="MS" accession="MS:1000523" name="64-bit float" value=""/> <cvParam cvLabel="MS" accession="MS:1000576" name="no compression" value=""/> <cvParam cvLabel="MS" accession="MS:1000515" name="intensity array" value=""/> ... <binary>AAAAAA ... ABgnUA=</binary> </binaryDataArray> </spectrum>
comment:6 Changed 15 years ago by
severity: | 16 → 4 |
---|
Severity set to 4, since there may be unknown issues with encoding parameter values not used in the mzData standard. Without this uncertainty, the severity could be set to 2, since existing interfaces and classes like PeakListFileInterface and PeakListFileImpl can be used as templates for the new ones.
comment:7 Changed 15 years ago by
(In [2640]) Refs #394. First revision of spectrum reader for mzML files.
- Interface/file io/PeakListFileInterface.java in api/core/ updated
in javadocs (no change in functionality). Reference to mzData files removed, as the interface is now used for general peak list file (spectrum files), not only mzData files.
- New interface/file io/SpectrumIdReaderInterface.java in api/core/
for obtaining a list of spectrum id values.
- New class/file io/MzMLFileReader.java in api/core/, that
implements the PeakListFileInterface?, SpectrumIdReaderInterface?, and FileValidationInterface? interfaces for mzML files. Only support for 32-bit float and 64-bit float precision. No support for public method Double getRetentionTimeInMinutes() in SpectrumInterface?, i.e. it will always return null.
- New test class/file io/TestMzMLFileReader.java in api/core/test/.
JUnit test that uses an input mzMl file.
comment:8 Changed 15 years ago by
(In [2653]) Refs #404. Refs #394. Refs #406. Refs #407. Interface/file io/SpectrumIdReaderInterface.java in api/core/ extended with a method to specify if the spectrum id values were obtained from spectrum order numbers, instead of explicit id values.
- Interface/file io/SpectrumIdReaderInterface.java in api/core/
extended with new public method boolean isSpectrumIdObtainedFromSpectrumOrderNumber() to specify if the spectrum id values were obtained from spectrum order numbers, instead of explicit id values.
- Class/file io/MzMLFileReader.java in api/core/ updated by
adding public method boolean isSpectrumIdObtainedFromSpectrumOrderNumber() to specify if the spectrum id values were obtained from spectrum order numbers, instead of explicit id values. Always return 'false', since explicit id values are used.
- Class/file io/PklFileReader.java in api/core/ updated by
adding public method boolean isSpectrumIdObtainedFromSpectrumOrderNumber() to specify if the spectrum id values were obtained from spectrum order numbers, instead of explicit id values. Always return 'true', since spectrum order numbers are used.
- Class/file io/MgfFileReader.java in api/core/ updated by
adding public method boolean isSpectrumIdObtainedFromSpectrumOrderNumber() to specify if the spectrum id values were obtained from spectrum order numbers, instead of explicit id values. Always return 'true', since spectrum order numbers are currently used.
comment:9 Changed 15 years ago by
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Ticket closed as a first version of an mzML reader has been implemented. When the specification for mzML 1.0 is released, the ticket may be reopened, or a new ticket created.
comment:10 Changed 15 years ago by
(In [2721]) Refs #430. Refs #394. First revision of support of zlib compressed data in the mzML reader.
- Class/file io/Base64Util.java in api/core/ updated:
- New public static method
List<Double> decode(boolean doublePrecision, boolean bigEndian,
boolean zLibCompression, String dataString)
with optional zlib decompression of the decoded byte array before
conversion to a list of double
values.
- Previous public static method
List<Double> decode(boolean doublePrecision, boolean bigEndian,
String dataString)
updated to call the new method with argument "zLibCompression" set
to false
, in order to avoid duplication of code.
- Class/file io/MzMLFileReader.java in api/core/ updated:
- New String instance variable "compression" with accessor methods.
- Private method void processStartElement(XMLStreamReader parser)
updated to set value of new "compression" String instance variable based on cvParam name property values related to the data compression.
- New private utility method boolean isZLibCompression() that
returns true
or false
depending on the value of "compression"
String instance variable. Default is false
.
- Private method List<Double> dataItem(...) updated to accept
a boolean argument indicating if zlib compression is used, List<Double> dataItem(boolean doublePrecision, boolean bigEndian, boolean zLibCompression, String dataBase64Raw). The compression flag is used when calling updated decode(...) method in class Base64Util.
- Private method void processEndElement(XMLStreamReader parser)
updated to call updated method List<Double> dataItem(...) with compression flag obtained by from new utility method isZLibCompression().
comment:11 Changed 15 years ago by
(In [2792]) Refs #450. Refs #430. Refs #394. First revision of support of referenceable param groups in the mzML reader.
- Class/file io/MzMLFileReader.java in api/core/ updated:
- New private list variable
List<ReferenceableParamGroup?> referenceableParamGroupList for storing data for referenceable param groups. The elements are instances of new private inner class ReferenceableParamGroup?.
- Private method void processStartElement(XMLStreamReader parser)
updated to store values for referenceable param group data in a "referenceableParamGroup" XML block, and use the appropriate values if a "referenceableParamGroupRef" is encountered.
- Private method void processEndElement(XMLStreamReader parser)
updated to support "referenceableParamGroup" XML blocks.
comment:12 Changed 15 years ago by
(In [2793]) Refs #450. Refs #430. Refs #394. MzML reader updated for safer management of referenceable param groups.
- Class/file io/MzMLFileReader.java in api/core/ updated in
private method void processStartElement(XMLStreamReader parser) to create a new referenceableParamGroupList empty list when a "referenceableParamGroupList" XML tag is encountered.
comment:13 Changed 15 years ago by
(In [2802]) Refs #454. Refs #450. Refs #430. Refs #394. First revision of use of accession number property values instead of name property values in the mzML reader:
- Class/file io/MzMLFileReader.java in api/core/ updated in
private method void processStartElement(XMLStreamReader parser) to use "accession" instead of "name" property values for cvParam XML tags when obtaining data for array type, precision, and compression. In order to make the code more readable, comparison of obtained accession string values are made against new string constants defined for the class, with names indicating the value the accession number represents.
comment:14 Changed 15 years ago by
(In [2805]) Refs #454. Refs #450. Refs #430. Refs #394. Support for obtaining retention time values in minutes added in the mzML reader:
- Class/file io/MzMLFileReader.java in api/core/ updated in
private method void processStartElement(XMLStreamReader parser) to use "unitAccession" property values for cvParam XML tags when obtaining data for retention times ("scan time"). In order to make the code more readable, comparison of obtained accession string values are made against new string constants defined for the class, with names indicating the value the accession number represents.
Ticket accepted.