Opened 9 years ago

Closed 9 years ago

Last modified 8 years ago

#805 closed (fixed)

Support for simplifying ProteomeXchange submission

Reported by: olle Owned by: olle
Milestone: Proteios SE 2.19.0 Keywords:
Cc:

Description

Proteios SE should have support for simplifying ProteomeXchange submission by creating a ProteomeXchange Submission Summary *.px file, and optionally copy it and selected files inside Proteios SE file system to an external directory.

The first version will just be a shell for a full-featured application:

  • The user will be allowed to enter data for inclusion in the PX file and select files of different types to be used.
  • Where possible, form fields should have probable default values filled in, e.g. from properties of the active project.
  • The user will have to know the correct Controlled Vocabulary CV terms to enter.
  • No file mapping will be made in the created file, but has to be added later.
  • The user may choose whether the created PX file and selected files should be copied to an external directory.

Change History (24)

comment:1 Changed 9 years ago by olle

Status: newassigned

Ticket accepted.

comment:2 Changed 9 years ago by olle

References:

comment:3 Changed 9 years ago by olle

Design discussion:

The application will be created as a Proteios SE file context extension, i.e. it will be available through the "Extensions" button in the file table. It will be named "ProteomeXchange Submission Summary File Export", and will create a job to perform the desired action.

Main classes involved, in order perceived by the user:

  1. ProteomeXchangeExportForm (called from ProteomeXchangeExport)
  2. ProteomeXchangeExport (File context extension)
  3. SelectProteomeXchangeIncludeFilesStep1a
  4. ViewActiveDirectory in select mode
  5. SelectProteomeXchangeIncludeFilesStep1b
  6. CreateProteomeXchangeExportJob
  7. ProteomeXchangeExportPlugin

Main functionality of the classes:

Class Functionality
ProteomeXchangeExportForm Form for data related to PX file export itself, but not its content
ProteomeXchangeExport Form for PX file meta-data content and file types to include
SelectProteomeXchangeIncludeFilesStep1a Temporary storage of entered data, file selection (see below)
ViewActiveDirectory File selection (see below)
SelectProteomeXchangeIncludeFilesStep1b Temporary storage of file selection data, file selection (see below)
CreateProteomeXchangeExportJob Creates job, stores assembled data as job parameters, resets used session parameters to null, in order to avoid unwanted side effects
ProteomeXchangeExportPlugin Creates the PX file and optionally copies it and selected files to external directory

File selection:

Even with the restrictions stated in the ticket description, there are some design elements that are non-trivial. The user should be allowed to select files of a number of file types that differ from the Proteios SE file types, e.g. the PX "raw" files may be of format mzML, mzXML, or mzData. The current Proteios SE file selection is managed by class ViewActiveDirectory in select mode (triggered by setting its VBoolean VSELECT parameter to true) and allows files to be selected from a single directory at a time, while files of the same PX file type may be located in different directories. It is therefore desirable to extend Proteios SE file selection to allow files to be selected from different directories, and still being handled as a group. This can be implemented by adding an optional button to the file selection form, whereby the user can choose to continue selecting files of the same type.

The file selection will be implemented through the following valid parameters:

Parameter Type Class Description
VPARAM VString NextContinueSelectionButtonNameField Field holding name of button to continue selection of same file type. A non-null, non-empty value is used as a flag to indicate that such a button should be added.
VCONTINUESELECTION VBoolean ViewActiveDirectory Boolean flag parameter set to true for the "Next - Continue selection of same file type" button
VFILETYPETOINCLUDE VString CreateProteomeXchangeExportJob Parameter defining the next file type to select
VPARAM VString TitleField Field holding title for file selection form
VPARAM VString NextButtonNameField Field holding name of button for next step

The three classes involved in the selection process are:

  1. SelectProteomeXchangeIncludeFilesStep1a
    Stores entered data from previous form as session parameters, after which forward actions and names of CreateProteomeXchangeExportJob.VFILETYPETOINCLUDE, TitleField.VPARAM, NextButtonNameField.VPARAM, and NextContinueSelectionButtonNameField.VPARAM are set and the action is forwarded to ViewActiveDirectory in select mode.
  2. ViewActiveDirectory
    Displays a file selection form with two "next" buttons, one for continuing the current selection, one for the next step.
  3. SelectProteomeXchangeIncludeFilesStep1b
    Updates the stored file selection data with the newly selected files. If value of ViewActiveDirectory.VCONTINUESELECTION is true, the action is directly forwarded back to VieWActiveDirectory, otherwise the next action is either ViewActiveDirectory with new values for CreateProteomeXchangeExportJob.VFILETYPETOINCLUDE, TitleField.VPARAM, and NextButtonNameField.VPARAM, or the action for the next step after file selection (in this case CreateProteomeXchangeExportJob).

Other design comments:

The temporary list of file id values for each file type is stored as a session parameter in the form of a comma-separated string. This might lead to problems if a large number of files are selected, resulting in a long string.

As a convenience, new interface FileCrudeWriter and implementation FileCrudeWriterImpl classes in api/core/src/org/proteios/io/ will be created for writing pure text files in Proteios SE, in analogy with XMLCrudeWriter and XMLCrudeWriterImpl for XML files.

Last edited 9 years ago by olle (previous) (diff)

comment:4 Changed 9 years ago by olle

(In [4465]) Refs #805. First version of support for simplifying ProteomeXchange submission by creating a ProteomeXchange Submission Summary *.px file, and optionally copy it and selected files inside Proteios SE file system to an external directory. This version is just a shell for a full-featured application.

New classes/files in api/core/ (convenience classes for writing pure text files):

  1. io/FileCrudeWriter.java
  2. io/FileCrudeWriterImpl.java

New classes/files in client/servlet/:

  1. action/file/CreateProteomeXchangeExportJob.java
  2. action/file/ProteomeXchangeExport.java
  3. action/file/SelectProteomeXchangeIncludeFilesStep1a.java
  4. action/file/SelectProteomeXchangeIncludeFilesStep1b.java
  5. gui/form/NextContinueSelectionButtonNameField.java
  6. gui/form/ProteomeXchangeExportForm.java

New classes/files in plugin/:

  1. plugins/ProteomeXchangeExportPlugin.java

Updated action classes/files in client/servlet/:

  1. action/directory/ViewActiveDirectory.java

Updated English dictionary and default icon settings files in client/servlet/:

  1. locale/en/dictionary
  2. icons/default

comment:5 Changed 9 years ago by olle

(In [4466]) Refs #805. Default icon settings file icons/default in client/servlet/ updated by removal of redundant line.

comment:6 Changed 9 years ago by olle

Design update:

  • Storage of the temporary list of file id values for each file type as a session parameter in the form of a comma-separated string, with possible problems when the number of selected files is large, can be avoided by using method List<Integer> getSessionAttributeList(VInteger param) introduced in change set [3922] in Ticket #714 (Moving many files fails). This method allows the integer id values to be stored in a session attribute as a list.

Affected classes/files:

  1. action/file/CreateProteomeXchangeExportJob.java in client/servlet/:
    a. Public static final valid parameter VString VPROTEOMEXCHANGERESULTFILEIDSTR is exchanged for VInteger VPROTEOMEXCHANGERESULTFILEIDLIST, and analogously for the other ProteomeXchange file types.
    b. The List<Integer> file id values for the various file types are retrieved using method getSessionAttributeList(VInteger param).
    c. Private convenience methods List<Integer> fetchFileIdList(VString param) and List<Integer> listStringToIntegerList(String listStr, String separator) are no longer needed, and are therefore removed.
  2. action/file/SelectProteomeXchangeIncludeFilesStep1b.java in client/servlet/:
    a. Use of public static final valid parameter VString CreateProteomeXchangeExportJob.VPROTEOMEXCHANGERESULTFILEIDSTR is exchanged for VInteger CreateProteomeXchangeExportJob.VPROTEOMEXCHANGERESULTFILEIDLIST, and analogously for the other ProteomeXchange file types.
    b. The List<Integer> file id values for the various file types are retrieved using method getSessionAttributeList(VInteger param).
    c. Public static final HashMap<String, VString> fileTypeFileIdStrParamHashMap is exchanged for HashMap<String, VInteger> fileTypeFileIdListParamHashMap.
    d. Private method String updateFileTypeIdSelectionString(VString fileTypeParam, String fileIdStr) is exchanged for List<Integer> updateFileTypeIdSelectionList(VInteger fileTypeParam, List<Integer> fileIdList).
Last edited 9 years ago by olle (previous) (diff)

comment:7 Changed 9 years ago by olle

(In [4467]) Refs #805. Support for simplifying ProteomeXchange submission by creating a ProteomeXchange Submission Summary *.px file updated by storing integer id values for selected files in a session attribute as a list instead of a comma-separated string of id values. This should avoid possible problems when the number of selected files is large.

  1. Class/file action/file/CreateProteomeXchangeExportJob.java in client/servlet/ updated:
    a. Public static final valid parameter VString VPROTEOMEXCHANGERESULTFILEIDSTR is exchanged for VInteger VPROTEOMEXCHANGERESULTFILEIDLIST, and analogously for the other ProteomeXchange file types.
    b. The List<Integer> file id values for the various file types are retrieved using method getSessionAttributeList(VInteger param).
    c. Private convenience methods List<Integer> fetchFileIdList(VString param) and List<Integer> listStringToIntegerList(String listStr, String separator) are no longer needed, and are therefore removed.
  2. Class/file action/file/SelectProteomeXchangeIncludeFilesStep1b.java in client/servlet/ updated:
    a. Use of public static final valid parameter VString CreateProteomeXchangeExportJob.VPROTEOMEXCHANGERESULTFILEIDSTR is exchanged for VInteger CreateProteomeXchangeExportJob.VPROTEOMEXCHANGERESULTFILEIDLIST, and analogously for the other ProteomeXchange file types.
    b. The List<Integer> file id values for the various file types are retrieved using method getSessionAttributeList(VInteger param).
    c. Public static final HashMap<String, VString> fileTypeFileIdStrParamHashMap is exchanged for HashMap<String, VInteger> fileTypeFileIdListParamHashMap.
    d. Private method String updateFileTypeIdSelectionString(VString fileTypeParam, String fileIdStr) is exchanged for List<Integer> updateFileTypeIdSelectionList(VInteger fileTypeParam, List<Integer> fileIdList).
Last edited 9 years ago by olle (previous) (diff)

comment:8 Changed 9 years ago by olle

Design update to include default file mappings:

In the following, it is assumed that only files selected to be included in the ProteomeXchange submission will be used in the file mapping. File types written in italics are ProteomeXchange submission file types.

  • A result file should be mapped to all files in the Hits table.
  • A peaklist file should be mapped to all raw files it is associated to via a sample.
  • A search file should be mapped to all peaklist files (raw or peaklist ProteomeXchange files) it is related to in the Hits table.
  • An other file that is a feature file should be mapped to the peaklist file (raw or peaklist ProteomeXchange file) it is related to in the Features table.

In order to simplify the necessary database queries, the following new public static methods should be added:

  1. Class/file core/Hit.java in api/core/:

    List<File> getUniqueIdentificationResultFiles(Project project, DbControl dc)
    List<File> getUniquePeakListFiles(Project project, DbControl dc)
    List<File> getUniquePeakListFilesForIdentificationResultFile(Project project, File identificationResultFile, DbControl dc)

  2. Class/file core/Feature.java in api/core/:

    List<File> getUniqueFeatureFiles(Project project, DbControl dc)
    List<File> getUniqueMsFilesForFeatureFile(Project project, File featureFile, DbControl dc)

Update of class/file plugins/ProteomeXchangeExportPlugin.java in plugin/:

  1. File mapping indices for included files are now calculated before the file mapping table is written, since a file may be mapped to one in a later part of the table. The file mapping index will be calculated by new private method HashMap<Integer,Integer> createFileIdMapIndexHashMap().
  2. The file mapping will be calculated by new private method HashMap<Integer,List<Integer>> createFileIdFileMappingListHashMap(DbControl dc, HashMap<Integer,Integer> fileIdMapIndexHashMap).
  3. New private convenience method List<Integer> updateFileMappingList(List<Integer> fileMappingList, List<File> candidateLinkFileList, File keyFile, HashMap<Integer,Integer> fileIdMapIndexHashMap) updates the file mapping list for a key file with ProteomeXchange file mapping indices for candidate files in the ProteomeXchange submission list, that are not already in the file mapping list for the key file.
  4. Private method int writeProteomeXchangeFme(FileCrudeWriter fileCrudeWriter, List<File> fileList, String fileType, int index) is updated to void writeProteomeXchangeFme(FileCrudeWriter fileCrudeWriter, List<File> fileList, String fileType, HashMap<Integer,Integer> fileIdMapIndexHashMap, HashMap<Integer,List<Integer>> fileIdFileMappingListHashMap). It will now write the file mapping for a file as a comma-separated list string of ProteomeXchange submission indices in ascending order.
Last edited 9 years ago by olle (previous) (diff)

comment:9 Changed 9 years ago by olle

(In [4468]) Refs #805. Support for simplifying ProteomeXchange submission by creating a ProteomeXchange Submission Summary *.px file updated to include default file mappings:

File types written in italics are ProteomeXchange submission file types.

  • A result file is be mapped to all files in the Hits table.
  • A peaklist file is mapped to all raw files it is associated to via a sample.
  • A search file is mapped to all peaklist files (raw or peaklist ProteomeXchange files) it is related to in the Hits table.
  • An other file that is a feature file is mapped to the peaklist file (raw or peaklist ProteomeXchange file) it is related to in the Features table.

In order to simplify the necessary database queries, the following new public static methods are added:

  1. Class/file core/Hit.java in api/core/:

    List<File> getUniqueIdentificationResultFiles(Project project, DbControl dc)
    List<File> getUniquePeakListFiles(Project project, DbControl dc)
    List<File> getUniquePeakListFilesForIdentificationResultFile(Project project, File identificationResultFile, DbControl dc)

  2. Class/file core/Feature.java in api/core/:

    List<File> getUniqueFeatureFiles(Project project, DbControl dc)
    List<File> getUniqueMsFilesForFeatureFile(Project project, File featureFile, DbControl dc)

Update of class/file plugins/ProteomeXchangeExportPlugin.java in plugin/:

  1. File mapping indices for included files are now calculated before the file mapping table is written, since a file may be mapped to one in a later part of the table. The file mapping index will be calculated by new private method HashMap<Integer,Integer> createFileIdMapIndexHashMap().
  2. The file mapping will be calculated by new private method HashMap<Integer,List<Integer>> createFileIdFileMappingListHashMap(DbControl dc, HashMap<Integer,Integer> fileIdMapIndexHashMap).
  3. New private convenience method List<Integer> updateFileMappingList(List<Integer> fileMappingList, List<File> candidateLinkFileList, File keyFile, HashMap<Integer,Integer> fileIdMapIndexHashMap) updates the file mapping list for a key file with ProteomeXchange file mapping indices for candidate files in the ProteomeXchange submission list, that are not already in the file mapping list for the key file.
  4. Private method int writeProteomeXchangeFme(FileCrudeWriter fileCrudeWriter, List<File> fileList, String fileType, int index) is updated to void writeProteomeXchangeFme(FileCrudeWriter fileCrudeWriter, List<File> fileList, String fileType, HashMap<Integer,Integer> fileIdMapIndexHashMap, HashMap<Integer,List<Integer>> fileIdFileMappingListHashMap). It will now write the file mapping for a file as a comma-separated list string of ProteomeXchange submission indices in ascending order.

comment:10 Changed 9 years ago by olle

(In [4469]) Refs #805. Class/file action/file/ProteomeXchangeExport.java in client/servlet/ updated in instruction text regarding file mapping.

comment:11 Changed 9 years ago by olle

Design update:

  • ProteomeXchange submission support should be made a hit context extension instead of a file context extension, i.e. it will be available through the "Extensions" button in the hit table instead of the file table.
  • The active project menu should be extended with a new cascade item "Publishing", with initial menu items for PRIDE XML export and ProteomeXchange submission support.

comment:12 Changed 9 years ago by olle

(In [4470]) Refs #805. ProteomeXchange submission changed to a hit context extension instead of a file context extension, i.e. it will be available through the "Extensions" button in the hit table instead of the file table. Some minor code clean-up also performed by removing lines commented out.

comment:13 Changed 9 years ago by olle

(In [4471]) Refs #805. Active project menu updated with new cascade item "Publishing", with initial menu items for PRIDE XML export and ProteomeXchange submission support:

  1. Class/file gui/MainMenu.java in client/servlet/ updated in creation of active project menu.
  2. English dictionary file locale/en/dictionary in client/servlet/ updated with new string key entry.

comment:14 Changed 9 years ago by olle

Design update of file mapping:

  • A result file should be mapped to all files in the Hits table, that are related to the peaklist file used to create the result file. In order to find the peaklist file used to create a PRIDE XML file, the latter should have an annotation to the name of the peaklist file.
  • A peaklist file is mapped to all raw files it is associated to via a sample, where the base name of the raw file equals that of the peaklist file. This will prevent a peaklist file to be mapped to all raw files that are technical replicates of a sample, but only to the raw file used to create it.

comment:15 Changed 9 years ago by olle

(In [4472]) Refs #805. Refs #405. Support for simplifying ProteomeXchange submission by creating a ProteomeXchange Submission Summary *.px file updated in default file mappings for result and peaklist files:

File types written in italics are ProteomeXchange submission file types.

  • A result file is mapped to all files in the Hits table, that are related to the peaklist file used to create the result file. In order to find the peaklist file used to create a PRIDE XML file, the latter should have an annotation to the name of the peaklist file.
  • A peaklist file is mapped to all raw files it is associated to via a sample, where the base name of the raw file equals that of the peaklist file. This will prevent a peaklist file to be mapped to all raw files that are technical replicates of a sample, but only to the raw file used to create it.

In order to simplify the necessary database queries, the following new public static methods are added:

  1. Class/file core/Hit.java in api/core/:

    List<File> getUniqueIdentificationResultFilesForPeakListFile(Project project, File peakListFile, DbControl dc)

SQL query XML file conf/common-queries.xml in api/core/ is updated with new query with id "GET_UNIQUE_IDENTIFICATIONRESULTFILES_IN_HITS_FOR_PEAKLISTFILE_IN_PROJECT".

Update of class/file plugins/PrideExportPlugin.java in plugin/:

  1. Public method void doExport(DbControl dc, File outCoreFile, ProgressReporter progress) is updated to annotate the created PRIDE XML file with the filename of the peaklist file.

Update of class/file plugins/ProteomeXchangeExportPlugin.java in plugin/:

  1. Private method HashMap<Integer,List<Integer>> createFileIdFileMappingListHashMap(DbControl dc, HashMap<Integer,Integer> fileIdMapIndexHashMap) is updated by mapping a result file to all files in the Hits table, that are related to the peaklist file used to create the result file, and mapping a peaklist file to all raw files it is associated to via a sample, where the base name of the raw file equals the peaklist file.
  1. New private convenience method String fetchBaseFilename(File file) added. It returns the base filename of a file (filename without file extension).

Last edited 9 years ago by olle (previous) (diff)

comment:16 Changed 9 years ago by olle

Note: Due to an error in the SVN commit message, changeset [4472] was referenced to Ticket #406 (PKL file reader) instead of Ticket #405 (Support export for publication). The error has been corrected in the Trac comments, but remains in the SVN commit message.

comment:17 Changed 9 years ago by olle

Design update:

  • An option for automatic file selection should be added, with the following properties:

    1. If chosen, any subsequent file type selections performed in the form will be ignored.
    2. The user will always be given the opportunity to select "other" files, but feature files in the project will automatically be included.
    3. If the new option is not selected, file selection can be performed as previously.

The automatic file selection assumes a certain work flow, and will be performed as follows in its first version:

  1. result files (currently PRIDE XML files) are *.pride.xml files in the project, that have a "PeaklistFileName" annotation. (Support for the "PeaklistFileName" annotation was added in change set [4472].)
  2. peaklist files are files in the project, that have the same names as the "PeaklistFileName" annotations in the result files.
  3. raw files are *.mzML, *.mzXML, and *.mzData files in the project, that have the same base name (filename without file extension) as the peaklist files.
  4. search files are files associated with the peaklist files in the Hits table.
  5. other files with feature data are files associated with raw peaklist files in the Features table. These will be added to any files of type other selected by the user.

A file should only be added once to a specific category.

In the first version, automatic file selection will be performed by class CreateProteomeXchangeExportJob, whereby no changes are needed in plug-in class ProteomeXchangeExportPlugin. This ensures that the file selection is always performed before the job is created, but will cause a delay before the Jobs table appears. For small- to medium-sized projects the delay is small enough to ignore, but if it should become too annoying for larger projects, the design should be changed by performing the automatic file selection in the plug-in.

Last edited 9 years ago by olle (previous) (diff)

comment:18 Changed 9 years ago by olle

(In [4475]) Refs #805. Support for simplifying ProteomeXchange submission by creating a ProteomeXchange Submission Summary *.px file updated by adding option for automatic file selection via associations:

  1. Class/file action/file/ProteomeXchangeExport.java in client/servlet/ updated with new check box getIncludeFilesFromAssociationsCB coupled to new valid parameter VBoolean CreateProteomeXchangeExportJob.VGETINCLUDEFILESFROMASSOCIATIONS.
  2. Class/file action/file/SelectProteomeXchangeIncludeFilesStep1a.java in client/servlet/ updated to obtain value of valid parameter VBoolean CreateProteomeXchangeExportJob.VGETINCLUDEFILESFROMASSOCIATIONS from request and save it to session. If the parameter has value true, all file types will be marked for inclusion, but the user will only be able to select files of ProteomeXchange file type "other", to which automatically found feature files will later be added.
  3. Class/file action/file/CreateProteomeXchangeExportJob.java in client/servlet/ updated with new valid parameter VBoolean VGETINCLUDEFILESFROMASSOCIATIONS, whose value is obtained from session. If the parameter has value true, include files will be selected using associations and annotations. New private convenience method String fetchBaseFilename(File file) added. It returns the base filename of a file (filename without file extension).
  4. Class/file core/Feature.java in api/core/ updated with new public static method List<File> getUniqueFeatureFilesForMsFile(Project project, File msFile, DbControl dc). It uses new pre-defined query GET_UNIQUE_FEATUREFILES_IN_FEATURES_FOR_MSFILE_IN_PROJECT to find unique feature files for a peaklist file in project.
  5. SQL query XML file conf/common-queries.xml in api/core/ is updated with new query with id "GET_UNIQUE_FEATUREFILES_IN_FEATURES_FOR_MSFILE_IN_PROJECT".
  6. English dictionary file locale/en/dictionary in client/servlet/ updated with new string key entry.

comment:19 Changed 9 years ago by olle

Resolution: fixed
Status: assignedclosed

Ticket closed as the requested functionality has been added.

comment:20 Changed 9 years ago by Fredrik Levander

(In [4515]) Refs #805. Writing URL of files instead of local path for http-accessible files.

comment:21 Changed 9 years ago by olle

(In [4525]) Refs #818. Refs #805. File selection updated to avoid creating extra unwanted buttons, when navigating the directory tree to find files to select:

  1. Class/file action/directory/ViewActiveDirectory.java in client/servlet/ updated in private method String fetchNextButtonName(Boolean select, Boolean selectMoveDir, VString fieldName, String defaultName) to exchange a four-character string "null" for a null value, when obtained from a posted parameter or session attribute value. This has the side effect that no button can be created in this way having the name "null", but this is unlikely to cause any problems.

comment:22 Changed 9 years ago by olle

(In [4526]) Refs #805. Retraction of writing URL of files instead of local path for http-accessible files, that was added in changeset [4515], as the ProteomeXchange submission tool does not currently support this.

  1. Class/file plugins/ProteomeXchangeExportPlugin.java in plugin updated in private method void writeProteomeXchangeFme(FileCrudeWriter fileCrudeWriter, int index, String fileType, File file, String fileMapping) to not write URL of files instead of local path for http-accessible files.

comment:23 Changed 9 years ago by Fredrik Levander

(In [4552]) Refs #805. Reactivated writing of URL since this is now supported by the PX submission tool.

comment:24 Changed 8 years ago by Fredrik Levander

(In [4553]) Refs #805. Adding file associations to raw files from PRIDE XML files

Note: See TracTickets for help on using tickets.