BASE 2.1 - User documentation: Getting started

This document describes how to get started with your new shiny BASE box.

Contents
  1. Administrative tasks
  2. User tasks

Last updated: $Date: 2009-04-06 12:52:39 +0000 (Mon, 06 Apr 2009) $
Copyright © 2006 The respective authors. All rights reserved.

This description will take you from uploading raw data to running the first analysis plug-ins. Both Affymetrix and Genepix data will be used as examples.

  1. Administrative tasks

    Most of the task in this section requires more privileges than the normal user credentials. You should log in as an administrator when you perform these tasks. As always, there are many ways to do things and what is presented here is the path to get going with BASE as fast as possible without creating havoc in future use of the BASE server.

    1. Create an administrator account
      Log in as 'root' using the password you set during BASE initialization. Create an administrator account. Log out, and use the admin account for all administrative tasks. The root account should not be used as the administrator account.

    2. Create user accounts
      Create user accounts using the newly created administrator account. You should give your users appropriate roles, most of them should be assigned to the 'User' role. (Link to role docs?) You can create new roles if the default ones does not fit your needs.

    3. Create essential plug-in configurations
      Some plug-ins need to be configured before use, whereas other works out of the box. Some external plug-ins as the RMAExpress plug-in for BASE has to be compiled before including them into your BASE server.

      Core plug-ins are a set of plug-ins that are automatically installed during setup. As pointed out above, these may need a configuration to work. The way to find out which plug-ins need a configuration is to choose Administrate -> Plugins -> Definitions. This will display a list all plug-ins available for your BASE server. Nothing is shown? The reason for this is that your newly created administrator does not own the definitions, change the view/preset to show shared items as well. By clicking on the 'Columns' tab you can select what information should be displayed in the list. Please select 'Requires config' in the 'Columns' tab, and move it up in the left panel. Finalize by clicking 'Ok'. The list now has an extra column with yes and no's. Mark the 'true' radio button in the 'Requires config' column to only list plug-ins that need a configuration.

      The exercise above had two important aims. i) To show you what plug-ins already exists, and ii) find three essential plug-ins that needs a configuration each. Before any raw data can be stored in BASE some information has to be available in BASE. The slide design (reporter map) and reporter information (probe set information in Affymetrix vocabulary). For these imports to work for Genepix data we need to create configurations for the "Reporter map importer" and "Reporter importer" respectively, and a configuration is needed for the raw data import plug-in, "Raw data importer", as well. Affymetrix data import works differently and only a "Reporter map importer" configuration is needed.

      The pre-installed plug-in definitions are already shared to group 'Everyone' (all users are members of this group) but the configurations created are not shared by default. The configurations should be shared to the users who should have access to them, normally group 'Everyone'. As counterexamples, the reporter importers and reporter map importers do not have to be share to everyone since these imports should only be allowed to be done by a small group of users. Every reporter and reporter map should only be imported once and all users should use the same reporter and reporter map information. The raw data importers should normally be shared to everyone since all users import data into BASE.

      Reporter importer
      Select the "Reporter importer", and click on the 'New configuration ...' tab. Enter a name in the 'Name' field (Affymetrix probe set importer / Genepix reporter import) and click 'Save and configure' button. The next pop-up expects input of regular expression needed to import reporter data. Here we just give the values to add to the panel, but as of version 2.1 it is possible to use a import format wizard. The wizard is started by clicking on the 'Test with file ...' button.

      Below we tabulate regexps to enter for a few file formats if you prefer to enter the regexp manually. Note 1: Here it is assumed that the default extended-properties.xml is used. Note 2: your browser may break lines in the table cells, treat these line breaks as a ' ' (space) character. Note 3: Depending on your files the values to input may be different.

      Column Affymetrix Genepix .gpr Genepix .txt
      Data Header "Probe Set ID","GeneChip Array",.* "Block"\t"Column"\t"Row"\t"Name"\t"ID"\t.* 384_number\t384_column\t384_row\t384_position\toligo_id.*
      Data splitter (?!"),(?=") \t \t
      Remove quotes true true true
      Name \Probe Set ID\ \Name\ \oligo_id\
      Reporter ID \Probe Set ID\ \ID\ \oligo_id\
      Gene symbol \Gene Symbol\ \gene_symbol_Ensembl*\
      Cluster ID \UniGene ID\ \RefSeq\

      Finalize with clicking 'Next'. You can of course fill the other entries as well.

      Reporter map importer
      Affymetrix CDF files do not need a configuration since these are not imported to BASE with a plug-in. Genepix data requires a configuration. To configure a plug-in for Genepix reporter map imports select the "Reporter map importer" from the plug-in definition list. Click on the 'New configuration ...' tab. Enter a name in the 'Name' field (Affymetrix probe set importer / Genepix reporter map importer) and click 'Save and configure' button. The next pop-up expects input of regular expression needed to import reporter map data. Here we just give the values to add to the panel, but as of version 2.1 it is possible to use a import format wizard. The wizard is started by clicking on the 'Test with file ...' button. Below we tabulate regexps to enter if you prefer to enter the regexp manually:

      Column Genepix .gpr
      Data Header "Block"\t"Column"\t"Row"\t"Name"\t"ID"\t.*
      Data splitter \t
      Remove quotes true
      Reporter ID \ID\
      Block \Block\
      Column \Column\
      Row \Row\

      Finalize with clicking 'Next'. You can of course fill the other entries as well.

      Raw data importer
      Affymetrix CEL files do not need a configuration since these are not imported to BASE with a plug-in. Genepix data requires a configuration. Create a configuration for Genepix raw data import by selecting the "Raw data importer" from the plug-in definition list. Click on the 'New configuration ...' tab. Enter a name in the 'Name' field (Genepix raw data importer) and click 'Save and configure' button. The next pop-up expects input of regular expression needed to import raw data. Here we just give the values to add to the panel, but as of version 2.1 it is possible to use a import format wizard. The wizard is started by clicking on the 'Test with file ...' button. Below we tabulate regexps to enter if you prefer to enter the regexp manually:

      Column Genepix .gpr
      Raw data type Genepix
      Header "(.+)=(.*)"
      Data header "Block"\t"Column"\t"Row"\t"Name"\t"ID"\t.*"Ratio of Medians \(532\/635\)".*
      Data splitter \t
      Min data columns 48
      Max data columns 48
      Block \Block\
      Column \Column\
      Row \Row\
      X \X\
      Y \Y\
      Reporter ID \ID\
      Spot diameter \Dia.\
      Channel 1 foreground median \F635 Median\
      Channel 1 foreground mean \F635 Mean\
      Channel 1 foreground standard deviation \F635 SD\
      Channel 1 background median \B635 Median\
      Channel 1 background mean \B635 Mean\
      Channel 1 background standard deviation \B635 SD\
      Percent pixels within 1 standard deviation \% > B635+1SD\
      Percent pixels within 2 standard deviations \% > B635+2SD\
      Percent saturated pixels \F635 % Sat.\
      Channel 2 foreground median \F532 Median\
      Channel 2 foreground mean \F532 Mean\
      Channel 2 foreground standard deviation \F532 SD\
      Channel 2 background median \B532 Median\
      Channel 2 background mean \B532 Mean\
      Channel 2 background standard deviation \B532 SD\
      Percent pixels within 1 standard deviation \% > B532+1SD\
      Percent pixels within 2 standard deviations \% > B532+2SD\
      Percent saturated pixels \F532 % Sat.\
      Foreground pixels \F Pixels\
      Background pixels \B Pixels\
      Flags \Flags\

      Finalize with clicking 'Next'. You can of course fill the other entries as well.

    4. (Affymetrix only) Install RMAExpress plug-in for BASE
      This plug-in is required for Affymetrix data. The plug-in will create a bioassay set, i.e., to be able to analyse the data in BASE.

    5. Upload files
      Before import you can upload files to the server, or if preferred, files to be imported can be uploaded at import time. To upload files, choose 'View' -> 'Files' and click in the 'Upload file...' button. At upload time you assign a type to the file (can be changed later if needed), and batch upload of files is possible in a zip-file or tar archive.

      As administrator you need to upload the reporter map (array design) files and corresponding reporter files. For Affymetrix this translates to Affydesign.cdf files and Affydesign_annot.csv files.

    6. Reporter import
      Choose 'View' -> 'Reporters' and click on the 'Import...' tab. Click 'Next' in the pop-up without changing anything. Select the file with reporter information and click 'Next'. In the panel you can choose to update reporter information for already stored reporters. Click 'Next'. Set the job name and click 'Finish' to start the import. A progress report will be displayed. It is safe to close this page, the job will finish anyway.

      In some cases the reporter annotation file from Affymetrix fails to describe all probe sets on the chip, you may get a message like Error: Unable to import root bioassay. Item not found: Reporter[externalId=AFFX-2315060]. In this case you have to fall back to importing the probe set information using the CDF file, simply do

      Import the reporters by choosing 'View' -> 'Reporters' and the clicking on 'Import...'. Click 'Next' without changing the 'auto detect' settings. Supply the CDF file name in the next dialog, click 'Next'. Start the import by clicking 'Next' in the parameter dialog. This will add the missing probe sets without changing existing probe sets. The new probe sets will not have any annotations associated with them since the CDFs do not contain such information.

    7. Import reporter maps
      Performing an import of array designs is only needed once for every design added into BASE. Remember to share your designs (normally to group 'Everyone').

      To create a new design, choose 'Array LIMS' -> 'Array designs' and click on the 'New...' tab. Select a name for the design. If it is an Affymetrix design, mark the 'Affy chip' box and select a 'CDF file' (make sure that also the CDF file is shared). Hint: For Affymetrix design you should choose the name to be the same as the cdf file name. If you are importing something else than Affymetrix designs make sure that you unmark the 'Affy chip' box. Finalize with 'Save'. For Affymetrix the import is done, whereas Genepix have to perform reporter map import.

      Genepix reporter map import
      Click on the newly created array design, and click on the 'Import...' tab. In the pop up, click 'Next'. Browse for the reporter map file and click 'Next'. In the following dialogue you can specify how to treat errors encountered. The default is 'fail' which may be too restrictive, trial and error is the way to go here for now. If you have trouble importing try 'skip' for 'Default error handling' and 'Reporter not found'. You can safely leave the 'Character set' as it is.

  2. User tasks

  3. A normal user is not allowed to add array design, reporter information, and a lot of other information to BASE. The reason for this is that a lot of information should only exist as one copy in the database. For example, reporters should only exist in one copy because everyone uses the same reporters. There is no need to store several copies of the same array design.

    A user will normally upload experimental data to BASE for import into the database. To be able to import the data, the array design used must be available in BASE at import time. If the array design is not available, a user with the proper credential must add the array design to BASE.

    1. Create a new Project
      Go to 'View' -> 'Projects' and click on the 'New...' tab. Name the project and save. In the projects listing page, click 'Set active' for the new project. The selected project should be displayed on top right as 'Active project'. Selecting an active project will influence the behaviour of BASE. Only information related to the active project will be displayed and items created will automatically be shared to the active project.

      The display of items shared to the active project only requires that the 'view/presets' pull-down menu options are properly set. In 'view/preset' you can select what should be displayed, and if only 'In current project' is selected then only information shared to a project will be shown.

    2. Create raw bioassays
      Upload your raw data files. Go to 'View' -> 'Raw bioassays' and click on the 'New...' tab. Name the raw bioassay. Select 'Raw data type'. Select the 'Array design', if you cannot see see designs in the pop-up the make sure that the 'view/presets' pull-down menu has 'Shared to me' marked and select appropriate design. Select 'CEL file' if you create an Affymetrix raw bioassay. You can leave the rest untouched. Redo for all raw data files. (Yes a batch creator would be nice.)

    3. Create a new experiment
      Go to 'View' -> 'Experiments' and click on the 'New...' tab. Name the experiment. Click 'Add raw bioassays...' button and mark the raw bioassays to add to the experiment and click 'Ok'. Finalize with clicking the 'Save' button.

    4. Analysing an experiment step I
      Select the newly created experiment and in the the experiment window select the 'Bioassay sets' tab. There are no bioassay sets and you must create a root bioassay set. This step is different depending on what data is used in the experiment but note that this paragraph ends with information valid for all type of data.

      Affymetrix
      Click on the 'New root bioassay set ...' tab and select the RMAExpress plug-in in the pop up. (If the plug-in does not appear as an option, the installation of the plug-in has failed. Please refer to Installing RMAExpress plug-in for BASE.) Click on the 'Next' button and in the new window set the name for the bioassay set and select the bioassays you want to include for this analysis. Click 'Next'. In the next pop up you can change information if you like but it is not necessary, when done click on 'Finish' to add the analysis job to the job queue. The job will start automatically when the server as an open slot for it. The last pop up window will report progress on the job.

      All flavours
      You can safely close the job progress window, the server will run the job anyway. If you want to monitor the progress of the job after closing the job window just go to 'View' -> 'Jobs' and locate your job in the list and select it. When the job is done the new bioassay set will be added to the list of bioassay set for the experiment. Unfortunately the list does not auto update so you need to click on the 'Properties' tab and the on the 'Bioassay sets' tab to refresh the list of bioassay sets.

    5. Analysing an experiment step II
      Now you should have the first bioassay set. In the list of bioassay sets you will have a number of icons attached that will start tools. The set of tools available for bioassay sets is context sensitive, i.e., only tools that can handle the bioassay sets are shown. At the time of writing of this document there are four icons available; A simple plot tool, experiment explorer, a filtering tool, and a plug-in runner. You have to try these on your own. The aim with this paragraph is to highlight the fact that you can export the bioassay set data for use outside BASE.

      To export bioassay set data select a bioassay set. In the bioassay set presentation window you will have an export button just below the tabs on the top. (There is also an export button for the Sub analysis tree which may be confusing at first, but his button will export information about the analysis tree.) So, assuming you select the proper export button you will be given a list of different export formats. The list of supported exports depends on what plug-ins are added to the server, but there should at least be an export to MultiExperiment Viewer maintained by the Dana Farber Cancer Institute.

This concludes this short document on how to get going with BASE. We have not covered how to use the LIMS part of BASE, nor annotations, and much more. We are creating documentation, please browse BASE web site for the latest available documents.