source: plugins/base2/net.sf.basedb.illumina/trunk/README @ 941

Last change on this file since 941 was 941, checked in by Martin Svensson, 15 years ago

References #174 license text should referr to the Illumina plug-in package

  • Property svn:eol-style set to native
  • Property svn:mime-type set to text/x-trac-wiki
File size: 12.4 KB

Copyright (C) 2008
This file is part of Illumina plug-in package for BASE.
Available at http://baseplugins.thep.lu.se/
BASE main site: http://base.thep.lu.se/
This is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 3
of the License, or (at your option) any later version.
The software is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with BASE. If not, see <http://www.gnu.org/licenses/>.

Requirements

  1. BASE 2.9.0 or later.

For expression experiments:

  1. Illumina Bead Summary (IBS) files. The IBS files contain quantified probe intensities.
  2. Illumina Sentrix® Array binary manifest (BGX) file. The BGX files contain probe annotations.

For SNP experiments:

  1. Illumina SNP manifest files
  2. Illumina SNP raw data files

Tested using Illumina BeadArray? Reader (Version: 1.7.0.44) and BeadScan? (Version: 3.5.31.17122) ## This is what we use in Lund.

Introduction

This README file contains general information about the plug-in package and specific information about expression data. See the README_SNP file specific information about SNP data.

The Illumina BeadArray? Reader is a scanner that can read arrays including Illumina Sentrix® BeadChips? and Sentrix® Array Matrices (SAMs). Operation of the BeadArray? Reader and image aquisition from Sentrix® arrays is handled by the Illumina BeadScan? software.

The data output from a BeadArray? Reader scanner by default consists of files including image data (IDAT) files that can be read by data analysis software such as the Illumina BeadStudio? software.

The Illumina plug-ins package for BASE reads Illumina Sentrix® Array data from Illumina Bead Summary (IBS) files. The IBS files are not by default outputted by the BeadArray? Reader and the scanner must be configured to do so. Once the BeadArray? Reader is configured it will output IBS files in addition to any default output files. To configure a BeadArray? Reader to output IBS files, users are asked to contact their local Illumina Field Application Scientist.

The IBS files are text files that contain bead-type level data for scanned Sentrix® arrays. The file format is explained in detail in the section Illumina Bead Summary files.

Illumina Bead Summary (IBS) files

The IBS files contain bead-type level data for scanned Sentrix® arrays. The IBS files are simple comma separated text files with file extension .csv. The IBS files are outputted by the BeadArray? Reader in the same directory as any additional data files from a scan. Note that IBS files are not outputted by a BeadArray? Reader with default settings. Contact a local Illumina Field Application Scientist to configure the scanner to output IBS files.

IBS files are composed of four comma separated columns. See below for an example IBS file including header and 3 rows of data.

Illumicode,N,Mean GRN,Dev GRN
10008,26,222,47
10010,16,57,11
10014,16,56,13

The column content in an IBS file is described below.

  • Illumicode : A code corresponding to the Array_Address_Id in the Illumina Sentrix® Array binary manifest (BGX) file. Note that the Illuminacode is a string (or integer) of varying length. The Array_Address_Id is a string with a fixed lenght of 10 characters that consists of an Illuminacode padded with zeros.
  • N : The total number of beads used to calculate Mean GRN and Dev GRN.
  • Mean GRN : The mean intensity.
  • Dev GRN : Standard deviation of the mean intensity.

A new raw data type has been defined in illumina-raw-data-types.xml to hold this kind of data. The name of the raw data type is Illumina Bead Summary (IBS) and the unique ID is illumina_bead_summary

Illumina Sentrix® Array binary manifest (BGX) files

In addition to IDAT files, BeadStudio? requires Illumina Sentrix® Array binary manifest (BGX) files that contain information about the probes on a specific Illumina Sentrix® Array, including gene symbol, probe sequence, and so on. In BASE, the BGX files are used to create array designs that describe the probe content of a specific Illumina Sentrix® Array.

BGX files are tab separated text files composed of 3 sections named Heading, Probes, and Controls respectively. The first section is the Heading section. It is preceeded by a row containing the text [Heading]. In the Heading section some information is presented including the number of Probes and Controls described in the BGX file. See below for an example of the Heading section.

[Heading]
Date	1/3/2007
ContentVersion	1.0
FormatVersion	1.0.0
Number of Probes	48701
Number of Controls	1426

Following the Heading section is the Probes section wich is preceeded by a row containing the text [Probes]. The first row of the Probes section, i.e., the row after [Probes] contain the header for the Probes section. Following the Probes section is the Controls section wich is preceeded by a row containing the text [Controls]. The first row of the Controls section, i.e., the row after [Controls] contain the header for the Controls section. Note that the header row for the Controls section is completely different that the header row for the Probes section. See below for an example of Probes header and Controls header and how information in the BGX file is mapped to BASE.

Mapping reporter/control annotations from BGX files to BASE

The table below shows how the [Probes] section in the BGX file are mapped to reporter annotations in BASE. Annotations in <brackets> are new annotations defined in the illumina-extended-properties.xml file. BGX columns marked with - are not mapped to BASE.

BGX column BASE reporter annotation Example value
Species Species Homo sapiens
Source <Source> RefSeq?
Search_Key <Search_Key> ILMN_5998
Transcript - ILMN_5998
ILMN_Gene <ILMN_Gene> BRCA1
Source_Reference_ID <Source_Reference_ID> NM_007301.2
RefSeq_ID RefSeq? NM_007301.2
Unigene_ID Cluster ID
Entrez_Gene_ID LocusLink? 672
GI - 63252878
Accession Accession NM_007301.2
Symbol Gene symbol BRCA1
Protein_Product - NP_009232.1
Probe_Id External ID ILMN_1738027
Array_Address_Id Feature ID * 0003120095
Probe_Type <Isoform_Type> A
Probe_Start - 6438
Probe_Sequence Sequence ATCCAGGACTGTTTATAGCTGTTGGAAGGACTAGGTCTTCCCTAGCCCCC
Chromosome Chromosome 17
Probe_Chr_Orientation <Probe_Chr_Orientation>
Probe_Coordinates <Probe_Coordinates> 38449935-38449984
Definition Description Homo sapiens breast cancer 1, early onset (BRCA1), transcript variant BRCA1-delta15-17, mRNA.
Ontology_Component GO cell location ubiquitin ligase complex [goid 151] [pmid 14976165] [evidence NAS]; ...
Ontology_Process GO biological process protein ubiquitination [goid 16567] [pmid 15905410] [evidence NAS]; ...
Ontology_Function GO molecular function metal ion binding [goid 46872] [evidence IEA]; ...
Synonyms <Synonyms> IRIS; PSCP; BRCAI; BRCC1; RNF53

The table below shows how the [Controls] section in the BGX file are mapped to reporter annotations in BASE. Annotations in <brackets> are new annotations defined in the illumina-extended-properties.xml file. BGX columns marked with - are not mapped to BASE.

BGX column BASE reporter annotation Example value
Probe_Id External ID ILMN_943471
Array_Address_Id Feature ID * 0004780609
Reporter_Group_Name <Control_Group_Name> housekeeping
Reporter_Group_id <Control_Group_Id> housekeeping
Reporter_Composite_map <Control_Composite_map> GI_34304116-S
Probe_Sequence Sequence CGTGAAGACCCTGACTGGTAAGACCATCACTCTCGAAGTGGAGCCGAGTG
  • The Feature ID is not a reporter annotation. It is used only to

identify the probe on an array design.

The column mappings for the [Probes] section can be changed by modifying the existing import configuration or creating a new configuration. The column mappings for [Controls] section can't be changed.

Getting started

  1. Install this package as described by the instructions in the INSTALL file.
  2. Import reporter annotations. You will need one or more BGX files for this. BGX files can be downloaded from http://www.switchtoi.com/annotationfiles.ilmn.
    • Upload the BGX file(s) to BASE.
    • Go to the View -> Reporters menu.
    • Click on the Import button.
    • Use the auto-detect function or select the Illumina BGX reporter importer plug-in.
    • Select the BGX file.
    • Finish the job registration and wait for the plug-in to complete.
    • Repeat this one time for each BGX file.
  3. Create array designs. You will need one array design for each BGX file.
    • Go to the Array LIMS -> Array designs menu.
    • Click on the New button.
    • Choose the Illumina/Expression? 1 or the Illumina/Expression? 2 platform. The difference is that the Expression 2 has two IBS files for each raw data set, but Expression 1 only has one.
    • We recommend that you give the array design the same name as the BGX file.
    • Switch to the Data files tab and select the BGX file.
    • Click on Save.
    • Click on the newly created array design.
    • Click on the Import button and select the Illumina BGX feature importer plug-in.
    • Click on Next and select the Duplicate feature=skip option.
    • Finish the job registration and wait for the plug-in to complete.
    • Repeat this for each BGX file.
  4. Import raw data. You will need one or two IBS files.
    • Upload the IBS file(s) to BASE.
    • Go to the View -> Raw bioassays menu.
    • Click on the New button.
    • Select the Illumina/Expression? 1 or the Illumina/Expression? 2 platform. The difference is that the Expression 2 has two IBS files for each raw data set, but Expression 1 only has one.
    • Select one of the array designs created in step 3.
    • Switch to the Data files tab and select the IBS file(s).
    • Click on Save.
    • Click on the newly created raw bioassay.
    • Click on the Import button and select the Illumina Bead Summary Importer
    • Finish the job registration and wait for the plug-in to complete.
    • Repeat this for each set of raw data files.
  5. Add your raw data sets to an experiment.

Tip! Steps 1-3 only needs to be done a single time for a BASE installation. If more than one user is going to use the Illumina package we recommend that the array designs created in step 3 are shared to the appropriate users, for example, the Everyone group.

Tip! The data import step in (4) above can be done for an entire experiment at a time.

Note: See TracBrowser for help on using the repository browser.