source:
plugins/base2/net.sf.basedb.normalizers/trunk/README
Last change on this file was 2173, checked in by , 10 years ago | |
---|---|
|
|
File size: 4.8 KB |
$Id: README 2173 2013-12-12 09:30:30Z jari $
About Normalization package for BASE
The Normalization package for BASE (net.sf.basedb.normalization)
plug-in set is a compilation of normalisers for expression data. See
Documentation
below for further information about the different
plug-ins in this package. Common to most of the plug-ins provided with
this package is that they work on bioassay sets with either 1-channel
and 2-channel data. The algorithms are working on expression values,
that is for 2-channel data, ratio ch1/ch2 are used.
Normalization package for BASE
is free software. See the file
license.txt for copying conditions.
The package was created, and is maintained, by Martin Svensson and Jari Hakkinen.
Downloading
Normalization package for BASE
can be obtained from
Installation
Installation instructions can be found in the 'INSTALL' file.
Documentation
Average normalization
This plug-in scales the expression values for an assay with a factor, S, equal to the ratio of either i) the geometric mean of the expression values of all spots in the bioassay set divided by the assay average, or ii) a user defined value divided by the assay average.
The new expression values will become S times the original expression value. The user can choose between using geometric or arithmetic mean when calculating the averages.
Background subtraction and proper filtration have to be done before running this plug-in.
qQuantile normalization
The current implementation of qQuantile normalization supports only 1-channel arrays.
The qQuantile normalization is inspired by the 'Cubic Spline' normalization in Illumina Beadstudio and the work by Workman et al., http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=12225587
In qQuantile normalization, all assays (including the target) are sorted in increasing intensity. The sorted list of probe intensities are partitioned into q groups, and each of theses q groups are adjusted (normalized) with the corresponding target group. After normalization the intensity distribution of each assay will be approximately the same as the target distribution. q is calculated as q=max(10,min(100,target_size/10)). The program will stop if the number of well defined expression values in the target or any of the assays in the set is smaller than q.
The target is defined by selecting a subset of the assays in the bioassay set, and the target expression values are the medians of probe intensities over the bioassay set. Probes with no well defined measurements in the bioassay set are simply ignored in target calculation.
Since the normalization calculations are based on geometric means and performed in log space the intensities must be positive and larger than 0. Rather than expecting the user of qQuantile normalization to remove such intensity the underlying algorithm silently ignores zero and negative intensities.
The bioassay set to be normalized must be non-logarithmic values since this plug-in will log all values before performing the normalization.
Background subtraction and proper filtration should be done on the bioassay set before running this plug-in.
Quantile normalization
In quantile normalization each assay data is sorted in ascending expression value order and added to a matrix as columns. The matrix rows will contain mixed probes (also known as reporters or genes) decided by their rank. For each row in the matrix, the expression values are replaced with the row average value (geometric or arithmetic selectable by user). Finally, each assay is reordered into its original order to retain a standard expression matrix were each row represents one probe. Assays are not mixed.
Background subtraction and proper filtration should be done on the bioassay set before running this plug-in. The bioassay set must not contain any missing values.
Copyright (C) 2008 Jari Häkkinen, Martin Svensson Copyright (C) 2009 Jari Häkkinen This file is part of the Normalizers plug-in package for BASE (net.sf.based.normalizers). The package is available at http://baseplugins.thep.lu.se/ BASE main site is http://base.thep.lu.se/ This is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. The software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.