source: plugins/base2/net.sf.basedb.normalizers/trunk/README

Last change on this file was 2173, checked in by Jari Häkkinen, 10 years ago

Refs #228 and #541. Improving doc and help string

  • Property svn:eol-style set to native
  • Property svn:keywords set to Id
  • Property svn:mime-type set to text/x-trac-wiki
File size: 4.8 KB

$Id: README 2173 2013-12-12 09:30:30Z jari $

About Normalization package for BASE

The Normalization package for BASE (net.sf.basedb.normalization) plug-in set is a compilation of normalisers for expression data. See Documentation below for further information about the different plug-ins in this package. Common to most of the plug-ins provided with this package is that they work on bioassay sets with either 1-channel and 2-channel data. The algorithms are working on expression values, that is for 2-channel data, ratio ch1/ch2 are used.

Normalization package for BASE is free software. See the file license.txt for copying conditions.

The package was created, and is maintained, by Martin Svensson and Jari Hakkinen.

Downloading

Normalization package for BASE can be obtained from

http://baseplugins.thep.lu.se/wiki/PluginDownload

Installation

Installation instructions can be found in the 'INSTALL' file.

Documentation

Average normalization

This plug-in scales the expression values for an assay with a factor, S, equal to the ratio of either i) the geometric mean of the expression values of all spots in the bioassay set divided by the assay average, or ii) a user defined value divided by the assay average.

The new expression values will become S times the original expression value. The user can choose between using geometric or arithmetic mean when calculating the averages.

Background subtraction and proper filtration have to be done before running this plug-in.

qQuantile normalization

The current implementation of qQuantile normalization supports only 1-channel arrays.

The qQuantile normalization is inspired by the 'Cubic Spline' normalization in Illumina Beadstudio and the work by Workman et al., http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=12225587

In qQuantile normalization, all assays (including the target) are sorted in increasing intensity. The sorted list of probe intensities are partitioned into q groups, and each of theses q groups are adjusted (normalized) with the corresponding target group. After normalization the intensity distribution of each assay will be approximately the same as the target distribution. q is calculated as q=max(10,min(100,target_size/10)). The program will stop if the number of well defined expression values in the target or any of the assays in the set is smaller than q.

The target is defined by selecting a subset of the assays in the bioassay set, and the target expression values are the medians of probe intensities over the bioassay set. Probes with no well defined measurements in the bioassay set are simply ignored in target calculation.

Since the normalization calculations are based on geometric means and performed in log space the intensities must be positive and larger than 0. Rather than expecting the user of qQuantile normalization to remove such intensity the underlying algorithm silently ignores zero and negative intensities.

The bioassay set to be normalized must be non-logarithmic values since this plug-in will log all values before performing the normalization.

Background subtraction and proper filtration should be done on the bioassay set before running this plug-in.

Quantile normalization

In quantile normalization each assay data is sorted in ascending expression value order and added to a matrix as columns. The matrix rows will contain mixed probes (also known as reporters or genes) decided by their rank. For each row in the matrix, the expression values are replaced with the row average value (geometric or arithmetic selectable by user). Finally, each assay is reordered into its original order to retain a standard expression matrix were each row represents one probe. Assays are not mixed.

Background subtraction and proper filtration should be done on the bioassay set before running this plug-in. The bioassay set must not contain any missing values.


Copyright (C) 2008 Jari Häkkinen, Martin Svensson
Copyright (C) 2009 Jari Häkkinen
This file is part of the Normalizers plug-in package for BASE
(net.sf.based.normalizers). The package is available at
http://baseplugins.thep.lu.se/ BASE main site is
http://base.thep.lu.se/
This is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 3
of the License, or (at your option) any later version.
The software is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
Note: See TracBrowser for help on using the repository browser.