source: plugins/base2/net.sf.basedb.normalizers/trunk/README @ 1064

Last change on this file since 1064 was 1064, checked in by Jari Häkkinen, 13 years ago

Addresses #118. Calculate target with median instead of mean.

  • Property svn:eol-style set to native
  • Property svn:keywords set to Id
  • Property svn:mime-type set to text/x-trac-wiki
File size: 4.7 KB

$Id: README 1064 2009-05-14 10:26:03Z jari $

About Normalization package for BASE

The Normalization package for BASE (net.sf.basedb.normalization) plug-in set is a compilation of normalisers for expression data. See Documentation below for further information about the different plug-ins in this package. Common to the plug-ins provided with this package is that they work on bioassay set with either 1-channel or 2-channel. The algorithms are working on expression values, that is for 2-channel data, ratios ch1/ch2 are used.

Normalization package for BASE is free software. See the file license.txt for copying conditions.

The package was created, and is maintained, by Martin Svensson and Jari Hakkinen.

Downloading

Normalization package for BASE can be obtained from

http://baseplugins.thep.lu.se/wiki/PluginDownload

Installation

Installation instructions can be found in the 'INSTALL' file.

Documentation

Average normalization

This plug-in scales the expression values for an assay with a factor, S, equal to the ratio of either i) the geometric mean of the expression values of all spots in the bioassay set divided by the assay average, or ii) a user defined value divided by the assay average.

The new expression values will become S times the original expression value.

Background subtraction and proper filtration have to be done before running this plug-in.

qQuantile normalization

qQN geometric mean non-logged values! qQN must have positive numbers!

The assays are normed against a selectable sub-set of the assays ... If a probe has no well-defined measurement (i.e., no assay in the reference has a well defined value for a probe) it is simply ignored from the target distribution.

Ja, jag tankte att man far justera q till maximalt antal icke nan ... men som sagt an sa lange ar q==100.

Den begränsning som finns nu är att det måste finnas lagom många väldefinierade mätvärden per assay (några hundra per assay) annars kraschar nog programmet. Jag kommer att lösa detta genom att välja antalet bins i distributionsberäkningen som #bins=max(100,N_i/10) i=1...#assays och kräva att #bins är minst 10. Kravet är alltså att varje assay måste ha minst 100 väldefinierade punkter. Om inte kravet är uppfyllt stannar programmet med ett någorlunda trevligt meddelande.

In q-quantile normalization each assay data is sorted in ascending expression value order and added to a matrix as columns. The matrix rows will contain mixed probes (also known as reporters or genes) decided by their rank. For each row in the matrix, the expression values are replaced with the row average value. Finally, each assay is reordered into its original order to retain a standard expression matrix were each row represents one probe. Assays are not mixed.

Background subtraction and proper filtration should be done on the bioassay set before running this plug-in. The bioassay set must not contain any missing values.

The qQuantile normalization is inspired by the 'Cubic Spline' normalization in Illumina Beadstudio and the work by Workman et al., http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=12225587

Quantile normalization

In quantile normalization each assay data is sorted in ascending expression value order and added to a matrix as columns. The matrix rows will contain mixed probes (also known as reporters or genes) decided by their rank. For each row in the matrix, the expression values are replaced with the row average value. Finally, each assay is reordered into its original order to retain a standard expression matrix were each row represents one probe. Assays are not mixed.

Background subtraction and proper filtration should be done on the bioassay set before running this plug-in. The bioassay set must not contain any missing values.


Copyright (C) 2008 Jari Häkkinen, Martin Svensson
Copyright (C) 2009 Jari Häkkinen
This file is part of the Normalizers plug-in package for BASE
(net.sf.based.normalizers). The package is available at
http://baseplugins.thep.lu.se/ BASE main site is
http://base.thep.lu.se/
This is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 3
of the License, or (at your option) any later version.
The software is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
Note: See TracBrowser for help on using the repository browser.