Ignore:
Timestamp:
Feb 20, 2009, 1:52:57 AM (14 years ago)
Author:
Peter
Message:

working on documentation. refs #478

File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/yat/normalizer/qQuantileNormalizer.h

    r1803 r1810  
    4747     \brief Perform Q-quantile normalization
    4848
    49      After a Q-quantile normalization each column has approximately
    50      the same distribution of data (the Q-quantiles are the
    51      same). Also, within each column the rank of an element is not
    52      changed.
    53 
    54      The normalization goes like this
    55      - Data is not assumed to be sorted.
    56      - Partition sorted target data in N parts. N must be 3 larger
     49     Perform a Q-quantile normalization on a \a source range, after
     50     which it will approximately have the same distribution of data as
     51     the \a target range (the Q-quantiles are the same). The rank of
     52     an element in the \a source range is not changed.
     53
     54     The class works also with unweighed ranges, and there is no
     55     restriction that weighted \a source range requires weighted \a
     56     target range or vice versa.
     57
     58     Normalization goes like this:
     59     - Data are not assumed to be sorted.
     60     - Partition sorted \a target data in N parts. N must be 3 or larger
    5761       because of requirements from the underlying cspline fit
    58      - Calculate the arithmetic mean for each part, the mean is
     62     - Calculate the arithmetic (weighted) mean for each part, the mean is
    5963       assigned to the mid point of each part.
    60      - Do the above for the data to be tranformed (called source
     64     - Do the above for the data to be tranformed (called \a source
    6165       here).
    62      - For each part, calculate the difference between the target and
    63        the source. Now we have N differences d_i with associated rank
    64        (midpoint of each part).
    65      - Create a cubic spline fit to this difference vector d. The
    66        resulting curve is used to recalculate all column values.
     66     - For each part, calculate the difference between the \a target
     67       and \a the source. Now we have \a N differences \f$ d_i \f$
     68       with associated rank (midpoint of each part).
     69     - Create a cubic spline fit to this difference vector \a d. The
     70       resulting curve is used to recalculate all values in \a source.
    6771       - Use the cubic spline fit for values within the cubic spline
    6872         fit range [midpoint 1st part, midpoint last part].
    6973       - For data outside the cubic spline fit use linear
    70          extrapolation, i.e., a constant shift. d_first for points
    71          below fit range, and d_last for points above fit range.
     74         extrapolation, i.e., a constant shift. \f$ d_{first} \f$ for
     75         points below fit range, and \f$ d_last \f$ for points above fit
     76         range.
    7277
    7378     \since New in yat 0.5
     
    7782  public:
    7883    /**
    79        \brief Documentation please.
     84       \brief Contructor
    8085
    8186       \a Q is the number of parts and must be within \f$ [3,N] \f$
Note: See TracChangeset for help on using the changeset viewer.