Opened 15 years ago
Last modified 14 years ago
#401 new discussion
Weighted Quantile Normalization
Reported by: | Peter | Owned by: | Peter |
---|---|---|---|
Priority: | minor | Milestone: | yat 0.x+ |
Component: | normalizer | Version: | trunk |
Keywords: | Cc: |
Description (last modified by )
related to ticket:478
Same as the unweighted case, but working on MatrixWeighted rather than Matrix.
Though, it is not straight forward to generalize the algorithm. In the unweighted case each column has follows the same distribution. What does that post-condition imply in the weighted case?
Change History (3)
comment:1 Changed 15 years ago by
comment:2 Changed 15 years ago by
Component: | utility → normalizer |
---|---|
Owner: | changed from Jari Häkkinen to Peter |
A related question is how to take care of ties. Currently, two elements within a column being equal prior normalization, they are likely not equal after normalization.
Note: See
TracTickets for help on using
tickets.
The unweighted Quantile Normalization algorithm has three steps (see #288):
To generalize this algorithm I will reformulate it as an optimization problem that can be generalized for the weighted case. If we let x be the matrix prior normalization, we can define the generalized rank as:
r_ij = sum_k w_kj * I(x_ij - x_kj)
where I(x)=1 if x>0, I(x)=1/2 if x=0, and I(x)=0 if x<0.
Then the quantile normalized matrix x' fulfills: x'_ij <= x'_kl iff r_ij <= r_kl
and Q = sum_ij (x_ij - x'_ij)2 is minimized.
In other words, we can view QN as a minimization problem with a number of inequality conditions. This is straight forward to generalize to the weighted case:
The inequality conditions are good as they are, and the objective function to minimized could be modified to
Q = sum_ij w_ij (x_ij - x'_ij)2
The remaining problem is to find an algorithm that can find the solution fast.