Opened 12 years ago

Last modified 12 years ago

#401 new discussion

Weighted Quantile Normalization

Reported by: Peter Owned by: Peter
Priority: minor Milestone: yat 0.x+
Component: normalizer Version: trunk
Keywords: Cc:

Description (last modified by Peter)

related to ticket:478

Same as the unweighted case, but working on MatrixWeighted rather than Matrix.

Though, it is not straight forward to generalize the algorithm. In the unweighted case each column has follows the same distribution. What does that post-condition imply in the weighted case?

Change History (3)

comment:1 Changed 12 years ago by Peter

The unweighted Quantile Normalization algorithm has three steps (see #288):

  • sort each column
  • replace element with row average
  • reverse sort in step 1

To generalize this algorithm I will reformulate it as an optimization problem that can be generalized for the weighted case. If we let x be the matrix prior normalization, we can define the generalized rank as:

r_ij = sum_k w_kj * I(x_ij - x_kj)

where I(x)=1 if x>0, I(x)=1/2 if x=0, and I(x)=0 if x<0.

Then the quantile normalized matrix x' fulfills: x'_ij <= x'_kl iff r_ij <= r_kl

and Q = sum_ij (x_ij - x'_ij)2 is minimized.

In other words, we can view QN as a minimization problem with a number of inequality conditions. This is straight forward to generalize to the weighted case:

The inequality conditions are good as they are, and the objective function to minimized could be modified to

Q = sum_ij w_ij (x_ij - x'_ij)2

The remaining problem is to find an algorithm that can find the solution fast.

comment:2 Changed 12 years ago by Peter

Component: utilitynormalizer
Owner: changed from Jari Häkkinen to Peter

A related question is how to take care of ties. Currently, two elements within a column being equal prior normalization, they are likely not equal after normalization.

comment:3 Changed 12 years ago by Peter

Description: modified (diff)

#478 was marked as related

Note: See TracTickets for help on using tickets.