Changeset 2165


Ignore:
Timestamp:
Dec 9, 2013, 12:23:59 PM (9 years ago)
Author:
olle
Message:

Refs #228. Refs #541. Quantile normalization updated to untransform logarithmic data before normalization, and then transform it back to logarithmic format before results are stored. Default averaging method will always be set to Formula.AverageMethod.GEOMETRIC_MEAN:

  1. Class/file QuantileNormalizer.java in src/net/sf/basedb/plugins/ in package net.sf.basedb.normalizers updates:
    a. Private method RequestInformation getConfiguredJobParameters() updated to always set the default averaging method to Formula.AverageMethod.GEOMETRIC_MEAN.
    b. Private method BioAssaySet normalize(DbControl dc, BioAssaySet source, Job job, ProgressReporter progress) updated to call public method double transform(double value) in class IntensityTransform to transform the normalized result back before storing it. Error message when number of spots differ between two BioAssay sets updated to report the names of the latter and the number of spots in each one. Also minor updates in order to increase clarity of code.
  2. Class/file AbstractNormalizationPlugin.java in src/net/sf/basedb/plugins/ in package net.sf.basedb.normalizers updated in help text for selecting averaging method by avoiding reference to the format data is stored in, since averaging is performed on untransformed data.
  3. XML files extensions.xml in META-INF in package net.sf.basedb.normalizers updated in help text for Quantile normalization by adding information that data stored in logarithmic format will be untransformed before averaging, and then transformed back to logarithmic format before results are stored.
Location:
plugins/base2/net.sf.basedb.normalizers/trunk
Files:
3 edited

Legend:

Unmodified
Added
Removed
  • plugins/base2/net.sf.basedb.normalizers/trunk/META-INF/extensions.xml

    r1454 r2165  
    1111    </description>
    1212    <version>1.1-beta</version>
    13     <min-base-version>3.0.0</min-base-version>
     13    <min-base-version>3.2.4</min-base-version>
    1414    <copyright>BASE development team</copyright>
    1515    <email>basedb-users@lists.sourceforge.net</email>
     
    3030       
    3131        The new expression values will become "S" times the original
    32         expression value. Background subtraction and proper filtration
     32        expression value. Background subtraction and proper filtration
    3333        have to be done before running this plug-in.
    3434       
     
    5252        values are replaced with the row average value. Finally, each assay
    5353        is reordered into its original order to retain a standard
    54         expression matrix were each row represents one probe. Assays are
     54        expression matrix where each row represents one probe. Assays are
    5555        not mixed.
    5656       
     
    5858        bioassay set before running this plug-in. The bioassay set must not
    5959        contain any missing values.
     60       
     61        Data stored in logarithmic format will be untransformed before averaging,
     62        and then transformed back to logarithmic format before results are stored.
    6063       
    6164        This plug-in supports 1-channel and 2-channel data.
     
    111114      <name>Rank invariant normalization</name>
    112115      <description>
    113         The development of this plug-in is still in progress
     116        The development of this plug-in is still in progress.
     117       
     118        This plug-in currently only supports 1-channel data.
    114119      </description>     
    115120    </about>   
  • plugins/base2/net.sf.basedb.normalizers/trunk/src/net/sf/basedb/plugins/AbstractNormalizationPlugin.java

    r2156 r2165  
    7171    "averageMethod",
    7272    "Average calculation method",
    73     "Select which method to use when calculating averages.\n" +
    74     geometricOption +
    75     " is default for non-logarithmic values and " + arithmeticOption +
    76     " is default for logarithmic values.",
     73    "Select which method to use when calculating averages.",
    7774    null
    7875  ); 
  • plugins/base2/net.sf.basedb.normalizers/trunk/src/net/sf/basedb/plugins/QuantileNormalization.java

    r2154 r2165  
    7070   values are replaced with the row average value. Finally, each assay
    7171   is reordered into its original order to retain a standard
    72    expression matrix were each row represents one probe. Assays are
     72   expression matrix where each row represents one probe. Assays are
    7373   not mixed.
    7474
     
    252252       
    253253        // Average normalization options
    254         Formula.AverageMethod defaultAverageMethod = Formula.AverageMethod.ARITHMETIC_MEAN;
    255         IntensityTransform transform = bas.getIntensityTransform();
    256         if (transform == IntensityTransform.NONE)
    257         {
    258           defaultAverageMethod = Formula.AverageMethod.GEOMETRIC_MEAN;
    259         }
     254        // Default is always set to geometric mean, since averaging is performed on untransformed data
     255        Formula.AverageMethod defaultAverageMethod = Formula.AverageMethod.GEOMETRIC_MEAN;
    260256        StringParameterType spt = new StringParameterType
    261257        (
     
    301297    int noOfChannels = source.getRawDataType().getChannels();
    302298    long normalizedSpots = 0;
     299    IntensityTransform transform = source.getIntensityTransform();
    303300       
    304301    // Get query to recieve the spot data.
     
    325322    String configuredAverageMethod = (String)job.getParameterValue(averageMethodParameter.getName());
    326323    int spotsPerAssay = -1;
     324    String refAssayName = "";
    327325   
    328326    for (BioAssay assay : assays)
     
    332330      // Control that the number of spots per assay is the same in all assays
    333331      if (spotsPerAssay > -1 && spotsPerAssay != assay.getNumSpots())
    334         throw new BaseException("The number of spots are not equal between the dispal\n " +
     332      {
     333        throw new BaseException("The number of spots for assay '" + assay.getName() + "' (" + assay.getNumSpots() + ") is not equal to that for assay '" + refAssayName + "' (" + spotsPerAssay + ")\n " +
    335334            "The normalization can not be done.");
     335      }
    336336      else if (spotsPerAssay == -1)
     337      {
     338        // First BioAssay, save data for check and possible error message
    337339        spotsPerAssay = assay.getNumSpots();
     340        refAssayName = assay.getName();
     341      }
    338342     
    339343      // Get spot data and sort it ascending
     
    342346     
    343347      // Write spot data to file and calculate row summaries
    344       if (rowCalculators == null) rowCalculators = new AverageCalculator[data.size()];
     348      if (rowCalculators == null)
     349      {
     350        rowCalculators = new AverageCalculator[data.size()];
     351      }
    345352      FileWriter fw = null;
    346353      try
     
    352359        {
    353360          AbstractSpotData spot = data.get(i);
    354           if (rowCalculators[i] == null) rowCalculators[i] = new AverageCalculator(configuredAverageMethod);
     361          if (rowCalculators[i] == null)
     362          {
     363            rowCalculators[i] = new AverageCalculator(configuredAverageMethod);
     364          }
    355365          rowCalculators[i].addNumber(spot.getNormalizableData());
    356366         
     
    413423          AbstractSpotData spot = getSpotDataFromString(lineInfo);
    414424          spot.setNormalizableData(rowCalculators[rowIndex].getAverage());
    415           batcher.insert(columnNo, spot.getPosition(), spot.getChannelData());
     425          // Average is calculated for untransformed data, transform channel data before storing
     426          float[] channelDataArr = spot.getChannelData();
     427          float[] transformedChannelDataArr = new float[channelDataArr.length];
     428          for (int i=0; i < channelDataArr.length; i++)
     429          {
     430            // d1 = untransormed value, d2 = transformed value (0 if d1 is out of range)
     431            float d1 = channelDataArr[i];
     432            float d2 = 0F;
     433            if (!Float.isNaN(d1) && !Float.isInfinite(d1))
     434            {
     435              d2 = (float) transform.transform(d1);
     436            }
     437            transformedChannelDataArr[i] = d2;
     438          }
     439          batcher.insert(columnNo, spot.getPosition(), transformedChannelDataArr);
    416440          normalizedSpots++;
    417441          rowIndex++;
    418442        }
     443        child.setIntensityTransform(transform);
    419444        // Clean up
    420445        input.close();
Note: See TracChangeset for help on using the changeset viewer.