Changeset 2026


Ignore:
Timestamp:
Sep 25, 2013, 4:09:06 PM (8 years ago)
Author:
olle
Message:

Refs #79. Center plug-in updated to allow centering using different groupings of assays:

  1. BASE1 plug-in configuration file plugin_Transformation_Center.base in Center/misc/ updated with new options and description text.
  2. Java file Center.java in Center/src/center/ updated to implement the new functionality.
Location:
plugins/base1/se.lu.onk.Center/trunk
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • plugins/base1/se.lu.onk.Center/trunk/misc/plugin_Transformation_Center.base

    r897 r2026  
    44versionNumber @VERSION@
    55name  Transformation: median/mean centering
    6 descr To center your data means that you adjust your values to reflect their variation from some property of the data such as the mean or median. The center plugin allows the user to center the expression levels either per gene or per array. Consider a common experimental design where you are looking at a large number of samples all compared to a common reference. For each gene, you have a series of ration values that are relative to the expression level of that gene in the reference sample. Since the reference sample really has nothing  to do with your experiment, you want your analysis to be independent of the amount of a gene present in the reference sample. This is achieved by center your data on genes. Centering makes less sense in experiments where the reference sample is part of the experiment. Centering the data for arrays can also be used to remove certain types of biases and can be seen as a crude normalization. The results of many two-color fluorescent hybridization experiments are not corrected for systematic biases in ratios that are the result of differences in RNA amounts, labeling efficiency and image acquisition parameters. Such biases have the effect of multiplying ratios for all genes by a fixed scalar. Mean or median centering the data in log-space has the effect of correcting this bias, although it should be noted that an assumption is being made in correcting this bias, which is that the average gene in a given experiment is expected to have a ratio of 1.0 (or log-ratio of 0).\r\n\r\nIn general, I recommend the use of median rather than mean centering, as it is more robust against outliers. \r\n\r\n\r\nParameters:\r\n\r\nCenter on genes/arrays - If the centering should be done on genes, arrays or both. If both i choosen then the centering will first be done on genes then on arrays, this is called a cycle.\r\n\r\nNumber of centering cycles - How many cycles should be done. This value is only relevant if the centering should be done on both genes and arrays. \r\n\r\nCentering using median or mean - If median or mean should be used for the centering.
    7 execName  onk.lu.se/johane/center2/run
     6descr To center your data means that you adjust your values to reflect their variation from some property of the data such as the mean or median. The center plugin allows the user to center the expression levels either per gene or per array. Consider a common experimental design where you are looking at a large number of samples all compared to a common reference. For each gene, you have a series of ration values that are relative to the expression level of that gene in the reference sample. Since the reference sample really has nothing  to do with your experiment, you want your analysis to be independent of the amount of a gene present in the reference sample. This is achieved by center your data on genes. Centering makes less sense in experiments where the reference sample is part of the experiment. Centering the data for arrays can also be used to remove certain types of biases and can be seen as a crude normalization. The results of many two-color fluorescent hybridization experiments are not corrected for systematic biases in ratios that are the result of differences in RNA amounts, labeling efficiency and image acquisition parameters. Such biases have the effect of multiplying ratios for all genes by a fixed scalar. Mean or median centering the data in log-space has the effect of correcting this bias, although it should be noted that an assumption is being made in correcting this bias, which is that the average gene in a given experiment is expected to have a ratio of 1.0 (or log-ratio of 0).\r\n\r\nIn general, I recommend the use of median rather than mean centering, as it is more robust against outliers. \r\n\r\n\r\nParameters:\r\n\r\nCenter on genes/arrays - If the centering should be done on genes, arrays or both. If both are chosen then the centering will first be done on genes then on arrays, this is called a cycle. \r\n\r\nAssay groups for centering - If the centering should be based on data in one or more assay groups. If "Default" is selected, centering on arrays is made with each assay in its own group, while centering on genes is made with all assays in one group. "Single assay group" centers all data based on values in a single assay group. "Assay groups" centers each assay group separately. \r\n\r\nCenter group(s) assay names - Comma-separated list of names of assays in groups to use for centering, with each group separated by a '|' character. Only used if "Single assay group" or "Assay groups" has been selected under "Assay groups for centering". \r\n\r\nNumber of centering cycles - How many cycles should be done. This value is only relevant if the centering should be done on both genes and arrays. \r\n\r\nCentering using median or mean - If median or mean should be used for the centering. \r\n\r\nCreate debug files - If debug data should be stored in "data" directory.
     7execName  run
    88usedColumns position\treporter
    99usedFields  l2ratio1_2\tl10intgmean1_2
     
    2020%
    21211 h section   30  center settings   0
    22 2 e centerGeneAssay Center on genes/arrays  30  1 1\tBoth\t2\tArrays(columns)\t3\tGenes(rows) 0
    23 3 i centerCycles  Number of centering cycles  10  5   0
    24 4 e mm  Centering using median or mean  30  1 1\tMedian\t2\tMean  0
    25 5 e normGeneAssay Normalize on genes/assays 30  1 1\tBoth\t2\tAssays(columns)\t3\tGenes(rows) 1
    26 6 i normCycles  Number of normalization cycles  10  0   1
    27 
     222 e centerGeneAssay Center on genes/arrays  30  1 1\tBoth\t2\tArrays (columns)\t3\tGenes (rows) 0
     233 e centerAssayGroups Assay groups for centering  30  1 1\tDefault - Arrays: each assay in its own group, genes: all assays in one group\t2\tSingle assay group - Center all data based on values in single assay group\t3\tAssay groups - Center each assay group separately 0
     244 t centerGroupsAssayNames  Center group(s) assay names 30      0
     255 i centerCycles  Number of centering cycles  10  5   0
     266 e mm  Centering using median or mean  30  1 1\tMedian\t2\tMean  0
     277 e normGeneAssay Normalize on genes/assays 30  1 1\tBoth\t2\tAssays(columns)\t3\tGenes(rows) 1
     288 i normCycles  Number of normalization cycles  10  0   1
     299 e createDebugFiles  Create debug files  30  false true\tyes\tfalse\tno  0
  • plugins/base1/se.lu.onk.Center/trunk/src/center/Center.java

    r2012 r2026  
    6161  private GeneAssay param_centerGeneAssay;
    6262
     63  private CenteringGroups param_centeringGroups;
     64 
     65  private String param_centerGroupsAssayNames = "";
     66
    6367  private int param_centerCycles = -1;
    6468
     69  private List<String> singleCenterGroupAssayNameList = null;
     70
     71  private List<Integer> singleCenterGroupAssayIndexList = null;
     72
     73  private HashMap<Integer,List<String>> centerGroupsAssayNameHashMap = null;
     74
     75  private HashMap<Integer,List<Integer>> centerGroupsAssayIndexHashMap = null;
     76 
     77  private boolean param_createDebugFiles = false;
     78
    6579  private ArrayList<AssayRow> data = new ArrayList<AssayRow>();
     80 
     81  private boolean debug = false;
    6682
    6783  public static void main(String[] args)
     
    118134    }
    119135  }
    120  
     136
     137  public Center()
     138  {
     139  }
     140
    121141  public void extractSettings(BASEFileSection section)
    122142  {
     
    125145      param_mm = CenterOn.fromValue(section.findIntOpt("mm"));
    126146      param_centerGeneAssay = GeneAssay.fromValue(section.findIntOpt("centerGeneAssay"));
     147      param_centeringGroups = CenteringGroups.fromValue(section.findIntOpt("centerAssayGroups"));
     148      param_centerGroupsAssayNames = section.findStringOpt("centerGroupsAssayNames");
    127149      param_centerCycles = section.findIntOpt("centerCycles");
     150      param_createDebugFiles = section.findBooleanOpt("createDebugFiles");
     151      if (param_createDebugFiles)
     152      {
     153        debug = true;
     154        File dataDir = new File("data");
     155        dataDir.mkdir();
     156      }
     157      PrintStream debugOut = null;
     158      if (debug)
     159      {
     160        debugOut = new PrintStream(new File("data", "debugExtractSettings.txt"));
     161        debugOut.println("Center::extractSettings(): param_centerGeneAssay = " + param_centerGeneAssay);
     162        debugOut.println("Center::extractSettings(): param_centeringGroups = " + param_centeringGroups);
     163        debugOut.println("Center::extractSettings(): param_centerGroupsAssayNames = \"" + param_centerGroupsAssayNames + "\"");
     164        debugOut.println("Center::extractSettings(): param_centerCycles = " + param_centerCycles);
     165        debugOut.println("Center::extractSettings(): param_mm = " + param_mm);
     166        debugOut.println("Center::extractSettings(): param_createDebugFiles = " + param_createDebugFiles);
     167      }
     168      if (param_centeringGroups == CenteringGroups.ASSAYGROUPSINGLE)
     169      {
     170        singleCenterGroupAssayNameList = new ArrayList<String>();
     171        for (String s : param_centerGroupsAssayNames.trim().split(","))
     172        {
     173          s.trim();
     174          singleCenterGroupAssayNameList.add(s);
     175          if (debug)
     176          {
     177            debugOut.println("Center::extractSettings(): s = \"" + s + "\"");
     178          }
     179        }
     180      }
     181      if (param_centeringGroups == CenteringGroups.ASSAYGROUPS)
     182      {
     183        centerGroupsAssayNameHashMap = new HashMap<Integer,List<String>>();
     184        int groupIndex = 0;
     185        for (String assayNames : param_centerGroupsAssayNames.trim().split("\\|"))
     186        {
     187          List<String> groupAssayNameList = new ArrayList<String>();
     188          if (debug)
     189          {
     190            debugOut.println("Center::extractSettings(): assayNames = \"" + assayNames + "\"");
     191          }
     192          for (String s : assayNames.trim().split(","))
     193          {
     194            s.trim();
     195            groupAssayNameList.add(s);
     196            if (debug)
     197            {
     198              debugOut.println("Center::extractSettings(): s = \"" + s + "\"");
     199            }
     200          }
     201          centerGroupsAssayNameHashMap.put(groupIndex, groupAssayNameList);
     202          groupIndex++;
     203        }
     204      }
     205      if (debug)
     206      {
     207        debugOut.println("Center::extractSettings(): singleCenterGroupAssayNameList = " + singleCenterGroupAssayNameList);
     208        debugOut.println("Center::extractSettings(): centerGroupsAssayNameHashMap = " + centerGroupsAssayNameHashMap);
     209        debugOut.close();
     210      }
    128211    }
    129212    catch (NumberFormatException e)
     
    131214      e.printStackTrace();
    132215    }
    133 
     216    catch (Exception e1)
     217    {
     218      e1.printStackTrace();
     219    }
    134220    if (param_centerCycles < 0)
    135221    {
     
    156242      String count = section.findStringOpt("count");
    157243      String annotationColumns = section.findStringOpt("annotationColumns");
     244      int nameCol = section.findFieldList("columns").indexOf("name");
     245      int idCol = section.findFieldList("columns").indexOf("id");
     246      //
     247      PrintStream debugOut = null;
     248      if (debug)
     249      {
     250        debugOut = new PrintStream(new File("data", "debugExtractAssays.txt"));
     251        debugOut.println("Center::extractAssays(): param_centerGeneAssay = " + param_centerGeneAssay);
     252        debugOut.println("Center::extractAssays(): param_centeringGroups = " + param_centeringGroups);
     253        debugOut.println("Center::extractAssays(): param_centerGroupsAssayNames = \"" + param_centerGroupsAssayNames + "\"");
     254        debugOut.println("Center::extractAssays(): param_centerCycles = " + param_centerCycles);
     255        debugOut.println("Center::extractAssays(): param_mm = " + param_mm);
     256        debugOut.println("Center::extractAssays(): singleCenterGroupAssayNameList = " + singleCenterGroupAssayNameList);
     257        debugOut.println("Center::extractAssays(): singleCenterGroupAssayIndexList = " + singleCenterGroupAssayIndexList);
     258        debugOut.println("Center::extractAssays(): param_centerGroupsAssayNames = \"" + param_centerGroupsAssayNames + "\"");
     259        debugOut.println("Center::extractAssays(): centerGroupsAssayNameHashMap = " + centerGroupsAssayNameHashMap);
     260        debugOut.println("Center::extractAssays(): centerGroupsAssayIndexHashMap = " + centerGroupsAssayIndexHashMap);
     261        debugOut.println("Center::extractAssays(): section.findFieldList(\"columns\") = " + section.findFieldList("columns"));
     262        debugOut.println("Center::extractAssays(): nameCol = " + nameCol);
     263        debugOut.println("Center::extractAssays(): idCol = " + idCol);
     264        debugOut.println("Center::extractAssays(): columns = " + columns);
     265        debugOut.println("Center::extractAssays(): count = " + count);
     266        debugOut.println("Center::extractAssays(): annotationColumns = " + annotationColumns);
     267      }
    158268      if (columns == null || count == null || annotationColumns == null)
    159269      {
     
    162272      }
    163273      System.out.println(section);
     274      //
     275      if (param_centeringGroups == CenteringGroups.ASSAYGROUPSINGLE)
     276      {
     277        singleCenterGroupAssayIndexList = new ArrayList<Integer>();
     278      }
     279      if (param_centeringGroups == CenteringGroups.ASSAYGROUPS)
     280      {
     281        centerGroupsAssayIndexHashMap = new HashMap<Integer,List<Integer>>();
     282      }
    164283      String[] vals;
     284      int index = 0;
    165285      while ((vals = reader.readDataRow()) != null)
    166286      {
     287        if (debug)
     288        {
     289          debugOut.println("Center::extractAssays(): Data row: " + (index + 1) + " vals[idCol]: " + vals[idCol] + " vals[nameCol]: " + vals[nameCol]);
     290        }
     291        if (param_centeringGroups == CenteringGroups.ASSAYGROUPSINGLE)
     292        {
     293          String assayName = vals[nameCol];
     294          if (singleCenterGroupAssayNameList.contains(assayName))
     295          {
     296            // Add index value for assay
     297            singleCenterGroupAssayIndexList.add(index);
     298            if (debug)
     299            {
     300              debugOut.println("Center::extractAssays(): Assay \"" + assayName + "\" with index " + index + " included: yes");
     301            }         }
     302          else
     303          {
     304            if (debug)
     305            {
     306              debugOut.println("Center::extractAssays(): Assay \"" + assayName + "\" with index " + index + " included: no");
     307            }
     308          }           
     309        }
     310        if (param_centeringGroups == CenteringGroups.ASSAYGROUPS)
     311        {
     312          String assayName = vals[nameCol];
     313          for (int i=0; i < centerGroupsAssayNameHashMap.size(); i++)
     314          {
     315            List<String> assayNameList = (List<String>) centerGroupsAssayNameHashMap.get(i);
     316            if (assayNameList != null && assayNameList.contains(assayName))
     317            {
     318              // Add index value for assay
     319              if (centerGroupsAssayIndexHashMap.get(i) == null)
     320              {
     321                List<Integer> assayIndexList = new ArrayList<Integer>();
     322                centerGroupsAssayIndexHashMap.put(i, assayIndexList);
     323              }
     324              centerGroupsAssayIndexHashMap.get(i).add(index);
     325            }
     326          }
     327        }
     328        index++;
    167329        for (int i = 0; i < vals.length; i++)
    168330        {
     
    176338      }
    177339      System.out.println();
     340      if (debug)
     341      {
     342        debugOut.println("Center::extractAssays(): singleCenterGroupAssayIndexList = " + singleCenterGroupAssayIndexList);
     343        debugOut.close();
     344      }
    178345    }
    179346    catch (IOException e)
     
    211378    nbrOfFields = assayFields.size();
    212379
     380    if (param_centeringGroups == CenteringGroups.DEFAULT)
     381    {
     382      if (param_centerGeneAssay == GeneAssay.ASSAYS)
     383      {
     384        // Add index value for each assay in its own group
     385        if (centerGroupsAssayIndexHashMap == null)
     386        {
     387          centerGroupsAssayIndexHashMap = new HashMap<Integer,List<Integer>>();
     388        }
     389        for (int i=0; i < nbrOfAssays; i++)
     390        {
     391          if (centerGroupsAssayIndexHashMap.get(i) == null)
     392          {
     393            List<Integer> assayIndexList = new ArrayList<Integer>();
     394            centerGroupsAssayIndexHashMap.put(i, assayIndexList);
     395          }
     396          centerGroupsAssayIndexHashMap.get(i).add(i);
     397        }
     398      }
     399      if (param_centerGeneAssay == GeneAssay.GENES)
     400      {
     401        // Add index value for all assays in its one single group
     402        if (centerGroupsAssayIndexHashMap == null)
     403        {
     404          centerGroupsAssayIndexHashMap = new HashMap<Integer,List<Integer>>();
     405        }
     406        for (int i=0; i < nbrOfAssays; i++)
     407        {
     408          if (centerGroupsAssayIndexHashMap.get(0) == null)
     409          {
     410            List<Integer> assayIndexList = new ArrayList<Integer>();
     411            centerGroupsAssayIndexHashMap.put(0, assayIndexList);
     412          }
     413          centerGroupsAssayIndexHashMap.get(0).add(i);
     414        }
     415      }
     416    }
     417
    213418    int totNbrOfCol = nbrOfCol - 1 + nbrOfAssays * nbrOfFields;
    214419    try
    215420    {
     421      PrintStream debugOut = null;
     422      if (debug)
     423      {
     424        debugOut = new PrintStream(new File("data", "debugExtractSpots.txt"));
     425        debugOut.println("Center::extractSpots(): section.getHeaders() = " + section.getHeaders());       
     426        debugOut.println("Center::extractSpots(): param_centerGeneAssay = " + param_centerGeneAssay);
     427        debugOut.println("Center::extractSpots(): param_centeringGroups = " + param_centeringGroups);
     428        debugOut.println("Center::extractSpots(): param_centerGroupsAssayNames = \"" + param_centerGroupsAssayNames + "\"");
     429        debugOut.println("Center::extractSpots(): param_centerCycles = " + param_centerCycles);
     430        debugOut.println("Center::extractSpots(): param_mm = " + param_mm);
     431        debugOut.println("Center::extractSpots(): singleCenterGroupAssayNameList = " + singleCenterGroupAssayNameList);
     432        debugOut.println("Center::extractSpots(): singleCenterGroupAssayIndexList = " + singleCenterGroupAssayIndexList);
     433        debugOut.println("Center::extractSpots(): param_centerGroupsAssayNames = \"" + param_centerGroupsAssayNames + "\"");
     434        debugOut.println("Center::extractSpots(): centerGroupsAssayNameHashMap = " + centerGroupsAssayNameHashMap);
     435        debugOut.println("Center::extractSpots(): centerGroupsAssayIndexHashMap = " + centerGroupsAssayIndexHashMap);
     436        debugOut.println("Center::extractSpots(): nbrOfCol = " + nbrOfCol);
     437        debugOut.println("Center::extractSpots(): nbrOfAssays = " + nbrOfAssays);
     438        debugOut.println("Center::extractSpots(): nbrOfFields = " + nbrOfFields);
     439        debugOut.println("Center::extractSpots(): assays = " + assays);
     440        debugOut.println("Center::extractSpots(): columns = " + columns);
     441        debugOut.println("Center::extractSpots(): posCol = " + posCol);
     442        debugOut.println("Center::extractSpots(): repCol = " + repCol);
     443        debugOut.println("Center::extractSpots(): dataCol = " + dataCol);
     444        debugOut.println("Center::extractSpots(): mCol = " + mCol);
     445        debugOut.println("Center::extractSpots(): aCol = " + aCol);
     446      }
     447      int nRow = 0;
    216448      String[] vals = reader.readDataRow(totNbrOfCol);
    217449      while (vals != null)
     
    235467        data.add(assayRow);
    236468        vals = reader.readDataRow(totNbrOfCol);
     469        nRow++;
     470      }
     471      if (debug)
     472      {
     473        debugOut.println("Center::extractSpots(): Number of rows = " + nRow);
     474        debugOut.close();
    237475      }
    238476    }
     
    247485  public void center()
    248486  {
    249 
    250     for (int i = 0; i < param_centerCycles; i++)
    251     {
    252       if (param_centerGeneAssay == GeneAssay.GENES || param_centerGeneAssay == GeneAssay.BOTH)
    253       {
    254         centerRows(data);
    255       }
    256       if (param_centerGeneAssay == GeneAssay.ASSAYS || param_centerGeneAssay == GeneAssay.BOTH)
    257       {
    258         centerColumns(data);
    259       }
    260     }
    261 
    262     for (AssayRow ar : data)
    263     {
    264       System.out.println(ar);
    265     }
    266   }
    267 
    268 
     487    try {
     488      PrintStream debugOut = null;
     489      if (debug)
     490      {
     491        debugOut = new PrintStream(new File("data", "debugCenter.txt"));
     492        debugOut.println("Center::center(): param_centerGeneAssay = " + param_centerGeneAssay);
     493        debugOut.println("Center::center(): param_centeringGroups = " + param_centeringGroups);
     494        debugOut.println("Center::center(): param_centerGroupsAssayNames = \"" + param_centerGroupsAssayNames + "\"");
     495        debugOut.println("Center::center(): param_centerCycles = " + param_centerCycles);
     496        debugOut.println("Center::center(): param_mm = " + param_mm);
     497        debugOut.println("Center::center(): singleCenterGroupAssayNameList = " + singleCenterGroupAssayNameList);
     498        debugOut.println("Center::center(): singleCenterGroupAssayIndexList = " + singleCenterGroupAssayIndexList);
     499        debugOut.println("Center::center(): centerGroupsAssayNameHashMap = " + centerGroupsAssayNameHashMap);
     500        debugOut.println("Center::center(): centerGroupsAssayIndexHashMap = " + centerGroupsAssayIndexHashMap);
     501      }
     502      //
     503      for (int i = 0; i < param_centerCycles; i++)
     504      {
     505        if (param_centerGeneAssay == GeneAssay.GENES || param_centerGeneAssay == GeneAssay.BOTH)
     506        {
     507          if (centerGroupsAssayIndexHashMap == null)
     508          {
     509            centerRows(data);
     510          }
     511          else
     512          {
     513            centerRowsForGroupsSeparately(data);
     514          }
     515        }
     516        if (param_centerGeneAssay == GeneAssay.ASSAYS || param_centerGeneAssay == GeneAssay.BOTH)
     517        {
     518          if (centerGroupsAssayIndexHashMap == null)
     519          {
     520            centerColumns(data);
     521          }
     522          else
     523          {
     524            centerColumnsForGroupsSeparately(data);
     525          }
     526        }
     527      }
     528
     529      for (AssayRow ar : data)
     530      {
     531        System.out.println(ar);
     532      }
     533      if (debug)
     534      {
     535        debugOut.close();
     536      }
     537    }
     538    catch (Exception e)
     539    {
     540      e.printStackTrace();
     541    }
     542  }
     543
     544
     545  /**
     546   * Center rows (genes) with same correction for each A-value (gene)
     547   * in M/A-plot for all assays.
     548   *
     549   * @param data_arr List<AssayRow> AssayRow list data to use.
     550   */
    269551  private void centerRows(List<AssayRow> data_arr)
    270552  {
     553    List<Integer> assayIndexList = singleCenterGroupAssayIndexList;
     554    int nAssayRows = 0;
     555    List<Float> rFitList = new ArrayList<Float>();
    271556    for (AssayRow ar : data_arr)
    272557    {
     
    277562        for (int j = 0; j < nbrOfAssays; j++)
    278563        {
     564          // Check if data for assay should be included when calculating center value
     565          boolean includeAssayData = true;
     566          if (assayIndexList != null && !assayIndexList.contains(j))
     567          {
     568            includeAssayData = false;
     569          }
     570          if (includeAssayData)
     571          {
     572            if (!Float.isNaN(ar.ratio[j]))
     573            {
     574              r.add(ar.ratio[j]);
     575            }
     576          }
     577        }
     578        if (param_mm == CenterOn.MEDIAN)
     579        {
     580          r_fit = median(r);
     581        }
     582        else
     583        {
     584          r_fit = mean(r);
     585        }
     586        // Center data for all groups
     587        for (int j = 0; j < nbrOfAssays; j++)
     588        {
     589          ar.ratio[j] -= r_fit;
     590        }
     591        rFitList.add(r_fit);
     592        nAssayRows++;
     593      }
     594    }
     595    try
     596    {
     597      if (debug)
     598      {
     599        PrintStream debugOut = new PrintStream(new File("data", "debugCenterRows.txt"));
     600        debugOut.println("Center::centerRows(): param_centerGeneAssay = " + param_centerGeneAssay);
     601        debugOut.println("Center::centerRows(): param_centeringGroups = " + param_centeringGroups);
     602        debugOut.println("Center::centerRows(): param_centerGroupsAssayNames = \"" + param_centerGroupsAssayNames + "\"");
     603        debugOut.println("Center::centerRows(): param_centerCycles = " + param_centerCycles);
     604        debugOut.println("Center::centerRows(): param_mm = " + param_mm);
     605        debugOut.println("Center::centerRows(): singleCenterGroupAssayIndexList = " + singleCenterGroupAssayIndexList);
     606        debugOut.println("Center::centerRows(): centerGroupsAssayIndexHashMap = " + centerGroupsAssayIndexHashMap);
     607        debugOut.println("Center::centerRows(): nbrOfAssays = " + nbrOfAssays);
     608        debugOut.println("Center::centerRows(): nAssayRows = " + nAssayRows);
     609        debugOut.close();
     610        PrintStream debugOutRFitList = new PrintStream(new File("data", "debugCenterRowsRFitList.txt"));
     611        for (int nRow = 0; nRow < nAssayRows; nRow++)
     612        {
     613          Float r_fit = (Float) rFitList.get(nRow);
     614          debugOutRFitList.println("Center::centerRows(): row = " + nRow + " r_fit = " + r_fit);
     615        }
     616        debugOutRFitList.close();
     617      }
     618    }
     619    catch (Exception e)
     620    {
     621      e.printStackTrace();
     622    }
     623  }
     624
     625  /**
     626   * Center rows (genes) for groups separately.
     627   *
     628   * @param data_arr List<AssayRow> AssayRow list data to use.
     629   */
     630  private void centerRowsForGroupsSeparately(List<AssayRow> data_arr)
     631  {
     632    HashMap<Integer,List<Float>> rFitHashMap = new HashMap<Integer,List<Float>>();
     633    int nAssayRows = 0;
     634    for (AssayRow ar : data_arr)
     635    {
     636      if (ar.valid())
     637      {
     638        List<Float> rFitList = new ArrayList<Float>();
     639        for (int j = 0; j < nbrOfAssays; j++)
     640        {
     641          rFitList.add(0f);
     642        }
     643        // Center data for groups separately
     644        for (int i = 0; i < centerGroupsAssayIndexHashMap.size(); i++)
     645        {
     646          List<Integer> assayIndexList = (List<Integer>) centerGroupsAssayIndexHashMap.get(i);
     647          float r_fit = 0f;
     648          ArrayList<Float> r = new ArrayList<Float>();
     649          for (int j = 0; j < nbrOfAssays; j++)
     650          {
     651            // Check if data for assay should be included when calculating center value
     652            boolean includeAssayData = true;
     653            if (assayIndexList != null && !assayIndexList.contains(j))
     654            {
     655              includeAssayData = false;
     656            }
     657            if (includeAssayData)
     658            {
     659              if (!Float.isNaN(ar.ratio[j]))
     660              {
     661                r.add(ar.ratio[j]);
     662              }
     663            }
     664          }
     665          if (param_mm == CenterOn.MEDIAN)
     666          {
     667            r_fit = median(r);
     668          }
     669          else
     670          {
     671            r_fit = mean(r);
     672          }
     673          // Only center data for current group
     674          for (int j = 0; j < nbrOfAssays; j++)
     675          {
     676            if (assayIndexList == null || assayIndexList.contains(j))
     677            {
     678              ar.ratio[j] -= r_fit;
     679              rFitList.set(j, r_fit);
     680            }
     681          }
     682        }
     683        rFitHashMap.put(nAssayRows, rFitList);
     684        nAssayRows++;
     685      }
     686    }
     687    try
     688    {
     689      if (debug)
     690      {
     691        PrintStream debugOut = new PrintStream(new File("data", "debugCenterRowsForGroupsSeparately.txt"));
     692        debugOut.println("Center::centerRowsForGroupsSeparately(): param_centerGeneAssay = " + param_centerGeneAssay);
     693        debugOut.println("Center::centerRowsForGroupsSeparately(): param_centeringGroups = " + param_centeringGroups);
     694        debugOut.println("Center::centerRowsForGroupsSeparately(): param_centerGroupsAssayNames = \"" + param_centerGroupsAssayNames + "\"");
     695        debugOut.println("Center::centerRowsForGroupsSeparately(): param_centerCycles = " + param_centerCycles);
     696        debugOut.println("Center::centerRowsForGroupsSeparately(): param_mm = " + param_mm);
     697        debugOut.println("Center::centerRowsForGroupsSeparately(): singleCenterGroupAssayIndexList = " + singleCenterGroupAssayIndexList);
     698        debugOut.println("Center::centerRowsForGroupsSeparately(): centerGroupsAssayIndexHashMap = " + centerGroupsAssayIndexHashMap);
     699        debugOut.println("Center::centerRowsForGroupsSeparately(): nbrOfAssays = " + nbrOfAssays);
     700        debugOut.println("Center::centerRowsForGroupsSeparately(): nAssayRows = " + nAssayRows);
     701        PrintStream debugOutRFitList = new PrintStream(new File("data", "debugCenterRowsForGroupsSeparatelyRFitList.txt"));
     702        for (int nRow = 0; nRow < nAssayRows; nRow++)
     703        {
     704          List<Float> rFitList = (List<Float>) rFitHashMap.get(nRow);
     705          for (int nAssay = 0; nAssay < nbrOfAssays; nAssay++)
     706          {
     707            // Get r_fit for gene nRow and assay nAssay
     708            Float r_fit = (Float) rFitList.get(nAssay);
     709            debugOutRFitList.println("Center::centerRowsForGroupsSeparately(): nRow = " + nRow + " nAssay = " + nAssay + " r_fit = " + r_fit);
     710          }
     711        }
     712        debugOutRFitList.close();
     713        debugOut.close();
     714      }
     715    }
     716    catch (Exception e)
     717    {
     718      e.printStackTrace();
     719    }
     720  }
     721
     722  /**
     723   * Center columns (assays) with constant correction over A in M/A-plot.
     724   *
     725   * @param data_arr List<AssayRow> AssayRow list data to use.
     726   */
     727  private void centerColumns(List<AssayRow> data_arr)
     728  {
     729    List<Integer> assayIndexList = singleCenterGroupAssayIndexList;
     730    // Store centering values for debug output
     731    List<Float> rFitList = new ArrayList<Float>();
     732    List<Integer> nAssayRowList = new ArrayList<Integer>();
     733    List<Float> r_single = null;
     734    if (assayIndexList != null)
     735    {
     736      // Collect data from single selected assay group
     737      r_single = new ArrayList<Float>();
     738      for (int j = 0; j < nbrOfAssays; j++)
     739      {
     740        if (assayIndexList.contains(j))
     741        {
     742          for (AssayRow ar : data_arr)
     743          {
     744            if (!Float.isNaN(ar.ratio[j]))
     745            {
     746              r_single.add(ar.ratio[j]);
     747            }
     748          }
     749        }
     750      }
     751    }
     752    for (int j = 0; j < nbrOfAssays; j++)
     753    {
     754      float r_fit;
     755      List<Float> r = new ArrayList<Float>();
     756      if (r_single != null)
     757      {
     758        // Use data from single assay group for centering
     759        r = r_single;
     760      }
     761      else
     762      {
     763        // Use data from each assay for centering
     764        for (AssayRow ar : data_arr)
     765        {
    279766          if (!Float.isNaN(ar.ratio[j]))
    280767          {
     
    282769          }
    283770        }
    284         if (param_mm == CenterOn.MEDIAN)
    285         {
    286           r_fit = median(r);
    287         }
    288         else
    289         {
    290           r_fit = mean(r);
    291         }
    292         for (int j = 0; j < nbrOfAssays; j++)
    293         {
    294           ar.ratio[j] -= r_fit;
    295         }
    296       }
    297     }
    298   }
    299 
    300   private void centerColumns(List<AssayRow> data_arr)
    301   {
    302     for (int j = 0; j < nbrOfAssays; j++)
    303     {
    304       float r_fit;
     771      }
     772      if (param_mm == CenterOn.MEDIAN)
     773      {
     774        r_fit = median(r);
     775      }
     776      else
     777      {
     778        r_fit = mean(r);
     779      }
     780      rFitList.add(r_fit);
     781      int nAssayRows = 0;
     782      for (AssayRow ar : data_arr)
     783      {
     784        ar.ratio[j] -= r_fit;
     785        nAssayRows++;
     786      }
     787      nAssayRowList.add(nAssayRows);
     788    }
     789    try
     790    {
     791      if (debug)
     792      {
     793        PrintStream debugOut = new PrintStream(new File("data", "debugCenterColumns.txt"));
     794        debugOut.println("Center::centerColumns(): param_centerGeneAssay = " + param_centerGeneAssay);
     795        debugOut.println("Center::centerColumns(): param_centeringGroups = " + param_centeringGroups);
     796        debugOut.println("Center::centerColumns(): param_centerGroupsAssayNames = \"" + param_centerGroupsAssayNames + "\"");
     797        debugOut.println("Center::centerColumns(): param_centerCycles = " + param_centerCycles);
     798        debugOut.println("Center::centerColumns(): param_mm = " + param_mm);
     799        debugOut.println("Center::centerColumns(): singleCenterGroupAssayIndexList = " + singleCenterGroupAssayIndexList);
     800        debugOut.println("Center::centerColumns(): centerGroupsAssayIndexHashMap = " + centerGroupsAssayIndexHashMap);
     801        debugOut.println("Center::centerColumns(): nbrOfAssays = " + nbrOfAssays);
     802        for (int i = 0; i < nbrOfAssays; i++)
     803        {
     804          debugOut.println("Center::centerColumns(): assay = " + i + " nAssayRows = " + (Integer) nAssayRowList.get(i) + " r_fit = " + (Float) rFitList.get(i));
     805        }
     806        debugOut.close();
     807      }
     808    }
     809    catch (Exception e)
     810    {
     811      e.printStackTrace();
     812    }
     813  }
     814
     815  /**
     816   * Center columns (assays) for groups separately.
     817   *
     818   * @param data_arr List<AssayRow> AssayRow list data to use.
     819   */
     820  private void centerColumnsForGroupsSeparately(List<AssayRow> data_arr)
     821  {
     822    // Store centering values for debug output
     823    List<Float> rFitList = new ArrayList<Float>();
     824    List<Integer> nAssayRowList = new ArrayList<Integer>();
     825    for (int i = 0; i < centerGroupsAssayIndexHashMap.size(); i++)
     826    {
     827      List<Integer> assayIndexList = (List<Integer>) centerGroupsAssayIndexHashMap.get(i);
     828      // Step 1 - Calculate centering values for assay groups
     829      float r_fit = 0f;
    305830      ArrayList<Float> r = new ArrayList<Float>();
    306       for (AssayRow ar : data_arr)
    307       {
    308         if (!Float.isNaN(ar.ratio[j]))
    309         {
    310           r.add(new Float(ar.ratio[j]));
     831      for (int j = 0; j < nbrOfAssays; j++)
     832      {
     833        // Check if data for assay should be included when calculating center value
     834        boolean includeAssayData = true;
     835        if (assayIndexList != null && !assayIndexList.contains(j))
     836        {
     837          includeAssayData = false;
     838        }
     839        if (includeAssayData)
     840        {
     841          for (AssayRow ar : data_arr)
     842          {
     843            if (!Float.isNaN(ar.ratio[j]))
     844            {
     845              r.add(ar.ratio[j]);
     846            }
     847          }
    311848        }
    312849      }
     
    319856        r_fit = mean(r);
    320857      }
    321       for (AssayRow ar : data_arr)
    322       {
    323         ar.ratio[j] -= r_fit;
    324       }
     858      rFitList.add(r_fit);
     859      // Only center data for current group
     860      for (int j = 0; j < nbrOfAssays; j++)
     861      {
     862        int nAssayRows = 0;
     863        if (assayIndexList == null || assayIndexList.contains(j))
     864        {
     865          for (AssayRow ar : data_arr)
     866          {
     867            if (!Float.isNaN(ar.ratio[j]))
     868            {
     869              ar.ratio[j] -= r_fit;
     870              nAssayRows++;
     871            }
     872          }
     873        }
     874        nAssayRowList.add(nAssayRows);
     875      }
     876    }
     877    try
     878    {
     879      if (debug)
     880      {
     881        PrintStream debugOut = new PrintStream(new File("data", "debugCenterColumnsForGroupsSeparately.txt"));
     882        debugOut.println("Center::centerColumnsForGroupsSeparately(): param_centerGeneAssay = " + param_centerGeneAssay);
     883        debugOut.println("Center::centerColumnsForGroupsSeparately(): param_centeringGroups = " + param_centeringGroups);
     884        debugOut.println("Center::centerColumnsForGroupsSeparately(): param_centerGroupsAssayNames = \"" + param_centerGroupsAssayNames + "\"");
     885        debugOut.println("Center::centerColumnsForGroupsSeparately(): param_centerCycles = " + param_centerCycles);
     886        debugOut.println("Center::centerColumnsForGroupsSeparately(): param_mm = " + param_mm);
     887        debugOut.println("Center::centerColumnsForGroupsSeparately(): singleCenterGroupAssayIndexList = " + singleCenterGroupAssayIndexList);
     888        debugOut.println("Center::centerColumnsForGroupsSeparately(): centerGroupsAssayIndexHashMap = " + centerGroupsAssayIndexHashMap);
     889        debugOut.println("Center::centerColumnsForGroupsSeparately(): nbrOfAssays = " + nbrOfAssays);
     890        for (int i = 0; i < centerGroupsAssayIndexHashMap.size(); i++)
     891        {
     892          debugOut.println("Center::centerColumnsForGroupsSeparately(): assay group = " + i + " nAssayRows = " + (Integer) nAssayRowList.get(i) + " r_fit = " + (Float) rFitList.get(i));
     893        }
     894        debugOut.close();
     895      }
     896    }
     897    catch (Exception e)
     898    {
     899      e.printStackTrace();
    325900    }
    326901  }
     
    372947  private float calculatePercentile(List<Float> vec, float fraction)
    373948  {
     949    if (vec == null || vec.size() == 0)
     950    {
     951      return 0f;
     952    }
    374953    Float percentileValue = null;
    375954    // Get ascending sorted list
     
    402981  private float mean(List<Float> vec)
    403982  {
     983    if (vec == null || vec.size() == 0)
     984    {
     985      return 0f;
     986    }
    404987    float ret = 0;
    405988    for (int i = 0; i < vec.size(); i++)
     
    4751058  }
    4761059
     1060  private enum CenteringGroups {
     1061    DEFAULT(1), ASSAYGROUPSINGLE(2), ASSAYGROUPS(3);
     1062
     1063    private int value;
     1064
     1065    private CenteringGroups(int value)
     1066    {
     1067      this.value = value;
     1068    }
     1069
     1070    private static final Map<Integer, CenteringGroups> valueMapping = new HashMap<Integer, CenteringGroups>();
     1071
     1072    static
     1073    {
     1074      for (CenteringGroups centeringGroups : CenteringGroups.values())
     1075      {
     1076        valueMapping.put(centeringGroups.getValue(), centeringGroups);
     1077      }
     1078    }
     1079
     1080    public static CenteringGroups fromValue(int value)
     1081    {
     1082      CenteringGroups centeringGroups = valueMapping.get(value);
     1083      return centeringGroups;
     1084    }
     1085
     1086    public int getValue()
     1087    {
     1088      return value;
     1089    }
     1090  }
     1091
    4771092}
Note: See TracChangeset for help on using the changeset viewer.