Changeset 4002
- Timestamp:
- Nov 26, 2007, 1:36:38 PM (16 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
trunk/doc/src/docbook/appendix/raw_data_types.xml
r3944 r4002 31 31 32 32 <para> 33 Raw data can be stored either as files attached to items or in 34 the database. 35 The <classname docapi="net.sf.basedb.core">Platform</classname> item has information 36 about this. Configuration information for the database tables 37 and columns used to store raw data in the database is found in the 38 <filename>raw-data-types.xml</filename> file. For detailed information 39 see <xref linkend="core_api.data_in_files" />. 33 Raw data can be stored either as files attached to items and/or in 34 the database. The <classname docapi="net.sf.basedb.core">Platform</classname> 35 item has information about this. For more information see 36 <xref linkend="core_api.data_in_files" />. 40 37 </para> 41 38 … … 113 110 114 111 <para> 115 TODO 116 </para> 112 A given platform either supports importing data to the database or it 113 doesn't. If it supports import, it may be locked to specific raw data type 114 or it may use any raw data type. Among the default platforms installed with 115 BASE, the Affymetrix platform doesn't support importing data. The Generic platform 116 supports importing to any raw data type. 117 </para> 118 119 <para> 120 Raw data types are defined in the <filename>raw-data-types.xml</filename> 121 file. This file is located in the <filename><basedir>/www/WEB-INF/classes</filename> 122 directory and contains information about the database tables and columns to 123 use for storing raw data. BASE ships with default raw data types for many 124 different microarray platforms, including Genepix, Agilent and Illumina. 125 </para> 126 127 <para> 128 If you want your BASE installation to be configured differently we recommend that 129 you do it before the first initialisation of the database. 130 It is possible to change the configuration of an existing BASE installation but it 131 requires manual updates to the database. Follow this procedure: 132 </para> 133 134 <orderedlist> 135 <listitem> 136 <para> 137 Shut down the BASE web server. If you have installed job agents you should shut 138 down them as well. 139 </para> 140 </listitem> 141 142 <listitem> 143 <para> 144 Modify the <filename>raw-data-types.xml</filename> file. If you have installed 145 job agents, make sure they all have the same version as the web server. 146 </para> 147 </listitem> 148 149 <listitem> 150 <para> 151 Run the <filename>updatedb.sh</filename> script. Tables for new raw data types 152 and new columns for existing raw data types automatically be created, but the script 153 can't delete tables or columns that have been removed, or modify columns that have 154 changed datatype. You will have to do these kind of changes by manually executing 155 SQL against your database. Check your database documentation for information about SQL syntax. 156 </para> 157 158 <tip> 159 <title>Create a parallell installation</title> 160 <para> 161 You can always create a new temporary parallell installation to check 162 what the table generated by installation script looks like. Compare the 163 new table to the existing one and make sure they match. 164 </para> 165 </tip> 166 </listitem> 167 168 <listitem> 169 <para> 170 Start up the BASE web server and job agents, if any, again. 171 </para> 172 </listitem> 173 </orderedlist> 174 175 <tip> 176 <title>Start with few columns</title> 177 <para> 178 It is better to start with too few columns, since it is easier to add 179 more columns than it is to remove columns that are not needed. 180 </para> 181 </tip> 182 183 <bridgehead>Format of the raw-data-types.xml file</bridgehead> 184 <para> 185 The <filename>raw-data-types.xml</filename> is an XML file. 186 The following example will serve as a description of the format: 187 </para> 188 189 190 <programlisting language="xml"> 191 <![CDATA[ 192 <?xml version="1.0" ?> 193 <?xml-stylesheet type="text/xsl" href="raw-data-types.xsl"?> 194 <!DOCTYPE raw-data-types SYSTEM "raw-data-types.dtd" > 195 <raw-data-types> 196 <raw-data-type 197 id="genepix" 198 name="GenePix" 199 channels="2" 200 table="RawDataGenePix" 201 > 202 <property 203 name="diameter" 204 title="Spot diameter" 205 description="The diameter of the spot in µm" 206 column="diameter" 207 type="float" 208 /> 209 <property 210 name="ch1FgMedian" 211 title="Channel 1 foreground median" 212 description="The median of the foreground intensity in channel 1" 213 column="ch1_fg_median" 214 type="float" 215 channel="1" 216 /> 217 <!-- skipped a lot of properties --> 218 <intensity-formula 219 name="mean" 220 title="Mean FG - Mean BG" 221 description="Subtract mean background from mean foreground" 222 > 223 <formula 224 channel="1" 225 expression="raw('ch1FgMean') - raw('ch1BgMean')" 226 /> 227 <formula 228 channel="2" 229 expression="raw('ch2FgMean') - raw('ch2BgMean')" 230 /> 231 </intensity-formula> 232 <!-- and a few more... ---> 233 </raw-data-type> 234 </raw-data-types> 235 ]]> 236 </programlisting> 237 238 <para> 239 Each raw data type is represented by a <sgmltag class="starttag">raw-data-type</sgmltag> 240 tag. The following attributes can be used: 241 </para> 242 243 <table frame="all" id="appendix.rawdatatypes.tag"> 244 <title>Attributes for the <sgmltag class="starttag">raw-data-type</sgmltag> tag</title> 245 <tgroup cols="3" align="left"> 246 <colspec colname="attribute" align="left" /> 247 <colspec colname="required" /> 248 <colspec colname="comment" /> 249 <thead> 250 <row> 251 <entry>Attribute</entry> 252 <entry>Required</entry> 253 <entry>Comment</entry> 254 </row> 255 </thead> 256 <tbody> 257 <row> 258 <entry>id</entry> 259 <entry>yes</entry> 260 <entry> 261 A unique ID of the raw data type. It should contain only letters, 262 numbers and underscores and the first character must be a letter. 263 </entry> 264 </row> 265 <row> 266 <entry>name</entry> 267 <entry>yes</entry> 268 <entry> 269 A unique name of the raw data type. The name is usually used by client 270 applications for disaplay. 271 </entry> 272 </row> 273 <row> 274 <entry>table</entry> 275 <entry>yes</entry> 276 <entry> 277 The name of the database table to store data in. The table name 278 must be unique and can only contain letters, 279 numbers and underscores. The first character must be a letter. 280 </entry> 281 </row> 282 <row> 283 <entry>channels</entry> 284 <entry>yes</entry> 285 <entry> 286 The number of channels used by this raw data type. It must be 287 a number > 0. 288 </entry> 289 </row> 290 <row> 291 <entry>description</entry> 292 <entry>no</entry> 293 <entry> 294 An optional (longer) description of the raw data type. 295 </entry> 296 </row> 297 </tbody> 298 </tgroup> 299 </table> 300 301 <para> 302 Following the <sgmltag class="starttag">raw-data-type</sgmltag> tag 303 is one or more <sgmltag class="starttag">property</sgmltag> tags. 304 Each one defines a column in the database that is designed to hold 305 data values of a particular type. The following attributes can be used 306 on this tag: 307 </para> 308 309 <table frame="all" id="appendix.rawdatatypes.property"> 310 <title>Attributes for the <sgmltag class="starttag">property</sgmltag> tag</title> 311 <tgroup cols="3" align="left"> 312 <colspec colname="attribute" align="left" /> 313 <colspec colname="required" /> 314 <colspec colname="comment" /> 315 <thead> 316 <row> 317 <entry>Attribute</entry> 318 <entry>Required</entry> 319 <entry>Comment</entry> 320 </row> 321 </thead> 322 <tbody> 323 <row> 324 <entry>*</entry> 325 <entry></entry> 326 <entry> 327 All attributes defined by the 328 <sgmltag class="starttag">property</sgmltag> tag in 329 <filename>extended-properties.xml</filename>. See 330 <xref linkend="appendix.extendedproperties.property" />. 331 </entry> 332 </row> 333 <row> 334 <entry>channels</entry> 335 <entry>no</entry> 336 <entry> 337 The channel number the property belongs to. Allowed values are 0 to 338 the number of channels specified for the raw data type. If the property 339 doesn't belong to any channels set the value to 0 or leave it 340 unspecified. 341 </entry> 342 </row> 343 </tbody> 344 </tgroup> 345 </table> 346 347 <para> 348 Following the <sgmltag class="starttag">property</sgmltag> tags comes 0 349 or more <sgmltag class="starttag">intensity-formula</sgmltag> tags. 350 Each one defines mathematical formulas that can be used to create 351 calculate the intensity values from the raw data. In the Genepix, case 352 there are several formulas which differs in the way background is 353 subtracted from foregorund intensity values. For other raw data 354 types, the intensity formula may just copy one of the raw data values. 355 </para> 356 357 <para> 358 The intensity formulas are installed as <classname 359 docapi="net.sf.basedb.core">Formula</classname> items in the database. This 360 means that you can manually add, change or remove intensity formulas directly 361 from the web interface. The intensity formulas in the <filename>raw-data-types.xml</filename> 362 file are only used at installation time. 363 </para> 364 365 <para> 366 The <sgmltag class="starttag">intensity-formula</sgmltag> tag has the following 367 attributes: 368 </para> 369 370 <table frame="all" id="appendix.rawdatatypes.intensity-formula"> 371 <title>Attributes for the <sgmltag class="starttag">intensity-formula</sgmltag> tag</title> 372 <tgroup cols="3" align="left"> 373 <colspec colname="attribute" align="left" /> 374 <colspec colname="required" /> 375 <colspec colname="comment" /> 376 <thead> 377 <row> 378 <entry>Attribute</entry> 379 <entry>Required</entry> 380 <entry>Comment</entry> 381 </row> 382 </thead> 383 <tbody> 384 <row> 385 <entry>name</entry> 386 <entry>yes</entry> 387 <entry> 388 A unique name for the formula. This is only used during installation. 389 </entry> 390 </row> 391 <row> 392 <entry>title</entry> 393 <entry>yes</entry> 394 <entry> 395 The title of the formula. This is used by client applications for 396 display. 397 </entry> 398 </row> 399 <row> 400 <entry>description</entry> 401 <entry>no</entry> 402 <entry> 403 An optional, longer, description of the formula. 404 </entry> 405 </row> 406 </tbody> 407 </tgroup> 408 </table> 409 410 <para> 411 The <sgmltag class="starttag">intensity-formula</sgmltag> must contain 412 one <sgmltag class="starttag">formula</sgmltag> tag for each channel 413 of the raw data type. The attributes of this tag are: 414 </para> 415 416 <table frame="all" id="appendix.rawdatatypes.formula"> 417 <title>Attributes for the <sgmltag class="starttag">formula</sgmltag> tag</title> 418 <tgroup cols="3" align="left"> 419 <colspec colname="attribute" align="left" /> 420 <colspec colname="required" /> 421 <colspec colname="comment" /> 422 <thead> 423 <row> 424 <entry>Attribute</entry> 425 <entry>Required</entry> 426 <entry>Comment</entry> 427 </row> 428 </thead> 429 <tbody> 430 <row> 431 <entry>channel</entry> 432 <entry>yes</entry> 433 <entry> 434 The channel number. One tag for each channel must be specified. No 435 duplicates are allowed. 436 </entry> 437 </row> 438 <row> 439 <entry>expression</entry> 440 <entry>yes</entry> 441 <entry> 442 The mathematical expression used to calculate the intensities. 443 The expression is parsed with the <classname docapi="net.sf.basedb.util.jep">Jep</classname> 444 parser. It supports the common mathematical operations such as +, -, *, /, 445 some mathematical function like, log2(), ln(), sqrt(), etc. See the API 446 documentation for Jep for more information. You can also use two special 447 function developed specifically for this case: 448 <itemizedlist> 449 <listitem> 450 <para> 451 raw(name): Get the value from the raw data property with the given name, 452 for example: <code>raw('ch1FgMedian')</code>. 453 </para> 454 </listitem> 455 <listitem> 456 <para> 457 mean(name): Get the mean value of the raw data property with the given name, 458 for example: <code>mean('ch1BgMean')</code>. The mean is calculated from 459 all raw data spots in the raw bioassay. 460 </para> 461 </listitem> 462 </itemizedlist> 463 </entry> 464 </row> 465 </tbody> 466 </tgroup> 467 </table> 117 468 118 469 </sect1>
Note: See TracChangeset
for help on using the changeset viewer.