1 | <?xml version="1.0" encoding="UTF-8"?> |
---|
2 | <!DOCTYPE sect1 PUBLIC |
---|
3 | "-//Dawid Weiss//DTD DocBook V3.1-Based Extension for XML and graphics inclusion//EN" |
---|
4 | "../../../../lib/docbook/preprocess/dweiss-docbook-extensions.dtd"> |
---|
5 | <!-- |
---|
6 | $Id: rawbioassays.xml 7640 2019-03-11 13:13:29Z nicklas $ |
---|
7 | |
---|
8 | Copyright (C) 2007 Peter Johansson, Nicklas Nordborg, Martin Svensson |
---|
9 | |
---|
10 | This file is part of BASE - BioArray Software Environment. |
---|
11 | Available at http://base.thep.lu.se/ |
---|
12 | |
---|
13 | BASE is free software; you can redistribute it and/or |
---|
14 | modify it under the terms of the GNU General Public License |
---|
15 | as published by the Free Software Foundation; either version 3 |
---|
16 | of the License, or (at your option) any later version. |
---|
17 | |
---|
18 | BASE is distributed in the hope that it will be useful, |
---|
19 | but WITHOUT ANY WARRANTY; without even the implied warranty of |
---|
20 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
---|
21 | GNU General Public License for more details. |
---|
22 | |
---|
23 | You should have received a copy of the GNU General Public License |
---|
24 | along with BASE. If not, see <http://www.gnu.org/licenses/>. |
---|
25 | --> |
---|
26 | |
---|
27 | <sect1 id="experiments_analysis.rawbioassay"> |
---|
28 | <?dbhtml filename="rawbioassays.html" ?> |
---|
29 | <title>Raw bioassays</title> |
---|
30 | <para> |
---|
31 | A <guilabel>Raw bioassay</guilabel> is the representation |
---|
32 | of the result of analyzing data from the physical bioassay |
---|
33 | down to the point where we have a file or a set of files |
---|
34 | containing measurements per feature (eg. spot, gene, etc.) |
---|
35 | for a single sample or extract. Further analysis is usually |
---|
36 | needed before we can say something about individual features |
---|
37 | or samples and how they relate to each other. This |
---|
38 | kind of analisys is done in <guilabel>Experiments</guilabel>. |
---|
39 | See <xref linkend="experiments_analysis.experiments" />. |
---|
40 | </para> |
---|
41 | |
---|
42 | <para> |
---|
43 | The term <guilabel>Raw bioassay</guilabel> is bit misleading since the |
---|
44 | real "raw data" is actually the images from a microarray scan or the |
---|
45 | output from a sequencer. For historical reasons we have chosen to keep |
---|
46 | the term raw bioassay since this represents the first possibility for |
---|
47 | a transition between file-base data and database-stored data. Typically, |
---|
48 | all pre-rawbioassay analysis is done outside of BASE, and although |
---|
49 | we now have the possibility to track this in detail, it will |
---|
50 | probably remain so for some time in the future. See |
---|
51 | <xref linkend="experiments_analysis.derivedbioassays" />. |
---|
52 | |
---|
53 | </para> |
---|
54 | |
---|
55 | <sect2 id="experiments_analysis.rawbioassay.create"> |
---|
56 | <title>Create raw bioassays</title> |
---|
57 | <para> |
---|
58 | Creating a new raw bioassay is a two- or three-step process: |
---|
59 | </para> |
---|
60 | |
---|
61 | <orderedlist> |
---|
62 | <listitem> |
---|
63 | <para> |
---|
64 | Create a new raw bioassay item with the &gbNew; button in the raw bioassays list view. |
---|
65 | It is also possible to create raw bioassays from the derived bioassays |
---|
66 | list- and single view- page. |
---|
67 | </para> |
---|
68 | </listitem> |
---|
69 | <listitem> |
---|
70 | <para> |
---|
71 | Upload the file(s) with the raw data and attach them to the |
---|
72 | raw bioassay. |
---|
73 | </para> |
---|
74 | </listitem> |
---|
75 | <listitem> |
---|
76 | <para> |
---|
77 | The used platform may require that data is imported to the database. |
---|
78 | See <xref linkend="import_data" />. If the platform is a |
---|
79 | file-only platform, this step can be skipped. |
---|
80 | </para> |
---|
81 | </listitem> |
---|
82 | </orderedlist> |
---|
83 | |
---|
84 | <note> |
---|
85 | <title>Supported file formats</title> |
---|
86 | BASE has built-in support for most file formats where the data comes |
---|
87 | in a tab-separated (or similar) form. Data for one raw bioassay |
---|
88 | must be in a single file. Support for other file formats |
---|
89 | may be added through plug-ins. |
---|
90 | </note> |
---|
91 | </sect2> |
---|
92 | |
---|
93 | <sect2 id="experiments_analysis.rawbioassay.properties"> |
---|
94 | <title>Raw bioassay properties</title> |
---|
95 | |
---|
96 | <figure |
---|
97 | id="experiments_analysis.figures.rawbioassay.edit"> |
---|
98 | <title>Raw bioassay properties</title> |
---|
99 | <screenshot> |
---|
100 | <mediaobject> |
---|
101 | <imageobject> |
---|
102 | <imagedata |
---|
103 | fileref="figures/rawbioassay_edit.png" format="PNG" /> |
---|
104 | </imageobject> |
---|
105 | </mediaobject> |
---|
106 | </screenshot> |
---|
107 | </figure> |
---|
108 | |
---|
109 | <helptext external_id="rawbioassay.edit" title="Edit raw bioassay"> |
---|
110 | |
---|
111 | <variablelist> |
---|
112 | <varlistentry> |
---|
113 | <term><guilabel>Name</guilabel></term> |
---|
114 | <listitem> |
---|
115 | <para> |
---|
116 | The name of the raw bioassay. |
---|
117 | </para> |
---|
118 | </listitem> |
---|
119 | </varlistentry> |
---|
120 | <varlistentry> |
---|
121 | <term> |
---|
122 | <guilabel>Platform</guilabel> |
---|
123 | </term> |
---|
124 | <listitem> |
---|
125 | <para> |
---|
126 | Select the platform / variant used for the |
---|
127 | raw bioassay. The selected options affects which |
---|
128 | files that can be selected on the <guilabel>Data files</guilabel> |
---|
129 | tab. If the platform supports importing data to the database |
---|
130 | you must also select a <guilabel>Raw data type</guilabel>. |
---|
131 | </para> |
---|
132 | </listitem> |
---|
133 | </varlistentry> |
---|
134 | <varlistentry> |
---|
135 | <term><guilabel>Raw data type</guilabel></term> |
---|
136 | <listitem> |
---|
137 | <para> |
---|
138 | The type of raw data. This option is disabled for file-only |
---|
139 | platforms and for platforms that are locked to a specific |
---|
140 | raw data type. This cannot be changed after raw data has been |
---|
141 | imported. <nohelp>See |
---|
142 | <xref linkend="experiments_analysis.rawdatatypes" />.</nohelp> |
---|
143 | </para> |
---|
144 | </listitem> |
---|
145 | </varlistentry> |
---|
146 | <varlistentry> |
---|
147 | <term><guilabel>Parent bioassay</guilabel></term> |
---|
148 | <listitem> |
---|
149 | <para> |
---|
150 | The derived bioassay that is the parent of this |
---|
151 | raw bioassay. |
---|
152 | </para> |
---|
153 | </listitem> |
---|
154 | </varlistentry> |
---|
155 | <varlistentry> |
---|
156 | <term><guilabel>Parent extract</guilabel></term> |
---|
157 | <listitem> |
---|
158 | <para> |
---|
159 | The extract which this raw bioassay has measured. This is normally selected |
---|
160 | among the extracts that are linked with the physical bioassay that this |
---|
161 | raw bioassay is coming from. Selecting the correct extract is important if the |
---|
162 | physical bioassay contains more than one extract, since otherwise it may affect |
---|
163 | how annotations are inherited and used in downstream analysis. |
---|
164 | </para> |
---|
165 | </listitem> |
---|
166 | </varlistentry> |
---|
167 | <varlistentry> |
---|
168 | <term><guilabel>Array design</guilabel></term> |
---|
169 | <listitem> |
---|
170 | <para> |
---|
171 | The array design used on the array slide (optional). |
---|
172 | If an array design is specified |
---|
173 | the import will verify that the raw data has |
---|
174 | the same reporter on the same position. This |
---|
175 | prevents mistakes but also speed up analysis |
---|
176 | since some optimizations can be used when assigning |
---|
177 | positions in bioassay sets. |
---|
178 | The array design can be changed after raw data has been |
---|
179 | imported, but this triggers a new validation. If the raw data |
---|
180 | is stored in the database, the features on the new array design must |
---|
181 | match the the raw data. The verification can use three different methods: |
---|
182 | </para> |
---|
183 | |
---|
184 | <itemizedlist> |
---|
185 | <listitem> |
---|
186 | <para> |
---|
187 | Coordinates: Verify block, meta-grid, row and column coordinates. |
---|
188 | </para> |
---|
189 | </listitem> |
---|
190 | <listitem> |
---|
191 | <para>Position: Verify the position number.</para> |
---|
192 | </listitem> |
---|
193 | <listitem> |
---|
194 | <para> |
---|
195 | Feature ID: Verify the feature ID. This option can only be used |
---|
196 | if the raw bioassay is currently connected to an array design that |
---|
197 | has feature ID values already. |
---|
198 | </para> |
---|
199 | </listitem> |
---|
200 | </itemizedlist> |
---|
201 | <para> |
---|
202 | In all three cases it is also verified that the reporter of the raw |
---|
203 | data matches the reporter of the features. |
---|
204 | </para> |
---|
205 | |
---|
206 | <para> |
---|
207 | For Affymetrix data, the |
---|
208 | CEL file is validated against the CDF file of the new array design. |
---|
209 | If the validation fails, the array design is not changed. |
---|
210 | </para> |
---|
211 | </listitem> |
---|
212 | </varlistentry> |
---|
213 | <varlistentry> |
---|
214 | <term><guilabel>Software</guilabel></term> |
---|
215 | <listitem> |
---|
216 | <para> |
---|
217 | The software used to generate the raw data (optional). |
---|
218 | </para> |
---|
219 | </listitem> |
---|
220 | </varlistentry> |
---|
221 | <varlistentry> |
---|
222 | <term><guilabel>Protocol</guilabel></term> |
---|
223 | <listitem> |
---|
224 | <para> |
---|
225 | The protocol used when generating the raw data (optional). |
---|
226 | Software parameters may be registered as part of |
---|
227 | the protocol. |
---|
228 | </para> |
---|
229 | </listitem> |
---|
230 | </varlistentry> |
---|
231 | <varlistentry> |
---|
232 | <term><guilabel>Description</guilabel></term> |
---|
233 | <listitem> |
---|
234 | <para> |
---|
235 | A description of the raw bioassay (optional). |
---|
236 | </para> |
---|
237 | </listitem> |
---|
238 | </varlistentry> |
---|
239 | </variablelist> |
---|
240 | |
---|
241 | <seeother> |
---|
242 | <other external_id="datafiles.edit">Data files</other> |
---|
243 | <other external_id="annotations.edit">Annotations</other> |
---|
244 | <other external_id="annotations.edit.inerited">Inherit annotations</other> |
---|
245 | </seeother> |
---|
246 | </helptext> |
---|
247 | |
---|
248 | |
---|
249 | <para> |
---|
250 | The <guilabel>Data files</guilabel> tab allows BASE users to select |
---|
251 | files that contains data for the raw bioassay. |
---|
252 | Read more about this in <xref linkend="platforms.selectfiles" />. |
---|
253 | </para> |
---|
254 | |
---|
255 | <para> |
---|
256 | The <guilabel>Annotations</guilabel> tab allows BASE users to use |
---|
257 | annotation types to refine bioassay description. More about annotating items |
---|
258 | can be read in <xref linkend="annotations.annotating" /> |
---|
259 | </para> |
---|
260 | |
---|
261 | <para> |
---|
262 | This <guilabel>Inherited annotations</guilabel> tab contains a list of those annotations |
---|
263 | that are inherited from the bioassay's parents. Information about working with inherited |
---|
264 | annotations can be found in <xref linkend="annotations.inheriting" />. |
---|
265 | </para> |
---|
266 | |
---|
267 | </sect2> |
---|
268 | |
---|
269 | <sect2 id="experiments_analysis.rawbioassay.rawdata"> |
---|
270 | <title>Import raw data</title> |
---|
271 | <para> |
---|
272 | Depending on the platform, raw data may have to be imported after |
---|
273 | you have created the raw bioassay item. This section doesn't apply |
---|
274 | to file-only platforms. The import is handled by plug-ins. To start |
---|
275 | the import click on the <guibutton>Import…</guibutton> |
---|
276 | button on the single-item view for the raw bioassay. |
---|
277 | If this button does not appear it may be because no file |
---|
278 | format has been specified for the raw data type used by the |
---|
279 | raw bioassay or that the logged in user does not have permission |
---|
280 | to use the import plug-in or file format. |
---|
281 | See <xref linkend="import_data" /> for more |
---|
282 | information. |
---|
283 | </para> |
---|
284 | |
---|
285 | <note> |
---|
286 | <title>File-only platforms</title> |
---|
287 | File-only platforms, such as Affymetrix, is handled differently and data is not |
---|
288 | imported into the database. |
---|
289 | </note> |
---|
290 | |
---|
291 | </sect2> |
---|
292 | |
---|
293 | <sect2 id="experiments_analysis.rawdatatypes"> |
---|
294 | <title>Raw data types</title> |
---|
295 | |
---|
296 | <para> |
---|
297 | A raw data type defines the types of measured values that can be stored |
---|
298 | for individual features in the database. Usually this includes some |
---|
299 | kind of foreground and background intensity values. The number and meaning |
---|
300 | of the values usually depends on the hardware and software used to analyze |
---|
301 | the data from the experiment. Many tools provide mean and median values, |
---|
302 | standard deviations, quality control information, etc. Since there are so |
---|
303 | many existing tools with many different data file formats BASE uses a |
---|
304 | separate database table for each raw data type to store data. The raw data |
---|
305 | tables have been optimized for the type of raw data they can hold and only |
---|
306 | has the columns that are needed to store the data. BASE ships with a large |
---|
307 | number of pre-defined raw data types. An administrator may also define |
---|
308 | additional raw data type. See <xref linkend="appendix.rawdatatypes" /> |
---|
309 | for more information. |
---|
310 | </para> |
---|
311 | |
---|
312 | <sect3 id="experiments_analysis.fileonly"> |
---|
313 | <title>File-only platforms</title> |
---|
314 | <para> |
---|
315 | In some cases it doesn't make sense to import any data into the |
---|
316 | database. The main reason is that performance will suffer as the |
---|
317 | number of entries in the database gets higher. A typical Genepix file |
---|
318 | contains ~55K spots while an Affymetrix file may have millions. |
---|
319 | </para> |
---|
320 | <para> |
---|
321 | The drawback of keeping the data in files is that none of the generic |
---|
322 | tools in BASE can read it. Special plug-ins must be developed for each |
---|
323 | type of data file that can be used to analyze and visualize the data. |
---|
324 | For the Affymetrix platform there are implementations of the RMAExpress |
---|
325 | and Plier normalizations available on the BASE plug-ins web site. |
---|
326 | BASE also ships with built-in plug-ins for extracting metadata from |
---|
327 | Affymetrix CEL and CDF files (ie. headers, number of spots, etc). |
---|
328 | </para> |
---|
329 | <para> |
---|
330 | Users of other file-only platforms should check the BASE plug-ins |
---|
331 | website for plug-ins related to their platform. If they can't |
---|
332 | find any we recommend that they try to find other users of the same |
---|
333 | platform and try to cooperate in developing the required tools and |
---|
334 | plug-ins. |
---|
335 | </para> |
---|
336 | </sect3> |
---|
337 | |
---|
338 | </sect2> |
---|
339 | |
---|
340 | </sect1> |
---|
341 | |
---|