1 | <?xml version="1.0" encoding="UTF-8"?> |
---|
2 | <!DOCTYPE chapter PUBLIC |
---|
3 | "-//Dawid Weiss//DTD DocBook V3.1-Based Extension for XML and graphics inclusion//EN" |
---|
4 | "../../../../lib/docbook/preprocess/dweiss-docbook-extensions.dtd"> |
---|
5 | |
---|
6 | <!-- |
---|
7 | $Id: features.xml 5845 2011-11-01 16:58:55Z jari $ |
---|
8 | |
---|
9 | Copyright (C) 2008, 2011 Jari Häkkinen |
---|
10 | |
---|
11 | This file is part of BASE - BioArray Software Environment. |
---|
12 | Available at http://base.thep.lu.se/ |
---|
13 | |
---|
14 | BASE is free software; you can redistribute it and/or |
---|
15 | modify it under the terms of the GNU General Public License |
---|
16 | as published by the Free Software Foundation; either version 3 |
---|
17 | of the License, or (at your option) any later version. |
---|
18 | |
---|
19 | BASE is distributed in the hope that it will be useful, |
---|
20 | but WITHOUT ANY WARRANTY; without even the implied warranty of |
---|
21 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
---|
22 | GNU General Public License for more details. |
---|
23 | |
---|
24 | You should have received a copy of the GNU General Public License |
---|
25 | along with BASE. If not, see <http://www.gnu.org/licenses/>. |
---|
26 | --> |
---|
27 | |
---|
28 | <chapter id="features" chunked="0"> |
---|
29 | <?dbhtml filename="features.html"?> |
---|
30 | <title>BASE features</title> |
---|
31 | |
---|
32 | <para> |
---|
33 | The BASE application features many components; MIAME compliance, |
---|
34 | multi-user, data sharing, data access management, array and |
---|
35 | biomaterial LIMS, multiple array platforms, RNAseq sequencing |
---|
36 | support, extensibility, configurable plug-ins, annotation |
---|
37 | customisation, streamlined access to analysis tools, integration |
---|
38 | of <ulink url='http://www.tm4.org/mev/'>MultiExperiment Viewer |
---|
39 | (MeV)</ulink>, web services API, and more. To support all |
---|
40 | components the underlying relational database has grown to become |
---|
41 | very large and complex, especially since BASE itself works with |
---|
42 | objects posing additional database tables to keep track of objects |
---|
43 | stored in a relational database. Thus, rather than trying to |
---|
44 | describe every feature in detail here, we highlight some of the |
---|
45 | more important features. |
---|
46 | </para> |
---|
47 | |
---|
48 | <sect1 id="features.webinterface"> |
---|
49 | <title>Web interface</title> |
---|
50 | |
---|
51 | <para> |
---|
52 | The entire system is accessed through a web-interface over the |
---|
53 | Internet using a standard web browser, such as Firefox, Safari, |
---|
54 | Opera, or Internet Explorer. Access privileges to a particular |
---|
55 | BASE installation are managed by personal accounts through the |
---|
56 | web-interface. A local administrator creates new user accounts |
---|
57 | with specific roles and access privileges and has an overall |
---|
58 | managerial responsibility for an individual BASE |
---|
59 | installation. With exception for the administrator with global |
---|
60 | data access, individual users have sole access to and control |
---|
61 | their inputted data. Users have the possibility to share data |
---|
62 | they own (or have share credentials for) to other users of the |
---|
63 | same BASE installation. |
---|
64 | </para> |
---|
65 | |
---|
66 | </sect1> |
---|
67 | |
---|
68 | <sect1 id="features.datamangement"> |
---|
69 | <title>Information and annotation management</title> |
---|
70 | |
---|
71 | <para> |
---|
72 | BASE features a biomaterial LIMS tracking biological material |
---|
73 | from its source to hybridisation/sequencing and ultimately to |
---|
74 | raw data and analysis. All events throughout sample handling are |
---|
75 | tracked and information on used and remaining quantities, |
---|
76 | physical sample locations, quality control information, and |
---|
77 | sample relations is stored in BASE. Racks or boxes holding |
---|
78 | biomaterials can be created as BioPlates and plate events are |
---|
79 | easily performed for extraction or labelling events. Although |
---|
80 | becoming less commonly used, the array production LIMS of |
---|
81 | previous BASE versions is retained to support researchers |
---|
82 | with spotting facilities, e.g., protein array production and |
---|
83 | BAC array printing that may not be commercially available. |
---|
84 | </para> |
---|
85 | |
---|
86 | <para> |
---|
87 | Events in biomaterial and array LIMS are annotable with |
---|
88 | protocols and event dates, and most items can be annotated with |
---|
89 | customisable annotation types such as floats, integers, dates, |
---|
90 | and Boolean flags. Change history for biomaterial items is available |
---|
91 | if configured and can be used to track modifications in the database. |
---|
92 | Annotations are either free form or from a preset list of values, |
---|
93 | and can be marked as required for MIAME compliance. The annotation |
---|
94 | system is searchable and the user can select any annotations to be |
---|
95 | an experimental factors in analysis whereby it becomes availabe to |
---|
96 | analysis plugins and plot-tools. |
---|
97 | </para> |
---|
98 | |
---|
99 | </sect1> |
---|
100 | |
---|
101 | <sect1 id="features.sharingandprivacy"> |
---|
102 | <title>Data sharing and privacy</title> |
---|
103 | |
---|
104 | <para> |
---|
105 | One of the important features of BASE is its capabilities as a |
---|
106 | local data repository. The repository functionality is amended |
---|
107 | with data grouping, sharing, and privacy policies. A BASE |
---|
108 | project is used to group items (biomaterial, raw data, and |
---|
109 | experiments) into a logical entity, and a BASE experiment is a |
---|
110 | collection of bioassays, e.g., array data, grouped logically together |
---|
111 | for further analysis. All items can co-exist in several projects |
---|
112 | and experiments without any unnecessary copying of information. |
---|
113 | </para> |
---|
114 | |
---|
115 | <para> |
---|
116 | Data privacy is guarded by the data owner and BASE allows the |
---|
117 | owner to set data access rules. To this end, each item in BASE |
---|
118 | is owned by a user enabling him to share data with |
---|
119 | colleagues. The grouping of data in projects allows the data |
---|
120 | owner to simply include other users in a project in order to |
---|
121 | share data. Each item can have different access levels even |
---|
122 | within a project, and project members can have different |
---|
123 | privileges. The data access rules are very flexible and can be |
---|
124 | overwhelming since access levels on almost any item can be |
---|
125 | individually set. However, using projects, the proper access |
---|
126 | levels can be set at a single point of interaction. |
---|
127 | </para> |
---|
128 | |
---|
129 | </sect1> |
---|
130 | |
---|
131 | <sect1 id="features.directorystructure"> |
---|
132 | <title>File and directory structure</title> |
---|
133 | |
---|
134 | <para> |
---|
135 | BASE has an integrated file system to provide the possibility for |
---|
136 | researchers to collect all data files related to a project in |
---|
137 | one single storage location. Data files are uploaded using a web |
---|
138 | browser or an ftp client. The file storage is an integral part |
---|
139 | of a strategy to store all experiment relevant data in BASE, |
---|
140 | even data types not already supported in analysis. Collecting |
---|
141 | all data allows future reuse of the data as more data are |
---|
142 | produced, and new analysis tools becomes available. |
---|
143 | </para> |
---|
144 | |
---|
145 | </sect1> |
---|
146 | |
---|
147 | <sect1 id="features.plugininfrastructure"> |
---|
148 | <title>Plugin and extension infrastructure</title> |
---|
149 | |
---|
150 | <!-- |
---|
151 | Analysis, extensions, and plug-ins |
---|
152 | --> |
---|
153 | <para> |
---|
154 | BASE features a hierarchically organised analysis interface that |
---|
155 | allows data filtering, normalisation, transformation, and other |
---|
156 | analyses. Parameters and settings are automatically stored for |
---|
157 | each step in the analysis. The selection of analysis tools |
---|
158 | depends on array type and available plug-ins where a wide range |
---|
159 | of tools are pre-installed with BASE, and optional plug-ins can |
---|
160 | be downloaded from the <ulink |
---|
161 | href='http://baseplugins.thep.lu.se'>BASE plug-in site |
---|
162 | </ulink>. BASE capitalise from other software tools, such as |
---|
163 | MEV, by integrating them into the user interface. Such |
---|
164 | integration provide streamlined access to analysis modules in |
---|
165 | external tools. BASE even features a rudimentary manual |
---|
166 | transform creator that enables researchers to add analysis steps |
---|
167 | within the hierarchical overview of analysis performed |
---|
168 | independently of BASE. The transform creator enables storage of |
---|
169 | result files and parameter information for archival, tracking, |
---|
170 | and sharing purposes. |
---|
171 | </para> |
---|
172 | |
---|
173 | <para> |
---|
174 | The analysis of genomics data is continuously evolving with new |
---|
175 | methods and techniques. To this end BASE provides extensions and |
---|
176 | plug-in programming interfaces (APIs) to enable straightforward |
---|
177 | additions of new analysis tools. The use of the APIs is well |
---|
178 | documented and there are numerous examples on how to create |
---|
179 | extensions. The MEV and ftp-server integrations all utilise the |
---|
180 | extension mechanism, and the automatically generated overview |
---|
181 | plots available in the experimental analysis view are also |
---|
182 | extensions. The plug-in API is used for all data imports and |
---|
183 | exports, and most analysis tools, providing new developers a lot |
---|
184 | of example code to examine when they create BASE plug-ins. |
---|
185 | </para> |
---|
186 | |
---|
187 | </sect1> |
---|
188 | |
---|
189 | <sect1 id="features.batchdata"> |
---|
190 | <title>Batch upload and download of data</title> |
---|
191 | |
---|
192 | <para> |
---|
193 | File, annotation, and item upload can be done asynchronously as |
---|
194 | data are generated or information becomes available. To relieve |
---|
195 | researchers from the tedious task of entering data one by one a |
---|
196 | set of batch import were created; the information generated |
---|
197 | throughout the experimental work is uploaded to BASE in plain |
---|
198 | tab-separated files. These files are supplied to batch importer |
---|
199 | plug-ins that parse the files and create items and associations |
---|
200 | according to the information in the files. The same plug-ins can |
---|
201 | be used to batch update many items. Similarly, annotating items |
---|
202 | is done by creating tab-separated files with annotation |
---|
203 | information, uploading these to BASE, and loading the file |
---|
204 | content into the database using annotation importers. If needed, |
---|
205 | annotations are easily updated with the same mechanism. |
---|
206 | </para> |
---|
207 | |
---|
208 | <para> |
---|
209 | Files uploaded to BASE are stored in the directory structure |
---|
210 | within BASE and multiple files are easily transferred to BASE |
---|
211 | either packaged in compressed files with a single upload action, |
---|
212 | or by using an ftp client supporting transfer of file |
---|
213 | structures. Similarly, downloading multiple files is |
---|
214 | straightforward either using an ftp client or by a single click |
---|
215 | in the BASE web interface. Download of items is done through |
---|
216 | item listing views enabling users to filter and select what |
---|
217 | information should be downloaded. |
---|
218 | </para> |
---|
219 | |
---|
220 | </sect1> |
---|
221 | |
---|
222 | <sect1 id="features.supportedarrays"> |
---|
223 | <title>Supported array platforms and raw data formats</title> |
---|
224 | |
---|
225 | <para> |
---|
226 | There are many types of microarrays, techniques, and brands |
---|
227 | available for researchers; one- or two-channel hybridizations, |
---|
228 | spotted cDNA/oligo arrays, Affymetrix (GeneChip), Illumina (SNP, |
---|
229 | DASL, WGEX, microRNA), aCGH, SNP, tiling arrays, and many |
---|
230 | more. Data are produced in different file formats that must be |
---|
231 | treated differently depending on type. |
---|
232 | </para> |
---|
233 | |
---|
234 | <para> |
---|
235 | Many platforms and experimental setups are supported in |
---|
236 | downstream analysis but some microarray techniques cannot |
---|
237 | currently be analysed within BASE simply because lack of support |
---|
238 | in available plug-ins. The problem is resolved by creating new, |
---|
239 | or extending available, plug-ins that add analysis capabilities |
---|
240 | of platforms and techniques not readily supported in |
---|
241 | analysis. Extending analysis capabilities to new technologies is |
---|
242 | only a matter of local needs and resources. We add support for |
---|
243 | platforms in use at the Lund University microarray facility and |
---|
244 | make our tools freely available to the community. |
---|
245 | </para> |
---|
246 | |
---|
247 | <para> |
---|
248 | For two channel array platforms it is straightforward to |
---|
249 | customize BASE for a specific array platform, the platform |
---|
250 | simply needs to be adapted to the (BASE) Generic platform. The |
---|
251 | adaptation is to create a raw data format definition and to |
---|
252 | configure raw data importers, or make use of already available |
---|
253 | raw data formats. However, it is not always possible to make an |
---|
254 | natural mapping of a platform to the Generic platform. Platforms |
---|
255 | such as Affymetrix and Illumina platforms cannot naturally be |
---|
256 | mapped on to the Generic two channel platform. For Affymetrix, |
---|
257 | BASE comes with a specific Affymetrix platform and Illumina can |
---|
258 | be supported by customizing BASE (go to the <ulink |
---|
259 | url="http://baseplugins.thep.lu.se/wiki/net.sf.basedb.illumina"> |
---|
260 | Illumina package</ulink> web site for more information on adding |
---|
261 | Illumina support to BASE). |
---|
262 | </para> |
---|
263 | |
---|
264 | <para> |
---|
265 | How to adapt new array platforms to the Generic platform format |
---|
266 | or how to create a new platform type in BASE can be read |
---|
267 | elsewhere in this document. Here we list different array |
---|
268 | platforms used in BASE and also list raw data types supported |
---|
269 | by BASE. However, not all platforms nor raw data types listed |
---|
270 | below are available out-of-the box and a BASE administrator must |
---|
271 | customize his local BASE installation for their specific |
---|
272 | need. What comes pre-configured when BASE is installed is |
---|
273 | indicated in the lists below. |
---|
274 | </para> |
---|
275 | |
---|
276 | <sect2 id="features.supportedarrays.platforms"> |
---|
277 | <title>Vendor specific and custom printing array |
---|
278 | platforms</title> |
---|
279 | |
---|
280 | <para> |
---|
281 | Not all array platforms listed below are available by |
---|
282 | default. The comments to specific platforms explain how to |
---|
283 | enable the use of the array platform in BASE. In some cases |
---|
284 | there is no confirmed usage of a platform but we believe it |
---|
285 | has been tested by anonymous users. |
---|
286 | </para> |
---|
287 | |
---|
288 | <variablelist> |
---|
289 | <varlistentry> |
---|
290 | <term>Affymetrix</term> |
---|
291 | <listitem> |
---|
292 | <para> |
---|
293 | The Affymetrix platform comes pre-configured with a |
---|
294 | new BASE installation. Affymetrix platform in this |
---|
295 | context are the Affymetrix expression arrays. So far |
---|
296 | there has been no reason for expanding the Array |
---|
297 | platform to other chip-types. In principle any |
---|
298 | Affymetrix chip type can be stored in BASE but current |
---|
299 | plug-ins will always assume that expression data is |
---|
300 | stored and analyzed. This can be resolved by adding |
---|
301 | variants of the Affymetrix platform but the Lund BASE |
---|
302 | team currently has no plans to create Affymetrix |
---|
303 | variants. |
---|
304 | </para> |
---|
305 | </listitem> |
---|
306 | </varlistentry> |
---|
307 | |
---|
308 | <varlistentry> |
---|
309 | <term>Agilent</term> |
---|
310 | <listitem> |
---|
311 | <para> |
---|
312 | </para> |
---|
313 | </listitem> |
---|
314 | </varlistentry> |
---|
315 | |
---|
316 | <varlistentry> |
---|
317 | <term>Custom printing</term> |
---|
318 | <listitem> |
---|
319 | <para> |
---|
320 | The array layout options are endless and imagination is |
---|
321 | the only limitation ... almost. BASE can import many |
---|
322 | in-house array designs and platforms. The custom arrays |
---|
323 | usually fall back on one of the raw data types already |
---|
324 | available such as GenePix. |
---|
325 | </para> |
---|
326 | </listitem> |
---|
327 | </varlistentry> |
---|
328 | |
---|
329 | <varlistentry> |
---|
330 | <term>Illumina</term> |
---|
331 | <listitem> |
---|
332 | <para> |
---|
333 | There are several variants of the Illumina platform. Using |
---|
334 | several variants allows BASE to adapt its handling of |
---|
335 | different Illumina chip types. Illumina platform support |
---|
336 | is not included in a standard BASE installation but there |
---|
337 | is |
---|
338 | a <ulink url="http://baseplugins.thep.lu.se/wiki/net.sf.basedb.illumina"> |
---|
339 | Illumina package</ulink> available for seamless |
---|
340 | integration of the Illumina array platform to BASE. |
---|
341 | </para> |
---|
342 | </listitem> |
---|
343 | </varlistentry> |
---|
344 | |
---|
345 | <varlistentry> |
---|
346 | <term>ImaGene</term> |
---|
347 | <listitem> |
---|
348 | <para> |
---|
349 | No successful use confirmed but ImaGene raw data is |
---|
350 | available in BASE. |
---|
351 | </para> |
---|
352 | </listitem> |
---|
353 | </varlistentry> |
---|
354 | |
---|
355 | <varlistentry> |
---|
356 | <term>Unlisted</term> |
---|
357 | <listitem> |
---|
358 | <para> |
---|
359 | In principle any platform generating a matrix of data can |
---|
360 | be imported into BASE. Simply utilize the available raw |
---|
361 | data formats and data importers. |
---|
362 | </para> |
---|
363 | </listitem> |
---|
364 | </varlistentry> |
---|
365 | </variablelist> |
---|
366 | |
---|
367 | </sect2> |
---|
368 | |
---|
369 | <sect2 id="features.supportedarrays.rawdatatypes"> |
---|
370 | <title>Available raw data types</title> |
---|
371 | |
---|
372 | <para> |
---|
373 | Raw data comes in many different formats. These formats are |
---|
374 | usually defined by scanner software vendors and BASE must keep |
---|
375 | track of the different formats for analysis and plotting. BASE |
---|
376 | supports many formats out the box, but some formats need to |
---|
377 | be added manually by the BASE administrator (indicated in the |
---|
378 | list below). |
---|
379 | </para> |
---|
380 | |
---|
381 | <variablelist> |
---|
382 | |
---|
383 | <varlistentry> |
---|
384 | <term>Affymetrix</term> |
---|
385 | <listitem> |
---|
386 | <para> |
---|
387 | </para> |
---|
388 | </listitem> |
---|
389 | </varlistentry> |
---|
390 | |
---|
391 | <varlistentry> |
---|
392 | <term>AIDA</term> |
---|
393 | <listitem> |
---|
394 | <para> |
---|
395 | </para> |
---|
396 | </listitem> |
---|
397 | </varlistentry> |
---|
398 | |
---|
399 | <varlistentry> |
---|
400 | <term>Agilent</term> |
---|
401 | <listitem> |
---|
402 | <para> |
---|
403 | </para> |
---|
404 | </listitem> |
---|
405 | </varlistentry> |
---|
406 | |
---|
407 | <varlistentry> |
---|
408 | <term>BZScan</term> |
---|
409 | <listitem> |
---|
410 | <para> |
---|
411 | </para> |
---|
412 | </listitem> |
---|
413 | </varlistentry> |
---|
414 | |
---|
415 | <varlistentry> |
---|
416 | <term>ChipSkipper</term> |
---|
417 | <listitem> |
---|
418 | <para> |
---|
419 | </para> |
---|
420 | </listitem> |
---|
421 | </varlistentry> |
---|
422 | |
---|
423 | <varlistentry> |
---|
424 | <term>GenePix</term> |
---|
425 | <listitem> |
---|
426 | <para> |
---|
427 | </para> |
---|
428 | </listitem> |
---|
429 | </varlistentry> |
---|
430 | |
---|
431 | <varlistentry> |
---|
432 | <term>GeneTAC</term> |
---|
433 | <listitem> |
---|
434 | <para> |
---|
435 | </para> |
---|
436 | </listitem> |
---|
437 | </varlistentry> |
---|
438 | |
---|
439 | <varlistentry> |
---|
440 | <term>Illumina</term> |
---|
441 | <listitem> |
---|
442 | <para> |
---|
443 | The Illumina array platform usage is recommended to be |
---|
444 | based on the <emphasis>Illumina Bead Summary |
---|
445 | (IBS)</emphasis> raw data format below. |
---|
446 | </para> |
---|
447 | </listitem> |
---|
448 | </varlistentry> |
---|
449 | |
---|
450 | <varlistentry> |
---|
451 | <term>Illumina Bead Summary (IBS)</term> |
---|
452 | <listitem> |
---|
453 | <para> |
---|
454 | Not available in BASE directly but it is added with |
---|
455 | the <ulink url="http://baseplugins.thep.lu.se/wiki/net.sf.basedb.illumina"> |
---|
456 | Illumina plug-in</ulink> that adds Illumina array |
---|
457 | platform support to BASE. |
---|
458 | </para> |
---|
459 | </listitem> |
---|
460 | </varlistentry> |
---|
461 | |
---|
462 | <varlistentry> |
---|
463 | <term>ImaGene</term> |
---|
464 | <listitem> |
---|
465 | <para> |
---|
466 | </para> |
---|
467 | </listitem> |
---|
468 | </varlistentry> |
---|
469 | |
---|
470 | <varlistentry> |
---|
471 | <term>QuantArray Biotin</term> |
---|
472 | <listitem> |
---|
473 | <para> |
---|
474 | </para> |
---|
475 | </listitem> |
---|
476 | </varlistentry> |
---|
477 | |
---|
478 | <varlistentry> |
---|
479 | <term>QuantArray Cy</term> |
---|
480 | <listitem> |
---|
481 | <para> |
---|
482 | </para> |
---|
483 | </listitem> |
---|
484 | </varlistentry> |
---|
485 | |
---|
486 | <varlistentry> |
---|
487 | <term>SpotFinder</term> |
---|
488 | <listitem> |
---|
489 | <para> |
---|
490 | </para> |
---|
491 | </listitem> |
---|
492 | </varlistentry> |
---|
493 | |
---|
494 | </variablelist> |
---|
495 | |
---|
496 | </sect2> |
---|
497 | |
---|
498 | </sect1> |
---|
499 | |
---|
500 | <sect1 id="features.repositoryandstandards"> |
---|
501 | <title>Repository and standards</title> |
---|
502 | |
---|
503 | <para> |
---|
504 | The Microarray Gene Expression Data Society (MGED) develops and |
---|
505 | maintains standards for data acquisition, representation, and |
---|
506 | interchange such as the MIAME guidelines, the MAGE-TAB |
---|
507 | interchange format, and the MGED Ontology for microarray |
---|
508 | experiments. BASE does not enforce the use of the MGED standards |
---|
509 | but support storage of information required by MIAME. BASE has |
---|
510 | an experiment item overview functionality useful for validating |
---|
511 | information related to experiments. The validation level is user |
---|
512 | selectable of which the option regarding MIAME compliance is |
---|
513 | most relevant here. When users or server administrators create |
---|
514 | annotation types in BASE these annotation values can be marked |
---|
515 | as required by MIAME and optionally defined to be a list of |
---|
516 | pre-defined values from a controlled vocabulary. Validation will |
---|
517 | check for inconsistencies and report errors, and give the user |
---|
518 | an opportunity to fix issues immediately or later. After |
---|
519 | resolving the issues raised by the validation, data can be |
---|
520 | exported for submission to public repositories such as |
---|
521 | ArrayExpress, Gene Expression Omnibus (GEO), and CIBEX. |
---|
522 | </para> |
---|
523 | |
---|
524 | </sect1> |
---|
525 | |
---|
526 | </chapter> |
---|