source: trunk/doc/src/docbook/developerdoc/api_overview.xml @ 4702

Last change on this file since 4702 was 4702, checked in by Nicklas Nordborg, 14 years ago

References #1166: Store creation date for items

Fixed invalid xml in documentation.

  • Property svn:eol-style set to native
  • Property svn:keywords set to Id
File size: 125.2 KB
Line 
1<?xml version="1.0" encoding="UTF-8"?>
2<!DOCTYPE chapter PUBLIC
3    "-//Dawid Weiss//DTD DocBook V3.1-Based Extension for XML and graphics inclusion//EN"
4    "../../../../lib/docbook/preprocess/dweiss-docbook-extensions.dtd">
5<!--
6  $Id: api_overview.xml 4702 2008-12-11 13:23:41Z nicklas $
7
8  Copyright (C) 2007 Peter Johansson, Nicklas Nordborg, Martin Svensson
9
10  This file is part of BASE - BioArray Software Environment.
11  Available at http://base.thep.lu.se/
12
13  BASE is free software; you can redistribute it and/or
14  modify it under the terms of the GNU General Public License
15  as published by the Free Software Foundation; either version 3
16  of the License, or (at your option) any later version.
17
18  BASE is distributed in the hope that it will be useful,
19  but WITHOUT ANY WARRANTY; without even the implied warranty of
20  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
21  GNU General Public License for more details.
22
23  You should have received a copy of the GNU General Public License
24  along with BASE. If not, see <http://www.gnu.org/licenses/>.
25-->
26
27<chapter id="api_overview">
28  <?dbhtml dir="api"?>
29  <title>API overview (how to use and code examples)</title>
30
31  <sect1 id="api_overview.public_api">
32    <title>The Public API of BASE</title>
33   
34    <para>
35      Not all public classes and methods in the <filename>BASE2Core.jar</filename>
36      and other JAR files shipped with BASE are considered as
37      <emphasis>Public API</emphasis>. This is important knowledge
38      since we will always try to maintain backwards compatibility
39      for classes that are part of the public API. For other
40      classes, changes may be introduced at any time without
41      notice or specific documentation. In other words:
42    </para>
43   
44    <note>
45      <title>Only use the public API when developing plug-ins</title>
46      <para>
47        This will maximize the chance that you plug-in will continue
48        to work with the next BASE release. If you use the non-public API
49        you do so at your own risk.
50      </para>
51    </note>
52   
53    <para>
54      See the <ulink url="http://base.thep.lu.se/chrome/site/doc/api/index.html"
55        >javadoc</ulink> for information about
56      what parts of the API that contributes to the public API.
57      Methods, classes and other elements that have been tagged as
58      <code>@deprecated</code> should be considered as part of the internal API
59      and may be removed in a subsequent release without warning.
60    </para>
61   
62    <para>
63      See <xref linkend="appendix.incompatible" /> to read more about
64      changes that have been introduced by each release.
65    </para>
66
67    <sect2 id="api_overview.compatibility">
68      <title>What is backwards compatibility?</title>
69     
70      <para>
71        There is a great article about this subject on <ulink 
72        url="http://wiki.eclipse.org/index.php/Evolving_Java-based_APIs"
73          >http://wiki.eclipse.org/index.php/Evolving_Java-based_APIs</ulink>.
74        This is what we will try to comply with. If you do not want to
75        read the entire article, here are some of the most important points:
76      </para>
77     
78     
79      <sect3 id="api_overview.compatibility.binary">
80        <title>Binary compatibility</title>
81        <para>
82        <blockquote>
83          Pre-existing Client binaries must link and run with new releases of the
84          Component without recompiling.
85        </blockquote>
86       
87        For example:
88        <itemizedlist>
89        <listitem>
90          <para>
91            We cannot change the number or types of parameters to a method
92            or constructor.
93          </para>
94        </listitem>
95        <listitem>
96          <para>
97            We cannot add or change methods to interfaces that are intended
98            to be implemented by plug-in or client code.
99          </para>
100        </listitem>
101        </itemizedlist>
102        </para>       
103      </sect3>
104     
105      <sect3 id="api_overview.compatibility.contract">
106        <title>Contract compatibility</title>
107        <para>
108          <blockquote>
109          API changes must not invalidate formerly legal Client code.
110          </blockquote>
111       
112          For example:
113          <itemizedlist>
114          <listitem>
115            <para>
116              We cannot change the implementation of a method to do
117              things differently than before. For example, allow <constant>null</constant>
118              as a return value when it was not allowed before.
119            </para>
120          </listitem>
121          </itemizedlist>
122       
123          <note>
124            <para>
125            Sometimes there is a very fine line between what is considered a
126            bug and what is considered a feature. For example, if the
127            actual implementation does not do what the javadoc says,
128            do we change the code or do we change the documentation?
129            This has to be considered from case to case and depends on
130            the age of the code and if we expect plug-ins and clients to be
131            affected by it or not.
132            </para>
133          </note>
134        </para>
135      </sect3>
136     
137      <sect3 id="api_overview.compatibility.source">
138        <title>Source code compatibility</title>
139        <para>
140        This is not an important matter and is not always possible to
141        achieve. In most cases, the problems are easy to fix.
142        Example:
143       
144        <itemizedlist>
145        <listitem>
146          <para>
147          Adding a class may break a plug-in or client that import
148          classes with <constant>.*</constant> if the same class name
149          exists in another package.
150          </para>
151        </listitem>
152        </itemizedlist>
153        </para>
154      </sect3>
155    </sect2>
156  </sect1>
157
158  <sect1 id="api_overview.data_api" chunked="1">
159    <title>The database schema and the Data Layer API</title>
160
161    <para>
162      This section gives an overview of the entire data layer API.
163      The figure below show how different modules relate to each other.
164    </para>
165   
166    <note>
167      All information has not yet been transfered from the old documentation.
168      The old documentation is available at
169      <ulink url="http://base.thep.lu.se/chrome/site/doc/historical/development/overview/data/index.html"
170        >http://base.thep.lu.se/chrome/site/doc/historical/development/overview/data/index.html</ulink>
171    </note>
172   
173    <figure id="data_api.figures.overview">
174      <title>Data layer overview</title>
175      <screenshot>
176        <mediaobject>
177          <imageobject>
178            <imagedata 
179              align="center"
180              scalefit="1" width="100%"
181              fileref="figures/uml/datalayer.overview.png" format="PNG" />
182          </imageobject>
183        </mediaobject>
184      </screenshot>
185    </figure>
186
187    <sect2 id="data_api.basic">
188      <title>Basic classes and interfaces</title>
189     
190      <para>
191        This document contains information about the basic classes and interfaces in this package.
192        They are important since all data-layer classes must inherit from one of the already
193        existing abstract base classes or implement one or more of the
194        existing interfaces. They contain code that is common to all classes,
195        for example implementations of the <methodname>equals()</methodname>
196        and <methodname>hashCode()</methodname> methods or how to link with the owner of an
197        item.
198      </para>
199     
200      <sect3 id="data_api.basic.uml">
201        <title>UML diagram</title>
202       
203        <figure id="data_api.figures.basic">
204          <title>Basic classes and interfaces</title>
205          <screenshot>
206            <mediaobject>
207              <imageobject>
208                <imagedata 
209                  align="center"
210                  fileref="figures/uml/datalayer.basic.png" format="PNG" />
211              </imageobject>
212            </mediaobject>
213          </screenshot>
214        </figure>
215      </sect3>
216     
217      <sect3 id="data_api.basic.classes">
218        <title>Classes</title>
219       
220        <variablelist>
221        <varlistentry>
222          <term><classname docapi="net.sf.basedb.core.data">BasicData</classname></term>
223          <listitem>
224            <para>
225            The root class. It overrides the <methodname>equals()</methodname>,
226            <methodname>hashCode()</methodname> and <methodname>toString()</methodname> methods
227            from the <classname>Object</classname> class. It also defines the
228            <varname>id</varname> and <varname>version</varname> properties.
229            All data layer classes must inherit from this class or one of it's subclasses.
230            </para>
231          </listitem>
232        </varlistentry>
233       
234        <varlistentry>
235          <term><classname docapi="net.sf.basedb.core.data">OwnedData</classname></term>
236          <listitem>
237            <para>
238            Extends the <classname>BasicData</classname> class and adds
239            an <varname>owner</varname> property. The owner is a required link to a
240            <classname docapi="net.sf.basedb.core.data">UserData</classname> object, representing the user that
241            is the owner of the item.
242            </para>
243          </listitem>
244        </varlistentry>
245
246        <varlistentry>
247          <term><classname docapi="net.sf.basedb.core.data">SharedData</classname></term>
248          <listitem>
249            <para>
250            Extends the <classname>OwnedData</classname> class and adds
251            properties (<varname>itemKey</varname> and <varname>projectKey</varname>)
252            that holds access permission information for an item.
253            Access permissions are held in <classname docapi="net.sf.basedb.core.data">ItemKeyData</classname> and/or
254            <classname docapi="net.sf.basedb.core.data">ProjectKeyData</classname> objects. These objects only exists if
255            the item has been shared.
256            </para>
257          </listitem>
258        </varlistentry>
259
260        <varlistentry>
261          <term><classname docapi="net.sf.basedb.core.data">CommonData</classname></term>
262          <listitem>
263            <para>
264            This is a convenience class for items that extends the <classname>SharedData</classname>
265            class and implements the <interfacename docapi="net.sf.basedb.core.data">NameableData</interfacename> and
266            <interfacename docapi="net.sf.basedb.core.data">RemoveableData</interfacename> interfaces. This is one of
267            the most common situations.
268            </para>
269          </listitem>
270        </varlistentry>
271
272        <varlistentry>
273          <term><classname docapi="net.sf.basedb.core.data">AnnotatedData</classname></term>
274          <listitem>
275            <para>
276            This is a convenience class for items that can be annotated.
277            Annotations are held in <classname docapi="net.sf.basedb.core.data">AnnotationSetData</classname> objects.
278            The annotation set only exists if annotations has been created for the item.
279            </para>
280          </listitem>
281        </varlistentry>
282        </variablelist>
283       
284      </sect3>
285     
286      <sect3 id="data_api.basic.interfaces">
287        <title>Interfaces</title>
288       
289        <variablelist>
290        <varlistentry>
291          <term><classname docapi="net.sf.basedb.core.data">IdentifiableData</classname></term>
292          <listitem>
293            <para>
294            All items are identifiable, which means that they have a unique <varname>id</varname>.
295            The id is unique for all items of a specific type (ie. class). The id is number
296            that is automatically generated by the database and has no other meaning
297            outside of the application. The <varname>version</varname> property is used for
298            detecting and preventing concurrent modifications to an item.
299            </para>
300          </listitem>
301        </varlistentry>
302       
303        <varlistentry>
304          <term><classname docapi="net.sf.basedb.core.data">OwnableData</classname></term>
305          <listitem>
306            <para>
307            An ownable item is an item which has an owner. The owner is represented as a
308            required link to a <classname docapi="net.sf.basedb.core.data">UserData</classname> object.
309            </para>
310          </listitem>
311        </varlistentry>       
312
313        <varlistentry>
314          <term><classname docapi="net.sf.basedb.core.data">ShareableData</classname></term>
315          <listitem>
316            <para>
317            A shareable item is an item which can be shared to other users, groups or projects.
318            Access permissions are held in <classname docapi="net.sf.basedb.core.data">ItemKeyData</classname> and/or
319            <classname docapi="net.sf.basedb.core.data">ProjectKeyData</classname> objects.
320            </para>
321          </listitem>
322        </varlistentry>
323             
324        <varlistentry>
325          <term><classname docapi="net.sf.basedb.core.data">NameableData</classname></term>
326          <listitem>
327            <para>
328            A nameable item is an item that has a name (required) and a description
329            (optional). The name doesn't have to be unique, except in a few special
330            cases (for example, the name of a file).
331            </para>
332          </listitem>
333        </varlistentry>
334       
335        <varlistentry>
336          <term><classname docapi="net.sf.basedb.core.data">RemovableData</classname></term>
337          <listitem>
338            <para>
339            A removable item is an item that can be flagged as removed. This doesn't
340            remove the information about the item from the database, but can be used by
341            client applications to hide items that the user is not interested in.
342            A trashcan function can be used to either restore or permanently
343            remove items that has the flag set.
344            </para>
345          </listitem>
346        </varlistentry>
347               
348        <varlistentry>
349          <term><classname docapi="net.sf.basedb.core.data">SystemData</classname></term>
350          <listitem>
351            <para>
352            A system item is an item which has an additional id in the form of string. A system id
353            is required when we need to make sure that we can get a specific item without
354            knowing the numeric id. Example of such items are the root user and the everyone group.
355            A system id is generally constructed like:
356            <constant>net.sf.basedb.core.User.ROOT</constant>. The system id:s are defined in the
357            core layer by each item class.
358            </para>
359          </listitem>
360        </varlistentry>
361
362        <varlistentry>
363          <term><classname docapi="net.sf.basedb.core.data">DiskConsumableData</classname></term>
364          <listitem>
365            <para>
366            This interface is used by items which occupies a lot of disk space and
367            should be part of the quota system, for example files. The required
368            <classname docapi="net.sf.basedb.core.data">DiskUsageData</classname> contains information about the size,
369            location, owner etc. of the item.
370            </para>
371          </listitem>
372        </varlistentry>
373       
374        <varlistentry>
375          <term><classname docapi="net.sf.basedb.core.data">AnnotatableData</classname></term>
376          <listitem>
377            <para>
378            This interface is used by items which can be annotated. Annotations are name/value
379            pairs that are attached as extra information to an item. All annotations are
380            contained in an <classname docapi="net.sf.basedb.core.data">AnnotationSetData</classname> object.
381            </para>
382          </listitem>
383        </varlistentry>
384       
385        <varlistentry>
386          <term><classname docapi="net.sf.basedb.core.data">ExtendableData</classname></term>
387          <listitem>
388            <para>
389            This interface is used by items which can have extra administrator-defined
390            columns. The functionality is similar to annotations. It is not as flexible,
391            since it is a global configuration, but has better performance. BASE will
392            generate extra database columns to store the data in the tables for items that
393            can be extended.
394            </para>
395          </listitem>
396        </varlistentry>
397       
398        <varlistentry>
399          <term><classname docapi="net.sf.basedb.core.data">BatchableData</classname></term>
400          <listitem>
401            <para>
402            This interface is a tagging interface which is used by items that needs batch
403            functionality in the core.
404            </para>
405          </listitem>
406        </varlistentry>
407       
408        <varlistentry>
409          <term><classname docapi="net.sf.basedb.core.data">RegisteredData</classname></term>
410          <listitem>
411            <para>
412            This interface is used by items which registered the date they were
413            created in the database. The registration date is set at creation time
414            and can't be modified later. Since this didn't exist prior to BASE 2.10
415            null values are allowed on all pre-existing items. Note! For backwards
416            compatibility reasons with existing code in
417            <classname docapi="net.sf.basedb.core.data">BioMaterialEventData</classname>
418            the method name is <methodname>getEntryDate()</methodname>.
419            </para>
420          </listitem>
421        </varlistentry>
422        </variablelist>
423
424      </sect3>
425    </sect2>
426   
427    <sect2 id="data_api.authentication">
428      <title>User authentication and access control</title>
429     
430      <para>
431         This section gives an overview of user authentication and
432         how groups, roles and projects are used for access control
433         to items.
434      </para>
435     
436      <sect3 id="data_api.authentication.uml">
437        <title>UML diagram</title>
438       
439        <figure id="data_api.figures.authentication">
440          <title>User authentication and access control</title>
441          <screenshot>
442            <mediaobject>
443              <imageobject>
444                <imagedata 
445                  align="center"
446                  scalefit="1" width="100%"
447                  fileref="figures/uml/datalayer.authentication.png" format="PNG" />
448              </imageobject>
449            </mediaobject>
450          </screenshot>
451        </figure>
452      </sect3>
453     
454      <sect3 id="data_api.authentication.users">
455        <title>Users and passwords</title>     
456     
457        <para>
458          The <classname docapi="net.sf.basedb.core.data">UserData</classname> class holds information about users.
459          We keep the passwords in a separate table and use proxies to avoid loading
460          password data each time a user is loaded to minimize security risks. It is
461          only if the password needs to be changed that the <classname docapi="net.sf.basedb.core.data">PasswordData</classname>
462          object is loaded. The one-to-one mapping between user and password is controlled
463          by the password class, but a cascade attribute on the user class makes sure
464          that the password is deleted when a user is deleted.
465        </para>
466      </sect3>
467
468      <sect3 id="data_api.authentication.groups">
469        <title>Groups, roles and projects</title>     
470     
471        <para>
472          The <classname docapi="net.sf.basedb.core.data">GroupData</classname>, <classname docapi="net.sf.basedb.core.data">RoleData</classname> and
473          <classname docapi="net.sf.basedb.core.data">ProjectData</classname> classes holds information about groups, roles
474          and projects respectively. A user may be a member of any number of groups,
475          roles and/or projects. The membership in a project comes with an attached
476          permission values. This is the highest permission the user has in the
477          project. No matter what permission an item has been shared with the
478          user will not get higher permission. Groups may be members of other groups and
479          also in projects.
480        </para>
481       
482        <para>
483          Group membership is always accounted for, but the core only allows
484          one project at a time to be use, this is the <emphasis>active project</emphasis>.
485          When a project is active new items that are created are automatically
486          added to that project with the permission given by the
487          <varname>autoPermission</varname> property.
488        </para>
489             
490      </sect3>
491     
492      <sect3 id="data_api.authentication.keys">
493        <title>Keys</title>     
494     
495        <para>
496          The <classname docapi="net.sf.basedb.core.data">KeyData</classname> class and it's subclasses
497          <classname docapi="net.sf.basedb.core.data">ItemKeyData</classname>, <classname docapi="net.sf.basedb.core.data">ProjectKeyData</classname> and
498          <classname docapi="net.sf.basedb.core.data">RoleKeyData</classname>, are used to store information about access
499          permissions to items. To get permission to manipulate an item a user must have
500          access to a key giving that permission. There are three types of keys:
501        </para>
502       
503        <variablelist>
504        <varlistentry>
505          <term><classname docapi="net.sf.basedb.core.data">ItemKey</classname></term>
506          <listitem>
507            <para>
508            Is used to give a user or group access to a specific item. The item
509            must be a <interfacename docapi="net.sf.basedb.core.data">ShareableData</interfacename> item.
510            The permissions are usually set by the owner of the item. Once created an
511            item key cannot be changed. This allows the core to reuse a key if the
512            permissions match exactly, ie. for a given set of users/groups/permissions
513            there can be only one item key object.
514            </para>
515          </listitem>
516        </varlistentry>
517
518        <varlistentry>
519          <term><classname docapi="net.sf.basedb.core.data">ProjectKey</classname></term>
520          <listitem>
521            <para>
522            Is used to give members of a project access to a specific item. The item
523            must be a <interfacename docapi="net.sf.basedb.core.data">ShareableData</interfacename> item. Once created a
524            project key cannot be changed. This allows the core to reuse a key if the
525            permissions match exactly, ie. for a given set of projects/permissions
526            there can be only one project key object.
527            </para>
528          </listitem>
529        </varlistentry>
530
531        <varlistentry>
532          <term><classname docapi="net.sf.basedb.core.data">RoleKey</classname></term>
533          <listitem>
534            <para>
535            Is used to give a user access to all items of a specific type, ie.
536            <constant>READ</constant> all <constant>SAMPLES</constant>. The installation
537            will make sure that there already exists a role key for each type of item, and
538            it is not possible to add new or delete existing keys. Unlike the other two types
539            this key can be modified.
540            </para>
541           
542            <para>
543            A role key is also used to assign permissions to plug-ins. If a plug-in has
544            been specified to use permissions the default is to deny everything.
545            The mapping to the role key is used to grant permissions to the plugin.
546            The <varname>granted</varname> value gives the plugin access to all items
547            of the related item type regardless of if the user that is running the plug-in has the
548            permission or not. The <varname>denied</varname> values denies access to all
549            items of the related item type even if the logged in user has the permission.
550            Permissions that are not granted nor denied are checked against the
551            logged in users regular permissions. Permissions to items that are
552            not linked are always denied.
553            </para>
554          </listitem>
555        </varlistentry>
556        </variablelist>
557       
558      </sect3>
559
560      <sect3 id="data_api.authentication.permissions">
561        <title>Permissions</title>
562       
563        <para>
564          The <varname>permission</varname> property appearing in many classes is an
565          integer values describing the permission:
566        </para>
567       
568        <informaltable>
569        <tgroup cols="2">
570          <colspec colname="value" />
571          <colspec colname="permission" />
572          <thead>
573            <row>
574              <entry>Value</entry>
575              <entry>Permission</entry>
576            </row>
577          </thead>
578          <tbody>
579            <row>
580              <entry>1</entry>
581              <entry>Read</entry>
582            </row>
583            <row>
584              <entry>3</entry>
585              <entry>Use</entry>
586            </row>
587            <row>
588              <entry>7</entry>
589              <entry>Restricted write</entry>
590            </row>
591            <row>
592              <entry>15</entry>
593              <entry>Write</entry>
594            </row>
595            <row>
596              <entry>31</entry>
597              <entry>Delete</entry>
598            </row>
599            <row>
600              <entry>47 (=32+15)</entry>
601              <entry>Set owner</entry>
602            </row>
603            <row>
604              <entry>79 (=64+15)</entry>
605              <entry>Set permissions</entry>
606            </row>
607            <row>
608              <entry>128</entry>
609              <entry>Create</entry>
610            </row>
611            <row>
612              <entry>256</entry>
613              <entry>Denied</entry>
614            </row>
615          </tbody>
616        </tgroup>
617        </informaltable>
618       
619        <para>
620          The values are constructed so that
621          <constant>READ</constant> -&gt;
622          <constant>USE</constant> -&gt;
623          <constant>RESTRICTED_WRITE</constant> -&gt;
624          <constant>WRITE</constant> -&gt;
625          <constant>DELETE</constant>
626          are chained in the sense that a higher permission always implies the lower permissions
627          also. The <constant>SET_OWNER</constant> and <constant>SET_PERMISSION</constant>
628          both implies <constant>WRITE</constant> permission. The <constant>DENIED</constant>
629          permission is only valid for role keys, and if specified it overrides all
630          other permissions.               
631        </para>
632       
633        <para>
634          When combining permission for a single item the permission codes for the different
635          paths are OR-ed together. For example a user has a role key with <constant>READ</constant>
636          permission for <constant>SAMPLES</constant>, but also an item key with <constant>USE</constant>
637          permission for a specific sample. Of course, the resulting permission for that
638          sample is <constant>USE</constant>. For other samples the resulting permission is
639          <constant>READ</constant>.
640        </para>
641       
642        <para>
643          If the user is also a member of a project which has <constant>WRITE</constant>
644          permission for the same sample, the user will have <constant>WRITE</constant>
645          permission when working with that project.
646        </para>
647       
648        <para>
649          The <constant>RESTRICTED_WRITE</constant> permission is in most cases the same
650          as the <constant>WRITE</constant> permission. So far the <constant>RESTRICTED_WRITE</constant>
651          permission is only given to users to their own <classname docapi="net.sf.basedb.core.data">UserData</classname>
652          object so they can change their address and other contact information,
653          but not quota, expiration date and other administrative information.
654        </para>
655
656      </sect3>
657    </sect2>
658
659    <sect2 id="data_api.wares">
660      <title>Hardware and software</title>
661      <para>
662         This section gives an overview of hardware and software in BASE.
663      </para>
664     
665      <sect3 id="data_api.wares.uml">
666        <title>UML diagram</title>
667       
668        <figure id="data_api.figures.wares">
669          <title>Hardware and software</title>
670          <screenshot>
671            <mediaobject>
672              <imageobject>
673                <imagedata 
674                  align="center"
675                  fileref="figures/uml/datalayer.wares.png" format="PNG" />
676              </imageobject>
677            </mediaobject>
678          </screenshot>
679        </figure>
680      </sect3>
681     
682      <sect3 id="data_api.wares.description">
683        <title>Hardware and software</title>
684        <para>
685          BASE is pre-installed with a set of hardware and software types.
686          They are typically used to filter the registered hardware and software
687          depending on what a user is doing. For example, when adding raw data
688          to BASE a user can select a scanner. The GUI will display the hardware
689          that has been registered as <emphasis>scanner</emphasis> hardware types.
690          Other hardware types are <emphasis>hybridization station</emphasis>
691          and <emphasis>print robot</emphasis>. An administrator may register more
692          hardware and software types.
693        </para>
694      </sect3>
695    </sect2>
696   
697    <sect2 id="data_api.reporters">
698      <title>Reporters</title>
699      <para>
700         This section gives an overview of hardware and software in BASE.
701      </para>
702     
703      <sect3 id="data_api.reporters.uml">
704        <title>UML diagram</title>
705       
706        <figure id="data_api.figures.reporters">
707          <title>Reporters</title>
708          <screenshot>
709            <mediaobject>
710              <imageobject>
711                <imagedata 
712                  align="center"
713                  fileref="figures/uml/datalayer.reporters.png" format="PNG" />
714              </imageobject>
715            </mediaobject>
716          </screenshot>
717        </figure>
718      </sect3>
719     
720      <sect3 id="data_api.reporters.description">
721        <title>Reporters</title>
722        <para>
723          The <classname docapi="net.sf.basedb.core.data">ReporterData</classname> class holds information about reporters.
724          The <property>externalId</property> is a required property that must be unique
725          among all reporters. The external ID is the value BASE uses to match
726          reporters when importing data from files.
727        </para>
728       
729        <para>
730          The <classname>ReporterData</classname> is an <emphasis>extendable</emphasis>
731          class, which means that the server administrator can define additional
732          columns (=annotations) in the reporters table. These are accessed with
733          the <methodname>ReporterData.getExtended()</methodname> and
734          <methodname>ReporterData.setExtended()</methodname> methods.
735          See <xref linkend="appendix.extendedproperties" /> for more information about
736          this.
737        </para>
738       
739        <para>
740          The <classname>ReporterData</classname> is also a <emphasis>batchable</emphasis>
741          class which means that there is no corresponding class in the core
742          layer. Client applications and plug-ins should work directly with
743          the <classname>ReporterData</classname> class. To help manage the reporters
744          there is the <classname docapi="net.sf.basedb.core">Reporter</classname> and <classname docapi="net.sf.basedb.core">ReporterBatcher</classname>
745          classes. The main reason for this
746          is to increase the performance and lower the memory usage by bypassing
747          internal caching in the core and Hibernate. Performance is also
748          increased by the batchers which uses more efficient SQL against the
749          database than Hibernate.
750        </para>
751       
752        <para>
753          The
754          <property>lastUpdate</property>
755          property holds the data and time the reporter information was last updated. The
756          value is managed automatically by the
757          <classname>ReporterBatcher</classname>
758          class. That goes for
759          <property>lastSource</property>
760          property too, which holds information about where the last update comes from. By
761          default this is set to the name of the logged in user, but it can be changed by
762          calling
763          <methodname>ReporterBatcher.setUpdateSource(String source)</methodname>
764          before the batcher commits the updates to the database. The source-string
765          should have the format: <synopsis>[ITEM_TYPE]:[ITEM_NAME]</synopsis> where,in
766          the file-case, ITEM_TYPE is File and ITEM_NAME is the file's name.
767        </para>
768      </sect3>
769     
770      <sect3 id="data_api.reporters.lists">
771        <title>Reporter lists</title>
772       
773        <para>
774          Reporter lists can be used to group reporters that are somehow related
775          to each other. This could for example be a list of interesting reporters
776          found in the analysis of an experiment. Each reporter in the list may
777          optionally be assigned a score. The meaning of the score value is not
778          interpreted by BASE.
779        </para>
780       
781      </sect3>
782     
783     
784    </sect2>
785
786    <sect2 id="data_api.quota">
787      <title>Quota and disk usage</title>
788      <para>
789         This section gives an overview of quota system in BASE
790         and how the disk usage is kept track of.
791      </para>
792     
793      <sect3 id="data_api.quota.uml">
794        <title>UML diagram</title>
795       
796        <figure id="data_api.figures.quota">
797          <title>Quota and disk usage</title>
798          <screenshot>
799            <mediaobject>
800              <imageobject>
801                <imagedata 
802                  align="center"
803                  fileref="figures/uml/datalayer.quota.png" format="PNG" />
804              </imageobject>
805            </mediaobject>
806          </screenshot>
807        </figure>
808      </sect3>
809     
810      <sect3 id="data_api.quota.description">
811        <title>Quota</title>
812       
813        <para>
814          The <classname docapi="net.sf.basedb.core.data">QuotaData</classname> holds information about a
815          single quota registration. The same quota may be used by many different users
816          and groups. This object encapsulates allowed
817          quota values for different types of quota types and locations.
818          BASE defines several quota types (file, raw data and experiment),
819          and locations (primary, secondary and offline).
820        </para>
821       
822        <para>
823          The <property>quotaValues</property> property is a map from
824          <classname docapi="net.sf.basedb.core.data">QuotaIndex</classname> to maximum byte values.
825          This map must contain at least one entry for the total
826          quota at the primary location.
827        </para>
828       
829      </sect3>
830     
831      <sect3 id="data_api.quota.diskusage">
832        <title>Disk usage</title>
833       
834        <para>
835          A <interfacename docapi="net.sf.basedb.core.data">DiskConsumableData</interfacename> (for example a file)
836          item is automatically linked to a <classname docapi="net.sf.basedb.core.data">DiskUsageData</classname>
837          item. This holds information about the number of bytes,
838          the location and quota type the item uses. It also holds information
839          about which user and group (optional) that should be charged for the disk usage.
840          The user is always the owner of the item.
841        </para>
842
843      </sect3>
844     
845    </sect2>
846
847    <sect2 id="data_api.clients">
848      <title>Client, session and settings</title>
849      <para>
850         This section gives an overview of hardware and software in BASE.
851      </para>
852     
853      <sect3 id="data_api.clients.uml">
854        <title>UML diagram</title>
855       
856        <figure id="data_api.figures.clients">
857          <title>Client, sessions and settings</title>
858          <screenshot>
859            <mediaobject>
860              <imageobject>
861                <imagedata 
862                  align="center"
863                  scalefit="1" width="100%"
864                  fileref="figures/uml/datalayer.clients.png" format="PNG" />
865              </imageobject>
866            </mediaobject>
867          </screenshot>
868        </figure>
869      </sect3>
870     
871      <sect3 id="data_api.clients.description">
872        <title>Clients</title>
873        <para>
874          The <classname docapi="net.sf.basedb.core.data">ClientData</classname> class holds information
875          about a client application. The <property>externalId</property>
876          property is a unique identifier for the application. To avoid ID clashes the ID
877          should be constructed in the same way as Java packages, for example
878          <constant>net.sf.basedb.clients.web</constant> is the ID for the
879          web client application.
880        </para>
881       
882        <para>
883          A client application doesn't have to be registered with BASE
884          to be able to use it. But we recommend it since:
885        </para>
886       
887        <itemizedlist>
888        <listitem>
889          <para>
890            The permission system allows an admin to specify exactly
891            which users that may use a specific application.
892          </para>
893        </listitem>
894       
895        <listitem>
896          <para>
897          The application can't store any context-sensitive or application-specific
898          settings unless it is registered.
899          </para>
900        </listitem>
901       
902        <listitem>
903          <para>
904          The application can store context-sensitive help in the BASE
905          database.
906          </para>
907        </listitem>
908        </itemizedlist>
909      </sect3>
910     
911      <sect3 id="data_api.clients.sessions">
912        <title>Sessions</title>
913       
914        <para>
915          A session represents the time between login and logout for a single
916          user. The <classname docapi="net.sf.basedb.core.data">SessionData</classname> object is entirely
917          managed by the BASE core, and should be considered read-only
918          for client applications.
919        </para>
920           
921      </sect3>
922     
923      <sect3 id="data_api.clients.settings">
924        <title>Settings</title>
925       
926        <para>
927          There are two types of settings: context-sensitive settings and regular
928          settings. The regular settings are simple key-value pairs of strings
929          and can be used for almost anything. There are four subtypes:
930        </para>
931       
932        <itemizedlist>
933        <listitem>
934          <para>
935          Global default settings: Settings that are used by all users
936          and client applications on the BASE server. These settings
937          are read-only except for administrators. BASE has not yet defined
938          any settings of this type.
939          </para>
940        </listitem>
941       
942        <listitem>
943          <para>
944          User default settings: Settings that are valid for a single user
945          for any client application. BASE has not yet defined
946          any settings of this type.
947          </para>
948        </listitem>
949       
950        <listitem>
951          <para>
952          Client default settings: Settings that are valid for all users using
953          a specific client application. Each client application is responsible
954          for defining it's own settings. Settings are read-only except
955          for administrators.
956          </para>
957        </listitem>
958       
959        <listitem>
960          <para>
961          User client settings: Settings that are valid for a single user using
962          a specific client application. Each client application is responsible
963          for defining it's own settings.
964          </para>
965        </listitem>
966       
967        </itemizedlist>
968       
969        <para>
970          The context-sensitive settings are designed to hold information
971          about the current status of options related to the listing of items
972          of a specific type. This includes:
973        </para>
974       
975        <itemizedlist>
976        <listitem>
977          <para>
978          Current filtering options (as 1 or more <classname docapi="net.sf.basedb.core.data">PropertyFilterData</classname>
979          objects).
980          </para>
981        </listitem>
982       
983        <listitem>
984          <para>
985          Which columns and direction to use for sorting.
986          </para>
987        </listitem>
988       
989        <listitem>
990          <para>
991          The number of items to display on each page, and which page that
992          is the current page.
993          </para>
994        </listitem>
995       
996        <listitem>
997          <para>
998          Simple key-value settings related to a given context.
999          </para>
1000        </listitem>
1001        </itemizedlist>
1002       
1003        <para>
1004          Context-sensitive settings are only accessible if a client
1005          application has been registered. The settings may be
1006          named to make it possible to store several presets and to
1007          quickly switch between them. In any case, BASE maintains a
1008          current default setting with an empty name. An administrator
1009          may mark a named setting as public to allow other users to
1010          use it.
1011        </para>
1012       
1013      </sect3>
1014     
1015     
1016    </sect2>
1017
1018    <sect2 id="data_api.files">
1019      <title>Files and directories</title>
1020
1021      <para>
1022        This section covers the details of the BASE file
1023        system.
1024      </para>
1025
1026      <sect3 id="data_api.files.uml">
1027      <title>UML diagram</title>
1028     
1029        <figure id="data_api.figures.files">
1030          <title>Files and directories</title>
1031          <screenshot>
1032            <mediaobject>
1033              <imageobject>
1034                <imagedata 
1035                  align="center"
1036                  fileref="figures/uml/datalayer.files.png" format="PNG" />
1037              </imageobject>
1038            </mediaobject>
1039          </screenshot>
1040        </figure>
1041      </sect3>
1042     
1043      <sect3 id="data_api.files.description">
1044        <title>Description</title>
1045       
1046        <para>
1047          The <classname docapi="net.sf.basedb.core.data">DirectoryData</classname> class holds
1048          information about directories. Directories are organised in the
1049          ususal way as as tree structure. All directories must have
1050          a parent directory, except the system-defined root directory.
1051        </para>
1052       
1053        <para>
1054          The <classname docapi="net.sf.basedb.core.data">FileData</classname> class holds information about
1055          a file. The actual file contents is stored on disk in the directory
1056          specified by the <varname>userfiles</varname> setting in
1057          <filename>base.config</filename>. The <varname>internalName</varname>
1058          property is the name of the file on disk, but this is never exposed to
1059          client applications. The filenames and directories
1060          on the disk doesn't correspond to the the filenames and directories in
1061          BASE.
1062        </para>
1063       
1064        <para>
1065          The <varname>location</varname> property can take three values:
1066        </para>
1067       
1068        <itemizedlist>
1069        <listitem>
1070          <para>
1071          0 = The file is offline, ie. there is no file on the disk
1072          </para>
1073        </listitem>
1074        <listitem>
1075          <para>
1076          1 = The file is in primary storage, ie. it is located on the disk
1077          and can be used by BASE
1078          </para>
1079        </listitem>
1080        <listitem>
1081          <para>
1082          2 = The file is in secondary storage, ie. it has been moved to some
1083          other place and can't be used by BASE immediately.
1084          </para>
1085        </listitem>
1086        </itemizedlist>
1087       
1088        <para>
1089          The <varname>action</varname> property controls how a file is
1090          moved between primary and seconday storage. It can have the following
1091          values:
1092        </para>
1093       
1094        <itemizedlist>
1095        <listitem>
1096          <para>
1097          0 = Do nothing
1098          </para>
1099        </listitem>
1100        <listitem>
1101          <para>
1102          1 = If the file is in secondary storage, move it back to the primary storage
1103          </para>
1104        </listitem>
1105        <listitem>
1106          <para>
1107          2 = If the file is in primary storage, move it to the secondary storage
1108          </para>
1109        </listitem>
1110        </itemizedlist>
1111       
1112        <para>
1113          The actual moving between primary and secondary storage is done by an
1114          external program. See
1115          <xref linkend="appendix.base.config.secondary" /> and
1116          <xref linkend="plugin_developer.other.secondary" /> for more information.
1117        </para>
1118     
1119        <para>
1120          The <varname>md5</varname> property can be used to check for file
1121          corruption when it is moved between primary and secondary storage or
1122          when a user re-uploads a file that has been offline.
1123        </para>
1124       
1125        <para>
1126          BASE can store files in a compressed format. This is handled internally
1127          and is not visible to client applications. The <varname>compressed</varname>
1128          and <varname>diskSize</varname> properties are used to store information
1129          about this. A file may always be compressed if the users says so, but
1130          BASE can also do this automatically if the file is uploaded
1131          to a directory with the <varname>autoCompress</varname> flag set
1132          or if the file has MIME type with the <varname>autoCompress</varname>
1133          flag set.
1134        </para>
1135       
1136        <para>
1137          The <classname docapi="net.sf.basedb.core.data">FileTypeData</classname> class holds information about
1138          file types. It is used only to make it easier for users to organise
1139          their files.
1140        </para>
1141       
1142        <para>
1143          The <classname docapi="net.sf.basedb.core.data">MimeTypeData</classname> is used to register mime types and
1144          map them to file extensions. The information is only used to lookup values
1145          when needed. Given the filename we can set the <varname>File.mimeType</varname>
1146          and <varname>File.fileType</varname> properties. The MIME type is also
1147          used to decide if a file should be stored in a compressed format or not.
1148          The extension of a MIME type must be unique. Extensions should be registered
1149          without a dot, ie <emphasis>html</emphasis>, not <emphasis>.html</emphasis>
1150        </para>
1151       
1152      </sect3>
1153     
1154     
1155    </sect2>
1156   
1157    <sect2 id="data_api.platforms">
1158      <title>Experimental platforms</title>
1159
1160      <para>
1161         This section gives an overview of experimental platforms
1162         and how they are used to enable data storage in files
1163         instead of in the database.
1164      </para>
1165     
1166      <itemizedlist>
1167        <title>See also</title>
1168        <listitem><xref linkend="core_api.data_in_files" /></listitem>
1169        <listitem><xref linkend="appendix.rawdatatypes.platforms" /></listitem>
1170        <listitem><xref linkend="plugin_developer.other.datafiles" /></listitem>
1171      </itemizedlist>
1172         
1173      <sect3 id="data_api.platforms.uml">
1174        <title>UML diagram</title>
1175       
1176        <figure id="data_api.figures.platforms">
1177          <title>Experimental platforms</title>
1178          <screenshot>
1179            <mediaobject>
1180              <imageobject>
1181                <imagedata 
1182                  align="center"
1183                  fileref="figures/uml/datalayer.platforms.png" format="PNG" />
1184              </imageobject>
1185            </mediaobject>
1186          </screenshot>
1187        </figure>
1188      </sect3>
1189     
1190      <sect3 id="data_api.platforms.platforms">
1191        <title>Platforms</title>
1192       
1193        <para>
1194          The <classname docapi="net.sf.basedb.core.data">PlatformData</classname> holds information about a
1195          platform. A platform can have one or more <classname docapi="net.sf.basedb.core.data">PlatformVariant</classname>:s.
1196          Both the platform and variant are identified by an external ID that
1197          is fixed and can't be changed. <emphasis>Affymetrix</emphasis>
1198          is an example of a platform.
1199          If the <varname>fileOnly</varname> flag is set data for the platform
1200          can only be stored in files and not imported into the database. If
1201          the flag is not set data can be imported into the database.
1202          In the latter case, the <varname>rawDataType</varname> property
1203          can be used to lock the platform
1204          to a specific raw data type. If the value is <constant>null</constant>
1205          the platform can use any raw data type.
1206        </para>
1207       
1208        <para>
1209          Each platform and it's variant can be connected to one or more
1210          <classname docapi="net.sf.basedb.core.data">DataFileTypeData</classname> items. This item
1211          describes the kind of files that are used to hold data for
1212          the platform and/or variant. The file types are re-usable between
1213          different platforms and variants. Note that a file type may be attached
1214          to either only a platform or to a platform with a variant. File
1215          types attached to platforms are inherited by the variants. The variants
1216          can only define additional file types, not remove or redefine file types
1217          that has been attached to the platform.
1218        </para>
1219        <para>
1220          The file type is also identified
1221          by a fixed, non-changable external ID. The <varname>itemType</varname>
1222          property tells us what type of item the file holds data for (ie.
1223          array design or raw bioassay). It also links to a <classname docapi="net.sf.basedb.core.data">FileType</classname>
1224          which is the generic type of data in the file. This allows us to query
1225          the database for, as an example, files with the generic type
1226          <constant>FileType.RAW_DATA</constant>. If we are in an Affymetrix
1227          experiment we will get the CEL file, for another platform we will
1228          get another file.
1229        </para>
1230        <para>
1231          The <varname>required</varname> flag in <classname docapi="net.sf.basedb.core.data">PlatformFileTypeData</classname>
1232          is used to signal that the file is a required file. This is not
1233          enforeced by the core. It is intended to be used by client applications
1234          for creating a better GUI and for validation of an experiment.
1235        </para>
1236
1237      </sect3>
1238     
1239      <sect3 id="data_api.platforms.files">
1240        <title>FileStoreEnabled items and data files</title>
1241       
1242        <para>
1243          An item must implement the <interfacename docapi="net.sf.basedb.core">FileStoreEnabledData</interfacename>
1244          interface to be able to store data in files instead of in the database.
1245          The interface creates a link to a <classname docapi="net.sf.basedb.core.data">FileSetData</classname> object,
1246          which can hold several <classname docapi="net.sf.basedb.core.data">FileSetMemberData</classname> items.
1247          Each member points to specific <classname docapi="net.sf.basedb.core.data">FileData</classname> item.
1248          A file set can only store one file of each <classname docapi="net.sf.basedb.core.data">DataFileTypeData</classname>.
1249        </para>
1250       
1251      </sect3>
1252    </sect2>
1253
1254    <sect2 id="data_api.parameters">
1255      <title>Parameters</title>
1256     
1257      <para>
1258        This section gives an overview the generic parameter
1259        system in BASE that is used to store annotation values,
1260        plugin configuration values, job parameter values, etc.
1261      </para>
1262     
1263      <sect3 id="data_api.parameters.uml">
1264        <title>UML diagram</title>
1265       
1266        <figure id="data_api.figures.parameters">
1267          <title>Parameters</title>
1268          <screenshot>
1269            <mediaobject>
1270              <imageobject>
1271                <imagedata 
1272                  align="center"
1273                  fileref="figures/uml/datalayer.parameters.png" format="PNG" />
1274              </imageobject>
1275            </mediaobject>
1276          </screenshot>
1277        </figure>
1278      </sect3>
1279     
1280      <sect3 id="data_api.parameters.description">
1281        <title>Parameters</title>
1282       
1283        <para>
1284          The parameter system is a generic system that can store almost
1285          any kind of simple values (string, numbers, dates, etc.) and
1286          also links to other items. The <classname docapi="net.sf.basedb.core.data">ParameterValueData</classname> 
1287          class is an abstract base class that can hold multiple values (all must be of the
1288          same type). Unless only a specific type of values should be stored, this is
1289          the class that should be used when creating references for storing parameter
1290          values. It makes it possible for a single relaltion to use any kind of
1291          values or for a collection reference to mix multiple types of values.
1292          A typical use case maps a <classname>Map</classname> with the
1293          parameter name as the key:
1294        </para>
1295       
1296        <programlisting language="java">
1297private Map&lt;String, ParameterValueData&lt;?&gt;&gt; configurationValues;
1298/**
1299   Link parameter name with it's values.
1300   @hibernate.map table="`PluginConfigurationValues`" lazy="true" cascade="all"
1301   @hibernate.collection-key column="`pluginconfiguration_id`"
1302   @hibernate.collection-index column="`name`" type="string" length="255"
1303   @hibernate.collection-many-to-many column="`value_id`"
1304      class="net.sf.basedb.core.data.ParameterValueData"
1305*/
1306public Map&lt;String, ParameterValueData&lt;?&gt;&gt; getConfigurationValues()
1307{
1308   return configurationValues;
1309}
1310void setConfigurationValues(Map&lt;String, ParameterValueData&lt;?&gt;&gt; configurationValues)
1311{
1312   this.configurationValues = configurationValues;
1313}
1314</programlisting>
1315       
1316      <para>
1317      Now it is possible for the collection to store all types of values:
1318      </para>
1319     
1320      <programlisting language="java">
1321Map&lt;String, ParameterValueData&lt;?&gt;&gt; config = ...
1322config.put("names", new StringParameterValueData("A", "B", "C"));
1323config.put("sizes", new IntegerParameterValueData(10, 20, 30));
1324
1325// When you later load those values again you have to cast
1326// them to the correct class.
1327List&lt;String&gt; names = (List&lt;String&gt;)config.get("names").getValues();
1328List&lt;Integer&gt; sizes = (List&lt;Integer&gt;)config.get("sizes").getValues();
1329</programlisting>
1330
1331      </sect3>
1332     
1333    </sect2>
1334
1335    <sect2 id="data_api.annotations">
1336      <title>Annotations</title>
1337     
1338      <para>
1339        This section gives an overview of how the BASE annotation
1340        system works.
1341      </para>
1342     
1343      <sect3 id="data_api.annotations.uml">
1344        <title>UML diagram</title>
1345       
1346        <figure id="data_api.figures.annotations">
1347          <title>Annotations</title>
1348          <screenshot>
1349            <mediaobject>
1350              <imageobject>
1351                <imagedata 
1352                  align="center"
1353                  fileref="figures/uml/datalayer.annotations.png" format="PNG" />
1354              </imageobject>
1355            </mediaobject>
1356          </screenshot>
1357        </figure>
1358      </sect3>
1359     
1360      <sect3 id="data_api.annotations.description">
1361        <title>Annotations</title>
1362       
1363        <para>
1364        An item must implement the <interfacename docapi="net.sf.basedb.core.data">AnnotatableData</interfacename>
1365        interface to be able to use the annotation system. This interface gives
1366        a link to a <classname docapi="net.sf.basedb.core.data">AnnotationSetData</classname> item. This class
1367        encapsulates all annotations for the item. There are two types of
1368        annotations:
1369        </para>
1370       
1371        <itemizedlist>
1372        <listitem>
1373          <para>
1374          <emphasis>Primary annotations</emphasis> are annotations that
1375          explicitely belong to the item. An annotation set can contain
1376          only one primary annotation of each annotation type. The primary
1377          annotation are linked with the <property>annotations</property>
1378          property. This property is a map with an
1379          <classname docapi="net.sf.basedb.core.data">AnnotationTypeData</classname>  as the key.
1380          </para>
1381        </listitem>
1382       
1383        <listitem>
1384          <para>
1385          <emphasis>Inherited annotations</emphasis> are annotations
1386          that belong to a parent item, but that we want to use on
1387          another item as well. Inherited annotations are saved as
1388          references to either a single annotation or to another
1389          annotation set. Thus, it is possible for an item to inherit
1390          multiple annotations of the same annotation type.
1391          </para>
1392        </listitem>
1393        </itemizedlist>
1394       
1395        <para>
1396          The <classname docapi="net.sf.basedb.core.data">AnnotationData</classname> class is also
1397          just a placeholder. It connects the annotation set and
1398          annotation type with a <classname docapi="net.sf.basedb.core.data">ParameterValueData</classname>
1399          object. This is the object that holds the actual annotation
1400          values.
1401        </para>
1402       
1403      </sect3>
1404     
1405      <sect3 id="data_api.annotations.types">
1406        <title>Annotation types</title>
1407       
1408        <para>
1409        Instances of the <classname docapi="net.sf.basedb.core.data">AnnotationTypeData</classname> class
1410        defines the various annotations. It must have a <property>valueType</property> 
1411        property which cannot be changed. The value of this property controls
1412        which <classname docapi="net.sf.basedb.core.data">ParameterValueData</classname> subclass is used to store
1413        the annotation values, ie. <classname docapi="net.sf.basedb.core.data">IntegerParameterValueData</classname>,
1414        <classname docapi="net.sf.basedb.core.data">StringParameterValueData</classname>, etc.
1415        The <property>multiplicity</property> property holds the maximum allowed
1416        number of values for an annotation, or 0 if an unlimited number is
1417        allowed.
1418        </para>
1419       
1420        <para>
1421        The <property>itemTypes</property> collection holds the codes for
1422        the types of items the annotation type can be used on. This is
1423        checked when new annotations are created but already existing
1424        annotations are not affected if the collection is modified.
1425        </para>
1426       
1427        <para>
1428        Annotation types with the <property>protocolParameter</property> flag set
1429        are treated a bit differently. They will not show up as annotations
1430        to items with a type found in the <property>itemTypes</property> collection.
1431        A protocol parameter should be attached to a protocol. Then, when an item
1432        is using that protocol it becomes possible to add annotation values for
1433        the annotation types specified as protocol parameters. It doesn't matter
1434        if the item's type is found in the <property>itemTypes</property> 
1435        collection or not.
1436        </para>
1437       
1438        <para>
1439        The <property>options</property> collection is used to store additional
1440        options required by some of the value types, for example a max string
1441        length for string annotations or the max and min allowed value for
1442        integer annotations.
1443        </para>
1444       
1445        <para>
1446        The <property>enumeration</property> property is a boolean flag
1447        indicating if the allowed values are predefined as an enumeration.
1448        In that case those values are found in the <property>enumerationValues</property>
1449        property. The actual subclass is determined by the <property>valueType</property>
1450        property.
1451        </para>
1452       
1453        <para>
1454        Most of the other properties are hints to client applications how
1455        to render the input field for the annotation.
1456        </para>
1457       
1458      </sect3>
1459     
1460      <sect3 id="data_api.annotations.units">
1461        <title>Units</title>
1462        <para>
1463        Numerical annotation values can have units. A unit is described by
1464        a <classname docapi="net.sf.basedb.core.data">UnitData</classname> object.
1465        Each unit belongs to a <classname docapi="net.sf.basedb.core.data">QuantityData</classname> 
1466        object which defines the class of units. For example, if the quantity is
1467        <emphasis>weight</emphasis>, we can have units, <emphasis>kg</emphasis>,
1468        <emphasis>mg</emphasis>, <emphasis>µg</emphasis>, etc. The <classname>UnitData</classname>
1469        contains a factor and offset that relates all units to a common reference
1470        defined by the <classname>QuantityData</classname> class. For example,
1471        <emphasis>1 meter</emphasis> is the reference unit for distance, and we
1472        have <code>1 meter * 0.001 = 1 millimeter</code>. In this case, the factor is
1473        <emphasis>0.001</emphasis> and the offset 0. Another example is the relationship between
1474        kelvin and Celsius, which is <code>1 kelvin + 273.15 = 1 °Celsius</code>.
1475        Here, the factor is 1 and the offset is <emphasis>+273.15</emphasis>.
1476        The <classname
1477        docapi="net.sf.basedb.core.data">UnitSymbolData</classname>
1478        is used to make it possible to assign alternative symbols to a single unit.
1479        This is needed to simplify input where it may be hard to know what to
1480        type to get <emphasis></emphasis> or <emphasis>°C</emphasis>. Instead,
1481        <emphasis>m2</emphasis> and <emphasis>C</emphasis> can be used as
1482        alternative symbols.
1483        </para>
1484       
1485        <para>
1486        The creator of an annotation type may select a
1487        <classname>QuantityData</classname>, which can't be changed later, and
1488        a default <classname>UnitData</classname>. When entering annotation values
1489        a user may select any unit for the selected quantity (unless annotation type
1490        owner has limited this by selecting <varname>usableUnits</varname>). Before
1491        the values are stored in the database, they are converted to the default
1492        unit. This makes it possible to compare and filter on annotation values
1493        using different units. For example, filtering with <emphasis>&gt;5mg</emphasis> 
1494        also finds items that are annotated with <emphasis>2g</emphasis>.
1495        </para>
1496       
1497        <para>
1498        The core should automatically update the stored annotation values if
1499        the default unit is changed for an annotation type, or if the reference
1500        factor for a unit is changed.
1501        </para>
1502      </sect3>
1503     
1504      <sect3 id="data_api.annotations.categories">
1505        <title>Categories</title>
1506       
1507        <para>
1508        The <classname docapi="net.sf.basedb.core.data">AnnotationTypeCategoryData</classname> class defines
1509        categories that are used to group annotation types that are related to
1510        each other. This information is mainly useful for client applications
1511        when displaying forms for annotating items, that wish to provide a
1512        clearer interface when there are many (say 50+) annotations type for
1513        an item. An annotation type can belong to more than one category.
1514        </para>
1515       
1516      </sect3>
1517     
1518    </sect2>
1519
1520    <sect2 id="data_api.protocols">
1521      <title>Protocols</title>
1522
1523      <para>
1524        This section gives an overview of how protocols that describe various
1525        processes, such as sampling, extraction and scanning, are used in BASE.
1526      </para>
1527     
1528      <sect3 id="data_api.protocols.uml">
1529        <title>UML diagram</title>
1530       
1531        <figure id="data_api.figures.protocols">
1532          <title>Protocols</title>
1533          <screenshot>
1534            <mediaobject>
1535              <imageobject>
1536                <imagedata 
1537                  align="center"
1538                  fileref="figures/uml/datalayer.protocols.png" format="PNG" />
1539              </imageobject>
1540            </mediaobject>
1541          </screenshot>
1542        </figure>
1543      </sect3>
1544     
1545      <sect3 id="data_api.protocols.description">
1546        <title>Protocols</title>
1547       
1548        <para>
1549        A protocol is something that defines a procedure or recipe for some
1550        kind of action, such as sampling, extraction and scanning. In BASE we only
1551        store a short name and description. It is possible to attach a file
1552        that provides a longer description of the procedure.
1553        </para>
1554     
1555      </sect3>
1556     
1557      <sect3 id="data_api.protocols.parameters">
1558        <title>Parameters</title>
1559       
1560        <para>
1561        The procedure described by the protocol may have parameters
1562        that are set indepentently each time the protocol is used. It
1563        could for example be a temperature, a time or something else.
1564        The definition of parameters is done by creating annotation
1565        types and attaching them to the protocol. It is only possible
1566        to attach annotation types which has the <property>protocolParameter</property>
1567        property set to <constant>true</constant>. The same annotation type
1568        can be used for more than one protocol, but only do this if the
1569        parameters actually has the same meaning.
1570        </para>
1571     
1572      </sect3>
1573     
1574    </sect2>
1575
1576    <sect2 id="data_api.plugins">
1577      <title>Plug-ins, jobs and job agents</title>
1578     
1579      <para>
1580         This section gives an overview of plug-ins, jobs and job agents.
1581      </para>
1582     
1583      <itemizedlist>
1584        <title>See also</title>
1585        <listitem><xref linkend="plugins.installation" /></listitem>
1586        <listitem><xref linkend="installation_upgrade.jobagents" /></listitem>
1587      </itemizedlist>
1588     
1589      <sect3 id="data_api.plugins.uml">
1590        <title>UML diagram</title>
1591       
1592        <figure id="data_api.figures.plugins">
1593          <title>Plug-ins, jobs and job agents</title>
1594          <screenshot>
1595            <mediaobject>
1596              <imageobject>
1597                <imagedata 
1598                  align="center"
1599                  scalefit="1" width="100%"
1600                  fileref="figures/uml/datalayer.plugins.png" format="PNG" />
1601              </imageobject>
1602            </mediaobject>
1603          </screenshot>
1604        </figure>
1605      </sect3>
1606
1607      <sect3 id="data_api.plugins.plugins">
1608        <title>Plug-ins</title>
1609       
1610        <para>
1611          The <classname docapi="net.sf.basedb.core.data">PluginDefinitionData</classname> holds information of the
1612          installed plugin classes. Much of the information is copied from the
1613          plug-in itself from the <classname docapi="net.sf.basedb.core.plugin">About</classname> object and by checking
1614          which interfaces it implements.
1615        </para>
1616       
1617        <para>
1618          There are five main types of plug-ins:
1619        </para>
1620       
1621        <itemizedlist>
1622        <listitem>
1623          <para>
1624          IMPORT (mainType = 1): A plug-in that imports data to BASE.
1625          </para>
1626        </listitem>
1627        <listitem>
1628          <para>
1629          EXPORT (mainType = 2): A plug-in that exports data from BASE.
1630          </para>
1631        </listitem>
1632        <listitem>
1633          <para>
1634          INTENSITY (mainType = 3): A plug-in that calculates intensity values
1635          from raw data.
1636          </para>
1637        </listitem>
1638        <listitem>
1639          <para>
1640          ANALYZE (mainType = 4): A plug-in that analyses data.
1641          </para>
1642        </listitem>
1643        <listitem>
1644          <para>
1645          OTHER (mainType = 5): Any other plug-in.
1646          </para>
1647        </listitem>
1648        </itemizedlist>
1649       
1650        <para>
1651          A plug-in may have different configurations. The flags <property>supportsConfigurations</property>
1652          and <property>requiresConfiguration</property> are used to specify if a plug-in
1653          must have or can't have any configurations. Configuration parameter values are
1654          versioned. Each time anyone updates a configuration the version number
1655          is increased and the parameter values are stored as a new entity.
1656          This is required because we want to be able to know exactly which
1657          parameters a job were using when it was executed. When a job is
1658          created we also store the parameter version number
1659          (<property>JobData.parameterVersion</property>). This means that even if
1660          someone changes the configuration later we will always know which
1661          parameters the job used.
1662        </para>
1663       
1664        <para>
1665          The <classname docapi="net.sf.basedb.core.data">PluginTypeData</classname> class is ued to group
1666          plug-ins that share some common functionality, by implementing
1667          additional (optional) interfaces. For example, the
1668          <interfacename docapi="net.sf.basedb.core.plugin">AutoDetectingImporter</interfacename> should be implemented
1669          by import plug-ins that supports automatic detection of file formats.
1670          Another example is the <interfacename docapi="net.sf.basedb.core.plugin">AnalysisFilterPlugin</interfacename>
1671          interface which should be implemented by all analysis plug-ins that
1672          only filters data.
1673        </para>
1674
1675      </sect3>
1676     
1677      <sect3 id="data_api.plugins.jobs">
1678        <title>Jobs</title>
1679       
1680        <para>
1681          A job represents a single invokation of a plug-in to do some work.
1682          The <classname docapi="net.sf.basedb.core.data">JobData</classname> class holds information about this.
1683          A job is usuallu executed by a plug-in, but doesn't have to be. The
1684          <property>status</property> property holds the current state of a job.
1685        </para>
1686       
1687        <itemizedlist>
1688        <listitem>
1689          <para>
1690            UNCONFIGURED (status = 0): The job is not yet ready to be executed.
1691          </para>
1692        </listitem>
1693        <listitem>
1694          <para>
1695            WAITING (status = 1): The job is waiting to be executed.
1696          </para>
1697        </listitem>
1698        <listitem>
1699          <para>
1700            PREPARING (status = 5): The job is about to be executed but hasn't started yet.
1701          </para>
1702        </listitem>
1703        <listitem>
1704          <para>
1705            EXECUTING (status = 2): The job is currently executing.
1706          </para>
1707        </listitem>
1708        <listitem>
1709          <para>
1710            DONE (status = 3): The job finished successfully.
1711          </para>
1712        </listitem>
1713        <listitem>
1714          <para>
1715            ERROR (status = 4): The job finished with an error.
1716          </para>
1717        </listitem>
1718        </itemizedlist>
1719      </sect3>
1720
1721      <sect3 id="data_api.plugins.agents">
1722        <title>Job agents</title>
1723       
1724        <para>
1725          A job agent is a program running on the same or a different server that
1726          is regularly checking for jobs that are waiting to be executed. The
1727          <classname docapi="net.sf.basedb.core.data">JobAgentData</classname> holds information about a job agent
1728          and the <classname docapi="net.sf.basedb.core.data">JobAgentSettingsData</classname> links the agent
1729          with the plug-ins the agent is able to execute. The job agent will only
1730          execute jobs that are owner by users or projects that the job agent has
1731          been shared to with at least use permission. The <property>priorityBoost</property>
1732          property can be used to give specific plug-ins higher priority.
1733          Thus, for a job agent it is possible to:
1734        </para>
1735       
1736        <itemizedlist>
1737        <listitem>
1738          <para>
1739          Specify exactly which plug-ins it will execute. For example, it is possible
1740          to dedicate one agent to only run one plug-in.
1741          </para>
1742        </listitem>
1743        <listitem>
1744          <para>
1745          Give some plug-ins higher priority. For example a job agent that is mainly
1746          used for importing data should give higher priority to all import plug-ins.
1747          Other types of jobs will have to wait until there are no more data to be
1748          imported.
1749          </para>
1750        </listitem>
1751        <listitem>
1752          <para>
1753          Specify exactly which users/groups/projects that may use the agent. For
1754          example, it is possible to dedicate one agent to only run jobs for a certain
1755          project.
1756          </para>
1757        </listitem>
1758        </itemizedlist>
1759       
1760      </sect3>
1761
1762
1763    </sect2>
1764   
1765    <sect2 id="data_api.biomaterials">
1766      <title>Biomaterials</title>
1767     
1768      <sect3 id="data_api.biomaterials.uml">
1769        <title>UML diagram</title>
1770       
1771        <figure id="data_api.figures.biomaterials">
1772          <title>Biomaterials</title>
1773          <screenshot>
1774            <mediaobject>
1775              <imageobject>
1776                <imagedata 
1777                  align="center"
1778                  fileref="figures/uml/datalayer.biomaterials.png" format="PNG" />
1779              </imageobject>
1780            </mediaobject>
1781          </screenshot>
1782        </figure>
1783      </sect3>
1784     
1785      <sect3 id="data_api.biomaterials.description">
1786        <title>Biomaterials</title>
1787       
1788        <para>
1789          There are four types of biomaterials: <classname docapi="net.sf.basedb.core.data">BioSourceData</classname>,
1790          <classname docapi="net.sf.basedb.core.data">SampleData</classname>, <classname docapi="net.sf.basedb.core.data">ExtractData</classname> and
1791          <classname docapi="net.sf.basedb.core.data">LabeledExtractData</classname>.
1792          All four types of are derived from the base class <classname docapi="net.sf.basedb.core.data">BioMaterialData</classname>.
1793          The reason for this is that they all share common functionality such as pooling
1794          and events. By using a common base class we do not have to create duplicate
1795          classes for keeping track of events and parents.
1796        </para>
1797       
1798        <para>
1799          The <classname docapi="net.sf.basedb.core.data">BioSourceData</classname> is the simplest of the biomaterials.
1800          It cannot have parents and can't participate in events. It's only used as a
1801          (non-required) parent for samples.
1802        </para>
1803       
1804        <para>
1805          The <classname docapi="net.sf.basedb.core.data">MeasuredBioMaterialData</classname> class is used as a base
1806          class for the other three biomaterial types. It introduces quantity
1807          measurements and can store original and remaining quantities. They are
1808          both optional. If an original quantity has been specified the core
1809          automatically calculates the remaining quantity based on the events a
1810          biomaterial participates in.
1811        </para>
1812       
1813        <para>
1814          All measured biomaterial have at least one event associated with them,
1815          the creation event, which holds information about the creation of the
1816          biomaterial. A measured biomaterial can be created in three ways:
1817        </para>
1818       
1819        <itemizedlist>
1820        <listitem>
1821          <para>
1822          From a single item of the parent type. Biosource is the parent type of
1823          samples, sample is the parent type of extracts, and extract is the
1824          parent type of labeled extracts. In this case the
1825          <property>pooled</property> property is <constant>false</constant>
1826          and the parent is specified in the <property>parent</property> property.
1827          If the parent is not a <classname docapi="net.sf.basedb.core.data">BioSourceData</classname> this information
1828          is duplicated, with the addition of an optional used quantity value, in the
1829          <property>sources</property> collection of the <classname docapi="net.sf.basedb.core.data">BioMaterialEventData</classname>
1830          object representing the creation event. It is the responsibility of the
1831          core to make sure that everything is properly synchronized and that
1832          remaining quantities are calculated.
1833          </para>
1834        </listitem>
1835       
1836        <listitem>
1837          <para>
1838          From one or more items of the same type, i.e pooling.
1839          In this case the <property>pooled</property> property is <constant>true</constant> 
1840          and the <property>parent</property> property is null. All source
1841          biomaterials are contained in the <property>sources</property> collection.
1842          The core is still responsible for keeping everything synchronized and to
1843          update remaining quantities.
1844          </para>
1845        </listitem>
1846       
1847        <listitem>
1848          <para>
1849          As a standalone biomaterial without parents.
1850          </para>
1851        </listitem>
1852        </itemizedlist>
1853
1854      </sect3>
1855     
1856      <sect3 id="data_api.biomaterials.events">
1857        <title>Biomaterial events</title>
1858       
1859        <para>
1860          An event represents something that happened to one or more biomaterials, for example
1861          the creation of another biomaterial. The <classname docapi="net.sf.basedb.core.data">BioMaterialEventData</classname>
1862          holds information about entry and event dates, protocols used, the user who is
1863          responsible, etc. There are three types of events represented by the <property>eventType</property>
1864          property.
1865        </para>
1866       
1867        <orderedlist>
1868        <listitem>
1869          <para>
1870          <emphasis>Creation event</emphasis>: This event represents the creation of a (measured)
1871          biomaterial. The <property>sources</property> collection contains
1872          information about the biomaterials that were used to create the new
1873          biomaterial. If the biomaterial is a pooled biomaterial all sources must
1874          be of the same type. Otherwise there can only be one source of the parent
1875          type. These rules are maintained by the core.
1876          </para>
1877        </listitem>
1878       
1879        <listitem>
1880          <para>
1881          <emphasis>Hybridization event</emphasis>: This event represents the creation
1882          of a hybridization. This event type is needed because we want to keep track
1883          of quantities for labeled extracts. This event has a hybridization as a
1884          product instead of a biomaterial. The sources collection can only contain
1885          labeled extracts.
1886          </para>
1887        </listitem>
1888
1889        <listitem>
1890          <para>
1891          <emphasis>Other event</emphasis>: This event represents some other important
1892          information about a single biomaterial that affected the remaining quantity.
1893          This event type doesn't have any sources.
1894          </para>
1895        </listitem>
1896        </orderedlist>
1897      </sect3>
1898 
1899    </sect2>
1900
1901    <sect2 id="data_api.plates">
1902      <title>Array LIMS - plates</title>
1903
1904      <sect3 id="data_api.plates.uml">
1905        <title>UML diagram</title>
1906       
1907        <figure id="data_api.figures.plates">
1908          <title>Array LIMS - plates</title>
1909          <screenshot>
1910            <mediaobject>
1911              <imageobject>
1912                <imagedata 
1913                  align="center"
1914                  scalefit="1" width="100%"
1915                  fileref="figures/uml/datalayer.plates.png" format="PNG" />
1916              </imageobject>
1917            </mediaobject>
1918          </screenshot>
1919        </figure>
1920      </sect3>
1921
1922      <sect3 id="data_api.plates.description">
1923        <title>Plates</title>
1924       
1925        <para>
1926          The <classname docapi="net.sf.basedb.core.data">PlateData</classname> is the main class holding information
1927          about a single plate. The associated <classname docapi="net.sf.basedb.core.data">PlateGeometryData</classname>
1928          defines how many rows and columns there are on a plate. Since this
1929          information is used to create wells, and for various other checks it is
1930          not possible to change the number of rows or columns once a geometry has
1931          been created.
1932        </para>
1933         
1934        <para>
1935          All plates must have a <classname docapi="net.sf.basedb.core.data">PlateTypeData</classname> which defines
1936          the geometry and a set of event types (see below).
1937        </para>
1938       
1939        <para>
1940          If the destroyed flag of a plate is set it is not allowed to use the
1941          plate for a plate mapping or to create array designs. However, it
1942          is possible to change the flag to not destroyed.
1943        </para>
1944
1945        <para>
1946          The barcode is intended to be used as an external identifier of the plate.
1947          But, the core doesn't care about the value or if it is unique or not.
1948        </para>
1949      </sect3>
1950     
1951      <sect3 id="data_api.plates.events">
1952        <title>Plate events</title>
1953
1954        <para>
1955          The plate type defines a set of <classname docapi="net.sf.basedb.core.data">PlateEventTypeData</classname>
1956          objects, each one represening a particular event a plate of this type
1957          usually goes trough. For a plate of a certain type, it is possible to
1958          attach exactly one event of each event type. The event type defines an
1959          optional protocol type, which can be used by client applications to
1960          filter a list of protocols for the event. The core doesn't check that
1961          the selected protocol for an event is of the same protocol type as
1962          defined by the event type.
1963        </para>
1964
1965        <para>
1966          The ordinal value can be used as a hint to client applications in
1967          which order the events actually are performed in the lab. The core doesn't
1968          care about this value or if several event types have the same value.
1969        </para>
1970      </sect3>
1971
1972      <sect3 id="data_api.plates.mappings">
1973        <title>Plate mappings</title>
1974       
1975        <para>
1976          A plate can be created either from scratch, with the help of the information
1977          in a <classname docapi="net.sf.basedb.core.data">PlateMappingData</classname>, from a set of parent plates.
1978          In the first case it is possible to specify a reporter for each well on the
1979          plate. In the second case the mapping code creates all the wells and links
1980          them to the parent wells on the parent plates. Once the plate has been saved
1981          to the database, the wells cannot be modified (because they are used
1982          downstream for various validation, etc.)
1983        </para>
1984       
1985        <para>
1986          The details in a plate mapping are simply coordinates that for each
1987          destination plate, row and column define a source plate, row and column.
1988          It is possible for a single source well to be mapped to multiple destination
1989          wells, but for each destination well only a single source well can be
1990          used.
1991        </para>
1992       
1993      </sect3>
1994
1995    </sect2>
1996
1997    <sect2 id="data_api.arrays">
1998      <title>Array LIMS - arrays</title>
1999     
2000      <sect3 id="data_api.arrays.uml">
2001        <title>UML diagram</title>
2002       
2003        <figure id="data_api.figures.arrays">
2004          <title>Array LIMS - arrays</title>
2005          <screenshot>
2006            <mediaobject>
2007              <imageobject>
2008                <imagedata 
2009                  align="center"
2010                  fileref="figures/uml/datalayer.arrays.png" format="PNG" />
2011              </imageobject>
2012            </mediaobject>
2013          </screenshot>
2014        </figure>
2015      </sect3>
2016     
2017      <sect3 id="data_api.arrays.designs">
2018        <title>Array designs</title>
2019       
2020        <para>
2021          Array designs are stored in <classname docapi="net.sf.basedb.core.data">ArrayDesignData</classname> objects
2022          and can be created either as standalone designs or
2023          from plates. In the first case the features on an array design
2024          are described by a reporter map. A reporter map is a file
2025          that maps a coordinate (block, meta-grid, row, column),
2026          position or an external ID on an array design to a
2027          reporter. Which method to use is given by the
2028          <property>ArrayDesign.featureIdentificationMethod</property> property.
2029          The coordinate system on an array design is divided into blocks.
2030          Each block can be identified either by a <property>blockNumber</property>
2031          or by meta coordinates. This information is stored in
2032          <classname docapi="net.sf.basedb.core.data">ArrayDesignBlockData</classname> items. Each block
2033          contains several <classname docapi="net.sf.basedb.core.data">FeatureData</classname> items, each
2034          one identified by a row and column coordinate. Platforms that doesn't
2035          divide the array design into blocks or doesn't use the coordinate system at all
2036          must still create a single super-block that holds all features.
2037        </para>
2038       
2039        <para>
2040          Array designs that are created from plates use a print map file
2041          instead of a reporter map. A print map is similar to a plate mapping
2042          but maps features (instead of wells) to wells. The file should
2043          specifify which plate and well a feature is created from. Reporter
2044          information will automatically be copied by BASE from the well.
2045        </para>
2046       
2047        <para>
2048          It is also possible to skip the importing of features into the
2049          database and just keep the data in the orginal files instead.
2050          This is typically done for Affymetrix CDF files.
2051        </para>
2052       
2053      </sect3>
2054     
2055      <sect3 id="data_api.arrays.slides">
2056        <title>Array slides</title>
2057       
2058        <para>
2059          The <classname docapi="net.sf.basedb.core.data">ArraySlideData</classname> represents a single
2060          array. Arrays are usually printed several hundreds in a batch,
2061          represented by a <classname docapi="net.sf.basedb.core.data">ArrayBatchData</classname> item.
2062          The <property>batchIndex</property> is the ordinal number of the
2063          array in the batch. The <property>barcode</property> can be used
2064          as a means for external programs to identify the array. BASE doesn't
2065          care if a value is given or if they are unique or not. If the
2066          <property>destroyed</property> flag is set it prevents a slide from
2067          beeing used by a hybridization.
2068        </para>
2069
2070      </sect3>
2071    </sect2>
2072
2073    <sect2 id="data_api.rawdata">
2074      <title>Hybridizations and raw data</title>
2075     
2076      <sect3 id="data_api.rawdata.uml">
2077        <title>UML diagram</title>
2078       
2079        <figure id="data_api.figures.rawdata">
2080          <title>Hybridizations and raw data</title>
2081          <screenshot>
2082            <mediaobject>
2083              <imageobject>
2084                <imagedata 
2085                  align="center"
2086                  scalefit="1" width="100%"
2087                  fileref="figures/uml/datalayer.rawdata.png" format="PNG" />
2088              </imageobject>
2089            </mediaobject>
2090          </screenshot>
2091        </figure>
2092      </sect3>
2093     
2094      <sect3 id="data_api.rawdata.hybridizations">
2095        <title>Hybridizations</title>
2096       
2097        <para>
2098        Hybridizations connects the slides from the Array LIMS part
2099        with labeled extracts from the biomaterials part. The <property>creationEvent</property>
2100        is used to register which labeled extracts that were used on the hybridization.
2101        The relation to slides is a one-to-one relation. A slide can only be used on
2102        a single hybridization and a hybridization can only use a single slide. The relation
2103        is optional from both sides.
2104        </para>
2105
2106        <para>
2107        The scanning of the hybridized slide is registered as separate scan events.
2108        One or more images can optionally be attached to each scan.
2109        The images are not used by BASE.
2110        </para>
2111       
2112      </sect3>
2113     
2114      <sect3 id="data_api.rawdata.description">
2115        <title>Raw data</title>
2116       
2117        <para>
2118        A <classname docapi="net.sf.basedb.core.data">RawBioAssayData</classname> object represents
2119        the raw data that is produced by analysing the image(s) from a
2120        single scan. You may register which software that was used, the
2121        protocol and any parameters (through the annotation system).
2122        </para>
2123
2124        <para>
2125        Files with the analysed data values can be attached to the
2126        associated <classname docapi="net.sf.basedb.core.data">FileSetData</classname> object. The platform
2127        and, optionally, the variant has information about the file types
2128        that can be used for that platform. If the platform file types support
2129        metadata extraction, headers, the number of spots, and other
2130        information may be automatically extracted from the raw data file(s).
2131        </para>
2132       
2133        <para>
2134        If the platform support it, raw data can also be imported into the database.
2135        This is handled by batchers and <classname docapi="net.sf.basedb.core.data">RawData</classname> objects.
2136        Which table to store the data in depends on the <property>rawDataType</property>
2137        property. The properties shown for the <classname>RawData</classname> class
2138        in the diagram are the mandatory properties. Each raw data type defines additional
2139        properties that are specific to that raw data type.
2140        </para>
2141       
2142      </sect3>
2143     
2144      <sect3 id="data_api.rawdata.spotimages">
2145        <title>Spot images</title>
2146       
2147        <para>
2148        Spot images can be created if you have the original image
2149        files. BASE can use 1-3 images as sources for the red, green
2150        and blue channel respectively. The creation of spotimages requires
2151        that x and y coordinates are given for each raw data spot. The scaling
2152        and offset values are used to convert the coordinates to pixel
2153        coordinates. With this information BASE is able to cut out a square
2154        from the source images that, theoretically, contains a specific spot and
2155        nothing else. The spot images are gamma-corrected independently and then
2156        put together into PNG images that are stored in a zip file.
2157        </para>
2158      </sect3>
2159     
2160    </sect2>
2161
2162    <sect2 id="data_api.experiments">
2163      <title>Experiments and analysis</title>
2164     
2165     
2166      <sect3 id="data_api.experiments.uml">
2167        <title>UML diagram</title>
2168       
2169        <figure id="data_api.figures.experiments">
2170          <title>Experiments</title>
2171          <screenshot>
2172            <mediaobject>
2173              <imageobject>
2174                <imagedata 
2175                  align="center"
2176                  scalefit="1" width="75%"
2177                  fileref="figures/uml/datalayer.experiments.png" format="PNG" />
2178              </imageobject>
2179            </mediaobject>
2180          </screenshot>
2181        </figure>
2182      </sect3>
2183     
2184      <sect3 id="data_api.experiments.description">
2185        <title>Experiments</title>
2186       
2187        <para>
2188          The <classname docapi="net.sf.basedb.core.data">ExperimentData</classname> 
2189          class is used to collect information about a single experiment. It
2190          links to any number of <classname docapi="net.sf.basedb.core.data">RawBioAssayData</classname>
2191          items, which must all be of the same <classname 
2192          docapi="net.sf.basedb.core">RawDataType</classname>.
2193        </para>
2194       
2195        <para>
2196          Annotation types that are needed in the analysis must connected to
2197          the experiment as experimental factors and the annotation values should
2198          be set on or inherited by each raw bioassay that is part of the
2199          experiment.
2200        </para>
2201       
2202        <para>
2203          The directory connected to the experiment is the default directory
2204          where plugins that generate files should store them.
2205        </para>
2206      </sect3>
2207           
2208      <sect3 id="data_api.experiments.bioassays">
2209        <title>Bioassay sets, bioassays and transformations</title>
2210       
2211        <para>
2212          Each line of analysis starts with the creation of a <emphasis>root</emphasis>
2213          <classname docapi="net.sf.basedb.core.data">BioAssaySetData</classname>,
2214          which holds the intensities calculated from the raw data.
2215          A bioassayset can hold one intensity for each channel. The number of
2216          channels is defined by the raw data type. For each raw bioassay used a
2217          <classname docapi="net.sf.basedb.core.data">BioAssayData</classname>
2218          is created.
2219        </para>
2220       
2221        <para>
2222          Information about the process that calculated the intensities are
2223          stored in a <classname docapi="net.sf.basedb.core.data">TransformationData</classname>
2224          object. The root transformation links with the raw bioassays that are used
2225          in this line of analysis and to a <classname 
2226          docapi="net.sf.basedb.core.data">JobData</classname> which has information
2227          about which plug-in and parameters that was used in the calculation.
2228        </para>
2229     
2230        <para>
2231          Once the root bioassayset has been created it is possible to
2232          again apply a transformation to it. This time the transformation
2233          links to a single source bioassayset instead of the raw bioassays.
2234          As before, it still links to a job with information about the plug-in and
2235          parameters that does the actual work. The transformation must make sure
2236          that new bioassays are created and linked to the bioassays in the
2237          source bioassayset. This above process may be repeated as many times
2238          as needed.
2239        </para>
2240       
2241        <para>
2242          Data to a bioassay set can only be added to it before it has been
2243          committed to the database. Once the transaction has been committed
2244          it is no longed possible to add more data or to modify existing
2245          data.
2246        </para>
2247     
2248      </sect3>
2249
2250      <sect3 id="data_api.experiments.virtualdb">
2251        <title>Virtual databases, datacubes, etc.</title>
2252       
2253        <para>
2254          The above processes requires a flexible storage solution for the data.
2255          Each experiment is related to a <classname docapi="net.sf.basedb.core.data">VirtualDb</classname>
2256          object. This object represents the set of tables that are needed to store
2257          data for the experiment. All tables are created in a special part of the
2258          BASE database that we call the <emphasis>dynamic database</emphasis>.
2259          In MySQL the dynamic database is a separate database, in Postgres it is
2260          a separate schema.
2261        </para>
2262       
2263        <para>
2264          A virual database is divided into data cubes. A data cube can be seen
2265          as a three-dimensional object where each point can hold data that in
2266          most cases can be interpreted as data for a single spot from an
2267          array. The coordinates to a point is given by <emphasis>layer</emphasis>,
2268          <emphasis>column</emphasis> and <emphasis>position</emphasis>. The
2269          layer and column coordinates are represented by the
2270          <classname docapi="net.sf.basedb.core.data">DataCubeLayerData</classname>
2271          and <classname docapi="net.sf.basedb.core.data">DataCubeColumnData</classname>
2272          objects. The position coordinate has no separate object associated with
2273          it.
2274        </para>
2275       
2276        <para>
2277          Data for a single bioassay set is always stored in a single layer. It
2278          is possible for more than one bioassay set to use the same layer. This
2279          usually happens for filtering transformations that doesn't modify the
2280          data.  The filtered bioassay set is then linked to a
2281          <classname docapi="net.sf.basedb.core.data">DataCubeFilterData</classname>
2282          object, which has information about which data points that
2283          passed the filter.
2284        </para>
2285       
2286        <para>
2287          All data for a bioassay is stored in a single column.
2288          Two bioassays in different bioassaysets (layers) can only have the same
2289          column if one is the parent of the other.
2290        </para>
2291       
2292        <para>
2293          The position coordinate is tied to a reporter.
2294        </para>
2295       
2296        <para>
2297          A child bioassay set may use the same data cube as it's parent
2298          bioassay set if all of the following conditions are true:
2299        </para>
2300       
2301        <itemizedlist>
2302        <listitem>
2303          <para>
2304          All positions are linked to the same reporter as the positions
2305          in the parent bioassay set.
2306          </para>
2307        </listitem>
2308       
2309        <listitem>
2310          <para>
2311          All data points are linked to the same (possible many) raw data
2312          spots as the corresponding data points in the parent bioassay set.
2313          </para>
2314        </listitem>
2315       
2316        <listitem>
2317          <para>
2318          The bioassays in the child bioassay set each have exactly one
2319          parent in the parent bioassay set. No parent bioassay may be the
2320          parent of more than one child bioassay.
2321          </para>
2322        </listitem>
2323        </itemizedlist>
2324       
2325        <para>
2326          If any of the above conditions are not true, a new data cube
2327          must be created for the child bioassay set.
2328        </para>
2329      </sect3>
2330     
2331      <sect3 id="data_api.dynamic.description">
2332        <title>The dynamic database</title>
2333
2334        <figure id="data_api.figures.dynamic">
2335          <title>The dynamic database</title>
2336          <screenshot>
2337            <mediaobject>
2338              <imageobject>
2339                <imagedata 
2340                  align="center"
2341                  fileref="figures/uml/datalayer.dynamic.png" format="PNG" />
2342              </imageobject>
2343            </mediaobject>
2344          </screenshot>
2345        </figure>
2346       
2347        <para>
2348          Each virtual database consists of several tables. The tables
2349          are dynamically created when needed. For each table shown in the diagram
2350          the # sign is replaced by the id of the virtual database object at run
2351          time.
2352        </para>
2353       
2354        <para>
2355          There are no classes in the data layer for these tables and they
2356          are not mapped with Hibernate. When we work with these tables we
2357          are always using batcher classes and queries that works with integer,
2358          floats and strings.
2359        </para>
2360       
2361        <bridgehead>The D#Spot table</bridgehead>
2362        <para>
2363          This is the main table which keeps the intensities for a single spot
2364          in the data cube. Extra values attached to the spot are kept in separate
2365          tables, one for each type of value (D#SpotInt, D#SpotFloat and D#SpotString).
2366        </para>
2367       
2368        <bridgehead>The D#Pos table</bridgehead>
2369        <para>
2370          This table stores the reporter id for each position in a cube.
2371          Extra values attached to the position are kept in separate tables,
2372          one for each type of value (D#PosInt, D#PosFloat and D#PosString).
2373        </para>
2374       
2375        <bridgehead>The D#Filter table</bridgehead>
2376        <para>
2377          This table stores the coordinates for the spots that remain after
2378          filtering. Note that each filter is related to a bioassayset which
2379          gives the cube and layer values. Each row in the filter table then
2380          adds the column and position allowing us to find the spots in the
2381          D#Spot table.
2382        </para>
2383       
2384        <bridgehead>The D#RawParents table</bridgehead>
2385        <para>
2386          This table holds mappings for a spot to the raw data it is calculated
2387          from. We don't need the layer coordinate since all layers in a cube
2388          must have the same mapping to raw data.
2389        </para>
2390       
2391      </sect3>     
2392
2393     
2394    </sect2>
2395   
2396    <sect2 id="data_api.misc">
2397      <title>Other classes</title>
2398     
2399      <sect3 id="data_api.misc.uml">
2400        <title>UML diagram</title>
2401       
2402        <figure id="data_api.figures.misc">
2403          <title>Other classes</title>
2404          <screenshot>
2405            <mediaobject>
2406              <imageobject>
2407                <imagedata 
2408                  align="center"
2409                  fileref="figures/uml/datalayer.misc.png" format="PNG" />
2410              </imageobject>
2411            </mediaobject>
2412          </screenshot>
2413        </figure>
2414      </sect3>
2415     
2416    </sect2>
2417
2418  </sect1>
2419 
2420  <sect1 id="api_overview.core_api" chunked="1">
2421    <title>The Core API</title>
2422   
2423    <para>
2424      This section gives an overview of various parts of the core API.
2425    </para>
2426   
2427    <sect2 id="core_api.data_in_files">
2428      <title>Using files to store data</title>
2429     
2430      <para>
2431        BASE 2.5 introduced the possibility to use files to store data instead
2432        of importing it into the database. Files can be attached
2433        to any item that implements the <interfacename docapi="net.sf.basedb.core">FileStoreEnabled</interfacename>
2434        interface. Currently this is <classname docapi="net.sf.basedb.core">RawBioAssay</classname>
2435        and <classname docapi="net.sf.basedb.core">ArrayDesign</classname>. The
2436        ability to store data in files is not a replacement for storing data in the
2437        database. It is possible (for some platforms/raw data types) to have data in
2438        files and in the database at the same time. We would have liked to enforce
2439        that (raw) data is always present in files, but this will not be backwards compatible
2440        with older installations, so there are three cases:
2441      </para>
2442     
2443      <itemizedlist>
2444      <listitem>
2445        <para>
2446        Data in files only
2447        </para>
2448      </listitem>
2449      <listitem>
2450        <para>
2451        Data in the database only
2452        </para>
2453      </listitem>
2454      <listitem>
2455        <para>
2456        Data in both files and in the database
2457        </para>
2458      </listitem>
2459      </itemizedlist>
2460     
2461      <para>
2462        Not all three cases are supported for all types of data. This is controlled
2463        by the <classname docapi="net.sf.basedb.core">Platform</classname> class, which may disallow
2464        that data is stored in the database. To check this call
2465        <methodname>Platform.isFileOnly()</methodname> and/or
2466        <methodname>Platform.getRawDataType()</methodname>. If the <methodname>isFileOnly()</methodname>
2467        method returns <constant>true</constant>, the platform can't store data in
2468        the database. If the value is <constant>false</constant> more information
2469        can be obtained by calling <methodname>getRawDataType()</methodname>,
2470        which may return:
2471      </para>
2472     
2473      <itemizedlist>
2474      <listitem>
2475        <para>
2476          <constant>null</constant>: The platform can store data with any
2477          raw data type in the database.
2478        </para>
2479      </listitem>
2480      <listitem>
2481        <para>
2482        A <classname docapi="net.sf.basedb.core">RawDataType</classname> that has <code>isStoredInDb() == true</code>:
2483        The platform can store data in the database but only data with the specified raw
2484        data type.
2485        </para>
2486      </listitem>
2487      <listitem>
2488        <para>
2489        A <classname docapi="net.sf.basedb.core">RawDataType</classname> that has <code>isStoredInDb() == false</code>:
2490        The platform can't store data in the database.
2491        </para>
2492      </listitem>
2493      </itemizedlist>
2494
2495      <para>
2496        One major change from earlier BASE versions is that the registration of raw data types
2497        has changed. The <filename>raw-data-types.xml</filename> file should
2498        only be used for raw data types that are stored in the database. The
2499        <sgmltag>storage</sgmltag> tag has been deprecated and BASE will refuse
2500        to start if it finds a raw data type definitions with <code>storage="file"</code>.
2501      </para>
2502     
2503      <para>
2504        For backwards compatibility reasons, each <classname docapi="net.sf.basedb.core">Platform</classname>
2505        that can only store data in files will create "virtual" raw data type
2506        objects internally. These raw data types all return <constant>false</constant>
2507        from the <methodname>RawDataType.isStoredInDb()</methodname>
2508        method. They also have a back-link to the platform/variant that
2509        created it: <methodname>RawDataType.getPlatform()</methodname>
2510        and <methodname>RawDataType.getVariant()</methodname>. These two methods
2511        will always return <constant>null</constant> when called on a raw data type
2512        that can be stored in the database.
2513      </para>
2514     
2515      <itemizedlist>
2516        <title>See also</title>
2517        <listitem><xref linkend="data_api.platforms" /></listitem>
2518        <listitem><xref linkend="plugin_developer.other.datafiles" /></listitem>
2519        <listitem><xref linkend="appendix.rawdatatypes.platforms" /></listitem>
2520        <listitem>
2521          <xref linkend="appendix.incompatible.2.5" /> in
2522          <xref linkend="appendix.incompatible" />
2523        </listitem>
2524      </itemizedlist>
2525     
2526      <sect3 id="core_api.data_in_files.diagram">
2527        <title>Diagram of classes and methods</title>
2528        <figure id="core_api.figures.data_in_files">
2529          <title>Store data in files</title>
2530          <screenshot>
2531            <mediaobject>
2532              <imageobject>
2533                <imagedata 
2534                  align="center"
2535                  scalefit="1" width="100%"
2536                  fileref="figures/uml/corelayer.datainfiles.png" format="PNG" />
2537              </imageobject>
2538            </mediaobject>
2539          </screenshot>
2540        </figure>
2541       
2542        <para>
2543          This is rather large set of classes and methods. The ultimate goal
2544          is to be able to create links between a <classname docapi="net.sf.basedb.core">RawBioAssay</classname>
2545          / <classname docapi="net.sf.basedb.core">ArrayDesign</classname> and <classname docapi="net.sf.basedb.core">File</classname>
2546          items and to provide some metadata about the files.
2547          The <classname docapi="net.sf.basedb.core">FileStoreUtil</classname> class is one of the most
2548          important ones. It is intended to make it easy for plug-in (and other)
2549          developers to access the files without having to mess with platform
2550          or file type objects. The API is best described
2551          by a set of use-case examples.
2552        </para>
2553       
2554      </sect3>
2555     
2556      <sect3 id="core_api.data_in_files.ask">
2557        <title>Use case: Asking the user for files for a given item</title>
2558
2559        <para>
2560          A client application must know what types of files it makes sense
2561          to ask the user for. In some cases, data may be split into more than
2562          one file so we need a generic way to select files.
2563        </para>
2564       
2565        <para>
2566          Given that we have a <interfacename docapi="net.sf.basedb.core">FileStoreEnabled</interfacename>
2567          item we want to find out which <classname docapi="net.sf.basedb.core">DataFileType</classname>
2568          items that can be used for that item. The
2569          <methodname>DataFileType.getQuery(FileStoreEnabled)</methodname>
2570          can be used for this. Internally, the method uses the result from
2571          <methodname>FileStoreEnabled.getPlatform()</methodname>
2572          and <methodname>FileStoreEnabled.getVariant()</methodname>
2573          methods to restrict the query to only return file types for
2574          a given platform and/or variant. If the item doesn't have
2575          a platform or variant the query will return all file types
2576          that are associated with the given item type. In any case, we get a list
2577          of <classname>DataFileType</classname> items, each one representing a
2578          specific file type that we should ask the user about. Examples:
2579        </para>
2580
2581        <orderedlist>
2582        <listitem>
2583          <para>
2584          The <constant>Affymetrix</constant> platform defines <constant>CEL</constant>
2585          as a raw data file and <constant>CDF</constant> as an array design (reporter map)
2586          file. If we have a <classname docapi="net.sf.basedb.core">RawBioAssay</classname> the query will only return
2587          the CEL file type and the client can ask the user for a CEL file.
2588          </para>
2589        </listitem>
2590        <listitem>
2591          <para>
2592          The <constant>Generic</constant> platform defines <constant>PRINT_MAP</constant>
2593          and <constant>REPORTER_MAP</constant> for array designs. If we have
2594          an <classname docapi="net.sf.basedb.core">ArrayDesign</classname> the query will return those two
2595          items.
2596          </para>
2597        </listitem>
2598        </orderedlist>
2599     
2600        <para>
2601          It might also be interesting to know the currently selected file
2602          for each file type and if the platform has set the <varname>required</varname>
2603          flag for a particular file type. Here is a simple code example
2604          that may be useful to start from:
2605        </para>
2606     
2607        <programlisting language="java">
2608DbControl dc = ...
2609FileStoreEnabled item = ...
2610Platform platform = item.getPlatform();
2611PlatformVariant variant = item.getVariant();
2612
2613// Get list of DataFileTypes used by the platform
2614ItemQuery&lt;DataFileType&gt; query =
2615   DataFileType.getQuery(item);
2616List&lt;DataFileType&gt; types = query.list(dc);
2617
2618// Always check hasFileSet() method first to avoid
2619// creating the file set if it doesn't exists
2620FileSet fileSet = item.hasFileSet() ?
2621   null : item.getFileSet();
2622   
2623for (DataFileType type : types)
2624{
2625   // Get the current file, if any
2626   FileSetMember member = fileSet == null || !fileSet.hasMember(type) ?
2627      null : fileSet.getMember(type);
2628   File current = member == null ?
2629      null : member.getFile();
2630   
2631   // Check if a file is required by the platform
2632   PlatformFileType pft = platform == null ?
2633      null : platform.getFileType(type, variant);
2634   boolean isRequired = pft == null ?
2635      false : pft.isRequired();
2636     
2637   // Now we can do something with this information to
2638   // let the user select a file ...
2639}
2640</programlisting>
2641     
2642        <note>
2643          <title>Also remember to catch PermissionDeniedException</title>
2644          <para>
2645            The above code may look complicated, but this is mostly because
2646            of all checks for <constant>null</constant> values. Remember
2647            that many things are optional and may return <constant>null</constant>.
2648            Another thing to look out for is
2649            <exceptionname>PermissionDeniedException</exceptionname>:s. The logged in
2650            user may not have access to all items. The above example doesn't include
2651            any code for this since it would have made it too complex.
2652          </para>
2653        </note>
2654      </sect3>
2655     
2656      <sect3 id="core_api.data_in_files.link">
2657        <title>Use case: Link, validate and extract metadata from the selected files</title>
2658        <para>
2659          When the user has selected the file(s) we must store the links
2660          to them in the database. This is done with a <classname docapi="net.sf.basedb.core">FileSet</classname>
2661          object. A file set can contain any number of files. The only limitation
2662          is that it can only contain one file for each file type.
2663          Call <methodname>FileSet.setMember()</methodname> to store
2664          a file in the file set. If a file already exists for the given file type
2665          it is replaced, otherwise a new entry is created. The following
2666          program example assumes that we have a map where <classname docapi="net.sf.basedb.core">File</classname>:s
2667          are related to <classname docapi="net.sf.basedb.core">DataFileType</classname>:s. When all files
2668          have been added we call <methodname>FileSet.validate()</methodname>
2669          to validate the files and extract metadata.
2670        </para>
2671       
2672        <programlisting language="java">
2673DbControl dc = ...
2674FileStoreEnabled item = ...
2675Map&lt;DataFileType, File&gt; files = ...
2676
2677// Store the selected files in the fileset
2678FileSet fileSet = item.getFileSet();
2679for (Map.Entry&lt;DataFileType, File&gt; entry : files)
2680{
2681   DataFileType type = entry.getKey();
2682   File file = entry.getValue();
2683   fileSet.setMember(type, file);
2684}
2685
2686// Validate the files and extract metadata
2687fileSet.validate(dc, true);
2688</programlisting>
2689
2690        <para>
2691          Validation and extraction of metadata is important since we want
2692          data in files to be equivalent to data in the database. The validation
2693          and metadata extraction is done by the core when the
2694          <methodname>FileSet.validate()</methodname> is called.
2695          The process is partly pluggable since each <classname docapi="net.sf.basedb.core">DataFileType</classname> 
2696          can name a class that should do the validation and/or metadata extraction.
2697        </para>
2698
2699        <note>
2700          <para>
2701          The <methodname>FileSet.validate()</methodname> only validates
2702          the files where the file types have specified plug-ins that can
2703          do the validation and metadata extraction. The method doesn't
2704          throw any exceptions. Instead, all validation errors
2705          are returned a list of <classname>Throwable</classname>:s. The
2706          validation result is also stored for each file and can be access
2707          with <methodname>FileSetMember.isValid()</methodname> and
2708          <methodname>FileSetMember.getErrorMessage()</methodname>.
2709          </para>
2710        </note>
2711
2712        <para>
2713          Here is the general outline of what is going on in the core:
2714        </para>
2715
2716        <orderedlist>
2717        <listitem>
2718          <para>
2719          The core checks the <classname docapi="net.sf.basedb.core">DataFileType</classname> of all
2720          members in the file set and creates <classname docapi="net.sf.basedb.core.filehandler">DataFileValidator</classname>
2721          and <classname docapi="net.sf.basedb.core.filehandler">DataFileMetadataReader</classname> objects. Only one instance
2722          of each class is created. If the file set contains members which has the
2723          same validator or metadata reader, they will all share the same instance.
2724          </para>
2725        </listitem>
2726       
2727        <listitem>
2728          <para>
2729          Each validator/metadata reader class is initialised with calls to
2730          <methodname>DataFileHandler.setItem()</methodname> and
2731          <methodname>DataFileHandler.setFile()</methodname>.
2732          </para>
2733        </listitem>
2734       
2735        <listitem>
2736          <para>
2737          Each validator is called. The result of the validation is saved for each
2738          file and can be retreieved by <methodname>FileSetMember.isValid()</methodname>
2739          and <methodname>FileSetMember.getErrorMessage()</methodname>.
2740          </para>
2741        </listitem>
2742       
2743        <listitem>
2744          <para>
2745          Each metadata reader is called, unless the metadata reader is the same class
2746          as the validator and the validation failed. If the metadata reader is a
2747          different class, it is called even if the validation failed.
2748          </para>
2749        </listitem>
2750        </orderedlist>
2751
2752        <note>
2753          <title>Only one instance of each validator class is created</title>
2754          <para>
2755          The validation/metadata extraction is not done until all files have been
2756          added to the fileset. If the same validator/meta data reader is
2757          used for more than one file, the same instance is reused. Ie.
2758          the <methodname>setFile()</methodname> is called one time
2759          for each file/file type pair. The <methodname>validate()</methodname>
2760          and <methodname>extractMetadata()</methodname> methods are only
2761          called once.
2762          </para>
2763        </note>
2764       
2765        <para>
2766          All validators and meta data extractors should extend
2767          the <classname docapi="net.sf.basedb.core.filehandler">AbstractDataFileHandler</classname> class. The reason
2768          is that we may want to add more methods to the <interfacename docapi="net.sf.basedb.core.filehandler">DataFileHandler</interfacename>
2769          interface in the future. The <classname docapi="net.sf.basedb.core.filehandler">AbstractDataFileHandler</classname> will
2770          be used to provide default implementations for backwards compatibility.
2771        </para>
2772       
2773      </sect3>
2774     
2775      <sect3 id="core_api.data_in_files.import">
2776        <title>Use case: Import data into the database</title>
2777       
2778        <para>
2779          This should be done by existing plug-ins in the same way as before.
2780          A slight modification is needed since it is good if the importers
2781          are made aware of already selected files in the <classname docapi="net.sf.basedb.core">FileSet</classname>
2782          to provide good default values. The <classname docapi="net.sf.basedb.core">FileStoreUtil</classname>
2783          class is very useful in cases like this:
2784        </para>
2785       
2786        <programlisting language="java">
2787RawBioAssay rba = ...
2788DbControl dc = ...
2789
2790// Get the current raw data file, if any
2791List&lt;File&gt; rawDataFiles =
2792   FileStoreUtil.getGenericDataFiles(dc, rba, FileType.RAW_DATA);
2793File defaultFile = rawDataFiles.size() > 0 ?
2794   rawDataFiles.get(0) : null;
2795   
2796// Create parameter asking for input file - use current as default
2797PluginParameter&lt;File&gt; fileParameter = new PluginParameter&lt;File&gt;(
2798   "file",
2799   "Raw data file",
2800   "The file that contains the raw data that you want to import",
2801   new FileParameterType(defaultFile, true, 1)
2802);
2803</programlisting>
2804
2805      <para>
2806        An import plug-in should also save the file that was used to the file set:
2807      </para>
2808     
2809      <programlisting language="java">
2810RawBioassay rba = ...
2811// The file the user selected to import from
2812File rawDataFile = (File)job.getValue("file");
2813
2814// Save the file to the fileset. The method will check which file
2815// type the platform uses as the raw data type. As a fallback the
2816// GENERIC_RAW_DATA type is used
2817FileStoreUtil.setGenericDataFile(dc, rba, FileType.RAW_DATA,
2818   DataFileType.GENERIC_RAW_DATA, rawDataFile);
2819</programlisting>
2820
2821      </sect3>
2822     
2823      <sect3 id="core_api.data_in_files.experiments">
2824        <title>Use case: Using raw data from files in an experiment</title>
2825       
2826        <para>
2827          Just as before, an experiment is still locked to a single
2828          <classname docapi="net.sf.basedb.core">RawDataType</classname>. This is a design issue that
2829          would break too many things if changed. If data is stored in files
2830          the experiment is also locked to a single <classname docapi="net.sf.basedb.core">Platform</classname>.
2831          This has been designed to have as little impact on existing
2832          plug-ins as possible. In most cases, the plug-ins will continue
2833          to work as before.
2834        </para>
2835       
2836        <para>
2837          A plug-in (using data from the database that needs to check if it can
2838          be used within an experiment can still do:
2839        </para>
2840       
2841        <programlisting language="java">
2842Experiment e = ...
2843RawDataType rdt = e.getRawDataType();
2844if (rdt.isStoredInDb())
2845{
2846   // Check number of channels, etc...
2847   // ... run plug-in code ...
2848}
2849</programlisting>
2850       
2851        <para>
2852          A newer plug-in which uses data from files should do:
2853        </para>
2854       
2855        <programlisting language="java">
2856Experiment e = ...
2857DbControl dc = ...
2858RawDataType rdt = e.getRawDataType();
2859if (!rdt.isStoredInDb())
2860{
2861   // Check that platform/variant is supported
2862   Platform p = rdt.getPlatform(dc);
2863   PlatformVariant v = rdt.getVariant(dc);
2864   // ...
2865
2866   // Get data files
2867   File aFile = FileStoreUtil.getDataFile(dc, ...);
2868   
2869   // ... run plug-in code ...
2870}
2871</programlisting>
2872       
2873      </sect3>
2874     
2875    </sect2>
2876   
2877    <sect2 id="core_api.signals">
2878      <title>Sending signals (to plug-ins)</title>
2879   
2880      <para>
2881        BASE has a simple system for sending signals between different parts of
2882        a system. This signalling system was initially developed to be able to
2883        kill plug-ins that a user for some reason wanted to abort. The signalling
2884        system as such is not limited to this and it can be used for other purposes
2885        as well. Signals can of course be handled internally in a single JVM but
2886        also sent externally to other JVM:s running on the same or a different
2887        computer. The transport mechanism for signals is decoupled from the actual
2888        handling of them. If you want to, you could implement a signal transporter
2889        that sends signal as emails and the target plug-in would never know.
2890      </para>
2891     
2892      <para>
2893        The remainder of this section will focus mainly on the sending and
2894        transportation of signals. For more information about handling signals
2895        on the receiving end, see <xref linkend="plugin_developer.signals" />.
2896      </para>
2897     
2898      <sect3 id="core_api.signals.diagram">
2899        <title>Diagram of classes and methods</title>
2900        <figure id="core_api.figures.signals">
2901          <title>The signalling system</title>
2902          <screenshot>
2903            <mediaobject>
2904              <imageobject>
2905                <imagedata 
2906                  align="center"
2907                  scalefit="1" width="100%"
2908                  fileref="figures/uml/corelayer.signals.png" format="PNG" />
2909              </imageobject>
2910            </mediaobject>
2911          </screenshot>
2912        </figure>
2913     
2914        <para>
2915          The signalling system is rather simple. An object that wish
2916          to receieve signals must implement the
2917          <interfacename docapi="net.sf.basedb.core.signal"
2918          >SignalTarget</interfacename>. It's only method
2919          is <methodname>getSignalHandler()</methodname>. A
2920          <interfacename docapi="net.sf.basedb.core.signal"
2921          >SignalHandler</interfacename> is an object that
2922          knows what to do when a signal is delivered to it. The target object
2923          may implement the <interfacename>SignalHandler</interfacename> itself
2924          or use one of the existing handlers.
2925        </para>
2926       
2927        <para>
2928          The difficult part here is to be aware that a signal is usually
2929          delivered by a separate thread. The target object must be aware
2930          of this and know how to handle multiple threads. As an example we
2931          can use the <classname docapi="net.sf.basedb.core.signal"
2932          >ThreadSignalHandler</classname> which simply
2933          calls <code>Thread.interrupt()</code> to deliver a signal. The target
2934          object that uses this signal handler it must know that it should check
2935          <code>Thread.interrupted()</code> at regular intervals from the main
2936          thread. If that method returns true, it means that the <constant>ABORT</constant>
2937          signal has been delivered and the main thread should clean up and exit as
2938          soon as possible.
2939        </para>
2940       
2941        <para>
2942          Even if a signal handler could be given directly to the party
2943          that may be interested in sending a signal to the target this
2944          is not recommended. This would only work when sending signals
2945          within the same virtual machine. The signalling system includes
2946          <interfacename docapi="net.sf.basedb.core.signal"
2947          >SignalTransporter</interfacename> and
2948          <interfacename docapi="net.sf.basedb.core.signal"
2949          >SignalReceiver</interfacename> objects that are used
2950          to decouple the sending of signals with the handling of signals. The
2951          implementation usually comes in pairs, for example
2952          <classname docapi="net.sf.basedb.core.signal"
2953          >SocketSignalTransporters</classname> and <classname 
2954          docapi="net.sf.basedb.core.signal">SocketSignalReceiver</classname>.
2955        </para>
2956       
2957        <para>
2958          Setting up the transport mechanism is usually a system responsibility.
2959          Only the system know what kind of transport that is appropriate for it's current
2960          setup. Ie. should signals be delievered by TCP/IP sockets, only internally, or
2961          should a delivery mechanism based on web services be implemented?
2962          If a system wants to receive signals it must create an appropriate
2963          <interfacename>SignalReceiver</interfacename> object. Within BASE the
2964          internal job queue set up it's own signalling system that can be used to
2965          send signals (eg. kill) running jobs. The job agents do the same but uses
2966          a different implementation. See <xref linkend="appendix.base.config.jobqueue" />
2967          for more information about how to configure the internal job queue's
2968          signal receiver. In both cases, there is only one signal receiver instance
2969          active in the system.
2970        </para>
2971       
2972        <para>
2973          Let's take the internal job queue as an example. Here is how it works:
2974        </para>
2975       
2976        <itemizedlist>
2977        <listitem>
2978          <para>
2979          When the internal job queue is started, it will also create a signal
2980          receiver instance according to the settings in <filename>base.config</filename>.
2981          The default is to create <classname docapi="net.sf.basedb.core.signal"
2982          >LocalSignalReceiver</classname>
2983          which can only be used inside the same JVM. If needed, this can
2984          be changed to a <classname docapi="net.sf.basedb.core.signal"
2985          >SocketSignalReceiver</classname> or any other
2986          user-provided implementation.
2987          </para>
2988        </listitem>
2989       
2990        <listitem>
2991          <para>
2992          When the job queue has found a plug-in to execute it will check if
2993          it also implements the <interfacename docapi="net.sf.basedb.core.signal"
2994          >SignalTarget</interfacename>
2995          interface. If it does, a signal handler is created and registered
2996          with the signal receiver. This is actually done by the BASE core
2997          by calling <methodname>PluginExecutionRequest.registerSignalReceiver()</methodname>
2998          which also makes sure that the the ID returned from the registration is
2999          stored in the database together with the job item representing the
3000          plug-in to execute.
3001          </para>
3002        </listitem>
3003       
3004        <listitem>
3005          <para>
3006          Now, when the web client see's a running job which has a non-empty
3007          signal transporter property, the <guilabel>Abort</guilabel>
3008          button is activated. If the user clicks this button the BASE core
3009          uses the information in the database to create
3010          <interfacename docapi="net.sf.basedb.core.signal"
3011          >SignalTransporter</interfacename> object. This
3012          is simply done by calling <code>Job.getSignalTransporter()</code>.
3013          The created signal transporter knows how to send a signal
3014          to the signal receiver it was first registered with. When the
3015          signal arrives at the receiver it will find the handler for it
3016          and call <code>SignalHandler.handleSignal()</code>. This will in it's turn
3017          trigger some action in the signal target which soon will abort what
3018          it is doing and exit.
3019          </para>
3020        </listitem>
3021        </itemizedlist>
3022       
3023       
3024      </sect3>
3025   
3026    </sect2>
3027   
3028  </sect1>
3029
3030  <sect1 id="api_overview.query_api">
3031    <title>The Query API</title>
3032    <para>
3033      This documentation is only available in the old format.
3034      See <ulink url="http://base.thep.lu.se/chrome/site/doc/historical/development/overview/query/index.html"
3035        >http://base.thep.lu.se/chrome/site/doc/historical/development/overview/query/index.html</ulink>
3036    </para>
3037   
3038  </sect1>
3039 
3040  <sect1 id="api_overview.dynamic_and_batch_api">
3041    <title>Analysis and the Dynamic and Batch API:s</title>
3042    <para>
3043      This documentation is only available in the old format.
3044      See <ulink url="http://base.thep.lu.se/chrome/site/doc/historical/development/overview/dynamic/index.html"
3045        >http://base.thep.lu.se/chrome/site/doc/historical/development/overview/dynamic/index.html</ulink>
3046    </para>
3047  </sect1>
3048
3049  <sect1 id="api_overview.extensions">
3050    <title>Extensions API</title>
3051   
3052    <sect2 id="api_overview.extensions.core">
3053      <title>The core part</title>
3054   
3055      <para>
3056        The <emphasis>Extensions API</emphasis> is divided into two parts. A core
3057        part and a web client specific part. The core part can be found in the
3058        <package>net.sf.basedb.util.extensions</package> package and it's sub-packages,
3059        and consists of three sub-parts:
3060      </para>
3061     
3062      <itemizedlist>
3063      <listitem>
3064        <para>
3065        A set of interface definitions which forms the core of the Extensions API.
3066        The interfaces defines, for example, what an <interfacename 
3067        docapi="net.sf.basedb.util.extensions">Extension</interfacename> is and
3068        what an <interfacename 
3069        docapi="net.sf.basedb.util.extensions">ActionFactory</interfacename> should do.
3070        </para>
3071      </listitem>
3072     
3073      <listitem>
3074        <para>
3075        A <classname docapi="net.sf.basedb.util.extensions">Registry</classname> that is
3076        used to keep track of installed extensions. The registry also provides
3077        functionality for invoking and using the extensions.
3078        </para>
3079      </listitem>
3080     
3081      <listitem>
3082        <para>
3083        Utility classes that are useful when implementation a client application
3084        that can be extendable. The most useful example is the <classname
3085        docapi="net.sf.basedb.util.extensions.xml">XmlLoader</classname> which can
3086        read extension definitions from XML files and create the proper factories,
3087        etc.
3088        </para>
3089      </listitem>
3090      </itemizedlist>
3091     
3092      <figure id="core_api.figures.extensions_core">
3093        <title>The core part of the Extensions API</title>
3094        <screenshot>
3095          <mediaobject>
3096            <imageobject>
3097              <imagedata 
3098                align="center"
3099                fileref="figures/uml/corelayer.extensions_core.png" format="PNG" />
3100            </imageobject>
3101          </mediaobject>
3102        </screenshot>
3103      </figure>
3104     
3105      <para>
3106        The <classname docapi="net.sf.basedb.util.extensions">Registry</classname> 
3107        is one of the main classes in the extension system. All extension points and
3108        extensions must be registered before they can be used. Typically, you will
3109        first register extension points and then extensions, beacuse an extension
3110        can't be registered until the extension point it is extending has been
3111        registered.
3112      </para>
3113     
3114      <para>
3115        An <interfacename docapi="net.sf.basedb.util.extensions">ExtensionPoint</interfacename>
3116        is an ID and a definition of an <interfacename docapi="net.sf.basedb.util.extensions">Action</interfacename>
3117        class. The other options (name, description, renderer factory, etc.) are optional.
3118        An <interfacename docapi="net.sf.basedb.util.extensions">Extension</interfacename>
3119        that extends a specific extension point must provide an
3120        <interfacename docapi="net.sf.basedb.util.extensions">ActionFactory</interfacename>
3121        instance that can create actions of the type the extension point requires.
3122      </para>
3123     
3124      <example id="core_api.example.extensions_core">
3125        <title>The menu extensions point</title>
3126        <para>
3127        The <code>net.sf.basedb.clients.web.menu.extensions</code> extension point
3128        requires <interfacename 
3129        docapi="net.sf.basedb.clients.web.extensions.menu">MenuItemAction</interfacename>
3130        objects. An extension for this extension point must provide a factory that
3131        can create <classname>MenuItemAction</classname>:s. BASE ships with default
3132        factory implementations, for example the <classname 
3133        docapi="net.sf.basedb.clients.web.extensions.menu">FixedMenuItemFactory</classname>
3134        class, but an extension may provide it's own factory implementation if it wants to.
3135        </para>
3136      </example>
3137     
3138      <para>
3139        Call the <methodname>Registry.useExtensions()</methodname> method
3140        to use extensions from one or several extension points. This method will
3141        find all extensions for the given extension points. If a filter is given,
3142        it checks if any of the extensions or extension points has been disabled.
3143        It will then call <methodname>ActionFactory.prepareContext()</methodname>
3144        for all remaining extensions. This gives the action factory a chance to
3145        also disable the extension, for example, if the logged in user doesn't
3146        have a required permission. The action factory may also set attributes
3147        on the context. The attributes can be anything that the extension point
3148        may make use of. Check the documentation for the specific extension point
3149        for information about which attributes it supports. If there are
3150        any renderer factories, their <methodname>RendererFactory.prepareContext()</methodname>
3151        is also called. They have the same possibility of setting attributes
3152        on the context, but can't disable an extension.
3153      </para>
3154     
3155      <para>
3156        After this, an <classname 
3157        docapi="net.sf.basedb.util.extensions">ExtensionsInvoker</classname>
3158        object is created and returned to the extension point. Note that
3159        the <methodname>ActionFactory.getActions()</methodname> has not been
3160        called yet, so we don't know if the extensions are actually
3161        going to generate any actions. The <methodname>ActionFactory.getActions()</methodname>
3162        is not called until we have got ourselves an
3163        <classname docapi="net.sf.basedb.util.extensions">ActionIterator</classname>
3164        from the <methodname>ExtensionsInvoker.iterate()</methodname> method and
3165        starts to iterate. The call to <methodname>ActionIterator.hasNext()</methodname>
3166        will propagate down to <methodname>ActionFactory.getActions()</methodname>
3167        and the generated actions are then available with the
3168        <methodname>ActionIterator.next()</methodname> method.
3169      </para>
3170     
3171      <para>
3172        The <methodname>ExtensionsInvoker.renderDefault()</methodname>
3173        and <methodname>ExtensionsInvoker.render()</methodname> are
3174        just convenience methods that will make it easer to render
3175        the actions. The first method will of course only work if the
3176        extension point is providing a renderer factory, that can
3177        create the default renderer.
3178      </para>
3179     
3180      <note>
3181        <title>Be aware of multi-threading issues</title>
3182        <para>
3183          When you are creating extensions you must be aware that
3184          multiple threads may access the same objects at the same time.
3185          In particular, any action factory or renderer factory has to be
3186          thread-safe, since only one exists for each extension.
3187          Action and renderer objects should be thread-safe if the
3188          factories re-use the same objects.
3189        </para>
3190      </note>
3191   
3192    </sect2>
3193   
3194    <sect2 id="api_overview.extensions.web">
3195      <title>The web client part</title>
3196   
3197      <para>
3198        The web client specific parts of the Extensions API can be found
3199        in the <package>net.sf.basedb.client.web.extensions</package> package
3200        and it's subpackages. The top-level package contains classes used to
3201        administrate the extension system. Here is for example the
3202        <classname docapi="net.sf.basedb.client.web.extensions">ExtensionsControl</classname> 
3203        class which is the master controller for the web client extensions. It:
3204      </para>
3205     
3206      <itemizedlist>
3207      <listitem>
3208        <para>
3209        Keeps track of installed extensions and which JAR or XML file they are
3210        installed from.
3211        </para>
3212      </listitem>
3213     
3214      <listitem>
3215        <para>
3216        Can, manually or automatically, find and install new or
3217        updated extensions and uninstall deleted extensions.
3218        </para>
3219      </listitem>
3220     
3221      <listitem>
3222        <para>
3223        Adds permission control to the extension system, so that only an
3224        administrator is allowed to change settings, enable/disable extensions,
3225        etc.
3226        </para>
3227      </listitem>
3228      </itemizedlist>
3229     
3230      <para>
3231        In the top-level package there are also some abstract classes that may
3232        be useful to extend for developers creating their own extensions.
3233        For example, we recommend that all action factories extend the <classname 
3234        docapi="net.sf.basedb.client.web.extensions">AbstractJspActionFactory</classname>
3235        class.
3236      </para>
3237     
3238      <para>
3239        The sub-packages to <package>net.sf.basedb.client.web.extensions</package>
3240        are mostly specific to a single extension point or to a specific type of
3241        extension point. The <package>net.sf.basedb.client.web.extensions.menu</package>
3242        package, for example, contains classes that are/can be used for extensions
3243        adding menu items to the <menuchoice><guimenu>Extensions</guimenu></menuchoice>
3244        menu.
3245      </para>
3246     
3247      <figure id="core_api.figures.extensions_web">
3248        <title>The web client part of the Extensions API</title>
3249        <screenshot>
3250          <mediaobject>
3251            <imageobject>
3252              <imagedata 
3253                align="center"
3254                fileref="figures/uml/corelayer.extensions_web.png" format="PNG" />
3255            </imageobject>
3256          </mediaobject>
3257        </screenshot>
3258      </figure>
3259   
3260      <para>
3261        When the Tomcat web server is starting up, the <classname 
3262        docapi="net.sf.basedb.client.web.extensions">ExtensionsServlet</classname>
3263        is automatically loaded. This servlet has as two purposes:
3264      </para>
3265     
3266      <itemizedlist>
3267      <listitem>
3268        <para>
3269        Initialise the extensions system by calling
3270        <methodname>ExtensionsControl.init()</methodname>. This will result in
3271        an initial scan for installed extensions, which is equivalent to doing
3272        a manual scan with the force update setting to false. This means that
3273        the extension system is up an running as soon as the first user log's
3274        in to BASE.
3275        </para>
3276      </listitem>
3277     
3278      <listitem>
3279        <para>
3280        Act as a proxy for custom servlets defined by the extensions. URL:s
3281        ending with <code>.servlet</code> has been mapped to the
3282        <classname>ExtensionsServlet</classname>. When a request is made it
3283        will extract the name of the extension's JAR file from the
3284        URL, get the corresponding <classname 
3285        docapi="net.sf.basedb.client.web.extensions">ExtensionsFile</classname>
3286        and <classname docapi="net.sf.basedb.client.web.extensions">ServletWrapper</classname>
3287        and then invoke the custom servlet. More information can be found in
3288        <xref linkend="extensions_developer.servlets" />.
3289        </para>
3290      </listitem>
3291     
3292      </itemizedlist>
3293     
3294      <para>
3295        Using extensions only involves calling the
3296        <methodname>ExtensionsControl.createContext()</methodname> and
3297        <methodname>ExtensionsControl.useExtensions()</methodname> methods. This
3298        returns an <classname docapi="net.sf.basedb.util.extensions">ExtensionsInvoker</classname> 
3299        object as described in the previous section.
3300      </para>
3301     
3302      <para>
3303        To render the actions it is possible to either use the
3304        <methodname>ExtensionsInvoker.iterate()</methodname> method
3305        and generate HTML from the information in each action. Or
3306        (the better way) is to use a renderer together with the
3307        <classname docapi="net.sf.basedb.clients.web.taglib.extensions">Render</classname>
3308        taglib.
3309      </para>
3310     
3311      <para>
3312        To get information about the installed extensions, 
3313        change settings, enabled/disable extensions, performing a manual
3314        scan, etc. use the <methodname>ExtensionsControl.get()</methodname>
3315        method. This will create a permission-controlled object. All
3316        users has read permission, administrators has write permission.
3317      </para>
3318     
3319      <note>
3320        <para>
3321          The permission we check for is WRITE permission on the
3322          web client item. This means it is possible to give a user
3323          permissions to manage the extension system by assigning
3324          WRITE permission to the web client entry in the database.
3325          Do this from <menuchoice>
3326            <guimenu>Administrate</guimenu>
3327            <guimenuitem>Clients</guimenuitem>
3328          </menuchoice>.
3329        </para>
3330      </note>
3331   
3332      <para>
3333        The <classname docapi="net.sf.basedb.clients.web.extensions">XJspCompiler</classname>
3334        is mapped to handle the compilation <code>.xjsp</code> files
3335        which are regular JSP files with a different extension. This feature is
3336        experimental and requires installing an extra JAR into Tomcat's lib
3337        directory. See <xref linkend="admin.extensions.xjspcompiler" /> for
3338        more information.
3339      </para>
3340   
3341    </sect2>
3342   
3343  </sect1>
3344
3345  <sect1 id="api_overview.other_api">
3346    <title>Other useful classes and methods</title>
3347    <para>
3348      TODO
3349    </para>
3350  </sect1>
3351 
3352</chapter>
Note: See TracBrowser for help on using the repository browser.