source: trunk/doc/src/docbook/developerdoc/api_overview.xml @ 4902

Last change on this file since 4902 was 4902, checked in by Nicklas Nordborg, 14 years ago

Fixes #550: Write "Overview of BASE" section in Developer documentation part

  • Property svn:eol-style set to native
  • Property svn:keywords set to Id
File size: 124.9 KB
Line 
1<?xml version="1.0" encoding="UTF-8"?>
2<!DOCTYPE chapter PUBLIC
3    "-//Dawid Weiss//DTD DocBook V3.1-Based Extension for XML and graphics inclusion//EN"
4    "../../../../lib/docbook/preprocess/dweiss-docbook-extensions.dtd">
5<!--
6  $Id: api_overview.xml 4902 2009-04-24 10:56:19Z nicklas $
7
8  Copyright (C) 2007 Peter Johansson, Nicklas Nordborg, Martin Svensson
9
10  This file is part of BASE - BioArray Software Environment.
11  Available at http://base.thep.lu.se/
12
13  BASE is free software; you can redistribute it and/or
14  modify it under the terms of the GNU General Public License
15  as published by the Free Software Foundation; either version 3
16  of the License, or (at your option) any later version.
17
18  BASE is distributed in the hope that it will be useful,
19  but WITHOUT ANY WARRANTY; without even the implied warranty of
20  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
21  GNU General Public License for more details.
22
23  You should have received a copy of the GNU General Public License
24  along with BASE. If not, see <http://www.gnu.org/licenses/>.
25-->
26
27<chapter id="api_overview">
28  <?dbhtml dir="api"?>
29  <title>API overview (how to use and code examples)</title>
30
31  <sect1 id="api_overview.public_api">
32    <title>The Public API of BASE</title>
33   
34    <para>
35      Not all public classes and methods in the <filename>BASE2Core.jar</filename>
36      and other JAR files shipped with BASE are considered as
37      <emphasis>Public API</emphasis>. This is important knowledge
38      since we will always try to maintain backwards compatibility
39      for classes that are part of the public API. For other
40      classes, changes may be introduced at any time without
41      notice or specific documentation. In other words:
42    </para>
43   
44    <note>
45      <title>Only use the public API when developing plug-ins</title>
46      <para>
47        This will maximize the chance that you plug-in will continue
48        to work with the next BASE release. If you use the non-public API
49        you do so at your own risk.
50      </para>
51    </note>
52   
53    <para>
54      See the <ulink url="http://base.thep.lu.se/chrome/site/doc/api/index.html"
55        >javadoc</ulink> for information about
56      what parts of the API that contributes to the public API.
57      Methods, classes and other elements that have been tagged as
58      <code>@deprecated</code> should be considered as part of the internal API
59      and may be removed in a subsequent release without warning.
60    </para>
61   
62    <para>
63      See <xref linkend="appendix.incompatible" /> to read more about
64      changes that have been introduced by each release.
65    </para>
66
67    <sect2 id="api_overview.compatibility">
68      <title>What is backwards compatibility?</title>
69     
70      <para>
71        There is a great article about this subject on <ulink 
72        url="http://wiki.eclipse.org/index.php/Evolving_Java-based_APIs"
73          >http://wiki.eclipse.org/index.php/Evolving_Java-based_APIs</ulink>.
74        This is what we will try to comply with. If you do not want to
75        read the entire article, here are some of the most important points:
76      </para>
77     
78     
79      <sect3 id="api_overview.compatibility.binary">
80        <title>Binary compatibility</title>
81        <para>
82        <blockquote>
83          Pre-existing Client binaries must link and run with new releases of the
84          Component without recompiling.
85        </blockquote>
86       
87        For example:
88        <itemizedlist>
89        <listitem>
90          <para>
91            We cannot change the number or types of parameters to a method
92            or constructor.
93          </para>
94        </listitem>
95        <listitem>
96          <para>
97            We cannot add or change methods to interfaces that are intended
98            to be implemented by plug-in or client code.
99          </para>
100        </listitem>
101        </itemizedlist>
102        </para>       
103      </sect3>
104     
105      <sect3 id="api_overview.compatibility.contract">
106        <title>Contract compatibility</title>
107        <para>
108          <blockquote>
109          API changes must not invalidate formerly legal Client code.
110          </blockquote>
111       
112          For example:
113          <itemizedlist>
114          <listitem>
115            <para>
116              We cannot change the implementation of a method to do
117              things differently than before. For example, allow <constant>null</constant>
118              as a return value when it was not allowed before.
119            </para>
120          </listitem>
121          </itemizedlist>
122       
123          <note>
124            <para>
125            Sometimes there is a very fine line between what is considered a
126            bug and what is considered a feature. For example, if the
127            actual implementation does not do what the javadoc says,
128            do we change the code or do we change the documentation?
129            This has to be considered from case to case and depends on
130            the age of the code and if we expect plug-ins and clients to be
131            affected by it or not.
132            </para>
133          </note>
134        </para>
135      </sect3>
136     
137      <sect3 id="api_overview.compatibility.source">
138        <title>Source code compatibility</title>
139        <para>
140        This is not an important matter and is not always possible to
141        achieve. In most cases, the problems are easy to fix.
142        Example:
143       
144        <itemizedlist>
145        <listitem>
146          <para>
147          Adding a class may break a plug-in or client that import
148          classes with <constant>.*</constant> if the same class name
149          exists in another package.
150          </para>
151        </listitem>
152        </itemizedlist>
153        </para>
154      </sect3>
155    </sect2>
156  </sect1>
157
158  <sect1 id="api_overview.data_api" chunked="1">
159    <title>The database schema and the Data Layer API</title>
160
161    <para>
162      This section gives an overview of the entire data layer API.
163      The figure below show how different modules relate to each other.
164    </para>
165 
166    <figure id="data_api.figures.overview">
167      <title>Data layer overview</title>
168      <screenshot>
169        <mediaobject>
170          <imageobject>
171            <imagedata 
172              align="center"
173              scalefit="1" width="100%"
174              fileref="figures/uml/datalayer.overview.png" format="PNG" />
175          </imageobject>
176        </mediaobject>
177      </screenshot>
178    </figure>
179
180    <sect2 id="data_api.basic">
181      <title>Basic classes and interfaces</title>
182     
183      <para>
184        This document contains information about the basic classes and interfaces in this package.
185        They are important since all data-layer classes must inherit from one of the already
186        existing abstract base classes or implement one or more of the
187        existing interfaces. They contain code that is common to all classes,
188        for example implementations of the <methodname>equals()</methodname>
189        and <methodname>hashCode()</methodname> methods or how to link with the owner of an
190        item.
191      </para>
192     
193      <sect3 id="data_api.basic.uml">
194        <title>UML diagram</title>
195       
196        <figure id="data_api.figures.basic">
197          <title>Basic classes and interfaces</title>
198          <screenshot>
199            <mediaobject>
200              <imageobject>
201                <imagedata 
202                  align="center"
203                  fileref="figures/uml/datalayer.basic.png" format="PNG" />
204              </imageobject>
205            </mediaobject>
206          </screenshot>
207        </figure>
208      </sect3>
209     
210      <sect3 id="data_api.basic.classes">
211        <title>Classes</title>
212       
213        <variablelist>
214        <varlistentry>
215          <term><classname docapi="net.sf.basedb.core.data">BasicData</classname></term>
216          <listitem>
217            <para>
218            The root class. It overrides the <methodname>equals()</methodname>,
219            <methodname>hashCode()</methodname> and <methodname>toString()</methodname> methods
220            from the <classname>Object</classname> class. It also defines the
221            <varname>id</varname> and <varname>version</varname> properties.
222            All data layer classes must inherit from this class or one of it's subclasses.
223            </para>
224          </listitem>
225        </varlistentry>
226       
227        <varlistentry>
228          <term><classname docapi="net.sf.basedb.core.data">OwnedData</classname></term>
229          <listitem>
230            <para>
231            Extends the <classname>BasicData</classname> class and adds
232            an <varname>owner</varname> property. The owner is a required link to a
233            <classname docapi="net.sf.basedb.core.data">UserData</classname> object, representing the user that
234            is the owner of the item.
235            </para>
236          </listitem>
237        </varlistentry>
238
239        <varlistentry>
240          <term><classname docapi="net.sf.basedb.core.data">SharedData</classname></term>
241          <listitem>
242            <para>
243            Extends the <classname>OwnedData</classname> class and adds
244            properties (<varname>itemKey</varname> and <varname>projectKey</varname>)
245            that holds access permission information for an item.
246            Access permissions are held in <classname docapi="net.sf.basedb.core.data">ItemKeyData</classname> and/or
247            <classname docapi="net.sf.basedb.core.data">ProjectKeyData</classname> objects. These objects only exists if
248            the item has been shared.
249            </para>
250          </listitem>
251        </varlistentry>
252
253        <varlistentry>
254          <term><classname docapi="net.sf.basedb.core.data">CommonData</classname></term>
255          <listitem>
256            <para>
257            This is a convenience class for items that extends the <classname>SharedData</classname>
258            class and implements the <interfacename docapi="net.sf.basedb.core.data">NameableData</interfacename> and
259            <interfacename docapi="net.sf.basedb.core.data">RemoveableData</interfacename> interfaces. This is one of
260            the most common situations.
261            </para>
262          </listitem>
263        </varlistentry>
264
265        <varlistentry>
266          <term><classname docapi="net.sf.basedb.core.data">AnnotatedData</classname></term>
267          <listitem>
268            <para>
269            This is a convenience class for items that can be annotated.
270            Annotations are held in <classname docapi="net.sf.basedb.core.data">AnnotationSetData</classname> objects.
271            The annotation set only exists if annotations has been created for the item.
272            </para>
273          </listitem>
274        </varlistentry>
275        </variablelist>
276       
277      </sect3>
278     
279      <sect3 id="data_api.basic.interfaces">
280        <title>Interfaces</title>
281       
282        <variablelist>
283        <varlistentry>
284          <term><classname docapi="net.sf.basedb.core.data">IdentifiableData</classname></term>
285          <listitem>
286            <para>
287            All items are identifiable, which means that they have a unique <varname>id</varname>.
288            The id is unique for all items of a specific type (ie. class). The id is number
289            that is automatically generated by the database and has no other meaning
290            outside of the application. The <varname>version</varname> property is used for
291            detecting and preventing concurrent modifications to an item.
292            </para>
293          </listitem>
294        </varlistentry>
295       
296        <varlistentry>
297          <term><classname docapi="net.sf.basedb.core.data">OwnableData</classname></term>
298          <listitem>
299            <para>
300            An ownable item is an item which has an owner. The owner is represented as a
301            required link to a <classname docapi="net.sf.basedb.core.data">UserData</classname> object.
302            </para>
303          </listitem>
304        </varlistentry>       
305
306        <varlistentry>
307          <term><classname docapi="net.sf.basedb.core.data">ShareableData</classname></term>
308          <listitem>
309            <para>
310            A shareable item is an item which can be shared to other users, groups or projects.
311            Access permissions are held in <classname docapi="net.sf.basedb.core.data">ItemKeyData</classname> and/or
312            <classname docapi="net.sf.basedb.core.data">ProjectKeyData</classname> objects.
313            </para>
314          </listitem>
315        </varlistentry>
316             
317        <varlistentry>
318          <term><classname docapi="net.sf.basedb.core.data">NameableData</classname></term>
319          <listitem>
320            <para>
321            A nameable item is an item that has a name (required) and a description
322            (optional). The name doesn't have to be unique, except in a few special
323            cases (for example, the name of a file).
324            </para>
325          </listitem>
326        </varlistentry>
327       
328        <varlistentry>
329          <term><classname docapi="net.sf.basedb.core.data">RemovableData</classname></term>
330          <listitem>
331            <para>
332            A removable item is an item that can be flagged as removed. This doesn't
333            remove the information about the item from the database, but can be used by
334            client applications to hide items that the user is not interested in.
335            A trashcan function can be used to either restore or permanently
336            remove items that has the flag set.
337            </para>
338          </listitem>
339        </varlistentry>
340               
341        <varlistentry>
342          <term><classname docapi="net.sf.basedb.core.data">SystemData</classname></term>
343          <listitem>
344            <para>
345            A system item is an item which has an additional id in the form of string. A system id
346            is required when we need to make sure that we can get a specific item without
347            knowing the numeric id. Example of such items are the root user and the everyone group.
348            A system id is generally constructed like:
349            <constant>net.sf.basedb.core.User.ROOT</constant>. The system id:s are defined in the
350            core layer by each item class.
351            </para>
352          </listitem>
353        </varlistentry>
354
355        <varlistentry>
356          <term><classname docapi="net.sf.basedb.core.data">DiskConsumableData</classname></term>
357          <listitem>
358            <para>
359            This interface is used by items which occupies a lot of disk space and
360            should be part of the quota system, for example files. The required
361            <classname docapi="net.sf.basedb.core.data">DiskUsageData</classname> contains information about the size,
362            location, owner etc. of the item.
363            </para>
364          </listitem>
365        </varlistentry>
366       
367        <varlistentry>
368          <term><classname docapi="net.sf.basedb.core.data">AnnotatableData</classname></term>
369          <listitem>
370            <para>
371            This interface is used by items which can be annotated. Annotations are name/value
372            pairs that are attached as extra information to an item. All annotations are
373            contained in an <classname docapi="net.sf.basedb.core.data">AnnotationSetData</classname> object.
374            </para>
375          </listitem>
376        </varlistentry>
377       
378        <varlistentry>
379          <term><classname docapi="net.sf.basedb.core.data">ExtendableData</classname></term>
380          <listitem>
381            <para>
382            This interface is used by items which can have extra administrator-defined
383            columns. The functionality is similar to annotations. It is not as flexible,
384            since it is a global configuration, but has better performance. BASE will
385            generate extra database columns to store the data in the tables for items that
386            can be extended.
387            </para>
388          </listitem>
389        </varlistentry>
390       
391        <varlistentry>
392          <term><classname docapi="net.sf.basedb.core.data">BatchableData</classname></term>
393          <listitem>
394            <para>
395            This interface is a tagging interface which is used by items that needs batch
396            functionality in the core.
397            </para>
398          </listitem>
399        </varlistentry>
400       
401        <varlistentry>
402          <term><classname docapi="net.sf.basedb.core.data">RegisteredData</classname></term>
403          <listitem>
404            <para>
405            This interface is used by items which registered the date they were
406            created in the database. The registration date is set at creation time
407            and can't be modified later. Since this didn't exist prior to BASE 2.10
408            null values are allowed on all pre-existing items. Note! For backwards
409            compatibility reasons with existing code in
410            <classname docapi="net.sf.basedb.core.data">BioMaterialEventData</classname>
411            the method name is <methodname>getEntryDate()</methodname>.
412            </para>
413          </listitem>
414        </varlistentry>
415        </variablelist>
416
417      </sect3>
418    </sect2>
419   
420    <sect2 id="data_api.authentication">
421      <title>User authentication and access control</title>
422     
423      <para>
424         This section gives an overview of user authentication and
425         how groups, roles and projects are used for access control
426         to items.
427      </para>
428     
429      <sect3 id="data_api.authentication.uml">
430        <title>UML diagram</title>
431       
432        <figure id="data_api.figures.authentication">
433          <title>User authentication and access control</title>
434          <screenshot>
435            <mediaobject>
436              <imageobject>
437                <imagedata 
438                  align="center"
439                  scalefit="1" width="100%"
440                  fileref="figures/uml/datalayer.authentication.png" format="PNG" />
441              </imageobject>
442            </mediaobject>
443          </screenshot>
444        </figure>
445      </sect3>
446     
447      <sect3 id="data_api.authentication.users">
448        <title>Users and passwords</title>     
449     
450        <para>
451          The <classname docapi="net.sf.basedb.core.data">UserData</classname> class holds information about users.
452          We keep the passwords in a separate table and use proxies to avoid loading
453          password data each time a user is loaded to minimize security risks. It is
454          only if the password needs to be changed that the <classname docapi="net.sf.basedb.core.data">PasswordData</classname>
455          object is loaded. The one-to-one mapping between user and password is controlled
456          by the password class, but a cascade attribute on the user class makes sure
457          that the password is deleted when a user is deleted.
458        </para>
459      </sect3>
460
461      <sect3 id="data_api.authentication.groups">
462        <title>Groups, roles and projects</title>     
463     
464        <para>
465          The <classname docapi="net.sf.basedb.core.data">GroupData</classname>, <classname docapi="net.sf.basedb.core.data">RoleData</classname> and
466          <classname docapi="net.sf.basedb.core.data">ProjectData</classname> classes holds information about groups, roles
467          and projects respectively. A user may be a member of any number of groups,
468          roles and/or projects. The membership in a project comes with an attached
469          permission values. This is the highest permission the user has in the
470          project. No matter what permission an item has been shared with the
471          user will not get higher permission. Groups may be members of other groups and
472          also in projects.
473        </para>
474       
475        <para>
476          Group membership is always accounted for, but the core only allows
477          one project at a time to be use, this is the <emphasis>active project</emphasis>.
478          When a project is active new items that are created are automatically
479          added to that project with the permission given by the
480          <varname>autoPermission</varname> property.
481        </para>
482             
483      </sect3>
484     
485      <sect3 id="data_api.authentication.keys">
486        <title>Keys</title>     
487     
488        <para>
489          The <classname docapi="net.sf.basedb.core.data">KeyData</classname> class and it's subclasses
490          <classname docapi="net.sf.basedb.core.data">ItemKeyData</classname>, <classname docapi="net.sf.basedb.core.data">ProjectKeyData</classname> and
491          <classname docapi="net.sf.basedb.core.data">RoleKeyData</classname>, are used to store information about access
492          permissions to items. To get permission to manipulate an item a user must have
493          access to a key giving that permission. There are three types of keys:
494        </para>
495       
496        <variablelist>
497        <varlistentry>
498          <term><classname docapi="net.sf.basedb.core.data">ItemKey</classname></term>
499          <listitem>
500            <para>
501            Is used to give a user or group access to a specific item. The item
502            must be a <interfacename docapi="net.sf.basedb.core.data">ShareableData</interfacename> item.
503            The permissions are usually set by the owner of the item. Once created an
504            item key cannot be changed. This allows the core to reuse a key if the
505            permissions match exactly, ie. for a given set of users/groups/permissions
506            there can be only one item key object.
507            </para>
508          </listitem>
509        </varlistentry>
510
511        <varlistentry>
512          <term><classname docapi="net.sf.basedb.core.data">ProjectKey</classname></term>
513          <listitem>
514            <para>
515            Is used to give members of a project access to a specific item. The item
516            must be a <interfacename docapi="net.sf.basedb.core.data">ShareableData</interfacename> item. Once created a
517            project key cannot be changed. This allows the core to reuse a key if the
518            permissions match exactly, ie. for a given set of projects/permissions
519            there can be only one project key object.
520            </para>
521          </listitem>
522        </varlistentry>
523
524        <varlistentry>
525          <term><classname docapi="net.sf.basedb.core.data">RoleKey</classname></term>
526          <listitem>
527            <para>
528            Is used to give a user access to all items of a specific type, ie.
529            <constant>READ</constant> all <constant>SAMPLES</constant>. The installation
530            will make sure that there already exists a role key for each type of item, and
531            it is not possible to add new or delete existing keys. Unlike the other two types
532            this key can be modified.
533            </para>
534           
535            <para>
536            A role key is also used to assign permissions to plug-ins. If a plug-in has
537            been specified to use permissions the default is to deny everything.
538            The mapping to the role key is used to grant permissions to the plugin.
539            The <varname>granted</varname> value gives the plugin access to all items
540            of the related item type regardless of if the user that is running the plug-in has the
541            permission or not. The <varname>denied</varname> values denies access to all
542            items of the related item type even if the logged in user has the permission.
543            Permissions that are not granted nor denied are checked against the
544            logged in users regular permissions. Permissions to items that are
545            not linked are always denied.
546            </para>
547          </listitem>
548        </varlistentry>
549        </variablelist>
550       
551      </sect3>
552
553      <sect3 id="data_api.authentication.permissions">
554        <title>Permissions</title>
555       
556        <para>
557          The <varname>permission</varname> property appearing in many classes is an
558          integer values describing the permission:
559        </para>
560       
561        <informaltable>
562        <tgroup cols="2">
563          <colspec colname="value" />
564          <colspec colname="permission" />
565          <thead>
566            <row>
567              <entry>Value</entry>
568              <entry>Permission</entry>
569            </row>
570          </thead>
571          <tbody>
572            <row>
573              <entry>1</entry>
574              <entry>Read</entry>
575            </row>
576            <row>
577              <entry>3</entry>
578              <entry>Use</entry>
579            </row>
580            <row>
581              <entry>7</entry>
582              <entry>Restricted write</entry>
583            </row>
584            <row>
585              <entry>15</entry>
586              <entry>Write</entry>
587            </row>
588            <row>
589              <entry>31</entry>
590              <entry>Delete</entry>
591            </row>
592            <row>
593              <entry>47 (=32+15)</entry>
594              <entry>Set owner</entry>
595            </row>
596            <row>
597              <entry>79 (=64+15)</entry>
598              <entry>Set permissions</entry>
599            </row>
600            <row>
601              <entry>128</entry>
602              <entry>Create</entry>
603            </row>
604            <row>
605              <entry>256</entry>
606              <entry>Denied</entry>
607            </row>
608          </tbody>
609        </tgroup>
610        </informaltable>
611       
612        <para>
613          The values are constructed so that
614          <constant>READ</constant> -&gt;
615          <constant>USE</constant> -&gt;
616          <constant>RESTRICTED_WRITE</constant> -&gt;
617          <constant>WRITE</constant> -&gt;
618          <constant>DELETE</constant>
619          are chained in the sense that a higher permission always implies the lower permissions
620          also. The <constant>SET_OWNER</constant> and <constant>SET_PERMISSION</constant>
621          both implies <constant>WRITE</constant> permission. The <constant>DENIED</constant>
622          permission is only valid for role keys, and if specified it overrides all
623          other permissions.               
624        </para>
625       
626        <para>
627          When combining permission for a single item the permission codes for the different
628          paths are OR-ed together. For example a user has a role key with <constant>READ</constant>
629          permission for <constant>SAMPLES</constant>, but also an item key with <constant>USE</constant>
630          permission for a specific sample. Of course, the resulting permission for that
631          sample is <constant>USE</constant>. For other samples the resulting permission is
632          <constant>READ</constant>.
633        </para>
634       
635        <para>
636          If the user is also a member of a project which has <constant>WRITE</constant>
637          permission for the same sample, the user will have <constant>WRITE</constant>
638          permission when working with that project.
639        </para>
640       
641        <para>
642          The <constant>RESTRICTED_WRITE</constant> permission is in most cases the same
643          as the <constant>WRITE</constant> permission. So far the <constant>RESTRICTED_WRITE</constant>
644          permission is only given to users to their own <classname docapi="net.sf.basedb.core.data">UserData</classname>
645          object so they can change their address and other contact information,
646          but not quota, expiration date and other administrative information.
647        </para>
648
649      </sect3>
650    </sect2>
651
652    <sect2 id="data_api.wares">
653      <title>Hardware and software</title>
654      <para>
655         This section gives an overview of hardware and software in BASE.
656      </para>
657     
658      <sect3 id="data_api.wares.uml">
659        <title>UML diagram</title>
660       
661        <figure id="data_api.figures.wares">
662          <title>Hardware and software</title>
663          <screenshot>
664            <mediaobject>
665              <imageobject>
666                <imagedata 
667                  align="center"
668                  fileref="figures/uml/datalayer.wares.png" format="PNG" />
669              </imageobject>
670            </mediaobject>
671          </screenshot>
672        </figure>
673      </sect3>
674     
675      <sect3 id="data_api.wares.description">
676        <title>Hardware and software</title>
677        <para>
678          BASE is pre-installed with a set of hardware and software types.
679          They are typically used to filter the registered hardware and software
680          depending on what a user is doing. For example, when adding raw data
681          to BASE a user can select a scanner. The GUI will display the hardware
682          that has been registered as <emphasis>scanner</emphasis> hardware types.
683          Other hardware types are <emphasis>hybridization station</emphasis>
684          and <emphasis>print robot</emphasis>. An administrator may register more
685          hardware and software types.
686        </para>
687      </sect3>
688    </sect2>
689   
690    <sect2 id="data_api.reporters">
691      <title>Reporters</title>
692      <para>
693         This section gives an overview of hardware and software in BASE.
694      </para>
695     
696      <sect3 id="data_api.reporters.uml">
697        <title>UML diagram</title>
698       
699        <figure id="data_api.figures.reporters">
700          <title>Reporters</title>
701          <screenshot>
702            <mediaobject>
703              <imageobject>
704                <imagedata 
705                  align="center"
706                  fileref="figures/uml/datalayer.reporters.png" format="PNG" />
707              </imageobject>
708            </mediaobject>
709          </screenshot>
710        </figure>
711      </sect3>
712     
713      <sect3 id="data_api.reporters.description">
714        <title>Reporters</title>
715        <para>
716          The <classname docapi="net.sf.basedb.core.data">ReporterData</classname> class holds information about reporters.
717          The <property>externalId</property> is a required property that must be unique
718          among all reporters. The external ID is the value BASE uses to match
719          reporters when importing data from files.
720        </para>
721       
722        <para>
723          The <classname>ReporterData</classname> is an <emphasis>extendable</emphasis>
724          class, which means that the server administrator can define additional
725          columns (=annotations) in the reporters table. These are accessed with
726          the <methodname>ReporterData.getExtended()</methodname> and
727          <methodname>ReporterData.setExtended()</methodname> methods.
728          See <xref linkend="appendix.extendedproperties" /> for more information about
729          this.
730        </para>
731       
732        <para>
733          The <classname>ReporterData</classname> is also a <emphasis>batchable</emphasis>
734          class which means that there is no corresponding class in the core
735          layer. Client applications and plug-ins should work directly with
736          the <classname>ReporterData</classname> class. To help manage the reporters
737          there is the <classname docapi="net.sf.basedb.core">Reporter</classname> and <classname docapi="net.sf.basedb.core">ReporterBatcher</classname>
738          classes. The main reason for this
739          is to increase the performance and lower the memory usage by bypassing
740          internal caching in the core and Hibernate. Performance is also
741          increased by the batchers which uses more efficient SQL against the
742          database than Hibernate.
743        </para>
744       
745        <para>
746          The
747          <property>lastUpdate</property>
748          property holds the data and time the reporter information was last updated. The
749          value is managed automatically by the
750          <classname>ReporterBatcher</classname>
751          class. That goes for
752          <property>lastSource</property>
753          property too, which holds information about where the last update comes from. By
754          default this is set to the name of the logged in user, but it can be changed by
755          calling
756          <methodname>ReporterBatcher.setUpdateSource(String source)</methodname>
757          before the batcher commits the updates to the database. The source-string
758          should have the format: <synopsis>[ITEM_TYPE]:[ITEM_NAME]</synopsis> where,in
759          the file-case, ITEM_TYPE is File and ITEM_NAME is the file's name.
760        </para>
761      </sect3>
762     
763      <sect3 id="data_api.reporters.lists">
764        <title>Reporter lists</title>
765       
766        <para>
767          Reporter lists can be used to group reporters that are somehow related
768          to each other. This could for example be a list of interesting reporters
769          found in the analysis of an experiment. Each reporter in the list may
770          optionally be assigned a score. The meaning of the score value is not
771          interpreted by BASE.
772        </para>
773       
774      </sect3>
775     
776     
777    </sect2>
778
779    <sect2 id="data_api.quota">
780      <title>Quota and disk usage</title>
781      <para>
782         This section gives an overview of quota system in BASE
783         and how the disk usage is kept track of.
784      </para>
785     
786      <sect3 id="data_api.quota.uml">
787        <title>UML diagram</title>
788       
789        <figure id="data_api.figures.quota">
790          <title>Quota and disk usage</title>
791          <screenshot>
792            <mediaobject>
793              <imageobject>
794                <imagedata 
795                  align="center"
796                  fileref="figures/uml/datalayer.quota.png" format="PNG" />
797              </imageobject>
798            </mediaobject>
799          </screenshot>
800        </figure>
801      </sect3>
802     
803      <sect3 id="data_api.quota.description">
804        <title>Quota</title>
805       
806        <para>
807          The <classname docapi="net.sf.basedb.core.data">QuotaData</classname> holds information about a
808          single quota registration. The same quota may be used by many different users
809          and groups. This object encapsulates allowed
810          quota values for different types of quota types and locations.
811          BASE defines several quota types (file, raw data and experiment),
812          and locations (primary, secondary and offline).
813        </para>
814       
815        <para>
816          The <property>quotaValues</property> property is a map from
817          <classname docapi="net.sf.basedb.core.data">QuotaIndex</classname> to maximum byte values.
818          This map must contain at least one entry for the total
819          quota at the primary location.
820        </para>
821       
822      </sect3>
823     
824      <sect3 id="data_api.quota.diskusage">
825        <title>Disk usage</title>
826       
827        <para>
828          A <interfacename docapi="net.sf.basedb.core.data">DiskConsumableData</interfacename> (for example a file)
829          item is automatically linked to a <classname docapi="net.sf.basedb.core.data">DiskUsageData</classname>
830          item. This holds information about the number of bytes,
831          the location and quota type the item uses. It also holds information
832          about which user and group (optional) that should be charged for the disk usage.
833          The user is always the owner of the item.
834        </para>
835
836      </sect3>
837     
838    </sect2>
839
840    <sect2 id="data_api.clients">
841      <title>Client, session and settings</title>
842      <para>
843         This section gives an overview of hardware and software in BASE.
844      </para>
845     
846      <sect3 id="data_api.clients.uml">
847        <title>UML diagram</title>
848       
849        <figure id="data_api.figures.clients">
850          <title>Client, sessions and settings</title>
851          <screenshot>
852            <mediaobject>
853              <imageobject>
854                <imagedata 
855                  align="center"
856                  scalefit="1" width="100%"
857                  fileref="figures/uml/datalayer.clients.png" format="PNG" />
858              </imageobject>
859            </mediaobject>
860          </screenshot>
861        </figure>
862      </sect3>
863     
864      <sect3 id="data_api.clients.description">
865        <title>Clients</title>
866        <para>
867          The <classname docapi="net.sf.basedb.core.data">ClientData</classname> class holds information
868          about a client application. The <property>externalId</property>
869          property is a unique identifier for the application. To avoid ID clashes the ID
870          should be constructed in the same way as Java packages, for example
871          <constant>net.sf.basedb.clients.web</constant> is the ID for the
872          web client application.
873        </para>
874       
875        <para>
876          A client application doesn't have to be registered with BASE
877          to be able to use it. But we recommend it since:
878        </para>
879       
880        <itemizedlist>
881        <listitem>
882          <para>
883            The permission system allows an admin to specify exactly
884            which users that may use a specific application.
885          </para>
886        </listitem>
887       
888        <listitem>
889          <para>
890          The application can't store any context-sensitive or application-specific
891          settings unless it is registered.
892          </para>
893        </listitem>
894       
895        <listitem>
896          <para>
897          The application can store context-sensitive help in the BASE
898          database.
899          </para>
900        </listitem>
901        </itemizedlist>
902      </sect3>
903     
904      <sect3 id="data_api.clients.sessions">
905        <title>Sessions</title>
906       
907        <para>
908          A session represents the time between login and logout for a single
909          user. The <classname docapi="net.sf.basedb.core.data">SessionData</classname> object is entirely
910          managed by the BASE core, and should be considered read-only
911          for client applications.
912        </para>
913           
914      </sect3>
915     
916      <sect3 id="data_api.clients.settings">
917        <title>Settings</title>
918       
919        <para>
920          There are two types of settings: context-sensitive settings and regular
921          settings. The regular settings are simple key-value pairs of strings
922          and can be used for almost anything. There are four subtypes:
923        </para>
924       
925        <itemizedlist>
926        <listitem>
927          <para>
928          Global default settings: Settings that are used by all users
929          and client applications on the BASE server. These settings
930          are read-only except for administrators. BASE has not yet defined
931          any settings of this type.
932          </para>
933        </listitem>
934       
935        <listitem>
936          <para>
937          User default settings: Settings that are valid for a single user
938          for any client application. BASE has not yet defined
939          any settings of this type.
940          </para>
941        </listitem>
942       
943        <listitem>
944          <para>
945          Client default settings: Settings that are valid for all users using
946          a specific client application. Each client application is responsible
947          for defining it's own settings. Settings are read-only except
948          for administrators.
949          </para>
950        </listitem>
951       
952        <listitem>
953          <para>
954          User client settings: Settings that are valid for a single user using
955          a specific client application. Each client application is responsible
956          for defining it's own settings.
957          </para>
958        </listitem>
959       
960        </itemizedlist>
961       
962        <para>
963          The context-sensitive settings are designed to hold information
964          about the current status of options related to the listing of items
965          of a specific type. This includes:
966        </para>
967       
968        <itemizedlist>
969        <listitem>
970          <para>
971          Current filtering options (as 1 or more <classname docapi="net.sf.basedb.core.data">PropertyFilterData</classname>
972          objects).
973          </para>
974        </listitem>
975       
976        <listitem>
977          <para>
978          Which columns and direction to use for sorting.
979          </para>
980        </listitem>
981       
982        <listitem>
983          <para>
984          The number of items to display on each page, and which page that
985          is the current page.
986          </para>
987        </listitem>
988       
989        <listitem>
990          <para>
991          Simple key-value settings related to a given context.
992          </para>
993        </listitem>
994        </itemizedlist>
995       
996        <para>
997          Context-sensitive settings are only accessible if a client
998          application has been registered. The settings may be
999          named to make it possible to store several presets and to
1000          quickly switch between them. In any case, BASE maintains a
1001          current default setting with an empty name. An administrator
1002          may mark a named setting as public to allow other users to
1003          use it.
1004        </para>
1005       
1006      </sect3>
1007     
1008     
1009    </sect2>
1010
1011    <sect2 id="data_api.files">
1012      <title>Files and directories</title>
1013
1014      <para>
1015        This section covers the details of the BASE file
1016        system.
1017      </para>
1018
1019      <sect3 id="data_api.files.uml">
1020      <title>UML diagram</title>
1021     
1022        <figure id="data_api.figures.files">
1023          <title>Files and directories</title>
1024          <screenshot>
1025            <mediaobject>
1026              <imageobject>
1027                <imagedata 
1028                  align="center"
1029                  fileref="figures/uml/datalayer.files.png" format="PNG" />
1030              </imageobject>
1031            </mediaobject>
1032          </screenshot>
1033        </figure>
1034      </sect3>
1035     
1036      <sect3 id="data_api.files.description">
1037        <title>Description</title>
1038       
1039        <para>
1040          The <classname docapi="net.sf.basedb.core.data">DirectoryData</classname> class holds
1041          information about directories. Directories are organised in the
1042          ususal way as as tree structure. All directories must have
1043          a parent directory, except the system-defined root directory.
1044        </para>
1045       
1046        <para>
1047          The <classname docapi="net.sf.basedb.core.data">FileData</classname> class holds information about
1048          a file. The actual file contents is stored on disk in the directory
1049          specified by the <varname>userfiles</varname> setting in
1050          <filename>base.config</filename>. The <varname>internalName</varname>
1051          property is the name of the file on disk, but this is never exposed to
1052          client applications. The filenames and directories
1053          on the disk doesn't correspond to the the filenames and directories in
1054          BASE.
1055        </para>
1056       
1057        <para>
1058          The <varname>location</varname> property can take three values:
1059        </para>
1060       
1061        <itemizedlist>
1062        <listitem>
1063          <para>
1064          0 = The file is offline, ie. there is no file on the disk
1065          </para>
1066        </listitem>
1067        <listitem>
1068          <para>
1069          1 = The file is in primary storage, ie. it is located on the disk
1070          and can be used by BASE
1071          </para>
1072        </listitem>
1073        <listitem>
1074          <para>
1075          2 = The file is in secondary storage, ie. it has been moved to some
1076          other place and can't be used by BASE immediately.
1077          </para>
1078        </listitem>
1079        </itemizedlist>
1080       
1081        <para>
1082          The <varname>action</varname> property controls how a file is
1083          moved between primary and seconday storage. It can have the following
1084          values:
1085        </para>
1086       
1087        <itemizedlist>
1088        <listitem>
1089          <para>
1090          0 = Do nothing
1091          </para>
1092        </listitem>
1093        <listitem>
1094          <para>
1095          1 = If the file is in secondary storage, move it back to the primary storage
1096          </para>
1097        </listitem>
1098        <listitem>
1099          <para>
1100          2 = If the file is in primary storage, move it to the secondary storage
1101          </para>
1102        </listitem>
1103        </itemizedlist>
1104       
1105        <para>
1106          The actual moving between primary and secondary storage is done by an
1107          external program. See
1108          <xref linkend="appendix.base.config.secondary" /> and
1109          <xref linkend="plugin_developer.other.secondary" /> for more information.
1110        </para>
1111     
1112        <para>
1113          The <varname>md5</varname> property can be used to check for file
1114          corruption when it is moved between primary and secondary storage or
1115          when a user re-uploads a file that has been offline.
1116        </para>
1117       
1118        <para>
1119          BASE can store files in a compressed format. This is handled internally
1120          and is not visible to client applications. The <varname>compressed</varname>
1121          and <varname>diskSize</varname> properties are used to store information
1122          about this. A file may always be compressed if the users says so, but
1123          BASE can also do this automatically if the file is uploaded
1124          to a directory with the <varname>autoCompress</varname> flag set
1125          or if the file has MIME type with the <varname>autoCompress</varname>
1126          flag set.
1127        </para>
1128       
1129        <para>
1130          The <classname docapi="net.sf.basedb.core.data">FileTypeData</classname> class holds information about
1131          file types. It is used only to make it easier for users to organise
1132          their files.
1133        </para>
1134       
1135        <para>
1136          The <classname docapi="net.sf.basedb.core.data">MimeTypeData</classname> is used to register mime types and
1137          map them to file extensions. The information is only used to lookup values
1138          when needed. Given the filename we can set the <varname>File.mimeType</varname>
1139          and <varname>File.fileType</varname> properties. The MIME type is also
1140          used to decide if a file should be stored in a compressed format or not.
1141          The extension of a MIME type must be unique. Extensions should be registered
1142          without a dot, ie <emphasis>html</emphasis>, not <emphasis>.html</emphasis>
1143        </para>
1144       
1145      </sect3>
1146     
1147     
1148    </sect2>
1149   
1150    <sect2 id="data_api.platforms">
1151      <title>Experimental platforms</title>
1152
1153      <para>
1154         This section gives an overview of experimental platforms
1155         and how they are used to enable data storage in files
1156         instead of in the database.
1157      </para>
1158     
1159      <itemizedlist>
1160        <title>See also</title>
1161        <listitem><xref linkend="core_api.data_in_files" /></listitem>
1162        <listitem><xref linkend="appendix.rawdatatypes.platforms" /></listitem>
1163        <listitem><xref linkend="plugin_developer.other.datafiles" /></listitem>
1164      </itemizedlist>
1165         
1166      <sect3 id="data_api.platforms.uml">
1167        <title>UML diagram</title>
1168       
1169        <figure id="data_api.figures.platforms">
1170          <title>Experimental platforms</title>
1171          <screenshot>
1172            <mediaobject>
1173              <imageobject>
1174                <imagedata 
1175                  align="center"
1176                  fileref="figures/uml/datalayer.platforms.png" format="PNG" />
1177              </imageobject>
1178            </mediaobject>
1179          </screenshot>
1180        </figure>
1181      </sect3>
1182     
1183      <sect3 id="data_api.platforms.platforms">
1184        <title>Platforms</title>
1185       
1186        <para>
1187          The <classname docapi="net.sf.basedb.core.data">PlatformData</classname> holds information about a
1188          platform. A platform can have one or more <classname docapi="net.sf.basedb.core.data">PlatformVariant</classname>:s.
1189          Both the platform and variant are identified by an external ID that
1190          is fixed and can't be changed. <emphasis>Affymetrix</emphasis>
1191          is an example of a platform.
1192          If the <varname>fileOnly</varname> flag is set data for the platform
1193          can only be stored in files and not imported into the database. If
1194          the flag is not set data can be imported into the database.
1195          In the latter case, the <varname>rawDataType</varname> property
1196          can be used to lock the platform
1197          to a specific raw data type. If the value is <constant>null</constant>
1198          the platform can use any raw data type.
1199        </para>
1200       
1201        <para>
1202          Each platform and it's variant can be connected to one or more
1203          <classname docapi="net.sf.basedb.core.data">DataFileTypeData</classname> items. This item
1204          describes the kind of files that are used to hold data for
1205          the platform and/or variant. The file types are re-usable between
1206          different platforms and variants. Note that a file type may be attached
1207          to either only a platform or to a platform with a variant. File
1208          types attached to platforms are inherited by the variants. The variants
1209          can only define additional file types, not remove or redefine file types
1210          that has been attached to the platform.
1211        </para>
1212        <para>
1213          The file type is also identified
1214          by a fixed, non-changable external ID. The <varname>itemType</varname>
1215          property tells us what type of item the file holds data for (ie.
1216          array design or raw bioassay). It also links to a <classname docapi="net.sf.basedb.core.data">FileType</classname>
1217          which is the generic type of data in the file. This allows us to query
1218          the database for, as an example, files with the generic type
1219          <constant>FileType.RAW_DATA</constant>. If we are in an Affymetrix
1220          experiment we will get the CEL file, for another platform we will
1221          get another file.
1222        </para>
1223        <para>
1224          The <varname>required</varname> flag in <classname docapi="net.sf.basedb.core.data">PlatformFileTypeData</classname>
1225          is used to signal that the file is a required file. This is not
1226          enforeced by the core. It is intended to be used by client applications
1227          for creating a better GUI and for validation of an experiment.
1228        </para>
1229
1230      </sect3>
1231     
1232      <sect3 id="data_api.platforms.files">
1233        <title>FileStoreEnabled items and data files</title>
1234       
1235        <para>
1236          An item must implement the <interfacename docapi="net.sf.basedb.core">FileStoreEnabledData</interfacename>
1237          interface to be able to store data in files instead of in the database.
1238          The interface creates a link to a <classname docapi="net.sf.basedb.core.data">FileSetData</classname> object,
1239          which can hold several <classname docapi="net.sf.basedb.core.data">FileSetMemberData</classname> items.
1240          Each member points to specific <classname docapi="net.sf.basedb.core.data">FileData</classname> item.
1241          A file set can only store one file of each <classname docapi="net.sf.basedb.core.data">DataFileTypeData</classname>.
1242        </para>
1243       
1244      </sect3>
1245    </sect2>
1246
1247    <sect2 id="data_api.parameters">
1248      <title>Parameters</title>
1249     
1250      <para>
1251        This section gives an overview the generic parameter
1252        system in BASE that is used to store annotation values,
1253        plugin configuration values, job parameter values, etc.
1254      </para>
1255     
1256      <sect3 id="data_api.parameters.uml">
1257        <title>UML diagram</title>
1258       
1259        <figure id="data_api.figures.parameters">
1260          <title>Parameters</title>
1261          <screenshot>
1262            <mediaobject>
1263              <imageobject>
1264                <imagedata 
1265                  align="center"
1266                  fileref="figures/uml/datalayer.parameters.png" format="PNG" />
1267              </imageobject>
1268            </mediaobject>
1269          </screenshot>
1270        </figure>
1271      </sect3>
1272     
1273      <sect3 id="data_api.parameters.description">
1274        <title>Parameters</title>
1275       
1276        <para>
1277          The parameter system is a generic system that can store almost
1278          any kind of simple values (string, numbers, dates, etc.) and
1279          also links to other items. The <classname docapi="net.sf.basedb.core.data">ParameterValueData</classname> 
1280          class is an abstract base class that can hold multiple values (all must be of the
1281          same type). Unless only a specific type of values should be stored, this is
1282          the class that should be used when creating references for storing parameter
1283          values. It makes it possible for a single relaltion to use any kind of
1284          values or for a collection reference to mix multiple types of values.
1285          A typical use case maps a <classname>Map</classname> with the
1286          parameter name as the key:
1287        </para>
1288       
1289        <programlisting language="java">
1290private Map&lt;String, ParameterValueData&lt;?&gt;&gt; configurationValues;
1291/**
1292   Link parameter name with it's values.
1293   @hibernate.map table="`PluginConfigurationValues`" lazy="true" cascade="all"
1294   @hibernate.collection-key column="`pluginconfiguration_id`"
1295   @hibernate.collection-index column="`name`" type="string" length="255"
1296   @hibernate.collection-many-to-many column="`value_id`"
1297      class="net.sf.basedb.core.data.ParameterValueData"
1298*/
1299public Map&lt;String, ParameterValueData&lt;?&gt;&gt; getConfigurationValues()
1300{
1301   return configurationValues;
1302}
1303void setConfigurationValues(Map&lt;String, ParameterValueData&lt;?&gt;&gt; configurationValues)
1304{
1305   this.configurationValues = configurationValues;
1306}
1307</programlisting>
1308       
1309      <para>
1310      Now it is possible for the collection to store all types of values:
1311      </para>
1312     
1313      <programlisting language="java">
1314Map&lt;String, ParameterValueData&lt;?&gt;&gt; config = ...
1315config.put("names", new StringParameterValueData("A", "B", "C"));
1316config.put("sizes", new IntegerParameterValueData(10, 20, 30));
1317
1318// When you later load those values again you have to cast
1319// them to the correct class.
1320List&lt;String&gt; names = (List&lt;String&gt;)config.get("names").getValues();
1321List&lt;Integer&gt; sizes = (List&lt;Integer&gt;)config.get("sizes").getValues();
1322</programlisting>
1323
1324      </sect3>
1325     
1326    </sect2>
1327
1328    <sect2 id="data_api.annotations">
1329      <title>Annotations</title>
1330     
1331      <para>
1332        This section gives an overview of how the BASE annotation
1333        system works.
1334      </para>
1335     
1336      <sect3 id="data_api.annotations.uml">
1337        <title>UML diagram</title>
1338       
1339        <figure id="data_api.figures.annotations">
1340          <title>Annotations</title>
1341          <screenshot>
1342            <mediaobject>
1343              <imageobject>
1344                <imagedata 
1345                  align="center"
1346                  fileref="figures/uml/datalayer.annotations.png" format="PNG" />
1347              </imageobject>
1348            </mediaobject>
1349          </screenshot>
1350        </figure>
1351      </sect3>
1352     
1353      <sect3 id="data_api.annotations.description">
1354        <title>Annotations</title>
1355       
1356        <para>
1357        An item must implement the <interfacename docapi="net.sf.basedb.core.data">AnnotatableData</interfacename>
1358        interface to be able to use the annotation system. This interface gives
1359        a link to a <classname docapi="net.sf.basedb.core.data">AnnotationSetData</classname> item. This class
1360        encapsulates all annotations for the item. There are two types of
1361        annotations:
1362        </para>
1363       
1364        <itemizedlist>
1365        <listitem>
1366          <para>
1367          <emphasis>Primary annotations</emphasis> are annotations that
1368          explicitely belong to the item. An annotation set can contain
1369          only one primary annotation of each annotation type. The primary
1370          annotation are linked with the <property>annotations</property>
1371          property. This property is a map with an
1372          <classname docapi="net.sf.basedb.core.data">AnnotationTypeData</classname>  as the key.
1373          </para>
1374        </listitem>
1375       
1376        <listitem>
1377          <para>
1378          <emphasis>Inherited annotations</emphasis> are annotations
1379          that belong to a parent item, but that we want to use on
1380          another item as well. Inherited annotations are saved as
1381          references to either a single annotation or to another
1382          annotation set. Thus, it is possible for an item to inherit
1383          multiple annotations of the same annotation type.
1384          </para>
1385        </listitem>
1386        </itemizedlist>
1387       
1388        <para>
1389          The <classname docapi="net.sf.basedb.core.data">AnnotationData</classname> class is also
1390          just a placeholder. It connects the annotation set and
1391          annotation type with a <classname docapi="net.sf.basedb.core.data">ParameterValueData</classname>
1392          object. This is the object that holds the actual annotation
1393          values.
1394        </para>
1395       
1396      </sect3>
1397     
1398      <sect3 id="data_api.annotations.types">
1399        <title>Annotation types</title>
1400       
1401        <para>
1402        Instances of the <classname docapi="net.sf.basedb.core.data">AnnotationTypeData</classname> class
1403        defines the various annotations. It must have a <property>valueType</property> 
1404        property which cannot be changed. The value of this property controls
1405        which <classname docapi="net.sf.basedb.core.data">ParameterValueData</classname> subclass is used to store
1406        the annotation values, ie. <classname docapi="net.sf.basedb.core.data">IntegerParameterValueData</classname>,
1407        <classname docapi="net.sf.basedb.core.data">StringParameterValueData</classname>, etc.
1408        The <property>multiplicity</property> property holds the maximum allowed
1409        number of values for an annotation, or 0 if an unlimited number is
1410        allowed.
1411        </para>
1412       
1413        <para>
1414        The <property>itemTypes</property> collection holds the codes for
1415        the types of items the annotation type can be used on. This is
1416        checked when new annotations are created but already existing
1417        annotations are not affected if the collection is modified.
1418        </para>
1419       
1420        <para>
1421        Annotation types with the <property>protocolParameter</property> flag set
1422        are treated a bit differently. They will not show up as annotations
1423        to items with a type found in the <property>itemTypes</property> collection.
1424        A protocol parameter should be attached to a protocol. Then, when an item
1425        is using that protocol it becomes possible to add annotation values for
1426        the annotation types specified as protocol parameters. It doesn't matter
1427        if the item's type is found in the <property>itemTypes</property> 
1428        collection or not.
1429        </para>
1430       
1431        <para>
1432        The <property>options</property> collection is used to store additional
1433        options required by some of the value types, for example a max string
1434        length for string annotations or the max and min allowed value for
1435        integer annotations.
1436        </para>
1437       
1438        <para>
1439        The <property>enumeration</property> property is a boolean flag
1440        indicating if the allowed values are predefined as an enumeration.
1441        In that case those values are found in the <property>enumerationValues</property>
1442        property. The actual subclass is determined by the <property>valueType</property>
1443        property.
1444        </para>
1445       
1446        <para>
1447        Most of the other properties are hints to client applications how
1448        to render the input field for the annotation.
1449        </para>
1450       
1451      </sect3>
1452     
1453      <sect3 id="data_api.annotations.units">
1454        <title>Units</title>
1455        <para>
1456        Numerical annotation values can have units. A unit is described by
1457        a <classname docapi="net.sf.basedb.core.data">UnitData</classname> object.
1458        Each unit belongs to a <classname docapi="net.sf.basedb.core.data">QuantityData</classname> 
1459        object which defines the class of units. For example, if the quantity is
1460        <emphasis>weight</emphasis>, we can have units, <emphasis>kg</emphasis>,
1461        <emphasis>mg</emphasis>, <emphasis>µg</emphasis>, etc. The <classname>UnitData</classname>
1462        contains a factor and offset that relates all units to a common reference
1463        defined by the <classname>QuantityData</classname> class. For example,
1464        <emphasis>1 meter</emphasis> is the reference unit for distance, and we
1465        have <code>1 meter * 0.001 = 1 millimeter</code>. In this case, the factor is
1466        <emphasis>0.001</emphasis> and the offset 0. Another example is the relationship between
1467        kelvin and Celsius, which is <code>1 kelvin + 273.15 = 1 °Celsius</code>.
1468        Here, the factor is 1 and the offset is <emphasis>+273.15</emphasis>.
1469        The <classname
1470        docapi="net.sf.basedb.core.data">UnitSymbolData</classname>
1471        is used to make it possible to assign alternative symbols to a single unit.
1472        This is needed to simplify input where it may be hard to know what to
1473        type to get <emphasis></emphasis> or <emphasis>°C</emphasis>. Instead,
1474        <emphasis>m2</emphasis> and <emphasis>C</emphasis> can be used as
1475        alternative symbols.
1476        </para>
1477       
1478        <para>
1479        The creator of an annotation type may select a
1480        <classname>QuantityData</classname>, which can't be changed later, and
1481        a default <classname>UnitData</classname>. When entering annotation values
1482        a user may select any unit for the selected quantity (unless annotation type
1483        owner has limited this by selecting <varname>usableUnits</varname>). Before
1484        the values are stored in the database, they are converted to the default
1485        unit. This makes it possible to compare and filter on annotation values
1486        using different units. For example, filtering with <emphasis>&gt;5mg</emphasis> 
1487        also finds items that are annotated with <emphasis>2g</emphasis>.
1488        </para>
1489       
1490        <para>
1491        The core should automatically update the stored annotation values if
1492        the default unit is changed for an annotation type, or if the reference
1493        factor for a unit is changed.
1494        </para>
1495      </sect3>
1496     
1497      <sect3 id="data_api.annotations.categories">
1498        <title>Categories</title>
1499       
1500        <para>
1501        The <classname docapi="net.sf.basedb.core.data">AnnotationTypeCategoryData</classname> class defines
1502        categories that are used to group annotation types that are related to
1503        each other. This information is mainly useful for client applications
1504        when displaying forms for annotating items, that wish to provide a
1505        clearer interface when there are many (say 50+) annotations type for
1506        an item. An annotation type can belong to more than one category.
1507        </para>
1508       
1509      </sect3>
1510     
1511    </sect2>
1512
1513    <sect2 id="data_api.protocols">
1514      <title>Protocols</title>
1515
1516      <para>
1517        This section gives an overview of how protocols that describe various
1518        processes, such as sampling, extraction and scanning, are used in BASE.
1519      </para>
1520     
1521      <sect3 id="data_api.protocols.uml">
1522        <title>UML diagram</title>
1523       
1524        <figure id="data_api.figures.protocols">
1525          <title>Protocols</title>
1526          <screenshot>
1527            <mediaobject>
1528              <imageobject>
1529                <imagedata 
1530                  align="center"
1531                  fileref="figures/uml/datalayer.protocols.png" format="PNG" />
1532              </imageobject>
1533            </mediaobject>
1534          </screenshot>
1535        </figure>
1536      </sect3>
1537     
1538      <sect3 id="data_api.protocols.description">
1539        <title>Protocols</title>
1540       
1541        <para>
1542        A protocol is something that defines a procedure or recipe for some
1543        kind of action, such as sampling, extraction and scanning. In BASE we only
1544        store a short name and description. It is possible to attach a file
1545        that provides a longer description of the procedure.
1546        </para>
1547     
1548      </sect3>
1549     
1550      <sect3 id="data_api.protocols.parameters">
1551        <title>Parameters</title>
1552       
1553        <para>
1554        The procedure described by the protocol may have parameters
1555        that are set indepentently each time the protocol is used. It
1556        could for example be a temperature, a time or something else.
1557        The definition of parameters is done by creating annotation
1558        types and attaching them to the protocol. It is only possible
1559        to attach annotation types which has the <property>protocolParameter</property>
1560        property set to <constant>true</constant>. The same annotation type
1561        can be used for more than one protocol, but only do this if the
1562        parameters actually has the same meaning.
1563        </para>
1564     
1565      </sect3>
1566     
1567    </sect2>
1568
1569    <sect2 id="data_api.plugins">
1570      <title>Plug-ins, jobs and job agents</title>
1571     
1572      <para>
1573         This section gives an overview of plug-ins, jobs and job agents.
1574      </para>
1575     
1576      <itemizedlist>
1577        <title>See also</title>
1578        <listitem><xref linkend="plugins.installation" /></listitem>
1579        <listitem><xref linkend="installation_upgrade.jobagents" /></listitem>
1580      </itemizedlist>
1581     
1582      <sect3 id="data_api.plugins.uml">
1583        <title>UML diagram</title>
1584       
1585        <figure id="data_api.figures.plugins">
1586          <title>Plug-ins, jobs and job agents</title>
1587          <screenshot>
1588            <mediaobject>
1589              <imageobject>
1590                <imagedata 
1591                  align="center"
1592                  scalefit="1" width="100%"
1593                  fileref="figures/uml/datalayer.plugins.png" format="PNG" />
1594              </imageobject>
1595            </mediaobject>
1596          </screenshot>
1597        </figure>
1598      </sect3>
1599
1600      <sect3 id="data_api.plugins.plugins">
1601        <title>Plug-ins</title>
1602       
1603        <para>
1604          The <classname docapi="net.sf.basedb.core.data">PluginDefinitionData</classname> holds information of the
1605          installed plugin classes. Much of the information is copied from the
1606          plug-in itself from the <classname docapi="net.sf.basedb.core.plugin">About</classname> object and by checking
1607          which interfaces it implements.
1608        </para>
1609       
1610        <para>
1611          There are five main types of plug-ins:
1612        </para>
1613       
1614        <itemizedlist>
1615        <listitem>
1616          <para>
1617          IMPORT (mainType = 1): A plug-in that imports data to BASE.
1618          </para>
1619        </listitem>
1620        <listitem>
1621          <para>
1622          EXPORT (mainType = 2): A plug-in that exports data from BASE.
1623          </para>
1624        </listitem>
1625        <listitem>
1626          <para>
1627          INTENSITY (mainType = 3): A plug-in that calculates intensity values
1628          from raw data.
1629          </para>
1630        </listitem>
1631        <listitem>
1632          <para>
1633          ANALYZE (mainType = 4): A plug-in that analyses data.
1634          </para>
1635        </listitem>
1636        <listitem>
1637          <para>
1638          OTHER (mainType = 5): Any other plug-in.
1639          </para>
1640        </listitem>
1641        </itemizedlist>
1642       
1643        <para>
1644          A plug-in may have different configurations. The flags <property>supportsConfigurations</property>
1645          and <property>requiresConfiguration</property> are used to specify if a plug-in
1646          must have or can't have any configurations. Configuration parameter values are
1647          versioned. Each time anyone updates a configuration the version number
1648          is increased and the parameter values are stored as a new entity.
1649          This is required because we want to be able to know exactly which
1650          parameters a job were using when it was executed. When a job is
1651          created we also store the parameter version number
1652          (<property>JobData.parameterVersion</property>). This means that even if
1653          someone changes the configuration later we will always know which
1654          parameters the job used.
1655        </para>
1656       
1657        <para>
1658          The <classname docapi="net.sf.basedb.core.data">PluginTypeData</classname> class is ued to group
1659          plug-ins that share some common functionality, by implementing
1660          additional (optional) interfaces. For example, the
1661          <interfacename docapi="net.sf.basedb.core.plugin">AutoDetectingImporter</interfacename> should be implemented
1662          by import plug-ins that supports automatic detection of file formats.
1663          Another example is the <interfacename docapi="net.sf.basedb.core.plugin">AnalysisFilterPlugin</interfacename>
1664          interface which should be implemented by all analysis plug-ins that
1665          only filters data.
1666        </para>
1667
1668      </sect3>
1669     
1670      <sect3 id="data_api.plugins.jobs">
1671        <title>Jobs</title>
1672       
1673        <para>
1674          A job represents a single invokation of a plug-in to do some work.
1675          The <classname docapi="net.sf.basedb.core.data">JobData</classname> class holds information about this.
1676          A job is usuallu executed by a plug-in, but doesn't have to be. The
1677          <property>status</property> property holds the current state of a job.
1678        </para>
1679       
1680        <itemizedlist>
1681        <listitem>
1682          <para>
1683            UNCONFIGURED (status = 0): The job is not yet ready to be executed.
1684          </para>
1685        </listitem>
1686        <listitem>
1687          <para>
1688            WAITING (status = 1): The job is waiting to be executed.
1689          </para>
1690        </listitem>
1691        <listitem>
1692          <para>
1693            PREPARING (status = 5): The job is about to be executed but hasn't started yet.
1694          </para>
1695        </listitem>
1696        <listitem>
1697          <para>
1698            EXECUTING (status = 2): The job is currently executing.
1699          </para>
1700        </listitem>
1701        <listitem>
1702          <para>
1703            DONE (status = 3): The job finished successfully.
1704          </para>
1705        </listitem>
1706        <listitem>
1707          <para>
1708            ERROR (status = 4): The job finished with an error.
1709          </para>
1710        </listitem>
1711        </itemizedlist>
1712      </sect3>
1713
1714      <sect3 id="data_api.plugins.agents">
1715        <title>Job agents</title>
1716       
1717        <para>
1718          A job agent is a program running on the same or a different server that
1719          is regularly checking for jobs that are waiting to be executed. The
1720          <classname docapi="net.sf.basedb.core.data">JobAgentData</classname> holds information about a job agent
1721          and the <classname docapi="net.sf.basedb.core.data">JobAgentSettingsData</classname> links the agent
1722          with the plug-ins the agent is able to execute. The job agent will only
1723          execute jobs that are owner by users or projects that the job agent has
1724          been shared to with at least use permission. The <property>priorityBoost</property>
1725          property can be used to give specific plug-ins higher priority.
1726          Thus, for a job agent it is possible to:
1727        </para>
1728       
1729        <itemizedlist>
1730        <listitem>
1731          <para>
1732          Specify exactly which plug-ins it will execute. For example, it is possible
1733          to dedicate one agent to only run one plug-in.
1734          </para>
1735        </listitem>
1736        <listitem>
1737          <para>
1738          Give some plug-ins higher priority. For example a job agent that is mainly
1739          used for importing data should give higher priority to all import plug-ins.
1740          Other types of jobs will have to wait until there are no more data to be
1741          imported.
1742          </para>
1743        </listitem>
1744        <listitem>
1745          <para>
1746          Specify exactly which users/groups/projects that may use the agent. For
1747          example, it is possible to dedicate one agent to only run jobs for a certain
1748          project.
1749          </para>
1750        </listitem>
1751        </itemizedlist>
1752       
1753      </sect3>
1754
1755
1756    </sect2>
1757   
1758    <sect2 id="data_api.biomaterials">
1759      <title>Biomaterials</title>
1760     
1761      <sect3 id="data_api.biomaterials.uml">
1762        <title>UML diagram</title>
1763       
1764        <figure id="data_api.figures.biomaterials">
1765          <title>Biomaterials</title>
1766          <screenshot>
1767            <mediaobject>
1768              <imageobject>
1769                <imagedata 
1770                  align="center"
1771                  fileref="figures/uml/datalayer.biomaterials.png" format="PNG" />
1772              </imageobject>
1773            </mediaobject>
1774          </screenshot>
1775        </figure>
1776      </sect3>
1777     
1778      <sect3 id="data_api.biomaterials.description">
1779        <title>Biomaterials</title>
1780       
1781        <para>
1782          There are four types of biomaterials: <classname docapi="net.sf.basedb.core.data">BioSourceData</classname>,
1783          <classname docapi="net.sf.basedb.core.data">SampleData</classname>, <classname docapi="net.sf.basedb.core.data">ExtractData</classname> and
1784          <classname docapi="net.sf.basedb.core.data">LabeledExtractData</classname>.
1785          All four types of are derived from the base class <classname docapi="net.sf.basedb.core.data">BioMaterialData</classname>.
1786          The reason for this is that they all share common functionality such as pooling
1787          and events. By using a common base class we do not have to create duplicate
1788          classes for keeping track of events and parents.
1789        </para>
1790       
1791        <para>
1792          The <classname docapi="net.sf.basedb.core.data">BioSourceData</classname> is the simplest of the biomaterials.
1793          It cannot have parents and can't participate in events. It's only used as a
1794          (non-required) parent for samples.
1795        </para>
1796       
1797        <para>
1798          The <classname docapi="net.sf.basedb.core.data">MeasuredBioMaterialData</classname> class is used as a base
1799          class for the other three biomaterial types. It introduces quantity
1800          measurements and can store original and remaining quantities. They are
1801          both optional. If an original quantity has been specified the core
1802          automatically calculates the remaining quantity based on the events a
1803          biomaterial participates in.
1804        </para>
1805       
1806        <para>
1807          All measured biomaterial have at least one event associated with them,
1808          the creation event, which holds information about the creation of the
1809          biomaterial. A measured biomaterial can be created in three ways:
1810        </para>
1811       
1812        <itemizedlist>
1813        <listitem>
1814          <para>
1815          From a single item of the parent type. Biosource is the parent type of
1816          samples, sample is the parent type of extracts, and extract is the
1817          parent type of labeled extracts. In this case the
1818          <property>pooled</property> property is <constant>false</constant>
1819          and the parent is specified in the <property>parent</property> property.
1820          If the parent is not a <classname docapi="net.sf.basedb.core.data">BioSourceData</classname> this information
1821          is duplicated, with the addition of an optional used quantity value, in the
1822          <property>sources</property> collection of the <classname docapi="net.sf.basedb.core.data">BioMaterialEventData</classname>
1823          object representing the creation event. It is the responsibility of the
1824          core to make sure that everything is properly synchronized and that
1825          remaining quantities are calculated.
1826          </para>
1827        </listitem>
1828       
1829        <listitem>
1830          <para>
1831          From one or more items of the same type, i.e pooling.
1832          In this case the <property>pooled</property> property is <constant>true</constant> 
1833          and the <property>parent</property> property is null. All source
1834          biomaterials are contained in the <property>sources</property> collection.
1835          The core is still responsible for keeping everything synchronized and to
1836          update remaining quantities.
1837          </para>
1838        </listitem>
1839       
1840        <listitem>
1841          <para>
1842          As a standalone biomaterial without parents.
1843          </para>
1844        </listitem>
1845        </itemizedlist>
1846
1847      </sect3>
1848     
1849      <sect3 id="data_api.biomaterials.events">
1850        <title>Biomaterial events</title>
1851       
1852        <para>
1853          An event represents something that happened to one or more biomaterials, for example
1854          the creation of another biomaterial. The <classname docapi="net.sf.basedb.core.data">BioMaterialEventData</classname>
1855          holds information about entry and event dates, protocols used, the user who is
1856          responsible, etc. There are three types of events represented by the <property>eventType</property>
1857          property.
1858        </para>
1859       
1860        <orderedlist>
1861        <listitem>
1862          <para>
1863          <emphasis>Creation event</emphasis>: This event represents the creation of a (measured)
1864          biomaterial. The <property>sources</property> collection contains
1865          information about the biomaterials that were used to create the new
1866          biomaterial. If the biomaterial is a pooled biomaterial all sources must
1867          be of the same type. Otherwise there can only be one source of the parent
1868          type. These rules are maintained by the core.
1869          </para>
1870        </listitem>
1871       
1872        <listitem>
1873          <para>
1874          <emphasis>Hybridization event</emphasis>: This event represents the creation
1875          of a hybridization. This event type is needed because we want to keep track
1876          of quantities for labeled extracts. This event has a hybridization as a
1877          product instead of a biomaterial. The sources collection can only contain
1878          labeled extracts.
1879          </para>
1880        </listitem>
1881
1882        <listitem>
1883          <para>
1884          <emphasis>Other event</emphasis>: This event represents some other important
1885          information about a single biomaterial that affected the remaining quantity.
1886          This event type doesn't have any sources.
1887          </para>
1888        </listitem>
1889        </orderedlist>
1890      </sect3>
1891 
1892    </sect2>
1893
1894    <sect2 id="data_api.plates">
1895      <title>Array LIMS - plates</title>
1896
1897      <sect3 id="data_api.plates.uml">
1898        <title>UML diagram</title>
1899       
1900        <figure id="data_api.figures.plates">
1901          <title>Array LIMS - plates</title>
1902          <screenshot>
1903            <mediaobject>
1904              <imageobject>
1905                <imagedata 
1906                  align="center"
1907                  scalefit="1" width="100%"
1908                  fileref="figures/uml/datalayer.plates.png" format="PNG" />
1909              </imageobject>
1910            </mediaobject>
1911          </screenshot>
1912        </figure>
1913      </sect3>
1914
1915      <sect3 id="data_api.plates.description">
1916        <title>Plates</title>
1917       
1918        <para>
1919          The <classname docapi="net.sf.basedb.core.data">PlateData</classname> is the main class holding information
1920          about a single plate. The associated <classname docapi="net.sf.basedb.core.data">PlateGeometryData</classname>
1921          defines how many rows and columns there are on a plate. Since this
1922          information is used to create wells, and for various other checks it is
1923          not possible to change the number of rows or columns once a geometry has
1924          been created.
1925        </para>
1926         
1927        <para>
1928          All plates must have a <classname docapi="net.sf.basedb.core.data">PlateTypeData</classname> which defines
1929          the geometry and a set of event types (see below).
1930        </para>
1931       
1932        <para>
1933          If the destroyed flag of a plate is set it is not allowed to use the
1934          plate for a plate mapping or to create array designs. However, it
1935          is possible to change the flag to not destroyed.
1936        </para>
1937
1938        <para>
1939          The barcode is intended to be used as an external identifier of the plate.
1940          But, the core doesn't care about the value or if it is unique or not.
1941        </para>
1942      </sect3>
1943     
1944      <sect3 id="data_api.plates.events">
1945        <title>Plate events</title>
1946
1947        <para>
1948          The plate type defines a set of <classname docapi="net.sf.basedb.core.data">PlateEventTypeData</classname>
1949          objects, each one represening a particular event a plate of this type
1950          usually goes trough. For a plate of a certain type, it is possible to
1951          attach exactly one event of each event type. The event type defines an
1952          optional protocol type, which can be used by client applications to
1953          filter a list of protocols for the event. The core doesn't check that
1954          the selected protocol for an event is of the same protocol type as
1955          defined by the event type.
1956        </para>
1957
1958        <para>
1959          The ordinal value can be used as a hint to client applications in
1960          which order the events actually are performed in the lab. The core doesn't
1961          care about this value or if several event types have the same value.
1962        </para>
1963      </sect3>
1964
1965      <sect3 id="data_api.plates.mappings">
1966        <title>Plate mappings</title>
1967       
1968        <para>
1969          A plate can be created either from scratch, with the help of the information
1970          in a <classname docapi="net.sf.basedb.core.data">PlateMappingData</classname>, from a set of parent plates.
1971          In the first case it is possible to specify a reporter for each well on the
1972          plate. In the second case the mapping code creates all the wells and links
1973          them to the parent wells on the parent plates. Once the plate has been saved
1974          to the database, the wells cannot be modified (because they are used
1975          downstream for various validation, etc.)
1976        </para>
1977       
1978        <para>
1979          The details in a plate mapping are simply coordinates that for each
1980          destination plate, row and column define a source plate, row and column.
1981          It is possible for a single source well to be mapped to multiple destination
1982          wells, but for each destination well only a single source well can be
1983          used.
1984        </para>
1985       
1986      </sect3>
1987
1988    </sect2>
1989
1990    <sect2 id="data_api.arrays">
1991      <title>Array LIMS - arrays</title>
1992     
1993      <sect3 id="data_api.arrays.uml">
1994        <title>UML diagram</title>
1995       
1996        <figure id="data_api.figures.arrays">
1997          <title>Array LIMS - arrays</title>
1998          <screenshot>
1999            <mediaobject>
2000              <imageobject>
2001                <imagedata 
2002                  align="center"
2003                  fileref="figures/uml/datalayer.arrays.png" format="PNG" />
2004              </imageobject>
2005            </mediaobject>
2006          </screenshot>
2007        </figure>
2008      </sect3>
2009     
2010      <sect3 id="data_api.arrays.designs">
2011        <title>Array designs</title>
2012       
2013        <para>
2014          Array designs are stored in <classname docapi="net.sf.basedb.core.data">ArrayDesignData</classname> objects
2015          and can be created either as standalone designs or
2016          from plates. In the first case the features on an array design
2017          are described by a reporter map. A reporter map is a file
2018          that maps a coordinate (block, meta-grid, row, column),
2019          position or an external ID on an array design to a
2020          reporter. Which method to use is given by the
2021          <property>ArrayDesign.featureIdentificationMethod</property> property.
2022          The coordinate system on an array design is divided into blocks.
2023          Each block can be identified either by a <property>blockNumber</property>
2024          or by meta coordinates. This information is stored in
2025          <classname docapi="net.sf.basedb.core.data">ArrayDesignBlockData</classname> items. Each block
2026          contains several <classname docapi="net.sf.basedb.core.data">FeatureData</classname> items, each
2027          one identified by a row and column coordinate. Platforms that doesn't
2028          divide the array design into blocks or doesn't use the coordinate system at all
2029          must still create a single super-block that holds all features.
2030        </para>
2031       
2032        <para>
2033          Array designs that are created from plates use a print map file
2034          instead of a reporter map. A print map is similar to a plate mapping
2035          but maps features (instead of wells) to wells. The file should
2036          specifify which plate and well a feature is created from. Reporter
2037          information will automatically be copied by BASE from the well.
2038        </para>
2039       
2040        <para>
2041          It is also possible to skip the importing of features into the
2042          database and just keep the data in the orginal files instead.
2043          This is typically done for Affymetrix CDF files.
2044        </para>
2045       
2046      </sect3>
2047     
2048      <sect3 id="data_api.arrays.slides">
2049        <title>Array slides</title>
2050       
2051        <para>
2052          The <classname docapi="net.sf.basedb.core.data">ArraySlideData</classname> represents a single
2053          array. Arrays are usually printed several hundreds in a batch,
2054          represented by a <classname docapi="net.sf.basedb.core.data">ArrayBatchData</classname> item.
2055          The <property>batchIndex</property> is the ordinal number of the
2056          array in the batch. The <property>barcode</property> can be used
2057          as a means for external programs to identify the array. BASE doesn't
2058          care if a value is given or if they are unique or not. If the
2059          <property>destroyed</property> flag is set it prevents a slide from
2060          beeing used by a hybridization.
2061        </para>
2062
2063      </sect3>
2064    </sect2>
2065
2066    <sect2 id="data_api.rawdata">
2067      <title>Hybridizations and raw data</title>
2068     
2069      <sect3 id="data_api.rawdata.uml">
2070        <title>UML diagram</title>
2071       
2072        <figure id="data_api.figures.rawdata">
2073          <title>Hybridizations and raw data</title>
2074          <screenshot>
2075            <mediaobject>
2076              <imageobject>
2077                <imagedata 
2078                  align="center"
2079                  scalefit="1" width="100%"
2080                  fileref="figures/uml/datalayer.rawdata.png" format="PNG" />
2081              </imageobject>
2082            </mediaobject>
2083          </screenshot>
2084        </figure>
2085      </sect3>
2086     
2087      <sect3 id="data_api.rawdata.hybridizations">
2088        <title>Hybridizations</title>
2089       
2090        <para>
2091        Hybridizations connects the slides from the Array LIMS part
2092        with labeled extracts from the biomaterials part. The <property>creationEvent</property>
2093        is used to register which labeled extracts that were used on the hybridization.
2094        The relation to slides is a one-to-one relation. A slide can only be used on
2095        a single hybridization and a hybridization can only use a single slide. The relation
2096        is optional from both sides.
2097        </para>
2098
2099        <para>
2100        The scanning of the hybridized slide is registered as separate scan events.
2101        One or more images can optionally be attached to each scan.
2102        The images are not used by BASE.
2103        </para>
2104       
2105      </sect3>
2106     
2107      <sect3 id="data_api.rawdata.description">
2108        <title>Raw data</title>
2109       
2110        <para>
2111        A <classname docapi="net.sf.basedb.core.data">RawBioAssayData</classname> object represents
2112        the raw data that is produced by analysing the image(s) from a
2113        single scan. You may register which software that was used, the
2114        protocol and any parameters (through the annotation system).
2115        </para>
2116
2117        <para>
2118        Files with the analysed data values can be attached to the
2119        associated <classname docapi="net.sf.basedb.core.data">FileSetData</classname> object. The platform
2120        and, optionally, the variant has information about the file types
2121        that can be used for that platform. If the platform file types support
2122        metadata extraction, headers, the number of spots, and other
2123        information may be automatically extracted from the raw data file(s).
2124        </para>
2125       
2126        <para>
2127        If the platform support it, raw data can also be imported into the database.
2128        This is handled by batchers and <classname docapi="net.sf.basedb.core.data">RawData</classname> objects.
2129        Which table to store the data in depends on the <property>rawDataType</property>
2130        property. The properties shown for the <classname>RawData</classname> class
2131        in the diagram are the mandatory properties. Each raw data type defines additional
2132        properties that are specific to that raw data type.
2133        </para>
2134       
2135      </sect3>
2136     
2137      <sect3 id="data_api.rawdata.spotimages">
2138        <title>Spot images</title>
2139       
2140        <para>
2141        Spot images can be created if you have the original image
2142        files. BASE can use 1-3 images as sources for the red, green
2143        and blue channel respectively. The creation of spotimages requires
2144        that x and y coordinates are given for each raw data spot. The scaling
2145        and offset values are used to convert the coordinates to pixel
2146        coordinates. With this information BASE is able to cut out a square
2147        from the source images that, theoretically, contains a specific spot and
2148        nothing else. The spot images are gamma-corrected independently and then
2149        put together into PNG images that are stored in a zip file.
2150        </para>
2151      </sect3>
2152     
2153    </sect2>
2154
2155    <sect2 id="data_api.experiments">
2156      <title>Experiments and analysis</title>
2157     
2158     
2159      <sect3 id="data_api.experiments.uml">
2160        <title>UML diagram</title>
2161       
2162        <figure id="data_api.figures.experiments">
2163          <title>Experiments</title>
2164          <screenshot>
2165            <mediaobject>
2166              <imageobject>
2167                <imagedata 
2168                  align="center"
2169                  scalefit="1" width="75%"
2170                  fileref="figures/uml/datalayer.experiments.png" format="PNG" />
2171              </imageobject>
2172            </mediaobject>
2173          </screenshot>
2174        </figure>
2175      </sect3>
2176     
2177      <sect3 id="data_api.experiments.description">
2178        <title>Experiments</title>
2179       
2180        <para>
2181          The <classname docapi="net.sf.basedb.core.data">ExperimentData</classname> 
2182          class is used to collect information about a single experiment. It
2183          links to any number of <classname docapi="net.sf.basedb.core.data">RawBioAssayData</classname>
2184          items, which must all be of the same <classname 
2185          docapi="net.sf.basedb.core">RawDataType</classname>.
2186        </para>
2187       
2188        <para>
2189          Annotation types that are needed in the analysis must connected to
2190          the experiment as experimental factors and the annotation values should
2191          be set on or inherited by each raw bioassay that is part of the
2192          experiment.
2193        </para>
2194       
2195        <para>
2196          The directory connected to the experiment is the default directory
2197          where plugins that generate files should store them.
2198        </para>
2199      </sect3>
2200           
2201      <sect3 id="data_api.experiments.bioassays">
2202        <title>Bioassay sets, bioassays and transformations</title>
2203       
2204        <para>
2205          Each line of analysis starts with the creation of a <emphasis>root</emphasis>
2206          <classname docapi="net.sf.basedb.core.data">BioAssaySetData</classname>,
2207          which holds the intensities calculated from the raw data.
2208          A bioassayset can hold one intensity for each channel. The number of
2209          channels is defined by the raw data type. For each raw bioassay used a
2210          <classname docapi="net.sf.basedb.core.data">BioAssayData</classname>
2211          is created.
2212        </para>
2213       
2214        <para>
2215          Information about the process that calculated the intensities are
2216          stored in a <classname docapi="net.sf.basedb.core.data">TransformationData</classname>
2217          object. The root transformation links with the raw bioassays that are used
2218          in this line of analysis and to a <classname 
2219          docapi="net.sf.basedb.core.data">JobData</classname> which has information
2220          about which plug-in and parameters that was used in the calculation.
2221        </para>
2222     
2223        <para>
2224          Once the root bioassayset has been created it is possible to
2225          again apply a transformation to it. This time the transformation
2226          links to a single source bioassayset instead of the raw bioassays.
2227          As before, it still links to a job with information about the plug-in and
2228          parameters that does the actual work. The transformation must make sure
2229          that new bioassays are created and linked to the bioassays in the
2230          source bioassayset. This above process may be repeated as many times
2231          as needed.
2232        </para>
2233       
2234        <para>
2235          Data to a bioassay set can only be added to it before it has been
2236          committed to the database. Once the transaction has been committed
2237          it is no longed possible to add more data or to modify existing
2238          data.
2239        </para>
2240     
2241      </sect3>
2242
2243      <sect3 id="data_api.experiments.virtualdb">
2244        <title>Virtual databases, datacubes, etc.</title>
2245       
2246        <para>
2247          The above processes requires a flexible storage solution for the data.
2248          Each experiment is related to a <classname docapi="net.sf.basedb.core.data">VirtualDb</classname>
2249          object. This object represents the set of tables that are needed to store
2250          data for the experiment. All tables are created in a special part of the
2251          BASE database that we call the <emphasis>dynamic database</emphasis>.
2252          In MySQL the dynamic database is a separate database, in Postgres it is
2253          a separate schema.
2254        </para>
2255       
2256        <para>
2257          A virual database is divided into data cubes. A data cube can be seen
2258          as a three-dimensional object where each point can hold data that in
2259          most cases can be interpreted as data for a single spot from an
2260          array. The coordinates to a point is given by <emphasis>layer</emphasis>,
2261          <emphasis>column</emphasis> and <emphasis>position</emphasis>. The
2262          layer and column coordinates are represented by the
2263          <classname docapi="net.sf.basedb.core.data">DataCubeLayerData</classname>
2264          and <classname docapi="net.sf.basedb.core.data">DataCubeColumnData</classname>
2265          objects. The position coordinate has no separate object associated with
2266          it.
2267        </para>
2268       
2269        <para>
2270          Data for a single bioassay set is always stored in a single layer. It
2271          is possible for more than one bioassay set to use the same layer. This
2272          usually happens for filtering transformations that doesn't modify the
2273          data.  The filtered bioassay set is then linked to a
2274          <classname docapi="net.sf.basedb.core.data">DataCubeFilterData</classname>
2275          object, which has information about which data points that
2276          passed the filter.
2277        </para>
2278       
2279        <para>
2280          All data for a bioassay is stored in a single column.
2281          Two bioassays in different bioassaysets (layers) can only have the same
2282          column if one is the parent of the other.
2283        </para>
2284       
2285        <para>
2286          The position coordinate is tied to a reporter.
2287        </para>
2288       
2289        <para>
2290          A child bioassay set may use the same data cube as it's parent
2291          bioassay set if all of the following conditions are true:
2292        </para>
2293       
2294        <itemizedlist>
2295        <listitem>
2296          <para>
2297          All positions are linked to the same reporter as the positions
2298          in the parent bioassay set.
2299          </para>
2300        </listitem>
2301       
2302        <listitem>
2303          <para>
2304          All data points are linked to the same (possible many) raw data
2305          spots as the corresponding data points in the parent bioassay set.
2306          </para>
2307        </listitem>
2308       
2309        <listitem>
2310          <para>
2311          The bioassays in the child bioassay set each have exactly one
2312          parent in the parent bioassay set. No parent bioassay may be the
2313          parent of more than one child bioassay.
2314          </para>
2315        </listitem>
2316        </itemizedlist>
2317       
2318        <para>
2319          If any of the above conditions are not true, a new data cube
2320          must be created for the child bioassay set.
2321        </para>
2322      </sect3>
2323     
2324      <sect3 id="data_api.dynamic.description">
2325        <title>The dynamic database</title>
2326
2327        <figure id="data_api.figures.dynamic">
2328          <title>The dynamic database</title>
2329          <screenshot>
2330            <mediaobject>
2331              <imageobject>
2332                <imagedata 
2333                  align="center"
2334                  fileref="figures/uml/datalayer.dynamic.png" format="PNG" />
2335              </imageobject>
2336            </mediaobject>
2337          </screenshot>
2338        </figure>
2339       
2340        <para>
2341          Each virtual database consists of several tables. The tables
2342          are dynamically created when needed. For each table shown in the diagram
2343          the # sign is replaced by the id of the virtual database object at run
2344          time.
2345        </para>
2346       
2347        <para>
2348          There are no classes in the data layer for these tables and they
2349          are not mapped with Hibernate. When we work with these tables we
2350          are always using batcher classes and queries that works with integer,
2351          floats and strings.
2352        </para>
2353       
2354        <bridgehead>The D#Spot table</bridgehead>
2355        <para>
2356          This is the main table which keeps the intensities for a single spot
2357          in the data cube. Extra values attached to the spot are kept in separate
2358          tables, one for each type of value (D#SpotInt, D#SpotFloat and D#SpotString).
2359        </para>
2360       
2361        <bridgehead>The D#Pos table</bridgehead>
2362        <para>
2363          This table stores the reporter id for each position in a cube.
2364          Extra values attached to the position are kept in separate tables,
2365          one for each type of value (D#PosInt, D#PosFloat and D#PosString).
2366        </para>
2367       
2368        <bridgehead>The D#Filter table</bridgehead>
2369        <para>
2370          This table stores the coordinates for the spots that remain after
2371          filtering. Note that each filter is related to a bioassayset which
2372          gives the cube and layer values. Each row in the filter table then
2373          adds the column and position allowing us to find the spots in the
2374          D#Spot table.
2375        </para>
2376       
2377        <bridgehead>The D#RawParents table</bridgehead>
2378        <para>
2379          This table holds mappings for a spot to the raw data it is calculated
2380          from. We don't need the layer coordinate since all layers in a cube
2381          must have the same mapping to raw data.
2382        </para>
2383       
2384      </sect3>     
2385
2386     
2387    </sect2>
2388   
2389    <sect2 id="data_api.misc">
2390      <title>Other classes</title>
2391     
2392      <sect3 id="data_api.misc.uml">
2393        <title>UML diagram</title>
2394       
2395        <figure id="data_api.figures.misc">
2396          <title>Other classes</title>
2397          <screenshot>
2398            <mediaobject>
2399              <imageobject>
2400                <imagedata 
2401                  align="center"
2402                  fileref="figures/uml/datalayer.misc.png" format="PNG" />
2403              </imageobject>
2404            </mediaobject>
2405          </screenshot>
2406        </figure>
2407      </sect3>
2408     
2409    </sect2>
2410
2411  </sect1>
2412 
2413  <sect1 id="api_overview.core_api" chunked="1">
2414    <title>The Core API</title>
2415   
2416    <para>
2417      This section gives an overview of various parts of the core API.
2418    </para>
2419   
2420    <sect2 id="core_api.data_in_files">
2421      <title>Using files to store data</title>
2422     
2423      <para>
2424        BASE 2.5 introduced the possibility to use files to store data instead
2425        of importing it into the database. Files can be attached
2426        to any item that implements the <interfacename docapi="net.sf.basedb.core">FileStoreEnabled</interfacename>
2427        interface. Currently this is <classname docapi="net.sf.basedb.core">RawBioAssay</classname>
2428        and <classname docapi="net.sf.basedb.core">ArrayDesign</classname>. The
2429        ability to store data in files is not a replacement for storing data in the
2430        database. It is possible (for some platforms/raw data types) to have data in
2431        files and in the database at the same time. We would have liked to enforce
2432        that (raw) data is always present in files, but this will not be backwards compatible
2433        with older installations, so there are three cases:
2434      </para>
2435     
2436      <itemizedlist>
2437      <listitem>
2438        <para>
2439        Data in files only
2440        </para>
2441      </listitem>
2442      <listitem>
2443        <para>
2444        Data in the database only
2445        </para>
2446      </listitem>
2447      <listitem>
2448        <para>
2449        Data in both files and in the database
2450        </para>
2451      </listitem>
2452      </itemizedlist>
2453     
2454      <para>
2455        Not all three cases are supported for all types of data. This is controlled
2456        by the <classname docapi="net.sf.basedb.core">Platform</classname> class, which may disallow
2457        that data is stored in the database. To check this call
2458        <methodname>Platform.isFileOnly()</methodname> and/or
2459        <methodname>Platform.getRawDataType()</methodname>. If the <methodname>isFileOnly()</methodname>
2460        method returns <constant>true</constant>, the platform can't store data in
2461        the database. If the value is <constant>false</constant> more information
2462        can be obtained by calling <methodname>getRawDataType()</methodname>,
2463        which may return:
2464      </para>
2465     
2466      <itemizedlist>
2467      <listitem>
2468        <para>
2469          <constant>null</constant>: The platform can store data with any
2470          raw data type in the database.
2471        </para>
2472      </listitem>
2473      <listitem>
2474        <para>
2475        A <classname docapi="net.sf.basedb.core">RawDataType</classname> that has <code>isStoredInDb() == true</code>:
2476        The platform can store data in the database but only data with the specified raw
2477        data type.
2478        </para>
2479      </listitem>
2480      <listitem>
2481        <para>
2482        A <classname docapi="net.sf.basedb.core">RawDataType</classname> that has <code>isStoredInDb() == false</code>:
2483        The platform can't store data in the database.
2484        </para>
2485      </listitem>
2486      </itemizedlist>
2487
2488      <para>
2489        One major change from earlier BASE versions is that the registration of raw data types
2490        has changed. The <filename>raw-data-types.xml</filename> file should
2491        only be used for raw data types that are stored in the database. The
2492        <sgmltag>storage</sgmltag> tag has been deprecated and BASE will refuse
2493        to start if it finds a raw data type definitions with <code>storage="file"</code>.
2494      </para>
2495     
2496      <para>
2497        For backwards compatibility reasons, each <classname docapi="net.sf.basedb.core">Platform</classname>
2498        that can only store data in files will create "virtual" raw data type
2499        objects internally. These raw data types all return <constant>false</constant>
2500        from the <methodname>RawDataType.isStoredInDb()</methodname>
2501        method. They also have a back-link to the platform/variant that
2502        created it: <methodname>RawDataType.getPlatform()</methodname>
2503        and <methodname>RawDataType.getVariant()</methodname>. These two methods
2504        will always return <constant>null</constant> when called on a raw data type
2505        that can be stored in the database.
2506      </para>
2507     
2508      <itemizedlist>
2509        <title>See also</title>
2510        <listitem><xref linkend="data_api.platforms" /></listitem>
2511        <listitem><xref linkend="plugin_developer.other.datafiles" /></listitem>
2512        <listitem><xref linkend="appendix.rawdatatypes.platforms" /></listitem>
2513        <listitem>
2514          <xref linkend="appendix.incompatible.2.5" /> in
2515          <xref linkend="appendix.incompatible" />
2516        </listitem>
2517      </itemizedlist>
2518     
2519      <sect3 id="core_api.data_in_files.diagram">
2520        <title>Diagram of classes and methods</title>
2521        <figure id="core_api.figures.data_in_files">
2522          <title>Store data in files</title>
2523          <screenshot>
2524            <mediaobject>
2525              <imageobject>
2526                <imagedata 
2527                  align="center"
2528                  scalefit="1" width="100%"
2529                  fileref="figures/uml/corelayer.datainfiles.png" format="PNG" />
2530              </imageobject>
2531            </mediaobject>
2532          </screenshot>
2533        </figure>
2534       
2535        <para>
2536          This is rather large set of classes and methods. The ultimate goal
2537          is to be able to create links between a <classname docapi="net.sf.basedb.core">RawBioAssay</classname>
2538          / <classname docapi="net.sf.basedb.core">ArrayDesign</classname> and <classname docapi="net.sf.basedb.core">File</classname>
2539          items and to provide some metadata about the files.
2540          The <classname docapi="net.sf.basedb.core">FileStoreUtil</classname> class is one of the most
2541          important ones. It is intended to make it easy for plug-in (and other)
2542          developers to access the files without having to mess with platform
2543          or file type objects. The API is best described
2544          by a set of use-case examples.
2545        </para>
2546       
2547      </sect3>
2548     
2549      <sect3 id="core_api.data_in_files.ask">
2550        <title>Use case: Asking the user for files for a given item</title>
2551
2552        <para>
2553          A client application must know what types of files it makes sense
2554          to ask the user for. In some cases, data may be split into more than
2555          one file so we need a generic way to select files.
2556        </para>
2557       
2558        <para>
2559          Given that we have a <interfacename docapi="net.sf.basedb.core">FileStoreEnabled</interfacename>
2560          item we want to find out which <classname docapi="net.sf.basedb.core">DataFileType</classname>
2561          items that can be used for that item. The
2562          <methodname>DataFileType.getQuery(FileStoreEnabled)</methodname>
2563          can be used for this. Internally, the method uses the result from
2564          <methodname>FileStoreEnabled.getPlatform()</methodname>
2565          and <methodname>FileStoreEnabled.getVariant()</methodname>
2566          methods to restrict the query to only return file types for
2567          a given platform and/or variant. If the item doesn't have
2568          a platform or variant the query will return all file types
2569          that are associated with the given item type. In any case, we get a list
2570          of <classname>DataFileType</classname> items, each one representing a
2571          specific file type that we should ask the user about. Examples:
2572        </para>
2573
2574        <orderedlist>
2575        <listitem>
2576          <para>
2577          The <constant>Affymetrix</constant> platform defines <constant>CEL</constant>
2578          as a raw data file and <constant>CDF</constant> as an array design (reporter map)
2579          file. If we have a <classname docapi="net.sf.basedb.core">RawBioAssay</classname> the query will only return
2580          the CEL file type and the client can ask the user for a CEL file.
2581          </para>
2582        </listitem>
2583        <listitem>
2584          <para>
2585          The <constant>Generic</constant> platform defines <constant>PRINT_MAP</constant>
2586          and <constant>REPORTER_MAP</constant> for array designs. If we have
2587          an <classname docapi="net.sf.basedb.core">ArrayDesign</classname> the query will return those two
2588          items.
2589          </para>
2590        </listitem>
2591        </orderedlist>
2592     
2593        <para>
2594          It might also be interesting to know the currently selected file
2595          for each file type and if the platform has set the <varname>required</varname>
2596          flag for a particular file type. Here is a simple code example
2597          that may be useful to start from:
2598        </para>
2599     
2600        <programlisting language="java">
2601DbControl dc = ...
2602FileStoreEnabled item = ...
2603Platform platform = item.getPlatform();
2604PlatformVariant variant = item.getVariant();
2605
2606// Get list of DataFileTypes used by the platform
2607ItemQuery&lt;DataFileType&gt; query =
2608   DataFileType.getQuery(item);
2609List&lt;DataFileType&gt; types = query.list(dc);
2610
2611// Always check hasFileSet() method first to avoid
2612// creating the file set if it doesn't exists
2613FileSet fileSet = item.hasFileSet() ?
2614   null : item.getFileSet();
2615   
2616for (DataFileType type : types)
2617{
2618   // Get the current file, if any
2619   FileSetMember member = fileSet == null || !fileSet.hasMember(type) ?
2620      null : fileSet.getMember(type);
2621   File current = member == null ?
2622      null : member.getFile();
2623   
2624   // Check if a file is required by the platform
2625   PlatformFileType pft = platform == null ?
2626      null : platform.getFileType(type, variant);
2627   boolean isRequired = pft == null ?
2628      false : pft.isRequired();
2629     
2630   // Now we can do something with this information to
2631   // let the user select a file ...
2632}
2633</programlisting>
2634     
2635        <note>
2636          <title>Also remember to catch PermissionDeniedException</title>
2637          <para>
2638            The above code may look complicated, but this is mostly because
2639            of all checks for <constant>null</constant> values. Remember
2640            that many things are optional and may return <constant>null</constant>.
2641            Another thing to look out for is
2642            <exceptionname>PermissionDeniedException</exceptionname>:s. The logged in
2643            user may not have access to all items. The above example doesn't include
2644            any code for this since it would have made it too complex.
2645          </para>
2646        </note>
2647      </sect3>
2648     
2649      <sect3 id="core_api.data_in_files.link">
2650        <title>Use case: Link, validate and extract metadata from the selected files</title>
2651        <para>
2652          When the user has selected the file(s) we must store the links
2653          to them in the database. This is done with a <classname docapi="net.sf.basedb.core">FileSet</classname>
2654          object. A file set can contain any number of files. The only limitation
2655          is that it can only contain one file for each file type.
2656          Call <methodname>FileSet.setMember()</methodname> to store
2657          a file in the file set. If a file already exists for the given file type
2658          it is replaced, otherwise a new entry is created. The following
2659          program example assumes that we have a map where <classname docapi="net.sf.basedb.core">File</classname>:s
2660          are related to <classname docapi="net.sf.basedb.core">DataFileType</classname>:s. When all files
2661          have been added we call <methodname>FileSet.validate()</methodname>
2662          to validate the files and extract metadata.
2663        </para>
2664       
2665        <programlisting language="java">
2666DbControl dc = ...
2667FileStoreEnabled item = ...
2668Map&lt;DataFileType, File&gt; files = ...
2669
2670// Store the selected files in the fileset
2671FileSet fileSet = item.getFileSet();
2672for (Map.Entry&lt;DataFileType, File&gt; entry : files)
2673{
2674   DataFileType type = entry.getKey();
2675   File file = entry.getValue();
2676   fileSet.setMember(type, file);
2677}
2678
2679// Validate the files and extract metadata
2680fileSet.validate(dc, true);
2681</programlisting>
2682
2683        <para>
2684          Validation and extraction of metadata is important since we want
2685          data in files to be equivalent to data in the database. The validation
2686          and metadata extraction is done by the core when the
2687          <methodname>FileSet.validate()</methodname> is called.
2688          The process is partly pluggable since each <classname docapi="net.sf.basedb.core">DataFileType</classname> 
2689          can name a class that should do the validation and/or metadata extraction.
2690        </para>
2691
2692        <note>
2693          <para>
2694          The <methodname>FileSet.validate()</methodname> only validates
2695          the files where the file types have specified plug-ins that can
2696          do the validation and metadata extraction. The method doesn't
2697          throw any exceptions. Instead, all validation errors
2698          are returned a list of <classname>Throwable</classname>:s. The
2699          validation result is also stored for each file and can be access
2700          with <methodname>FileSetMember.isValid()</methodname> and
2701          <methodname>FileSetMember.getErrorMessage()</methodname>.
2702          </para>
2703        </note>
2704
2705        <para>
2706          Here is the general outline of what is going on in the core:
2707        </para>
2708
2709        <orderedlist>
2710        <listitem>
2711          <para>
2712          The core checks the <classname docapi="net.sf.basedb.core">DataFileType</classname> of all
2713          members in the file set and creates <classname docapi="net.sf.basedb.core.filehandler">DataFileValidator</classname>
2714          and <classname docapi="net.sf.basedb.core.filehandler">DataFileMetadataReader</classname> objects. Only one instance
2715          of each class is created. If the file set contains members which has the
2716          same validator or metadata reader, they will all share the same instance.
2717          </para>
2718        </listitem>
2719       
2720        <listitem>
2721          <para>
2722          Each validator/metadata reader class is initialised with calls to
2723          <methodname>DataFileHandler.setItem()</methodname> and
2724          <methodname>DataFileHandler.setFile()</methodname>.
2725          </para>
2726        </listitem>
2727       
2728        <listitem>
2729          <para>
2730          Each validator is called. The result of the validation is saved for each
2731          file and can be retreieved by <methodname>FileSetMember.isValid()</methodname>
2732          and <methodname>FileSetMember.getErrorMessage()</methodname>.
2733          </para>
2734        </listitem>
2735       
2736        <listitem>
2737          <para>
2738          Each metadata reader is called, unless the metadata reader is the same class
2739          as the validator and the validation failed. If the metadata reader is a
2740          different class, it is called even if the validation failed.
2741          </para>
2742        </listitem>
2743        </orderedlist>
2744
2745        <note>
2746          <title>Only one instance of each validator class is created</title>
2747          <para>
2748          The validation/metadata extraction is not done until all files have been
2749          added to the fileset. If the same validator/meta data reader is
2750          used for more than one file, the same instance is reused. Ie.
2751          the <methodname>setFile()</methodname> is called one time
2752          for each file/file type pair. The <methodname>validate()</methodname>
2753          and <methodname>extractMetadata()</methodname> methods are only
2754          called once.
2755          </para>
2756        </note>
2757       
2758        <para>
2759          All validators and meta data extractors should extend
2760          the <classname docapi="net.sf.basedb.core.filehandler">AbstractDataFileHandler</classname> class. The reason
2761          is that we may want to add more methods to the <interfacename docapi="net.sf.basedb.core.filehandler">DataFileHandler</interfacename>
2762          interface in the future. The <classname docapi="net.sf.basedb.core.filehandler">AbstractDataFileHandler</classname> will
2763          be used to provide default implementations for backwards compatibility.
2764        </para>
2765       
2766      </sect3>
2767     
2768      <sect3 id="core_api.data_in_files.import">
2769        <title>Use case: Import data into the database</title>
2770       
2771        <para>
2772          This should be done by existing plug-ins in the same way as before.
2773          A slight modification is needed since it is good if the importers
2774          are made aware of already selected files in the <classname docapi="net.sf.basedb.core">FileSet</classname>
2775          to provide good default values. The <classname docapi="net.sf.basedb.core">FileStoreUtil</classname>
2776          class is very useful in cases like this:
2777        </para>
2778       
2779        <programlisting language="java">
2780RawBioAssay rba = ...
2781DbControl dc = ...
2782
2783// Get the current raw data file, if any
2784List&lt;File&gt; rawDataFiles =
2785   FileStoreUtil.getGenericDataFiles(dc, rba, FileType.RAW_DATA);
2786File defaultFile = rawDataFiles.size() > 0 ?
2787   rawDataFiles.get(0) : null;
2788   
2789// Create parameter asking for input file - use current as default
2790PluginParameter&lt;File&gt; fileParameter = new PluginParameter&lt;File&gt;(
2791   "file",
2792   "Raw data file",
2793   "The file that contains the raw data that you want to import",
2794   new FileParameterType(defaultFile, true, 1)
2795);
2796</programlisting>
2797
2798      <para>
2799        An import plug-in should also save the file that was used to the file set:
2800      </para>
2801     
2802      <programlisting language="java">
2803RawBioassay rba = ...
2804// The file the user selected to import from
2805File rawDataFile = (File)job.getValue("file");
2806
2807// Save the file to the fileset. The method will check which file
2808// type the platform uses as the raw data type. As a fallback the
2809// GENERIC_RAW_DATA type is used
2810FileStoreUtil.setGenericDataFile(dc, rba, FileType.RAW_DATA,
2811   DataFileType.GENERIC_RAW_DATA, rawDataFile);
2812</programlisting>
2813
2814      </sect3>
2815     
2816      <sect3 id="core_api.data_in_files.experiments">
2817        <title>Use case: Using raw data from files in an experiment</title>
2818       
2819        <para>
2820          Just as before, an experiment is still locked to a single
2821          <classname docapi="net.sf.basedb.core">RawDataType</classname>. This is a design issue that
2822          would break too many things if changed. If data is stored in files
2823          the experiment is also locked to a single <classname docapi="net.sf.basedb.core">Platform</classname>.
2824          This has been designed to have as little impact on existing
2825          plug-ins as possible. In most cases, the plug-ins will continue
2826          to work as before.
2827        </para>
2828       
2829        <para>
2830          A plug-in (using data from the database that needs to check if it can
2831          be used within an experiment can still do:
2832        </para>
2833       
2834        <programlisting language="java">
2835Experiment e = ...
2836RawDataType rdt = e.getRawDataType();
2837if (rdt.isStoredInDb())
2838{
2839   // Check number of channels, etc...
2840   // ... run plug-in code ...
2841}
2842</programlisting>
2843       
2844        <para>
2845          A newer plug-in which uses data from files should do:
2846        </para>
2847       
2848        <programlisting language="java">
2849Experiment e = ...
2850DbControl dc = ...
2851RawDataType rdt = e.getRawDataType();
2852if (!rdt.isStoredInDb())
2853{
2854   // Check that platform/variant is supported
2855   Platform p = rdt.getPlatform(dc);
2856   PlatformVariant v = rdt.getVariant(dc);
2857   // ...
2858
2859   // Get data files
2860   File aFile = FileStoreUtil.getDataFile(dc, ...);
2861   
2862   // ... run plug-in code ...
2863}
2864</programlisting>
2865       
2866      </sect3>
2867     
2868    </sect2>
2869   
2870    <sect2 id="core_api.signals">
2871      <title>Sending signals (to plug-ins)</title>
2872   
2873      <para>
2874        BASE has a simple system for sending signals between different parts of
2875        a system. This signalling system was initially developed to be able to
2876        kill plug-ins that a user for some reason wanted to abort. The signalling
2877        system as such is not limited to this and it can be used for other purposes
2878        as well. Signals can of course be handled internally in a single JVM but
2879        also sent externally to other JVM:s running on the same or a different
2880        computer. The transport mechanism for signals is decoupled from the actual
2881        handling of them. If you want to, you could implement a signal transporter
2882        that sends signal as emails and the target plug-in would never know.
2883      </para>
2884     
2885      <para>
2886        The remainder of this section will focus mainly on the sending and
2887        transportation of signals. For more information about handling signals
2888        on the receiving end, see <xref linkend="plugin_developer.signals" />.
2889      </para>
2890     
2891      <sect3 id="core_api.signals.diagram">
2892        <title>Diagram of classes and methods</title>
2893        <figure id="core_api.figures.signals">
2894          <title>The signalling system</title>
2895          <screenshot>
2896            <mediaobject>
2897              <imageobject>
2898                <imagedata 
2899                  align="center"
2900                  scalefit="1" width="100%"
2901                  fileref="figures/uml/corelayer.signals.png" format="PNG" />
2902              </imageobject>
2903            </mediaobject>
2904          </screenshot>
2905        </figure>
2906     
2907        <para>
2908          The signalling system is rather simple. An object that wish
2909          to receieve signals must implement the
2910          <interfacename docapi="net.sf.basedb.core.signal"
2911          >SignalTarget</interfacename>. It's only method
2912          is <methodname>getSignalHandler()</methodname>. A
2913          <interfacename docapi="net.sf.basedb.core.signal"
2914          >SignalHandler</interfacename> is an object that
2915          knows what to do when a signal is delivered to it. The target object
2916          may implement the <interfacename>SignalHandler</interfacename> itself
2917          or use one of the existing handlers.
2918        </para>
2919       
2920        <para>
2921          The difficult part here is to be aware that a signal is usually
2922          delivered by a separate thread. The target object must be aware
2923          of this and know how to handle multiple threads. As an example we
2924          can use the <classname docapi="net.sf.basedb.core.signal"
2925          >ThreadSignalHandler</classname> which simply
2926          calls <code>Thread.interrupt()</code> to deliver a signal. The target
2927          object that uses this signal handler it must know that it should check
2928          <code>Thread.interrupted()</code> at regular intervals from the main
2929          thread. If that method returns true, it means that the <constant>ABORT</constant>
2930          signal has been delivered and the main thread should clean up and exit as
2931          soon as possible.
2932        </para>
2933       
2934        <para>
2935          Even if a signal handler could be given directly to the party
2936          that may be interested in sending a signal to the target this
2937          is not recommended. This would only work when sending signals
2938          within the same virtual machine. The signalling system includes
2939          <interfacename docapi="net.sf.basedb.core.signal"
2940          >SignalTransporter</interfacename> and
2941          <interfacename docapi="net.sf.basedb.core.signal"
2942          >SignalReceiver</interfacename> objects that are used
2943          to decouple the sending of signals with the handling of signals. The
2944          implementation usually comes in pairs, for example
2945          <classname docapi="net.sf.basedb.core.signal"
2946          >SocketSignalTransporters</classname> and <classname 
2947          docapi="net.sf.basedb.core.signal">SocketSignalReceiver</classname>.
2948        </para>
2949       
2950        <para>
2951          Setting up the transport mechanism is usually a system responsibility.
2952          Only the system know what kind of transport that is appropriate for it's current
2953          setup. Ie. should signals be delievered by TCP/IP sockets, only internally, or
2954          should a delivery mechanism based on web services be implemented?
2955          If a system wants to receive signals it must create an appropriate
2956          <interfacename>SignalReceiver</interfacename> object. Within BASE the
2957          internal job queue set up it's own signalling system that can be used to
2958          send signals (eg. kill) running jobs. The job agents do the same but uses
2959          a different implementation. See <xref linkend="appendix.base.config.jobqueue" />
2960          for more information about how to configure the internal job queue's
2961          signal receiver. In both cases, there is only one signal receiver instance
2962          active in the system.
2963        </para>
2964       
2965        <para>
2966          Let's take the internal job queue as an example. Here is how it works:
2967        </para>
2968       
2969        <itemizedlist>
2970        <listitem>
2971          <para>
2972          When the internal job queue is started, it will also create a signal
2973          receiver instance according to the settings in <filename>base.config</filename>.
2974          The default is to create <classname docapi="net.sf.basedb.core.signal"
2975          >LocalSignalReceiver</classname>
2976          which can only be used inside the same JVM. If needed, this can
2977          be changed to a <classname docapi="net.sf.basedb.core.signal"
2978          >SocketSignalReceiver</classname> or any other
2979          user-provided implementation.
2980          </para>
2981        </listitem>
2982       
2983        <listitem>
2984          <para>
2985          When the job queue has found a plug-in to execute it will check if
2986          it also implements the <interfacename docapi="net.sf.basedb.core.signal"
2987          >SignalTarget</interfacename>
2988          interface. If it does, a signal handler is created and registered
2989          with the signal receiver. This is actually done by the BASE core
2990          by calling <methodname>PluginExecutionRequest.registerSignalReceiver()</methodname>
2991          which also makes sure that the the ID returned from the registration is
2992          stored in the database together with the job item representing the
2993          plug-in to execute.
2994          </para>
2995        </listitem>
2996       
2997        <listitem>
2998          <para>
2999          Now, when the web client see's a running job which has a non-empty
3000          signal transporter property, the <guilabel>Abort</guilabel>
3001          button is activated. If the user clicks this button the BASE core
3002          uses the information in the database to create
3003          <interfacename docapi="net.sf.basedb.core.signal"
3004          >SignalTransporter</interfacename> object. This
3005          is simply done by calling <code>Job.getSignalTransporter()</code>.
3006          The created signal transporter knows how to send a signal
3007          to the signal receiver it was first registered with. When the
3008          signal arrives at the receiver it will find the handler for it
3009          and call <code>SignalHandler.handleSignal()</code>. This will in it's turn
3010          trigger some action in the signal target which soon will abort what
3011          it is doing and exit.
3012          </para>
3013        </listitem>
3014        </itemizedlist>
3015       
3016       
3017      </sect3>
3018   
3019    </sect2>
3020   
3021  </sect1>
3022
3023  <sect1 id="api_overview.query_api">
3024    <title>The Query API</title>
3025    <para>
3026      This documentation is only available in the old format.
3027      See <ulink url="http://base.thep.lu.se/chrome/site/doc/historical/development/overview/query/index.html"
3028        >http://base.thep.lu.se/chrome/site/doc/historical/development/overview/query/index.html</ulink>
3029    </para>
3030   
3031  </sect1>
3032 
3033  <sect1 id="api_overview.dynamic_and_batch_api">
3034    <title>Analysis and the Dynamic and Batch API:s</title>
3035    <para>
3036      This documentation is only available in the old format.
3037      See <ulink url="http://base.thep.lu.se/chrome/site/doc/historical/development/overview/dynamic/index.html"
3038        >http://base.thep.lu.se/chrome/site/doc/historical/development/overview/dynamic/index.html</ulink>
3039    </para>
3040  </sect1>
3041
3042  <sect1 id="api_overview.extensions">
3043    <title>Extensions API</title>
3044   
3045    <sect2 id="api_overview.extensions.core">
3046      <title>The core part</title>
3047   
3048      <para>
3049        The <emphasis>Extensions API</emphasis> is divided into two parts. A core
3050        part and a web client specific part. The core part can be found in the
3051        <package>net.sf.basedb.util.extensions</package> package and it's sub-packages,
3052        and consists of three sub-parts:
3053      </para>
3054     
3055      <itemizedlist>
3056      <listitem>
3057        <para>
3058        A set of interface definitions which forms the core of the Extensions API.
3059        The interfaces defines, for example, what an <interfacename 
3060        docapi="net.sf.basedb.util.extensions">Extension</interfacename> is and
3061        what an <interfacename 
3062        docapi="net.sf.basedb.util.extensions">ActionFactory</interfacename> should do.
3063        </para>
3064      </listitem>
3065     
3066      <listitem>
3067        <para>
3068        A <classname docapi="net.sf.basedb.util.extensions">Registry</classname> that is
3069        used to keep track of installed extensions. The registry also provides
3070        functionality for invoking and using the extensions.
3071        </para>
3072      </listitem>
3073     
3074      <listitem>
3075        <para>
3076        Utility classes that are useful when implementation a client application
3077        that can be extendable. The most useful example is the <classname
3078        docapi="net.sf.basedb.util.extensions.xml">XmlLoader</classname> which can
3079        read extension definitions from XML files and create the proper factories,
3080        etc.
3081        </para>
3082      </listitem>
3083      </itemizedlist>
3084     
3085      <figure id="core_api.figures.extensions_core">
3086        <title>The core part of the Extensions API</title>
3087        <screenshot>
3088          <mediaobject>
3089            <imageobject>
3090              <imagedata 
3091                align="center"
3092                fileref="figures/uml/corelayer.extensions_core.png" format="PNG" />
3093            </imageobject>
3094          </mediaobject>
3095        </screenshot>
3096      </figure>
3097     
3098      <para>
3099        The <classname docapi="net.sf.basedb.util.extensions">Registry</classname> 
3100        is one of the main classes in the extension system. All extension points and
3101        extensions must be registered before they can be used. Typically, you will
3102        first register extension points and then extensions, beacuse an extension
3103        can't be registered until the extension point it is extending has been
3104        registered.
3105      </para>
3106     
3107      <para>
3108        An <interfacename docapi="net.sf.basedb.util.extensions">ExtensionPoint</interfacename>
3109        is an ID and a definition of an <interfacename docapi="net.sf.basedb.util.extensions">Action</interfacename>
3110        class. The other options (name, description, renderer factory, etc.) are optional.
3111        An <interfacename docapi="net.sf.basedb.util.extensions">Extension</interfacename>
3112        that extends a specific extension point must provide an
3113        <interfacename docapi="net.sf.basedb.util.extensions">ActionFactory</interfacename>
3114        instance that can create actions of the type the extension point requires.
3115      </para>
3116     
3117      <example id="core_api.example.extensions_core">
3118        <title>The menu extensions point</title>
3119        <para>
3120        The <code>net.sf.basedb.clients.web.menu.extensions</code> extension point
3121        requires <interfacename 
3122        docapi="net.sf.basedb.clients.web.extensions.menu">MenuItemAction</interfacename>
3123        objects. An extension for this extension point must provide a factory that
3124        can create <classname>MenuItemAction</classname>:s. BASE ships with default
3125        factory implementations, for example the <classname 
3126        docapi="net.sf.basedb.clients.web.extensions.menu">FixedMenuItemFactory</classname>
3127        class, but an extension may provide it's own factory implementation if it wants to.
3128        </para>
3129      </example>
3130     
3131      <para>
3132        Call the <methodname>Registry.useExtensions()</methodname> method
3133        to use extensions from one or several extension points. This method will
3134        find all extensions for the given extension points. If a filter is given,
3135        it checks if any of the extensions or extension points has been disabled.
3136        It will then call <methodname>ActionFactory.prepareContext()</methodname>
3137        for all remaining extensions. This gives the action factory a chance to
3138        also disable the extension, for example, if the logged in user doesn't
3139        have a required permission. The action factory may also set attributes
3140        on the context. The attributes can be anything that the extension point
3141        may make use of. Check the documentation for the specific extension point
3142        for information about which attributes it supports. If there are
3143        any renderer factories, their <methodname>RendererFactory.prepareContext()</methodname>
3144        is also called. They have the same possibility of setting attributes
3145        on the context, but can't disable an extension.
3146      </para>
3147     
3148      <para>
3149        After this, an <classname 
3150        docapi="net.sf.basedb.util.extensions">ExtensionsInvoker</classname>
3151        object is created and returned to the extension point. Note that
3152        the <methodname>ActionFactory.getActions()</methodname> has not been
3153        called yet, so we don't know if the extensions are actually
3154        going to generate any actions. The <methodname>ActionFactory.getActions()</methodname>
3155        is not called until we have got ourselves an
3156        <classname docapi="net.sf.basedb.util.extensions">ActionIterator</classname>
3157        from the <methodname>ExtensionsInvoker.iterate()</methodname> method and
3158        starts to iterate. The call to <methodname>ActionIterator.hasNext()</methodname>
3159        will propagate down to <methodname>ActionFactory.getActions()</methodname>
3160        and the generated actions are then available with the
3161        <methodname>ActionIterator.next()</methodname> method.
3162      </para>
3163     
3164      <para>
3165        The <methodname>ExtensionsInvoker.renderDefault()</methodname>
3166        and <methodname>ExtensionsInvoker.render()</methodname> are
3167        just convenience methods that will make it easer to render
3168        the actions. The first method will of course only work if the
3169        extension point is providing a renderer factory, that can
3170        create the default renderer.
3171      </para>
3172     
3173      <note>
3174        <title>Be aware of multi-threading issues</title>
3175        <para>
3176          When you are creating extensions you must be aware that
3177          multiple threads may access the same objects at the same time.
3178          In particular, any action factory or renderer factory has to be
3179          thread-safe, since only one exists for each extension.
3180          Action and renderer objects should be thread-safe if the
3181          factories re-use the same objects.
3182        </para>
3183      </note>
3184   
3185    </sect2>
3186   
3187    <sect2 id="api_overview.extensions.web">
3188      <title>The web client part</title>
3189   
3190      <para>
3191        The web client specific parts of the Extensions API can be found
3192        in the <package>net.sf.basedb.client.web.extensions</package> package
3193        and it's subpackages. The top-level package contains classes used to
3194        administrate the extension system. Here is for example the
3195        <classname docapi="net.sf.basedb.client.web.extensions">ExtensionsControl</classname> 
3196        class which is the master controller for the web client extensions. It:
3197      </para>
3198     
3199      <itemizedlist>
3200      <listitem>
3201        <para>
3202        Keeps track of installed extensions and which JAR or XML file they are
3203        installed from.
3204        </para>
3205      </listitem>
3206     
3207      <listitem>
3208        <para>
3209        Can, manually or automatically, find and install new or
3210        updated extensions and uninstall deleted extensions.
3211        </para>
3212      </listitem>
3213     
3214      <listitem>
3215        <para>
3216        Adds permission control to the extension system, so that only an
3217        administrator is allowed to change settings, enable/disable extensions,
3218        etc.
3219        </para>
3220      </listitem>
3221      </itemizedlist>
3222     
3223      <para>
3224        In the top-level package there are also some abstract classes that may
3225        be useful to extend for developers creating their own extensions.
3226        For example, we recommend that all action factories extend the <classname 
3227        docapi="net.sf.basedb.client.web.extensions">AbstractJspActionFactory</classname>
3228        class.
3229      </para>
3230     
3231      <para>
3232        The sub-packages to <package>net.sf.basedb.client.web.extensions</package>
3233        are mostly specific to a single extension point or to a specific type of
3234        extension point. The <package>net.sf.basedb.client.web.extensions.menu</package>
3235        package, for example, contains classes that are/can be used for extensions
3236        adding menu items to the <menuchoice><guimenu>Extensions</guimenu></menuchoice>
3237        menu.
3238      </para>
3239     
3240      <figure id="core_api.figures.extensions_web">
3241        <title>The web client part of the Extensions API</title>
3242        <screenshot>
3243          <mediaobject>
3244            <imageobject>
3245              <imagedata 
3246                align="center"
3247                fileref="figures/uml/corelayer.extensions_web.png" format="PNG" />
3248            </imageobject>
3249          </mediaobject>
3250        </screenshot>
3251      </figure>
3252   
3253      <para>
3254        When the Tomcat web server is starting up, the <classname 
3255        docapi="net.sf.basedb.client.web.extensions">ExtensionsServlet</classname>
3256        is automatically loaded. This servlet has as two purposes:
3257      </para>
3258     
3259      <itemizedlist>
3260      <listitem>
3261        <para>
3262        Initialise the extensions system by calling
3263        <methodname>ExtensionsControl.init()</methodname>. This will result in
3264        an initial scan for installed extensions, which is equivalent to doing
3265        a manual scan with the force update setting to false. This means that
3266        the extension system is up an running as soon as the first user log's
3267        in to BASE.
3268        </para>
3269      </listitem>
3270     
3271      <listitem>
3272        <para>
3273        Act as a proxy for custom servlets defined by the extensions. URL:s
3274        ending with <code>.servlet</code> has been mapped to the
3275        <classname>ExtensionsServlet</classname>. When a request is made it
3276        will extract the name of the extension's JAR file from the
3277        URL, get the corresponding <classname 
3278        docapi="net.sf.basedb.client.web.extensions">ExtensionsFile</classname>
3279        and <classname docapi="net.sf.basedb.client.web.extensions">ServletWrapper</classname>
3280        and then invoke the custom servlet. More information can be found in
3281        <xref linkend="extensions_developer.servlets" />.
3282        </para>
3283      </listitem>
3284     
3285      </itemizedlist>
3286     
3287      <para>
3288        Using extensions only involves calling the
3289        <methodname>ExtensionsControl.createContext()</methodname> and
3290        <methodname>ExtensionsControl.useExtensions()</methodname> methods. This
3291        returns an <classname docapi="net.sf.basedb.util.extensions">ExtensionsInvoker</classname> 
3292        object as described in the previous section.
3293      </para>
3294     
3295      <para>
3296        To render the actions it is possible to either use the
3297        <methodname>ExtensionsInvoker.iterate()</methodname> method
3298        and generate HTML from the information in each action. Or
3299        (the better way) is to use a renderer together with the
3300        <classname docapi="net.sf.basedb.clients.web.taglib.extensions">Render</classname>
3301        taglib.
3302      </para>
3303     
3304      <para>
3305        To get information about the installed extensions, 
3306        change settings, enabled/disable extensions, performing a manual
3307        scan, etc. use the <methodname>ExtensionsControl.get()</methodname>
3308        method. This will create a permission-controlled object. All
3309        users has read permission, administrators has write permission.
3310      </para>
3311     
3312      <note>
3313        <para>
3314          The permission we check for is WRITE permission on the
3315          web client item. This means it is possible to give a user
3316          permissions to manage the extension system by assigning
3317          WRITE permission to the web client entry in the database.
3318          Do this from <menuchoice>
3319            <guimenu>Administrate</guimenu>
3320            <guimenuitem>Clients</guimenuitem>
3321          </menuchoice>.
3322        </para>
3323      </note>
3324   
3325      <para>
3326        The <classname docapi="net.sf.basedb.clients.web.extensions">XJspCompiler</classname>
3327        is mapped to handle the compilation <code>.xjsp</code> files
3328        which are regular JSP files with a different extension. This feature is
3329        experimental and requires installing an extra JAR into Tomcat's lib
3330        directory. See <xref linkend="admin.extensions.xjspcompiler" /> for
3331        more information.
3332      </para>
3333   
3334    </sect2>
3335   
3336  </sect1>
3337
3338  <sect1 id="api_overview.other_api">
3339    <title>Other useful classes and methods</title>
3340    <para>
3341      TODO
3342    </para>
3343  </sect1>
3344 
3345</chapter>
Note: See TracBrowser for help on using the repository browser.