source: trunk/doc/src/docbook/developer/base_api.xml @ 5780

Last change on this file since 5780 was 5780, checked in by Nicklas Nordborg, 10 years ago

References #1590: Documentation cleanup

Updated developer documentation "API overview". The chapter has been renamed to "The BASE API" and I intend to merge some information from the "Core developer reference" with this chapter.

  • Property svn:eol-style set to native
  • Property svn:keywords set to Id
File size: 135.2 KB
1<?xml version="1.0" encoding="UTF-8"?>
3    "-//Dawid Weiss//DTD DocBook V3.1-Based Extension for XML and graphics inclusion//EN"
4    "../../../../lib/docbook/preprocess/dweiss-docbook-extensions.dtd">
6  $Id: base_api.xml 5780 2011-10-04 11:01:20Z nicklas $
8  Copyright (C) 2007 Peter Johansson, Nicklas Nordborg, Martin Svensson
10  This file is part of BASE - BioArray Software Environment.
11  Available at
13  BASE is free software; you can redistribute it and/or
14  modify it under the terms of the GNU General Public License
15  as published by the Free Software Foundation; either version 3
16  of the License, or (at your option) any later version.
18  BASE is distributed in the hope that it will be useful,
19  but WITHOUT ANY WARRANTY; without even the implied warranty of
21  GNU General Public License for more details.
23  You should have received a copy of the GNU General Public License
24  along with BASE. If not, see <>.
27<chapter id="base_api">
28  <title>The BASE API</title>
30  <sect1 id="base_api.public">
31    <title>The Public API of BASE</title>
33    <para>
34      Not all public classes and methods in the <filename>base-*.jar</filename>
35      files and other JAR files shipped with BASE are considered as
36      <emphasis>Public API</emphasis>. This is important knowledge
37      since we will always try to maintain backwards compatibility
38      for classes that are part of the public API. For other
39      classes, changes may be introduced at any time without
40      notice or specific documentation. In other words:
41    </para>
43    <note>
44      <title>Only use the public API when developing plug-ins and extensions</title>
45      <para>
46        This will maximize the chance that your code will continue
47        to work with the next BASE release. If you use the non-public API
48        you do so at your own risk.
49      </para>
50    </note>
52    <para>
53      See the <ulink url="../../api/index.html"
54        >BASE API javadoc</ulink> for information about
55      what parts of the API that contributes to the public API.
56      Methods, classes and other elements that have been tagged as
57      <code>@deprecated</code> should be considered as part of the internal API
58      and may be removed in a subsequent release without warning.
59    </para>
61    <para>
62      Keeping the backwards compatibility is an aim only. It may not
63      always be possible. See <xref linkend="appendix.incompatible" /> to
64      read more about changes that have been introduced by each release
65      that may affect existing code.
66    </para>
68    <sect2 id="base_api.compatibility">
69      <title>What is backwards compatibility?</title>
71      <para>
72        There is a great article about this subject on <ulink 
73        url=""
74          ></ulink>.
75        This is what we will try to comply with. If you do not want to
76        read the entire article, here are some of the most important points:
77      </para>
80      <sect3 id="core_api.compatibility.binary">
81        <title>Binary compatibility</title>
82        <para>
83        <blockquote>
84          Pre-existing Client binaries must link and run with new releases of the
85          Component without recompiling.
86        </blockquote>
88        For example:
89        <itemizedlist>
90        <listitem>
91          <para>
92            We cannot change the number or types of parameters to a method
93            or constructor.
94          </para>
95        </listitem>
96        <listitem>
97          <para>
98            We cannot add or change methods to interfaces that are intended
99            to be implemented by plug-in or client code.
100          </para>
101        </listitem>
102        </itemizedlist>
103        </para>       
104      </sect3>
106      <sect3 id="base_api.compatibility.contract">
107        <title>Contract compatibility</title>
108        <para>
109          <blockquote>
110          API changes must not invalidate formerly legal Client code.
111          </blockquote>
113          For example:
114          <itemizedlist>
115          <listitem>
116            <para>
117              We cannot change the implementation of a method to do
118              things differently than before. For example, allow <constant>null</constant>
119              as a return value when it was not allowed before.
120            </para>
121          </listitem>
122          </itemizedlist>
124          <note>
125            <para>
126            Sometimes there is a very fine line between what is considered a
127            bug and what is considered a feature. For example, if the
128            actual implementation does not do what the javadoc says,
129            do we change the code or do we change the documentation?
130            This has to be considered from case to case and depends on
131            the age of the code and if we expect plug-ins and clients to be
132            affected by it or not.
133            </para>
134          </note>
135        </para>
136      </sect3>
138      <sect3 id="base_api.compatibility.source">
139        <title>Source code compatibility</title>
140        <para>
141        This is not an important matter and is not always possible to
142        achieve. In most cases, the problems are easy to fix.
143        Example:
145        <itemizedlist>
146        <listitem>
147          <para>
148          Adding a class may break a plug-in or client that import
149          classes with <constant>.*</constant> if the same class name
150          exists in another package.
151          </para>
152        </listitem>
153        </itemizedlist>
154        </para>
155      </sect3>
156    </sect2>
157  </sect1>
159  <sect1 id="" chunked="1">
160    <title>The Data Layer API</title>
162    <para>
163      This section gives an overview of the entire data layer API.
164      The figure below show how different modules relate to each other.
165    </para>
167    <figure id="data_api.figures.overview">
168      <title>Data layer overview</title>
169      <screenshot>
170        <mediaobject>
171          <imageobject>
172            <imagedata 
173              align="center"
174              scalefit="1" width="100%"
175              fileref="figures/uml/datalayer.overview.png" format="PNG" />
176          </imageobject>
177        </mediaobject>
178      </screenshot>
179    </figure>
181    <sect2 id="data_api.basic">
182      <title>Basic classes and interfaces</title>
184      <para>
185        This document contains information about the basic classes and interfaces in this package.
186        They are important since all data-layer classes must inherit from one of the already
187        existing abstract base classes or implement one or more of the
188        existing interfaces. They contain code that is common to all classes,
189        for example implementations of the <methodname>equals()</methodname>
190        and <methodname>hashCode()</methodname> methods or how to link with the owner of an
191        item.
192      </para>
194        <figure id="data_api.figures.basic">
195          <title>Basic classes and interfaces</title>
196          <screenshot>
197            <mediaobject>
198              <imageobject>
199                <imagedata 
200                  align="center"
201                  fileref="figures/uml/datalayer.basic.png" format="PNG" />
202              </imageobject>
203            </mediaobject>
204          </screenshot>
205        </figure>
207      <sect3 id="data_api.basic.classes">
208        <title>Classes</title>
210        <variablelist>
211        <varlistentry>
212          <term><classname docapi="">BasicData</classname></term>
213          <listitem>
214            <para>
215            The root class. It overrides the <methodname>equals()</methodname>,
216            <methodname>hashCode()</methodname> and <methodname>toString()</methodname> methods
217            from the <classname>Object</classname> class. It also defines the
218            <varname>id</varname> and <varname>version</varname> properties.
219            All data layer classes must inherit from this class or one of it's subclasses.
220            </para>
221          </listitem>
222        </varlistentry>
224        <varlistentry>
225          <term><classname docapi="">OwnedData</classname></term>
226          <listitem>
227            <para>
228            Extends the <classname>BasicData</classname> class and adds
229            an <varname>owner</varname> property. The owner is a required link to a
230            <classname docapi="">UserData</classname> object, representing the user that
231            is the owner of the item.
232            </para>
233          </listitem>
234        </varlistentry>
236        <varlistentry>
237          <term><classname docapi="">SharedData</classname></term>
238          <listitem>
239            <para>
240            Extends the <classname>OwnedData</classname> class and adds
241            properties (<varname>itemKey</varname> and <varname>projectKey</varname>)
242            that holds access permission information for an item.
243            Access permissions are held in <classname docapi="">ItemKeyData</classname> and/or
244            <classname docapi="">ProjectKeyData</classname> objects. These objects only exists if
245            the item has been shared.
246            </para>
247          </listitem>
248        </varlistentry>
250        <varlistentry>
251          <term><classname docapi="">CommonData</classname></term>
252          <listitem>
253            <para>
254            This is a convenience class for items that extends the <classname>SharedData</classname>
255            class and implements the <interfacename docapi="">NameableData</interfacename> and
256            <interfacename docapi="">RemoveableData</interfacename> interfaces. This is one of
257            the most common situations.
258            </para>
259          </listitem>
260        </varlistentry>
262        <varlistentry>
263          <term><classname docapi="">AnnotatedData</classname></term>
264          <listitem>
265            <para>
266            This is a convenience class for items that can be annotated.
267            Annotations are held in <classname docapi="">AnnotationSetData</classname> objects.
268            The annotation set only exists if annotations has been created for the item.
269            </para>
270          </listitem>
271        </varlistentry>
272        </variablelist>
274      </sect3>
276      <sect3 id="data_api.basic.interfaces">
277        <title>Interfaces</title>
279        <variablelist>
280        <varlistentry>
281          <term><classname docapi="">IdentifiableData</classname></term>
282          <listitem>
283            <para>
284            All items are identifiable, which means that they have a unique <varname>id</varname>.
285            The id is unique for all items of a specific type (ie. class). The id is number
286            that is automatically generated by the database and has no other meaning
287            outside of the application. The <varname>version</varname> property is used for
288            detecting and preventing concurrent modifications to an item.
289            </para>
290          </listitem>
291        </varlistentry>
293        <varlistentry>
294          <term><classname docapi="">OwnableData</classname></term>
295          <listitem>
296            <para>
297            An ownable item is an item which has an owner. The owner is represented as a
298            required link to a <classname docapi="">UserData</classname> object.
299            </para>
300          </listitem>
301        </varlistentry>       
303        <varlistentry>
304          <term><classname docapi="">ShareableData</classname></term>
305          <listitem>
306            <para>
307            A shareable item is an item which can be shared to other users, groups or projects.
308            Access permissions are held in <classname docapi="">ItemKeyData</classname> and/or
309            <classname docapi="">ProjectKeyData</classname> objects.
310            </para>
311          </listitem>
312        </varlistentry>
314        <varlistentry>
315          <term><classname docapi="">NameableData</classname></term>
316          <listitem>
317            <para>
318            A nameable item is an item that has a name (required) and a description
319            (optional). The name doesn't have to be unique, except in a few special
320            cases (for example, the name of a file).
321            </para>
322          </listitem>
323        </varlistentry>
325        <varlistentry>
326          <term><classname docapi="">RemovableData</classname></term>
327          <listitem>
328            <para>
329            A removable item is an item that can be flagged as removed. This doesn't
330            remove the information about the item from the database, but can be used by
331            client applications to hide items that the user is not interested in.
332            A trashcan function can be used to either restore or permanently
333            remove items that has the flag set.
334            </para>
335          </listitem>
336        </varlistentry>
338        <varlistentry>
339          <term><classname docapi="">SystemData</classname></term>
340          <listitem>
341            <para>
342            A system item is an item which has an additional id in the form of string. A system id
343            is required when we need to make sure that we can get a specific item without
344            knowing the numeric id. Example of such items are the root user and the everyone group.
345            A system id is generally constructed like:
346            <constant>net.sf.basedb.core.User.ROOT</constant>. The system id:s are defined in the
347            core layer by each item class.
348            </para>
349          </listitem>
350        </varlistentry>
352        <varlistentry>
353          <term><classname docapi="">DiskConsumableData</classname></term>
354          <listitem>
355            <para>
356            This interface is used by items which occupies a lot of disk space and
357            should be part of the quota system, for example files. The required
358            <classname docapi="">DiskUsageData</classname> contains information about the size,
359            location, owner etc. of the item.
360            </para>
361          </listitem>
362        </varlistentry>
364        <varlistentry>
365          <term><classname docapi="">AnnotatableData</classname></term>
366          <listitem>
367            <para>
368            This interface is used by items which can be annotated. Annotations are name/value
369            pairs that are attached as extra information to an item. All annotations are
370            contained in an <classname docapi="">AnnotationSetData</classname> object.
371            </para>
372          </listitem>
373        </varlistentry>
375        <varlistentry>
376          <term><classname docapi="">ExtendableData</classname></term>
377          <listitem>
378            <para>
379            This interface is used by items which can have extra administrator-defined
380            columns. The functionality is similar to annotations. It is not as flexible,
381            since it is a global configuration, but has better performance. BASE will
382            generate extra database columns to store the data in the tables for items that
383            can be extended.
384            </para>
385          </listitem>
386        </varlistentry>
388        <varlistentry>
389          <term><classname docapi="">BatchableData</classname></term>
390          <listitem>
391            <para>
392            This interface is a tagging interface which is used by items that needs batch
393            functionality in the core.
394            </para>
395          </listitem>
396        </varlistentry>
398        <varlistentry>
399          <term><interfacename docapi="">RegisteredData</interfacename></term>
400          <listitem>
401            <para>
402            This interface is used by items which registered the date they were
403            created in the database. The registration date is set at creation time
404            and can't be modified later. Since this didn't exist prior to BASE 2.10,
405            null values are allowed on all pre-existing items. Note! For backwards
406            compatibility reasons with existing code in
407            <classname docapi="">BioMaterialEventData</classname>
408            the method name is <methodname>getEntryDate()</methodname>.
409            </para>
410          </listitem>
411        </varlistentry>
413        <varlistentry>
414          <term><interfacename docapi="">LoggableData</interfacename></term>
415          <listitem>
416            <para>
417            This is a tagging interface that indicates that the <classname 
418            docapi="net.sf.basedb.core.log.db">DbLogManagerFactory</classname> logging
419            implementation should log changes made to items that implements it.
420            </para>
421          </listitem>
422        </varlistentry>
424        <varlistentry>
425          <term><interfacename docapi="">FileStoreEnabledData</interfacename></term>
426          <listitem>
427            <para>
428            This interface is implemented by all items that can have files with related data
429            attached to them. The file types that can be used for a specific item are usually
430            determined by the main type, the subtype or platform.
431            </para>
432          </listitem>
433        </varlistentry>
435        <varlistentry>
436          <term><interfacename docapi="">SubtypableData</interfacename></term>
437          <listitem>
438            <para>
439            This interface should be implemented by all items that can be subtyped.
440            Unless otherwise noted the subtype is always an optional link to
441            a <classname docapi="">ItemSubtypeData</classname>.
442            item. In the simplest form, the subtype is a kind of an annotation, but
443            for items that also implements the <interfacename 
444            docapi="">FileStoreEnabledData</interfacename>
445            interface, the subtype can be used to specify the file types that
446            are applicable for each item.
447            </para>
448          </listitem>
449        </varlistentry>
450        </variablelist>
452      </sect3>
453    </sect2>
455    <sect2 id="data_api.authentication">
456      <title>User authentication and access control</title>
458      <para>
459         This section gives an overview of user authentication and
460         how groups, roles and projects are used for access control
461         to items.
462      </para>
464        <figure id="data_api.figures.authentication">
465          <title>User authentication and access control</title>
466          <screenshot>
467            <mediaobject>
468              <imageobject>
469                <imagedata 
470                  align="center"
471                  scalefit="1" width="100%"
472                  fileref="figures/uml/datalayer.authentication.png" format="PNG" />
473              </imageobject>
474            </mediaobject>
475          </screenshot>
476        </figure>
478      <sect3 id="data_api.authentication.users">
479        <title>Users and passwords</title>     
481        <para>
482          The <classname docapi="">UserData</classname> class holds information about users.
483          We keep the passwords in a separate table and use proxies to avoid loading
484          password data each time a user is loaded to minimize security risks. It is
485          only if the password needs to be changed that the <classname docapi="">PasswordData</classname>
486          object is loaded. The one-to-one mapping between user and password is controlled
487          by the password class, but a cascade attribute on the user class makes sure
488          that the password is deleted when a user is deleted.
489        </para>
490      </sect3>
492      <sect3 id="data_api.authentication.groups">
493        <title>Groups, roles, projects and permission template</title>     
495        <para>
496          The <classname docapi="">GroupData</classname>,
497          <classname docapi="">RoleData</classname> and
498          <classname docapi="">ProjectData</classname> classes holds
499          information about groups, roles
500          and projects respectively. A user may be a member of any number of groups,
501          roles and/or projects. New users are automatically added as members of all
502          groups and roles that has the <varname>default</varname> property set.
503        </para>
505        <para>
506          The membership in a project comes with an attached
507          permission values. This is the highest permission the user has in the
508          project. No matter what permission an item has been shared with the
509          user will not get higher permission. Groups may be members of other groups and
510          also in projects. A <classname docapi="">PermissionTemplateData</classname>
511          is just a holder for permissions that users can use when sharing items. The
512          template is never part of the actual permission control mechanism.
513        </para>
515        <para>
516          Group membership is always accounted for, but the core only allows
517          one project at a time to be use, this is the <emphasis>active project</emphasis>.
518          When a project is active new items that are created are automatically
519          shared according to the settings for the project. There are two cases.
520          If the project has a permission template, the new item is given the same
521          permissions as the template has. If the project doesn't have a permission
522          template, the new item is shared to the active project with the permission
523          given by the <varname>autoPermission</varname> property. Note that in the
524          first case the new item may or may not be shared to the active project
525          depending on if the template is shared to the project or not.
526        </para>
528        <para>
529          Note that the permission template is only used (by the core) when creating
530          new items. The permissions held by the template are copied and when the new item
531          has been saved to the database there is no longer any reference back to
532          the template that was used to create it. This means that changes to the
533          template does not affect already existing items and that the template
534          can be deleted without problems.
535        </para>
536      </sect3>
538      <sect3 id="data_api.authentication.keys">
539        <title>Keys</title>     
541        <para>
542          The <classname docapi="">KeyData</classname> class and it's subclasses
543          <classname docapi="">ItemKeyData</classname>, <classname docapi="">ProjectKeyData</classname> and
544          <classname docapi="">RoleKeyData</classname>, are used to store information about access
545          permissions to items. To get permission to manipulate an item a user must have
546          access to a key giving that permission. There are three types of keys:
547        </para>
549        <variablelist>
550        <varlistentry>
551          <term><classname docapi="">ItemKey</classname></term>
552          <listitem>
553            <para>
554            Is used to give a user or group access to a specific item. The item
555            must be a <interfacename docapi="">ShareableData</interfacename> item.
556            The permissions are usually set by the owner of the item. Once created an
557            item key cannot be changed. This allows the core to reuse a key if the
558            permissions match exactly, ie. for a given set of users/groups/permissions
559            there can be only one item key object.
560            </para>
561          </listitem>
562        </varlistentry>
564        <varlistentry>
565          <term><classname docapi="">ProjectKey</classname></term>
566          <listitem>
567            <para>
568            Is used to give members of a project access to a specific item. The item
569            must be a <interfacename docapi="">ShareableData</interfacename> item. Once created a
570            project key cannot be changed. This allows the core to reuse a key if the
571            permissions match exactly, ie. for a given set of projects/permissions
572            there can be only one project key object.
573            </para>
574          </listitem>
575        </varlistentry>
577        <varlistentry>
578          <term><classname docapi="">RoleKey</classname></term>
579          <listitem>
580            <para>
581            Is used to give a user access to all items of a specific type, ie.
582            <constant>READ</constant> all <constant>SAMPLES</constant>. The installation
583            will make sure that there already exists a role key for each type of item, and
584            it is not possible to add new or delete existing keys. Unlike the other two types
585            this key can be modified.
586            </para>
588            <para>
589            A role key is also used to assign permissions to plug-ins. If a plug-in has
590            been specified to use permissions the default is to deny everything.
591            The mapping to the role key is used to grant permissions to the plugin.
592            The <varname>granted</varname> value gives the plugin access to all items
593            of the related item type regardless of if the user that is running the plug-in has the
594            permission or not. The <varname>denied</varname> values denies access to all
595            items of the related item type even if the logged in user has the permission.
596            Permissions that are not granted nor denied are checked against the
597            logged in users regular permissions. Permissions to items that are
598            not linked are always denied.
599            </para>
600          </listitem>
601        </varlistentry>
602        </variablelist>
604      </sect3>
606      <sect3 id="data_api.authentication.permissions">
607        <title>Permissions</title>
609        <para>
610          The <varname>permission</varname> property appearing in many classes is an
611          integer values describing the permission:
612        </para>
614        <informaltable>
615        <tgroup cols="2">
616          <colspec colname="value" />
617          <colspec colname="permission" />
618          <thead>
619            <row>
620              <entry>Value</entry>
621              <entry>Permission</entry>
622            </row>
623          </thead>
624          <tbody>
625            <row>
626              <entry>1</entry>
627              <entry>Read</entry>
628            </row>
629            <row>
630              <entry>3</entry>
631              <entry>Use</entry>
632            </row>
633            <row>
634              <entry>7</entry>
635              <entry>Restricted write</entry>
636            </row>
637            <row>
638              <entry>15</entry>
639              <entry>Write</entry>
640            </row>
641            <row>
642              <entry>31</entry>
643              <entry>Delete</entry>
644            </row>
645            <row>
646              <entry>47 (=32+15)</entry>
647              <entry>Set owner</entry>
648            </row>
649            <row>
650              <entry>79 (=64+15)</entry>
651              <entry>Set permissions</entry>
652            </row>
653            <row>
654              <entry>128</entry>
655              <entry>Create</entry>
656            </row>
657            <row>
658              <entry>256</entry>
659              <entry>Denied</entry>
660            </row>
661          </tbody>
662        </tgroup>
663        </informaltable>
665        <para>
666          The values are constructed so that
667          <constant>READ</constant> -&gt;
668          <constant>USE</constant> -&gt;
669          <constant>RESTRICTED_WRITE</constant> -&gt;
670          <constant>WRITE</constant> -&gt;
671          <constant>DELETE</constant>
672          are chained in the sense that a higher permission always implies the lower permissions
673          also. The <constant>SET_OWNER</constant> and <constant>SET_PERMISSION</constant>
674          both implies <constant>WRITE</constant> permission. The <constant>DENIED</constant>
675          permission is only valid for role keys, and if specified it overrides all
676          other permissions.               
677        </para>
679        <para>
680          When combining permission for a single item the permission codes for the different
681          paths are OR-ed together. For example a user has a role key with <constant>READ</constant>
682          permission for <constant>SAMPLES</constant>, but also an item key with <constant>USE</constant>
683          permission for a specific sample. Of course, the resulting permission for that
684          sample is <constant>USE</constant>. For other samples the resulting permission is
685          <constant>READ</constant>.
686        </para>
688        <para>
689          If the user is also a member of a project which has <constant>WRITE</constant>
690          permission for the same sample, the user will have <constant>WRITE</constant>
691          permission when working with that project.
692        </para>
694        <para>
695          The <constant>RESTRICTED_WRITE</constant> permission is in most cases the same
696          as the <constant>WRITE</constant> permission. So far the <constant>RESTRICTED_WRITE</constant>
697          permission is only given to users to their own <classname docapi="">UserData</classname>
698          object so they can change their address and other contact information,
699          but not quota, expiration date and other administrative information.
700        </para>
702      </sect3>
703    </sect2>
705    <sect2 id="data_api.reporters">
706      <title>Reporters</title>
707      <para>
708         This section gives an overview of reporters in BASE.
709      </para>
711        <figure id="data_api.figures.reporters">
712          <title>Reporters</title>
713          <screenshot>
714            <mediaobject>
715              <imageobject>
716                <imagedata 
717                  align="center"
718                  fileref="figures/uml/datalayer.reporters.png" format="PNG" />
719              </imageobject>
720            </mediaobject>
721          </screenshot>
722        </figure>
724      <sect3 id="data_api.reporters.description">
725        <title>Reporters</title>
726        <para>
727          The <classname docapi="">ReporterData</classname> class holds information about reporters.
728          The <property>externalId</property> is a required property that must be unique
729          among all reporters. The external ID is the value BASE uses to match
730          reporters when importing data from files.
731        </para>
733        <para>
734          The <classname>ReporterData</classname> is an <emphasis>extendable</emphasis>
735          class, which means that the server administrator can define additional
736          columns (=annotations) in the reporters table. These are accessed with
737          the <methodname>ReporterData.getExtended()</methodname> and
738          <methodname>ReporterData.setExtended()</methodname> methods.
739          See <xref linkend="appendix.extendedproperties" /> for more information about
740          this.
741        </para>
743        <para>
744          The <classname>ReporterData</classname> is also a <emphasis>batchable</emphasis>
745          class which means that there is no corresponding class in the core
746          layer. Client applications and plug-ins should work directly with
747          the <classname>ReporterData</classname> class. To help manage the reporters
748          there is the <classname docapi="net.sf.basedb.core">Reporter</classname> and <classname docapi="net.sf.basedb.core">ReporterBatcher</classname>
749          classes. The main reason for this
750          is to increase the performance and lower the memory usage by bypassing
751          internal caching in the core and Hibernate. Performance is also
752          increased by the batchers which uses more efficient SQL against the
753          database than Hibernate.
754        </para>
756        <para>
757          The
758          <property>lastUpdate</property>
759          property holds the data and time the reporter information was last updated. The
760          value is managed automatically by the
761          <classname>ReporterBatcher</classname>
762          class. That goes for
763          <property>lastSource</property>
764          property too, which holds information about where the last update comes from. By
765          default this is set to the name of the logged in user, but it can be changed by
766          calling
767          <methodname>ReporterBatcher.setUpdateSource(String source)</methodname>
768          before the batcher commits the updates to the database. The source-string
769          should have the format: <synopsis>[ITEM_TYPE]:[ITEM_NAME]</synopsis> where,in
770          the file-case, ITEM_TYPE is File and ITEM_NAME is the file's name.
771        </para>
772      </sect3>
774      <sect3 id="data_api.reporters.lists">
775        <title>Reporter lists</title>
777        <para>
778          Reporter lists can be used to group reporters that are somehow related
779          to each other. This could for example be a list of interesting reporters
780          found in the analysis of an experiment. Each reporter in the list may
781          optionally be assigned a score. The meaning of the score value is not
782          interpreted by BASE.
783        </para>
785      </sect3>
788    </sect2>
790    <sect2 id="data_api.quota">
791      <title>Quota and disk usage</title>
792      <para>
793         This section gives an overview of quota system in BASE
794         and how the disk usage is kept track of.
795      </para>
797        <figure id="data_api.figures.quota">
798          <title>Quota and disk usage</title>
799          <screenshot>
800            <mediaobject>
801              <imageobject>
802                <imagedata 
803                  align="center"
804                  fileref="figures/uml/datalayer.quota.png" format="PNG" />
805              </imageobject>
806            </mediaobject>
807          </screenshot>
808        </figure>
810      <sect3 id="data_api.quota.description">
811        <title>Quota</title>
813        <para>
814          The <classname docapi="">QuotaData</classname> holds information about a
815          single quota registration. The same quota may be used by many different users
816          and groups. This object encapsulates allowed
817          quota values for different types of quota types and locations.
818          BASE defines several quota types (file, raw data and experiment),
819          and locations (primary, secondary and offline).
820        </para>
822        <para>
823          The <property>quotaValues</property> property is a map from
824          <classname docapi="">QuotaIndex</classname> to maximum byte values.
825          This map must contain at least one entry for the total
826          quota at the primary location.
827        </para>
829      </sect3>
831      <sect3 id="data_api.quota.diskusage">
832        <title>Disk usage</title>
834        <para>
835          A <interfacename docapi="">DiskConsumableData</interfacename> (for example a file)
836          item is automatically linked to a <classname docapi="">DiskUsageData</classname>
837          item. This holds information about the number of bytes,
838          the location and quota type the item uses. It also holds information
839          about which user and group (optional) that should be charged for the disk usage.
840          The user is always the owner of the item.
841        </para>
843      </sect3>
845    </sect2>
847    <sect2 id="data_api.clients">
848      <title>Client, session and settings</title>
849      <para>
850         This section gives an overview of hardware and software in BASE.
851      </para>
853        <figure id="data_api.figures.clients">
854          <title>Client, sessions and settings</title>
855          <screenshot>
856            <mediaobject>
857              <imageobject>
858                <imagedata 
859                  align="center"
860                  scalefit="1" width="100%"
861                  fileref="figures/uml/datalayer.clients.png" format="PNG" />
862              </imageobject>
863            </mediaobject>
864          </screenshot>
865        </figure>
867      <sect3 id="data_api.clients.description">
868        <title>Clients</title>
869        <para>
870          The <classname docapi="">ClientData</classname> class holds information
871          about a client application. The <property>externalId</property>
872          property is a unique identifier for the application. To avoid ID clashes the ID
873          should be constructed in the same way as Java packages, for example
874          <constant>net.sf.basedb.clients.web</constant> is the ID for the
875          web client application.
876        </para>
878        <para>
879          A client application doesn't have to be registered with BASE
880          to be able to use it. But we recommend it since:
881        </para>
883        <itemizedlist>
884        <listitem>
885          <para>
886            The permission system allows an admin to specify exactly
887            which users that may use a specific application.
888          </para>
889        </listitem>
891        <listitem>
892          <para>
893          The application can't store any context-sensitive or application-specific
894          settings unless it is registered.
895          </para>
896        </listitem>
898        <listitem>
899          <para>
900          The application can store context-sensitive help in the BASE
901          database.
902          </para>
903        </listitem>
904        </itemizedlist>
905      </sect3>
907      <sect3 id="data_api.clients.sessions">
908        <title>Sessions</title>
910        <para>
911          A session represents the time between login and logout for a single
912          user. The <classname docapi="">SessionData</classname> object is entirely
913          managed by the BASE core, and should be considered read-only
914          for client applications.
915        </para>
917      </sect3>
919      <sect3 id="data_api.clients.settings">
920        <title>Settings</title>
922        <para>
923          There are two types of settings: context-sensitive settings and regular
924          settings. The regular settings are simple key-value pairs of strings
925          and can be used for almost anything. There are four subtypes:
926        </para>
928        <itemizedlist>
929        <listitem>
930          <para>
931          Global default settings: Settings that are used by all users
932          and client applications on the BASE server. These settings
933          are read-only except for administrators. BASE has not yet defined
934          any settings of this type.
935          </para>
936        </listitem>
938        <listitem>
939          <para>
940          User default settings: Settings that are valid for a single user
941          for any client application. BASE has not yet defined
942          any settings of this type.
943          </para>
944        </listitem>
946        <listitem>
947          <para>
948          Client default settings: Settings that are valid for all users using
949          a specific client application. Each client application is responsible
950          for defining it's own settings. Settings are read-only except
951          for administrators.
952          </para>
953        </listitem>
955        <listitem>
956          <para>
957          User client settings: Settings that are valid for a single user using
958          a specific client application. Each client application is responsible
959          for defining it's own settings.
960          </para>
961        </listitem>
963        </itemizedlist>
965        <para>
966          The context-sensitive settings are designed to hold information
967          about the current status of options related to the listing of items
968          of a specific type. This includes:
969        </para>
971        <itemizedlist>
972        <listitem>
973          <para>
974          Current filtering options (as 1 or more <classname docapi="">PropertyFilterData</classname>
975          objects).
976          </para>
977        </listitem>
979        <listitem>
980          <para>
981          Which columns and direction to use for sorting.
982          </para>
983        </listitem>
985        <listitem>
986          <para>
987          The number of items to display on each page, and which page that
988          is the current page.
989          </para>
990        </listitem>
992        <listitem>
993          <para>
994          Simple key-value settings related to a given context.
995          </para>
996        </listitem>
997        </itemizedlist>
999        <para>
1000          Context-sensitive settings are only accessible if a client
1001          application has been registered. The settings may be
1002          named to make it possible to store several presets and to
1003          quickly switch between them. In any case, BASE maintains a
1004          current default setting with an empty name. An administrator
1005          may mark a named setting as public to allow other users to
1006          use it.
1007        </para>
1009      </sect3>
1012    </sect2>
1014    <sect2 id="data_api.files">
1015      <title>Files and directories</title>
1017      <para>
1018        This section covers the details of the BASE file
1019        system.
1020      </para>
1022        <figure id="data_api.figures.files">
1023          <title>Files and directories</title>
1024          <screenshot>
1025            <mediaobject>
1026              <imageobject>
1027                <imagedata 
1028                  align="center"
1029                  fileref="figures/uml/datalayer.files.png" format="PNG" />
1030              </imageobject>
1031            </mediaobject>
1032          </screenshot>
1033        </figure>
1035        <para>
1036          The <classname docapi="">DirectoryData</classname> class holds
1037          information about directories. Directories are organised in the
1038          ususal way as as tree structure. All directories must have
1039          a parent directory, except the system-defined root directory.
1040        </para>
1042        <para>
1043          The <classname docapi="">FileData</classname> class holds information about
1044          a file. The actual file contents is stored on disk in the directory
1045          specified by the <varname>userfiles</varname> setting in
1046          <filename>base.config</filename>. The <varname>internalName</varname>
1047          property is the name of the file on disk, but this is never exposed to
1048          client applications. The filenames and directories
1049          on the disk doesn't correspond to the the filenames and directories in
1050          BASE.
1051        </para>
1053        <para>
1054          The <varname>url</varname> property is used for file items which are stored in
1055          an external location. In this case there is no local file data on the
1056          BASE server.
1057        </para>
1059        <para>
1060          The <varname>location</varname> property can take three values:
1061        </para>
1063        <itemizedlist>
1064        <listitem>
1065          <para>
1066          0 = The file is offline, ie. there is no file on the disk
1067          </para>
1068        </listitem>
1069        <listitem>
1070          <para>
1071          1 = The file is in primary storage, ie. it is located on the disk
1072          and can be used by BASE
1073          </para>
1074        </listitem>
1075        <listitem>
1076          <para>
1077          2 = The file is in secondary storage, ie. it has been moved to some
1078          other place and can't be used by BASE immediately.
1079          </para>
1080        </listitem>
1081        <listitem>
1082          <para>
1083          3 = The file is an external file whose location is referenced by the
1084          <varname>url</varname> property. If the file is protected by passwords
1085          or certificates the file item may reference a
1086          <classname docapi="">FileServerData</classname>
1087          object. Note that an external file in most cases can be used by client
1088          applications/plug-ins as if the file was stored locally on the BASE
1089          server.
1090          </para>
1091        </listitem>
1092        </itemizedlist>
1094        <para>
1095          The <varname>action</varname> property controls how a file is
1096          moved between primary and seconday storage. It can have the following
1097          values:
1098        </para>
1100        <itemizedlist>
1101        <listitem>
1102          <para>
1103          0 = Do nothing
1104          </para>
1105        </listitem>
1106        <listitem>
1107          <para>
1108          1 = If the file is in secondary storage, move it back to the primary storage
1109          </para>
1110        </listitem>
1111        <listitem>
1112          <para>
1113          2 = If the file is in primary storage, move it to the secondary storage
1114          </para>
1115        </listitem>
1116        </itemizedlist>
1118        <para>
1119          The actual moving between primary and secondary storage is done by an
1120          external program. See
1121          <xref linkend="appendix.base.config.secondary" /> and
1122          <xref linkend="plugin_developer.other.secondary" /> for more information.
1123        </para>
1125        <para>
1126          The <varname>md5</varname> property can be used to check for file
1127          corruption when it is moved between primary and secondary storage or
1128          when a user re-uploads a file that has been offline.
1129        </para>
1131        <para>
1132          BASE can store files in a compressed format. This is handled internally
1133          and is not visible to client applications. The <varname>compressed</varname>
1134          and <varname>compressedSize</varname> properties are used to store information
1135          about this. A file may always be compressed if the users says so, but
1136          BASE can also do this automatically if the file is uploaded
1137          to a directory with the <varname>autoCompress</varname> flag set
1138          or if the file has MIME type with the <varname>autoCompress</varname>
1139          flag set.
1140        </para>
1142        <para>
1143          The <classname docapi="">FileServerData</classname> class
1144          holds information about an external file server. If the <varname>connectionManagerFactory</varname>
1145          isn't set BASE automatically selects a factory based on the URL of the file. There is
1146          built-in support for HTTP and HTTPS, but it is possible to install extensions for
1147          support for other protocols. The <varname>host</varname> property can be set
1148          to override the host part of the URL from the file. See <xref 
1149          linkend="extensions_developer.connection_manager" /> for more
1150          information about connection managers.
1151        </para>
1153        <para>
1154          The <varname>username</varname> and <varname>password</varname> properties are used if
1155          the server requires the user to be logged in. BASE has built-in support for Basic and
1156          Digest authentication. The <varname>serverCertificate</varname> can be used with HTTPS
1157          servers that uses a non-trusted certificate to tell BASE to trust the server anyway.
1158          In most cases, this is only needed if the server uses a self-signed certificate, but could, for
1159          example, also be used if a trusted site has forgot to renew an expired certificate.
1160          The server certificate should be an X.509 certificate in either binary or text format.
1161          The <varname>clientCertificate</varname> and <varname>clientPassword</varname>
1162          properties are used for servers that require that users present a valid client
1163          certificate before they are allowed access. The client certificate is usually issued
1164          by the server operator and must be in PKCS #12 format.
1165        </para>
1167        <para>
1168          The <classname docapi="">FileTypeData</classname> class holds information about
1169          file types. It is used only to make it easier for users to organise
1170          their files.
1171        </para>
1173        <para>
1174          The <classname docapi="">MimeTypeData</classname> is used to register mime types and
1175          map them to file extensions. The information is only used to lookup values
1176          when needed. Given the filename we can set the <varname>File.mimeType</varname>
1177          and <varname>File.fileType</varname> properties. The MIME type is also
1178          used to decide if a file should be stored in a compressed format or not.
1179          The extension of a MIME type must be unique. Extensions should be registered
1180          without a dot, ie <emphasis>html</emphasis>, not <emphasis>.html</emphasis>
1181        </para> 
1183    </sect2>
1185    <sect2 id="data_api.platforms">
1186      <title>Experimental platforms and item subtypes</title>
1188      <para>
1189         This section gives an overview of experimental platforms
1190         and how they are used to enable data storage in files
1191         instead of in the database. In some senses item subtypes
1192         are related to platforms so they are also included here.
1193      </para>
1195      <itemizedlist>
1196        <title>See also</title>
1197        <listitem><xref linkend="core_api.data_in_files" /></listitem>
1198        <listitem><xref linkend="appendix.rawdatatypes.platforms" /></listitem>
1199        <listitem><xref linkend="extensions_developer.fileset_validator" /></listitem>
1200      </itemizedlist>
1202        <figure id="data_api.figures.platforms">
1203          <title>Experimental platforms and item subtypes</title>
1204          <screenshot>
1205            <mediaobject>
1206              <imageobject>
1207                <imagedata 
1208                  align="center"
1209                  fileref="figures/uml/datalayer.platforms.png" format="PNG" />
1210              </imageobject>
1211            </mediaobject>
1212          </screenshot>
1213        </figure>
1215      <sect3 id="data_api.platforms.platforms">
1216        <title>Platforms</title>
1218        <para>
1219          The <classname docapi="">PlatformData</classname> holds information about a
1220          platform. A platform can have one or more <classname docapi="">PlatformVariant</classname>:s.
1221          Both the platform and variant are identified by an external ID that
1222          is fixed and can't be changed. <emphasis>Affymetrix</emphasis>
1223          is an example of a platform.
1224          If the <varname>fileOnly</varname> flag is set data for the platform
1225          can only be stored in files and not imported into the database. If
1226          the flag is not set data can be imported into the database.
1227          In the latter case, the <varname>rawDataType</varname> property
1228          can be used to lock the platform
1229          to a specific raw data type. If the value is <constant>null</constant>
1230          the platform can use any raw data type.
1231        </para>
1233        <para>
1234          Each platform and it's variant can be connected to one or more
1235          <classname docapi="">DataFileTypeData</classname> items. This item
1236          describes the kind of files that are used to hold data for
1237          the platform and/or variant. The file types are re-usable between
1238          different platforms and variants. Note that a file type may be attached
1239          to either only a platform or to a platform with a variant. File
1240          types attached to platforms are inherited by the variants. The variants
1241          can only define additional file types, not remove or redefine file types
1242          that has been attached to the platform.
1243        </para>
1244        <para>
1245          The file type is also identified
1246          by a fixed, non-changable external ID. The <varname>itemType</varname>
1247          property tells us what type of item the file holds data for (ie.
1248          array design or raw bioassay). It also links to a <classname 
1249          docapi="">ItemSubtype</classname>
1250          which is the generic type of data in the file. This allows us to query
1251          the database for, as an example, files with the generic type
1252          <constant>FileType.RAW_DATA</constant>. If we are in an Affymetrix
1253          experiment we will get the CEL file, for another platform we will
1254          get another file.
1255        </para>
1256        <para>
1257          The <varname>required</varname> flag in <classname docapi="">PlatformFileTypeData</classname>
1258          is used to signal that the file is a required file. This is not
1259          enforced by the core. It is intended to be used by client applications
1260          for creating a better GUI and for validation of an experiment.
1261        </para>
1262        <para>
1263          The <varname>allowMultiple</varname> flag in <classname 
1264          docapi="">PlatformFileTypeData</classname>
1265          controls if it should be possible to store more than one file of
1266          the given type in file type. Again, this is not enforced by the core,
1267          but only a recommendation to client applications. The setting is
1268          also used for validation of an experiment.
1269        </para>
1271      </sect3>
1273      <sect3 id="data_api.platforms.subtypes">
1274        <title>Item subtypes</title>
1276        <para>
1277          The <classname docapi="">ItemSubtypeData</classname> 
1278          class describes a subtype for a main <varname>itemType</varname>. In the simplest
1279          form the subtype is a kind of annotation that is used mainly for creating a
1280          better user experience. If the main item type is also implementing the
1281          <interfacename docapi="">FileStoreEnabledData</interfacename> 
1282          interface, it is possible to
1283          register associations to the file types that can be used together with a given
1284          item subtype. The <varname>required</varname> and <varname>allowMultiple</varname>
1285          have are used in the same way as in the <classname>PlatformFileTypeData</classname>
1286          class.
1287        </para>
1289        <para>
1290          A subtype can be related to other subtypes. This is used to "chain" together
1291          groups of item subtypes. For example, <constant>Hybridization</constant>
1292          is a subtype for <constant>PHYSICALBIOASSAY</constant>, which is related to
1293          the <constant>Labeled extract (EXTRACT)</constant> subtype which is related to
1294          the <constant>Label (TAG)</constant> subtype. In addition, there are also
1295          several protocol and hardware subetypes mixed into this. The relationship between
1296          subtypes makes it possible for client applications to filter out unrelated stuff,
1297          and to validate experiments.
1298        </para>
1300      </sect3>
1302      <sect3 id="data_api.platforms.files">
1303        <title>FileStoreEnabled items and data files</title>
1305        <para>
1306          An item must implement the <interfacename docapi="">FileStoreEnabledData</interfacename>
1307          interface to be able to store data in files instead of in the database.
1308          The interface creates a link to a <classname docapi="">FileSetData</classname> object,
1309          which can hold several <classname docapi="">FileSetMemberData</classname> items.
1310          Each member points to specific <classname docapi="">FileData</classname> item.
1311        </para>
1313      </sect3>
1314    </sect2>
1316    <sect2 id="data_api.parameters">
1317      <title>Parameters</title>
1319      <para>
1320        This section gives an overview the generic parameter
1321        system in BASE that is used to store annotation values,
1322        plugin configuration values, job parameter values, etc.
1323      </para>
1325        <figure id="data_api.figures.parameters">
1326          <title>Parameters</title>
1327          <screenshot>
1328            <mediaobject>
1329              <imageobject>
1330                <imagedata 
1331                  align="center"
1332                  fileref="figures/uml/datalayer.parameters.png" format="PNG" />
1333              </imageobject>
1334            </mediaobject>
1335          </screenshot>
1336        </figure>
1338        <para>
1339          The parameter system is a generic system that can store almost
1340          any kind of simple values (string, numbers, dates, etc.) and
1341          also links to other items. It is, for example, used to store configuration
1342          parameters to plug-ins and jobs as well as annotation values to annotatable items.
1343          The <classname docapi="">ParameterValueData</classname> 
1344          class is an abstract base class that can hold multiple values (all must be of the
1345          same type). Unless only a specific type of values should be stored, this is
1346          the class that should be used when creating references for storing parameter
1347          values. It makes it possible for a single relation to use any kind of
1348          values or for a collection reference to mix multiple types of values.
1349          A typical use case maps a <classname>Map</classname> with the
1350          parameter name as the key:
1351        </para>
1353        <programlisting language="java">
1354private Map&lt;String, ParameterValueData&lt;?&gt;&gt; configurationValues;
1356   Link parameter name with it's values.
1357 table="`PluginConfigurationValues`" lazy="true" cascade="all"
1358   @hibernate.collection-key column="`pluginconfiguration_id`"
1359   @hibernate.collection-index column="`name`" type="string" length="255"
1360   @hibernate.collection-many-to-many column="`value_id`"
1361      class=""
1363public Map&lt;String, ParameterValueData&lt;?&gt;&gt; getConfigurationValues()
1365   return configurationValues;
1367void setConfigurationValues(Map&lt;String, ParameterValueData&lt;?&gt;&gt; configurationValues)
1369   this.configurationValues = configurationValues;
1373      <para>
1374      Now it is possible for the collection to store all types of values:
1375      </para>
1377      <programlisting language="java">
1378Map&lt;String, ParameterValueData&lt;?&gt;&gt; config = ...
1379config.put("names", new StringParameterValueData("A", "B", "C"));
1380config.put("sizes", new IntegerParameterValueData(10, 20, 30));
1382// When you later load those values again you have to cast
1383// them to the correct class.
1384List&lt;String&gt; names = (List&lt;String&gt;)config.get("names").getValues();
1385List&lt;Integer&gt; sizes = (List&lt;Integer&gt;)config.get("sizes").getValues();
1388    </sect2>
1390    <sect2 id="data_api.annotations">
1391      <title>Annotations</title>
1393      <para>
1394        This section gives an overview of how the BASE annotation
1395        system works.
1396      </para>
1398        <figure id="data_api.figures.annotations">
1399          <title>Annotations</title>
1400          <screenshot>
1401            <mediaobject>
1402              <imageobject>
1403                <imagedata 
1404                  align="center"
1405                  fileref="figures/uml/datalayer.annotations.png" format="PNG" />
1406              </imageobject>
1407            </mediaobject>
1408          </screenshot>
1409        </figure>
1411      <sect3 id="data_api.annotations.description">
1412        <title>Annotations</title>
1414        <para>
1415        An item must implement the <interfacename docapi="">AnnotatableData</interfacename>
1416        interface to be able to use the annotation system. This interface gives
1417        a link to a <classname docapi="">AnnotationSetData</classname> item. This class
1418        encapsulates all annotations for the item. There are two types of
1419        annotations:
1420        </para>
1422        <itemizedlist>
1423        <listitem>
1424          <para>
1425          <emphasis>Primary annotations</emphasis> are annotations that
1426          explicitely belong to the item. An annotation set can contain
1427          only one primary annotation of each annotation type. The primary
1428          annotation are linked with the <property>annotations</property>
1429          property. This property is a map with an
1430          <classname docapi="">AnnotationTypeData</classname>  as the key.
1431          </para>
1432        </listitem>
1434        <listitem>
1435          <para>
1436          <emphasis>Inherited annotations</emphasis> are annotations
1437          that belong to a parent item, but that we want to use on
1438          another item as well. Inherited annotations are saved as
1439          references to either a single annotation or to another
1440          annotation set. Thus, it is possible for an item to inherit
1441          multiple annotations of the same annotation type.
1442          </para>
1443        </listitem>
1444        </itemizedlist>
1446        <para>
1447          The <classname docapi="">AnnotationData</classname> class is also
1448          just a placeholder. It connects the annotation set and
1449          annotation type with a <classname docapi="">ParameterValueData</classname>
1450          object. This is the object that holds the actual annotation
1451          values.
1452        </para>
1454      </sect3>
1456      <sect3 id="data_api.annotations.types">
1457        <title>Annotation types</title>
1459        <para>
1460        Instances of the <classname docapi="">AnnotationTypeData</classname> class
1461        defines the various annotations. It must have a <property>valueType</property> 
1462        property which cannot be changed. The value of this property controls
1463        which <classname docapi="">ParameterValueData</classname> subclass is used to store
1464        the annotation values, ie. <classname docapi="">IntegerParameterValueData</classname>,
1465        <classname docapi="">StringParameterValueData</classname>, etc.
1466        The <property>multiplicity</property> property holds the maximum allowed
1467        number of values for an annotation, or 0 if an unlimited number is
1468        allowed.
1469        </para>
1471        <para>
1472        The <property>itemTypes</property> collection holds the codes for
1473        the types of items the annotation type can be used on. This is
1474        checked when new annotations are created but already existing
1475        annotations are not affected if the collection is modified.
1476        </para>
1478        <para>
1479        Annotation types with the <property>protocolParameter</property> flag set
1480        are treated a bit differently. They will not show up as annotations
1481        to items with a type found in the <property>itemTypes</property> collection.
1482        Instead, a protocol parameter should be attached to a protocol. Then, when an item
1483        is using that protocol it becomes possible to add annotation values for
1484        the annotation types specified as protocol parameters. It doesn't matter
1485        if the item's type is found in the <property>itemTypes</property> 
1486        collection or not.
1487        </para>
1489        <para>
1490        The <property>options</property> collection is used to store additional
1491        options required by some of the value types, for example a max string
1492        length for string annotations or the max and min allowed value for
1493        integer annotations.
1494        </para>
1496        <para>
1497        The <property>enumeration</property> property is a boolean flag
1498        indicating if the allowed values are predefined as an enumeration.
1499        In that case those values are found in the <property>enumerationValues</property>
1500        property. The actual subclass is determined by the <property>valueType</property>
1501        property.
1502        </para>
1504        <para>
1505        Most of the other properties are hints to client applications how
1506        to render the input field for the annotation.
1507        </para>
1509      </sect3>
1511      <sect3 id="data_api.annotations.units">
1512        <title>Units</title>
1513        <para>
1514        Numerical annotation values can have units. A unit is described by
1515        a <classname docapi="">UnitData</classname> object.
1516        Each unit belongs to a <classname docapi="">QuantityData</classname> 
1517        object which defines the class of units. For example, if the quantity is
1518        <emphasis>weight</emphasis>, we can have units, <emphasis>kg</emphasis>,
1519        <emphasis>mg</emphasis>, <emphasis>µg</emphasis>, etc. The <classname>UnitData</classname>
1520        contains a factor and offset that relates all units to a common reference
1521        defined by the <classname>QuantityData</classname> class. For example,
1522        <emphasis>1 meter</emphasis> is the reference unit for distance, and we
1523        have <code>1 meter * 0.001 = 1 millimeter</code>. In this case, the factor is
1524        <emphasis>0.001</emphasis> and the offset 0. Another example is the relationship between
1525        kelvin and Celsius, which is <code>1 kelvin + 273.15 = 1 °Celsius</code>.
1526        Here, the factor is 1 and the offset is <emphasis>+273.15</emphasis>.
1527        The <classname
1528        docapi="">UnitSymbolData</classname>
1529        is used to make it possible to assign alternative symbols to a single unit.
1530        This is needed to simplify input where it may be hard to know what to
1531        type to get <emphasis></emphasis> or <emphasis>°C</emphasis>. Instead,
1532        <emphasis>m2</emphasis> and <emphasis>C</emphasis> can be used as
1533        alternative symbols.
1534        </para>
1536        <para>
1537        The creator of an annotation type may select a
1538        <classname>QuantityData</classname>, which can't be changed later, and
1539        a default <classname>UnitData</classname>. When entering annotation values
1540        a user may select any unit for the selected quantity (unless annotation type
1541        owner has limited this by selecting <varname>usableUnits</varname>). Before
1542        the values are stored in the database, they are converted to the default
1543        unit. This makes it possible to compare and filter on annotation values
1544        using different units. For example, filtering with <emphasis>&gt;5mg</emphasis> 
1545        also finds items that are annotated with <emphasis>2g</emphasis>.
1546        </para>
1548        <para>
1549        The core should automatically update the stored annotation values if
1550        the default unit is changed for an annotation type, or if the reference
1551        factor for a unit is changed.
1552        </para>
1553      </sect3>
1555      <sect3 id="data_api.annotations.categories">
1556        <title>Categories</title>
1558        <para>
1559        The <classname docapi="">AnnotationTypeCategoryData</classname> class defines
1560        categories that are used to group annotation types that are related to
1561        each other. This information is mainly useful for client applications
1562        when displaying forms for annotating items, that wish to provide a
1563        clearer interface when there are many (say 50+) annotations type for
1564        an item. An annotation type can belong to more than one category.
1565        </para>
1567      </sect3>
1569    </sect2>
1571    <sect2 id="data_api.protocols">
1572      <title>Protocols, hardware and software</title>
1574      <para>
1575        This section gives an overview of how protocols that describe various
1576        processes, such as sampling, extraction and scanning, are used in BASE.
1577      </para>
1579        <figure id="data_api.figures.protocols">
1580          <title>Protocols, hardware and software</title>
1581          <screenshot>
1582            <mediaobject>
1583              <imageobject>
1584                <imagedata 
1585                  align="center"
1586                  fileref="figures/uml/datalayer.protocols.png" format="PNG" />
1587              </imageobject>
1588            </mediaobject>
1589          </screenshot>
1590        </figure>
1592      <sect3 id="data_api.protocols.description">
1593        <title>Protocols</title>
1595        <para>
1596        A protocol is something that defines a procedure or recipe for some
1597        kind of action, such as sampling, extraction and scanning. The subtype
1598        of the protocol is used to determine what the protocol is used for.
1599        In BASE we only store a short name and description. It is possible to
1600        attach a file that provides a longer description of the procedure.
1601        </para>
1603      </sect3>
1605      <sect3 id="data_api.protocols.parameters">
1606        <title>Parameters</title>
1608        <para>
1609        The procedure described by the protocol may have parameters
1610        that are set indepentently each time the protocol is used. It
1611        could for example be a temperature, a time or something else.
1612        The definition of parameters is done by creating annotation
1613        types and attaching them to the protocol. It is only possible
1614        to attach annotation types which has the <property>protocolParameter</property>
1615        property set to <constant>true</constant>. The same annotation type
1616        can be used for more than one protocol, but only do this if the
1617        parameters actually has the same meaning.
1618        </para>
1620      </sect3>
1622      <sect3 id="data_api.wares.description">
1623        <title>Hardware and software</title>
1624        <para>
1625          BASE is pre-installed with a set of subtypes for hardware and software.
1626          They are typically used to filter the registered hardware and software
1627          depending on what a user is doing. For example, when adding raw data
1628          to BASE a user can select a scanner. The GUI will display the hardware
1629          that has been registered as <emphasis>scanner</emphasis> subtype.
1630          Other subtypes are <emphasis>hybridization station</emphasis>
1631          and <emphasis>print robot</emphasis>. An administrator may register more
1632          subtypes.
1633        </para>
1634      </sect3>
1636    </sect2>
1638    <sect2 id="data_api.plugins">
1639      <title>Plug-ins, jobs and job agents</title>
1641      <para>
1642         This section gives an overview of plug-ins, jobs and job agents.
1643      </para>
1645      <itemizedlist>
1646        <title>See also</title>
1647        <listitem><xref linkend="plugins.installation" /></listitem>
1648        <listitem><xref linkend="installation.jobagents" /></listitem>
1649      </itemizedlist>
1651        <figure id="data_api.figures.plugins">
1652          <title>Plug-ins, jobs and job agents</title>
1653          <screenshot>
1654            <mediaobject>
1655              <imageobject>
1656                <imagedata 
1657                  align="center"
1658                  scalefit="1" width="100%"
1659                  fileref="figures/uml/datalayer.plugins.png" format="PNG" />
1660              </imageobject>
1661            </mediaobject>
1662          </screenshot>
1663        </figure>
1665      <sect3 id="data_api.plugins.plugins">
1666        <title>Plug-ins</title>
1668        <para>
1669          The <classname docapi="">PluginDefinitionData</classname> holds information of the
1670          installed plugin classes. Much of the information is copied from the
1671          plug-in itself from the <classname docapi="net.sf.basedb.core.plugin">About</classname> object and by checking
1672          which interfaces it implements.
1673        </para>
1675        <para>
1676          There are five main types of plug-ins:
1677        </para>
1679        <itemizedlist>
1680        <listitem>
1681          <para>
1682          IMPORT (mainType = 1): A plug-in that imports data to BASE.
1683          </para>
1684        </listitem>
1685        <listitem>
1686          <para>
1687          EXPORT (mainType = 2): A plug-in that exports data from BASE.
1688          </para>
1689        </listitem>
1690        <listitem>
1691          <para>
1692          INTENSITY (mainType = 3): A plug-in that calculates intensity values
1693          from raw data.
1694          </para>
1695        </listitem>
1696        <listitem>
1697          <para>
1698          ANALYZE (mainType = 4): A plug-in that analyses data.
1699          </para>
1700        </listitem>
1701        <listitem>
1702          <para>
1703          OTHER (mainType = 5): Any other plug-in.
1704          </para>
1705        </listitem>
1706        </itemizedlist>
1708        <para>
1709          A plug-in may have different configurations. The flags <property>supportsConfigurations</property>
1710          and <property>requiresConfiguration</property> are used to specify if a plug-in
1711          must have or can't have any configurations. Configuration parameter values are
1712          versioned. Each time anyone updates a configuration the version number
1713          is increased and the parameter values are stored as a new entity.
1714          This is required because we want to be able to know exactly which
1715          parameters a job were using when it was executed. When a job is
1716          created we also store the parameter version number
1717          (<property>JobData.parameterVersion</property>). This means that even if
1718          someone changes the configuration later we will always know which
1719          parameters the job used.
1720        </para>
1722        <para>
1723          The <classname docapi="">PluginTypeData</classname> class is ued to group
1724          plug-ins that share some common functionality, by implementing
1725          additional (optional) interfaces. For example, the
1726          <interfacename docapi="net.sf.basedb.core.plugin">AutoDetectingImporter</interfacename> should be implemented
1727          by import plug-ins that supports automatic detection of file formats.
1728          Another example is the <interfacename docapi="net.sf.basedb.core.plugin">AnalysisFilterPlugin</interfacename>
1729          interface which should be implemented by all analysis plug-ins that
1730          only filters data.
1731        </para>
1733      </sect3>
1735      <sect3 id="">
1736        <title>Jobs</title>
1738        <para>
1739          A job represents a single invokation of a plug-in to do some work.
1740          The <classname docapi="">JobData</classname> class holds information about this.
1741          A job is usuallu executed by a plug-in, but doesn't have to be. The
1742          <property>status</property> property holds the current state of a job.
1743        </para>
1745        <itemizedlist>
1746        <listitem>
1747          <para>
1748            UNCONFIGURED (status = 0): The job is not yet ready to be executed.
1749          </para>
1750        </listitem>
1751        <listitem>
1752          <para>
1753            WAITING (status = 1): The job is waiting to be executed.
1754          </para>
1755        </listitem>
1756        <listitem>
1757          <para>
1758            PREPARING (status = 5): The job is about to be executed but hasn't started yet.
1759          </para>
1760        </listitem>
1761        <listitem>
1762          <para>
1763            EXECUTING (status = 2): The job is currently executing.
1764          </para>
1765        </listitem>
1766        <listitem>
1767          <para>
1768            ABORTING (status = 6): The job is executing but an ABORT signal has been sent
1769            requesting it to abort and finish.
1770          </para>
1771        </listitem>
1772        <listitem>
1773          <para>
1774            DONE (status = 3): The job finished successfully.
1775          </para>
1776        </listitem>
1777        <listitem>
1778          <para>
1779            ERROR (status = 4): The job finished with an error.
1780          </para>
1781        </listitem>
1782        </itemizedlist>
1783      </sect3>
1785      <sect3 id="data_api.plugins.agents">
1786        <title>Job agents</title>
1788        <para>
1789          A job agent is a program running on the same or a different server that
1790          is regularly checking for jobs that are waiting to be executed. The
1791          <classname docapi="">JobAgentData</classname> holds information about a job agent
1792          and the <classname docapi="">JobAgentSettingsData</classname> links the agent
1793          with the plug-ins the agent is able to execute. The job agent will only
1794          execute jobs that are owned by users or projects that the job agent has
1795          been shared to with at least use permission. The <property>priorityBoost</property>
1796          property can be used to give specific plug-ins higher priority.
1797          Thus, for a job agent it is possible to:
1798        </para>
1800        <itemizedlist>
1801        <listitem>
1802          <para>
1803          Specify exactly which plug-ins it will execute. For example, it is possible
1804          to dedicate one agent to only run one plug-in.
1805          </para>
1806        </listitem>
1807        <listitem>
1808          <para>
1809          Give some plug-ins higher priority. For example a job agent that is mainly
1810          used for importing data should give higher priority to all import plug-ins.
1811          Other types of jobs will have to wait until there are no more data to be
1812          imported.
1813          </para>
1814        </listitem>
1815        <listitem>
1816          <para>
1817          Specify exactly which users/groups/projects that may use the agent. For
1818          example, it is possible to dedicate one agent to only run jobs for a certain
1819          project.
1820          </para>
1821        </listitem>
1822        </itemizedlist>
1824      </sect3>
1827    </sect2>
1829    <sect2 id="data_api.biomaterials">
1830      <title>Biomaterial LIMS</title>
1832        <figure id="data_api.figures.biomaterials">
1833          <title>Biomaterial LIMS</title>
1834          <screenshot>
1835            <mediaobject>
1836              <imageobject>
1837                <imagedata 
1838                  align="center"
1839                  fileref="figures/uml/datalayer.biomaterials.png" format="PNG" />
1840              </imageobject>
1841            </mediaobject>
1842          </screenshot>
1843        </figure>
1845      <sect3 id="data_api.biomaterials.description">
1846        <title>Biomaterials</title>
1848        <para>
1849          There are three main types of biomaterials: <classname docapi="">BioSourceData</classname>,
1850          <classname docapi="">SampleData</classname> and
1851          <classname docapi="">ExtractData</classname>.
1852          All types of are derived from the base class <classname docapi="">BioMaterialData</classname>.
1853          The reason for this is that they all share common functionality such as pooling
1854          and events. By using a common base class we do not have to create duplicate
1855          classes for keeping track of events and parents.
1856        </para>
1858        <para>
1859          The <classname docapi="">BioSourceData</classname> is the simplest of the biomaterials.
1860          It cannot have parents and can't participate in events. It's only used as a
1861          (non-required) parent for samples.
1862        </para>
1864        <para>
1865          The <classname docapi="">MeasuredBioMaterialData</classname> class is used as a base
1866          class for the other biomaterial types. It introduces quantity
1867          measurements and can store original and remaining quantities. They are
1868          both optional. If an original quantity has been specified the core
1869          automatically calculates the remaining quantity based on the events a
1870          biomaterial participates in.
1871        </para>
1873        <para>
1874          All measured biomaterial have at least one event associated with them,
1875          the <emphasis>creation event</emphasis>, which holds information about the creation of the
1876          biomaterial. A measured biomaterial can be created in three ways:
1877        </para>
1879        <itemizedlist>
1880        <listitem>
1881          <para>
1882          From a single item of the same type or the parent type. Biosource is the parent type of
1883          samples and sample is the parent type of extracts. The <property>parentType</property> 
1884          property must be set to the correct parent type and the <property>parent</property> property
1885          is set to point to the parent item. The parent information
1886          is also always duplicated in the <property>sources</property> collection of the <classname docapi="">BioMaterialEventData</classname>
1887          object representing the creation event. It is the responsibility of the
1888          core to make sure that everything is properly synchronized and that
1889          remaining quantities are calculated.
1890          </para>
1891        </listitem>
1893        <listitem>
1894          <para>
1895          From multiple items of the same type, i.e pooling.
1896          In this case the <property>parentType</property> property is set, but
1897          the <property>parent</property> property is null. All source
1898          biomaterials are contained in the <property>sources</property> collection.
1899          The core is still responsible for keeping everything synchronized and to
1900          update remaining quantities.
1901          </para>
1902        </listitem>
1904        <listitem>
1905          <para>
1906          As a standalone biomaterial without parents. The <property>parentType</property>
1907          property should be null, as should the <property>parent</property> property
1908          and the <property>sources</property> collection.
1909          </para>
1910        </listitem>
1911        </itemizedlist>
1913      </sect3>
1915      <sect3 id="data_api.biomaterials.plates">
1916        <title>Bioplates and plate types</title>
1918        <para>
1919          Biomaterial (except biosource) may optionally be placed on <classname 
1920          docapi="">BioPlateData</classname>:s. A bioplate is something
1921          that collects multiple biomaterial as a unit. A bioplate typically has a
1922          <classname docapi="">PlateGeometryData</classname> that
1923          determines the number of locations on the plate (<classname docapi="">BioWellData</classname>).
1924          A single well can hold a single biomaterial at a time.
1925        </para>
1927        <para>
1928          The bioplate must be of a specific <classname docapi="">BioPlateTypeData</classname>.
1929          The type can be used to put limitations on how the plate can be used. For example,
1930          it can be limited to a single type of biomaterial. It is also possible to lock wells
1931          so that the biomaterial in them can't be changed. Supported lock modes are:
1932        </para>
1934        <itemizedlist>
1935        <listitem>
1936          <para>
1937          <emphasis>Unlocked</emphasis>: Wells are unlocked and the biomaterial may be changed
1938          any number of times.
1939          </para>
1940        </listitem>
1941        <listitem>
1942          <para>
1943          <emphasis>Locked-after-move</emphasis>: The well is locked after it has been used one
1944          time and the biomaterial that was put in it has been moved to another plate.
1945          </para>
1946        </listitem>
1947        <listitem>
1948          <para>
1949          <emphasis>Locked-after-add</emphasis>: The well is locked after biomaterial has been
1950          put into it. It is not possible to remove the biomaterial.
1951          </para>
1952        </listitem>
1953        <listitem>
1954          <para>
1955          <emphasis>Locked-after-create</emphasis>: The well is locked once it has been created.
1956          Biomaterial must be put into wells before the plate is saved to the database.
1957          </para>
1958        </listitem>
1959        </itemizedlist>
1961      </sect3>
1963      <sect3 id="">
1964        <title>Biomaterial and plate events</title>
1966        <para>
1967          An event represents something that happened to one or more biomaterials, for example
1968          the creation of another biomaterial. The <classname docapi="">BioMaterialEventData</classname>
1969          holds information about entry and event dates, protocols used, the user who is
1970          responsible, etc. There are three types of events represented by the <property>eventType</property>
1971          property.
1972        </para>
1974        <orderedlist>
1975        <listitem>
1976          <para>
1977          <emphasis>Creation event</emphasis>: This event represents the creation of a (measured)
1978          biomaterial. The <property>sources</property> collection contains
1979          information about the biomaterials that were used to create the new
1980          biomaterial. All sources must be of the same type. There can only be one
1981          source of the parent type. These rules are maintained by the core.
1982          </para>
1983        </listitem>
1985        <listitem>
1986          <para>
1987          <emphasis>Bioassay event</emphasis>: This event represents the creation
1988          of a bioassay. This event type is needed because we want to keep track
1989          of quantities for extracts. This event has a <classname docapi="">PhysicalBioAssayData</classname> 
1990          as a product instead of a biomaterial. The sources collection can only contain
1991          extracts. If the bioassay can hold extracts in multiple positions the
1992          <property>position</property> property in <classname docapi="">BioMaterialEventSourceData</classname> 
1993          can be used to track which extract that was put in each position. It is allowed
1994          to put multiple extracts in the same position, but then the usually need
1995          to use different <classname docapi="">TagData</classname> 
1996          items. However, this is not enforced by the core.
1997          </para>
1998        </listitem>
2000        <listitem>
2001          <para>
2002          <emphasis>Other event</emphasis>: This event represents some other important
2003          information about a single biomaterial that affected the remaining quantity.
2004          This event type doesn't have any sources.
2005          </para>
2006        </listitem>
2007        </orderedlist>
2009        <para>
2010          It is also possible to register events that applies to one or more
2011          bioplates using the <classname docapi="">BioPlateEventData</classname>
2012          class. The <classname docapi="">BioPlateEventParticipantData</classname>
2013          class holds information about each plate that is part of the event. The <property>role</property>
2014          property is a textual description of what happened to the plate. Eg. a move event, may have one
2015          <emphasis>source</emphasis> plate and one <emphasis>destination</emphasis> plate. It is
2016          recommended (but not required) that all biomaterial that are affected by the plate event
2017          are linked via a <code>BioMaterialEventData</code> to a <code>BioPlateEventParticipantData</code>.
2018          This will make it easier to keep track of the history of individual biomaterial items.
2019          Biomaterial events that are linked in this way are also automatically updated if the
2020          bioplate event is modified (eg. selecting a protocol, event date, etc.).
2021        </para>
2023      </sect3> 
2024    </sect2>
2026    <sect2 id="data_api.plates">
2027      <title>Array LIMS - plates</title>
2029        <figure id="data_api.figures.plates">
2030          <title>Array LIMS - plates</title>
2031          <screenshot>
2032            <mediaobject>
2033              <imageobject>
2034                <imagedata 
2035                  align="center"
2036                  scalefit="1" width="100%"
2037                  fileref="figures/uml/datalayer.plates.png" format="PNG" />
2038              </imageobject>
2039            </mediaobject>
2040          </screenshot>
2041        </figure>
2043      <sect3 id="data_api.plates.description">
2044        <title>Plates</title>
2046        <para>
2047          The <classname docapi="">PlateData</classname> is the main class holding information
2048          about a single plate. The associated <classname docapi="">PlateGeometryData</classname>
2049          defines how many rows and columns there are on a plate. Since this
2050          information is used to create wells, and for various other checks it is
2051          not possible to change the number of rows or columns once a geometry has
2052          been created.
2053        </para>
2055        <para>
2056          All plates must have a <classname docapi="">PlateTypeData</classname> which defines
2057          the geometry and a set of event types (see below).
2058        </para>
2060        <para>
2061          If the destroyed flag of a plate is set it is not allowed to use the
2062          plate for a plate mapping or to create array designs. However, it
2063          is possible to change the flag to not destroyed.
2064        </para>
2066        <para>
2067          The barcode is intended to be used as an external identifier of the plate.
2068          But, the core doesn't care about the value or if it is unique or not.
2069        </para>
2070      </sect3>
2072      <sect3 id="">
2073        <title>Plate events</title>
2075        <para>
2076          The plate type defines a set of <classname docapi="">PlateEventTypeData</classname>
2077          objects, each one represening a particular event a plate of this type
2078          usually goes trough. For a plate of a certain type, it is possible to
2079          attach exactly one event of each event type. The event type defines an
2080          optional protocol type, which can be used by client applications to
2081          filter a list of protocols for the event. The core doesn't check that
2082          the selected protocol for an event is of the same protocol type as
2083          defined by the event type.
2084        </para>
2086        <para>
2087          The ordinal value can be used as a hint to client applications in
2088          which order the events actually are performed in the lab. The core doesn't
2089          care about this value or if several event types have the same value.
2090        </para>
2091      </sect3>
2093      <sect3 id="data_api.plates.mappings">
2094        <title>Plate mappings</title>
2096        <para>
2097          A plate can be created either from scratch, with the help of the information
2098          in a <classname docapi="">PlateMappingData</classname>, from a set of parent plates.
2099          In the first case it is possible to specify a reporter for each well on the
2100          plate. In the second case the mapping code creates all the wells and links
2101          them to the parent wells on the parent plates. Once the plate has been saved
2102          to the database, the wells cannot be modified (because they are used
2103          downstream for various validation, etc.)
2104        </para>
2106        <para>
2107          The details in a plate mapping are simply coordinates that for each
2108          destination plate, row and column define a source plate, row and column.
2109          It is possible for a single source well to be mapped to multiple destination
2110          wells, but for each destination well only a single source well can be
2111          used.
2112        </para>
2114      </sect3>
2116    </sect2>
2118    <sect2 id="data_api.arrays">
2119      <title>Array LIMS - arrays</title>
2121        <figure id="data_api.figures.arrays">
2122          <title>Array LIMS - arrays</title>
2123          <screenshot>
2124            <mediaobject>
2125              <imageobject>
2126                <imagedata 
2127                  align="center"
2128                  fileref="figures/uml/datalayer.arrays.png" format="PNG" />
2129              </imageobject>
2130            </mediaobject>
2131          </screenshot>
2132        </figure>
2134      <sect3 id="data_api.arrays.designs">
2135        <title>Array designs</title>
2137        <para>
2138          Array designs are stored in <classname docapi="">ArrayDesignData</classname> objects
2139          and can be created either as standalone designs or
2140          from plates. In the first case the features on an array design
2141          are described by a reporter map. A reporter map is a file
2142          that maps a coordinate (block, meta-grid, row, column),
2143          position or an external ID on an array design to a
2144          reporter. Which method to use is given by the
2145          <property>ArrayDesign.featureIdentificationMethod</property> property.
2146          The coordinate system on an array design is divided into blocks.
2147          Each block can be identified either by a <property>blockNumber</property>
2148          or by meta coordinates. This information is stored in
2149          <classname docapi="">ArrayDesignBlockData</classname> items. Each block
2150          contains several <classname docapi="">FeatureData</classname> items, each
2151          one identified by a row and column coordinate. Platforms that doesn't
2152          divide the array design into blocks or doesn't use the coordinate system at all
2153          must still create a single super-block that holds all features.
2154        </para>
2156        <para>
2157          Array designs that are created from plates use a print map file
2158          instead of a reporter map. A print map is similar to a plate mapping
2159          but maps features (instead of wells) to wells. The file should
2160          specifify which plate and well a feature is created from. Reporter
2161          information will automatically be copied by BASE from the well.
2162        </para>
2164        <para>
2165          It is also possible to skip the importing of features into the
2166          database and just keep the data in the orginal files instead.
2167          This is typically done for Affymetrix CDF files.
2168        </para>
2170      </sect3>
2172      <sect3 id="data_api.arrays.slides">
2173        <title>Array slides</title>
2175        <para>
2176          The <classname docapi="">ArraySlideData</classname> represents a single
2177          array. Arrays are usually printed several hundreds in a batch,
2178          represented by a <classname docapi="">ArrayBatchData</classname> item.
2179          The <property>batchIndex</property> is the ordinal number of the
2180          array in the batch. The <property>barcode</property> can be used
2181          as a means for external programs to identify the array. BASE doesn't
2182          care if a value is given or if they are unique or not. If the
2183          <property>destroyed</property> flag is set it prevents a slide from
2184          beeing used by a hybridization.
2185        </para>
2187      </sect3>
2188    </sect2>
2190    <sect2 id="data_api.bioassays">
2191      <title>Bioassays and raw data</title>
2193        <figure id="data_api.figures.rawdata">
2194          <title>Bioassays and raw data</title>
2195          <screenshot>
2196            <mediaobject>
2197              <imageobject>
2198                <imagedata 
2199                  align="center"
2200                  scalefit="1" width="100%"
2201                  fileref="figures/uml/datalayer.bioassays.png" format="PNG" />
2202              </imageobject>
2203            </mediaobject>
2204          </screenshot>
2205        </figure>
2207      <sect3 id="data_api.bioassays.physical">
2208        <title>Physical bioassays</title>
2210        <para>
2211        A <classname docapi="">PhysicalBioAssayData</classname>
2212        item connect the array slides from the Array LIMS part
2213        with extracts from the biomaterials part. The <property>creationEvent</property>
2214        is used to register which extracts that were used on the bioassay.
2215        The relation to slides is a one-to-one relation. A slide can only be used on
2216        a single physical bioassay and a bioassay can only use a single slide. The relation
2217        is optional from both sides.
2218        </para>
2220        <para>
2221        Further processing of the bioassay is registered as a series
2222        of <classname docapi="">DerivedBioAssayData</classname>
2223        items. For microarray experiments the first step is typically a scanning
2224        of the hybridization. Information about the software/hardware and protocol
2225        used can be registered. Any data files generated by the process can be
2226        registered with the <classname docapi="">FileSetData</classname>
2227        item. If more than one processsing step is required child derived
2228        bioassays can be created that descrive each additional step. 
2229        </para>
2231        <para>
2232        If the root physical bioassay has multiple extracts in multiple positions, the
2233        <property>extract</property> property of a derived bioassay is used to link
2234        with the extract that the specific derived bioassay represents. If the
2235        link is null the derived bioassay represents all extracts on the
2236        physical bioassay.
2237        </para>
2239      </sect3>
2241      <sect3 id="data_api.bioassays.rawdata">
2242        <title>Raw data</title>
2244        <para>
2245        A <classname docapi="">RawBioAssayData</classname> object
2246        represents the raw data that is produced by analysing the data from the physical
2247        bioassay. You may register which software that was used, the
2248        protocol and any parameters (through the annotation system).
2249        </para>
2251        <para>
2252        Files with the analysed data values can be attached to the
2253        associated <classname docapi="">FileSetData</classname> object.
2254        The platform and, optionally, the variant has information about the file types
2255        that can be used for that platform. If the platform file types support
2256        metadata extraction, headers, the number of spots, and other
2257        information may be automatically extracted from the raw data file(s).
2258        </para>
2260        <para>
2261        If the platform support it, raw data can also be imported into the database.
2262        This is handled by batchers and <classname docapi="">RawData</classname> objects.
2263        Which table to store the data in depends on the <property>rawDataType</property>
2264        property. The properties shown for the <classname>RawData</classname> class
2265        in the diagram are the mandatory properties. Each raw data type defines additional
2266        properties that are specific to that raw data type.
2267        </para>
2269      </sect3>
2271      <sect3 id="data_api.rawdata.spotimages">
2272        <title>Spot images</title>
2274        <para>
2275        Spot images can be created if you have the original image
2276        files. BASE can use 1-3 images as sources for the red, green
2277        and blue channel respectively. The creation of spotimages requires
2278        that x and y coordinates are given for each raw data spot. The scaling
2279        and offset values are used to convert the coordinates to pixel
2280        coordinates. With this information BASE is able to cut out a square
2281        from the source images that, theoretically, contains a specific spot and
2282        nothing else. The spot images are gamma-corrected independently and then
2283        put together into PNG images that are stored in a zip file.
2284        </para>
2285      </sect3>
2287    </sect2>
2289    <sect2 id="data_api.experiments">
2290      <title>Experiments and analysis</title>
2293        <figure id="data_api.figures.experiments">
2294          <title>Experiments</title>
2295          <screenshot>
2296            <mediaobject>
2297              <imageobject>
2298                <imagedata 
2299                  align="center"
2300                  scalefit="1" width="75%"
2301                  fileref="figures/uml/datalayer.experiments.png" format="PNG" />
2302              </imageobject>
2303            </mediaobject>
2304          </screenshot>
2305        </figure>
2307      <sect3 id="data_api.experiments.description">
2308        <title>Experiments</title>
2310        <para>
2311          The <classname docapi="">ExperimentData</classname> 
2312          class is used to collect information about a single experiment. It
2313          links to any number of <classname docapi="">RawBioAssayData</classname>
2314          items, which must all be of the same <classname 
2315          docapi="net.sf.basedb.core">RawDataType</classname>.
2316        </para>
2318        <para>
2319          Annotation types that are needed in the analysis must connected to
2320          the experiment as experimental factors and the annotation values should
2321          be set on or inherited by each raw bioassay that is part of the
2322          experiment.
2323        </para>
2325        <para>
2326          The directory connected to the experiment is the default directory
2327          where plugins that generate files should store them.
2328        </para>
2329      </sect3>
2331      <sect3 id="data_api.experiments.bioassays">
2332        <title>Bioassay sets, bioassays and transformations</title>
2334        <para>
2335          Each line of analysis starts with the creation of a <emphasis>root</emphasis>
2336          <classname docapi="">BioAssaySetData</classname>,
2337          which holds the intensities calculated from the raw data.
2338          A bioassayset can hold one intensity for each channel. The number of
2339          channels is defined by the raw data type. For each raw bioassay used a
2340          <classname docapi="">BioAssayData</classname>
2341          is created.
2342        </para>
2344        <para>
2345          Information about the process that calculated the intensities are
2346          stored in a <classname docapi="">TransformationData</classname>
2347          object. The root transformation links with the raw bioassays that are used
2348          in this line of analysis and to a <classname 
2349          docapi="">JobData</classname> which has information
2350          about which plug-in and parameters that was used in the calculation.
2351        </para>
2353        <para>
2354          Once the root bioassayset has been created it is possible to
2355          again apply a transformation to it. This time the transformation
2356          links to a single source bioassayset instead of the raw bioassays.
2357          As before, it still links to a job with information about the plug-in and
2358          parameters that does the actual work. The transformation must make sure
2359          that new bioassays are created and linked to the bioassays in the
2360          source bioassayset. This above process may be repeated as many times
2361          as needed.
2362        </para>
2364        <para>
2365          Data to a bioassay set can only be added to it before it has been
2366          committed to the database. Once the transaction has been committed
2367          it is no longed possible to add more data or to modify existing
2368          data.
2369        </para>
2371      </sect3>
2373      <sect3 id="data_api.experiments.virtualdb">
2374        <title>Virtual databases, datacubes, etc.</title>
2376        <para>
2377          The above processes requires a flexible storage solution for the data.
2378          Each experiment is related to a <classname docapi="">VirtualDb</classname>
2379          object. This object represents the set of tables that are needed to store
2380          data for the experiment. All tables are created in a special part of the
2381          BASE database that we call the <emphasis>dynamic database</emphasis>.
2382          In MySQL the dynamic database is a separate database, in Postgres it is
2383          a separate schema.
2384        </para>
2386        <para>
2387          A virual database is divided into data cubes. A data cube can be seen
2388          as a three-dimensional object where each point can hold data that in
2389          most cases can be interpreted as data for a single spot from an
2390          array. The coordinates to a point is given by <emphasis>layer</emphasis>,
2391          <emphasis>column</emphasis> and <emphasis>position</emphasis>. The
2392          layer and column coordinates are represented by the
2393          <classname docapi="">DataCubeLayerData</classname>
2394          and <classname docapi="">DataCubeColumnData</classname>
2395          objects. The position coordinate has no separate object associated with
2396          it.
2397        </para>
2399        <para>
2400          Data for a single bioassay set is always stored in a single layer. It
2401          is possible for more than one bioassay set to use the same layer. This
2402          usually happens for filtering transformations that doesn't modify the
2403          data.  The filtered bioassay set is then linked to a
2404          <classname docapi="">DataCubeFilterData</classname>
2405          object, which has information about which data points that
2406          passed the filter.
2407        </para>
2409        <para>
2410          All data for a bioassay is stored in a single column.
2411          Two bioassays in different bioassaysets (layers) can only have the same
2412          column if one is the parent of the other.
2413        </para>
2415        <para>
2416          The position coordinate is tied to a reporter.
2417        </para>
2419        <para>
2420          A child bioassay set may use the same data cube as it's parent
2421          bioassay set if all of the following conditions are true:
2422        </para>
2424        <itemizedlist>
2425        <listitem>
2426          <para>
2427          All positions are linked to the same reporter as the positions
2428          in the parent bioassay set.
2429          </para>
2430        </listitem>
2432        <listitem>
2433          <para>
2434          All data points are linked to the same (possible many) raw data
2435          spots as the corresponding data points in the parent bioassay set.
2436          </para>
2437        </listitem>
2439        <listitem>
2440          <para>
2441          The bioassays in the child bioassay set each have exactly one
2442          parent in the parent bioassay set. No parent bioassay may be the
2443          parent of more than one child bioassay.
2444          </para>
2445        </listitem>
2446        </itemizedlist>
2448        <para>
2449          If any of the above conditions are not true, a new data cube
2450          must be created for the child bioassay set.
2451        </para>
2452      </sect3>
2454      <sect3 id="data_api.dynamic.description">
2455        <title>The dynamic database</title>
2457        <figure id="data_api.figures.dynamic">
2458          <title>The dynamic database</title>
2459          <screenshot>
2460            <mediaobject>
2461              <imageobject>
2462                <imagedata 
2463                  align="center"
2464                  fileref="figures/uml/datalayer.dynamic.png" format="PNG" />
2465              </imageobject>
2466            </mediaobject>
2467          </screenshot>
2468        </figure>
2470        <para>
2471          Each virtual database consists of several tables. The tables
2472          are dynamically created when needed. For each table shown in the diagram
2473          the # sign is replaced by the id of the virtual database object at run
2474          time.
2475        </para>
2477        <para>
2478          There are no classes in the data layer for these tables and they
2479          are not mapped with Hibernate. When we work with these tables we
2480          are always using batcher classes and queries that works with integer,
2481          floats and strings.
2482        </para>
2484        <bridgehead>The D#Spot table</bridgehead>
2485        <para>
2486          This is the main table which keeps the intensities for a single spot
2487          in the data cube. Extra values attached to the spot are kept in separate
2488          tables, one for each type of value (D#SpotInt, D#SpotFloat and D#SpotString).
2489        </para>
2491        <bridgehead>The D#Pos table</bridgehead>
2492        <para>
2493          This table stores the reporter id for each position in a cube.
2494          Extra values attached to the position are kept in separate tables,
2495          one for each type of value (D#PosInt, D#PosFloat and D#PosString).
2496        </para>
2498        <bridgehead>The D#Filter table</bridgehead>
2499        <para>
2500          This table stores the coordinates for the spots that remain after
2501          filtering. Note that each filter is related to a bioassayset which
2502          gives the cube and layer values. Each row in the filter table then
2503          adds the column and position allowing us to find the spots in the
2504          D#Spot table.
2505        </para>
2507        <bridgehead>The D#RawParents table</bridgehead>
2508        <para>
2509          This table holds mappings for a spot to the raw data it is calculated
2510          from. We don't need the layer coordinate since all layers in a cube
2511          must have the same mapping to raw data.
2512        </para>
2514      </sect3>     
2517    </sect2>
2519    <sect2 id="data_api.misc">
2520      <title>Other classes</title>
2522        <figure id="data_api.figures.misc">
2523          <title>Other classes</title>
2524          <screenshot>
2525            <mediaobject>
2526              <imageobject>
2527                <imagedata 
2528                  align="center"
2529                  fileref="figures/uml/datalayer.misc.png" format="PNG" />
2530              </imageobject>
2531            </mediaobject>
2532          </screenshot>
2533        </figure>
2535    </sect2>
2537  </sect1>
2539  <sect1 id="api_overview.core_api" chunked="1">
2540    <title>The Core API</title>
2542    <para>
2543      This section gives an overview of various parts of the core API.
2544    </para>
2546    <sect2 id="core_api.data_in_files">
2547      <title>Using files to store data</title>
2549      <para>
2550        BASE 2.5 introduced the possibility to use files to store data instead
2551        of importing it into the database. Files can be attached
2552        to any item that implements the <interfacename docapi="net.sf.basedb.core">FileStoreEnabled</interfacename>
2553        interface. Currently this is <classname docapi="net.sf.basedb.core">RawBioAssay</classname>
2554        and <classname docapi="net.sf.basedb.core">ArrayDesign</classname>. The
2555        ability to store data in files is not a replacement for storing data in the
2556        database. It is possible (for some platforms/raw data types) to have data in
2557        files and in the database at the same time. We would have liked to enforce
2558        that (raw) data is always present in files, but this will not be backwards compatible
2559        with older installations, so there are three cases:
2560      </para>
2562      <itemizedlist>
2563      <listitem>
2564        <para>
2565        Data in files only
2566        </para>
2567      </listitem>
2568      <listitem>
2569        <para>
2570        Data in the database only
2571        </para>
2572      </listitem>
2573      <listitem>
2574        <para>
2575        Data in both files and in the database
2576        </para>
2577      </listitem>
2578      </itemizedlist>
2580      <para>
2581        Not all three cases are supported for all types of data. This is controlled
2582        by the <classname docapi="net.sf.basedb.core">Platform</classname> class, which may disallow
2583        that data is stored in the database. To check this call
2584        <methodname>Platform.isFileOnly()</methodname> and/or
2585        <methodname>Platform.getRawDataType()</methodname>. If the <methodname>isFileOnly()</methodname>
2586        method returns <constant>true</constant>, the platform can't store data in
2587        the database. If the value is <constant>false</constant> more information
2588        can be obtained by calling <methodname>getRawDataType()</methodname>,
2589        which may return:
2590      </para>
2592      <itemizedlist>
2593      <listitem>
2594        <para>
2595          <constant>null</constant>: The platform can store data with any
2596          raw data type in the database.
2597        </para>
2598      </listitem>
2599      <listitem>
2600        <para>
2601        A <classname docapi="net.sf.basedb.core">RawDataType</classname> that has <code>isStoredInDb() == true</code>:
2602        The platform can store data in the database but only data with the specified raw
2603        data type.
2604        </para>
2605      </listitem>
2606      <listitem>
2607        <para>
2608        A <classname docapi="net.sf.basedb.core">RawDataType</classname> that has <code>isStoredInDb() == false</code>:
2609        The platform can't store data in the database.
2610        </para>
2611      </listitem>
2612      </itemizedlist>
2614      <para>
2615        One major change from earlier BASE versions is that the registration of raw data types
2616        has changed. The <filename>raw-data-types.xml</filename> file should
2617        only be used for raw data types that are stored in the database. The
2618        <sgmltag>storage</sgmltag> tag has been deprecated and BASE will refuse
2619        to start if it finds a raw data type definitions with <code>storage="file"</code>.
2620      </para>
2622      <para>
2623        For backwards compatibility reasons, each <classname docapi="net.sf.basedb.core">Platform</classname>
2624        that can only store data in files will create "virtual" raw data type
2625        objects internally. These raw data types all return <constant>false</constant>
2626        from the <methodname>RawDataType.isStoredInDb()</methodname>
2627        method. They also have a back-link to the platform/variant that
2628        created it: <methodname>RawDataType.getPlatform()</methodname>
2629        and <methodname>RawDataType.getVariant()</methodname>. These two methods
2630        will always return <constant>null</constant> when called on a raw data type
2631        that can be stored in the database.
2632      </para>
2634      <itemizedlist>
2635        <title>See also</title>
2636        <listitem><xref linkend="data_api.platforms" /></listitem>
2637        <listitem><xref linkend="plugin_developer.other.datafiles" /></listitem>
2638        <listitem><xref linkend="appendix.rawdatatypes.platforms" /></listitem>
2639        <listitem>
2640          <xref linkend="appendix.incompatible.2.5" /> in
2641          <xref linkend="appendix.incompatible" />
2642        </listitem>
2643      </itemizedlist>
2645      <sect3 id="core_api.data_in_files.diagram">
2646        <title>Diagram of classes and methods</title>
2647        <figure id="core_api.figures.data_in_files">
2648          <title>Store data in files</title>
2649          <screenshot>
2650            <mediaobject>
2651              <imageobject>
2652                <imagedata 
2653                  align="center"
2654                  scalefit="1" width="100%"
2655                  fileref="figures/uml/corelayer.datainfiles.png" format="PNG" />
2656              </imageobject>
2657            </mediaobject>
2658          </screenshot>
2659        </figure>
2661        <para>
2662          This is rather large set of classes and methods. The ultimate goal
2663          is to be able to create links between a <classname docapi="net.sf.basedb.core">RawBioAssay</classname>
2664          / <classname docapi="net.sf.basedb.core">ArrayDesign</classname> and <classname docapi="net.sf.basedb.core">File</classname>
2665          items and to provide some metadata about the files.
2666          The <classname docapi="net.sf.basedb.core">FileStoreUtil</classname> class is one of the most
2667          important ones. It is intended to make it easy for plug-in (and other)
2668          developers to access the files without having to mess with platform
2669          or file type objects. The API is best described
2670          by a set of use-case examples.
2671        </para>
2673      </sect3>
2675      <sect3 id="core_api.data_in_files.ask">
2676        <title>Use case: Asking the user for files for a given item</title>
2678        <para>
2679          A client application must know what types of files it makes sense
2680          to ask the user for. In some cases, data may be split into more than
2681          one file so we need a generic way to select files.
2682        </para>
2684        <para>
2685          Given that we have a <interfacename docapi="net.sf.basedb.core">FileStoreEnabled</interfacename>
2686          item we want to find out which <classname docapi="net.sf.basedb.core">DataFileType</classname>
2687          items that can be used for that item. The
2688          <methodname>DataFileType.getQuery(FileStoreEnabled)</methodname>
2689          can be used for this. Internally, the method uses the result from
2690          <methodname>FileStoreEnabled.getPlatform()</methodname>
2691          and <methodname>FileStoreEnabled.getVariant()</methodname>
2692          methods to restrict the query to only return file types for
2693          a given platform and/or variant. If the item doesn't have
2694          a platform or variant the query will return all file types
2695          that are associated with the given item type. In any case, we get a list
2696          of <classname>DataFileType</classname> items, each one representing a
2697          specific file type that we should ask the user about. Examples:
2698        </para>
2700        <orderedlist>
2701        <listitem>
2702          <para>
2703          The <constant>Affymetrix</constant> platform defines <constant>CEL</constant>
2704          as a raw data file and <constant>CDF</constant> as an array design (reporter map)
2705          file. If we have a <classname docapi="net.sf.basedb.core">RawBioAssay</classname> the query will only return
2706          the CEL file type and the client can ask the user for a CEL file.
2707          </para>
2708        </listitem>
2709        <listitem>
2710          <para>
2711          The <constant>Generic</constant> platform defines <constant>PRINT_MAP</constant>
2712          and <constant>REPORTER_MAP</constant> for array designs. If we have
2713          an <classname docapi="net.sf.basedb.core">ArrayDesign</classname> the query will return those two
2714          items.
2715          </para>
2716        </listitem>
2717        </orderedlist>
2719        <para>
2720          It might also be interesting to know the currently selected file
2721          for each file type and if the platform has set the <varname>required</varname>
2722          flag for a particular file type. Here is a simple code example
2723          that may be useful to start from:
2724        </para>
2726        <programlisting language="java">
2727DbControl dc = ...
2728FileStoreEnabled item = ...
2729Platform platform = item.getPlatform();
2730PlatformVariant variant = item.getVariant();
2732// Get list of DataFileTypes used by the platform
2733ItemQuery&lt;DataFileType&gt; query =
2734   DataFileType.getQuery(item);
2735List&lt;DataFileType&gt; types = query.list(dc);
2737// Always check hasFileSet() method first to avoid
2738// creating the file set if it doesn't exists
2739FileSet fileSet = item.hasFileSet() ?
2740   null : item.getFileSet();
2742for (DataFileType type : types)
2744   // Get the current file, if any
2745   FileSetMember member = fileSet == null || !fileSet.hasMember(type) ?
2746      null : fileSet.getMember(type);
2747   File current = member == null ?
2748      null : member.getFile();
2750   // Check if a file is required by the platform
2751   PlatformFileType pft = platform == null ?
2752      null : platform.getFileType(type, variant);
2753   boolean isRequired = pft == null ?
2754      false : pft.isRequired();
2756   // Now we can do something with this information to
2757   // let the user select a file ...
2761        <note>
2762          <title>Also remember to catch PermissionDeniedException</title>
2763          <para>
2764            The above code may look complicated, but this is mostly because
2765            of all checks for <constant>null</constant> values. Remember
2766            that many things are optional and may return <constant>null</constant>.
2767            Another thing to look out for is
2768            <exceptionname>PermissionDeniedException</exceptionname>:s. The logged in
2769            user may not have access to all items. The above example doesn't include
2770            any code for this since it would have made it too complex.
2771          </para>
2772        </note>
2773      </sect3>
2775      <sect3 id="">
2776        <title>Use case: Link, validate and extract metadata from the selected files</title>
2777        <para>
2778          When the user has selected the file(s) we must store the links
2779          to them in the database. This is done with a <classname docapi="net.sf.basedb.core">FileSet</classname>
2780          object. A file set can contain any number of files. The only limitation
2781          is that it can only contain one file for each file type.
2782          Call <methodname>FileSet.setMember()</methodname> to store
2783          a file in the file set. If a file already exists for the given file type
2784          it is replaced, otherwise a new entry is created. The following
2785          program example assumes that we have a map where <classname docapi="net.sf.basedb.core">File</classname>:s
2786          are related to <classname docapi="net.sf.basedb.core">DataFileType</classname>:s. When all files
2787          have been added we call <methodname>FileSet.validate()</methodname>
2788          to validate the files and extract metadata.
2789        </para>
2791        <programlisting language="java">
2792DbControl dc = ...
2793FileStoreEnabled item = ...
2794Map&lt;DataFileType, File&gt; files = ...
2796// Store the selected files in the fileset
2797FileSet fileSet = item.getFileSet();
2798for (Map.Entry&lt;DataFileType, File&gt; entry : files)
2800   DataFileType type = entry.getKey();
2801   File file = entry.getValue();
2802   fileSet.setMember(type, file);
2805// Validate the files and extract metadata
2806fileSet.validate(dc, true);
2809        <para>
2810          Validation and extraction of metadata is important since we want
2811          data in files to be equivalent to data in the database. The validation
2812          and metadata extraction is done by the core when the
2813          <methodname>FileSet.validate()</methodname> is called.
2814          The process is partly pluggable since each <classname docapi="net.sf.basedb.core">DataFileType</classname> 
2815          can name a class that should do the validation and/or metadata extraction.
2816        </para>
2818        <note>
2819          <para>
2820          The <methodname>FileSet.validate()</methodname> only validates
2821          the files where the file types have specified plug-ins that can
2822          do the validation and metadata extraction. The method doesn't
2823          throw any exceptions. Instead, all validation errors
2824          are returned a list of <classname>Throwable</classname>:s. The
2825          validation result is also stored for each file and can be access
2826          with <methodname>FileSetMember.isValid()</methodname> and
2827          <methodname>FileSetMember.getErrorMessage()</methodname>.
2828          </para>
2829        </note>
2831        <para>
2832          Here is the general outline of what is going on in the core:
2833        </para>
2835        <orderedlist>
2836        <listitem>
2837          <para>
2838          The core checks the <classname docapi="net.sf.basedb.core">DataFileType</classname> of all
2839          members in the file set and creates <classname docapi="net.sf.basedb.core.filehandler">DataFileValidator</classname>
2840          and <classname docapi="net.sf.basedb.core.filehandler">DataFileMetadataReader</classname> objects. Only one instance
2841          of each class is created. If the file set contains members which has the
2842          same validator or metadata reader, they will all share the same instance.
2843          </para>
2844        </listitem>
2846        <listitem>
2847          <para>
2848          Each validator/metadata reader class is initialised with calls to
2849          <methodname>DataFileHandler.setItem()</methodname> and
2850          <methodname>DataFileHandler.setFile()</methodname>.
2851          </para>
2852        </listitem>
2854        <listitem>
2855          <para>
2856          Each validator is called. The result of the validation is saved for each
2857          file and can be retreieved by <methodname>FileSetMember.isValid()</methodname>
2858          and <methodname>FileSetMember.getErrorMessage()</methodname>.
2859          </para>
2860        </listitem>
2862        <listitem>
2863          <para>
2864          Each metadata reader is called, unless the metadata reader is the same class
2865          as the validator and the validation failed. If the metadata reader is a
2866          different class, it is called even if the validation failed.
2867          </para>
2868        </listitem>
2869        </orderedlist>
2871        <note>
2872          <title>Only one instance of each validator class is created</title>
2873          <para>
2874          The validation/metadata extraction is not done until all files have been
2875          added to the fileset. If the same validator/meta data reader is
2876          used for more than one file, the same instance is reused. Ie.
2877          the <methodname>setFile()</methodname> is called one time
2878          for each file/file type pair. The <methodname>validate()</methodname>
2879          and <methodname>extractMetadata()</methodname> methods are only
2880          called once.
2881          </para>
2882        </note>
2884        <para>
2885          All validators and meta data extractors should extend
2886          the <classname docapi="net.sf.basedb.core.filehandler">AbstractDataFileHandler</classname> class. The reason
2887          is that we may want to add more methods to the <interfacename docapi="net.sf.basedb.core.filehandler">DataFileHandler</interfacename>
2888          interface in the future. The <classname docapi="net.sf.basedb.core.filehandler">AbstractDataFileHandler</classname> will
2889          be used to provide default implementations for backwards compatibility.
2890        </para>
2892      </sect3>
2894      <sect3 id="core_api.data_in_files.import">
2895        <title>Use case: Import data into the database</title>
2897        <para>
2898          This should be done by existing plug-ins in the same way as before.
2899          A slight modification is needed since it is good if the importers
2900          are made aware of already selected files in the <classname docapi="net.sf.basedb.core">FileSet</classname>
2901          to provide good default values. The <classname docapi="net.sf.basedb.core">FileStoreUtil</classname>
2902          class is very useful in cases like this:
2903        </para>
2905        <programlisting language="java">
2906RawBioAssay rba = ...
2907DbControl dc = ...
2909// Get the current raw data file, if any
2910List&lt;File&gt; rawDataFiles =
2911   FileStoreUtil.getGenericDataFiles(dc, rba, FileType.RAW_DATA);
2912File defaultFile = rawDataFiles.size() > 0 ?
2913   rawDataFiles.get(0) : null;
2915// Create parameter asking for input file - use current as default
2916PluginParameter&lt;File&gt; fileParameter = new PluginParameter&lt;File&gt;(
2917   "file",
2918   "Raw data file",
2919   "The file that contains the raw data that you want to import",
2920   new FileParameterType(defaultFile, true, 1)
2924      <para>
2925        An import plug-in should also save the file that was used to the file set:
2926      </para>
2928      <programlisting language="java">
2929RawBioassay rba = ...
2930// The file the user selected to import from
2931File rawDataFile = (File)job.getValue("file");
2933// Save the file to the fileset. The method will check which file
2934// type the platform uses as the raw data type. As a fallback the
2935// GENERIC_RAW_DATA type is used
2936FileStoreUtil.setGenericDataFile(dc, rba, FileType.RAW_DATA,
2937   DataFileType.GENERIC_RAW_DATA, rawDataFile);
2940      </sect3>
2942      <sect3 id="core_api.data_in_files.experiments">
2943        <title>Use case: Using raw data from files in an experiment</title>
2945        <para>
2946          Just as before, an experiment is still locked to a single
2947          <classname docapi="net.sf.basedb.core">RawDataType</classname>. This is a design issue that
2948          would break too many things if changed. If data is stored in files
2949          the experiment is also locked to a single <classname docapi="net.sf.basedb.core">Platform</classname>.
2950          This has been designed to have as little impact on existing
2951          plug-ins as possible. In most cases, the plug-ins will continue
2952          to work as before.
2953        </para>
2955        <para>
2956          A plug-in (using data from the database that needs to check if it can
2957          be used within an experiment can still do:
2958        </para>
2960        <programlisting language="java">
2961Experiment e = ...
2962RawDataType rdt = e.getRawDataType();
2963if (rdt.isStoredInDb())
2965   // Check number of channels, etc...
2966   // ... run plug-in code ...
2970        <para>
2971          A newer plug-in which uses data from files should do:
2972        </para>
2974        <programlisting language="java">
2975Experiment e = ...
2976DbControl dc = ...
2977RawDataType rdt = e.getRawDataType();
2978if (!rdt.isStoredInDb())
2980   // Check that platform/variant is supported
2981   Platform p = rdt.getPlatform(dc);
2982   PlatformVariant v = rdt.getVariant(dc);
2983   // ...
2985   // Get data files
2986   File aFile = FileStoreUtil.getDataFile(dc, ...);
2988   // ... run plug-in code ...
2992      </sect3>
2994    </sect2>
2996    <sect2 id="core_api.signals">
2997      <title>Sending signals (to plug-ins)</title>
2999      <para>
3000        BASE has a simple system for sending signals between different parts of
3001        a system. This signalling system was initially developed to be able to
3002        kill plug-ins that a user for some reason wanted to abort. The signalling
3003        system as such is not limited to this and it can be used for other purposes
3004        as well. Signals can of course be handled internally in a single JVM but
3005        also sent externally to other JVM:s running on the same or a different
3006        computer. The transport mechanism for signals is decoupled from the actual
3007        handling of them. If you want to, you could implement a signal transporter
3008        that sends signal as emails and the target plug-in would never know.
3009      </para>
3011      <para>
3012        The remainder of this section will focus mainly on the sending and
3013        transportation of signals. For more information about handling signals
3014        on the receiving end, see <xref linkend="plugin_developer.signals" />.
3015      </para>
3017      <sect3 id="core_api.signals.diagram">
3018        <title>Diagram of classes and methods</title>
3019        <figure id="core_api.figures.signals">
3020          <title>The signalling system</title>
3021          <screenshot>
3022            <mediaobject>
3023              <imageobject>
3024                <imagedata 
3025                  align="center"
3026                  scalefit="1" width="100%"
3027                  fileref="figures/uml/corelayer.signals.png" format="PNG" />
3028              </imageobject>
3029            </mediaobject>
3030          </screenshot>
3031        </figure>
3033        <para>
3034          The signalling system is rather simple. An object that wish
3035          to receieve signals must implement the
3036          <interfacename docapi="net.sf.basedb.core.signal"
3037          >SignalTarget</interfacename>. It's only method
3038          is <methodname>getSignalHandler()</methodname>. A
3039          <interfacename docapi="net.sf.basedb.core.signal"
3040          >SignalHandler</interfacename> is an object that
3041          knows what to do when a signal is delivered to it. The target object
3042          may implement the <interfacename>SignalHandler</interfacename> itself
3043          or use one of the existing handlers.
3044        </para>
3046        <para>
3047          The difficult part here is to be aware that a signal is usually
3048          delivered by a separate thread. The target object must be aware
3049          of this and know how to handle multiple threads. As an example we
3050          can use the <classname docapi="net.sf.basedb.core.signal"
3051          >ThreadSignalHandler</classname> which simply
3052          calls <code>Thread.interrupt()</code> to deliver a signal. The target
3053          object that uses this signal handler it must know that it should check
3054          <code>Thread.interrupted()</code> at regular intervals from the main
3055          thread. If that method returns true, it means that the <constant>ABORT</constant>
3056          signal has been delivered and the main thread should clean up and exit as
3057          soon as possible.
3058        </para>
3060        <para>
3061          Even if a signal handler could be given directly to the party
3062          that may be interested in sending a signal to the target this
3063          is not recommended. This would only work when sending signals
3064          within the same virtual machine. The signalling system includes
3065          <interfacename docapi="net.sf.basedb.core.signal"
3066          >SignalTransporter</interfacename> and
3067          <interfacename docapi="net.sf.basedb.core.signal"
3068          >SignalReceiver</interfacename> objects that are used
3069          to decouple the sending of signals with the handling of signals. The
3070          implementation usually comes in pairs, for example
3071          <classname docapi="net.sf.basedb.core.signal"
3072          >SocketSignalTransporters</classname> and <classname 
3073          docapi="net.sf.basedb.core.signal">SocketSignalReceiver</classname>.
3074        </para>
3076        <para>
3077          Setting up the transport mechanism is usually a system responsibility.
3078          Only the system know what kind of transport that is appropriate for it's current
3079          setup. Ie. should signals be delievered by TCP/IP sockets, only internally, or
3080          should a delivery mechanism based on web services be implemented?
3081          If a system wants to receive signals it must create an appropriate
3082          <interfacename>SignalReceiver</interfacename> object. Within BASE the
3083          internal job queue set up it's own signalling system that can be used to
3084          send signals (eg. kill) running jobs. The job agents do the same but uses
3085          a different implementation. See <xref linkend="appendix.base.config.jobqueue" />
3086          for more information about how to configure the internal job queue's
3087          signal receiver. In both cases, there is only one signal receiver instance
3088          active in the system.
3089        </para>
3091        <para>
3092          Let's take the internal job queue as an example. Here is how it works:
3093        </para>
3095        <itemizedlist>
3096        <listitem>
3097          <para>
3098          When the internal job queue is started, it will also create a signal
3099          receiver instance according to the settings in <filename>base.config</filename>.
3100          The default is to create <classname docapi="net.sf.basedb.core.signal"
3101          >LocalSignalReceiver</classname>
3102          which can only be used inside the same JVM. If needed, this can
3103          be changed to a <classname docapi="net.sf.basedb.core.signal"
3104          >SocketSignalReceiver</classname> or any other
3105          user-provided implementation.
3106          </para>
3107        </listitem>
3109        <listitem>
3110          <para>
3111          When the job queue has found a plug-in to execute it will check if
3112          it also implements the <interfacename docapi="net.sf.basedb.core.signal"
3113          >SignalTarget</interfacename>
3114          interface. If it does, a signal handler is created and registered
3115          with the signal receiver. This is actually done by the BASE core
3116          by calling <methodname>PluginExecutionRequest.registerSignalReceiver()</methodname>
3117          which also makes sure that the the ID returned from the registration is
3118          stored in the database together with the job item representing the
3119          plug-in to execute.
3120          </para>
3121        </listitem>
3123        <listitem>
3124          <para>
3125          Now, when the web client see's a running job which has a non-empty
3126          signal transporter property, the <guilabel>Abort</guilabel>
3127          button is activated. If the user clicks this button the BASE core
3128          uses the information in the database to create
3129          <interfacename docapi="net.sf.basedb.core.signal"
3130          >SignalTransporter</interfacename> object. This
3131          is simply done by calling <code>Job.getSignalTransporter()</code>.
3132          The created signal transporter knows how to send a signal
3133          to the signal receiver it was first registered with. When the
3134          signal arrives at the receiver it will find the handler for it
3135          and call <code>SignalHandler.handleSignal()</code>. This will in it's turn
3136          trigger some action in the signal target which soon will abort what
3137          it is doing and exit.
3138          </para>
3139        </listitem>
3140        </itemizedlist>
3143      </sect3>
3145    </sect2>
3147  </sect1>
3149  <sect1 id="api_overview.query_api">
3150    <title>The Query API</title>
3151    <para>
3152      This documentation is only available in the old format.
3153      See <ulink url=""
3154        ></ulink>
3155    </para>
3157  </sect1>
3159  <sect1 id="api_overview.dynamic_and_batch_api">
3160    <title>Analysis and the Dynamic and Batch API:s</title>
3161    <para>
3162      This documentation is only available in the old format.
3163      See <ulink url=""
3164        ></ulink>
3165    </para>
3166  </sect1>
3168  <sect1 id="api_overview.extensions">
3169    <title>Extensions API</title>
3171    <sect2 id="api_overview.extensions.core">
3172      <title>The core part</title>
3174      <para>
3175        The <emphasis>Extensions API</emphasis> is divided into two parts. A core
3176        part and a web client specific part. The core part can be found in the
3177        <package>net.sf.basedb.util.extensions</package> package and it's sub-packages,
3178        and consists of three sub-parts:
3179      </para>
3181      <itemizedlist>
3182      <listitem>
3183        <para>
3184        A set of interface definitions which forms the core of the Extensions API.
3185        The interfaces defines, for example, what an <interfacename 
3186        docapi="net.sf.basedb.util.extensions">Extension</interfacename> is and
3187        what an <interfacename 
3188        docapi="net.sf.basedb.util.extensions">ActionFactory</interfacename> should do.
3189        </para>
3190      </listitem>
3192      <listitem>
3193        <para>
3194        A <classname docapi="net.sf.basedb.util.extensions">Registry</classname> that is
3195        used to keep track of installed extensions. The registry also provides
3196        functionality for invoking and using the extensions.
3197        </para>
3198      </listitem>
3200      <listitem>
3201        <para>
3202        Utility classes that are useful when implementation a client application
3203        that can be extendable. The most useful example is the <classname
3204        docapi="net.sf.basedb.util.extensions.xml">XmlLoader</classname> which can
3205        read extension definitions from XML files and create the proper factories,
3206        etc.
3207        </para>
3208      </listitem>
3209      </itemizedlist>
3211      <figure id="core_api.figures.extensions_core">
3212        <title>The core part of the Extensions API</title>
3213        <screenshot>
3214          <mediaobject>
3215            <imageobject>
3216              <imagedata 
3217                align="center"
3218                fileref="figures/uml/corelayer.extensions_core.png" format="PNG" />
3219            </imageobject>
3220          </mediaobject>
3221        </screenshot>
3222      </figure>
3224      <para>
3225        The <classname docapi="net.sf.basedb.util.extensions">Registry</classname> 
3226        is one of the main classes in the extension system. All extension points and
3227        extensions must be registered before they can be used. Typically, you will
3228        first register extension points and then extensions, beacuse an extension
3229        can't be registered until the extension point it is extending has been
3230        registered.
3231      </para>
3233      <para>
3234        An <interfacename docapi="net.sf.basedb.util.extensions">ExtensionPoint</interfacename>
3235        is an ID and a definition of an <interfacename docapi="net.sf.basedb.util.extensions">Action</interfacename>
3236        class. The other options (name, description, renderer factory, etc.) are optional.
3237        An <interfacename docapi="net.sf.basedb.util.extensions">Extension</interfacename>
3238        that extends a specific extension point must provide an
3239        <interfacename docapi="net.sf.basedb.util.extensions">ActionFactory</interfacename>
3240        instance that can create actions of the type the extension point requires.
3241      </para>
3243      <example id="core_api.example.extensions_core">
3244        <title>The menu extensions point</title>
3245        <para>
3246        The <code></code> extension point
3247        requires <interfacename 
3248        docapi="">MenuItemAction</interfacename>
3249        objects. An extension for this extension point must provide a factory that
3250        can create <classname>MenuItemAction</classname>:s. BASE ships with default
3251        factory implementations, for example the <classname 
3252        docapi="">FixedMenuItemFactory</classname>
3253        class, but an extension may provide it's own factory implementation if it wants to.
3254        </para>
3255      </example>
3257      <para>
3258        Call the <methodname>Registry.useExtensions()</methodname> method
3259        to use extensions from one or several extension points. This method will
3260        find all extensions for the given extension points. If a filter is given,
3261        it checks if any of the extensions or extension points has been disabled.
3262        It will then call <methodname>ActionFactory.prepareContext()</methodname>
3263        for all remaining extensions. This gives the action factory a chance to
3264        also disable the extension, for example, if the logged in user doesn't
3265        have a required permission. The action factory may also set attributes
3266        on the context. The attributes can be anything that the extension point
3267        may make use of. Check the documentation for the specific extension point
3268        for information about which attributes it supports. If there are
3269        any renderer factories, their <methodname>RendererFactory.prepareContext()</methodname>
3270        is also called. They have the same possibility of setting attributes
3271        on the context, but can't disable an extension.
3272      </para>
3274      <para>
3275        After this, an <classname 
3276        docapi="net.sf.basedb.util.extensions">ExtensionsInvoker</classname>
3277        object is created and returned to the extension point. Note that
3278        the <methodname>ActionFactory.getActions()</methodname> has not been
3279        called yet, so we don't know if the extensions are actually
3280        going to generate any actions. The <methodname>ActionFactory.getActions()</methodname>
3281        is not called until we have got ourselves an
3282        <classname docapi="net.sf.basedb.util.extensions">ActionIterator</classname>
3283        from the <methodname>ExtensionsInvoker.iterate()</methodname> method and
3284        starts to iterate. The call to <methodname>ActionIterator.hasNext()</methodname>
3285        will propagate down to <methodname>ActionFactory.getActions()</methodname>
3286        and the generated actions are then available with the
3287        <methodname></methodname> method.
3288      </para>
3290      <para>
3291        The <methodname>ExtensionsInvoker.renderDefault()</methodname>
3292        and <methodname>ExtensionsInvoker.render()</methodname> are
3293        just convenience methods that will make it easer to render
3294        the actions. The first method will of course only work if the
3295        extension point is providing a renderer factory, that can
3296        create the default renderer.
3297      </para>
3299      <note>
3300        <title>Be aware of multi-threading issues</title>
3301        <para>
3302          When you are creating extensions you must be aware that
3303          multiple threads may access the same objects at the same time.
3304          In particular, any action factory or renderer factory has to be
3305          thread-safe, since only one exists for each extension.
3306          Action and renderer objects should be thread-safe if the
3307          factories re-use the same objects.
3308        </para>
3309      </note>
3311      <para>
3312        Any errors that happen during usage of an extension is handled by an
3313        <interfacename docapi="net.sf.basedb.util.extensions">ErrorHandler</interfacename>.
3314        The core provides two implementations. We usually don't want the
3315        errors to show up in the gui so the <classname 
3316        docapi="net.sf.basedb.util.extensions">LoggingErrorHandlerFactory</classname> 
3317        is the default implementation that only writes to the log file. The
3318        <classname 
3319        docapi="net.sf.basedb.util.extensions">RethrowErrorHandlerFactory</classname>
3320        error handler can be used to re-throw exceptions which usually means that
3321        they trickle up to the gui and are shown to the user. It is also
3322        possible for an extension point to provide its own implementation of
3323        an <interfacename docapi="net.sf.basedb.util.extensions">ErrorHandlerFactory</interfacename>.
3324      </para>
3326    </sect2>
3328    <sect2 id="api_overview.extensions.web">
3329      <title>The web client part</title>
3331      <para>
3332        The web client specific parts of the Extensions API can be found
3333        in the <package>net.sf.basedb.client.web.extensions</package> package
3334        and it's subpackages. The top-level package contains classes used to
3335        administrate the extension system. Here is for example the
3336        <classname docapi="net.sf.basedb.client.web.extensions">ExtensionsControl</classname> 
3337        class which is the master controller for the web client extensions. It:
3338      </para>
3340      <itemizedlist>
3341      <listitem>
3342        <para>
3343        Keeps track of installed extensions and which JAR or XML file they are
3344        installed from.
3345        </para>
3346      </listitem>
3348      <listitem>
3349        <para>
3350        Can, manually or automatically, find and install new or
3351        updated extensions and uninstall deleted extensions.
3352        </para>
3353      </listitem>
3355      <listitem>
3356        <para>
3357        Adds permission control to the extension system, so that only an
3358        administrator is allowed to change settings, enable/disable extensions,
3359        etc.
3360        </para>
3361      </listitem>
3362      </itemizedlist>
3364      <para>
3365        In the top-level package there are also some abstract classes that may
3366        be useful to extend for developers creating their own extensions.
3367        For example, we recommend that all action factories extend the <classname 
3368        docapi="net.sf.basedb.client.web.extensions">AbstractJspActionFactory</classname>
3369        class.
3370      </para>
3372      <para>
3373        The sub-packages to <package>net.sf.basedb.client.web.extensions</package>
3374        are mostly specific to a single extension point or to a specific type of
3375        extension point. The <package></package>
3376        package, for example, contains classes that are/can be used for extensions
3377        adding menu items to the <menuchoice><guimenu>Extensions</guimenu></menuchoice>
3378        menu.
3379      </para>
3381      <figure id="core_api.figures.extensions_web">
3382        <title>The web client part of the Extensions API</title>
3383        <screenshot>
3384          <mediaobject>
3385            <imageobject>
3386              <imagedata 
3387                align="center"
3388                fileref="figures/uml/corelayer.extensions_web.png" format="PNG" />
3389            </imageobject>
3390          </mediaobject>
3391        </screenshot>
3392      </figure>
3394      <para>
3395        When the Tomcat web server is starting up, the <classname 
3396        docapi="net.sf.basedb.client.web.extensions">ExtensionsServlet</classname>
3397        is automatically loaded. This servlet has as two purposes:
3398      </para>
3400      <itemizedlist>
3401      <listitem>
3402        <para>
3403        Initialise the extensions system by calling
3404        <methodname>ExtensionsControl.init()</methodname>. This will result in
3405        an initial scan for installed extensions, which is equivalent to doing
3406        a manual scan with the force update setting to false. This means that
3407        the extension system is up an running as soon as the first user log's
3408        in to BASE.
3409        </para>
3410      </listitem>
3412      <listitem>
3413        <para>
3414        Act as a proxy for custom servlets defined by the extensions. URL:s
3415        ending with <code>.servlet</code> has been mapped to the
3416        <classname>ExtensionsServlet</classname>. When a request is made it
3417        will extract the name of the extension's JAR file from the
3418        URL, get the corresponding <classname 
3419        docapi="net.sf.basedb.client.web.extensions">ExtensionsFile</classname>
3420        and <classname docapi="net.sf.basedb.client.web.extensions">ServletWrapper</classname>
3421        and then invoke the custom servlet. More information can be found in
3422        <xref linkend="extensions_developer.servlets" />.
3423        </para>
3424      </listitem>
3426      </itemizedlist>
3428      <para>
3429        Using extensions only involves calling the
3430        <methodname>ExtensionsControl.createContext()</methodname> and
3431        <methodname>ExtensionsControl.useExtensions()</methodname> methods. This
3432        returns an <classname docapi="net.sf.basedb.util.extensions">ExtensionsInvoker</classname> 
3433        object as described in the previous section.
3434      </para>
3436      <para>
3437        To render the actions it is possible to either use the
3438        <methodname>ExtensionsInvoker.iterate()</methodname> method
3439        and generate HTML from the information in each action. Or
3440        (the better way) is to use a renderer together with the
3441        <classname docapi="net.sf.basedb.clients.web.taglib.extensions">Render</classname>
3442        taglib.
3443      </para>
3445      <para>
3446        To get information about the installed extensions, 
3447        change settings, enabled/disable extensions, performing a manual
3448        scan, etc. use the <methodname>ExtensionsControl.get()</methodname>
3449        method. This will create a permission-controlled object. All
3450        users has read permission, administrators has write permission.
3451      </para>
3453      <note>
3454        <para>
3455          The permission we check for is WRITE permission on the
3456          web client item. This means it is possible to give a user
3457          permissions to manage the extension system by assigning
3458          WRITE permission to the web client entry in the database.
3459          Do this from <menuchoice>
3460            <guimenu>Administrate</guimenu>
3461            <guimenuitem>Clients</guimenuitem>
3462          </menuchoice>.
3463        </para>
3464      </note>
3466      <para>
3467        The <classname docapi="net.sf.basedb.clients.web.extensions">XJspCompiler</classname>
3468        is mapped to handle the compilation <code>.xjsp</code> files
3469        which are regular JSP files with a different extension. This feature is
3470        experimental and requires installing an extra JAR into Tomcat's lib
3471        directory. See <xref linkend="plugins.installation.xjspcompiler" /> for
3472        more information.
3473      </para>
3475    </sect2>
3477  </sect1>
3479  <sect1 id="api_overview.other_api">
3480    <title>Other useful classes and methods</title>
3481    <para>
3482      TODO
3483    </para>
3484  </sect1>
Note: See TracBrowser for help on using the repository browser.