source: trunk/doc/src/docbook/developerdoc/api_overview.xml @ 5071

Last change on this file since 5071 was 5071, checked in by Nicklas Nordborg, 14 years ago

References #108: Logging the change history of an item

  • Added the new data classes to UML diagram and made updates to all other affected diagrams
  • Documented the logging feature for plug-in developers
  • Made the new data classes immutable
  • Property svn:eol-style set to native
  • Property svn:keywords set to Id
File size: 125.3 KB
Line 
1<?xml version="1.0" encoding="UTF-8"?>
2<!DOCTYPE chapter PUBLIC
3    "-//Dawid Weiss//DTD DocBook V3.1-Based Extension for XML and graphics inclusion//EN"
4    "../../../../lib/docbook/preprocess/dweiss-docbook-extensions.dtd">
5<!--
6  $Id: api_overview.xml 5071 2009-08-21 06:17:57Z nicklas $
7
8  Copyright (C) 2007 Peter Johansson, Nicklas Nordborg, Martin Svensson
9
10  This file is part of BASE - BioArray Software Environment.
11  Available at http://base.thep.lu.se/
12
13  BASE is free software; you can redistribute it and/or
14  modify it under the terms of the GNU General Public License
15  as published by the Free Software Foundation; either version 3
16  of the License, or (at your option) any later version.
17
18  BASE is distributed in the hope that it will be useful,
19  but WITHOUT ANY WARRANTY; without even the implied warranty of
20  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
21  GNU General Public License for more details.
22
23  You should have received a copy of the GNU General Public License
24  along with BASE. If not, see <http://www.gnu.org/licenses/>.
25-->
26
27<chapter id="api_overview">
28  <?dbhtml dir="api"?>
29  <title>API overview (how to use and code examples)</title>
30
31  <sect1 id="api_overview.public_api">
32    <title>The Public API of BASE</title>
33   
34    <para>
35      Not all public classes and methods in the <filename>BASE2Core.jar</filename>
36      and other JAR files shipped with BASE are considered as
37      <emphasis>Public API</emphasis>. This is important knowledge
38      since we will always try to maintain backwards compatibility
39      for classes that are part of the public API. For other
40      classes, changes may be introduced at any time without
41      notice or specific documentation. In other words:
42    </para>
43   
44    <note>
45      <title>Only use the public API when developing plug-ins</title>
46      <para>
47        This will maximize the chance that you plug-in will continue
48        to work with the next BASE release. If you use the non-public API
49        you do so at your own risk.
50      </para>
51    </note>
52   
53    <para>
54      See the <ulink url="http://base.thep.lu.se/chrome/site/doc/api/index.html"
55        >javadoc</ulink> for information about
56      what parts of the API that contributes to the public API.
57      Methods, classes and other elements that have been tagged as
58      <code>@deprecated</code> should be considered as part of the internal API
59      and may be removed in a subsequent release without warning.
60    </para>
61   
62    <para>
63      See <xref linkend="appendix.incompatible" /> to read more about
64      changes that have been introduced by each release.
65    </para>
66
67    <sect2 id="api_overview.compatibility">
68      <title>What is backwards compatibility?</title>
69     
70      <para>
71        There is a great article about this subject on <ulink 
72        url="http://wiki.eclipse.org/index.php/Evolving_Java-based_APIs"
73          >http://wiki.eclipse.org/index.php/Evolving_Java-based_APIs</ulink>.
74        This is what we will try to comply with. If you do not want to
75        read the entire article, here are some of the most important points:
76      </para>
77     
78     
79      <sect3 id="api_overview.compatibility.binary">
80        <title>Binary compatibility</title>
81        <para>
82        <blockquote>
83          Pre-existing Client binaries must link and run with new releases of the
84          Component without recompiling.
85        </blockquote>
86       
87        For example:
88        <itemizedlist>
89        <listitem>
90          <para>
91            We cannot change the number or types of parameters to a method
92            or constructor.
93          </para>
94        </listitem>
95        <listitem>
96          <para>
97            We cannot add or change methods to interfaces that are intended
98            to be implemented by plug-in or client code.
99          </para>
100        </listitem>
101        </itemizedlist>
102        </para>       
103      </sect3>
104     
105      <sect3 id="api_overview.compatibility.contract">
106        <title>Contract compatibility</title>
107        <para>
108          <blockquote>
109          API changes must not invalidate formerly legal Client code.
110          </blockquote>
111       
112          For example:
113          <itemizedlist>
114          <listitem>
115            <para>
116              We cannot change the implementation of a method to do
117              things differently than before. For example, allow <constant>null</constant>
118              as a return value when it was not allowed before.
119            </para>
120          </listitem>
121          </itemizedlist>
122       
123          <note>
124            <para>
125            Sometimes there is a very fine line between what is considered a
126            bug and what is considered a feature. For example, if the
127            actual implementation does not do what the javadoc says,
128            do we change the code or do we change the documentation?
129            This has to be considered from case to case and depends on
130            the age of the code and if we expect plug-ins and clients to be
131            affected by it or not.
132            </para>
133          </note>
134        </para>
135      </sect3>
136     
137      <sect3 id="api_overview.compatibility.source">
138        <title>Source code compatibility</title>
139        <para>
140        This is not an important matter and is not always possible to
141        achieve. In most cases, the problems are easy to fix.
142        Example:
143       
144        <itemizedlist>
145        <listitem>
146          <para>
147          Adding a class may break a plug-in or client that import
148          classes with <constant>.*</constant> if the same class name
149          exists in another package.
150          </para>
151        </listitem>
152        </itemizedlist>
153        </para>
154      </sect3>
155    </sect2>
156  </sect1>
157
158  <sect1 id="api_overview.data_api" chunked="1">
159    <title>The database schema and the Data Layer API</title>
160
161    <para>
162      This section gives an overview of the entire data layer API.
163      The figure below show how different modules relate to each other.
164    </para>
165 
166    <figure id="data_api.figures.overview">
167      <title>Data layer overview</title>
168      <screenshot>
169        <mediaobject>
170          <imageobject>
171            <imagedata 
172              align="center"
173              scalefit="1" width="100%"
174              fileref="figures/uml/datalayer.overview.png" format="PNG" />
175          </imageobject>
176        </mediaobject>
177      </screenshot>
178    </figure>
179
180    <sect2 id="data_api.basic">
181      <title>Basic classes and interfaces</title>
182     
183      <para>
184        This document contains information about the basic classes and interfaces in this package.
185        They are important since all data-layer classes must inherit from one of the already
186        existing abstract base classes or implement one or more of the
187        existing interfaces. They contain code that is common to all classes,
188        for example implementations of the <methodname>equals()</methodname>
189        and <methodname>hashCode()</methodname> methods or how to link with the owner of an
190        item.
191      </para>
192     
193      <sect3 id="data_api.basic.uml">
194        <title>UML diagram</title>
195       
196        <figure id="data_api.figures.basic">
197          <title>Basic classes and interfaces</title>
198          <screenshot>
199            <mediaobject>
200              <imageobject>
201                <imagedata 
202                  align="center"
203                  fileref="figures/uml/datalayer.basic.png" format="PNG" />
204              </imageobject>
205            </mediaobject>
206          </screenshot>
207        </figure>
208      </sect3>
209     
210      <sect3 id="data_api.basic.classes">
211        <title>Classes</title>
212       
213        <variablelist>
214        <varlistentry>
215          <term><classname docapi="net.sf.basedb.core.data">BasicData</classname></term>
216          <listitem>
217            <para>
218            The root class. It overrides the <methodname>equals()</methodname>,
219            <methodname>hashCode()</methodname> and <methodname>toString()</methodname> methods
220            from the <classname>Object</classname> class. It also defines the
221            <varname>id</varname> and <varname>version</varname> properties.
222            All data layer classes must inherit from this class or one of it's subclasses.
223            </para>
224          </listitem>
225        </varlistentry>
226       
227        <varlistentry>
228          <term><classname docapi="net.sf.basedb.core.data">OwnedData</classname></term>
229          <listitem>
230            <para>
231            Extends the <classname>BasicData</classname> class and adds
232            an <varname>owner</varname> property. The owner is a required link to a
233            <classname docapi="net.sf.basedb.core.data">UserData</classname> object, representing the user that
234            is the owner of the item.
235            </para>
236          </listitem>
237        </varlistentry>
238
239        <varlistentry>
240          <term><classname docapi="net.sf.basedb.core.data">SharedData</classname></term>
241          <listitem>
242            <para>
243            Extends the <classname>OwnedData</classname> class and adds
244            properties (<varname>itemKey</varname> and <varname>projectKey</varname>)
245            that holds access permission information for an item.
246            Access permissions are held in <classname docapi="net.sf.basedb.core.data">ItemKeyData</classname> and/or
247            <classname docapi="net.sf.basedb.core.data">ProjectKeyData</classname> objects. These objects only exists if
248            the item has been shared.
249            </para>
250          </listitem>
251        </varlistentry>
252
253        <varlistentry>
254          <term><classname docapi="net.sf.basedb.core.data">CommonData</classname></term>
255          <listitem>
256            <para>
257            This is a convenience class for items that extends the <classname>SharedData</classname>
258            class and implements the <interfacename docapi="net.sf.basedb.core.data">NameableData</interfacename> and
259            <interfacename docapi="net.sf.basedb.core.data">RemoveableData</interfacename> interfaces. This is one of
260            the most common situations.
261            </para>
262          </listitem>
263        </varlistentry>
264
265        <varlistentry>
266          <term><classname docapi="net.sf.basedb.core.data">AnnotatedData</classname></term>
267          <listitem>
268            <para>
269            This is a convenience class for items that can be annotated.
270            Annotations are held in <classname docapi="net.sf.basedb.core.data">AnnotationSetData</classname> objects.
271            The annotation set only exists if annotations has been created for the item.
272            </para>
273          </listitem>
274        </varlistentry>
275        </variablelist>
276       
277      </sect3>
278     
279      <sect3 id="data_api.basic.interfaces">
280        <title>Interfaces</title>
281       
282        <variablelist>
283        <varlistentry>
284          <term><classname docapi="net.sf.basedb.core.data">IdentifiableData</classname></term>
285          <listitem>
286            <para>
287            All items are identifiable, which means that they have a unique <varname>id</varname>.
288            The id is unique for all items of a specific type (ie. class). The id is number
289            that is automatically generated by the database and has no other meaning
290            outside of the application. The <varname>version</varname> property is used for
291            detecting and preventing concurrent modifications to an item.
292            </para>
293          </listitem>
294        </varlistentry>
295       
296        <varlistentry>
297          <term><classname docapi="net.sf.basedb.core.data">OwnableData</classname></term>
298          <listitem>
299            <para>
300            An ownable item is an item which has an owner. The owner is represented as a
301            required link to a <classname docapi="net.sf.basedb.core.data">UserData</classname> object.
302            </para>
303          </listitem>
304        </varlistentry>       
305
306        <varlistentry>
307          <term><classname docapi="net.sf.basedb.core.data">ShareableData</classname></term>
308          <listitem>
309            <para>
310            A shareable item is an item which can be shared to other users, groups or projects.
311            Access permissions are held in <classname docapi="net.sf.basedb.core.data">ItemKeyData</classname> and/or
312            <classname docapi="net.sf.basedb.core.data">ProjectKeyData</classname> objects.
313            </para>
314          </listitem>
315        </varlistentry>
316             
317        <varlistentry>
318          <term><classname docapi="net.sf.basedb.core.data">NameableData</classname></term>
319          <listitem>
320            <para>
321            A nameable item is an item that has a name (required) and a description
322            (optional). The name doesn't have to be unique, except in a few special
323            cases (for example, the name of a file).
324            </para>
325          </listitem>
326        </varlistentry>
327       
328        <varlistentry>
329          <term><classname docapi="net.sf.basedb.core.data">RemovableData</classname></term>
330          <listitem>
331            <para>
332            A removable item is an item that can be flagged as removed. This doesn't
333            remove the information about the item from the database, but can be used by
334            client applications to hide items that the user is not interested in.
335            A trashcan function can be used to either restore or permanently
336            remove items that has the flag set.
337            </para>
338          </listitem>
339        </varlistentry>
340               
341        <varlistentry>
342          <term><classname docapi="net.sf.basedb.core.data">SystemData</classname></term>
343          <listitem>
344            <para>
345            A system item is an item which has an additional id in the form of string. A system id
346            is required when we need to make sure that we can get a specific item without
347            knowing the numeric id. Example of such items are the root user and the everyone group.
348            A system id is generally constructed like:
349            <constant>net.sf.basedb.core.User.ROOT</constant>. The system id:s are defined in the
350            core layer by each item class.
351            </para>
352          </listitem>
353        </varlistentry>
354
355        <varlistentry>
356          <term><classname docapi="net.sf.basedb.core.data">DiskConsumableData</classname></term>
357          <listitem>
358            <para>
359            This interface is used by items which occupies a lot of disk space and
360            should be part of the quota system, for example files. The required
361            <classname docapi="net.sf.basedb.core.data">DiskUsageData</classname> contains information about the size,
362            location, owner etc. of the item.
363            </para>
364          </listitem>
365        </varlistentry>
366       
367        <varlistentry>
368          <term><classname docapi="net.sf.basedb.core.data">AnnotatableData</classname></term>
369          <listitem>
370            <para>
371            This interface is used by items which can be annotated. Annotations are name/value
372            pairs that are attached as extra information to an item. All annotations are
373            contained in an <classname docapi="net.sf.basedb.core.data">AnnotationSetData</classname> object.
374            </para>
375          </listitem>
376        </varlistentry>
377       
378        <varlistentry>
379          <term><classname docapi="net.sf.basedb.core.data">ExtendableData</classname></term>
380          <listitem>
381            <para>
382            This interface is used by items which can have extra administrator-defined
383            columns. The functionality is similar to annotations. It is not as flexible,
384            since it is a global configuration, but has better performance. BASE will
385            generate extra database columns to store the data in the tables for items that
386            can be extended.
387            </para>
388          </listitem>
389        </varlistentry>
390       
391        <varlistentry>
392          <term><classname docapi="net.sf.basedb.core.data">BatchableData</classname></term>
393          <listitem>
394            <para>
395            This interface is a tagging interface which is used by items that needs batch
396            functionality in the core.
397            </para>
398          </listitem>
399        </varlistentry>
400       
401        <varlistentry>
402          <term><classname docapi="net.sf.basedb.core.data">RegisteredData</classname></term>
403          <listitem>
404            <para>
405            This interface is used by items which registered the date they were
406            created in the database. The registration date is set at creation time
407            and can't be modified later. Since this didn't exist prior to BASE 2.10,
408            null values are allowed on all pre-existing items. Note! For backwards
409            compatibility reasons with existing code in
410            <classname docapi="net.sf.basedb.core.data">BioMaterialEventData</classname>
411            the method name is <methodname>getEntryDate()</methodname>.
412            </para>
413          </listitem>
414        </varlistentry>
415       
416        <varlistentry>
417          <term><interfacename docapi="net.sf.basedb.core.data">LoggableData</interfacename></term>
418          <listitem>
419            <para>
420            This is a tagging interface that indicates that the <classname 
421            docapi="net.sf.basedb.core.log.db">DbLogManagerFactory</classname> logging
422            implementation should log changes made to items that implements it.
423            </para>
424          </listitem>
425        </varlistentry>
426        </variablelist>
427
428      </sect3>
429    </sect2>
430   
431    <sect2 id="data_api.authentication">
432      <title>User authentication and access control</title>
433     
434      <para>
435         This section gives an overview of user authentication and
436         how groups, roles and projects are used for access control
437         to items.
438      </para>
439     
440      <sect3 id="data_api.authentication.uml">
441        <title>UML diagram</title>
442       
443        <figure id="data_api.figures.authentication">
444          <title>User authentication and access control</title>
445          <screenshot>
446            <mediaobject>
447              <imageobject>
448                <imagedata 
449                  align="center"
450                  scalefit="1" width="100%"
451                  fileref="figures/uml/datalayer.authentication.png" format="PNG" />
452              </imageobject>
453            </mediaobject>
454          </screenshot>
455        </figure>
456      </sect3>
457     
458      <sect3 id="data_api.authentication.users">
459        <title>Users and passwords</title>     
460     
461        <para>
462          The <classname docapi="net.sf.basedb.core.data">UserData</classname> class holds information about users.
463          We keep the passwords in a separate table and use proxies to avoid loading
464          password data each time a user is loaded to minimize security risks. It is
465          only if the password needs to be changed that the <classname docapi="net.sf.basedb.core.data">PasswordData</classname>
466          object is loaded. The one-to-one mapping between user and password is controlled
467          by the password class, but a cascade attribute on the user class makes sure
468          that the password is deleted when a user is deleted.
469        </para>
470      </sect3>
471
472      <sect3 id="data_api.authentication.groups">
473        <title>Groups, roles and projects</title>     
474     
475        <para>
476          The <classname docapi="net.sf.basedb.core.data">GroupData</classname>, <classname docapi="net.sf.basedb.core.data">RoleData</classname> and
477          <classname docapi="net.sf.basedb.core.data">ProjectData</classname> classes holds information about groups, roles
478          and projects respectively. A user may be a member of any number of groups,
479          roles and/or projects. The membership in a project comes with an attached
480          permission values. This is the highest permission the user has in the
481          project. No matter what permission an item has been shared with the
482          user will not get higher permission. Groups may be members of other groups and
483          also in projects.
484        </para>
485       
486        <para>
487          Group membership is always accounted for, but the core only allows
488          one project at a time to be use, this is the <emphasis>active project</emphasis>.
489          When a project is active new items that are created are automatically
490          added to that project with the permission given by the
491          <varname>autoPermission</varname> property.
492        </para>
493             
494      </sect3>
495     
496      <sect3 id="data_api.authentication.keys">
497        <title>Keys</title>     
498     
499        <para>
500          The <classname docapi="net.sf.basedb.core.data">KeyData</classname> class and it's subclasses
501          <classname docapi="net.sf.basedb.core.data">ItemKeyData</classname>, <classname docapi="net.sf.basedb.core.data">ProjectKeyData</classname> and
502          <classname docapi="net.sf.basedb.core.data">RoleKeyData</classname>, are used to store information about access
503          permissions to items. To get permission to manipulate an item a user must have
504          access to a key giving that permission. There are three types of keys:
505        </para>
506       
507        <variablelist>
508        <varlistentry>
509          <term><classname docapi="net.sf.basedb.core.data">ItemKey</classname></term>
510          <listitem>
511            <para>
512            Is used to give a user or group access to a specific item. The item
513            must be a <interfacename docapi="net.sf.basedb.core.data">ShareableData</interfacename> item.
514            The permissions are usually set by the owner of the item. Once created an
515            item key cannot be changed. This allows the core to reuse a key if the
516            permissions match exactly, ie. for a given set of users/groups/permissions
517            there can be only one item key object.
518            </para>
519          </listitem>
520        </varlistentry>
521
522        <varlistentry>
523          <term><classname docapi="net.sf.basedb.core.data">ProjectKey</classname></term>
524          <listitem>
525            <para>
526            Is used to give members of a project access to a specific item. The item
527            must be a <interfacename docapi="net.sf.basedb.core.data">ShareableData</interfacename> item. Once created a
528            project key cannot be changed. This allows the core to reuse a key if the
529            permissions match exactly, ie. for a given set of projects/permissions
530            there can be only one project key object.
531            </para>
532          </listitem>
533        </varlistentry>
534
535        <varlistentry>
536          <term><classname docapi="net.sf.basedb.core.data">RoleKey</classname></term>
537          <listitem>
538            <para>
539            Is used to give a user access to all items of a specific type, ie.
540            <constant>READ</constant> all <constant>SAMPLES</constant>. The installation
541            will make sure that there already exists a role key for each type of item, and
542            it is not possible to add new or delete existing keys. Unlike the other two types
543            this key can be modified.
544            </para>
545           
546            <para>
547            A role key is also used to assign permissions to plug-ins. If a plug-in has
548            been specified to use permissions the default is to deny everything.
549            The mapping to the role key is used to grant permissions to the plugin.
550            The <varname>granted</varname> value gives the plugin access to all items
551            of the related item type regardless of if the user that is running the plug-in has the
552            permission or not. The <varname>denied</varname> values denies access to all
553            items of the related item type even if the logged in user has the permission.
554            Permissions that are not granted nor denied are checked against the
555            logged in users regular permissions. Permissions to items that are
556            not linked are always denied.
557            </para>
558          </listitem>
559        </varlistentry>
560        </variablelist>
561       
562      </sect3>
563
564      <sect3 id="data_api.authentication.permissions">
565        <title>Permissions</title>
566       
567        <para>
568          The <varname>permission</varname> property appearing in many classes is an
569          integer values describing the permission:
570        </para>
571       
572        <informaltable>
573        <tgroup cols="2">
574          <colspec colname="value" />
575          <colspec colname="permission" />
576          <thead>
577            <row>
578              <entry>Value</entry>
579              <entry>Permission</entry>
580            </row>
581          </thead>
582          <tbody>
583            <row>
584              <entry>1</entry>
585              <entry>Read</entry>
586            </row>
587            <row>
588              <entry>3</entry>
589              <entry>Use</entry>
590            </row>
591            <row>
592              <entry>7</entry>
593              <entry>Restricted write</entry>
594            </row>
595            <row>
596              <entry>15</entry>
597              <entry>Write</entry>
598            </row>
599            <row>
600              <entry>31</entry>
601              <entry>Delete</entry>
602            </row>
603            <row>
604              <entry>47 (=32+15)</entry>
605              <entry>Set owner</entry>
606            </row>
607            <row>
608              <entry>79 (=64+15)</entry>
609              <entry>Set permissions</entry>
610            </row>
611            <row>
612              <entry>128</entry>
613              <entry>Create</entry>
614            </row>
615            <row>
616              <entry>256</entry>
617              <entry>Denied</entry>
618            </row>
619          </tbody>
620        </tgroup>
621        </informaltable>
622       
623        <para>
624          The values are constructed so that
625          <constant>READ</constant> -&gt;
626          <constant>USE</constant> -&gt;
627          <constant>RESTRICTED_WRITE</constant> -&gt;
628          <constant>WRITE</constant> -&gt;
629          <constant>DELETE</constant>
630          are chained in the sense that a higher permission always implies the lower permissions
631          also. The <constant>SET_OWNER</constant> and <constant>SET_PERMISSION</constant>
632          both implies <constant>WRITE</constant> permission. The <constant>DENIED</constant>
633          permission is only valid for role keys, and if specified it overrides all
634          other permissions.               
635        </para>
636       
637        <para>
638          When combining permission for a single item the permission codes for the different
639          paths are OR-ed together. For example a user has a role key with <constant>READ</constant>
640          permission for <constant>SAMPLES</constant>, but also an item key with <constant>USE</constant>
641          permission for a specific sample. Of course, the resulting permission for that
642          sample is <constant>USE</constant>. For other samples the resulting permission is
643          <constant>READ</constant>.
644        </para>
645       
646        <para>
647          If the user is also a member of a project which has <constant>WRITE</constant>
648          permission for the same sample, the user will have <constant>WRITE</constant>
649          permission when working with that project.
650        </para>
651       
652        <para>
653          The <constant>RESTRICTED_WRITE</constant> permission is in most cases the same
654          as the <constant>WRITE</constant> permission. So far the <constant>RESTRICTED_WRITE</constant>
655          permission is only given to users to their own <classname docapi="net.sf.basedb.core.data">UserData</classname>
656          object so they can change their address and other contact information,
657          but not quota, expiration date and other administrative information.
658        </para>
659
660      </sect3>
661    </sect2>
662
663    <sect2 id="data_api.wares">
664      <title>Hardware and software</title>
665      <para>
666         This section gives an overview of hardware and software in BASE.
667      </para>
668     
669      <sect3 id="data_api.wares.uml">
670        <title>UML diagram</title>
671       
672        <figure id="data_api.figures.wares">
673          <title>Hardware and software</title>
674          <screenshot>
675            <mediaobject>
676              <imageobject>
677                <imagedata 
678                  align="center"
679                  fileref="figures/uml/datalayer.wares.png" format="PNG" />
680              </imageobject>
681            </mediaobject>
682          </screenshot>
683        </figure>
684      </sect3>
685     
686      <sect3 id="data_api.wares.description">
687        <title>Hardware and software</title>
688        <para>
689          BASE is pre-installed with a set of hardware and software types.
690          They are typically used to filter the registered hardware and software
691          depending on what a user is doing. For example, when adding raw data
692          to BASE a user can select a scanner. The GUI will display the hardware
693          that has been registered as <emphasis>scanner</emphasis> hardware types.
694          Other hardware types are <emphasis>hybridization station</emphasis>
695          and <emphasis>print robot</emphasis>. An administrator may register more
696          hardware and software types.
697        </para>
698      </sect3>
699    </sect2>
700   
701    <sect2 id="data_api.reporters">
702      <title>Reporters</title>
703      <para>
704         This section gives an overview of hardware and software in BASE.
705      </para>
706     
707      <sect3 id="data_api.reporters.uml">
708        <title>UML diagram</title>
709       
710        <figure id="data_api.figures.reporters">
711          <title>Reporters</title>
712          <screenshot>
713            <mediaobject>
714              <imageobject>
715                <imagedata 
716                  align="center"
717                  fileref="figures/uml/datalayer.reporters.png" format="PNG" />
718              </imageobject>
719            </mediaobject>
720          </screenshot>
721        </figure>
722      </sect3>
723     
724      <sect3 id="data_api.reporters.description">
725        <title>Reporters</title>
726        <para>
727          The <classname docapi="net.sf.basedb.core.data">ReporterData</classname> class holds information about reporters.
728          The <property>externalId</property> is a required property that must be unique
729          among all reporters. The external ID is the value BASE uses to match
730          reporters when importing data from files.
731        </para>
732       
733        <para>
734          The <classname>ReporterData</classname> is an <emphasis>extendable</emphasis>
735          class, which means that the server administrator can define additional
736          columns (=annotations) in the reporters table. These are accessed with
737          the <methodname>ReporterData.getExtended()</methodname> and
738          <methodname>ReporterData.setExtended()</methodname> methods.
739          See <xref linkend="appendix.extendedproperties" /> for more information about
740          this.
741        </para>
742       
743        <para>
744          The <classname>ReporterData</classname> is also a <emphasis>batchable</emphasis>
745          class which means that there is no corresponding class in the core
746          layer. Client applications and plug-ins should work directly with
747          the <classname>ReporterData</classname> class. To help manage the reporters
748          there is the <classname docapi="net.sf.basedb.core">Reporter</classname> and <classname docapi="net.sf.basedb.core">ReporterBatcher</classname>
749          classes. The main reason for this
750          is to increase the performance and lower the memory usage by bypassing
751          internal caching in the core and Hibernate. Performance is also
752          increased by the batchers which uses more efficient SQL against the
753          database than Hibernate.
754        </para>
755       
756        <para>
757          The
758          <property>lastUpdate</property>
759          property holds the data and time the reporter information was last updated. The
760          value is managed automatically by the
761          <classname>ReporterBatcher</classname>
762          class. That goes for
763          <property>lastSource</property>
764          property too, which holds information about where the last update comes from. By
765          default this is set to the name of the logged in user, but it can be changed by
766          calling
767          <methodname>ReporterBatcher.setUpdateSource(String source)</methodname>
768          before the batcher commits the updates to the database. The source-string
769          should have the format: <synopsis>[ITEM_TYPE]:[ITEM_NAME]</synopsis> where,in
770          the file-case, ITEM_TYPE is File and ITEM_NAME is the file's name.
771        </para>
772      </sect3>
773     
774      <sect3 id="data_api.reporters.lists">
775        <title>Reporter lists</title>
776       
777        <para>
778          Reporter lists can be used to group reporters that are somehow related
779          to each other. This could for example be a list of interesting reporters
780          found in the analysis of an experiment. Each reporter in the list may
781          optionally be assigned a score. The meaning of the score value is not
782          interpreted by BASE.
783        </para>
784       
785      </sect3>
786     
787     
788    </sect2>
789
790    <sect2 id="data_api.quota">
791      <title>Quota and disk usage</title>
792      <para>
793         This section gives an overview of quota system in BASE
794         and how the disk usage is kept track of.
795      </para>
796     
797      <sect3 id="data_api.quota.uml">
798        <title>UML diagram</title>
799       
800        <figure id="data_api.figures.quota">
801          <title>Quota and disk usage</title>
802          <screenshot>
803            <mediaobject>
804              <imageobject>
805                <imagedata 
806                  align="center"
807                  fileref="figures/uml/datalayer.quota.png" format="PNG" />
808              </imageobject>
809            </mediaobject>
810          </screenshot>
811        </figure>
812      </sect3>
813     
814      <sect3 id="data_api.quota.description">
815        <title>Quota</title>
816       
817        <para>
818          The <classname docapi="net.sf.basedb.core.data">QuotaData</classname> holds information about a
819          single quota registration. The same quota may be used by many different users
820          and groups. This object encapsulates allowed
821          quota values for different types of quota types and locations.
822          BASE defines several quota types (file, raw data and experiment),
823          and locations (primary, secondary and offline).
824        </para>
825       
826        <para>
827          The <property>quotaValues</property> property is a map from
828          <classname docapi="net.sf.basedb.core.data">QuotaIndex</classname> to maximum byte values.
829          This map must contain at least one entry for the total
830          quota at the primary location.
831        </para>
832       
833      </sect3>
834     
835      <sect3 id="data_api.quota.diskusage">
836        <title>Disk usage</title>
837       
838        <para>
839          A <interfacename docapi="net.sf.basedb.core.data">DiskConsumableData</interfacename> (for example a file)
840          item is automatically linked to a <classname docapi="net.sf.basedb.core.data">DiskUsageData</classname>
841          item. This holds information about the number of bytes,
842          the location and quota type the item uses. It also holds information
843          about which user and group (optional) that should be charged for the disk usage.
844          The user is always the owner of the item.
845        </para>
846
847      </sect3>
848     
849    </sect2>
850
851    <sect2 id="data_api.clients">
852      <title>Client, session and settings</title>
853      <para>
854         This section gives an overview of hardware and software in BASE.
855      </para>
856     
857      <sect3 id="data_api.clients.uml">
858        <title>UML diagram</title>
859       
860        <figure id="data_api.figures.clients">
861          <title>Client, sessions and settings</title>
862          <screenshot>
863            <mediaobject>
864              <imageobject>
865                <imagedata 
866                  align="center"
867                  scalefit="1" width="100%"
868                  fileref="figures/uml/datalayer.clients.png" format="PNG" />
869              </imageobject>
870            </mediaobject>
871          </screenshot>
872        </figure>
873      </sect3>
874     
875      <sect3 id="data_api.clients.description">
876        <title>Clients</title>
877        <para>
878          The <classname docapi="net.sf.basedb.core.data">ClientData</classname> class holds information
879          about a client application. The <property>externalId</property>
880          property is a unique identifier for the application. To avoid ID clashes the ID
881          should be constructed in the same way as Java packages, for example
882          <constant>net.sf.basedb.clients.web</constant> is the ID for the
883          web client application.
884        </para>
885       
886        <para>
887          A client application doesn't have to be registered with BASE
888          to be able to use it. But we recommend it since:
889        </para>
890       
891        <itemizedlist>
892        <listitem>
893          <para>
894            The permission system allows an admin to specify exactly
895            which users that may use a specific application.
896          </para>
897        </listitem>
898       
899        <listitem>
900          <para>
901          The application can't store any context-sensitive or application-specific
902          settings unless it is registered.
903          </para>
904        </listitem>
905       
906        <listitem>
907          <para>
908          The application can store context-sensitive help in the BASE
909          database.
910          </para>
911        </listitem>
912        </itemizedlist>
913      </sect3>
914     
915      <sect3 id="data_api.clients.sessions">
916        <title>Sessions</title>
917       
918        <para>
919          A session represents the time between login and logout for a single
920          user. The <classname docapi="net.sf.basedb.core.data">SessionData</classname> object is entirely
921          managed by the BASE core, and should be considered read-only
922          for client applications.
923        </para>
924           
925      </sect3>
926     
927      <sect3 id="data_api.clients.settings">
928        <title>Settings</title>
929       
930        <para>
931          There are two types of settings: context-sensitive settings and regular
932          settings. The regular settings are simple key-value pairs of strings
933          and can be used for almost anything. There are four subtypes:
934        </para>
935       
936        <itemizedlist>
937        <listitem>
938          <para>
939          Global default settings: Settings that are used by all users
940          and client applications on the BASE server. These settings
941          are read-only except for administrators. BASE has not yet defined
942          any settings of this type.
943          </para>
944        </listitem>
945       
946        <listitem>
947          <para>
948          User default settings: Settings that are valid for a single user
949          for any client application. BASE has not yet defined
950          any settings of this type.
951          </para>
952        </listitem>
953       
954        <listitem>
955          <para>
956          Client default settings: Settings that are valid for all users using
957          a specific client application. Each client application is responsible
958          for defining it's own settings. Settings are read-only except
959          for administrators.
960          </para>
961        </listitem>
962       
963        <listitem>
964          <para>
965          User client settings: Settings that are valid for a single user using
966          a specific client application. Each client application is responsible
967          for defining it's own settings.
968          </para>
969        </listitem>
970       
971        </itemizedlist>
972       
973        <para>
974          The context-sensitive settings are designed to hold information
975          about the current status of options related to the listing of items
976          of a specific type. This includes:
977        </para>
978       
979        <itemizedlist>
980        <listitem>
981          <para>
982          Current filtering options (as 1 or more <classname docapi="net.sf.basedb.core.data">PropertyFilterData</classname>
983          objects).
984          </para>
985        </listitem>
986       
987        <listitem>
988          <para>
989          Which columns and direction to use for sorting.
990          </para>
991        </listitem>
992       
993        <listitem>
994          <para>
995          The number of items to display on each page, and which page that
996          is the current page.
997          </para>
998        </listitem>
999       
1000        <listitem>
1001          <para>
1002          Simple key-value settings related to a given context.
1003          </para>
1004        </listitem>
1005        </itemizedlist>
1006       
1007        <para>
1008          Context-sensitive settings are only accessible if a client
1009          application has been registered. The settings may be
1010          named to make it possible to store several presets and to
1011          quickly switch between them. In any case, BASE maintains a
1012          current default setting with an empty name. An administrator
1013          may mark a named setting as public to allow other users to
1014          use it.
1015        </para>
1016       
1017      </sect3>
1018     
1019     
1020    </sect2>
1021
1022    <sect2 id="data_api.files">
1023      <title>Files and directories</title>
1024
1025      <para>
1026        This section covers the details of the BASE file
1027        system.
1028      </para>
1029
1030      <sect3 id="data_api.files.uml">
1031      <title>UML diagram</title>
1032     
1033        <figure id="data_api.figures.files">
1034          <title>Files and directories</title>
1035          <screenshot>
1036            <mediaobject>
1037              <imageobject>
1038                <imagedata 
1039                  align="center"
1040                  fileref="figures/uml/datalayer.files.png" format="PNG" />
1041              </imageobject>
1042            </mediaobject>
1043          </screenshot>
1044        </figure>
1045      </sect3>
1046     
1047      <sect3 id="data_api.files.description">
1048        <title>Description</title>
1049       
1050        <para>
1051          The <classname docapi="net.sf.basedb.core.data">DirectoryData</classname> class holds
1052          information about directories. Directories are organised in the
1053          ususal way as as tree structure. All directories must have
1054          a parent directory, except the system-defined root directory.
1055        </para>
1056       
1057        <para>
1058          The <classname docapi="net.sf.basedb.core.data">FileData</classname> class holds information about
1059          a file. The actual file contents is stored on disk in the directory
1060          specified by the <varname>userfiles</varname> setting in
1061          <filename>base.config</filename>. The <varname>internalName</varname>
1062          property is the name of the file on disk, but this is never exposed to
1063          client applications. The filenames and directories
1064          on the disk doesn't correspond to the the filenames and directories in
1065          BASE.
1066        </para>
1067       
1068        <para>
1069          The <varname>location</varname> property can take three values:
1070        </para>
1071       
1072        <itemizedlist>
1073        <listitem>
1074          <para>
1075          0 = The file is offline, ie. there is no file on the disk
1076          </para>
1077        </listitem>
1078        <listitem>
1079          <para>
1080          1 = The file is in primary storage, ie. it is located on the disk
1081          and can be used by BASE
1082          </para>
1083        </listitem>
1084        <listitem>
1085          <para>
1086          2 = The file is in secondary storage, ie. it has been moved to some
1087          other place and can't be used by BASE immediately.
1088          </para>
1089        </listitem>
1090        </itemizedlist>
1091       
1092        <para>
1093          The <varname>action</varname> property controls how a file is
1094          moved between primary and seconday storage. It can have the following
1095          values:
1096        </para>
1097       
1098        <itemizedlist>
1099        <listitem>
1100          <para>
1101          0 = Do nothing
1102          </para>
1103        </listitem>
1104        <listitem>
1105          <para>
1106          1 = If the file is in secondary storage, move it back to the primary storage
1107          </para>
1108        </listitem>
1109        <listitem>
1110          <para>
1111          2 = If the file is in primary storage, move it to the secondary storage
1112          </para>
1113        </listitem>
1114        </itemizedlist>
1115       
1116        <para>
1117          The actual moving between primary and secondary storage is done by an
1118          external program. See
1119          <xref linkend="appendix.base.config.secondary" /> and
1120          <xref linkend="plugin_developer.other.secondary" /> for more information.
1121        </para>
1122     
1123        <para>
1124          The <varname>md5</varname> property can be used to check for file
1125          corruption when it is moved between primary and secondary storage or
1126          when a user re-uploads a file that has been offline.
1127        </para>
1128       
1129        <para>
1130          BASE can store files in a compressed format. This is handled internally
1131          and is not visible to client applications. The <varname>compressed</varname>
1132          and <varname>diskSize</varname> properties are used to store information
1133          about this. A file may always be compressed if the users says so, but
1134          BASE can also do this automatically if the file is uploaded
1135          to a directory with the <varname>autoCompress</varname> flag set
1136          or if the file has MIME type with the <varname>autoCompress</varname>
1137          flag set.
1138        </para>
1139       
1140        <para>
1141          The <classname docapi="net.sf.basedb.core.data">FileTypeData</classname> class holds information about
1142          file types. It is used only to make it easier for users to organise
1143          their files.
1144        </para>
1145       
1146        <para>
1147          The <classname docapi="net.sf.basedb.core.data">MimeTypeData</classname> is used to register mime types and
1148          map them to file extensions. The information is only used to lookup values
1149          when needed. Given the filename we can set the <varname>File.mimeType</varname>
1150          and <varname>File.fileType</varname> properties. The MIME type is also
1151          used to decide if a file should be stored in a compressed format or not.
1152          The extension of a MIME type must be unique. Extensions should be registered
1153          without a dot, ie <emphasis>html</emphasis>, not <emphasis>.html</emphasis>
1154        </para>
1155       
1156      </sect3>
1157     
1158     
1159    </sect2>
1160   
1161    <sect2 id="data_api.platforms">
1162      <title>Experimental platforms</title>
1163
1164      <para>
1165         This section gives an overview of experimental platforms
1166         and how they are used to enable data storage in files
1167         instead of in the database.
1168      </para>
1169     
1170      <itemizedlist>
1171        <title>See also</title>
1172        <listitem><xref linkend="core_api.data_in_files" /></listitem>
1173        <listitem><xref linkend="appendix.rawdatatypes.platforms" /></listitem>
1174        <listitem><xref linkend="plugin_developer.other.datafiles" /></listitem>
1175      </itemizedlist>
1176         
1177      <sect3 id="data_api.platforms.uml">
1178        <title>UML diagram</title>
1179       
1180        <figure id="data_api.figures.platforms">
1181          <title>Experimental platforms</title>
1182          <screenshot>
1183            <mediaobject>
1184              <imageobject>
1185                <imagedata 
1186                  align="center"
1187                  fileref="figures/uml/datalayer.platforms.png" format="PNG" />
1188              </imageobject>
1189            </mediaobject>
1190          </screenshot>
1191        </figure>
1192      </sect3>
1193     
1194      <sect3 id="data_api.platforms.platforms">
1195        <title>Platforms</title>
1196       
1197        <para>
1198          The <classname docapi="net.sf.basedb.core.data">PlatformData</classname> holds information about a
1199          platform. A platform can have one or more <classname docapi="net.sf.basedb.core.data">PlatformVariant</classname>:s.
1200          Both the platform and variant are identified by an external ID that
1201          is fixed and can't be changed. <emphasis>Affymetrix</emphasis>
1202          is an example of a platform.
1203          If the <varname>fileOnly</varname> flag is set data for the platform
1204          can only be stored in files and not imported into the database. If
1205          the flag is not set data can be imported into the database.
1206          In the latter case, the <varname>rawDataType</varname> property
1207          can be used to lock the platform
1208          to a specific raw data type. If the value is <constant>null</constant>
1209          the platform can use any raw data type.
1210        </para>
1211       
1212        <para>
1213          Each platform and it's variant can be connected to one or more
1214          <classname docapi="net.sf.basedb.core.data">DataFileTypeData</classname> items. This item
1215          describes the kind of files that are used to hold data for
1216          the platform and/or variant. The file types are re-usable between
1217          different platforms and variants. Note that a file type may be attached
1218          to either only a platform or to a platform with a variant. File
1219          types attached to platforms are inherited by the variants. The variants
1220          can only define additional file types, not remove or redefine file types
1221          that has been attached to the platform.
1222        </para>
1223        <para>
1224          The file type is also identified
1225          by a fixed, non-changable external ID. The <varname>itemType</varname>
1226          property tells us what type of item the file holds data for (ie.
1227          array design or raw bioassay). It also links to a <classname docapi="net.sf.basedb.core.data">FileType</classname>
1228          which is the generic type of data in the file. This allows us to query
1229          the database for, as an example, files with the generic type
1230          <constant>FileType.RAW_DATA</constant>. If we are in an Affymetrix
1231          experiment we will get the CEL file, for another platform we will
1232          get another file.
1233        </para>
1234        <para>
1235          The <varname>required</varname> flag in <classname docapi="net.sf.basedb.core.data">PlatformFileTypeData</classname>
1236          is used to signal that the file is a required file. This is not
1237          enforeced by the core. It is intended to be used by client applications
1238          for creating a better GUI and for validation of an experiment.
1239        </para>
1240
1241      </sect3>
1242     
1243      <sect3 id="data_api.platforms.files">
1244        <title>FileStoreEnabled items and data files</title>
1245       
1246        <para>
1247          An item must implement the <interfacename docapi="net.sf.basedb.core">FileStoreEnabledData</interfacename>
1248          interface to be able to store data in files instead of in the database.
1249          The interface creates a link to a <classname docapi="net.sf.basedb.core.data">FileSetData</classname> object,
1250          which can hold several <classname docapi="net.sf.basedb.core.data">FileSetMemberData</classname> items.
1251          Each member points to specific <classname docapi="net.sf.basedb.core.data">FileData</classname> item.
1252          A file set can only store one file of each <classname docapi="net.sf.basedb.core.data">DataFileTypeData</classname>.
1253        </para>
1254       
1255      </sect3>
1256    </sect2>
1257
1258    <sect2 id="data_api.parameters">
1259      <title>Parameters</title>
1260     
1261      <para>
1262        This section gives an overview the generic parameter
1263        system in BASE that is used to store annotation values,
1264        plugin configuration values, job parameter values, etc.
1265      </para>
1266     
1267      <sect3 id="data_api.parameters.uml">
1268        <title>UML diagram</title>
1269       
1270        <figure id="data_api.figures.parameters">
1271          <title>Parameters</title>
1272          <screenshot>
1273            <mediaobject>
1274              <imageobject>
1275                <imagedata 
1276                  align="center"
1277                  fileref="figures/uml/datalayer.parameters.png" format="PNG" />
1278              </imageobject>
1279            </mediaobject>
1280          </screenshot>
1281        </figure>
1282      </sect3>
1283     
1284      <sect3 id="data_api.parameters.description">
1285        <title>Parameters</title>
1286       
1287        <para>
1288          The parameter system is a generic system that can store almost
1289          any kind of simple values (string, numbers, dates, etc.) and
1290          also links to other items. The <classname docapi="net.sf.basedb.core.data">ParameterValueData</classname> 
1291          class is an abstract base class that can hold multiple values (all must be of the
1292          same type). Unless only a specific type of values should be stored, this is
1293          the class that should be used when creating references for storing parameter
1294          values. It makes it possible for a single relaltion to use any kind of
1295          values or for a collection reference to mix multiple types of values.
1296          A typical use case maps a <classname>Map</classname> with the
1297          parameter name as the key:
1298        </para>
1299       
1300        <programlisting language="java">
1301private Map&lt;String, ParameterValueData&lt;?&gt;&gt; configurationValues;
1302/**
1303   Link parameter name with it's values.
1304   @hibernate.map table="`PluginConfigurationValues`" lazy="true" cascade="all"
1305   @hibernate.collection-key column="`pluginconfiguration_id`"
1306   @hibernate.collection-index column="`name`" type="string" length="255"
1307   @hibernate.collection-many-to-many column="`value_id`"
1308      class="net.sf.basedb.core.data.ParameterValueData"
1309*/
1310public Map&lt;String, ParameterValueData&lt;?&gt;&gt; getConfigurationValues()
1311{
1312   return configurationValues;
1313}
1314void setConfigurationValues(Map&lt;String, ParameterValueData&lt;?&gt;&gt; configurationValues)
1315{
1316   this.configurationValues = configurationValues;
1317}
1318</programlisting>
1319       
1320      <para>
1321      Now it is possible for the collection to store all types of values:
1322      </para>
1323     
1324      <programlisting language="java">
1325Map&lt;String, ParameterValueData&lt;?&gt;&gt; config = ...
1326config.put("names", new StringParameterValueData("A", "B", "C"));
1327config.put("sizes", new IntegerParameterValueData(10, 20, 30));
1328
1329// When you later load those values again you have to cast
1330// them to the correct class.
1331List&lt;String&gt; names = (List&lt;String&gt;)config.get("names").getValues();
1332List&lt;Integer&gt; sizes = (List&lt;Integer&gt;)config.get("sizes").getValues();
1333</programlisting>
1334
1335      </sect3>
1336     
1337    </sect2>
1338
1339    <sect2 id="data_api.annotations">
1340      <title>Annotations</title>
1341     
1342      <para>
1343        This section gives an overview of how the BASE annotation
1344        system works.
1345      </para>
1346     
1347      <sect3 id="data_api.annotations.uml">
1348        <title>UML diagram</title>
1349       
1350        <figure id="data_api.figures.annotations">
1351          <title>Annotations</title>
1352          <screenshot>
1353            <mediaobject>
1354              <imageobject>
1355                <imagedata 
1356                  align="center"
1357                  fileref="figures/uml/datalayer.annotations.png" format="PNG" />
1358              </imageobject>
1359            </mediaobject>
1360          </screenshot>
1361        </figure>
1362      </sect3>
1363     
1364      <sect3 id="data_api.annotations.description">
1365        <title>Annotations</title>
1366       
1367        <para>
1368        An item must implement the <interfacename docapi="net.sf.basedb.core.data">AnnotatableData</interfacename>
1369        interface to be able to use the annotation system. This interface gives
1370        a link to a <classname docapi="net.sf.basedb.core.data">AnnotationSetData</classname> item. This class
1371        encapsulates all annotations for the item. There are two types of
1372        annotations:
1373        </para>
1374       
1375        <itemizedlist>
1376        <listitem>
1377          <para>
1378          <emphasis>Primary annotations</emphasis> are annotations that
1379          explicitely belong to the item. An annotation set can contain
1380          only one primary annotation of each annotation type. The primary
1381          annotation are linked with the <property>annotations</property>
1382          property. This property is a map with an
1383          <classname docapi="net.sf.basedb.core.data">AnnotationTypeData</classname>  as the key.
1384          </para>
1385        </listitem>
1386       
1387        <listitem>
1388          <para>
1389          <emphasis>Inherited annotations</emphasis> are annotations
1390          that belong to a parent item, but that we want to use on
1391          another item as well. Inherited annotations are saved as
1392          references to either a single annotation or to another
1393          annotation set. Thus, it is possible for an item to inherit
1394          multiple annotations of the same annotation type.
1395          </para>
1396        </listitem>
1397        </itemizedlist>
1398       
1399        <para>
1400          The <classname docapi="net.sf.basedb.core.data">AnnotationData</classname> class is also
1401          just a placeholder. It connects the annotation set and
1402          annotation type with a <classname docapi="net.sf.basedb.core.data">ParameterValueData</classname>
1403          object. This is the object that holds the actual annotation
1404          values.
1405        </para>
1406       
1407      </sect3>
1408     
1409      <sect3 id="data_api.annotations.types">
1410        <title>Annotation types</title>
1411       
1412        <para>
1413        Instances of the <classname docapi="net.sf.basedb.core.data">AnnotationTypeData</classname> class
1414        defines the various annotations. It must have a <property>valueType</property> 
1415        property which cannot be changed. The value of this property controls
1416        which <classname docapi="net.sf.basedb.core.data">ParameterValueData</classname> subclass is used to store
1417        the annotation values, ie. <classname docapi="net.sf.basedb.core.data">IntegerParameterValueData</classname>,
1418        <classname docapi="net.sf.basedb.core.data">StringParameterValueData</classname>, etc.
1419        The <property>multiplicity</property> property holds the maximum allowed
1420        number of values for an annotation, or 0 if an unlimited number is
1421        allowed.
1422        </para>
1423       
1424        <para>
1425        The <property>itemTypes</property> collection holds the codes for
1426        the types of items the annotation type can be used on. This is
1427        checked when new annotations are created but already existing
1428        annotations are not affected if the collection is modified.
1429        </para>
1430       
1431        <para>
1432        Annotation types with the <property>protocolParameter</property> flag set
1433        are treated a bit differently. They will not show up as annotations
1434        to items with a type found in the <property>itemTypes</property> collection.
1435        A protocol parameter should be attached to a protocol. Then, when an item
1436        is using that protocol it becomes possible to add annotation values for
1437        the annotation types specified as protocol parameters. It doesn't matter
1438        if the item's type is found in the <property>itemTypes</property> 
1439        collection or not.
1440        </para>
1441       
1442        <para>
1443        The <property>options</property> collection is used to store additional
1444        options required by some of the value types, for example a max string
1445        length for string annotations or the max and min allowed value for
1446        integer annotations.
1447        </para>
1448       
1449        <para>
1450        The <property>enumeration</property> property is a boolean flag
1451        indicating if the allowed values are predefined as an enumeration.
1452        In that case those values are found in the <property>enumerationValues</property>
1453        property. The actual subclass is determined by the <property>valueType</property>
1454        property.
1455        </para>
1456       
1457        <para>
1458        Most of the other properties are hints to client applications how
1459        to render the input field for the annotation.
1460        </para>
1461       
1462      </sect3>
1463     
1464      <sect3 id="data_api.annotations.units">
1465        <title>Units</title>
1466        <para>
1467        Numerical annotation values can have units. A unit is described by
1468        a <classname docapi="net.sf.basedb.core.data">UnitData</classname> object.
1469        Each unit belongs to a <classname docapi="net.sf.basedb.core.data">QuantityData</classname> 
1470        object which defines the class of units. For example, if the quantity is
1471        <emphasis>weight</emphasis>, we can have units, <emphasis>kg</emphasis>,
1472        <emphasis>mg</emphasis>, <emphasis>µg</emphasis>, etc. The <classname>UnitData</classname>
1473        contains a factor and offset that relates all units to a common reference
1474        defined by the <classname>QuantityData</classname> class. For example,
1475        <emphasis>1 meter</emphasis> is the reference unit for distance, and we
1476        have <code>1 meter * 0.001 = 1 millimeter</code>. In this case, the factor is
1477        <emphasis>0.001</emphasis> and the offset 0. Another example is the relationship between
1478        kelvin and Celsius, which is <code>1 kelvin + 273.15 = 1 °Celsius</code>.
1479        Here, the factor is 1 and the offset is <emphasis>+273.15</emphasis>.
1480        The <classname
1481        docapi="net.sf.basedb.core.data">UnitSymbolData</classname>
1482        is used to make it possible to assign alternative symbols to a single unit.
1483        This is needed to simplify input where it may be hard to know what to
1484        type to get <emphasis></emphasis> or <emphasis>°C</emphasis>. Instead,
1485        <emphasis>m2</emphasis> and <emphasis>C</emphasis> can be used as
1486        alternative symbols.
1487        </para>
1488       
1489        <para>
1490        The creator of an annotation type may select a
1491        <classname>QuantityData</classname>, which can't be changed later, and
1492        a default <classname>UnitData</classname>. When entering annotation values
1493        a user may select any unit for the selected quantity (unless annotation type
1494        owner has limited this by selecting <varname>usableUnits</varname>). Before
1495        the values are stored in the database, they are converted to the default
1496        unit. This makes it possible to compare and filter on annotation values
1497        using different units. For example, filtering with <emphasis>&gt;5mg</emphasis> 
1498        also finds items that are annotated with <emphasis>2g</emphasis>.
1499        </para>
1500       
1501        <para>
1502        The core should automatically update the stored annotation values if
1503        the default unit is changed for an annotation type, or if the reference
1504        factor for a unit is changed.
1505        </para>
1506      </sect3>
1507     
1508      <sect3 id="data_api.annotations.categories">
1509        <title>Categories</title>
1510       
1511        <para>
1512        The <classname docapi="net.sf.basedb.core.data">AnnotationTypeCategoryData</classname> class defines
1513        categories that are used to group annotation types that are related to
1514        each other. This information is mainly useful for client applications
1515        when displaying forms for annotating items, that wish to provide a
1516        clearer interface when there are many (say 50+) annotations type for
1517        an item. An annotation type can belong to more than one category.
1518        </para>
1519       
1520      </sect3>
1521     
1522    </sect2>
1523
1524    <sect2 id="data_api.protocols">
1525      <title>Protocols</title>
1526
1527      <para>
1528        This section gives an overview of how protocols that describe various
1529        processes, such as sampling, extraction and scanning, are used in BASE.
1530      </para>
1531     
1532      <sect3 id="data_api.protocols.uml">
1533        <title>UML diagram</title>
1534       
1535        <figure id="data_api.figures.protocols">
1536          <title>Protocols</title>
1537          <screenshot>
1538            <mediaobject>
1539              <imageobject>
1540                <imagedata 
1541                  align="center"
1542                  fileref="figures/uml/datalayer.protocols.png" format="PNG" />
1543              </imageobject>
1544            </mediaobject>
1545          </screenshot>
1546        </figure>
1547      </sect3>
1548     
1549      <sect3 id="data_api.protocols.description">
1550        <title>Protocols</title>
1551       
1552        <para>
1553        A protocol is something that defines a procedure or recipe for some
1554        kind of action, such as sampling, extraction and scanning. In BASE we only
1555        store a short name and description. It is possible to attach a file
1556        that provides a longer description of the procedure.
1557        </para>
1558     
1559      </sect3>
1560     
1561      <sect3 id="data_api.protocols.parameters">
1562        <title>Parameters</title>
1563       
1564        <para>
1565        The procedure described by the protocol may have parameters
1566        that are set indepentently each time the protocol is used. It
1567        could for example be a temperature, a time or something else.
1568        The definition of parameters is done by creating annotation
1569        types and attaching them to the protocol. It is only possible
1570        to attach annotation types which has the <property>protocolParameter</property>
1571        property set to <constant>true</constant>. The same annotation type
1572        can be used for more than one protocol, but only do this if the
1573        parameters actually has the same meaning.
1574        </para>
1575     
1576      </sect3>
1577     
1578    </sect2>
1579
1580    <sect2 id="data_api.plugins">
1581      <title>Plug-ins, jobs and job agents</title>
1582     
1583      <para>
1584         This section gives an overview of plug-ins, jobs and job agents.
1585      </para>
1586     
1587      <itemizedlist>
1588        <title>See also</title>
1589        <listitem><xref linkend="plugins.installation" /></listitem>
1590        <listitem><xref linkend="installation_upgrade.jobagents" /></listitem>
1591      </itemizedlist>
1592     
1593      <sect3 id="data_api.plugins.uml">
1594        <title>UML diagram</title>
1595       
1596        <figure id="data_api.figures.plugins">
1597          <title>Plug-ins, jobs and job agents</title>
1598          <screenshot>
1599            <mediaobject>
1600              <imageobject>
1601                <imagedata 
1602                  align="center"
1603                  scalefit="1" width="100%"
1604                  fileref="figures/uml/datalayer.plugins.png" format="PNG" />
1605              </imageobject>
1606            </mediaobject>
1607          </screenshot>
1608        </figure>
1609      </sect3>
1610
1611      <sect3 id="data_api.plugins.plugins">
1612        <title>Plug-ins</title>
1613       
1614        <para>
1615          The <classname docapi="net.sf.basedb.core.data">PluginDefinitionData</classname> holds information of the
1616          installed plugin classes. Much of the information is copied from the
1617          plug-in itself from the <classname docapi="net.sf.basedb.core.plugin">About</classname> object and by checking
1618          which interfaces it implements.
1619        </para>
1620       
1621        <para>
1622          There are five main types of plug-ins:
1623        </para>
1624       
1625        <itemizedlist>
1626        <listitem>
1627          <para>
1628          IMPORT (mainType = 1): A plug-in that imports data to BASE.
1629          </para>
1630        </listitem>
1631        <listitem>
1632          <para>
1633          EXPORT (mainType = 2): A plug-in that exports data from BASE.
1634          </para>
1635        </listitem>
1636        <listitem>
1637          <para>
1638          INTENSITY (mainType = 3): A plug-in that calculates intensity values
1639          from raw data.
1640          </para>
1641        </listitem>
1642        <listitem>
1643          <para>
1644          ANALYZE (mainType = 4): A plug-in that analyses data.
1645          </para>
1646        </listitem>
1647        <listitem>
1648          <para>
1649          OTHER (mainType = 5): Any other plug-in.
1650          </para>
1651        </listitem>
1652        </itemizedlist>
1653       
1654        <para>
1655          A plug-in may have different configurations. The flags <property>supportsConfigurations</property>
1656          and <property>requiresConfiguration</property> are used to specify if a plug-in
1657          must have or can't have any configurations. Configuration parameter values are
1658          versioned. Each time anyone updates a configuration the version number
1659          is increased and the parameter values are stored as a new entity.
1660          This is required because we want to be able to know exactly which
1661          parameters a job were using when it was executed. When a job is
1662          created we also store the parameter version number
1663          (<property>JobData.parameterVersion</property>). This means that even if
1664          someone changes the configuration later we will always know which
1665          parameters the job used.
1666        </para>
1667       
1668        <para>
1669          The <classname docapi="net.sf.basedb.core.data">PluginTypeData</classname> class is ued to group
1670          plug-ins that share some common functionality, by implementing
1671          additional (optional) interfaces. For example, the
1672          <interfacename docapi="net.sf.basedb.core.plugin">AutoDetectingImporter</interfacename> should be implemented
1673          by import plug-ins that supports automatic detection of file formats.
1674          Another example is the <interfacename docapi="net.sf.basedb.core.plugin">AnalysisFilterPlugin</interfacename>
1675          interface which should be implemented by all analysis plug-ins that
1676          only filters data.
1677        </para>
1678
1679      </sect3>
1680     
1681      <sect3 id="data_api.plugins.jobs">
1682        <title>Jobs</title>
1683       
1684        <para>
1685          A job represents a single invokation of a plug-in to do some work.
1686          The <classname docapi="net.sf.basedb.core.data">JobData</classname> class holds information about this.
1687          A job is usuallu executed by a plug-in, but doesn't have to be. The
1688          <property>status</property> property holds the current state of a job.
1689        </para>
1690       
1691        <itemizedlist>
1692        <listitem>
1693          <para>
1694            UNCONFIGURED (status = 0): The job is not yet ready to be executed.
1695          </para>
1696        </listitem>
1697        <listitem>
1698          <para>
1699            WAITING (status = 1): The job is waiting to be executed.
1700          </para>
1701        </listitem>
1702        <listitem>
1703          <para>
1704            PREPARING (status = 5): The job is about to be executed but hasn't started yet.
1705          </para>
1706        </listitem>
1707        <listitem>
1708          <para>
1709            EXECUTING (status = 2): The job is currently executing.
1710          </para>
1711        </listitem>
1712        <listitem>
1713          <para>
1714            DONE (status = 3): The job finished successfully.
1715          </para>
1716        </listitem>
1717        <listitem>
1718          <para>
1719            ERROR (status = 4): The job finished with an error.
1720          </para>
1721        </listitem>
1722        </itemizedlist>
1723      </sect3>
1724
1725      <sect3 id="data_api.plugins.agents">
1726        <title>Job agents</title>
1727       
1728        <para>
1729          A job agent is a program running on the same or a different server that
1730          is regularly checking for jobs that are waiting to be executed. The
1731          <classname docapi="net.sf.basedb.core.data">JobAgentData</classname> holds information about a job agent
1732          and the <classname docapi="net.sf.basedb.core.data">JobAgentSettingsData</classname> links the agent
1733          with the plug-ins the agent is able to execute. The job agent will only
1734          execute jobs that are owner by users or projects that the job agent has
1735          been shared to with at least use permission. The <property>priorityBoost</property>
1736          property can be used to give specific plug-ins higher priority.
1737          Thus, for a job agent it is possible to:
1738        </para>
1739       
1740        <itemizedlist>
1741        <listitem>
1742          <para>
1743          Specify exactly which plug-ins it will execute. For example, it is possible
1744          to dedicate one agent to only run one plug-in.
1745          </para>
1746        </listitem>
1747        <listitem>
1748          <para>
1749          Give some plug-ins higher priority. For example a job agent that is mainly
1750          used for importing data should give higher priority to all import plug-ins.
1751          Other types of jobs will have to wait until there are no more data to be
1752          imported.
1753          </para>
1754        </listitem>
1755        <listitem>
1756          <para>
1757          Specify exactly which users/groups/projects that may use the agent. For
1758          example, it is possible to dedicate one agent to only run jobs for a certain
1759          project.
1760          </para>
1761        </listitem>
1762        </itemizedlist>
1763       
1764      </sect3>
1765
1766
1767    </sect2>
1768   
1769    <sect2 id="data_api.biomaterials">
1770      <title>Biomaterials</title>
1771     
1772      <sect3 id="data_api.biomaterials.uml">
1773        <title>UML diagram</title>
1774       
1775        <figure id="data_api.figures.biomaterials">
1776          <title>Biomaterials</title>
1777          <screenshot>
1778            <mediaobject>
1779              <imageobject>
1780                <imagedata 
1781                  align="center"
1782                  fileref="figures/uml/datalayer.biomaterials.png" format="PNG" />
1783              </imageobject>
1784            </mediaobject>
1785          </screenshot>
1786        </figure>
1787      </sect3>
1788     
1789      <sect3 id="data_api.biomaterials.description">
1790        <title>Biomaterials</title>
1791       
1792        <para>
1793          There are four types of biomaterials: <classname docapi="net.sf.basedb.core.data">BioSourceData</classname>,
1794          <classname docapi="net.sf.basedb.core.data">SampleData</classname>, <classname docapi="net.sf.basedb.core.data">ExtractData</classname> and
1795          <classname docapi="net.sf.basedb.core.data">LabeledExtractData</classname>.
1796          All four types of are derived from the base class <classname docapi="net.sf.basedb.core.data">BioMaterialData</classname>.
1797          The reason for this is that they all share common functionality such as pooling
1798          and events. By using a common base class we do not have to create duplicate
1799          classes for keeping track of events and parents.
1800        </para>
1801       
1802        <para>
1803          The <classname docapi="net.sf.basedb.core.data">BioSourceData</classname> is the simplest of the biomaterials.
1804          It cannot have parents and can't participate in events. It's only used as a
1805          (non-required) parent for samples.
1806        </para>
1807       
1808        <para>
1809          The <classname docapi="net.sf.basedb.core.data">MeasuredBioMaterialData</classname> class is used as a base
1810          class for the other three biomaterial types. It introduces quantity
1811          measurements and can store original and remaining quantities. They are
1812          both optional. If an original quantity has been specified the core
1813          automatically calculates the remaining quantity based on the events a
1814          biomaterial participates in.
1815        </para>
1816       
1817        <para>
1818          All measured biomaterial have at least one event associated with them,
1819          the creation event, which holds information about the creation of the
1820          biomaterial. A measured biomaterial can be created in three ways:
1821        </para>
1822       
1823        <itemizedlist>
1824        <listitem>
1825          <para>
1826          From a single item of the parent type. Biosource is the parent type of
1827          samples, sample is the parent type of extracts, and extract is the
1828          parent type of labeled extracts. In this case the
1829          <property>pooled</property> property is <constant>false</constant>
1830          and the parent is specified in the <property>parent</property> property.
1831          If the parent is not a <classname docapi="net.sf.basedb.core.data">BioSourceData</classname> this information
1832          is duplicated, with the addition of an optional used quantity value, in the
1833          <property>sources</property> collection of the <classname docapi="net.sf.basedb.core.data">BioMaterialEventData</classname>
1834          object representing the creation event. It is the responsibility of the
1835          core to make sure that everything is properly synchronized and that
1836          remaining quantities are calculated.
1837          </para>
1838        </listitem>
1839       
1840        <listitem>
1841          <para>
1842          From one or more items of the same type, i.e pooling.
1843          In this case the <property>pooled</property> property is <constant>true</constant> 
1844          and the <property>parent</property> property is null. All source
1845          biomaterials are contained in the <property>sources</property> collection.
1846          The core is still responsible for keeping everything synchronized and to
1847          update remaining quantities.
1848          </para>
1849        </listitem>
1850       
1851        <listitem>
1852          <para>
1853          As a standalone biomaterial without parents.
1854          </para>
1855        </listitem>
1856        </itemizedlist>
1857
1858      </sect3>
1859     
1860      <sect3 id="data_api.biomaterials.events">
1861        <title>Biomaterial events</title>
1862       
1863        <para>
1864          An event represents something that happened to one or more biomaterials, for example
1865          the creation of another biomaterial. The <classname docapi="net.sf.basedb.core.data">BioMaterialEventData</classname>
1866          holds information about entry and event dates, protocols used, the user who is
1867          responsible, etc. There are three types of events represented by the <property>eventType</property>
1868          property.
1869        </para>
1870       
1871        <orderedlist>
1872        <listitem>
1873          <para>
1874          <emphasis>Creation event</emphasis>: This event represents the creation of a (measured)
1875          biomaterial. The <property>sources</property> collection contains
1876          information about the biomaterials that were used to create the new
1877          biomaterial. If the biomaterial is a pooled biomaterial all sources must
1878          be of the same type. Otherwise there can only be one source of the parent
1879          type. These rules are maintained by the core.
1880          </para>
1881        </listitem>
1882       
1883        <listitem>
1884          <para>
1885          <emphasis>Hybridization event</emphasis>: This event represents the creation
1886          of a hybridization. This event type is needed because we want to keep track
1887          of quantities for labeled extracts. This event has a hybridization as a
1888          product instead of a biomaterial. The sources collection can only contain
1889          labeled extracts.
1890          </para>
1891        </listitem>
1892
1893        <listitem>
1894          <para>
1895          <emphasis>Other event</emphasis>: This event represents some other important
1896          information about a single biomaterial that affected the remaining quantity.
1897          This event type doesn't have any sources.
1898          </para>
1899        </listitem>
1900        </orderedlist>
1901      </sect3>
1902 
1903    </sect2>
1904
1905    <sect2 id="data_api.plates">
1906      <title>Array LIMS - plates</title>
1907
1908      <sect3 id="data_api.plates.uml">
1909        <title>UML diagram</title>
1910       
1911        <figure id="data_api.figures.plates">
1912          <title>Array LIMS - plates</title>
1913          <screenshot>
1914            <mediaobject>
1915              <imageobject>
1916                <imagedata 
1917                  align="center"
1918                  scalefit="1" width="100%"
1919                  fileref="figures/uml/datalayer.plates.png" format="PNG" />
1920              </imageobject>
1921            </mediaobject>
1922          </screenshot>
1923        </figure>
1924      </sect3>
1925
1926      <sect3 id="data_api.plates.description">
1927        <title>Plates</title>
1928       
1929        <para>
1930          The <classname docapi="net.sf.basedb.core.data">PlateData</classname> is the main class holding information
1931          about a single plate. The associated <classname docapi="net.sf.basedb.core.data">PlateGeometryData</classname>
1932          defines how many rows and columns there are on a plate. Since this
1933          information is used to create wells, and for various other checks it is
1934          not possible to change the number of rows or columns once a geometry has
1935          been created.
1936        </para>
1937         
1938        <para>
1939          All plates must have a <classname docapi="net.sf.basedb.core.data">PlateTypeData</classname> which defines
1940          the geometry and a set of event types (see below).
1941        </para>
1942       
1943        <para>
1944          If the destroyed flag of a plate is set it is not allowed to use the
1945          plate for a plate mapping or to create array designs. However, it
1946          is possible to change the flag to not destroyed.
1947        </para>
1948
1949        <para>
1950          The barcode is intended to be used as an external identifier of the plate.
1951          But, the core doesn't care about the value or if it is unique or not.
1952        </para>
1953      </sect3>
1954     
1955      <sect3 id="data_api.plates.events">
1956        <title>Plate events</title>
1957
1958        <para>
1959          The plate type defines a set of <classname docapi="net.sf.basedb.core.data">PlateEventTypeData</classname>
1960          objects, each one represening a particular event a plate of this type
1961          usually goes trough. For a plate of a certain type, it is possible to
1962          attach exactly one event of each event type. The event type defines an
1963          optional protocol type, which can be used by client applications to
1964          filter a list of protocols for the event. The core doesn't check that
1965          the selected protocol for an event is of the same protocol type as
1966          defined by the event type.
1967        </para>
1968
1969        <para>
1970          The ordinal value can be used as a hint to client applications in
1971          which order the events actually are performed in the lab. The core doesn't
1972          care about this value or if several event types have the same value.
1973        </para>
1974      </sect3>
1975
1976      <sect3 id="data_api.plates.mappings">
1977        <title>Plate mappings</title>
1978       
1979        <para>
1980          A plate can be created either from scratch, with the help of the information
1981          in a <classname docapi="net.sf.basedb.core.data">PlateMappingData</classname>, from a set of parent plates.
1982          In the first case it is possible to specify a reporter for each well on the
1983          plate. In the second case the mapping code creates all the wells and links
1984          them to the parent wells on the parent plates. Once the plate has been saved
1985          to the database, the wells cannot be modified (because they are used
1986          downstream for various validation, etc.)
1987        </para>
1988       
1989        <para>
1990          The details in a plate mapping are simply coordinates that for each
1991          destination plate, row and column define a source plate, row and column.
1992          It is possible for a single source well to be mapped to multiple destination
1993          wells, but for each destination well only a single source well can be
1994          used.
1995        </para>
1996       
1997      </sect3>
1998
1999    </sect2>
2000
2001    <sect2 id="data_api.arrays">
2002      <title>Array LIMS - arrays</title>
2003     
2004      <sect3 id="data_api.arrays.uml">
2005        <title>UML diagram</title>
2006       
2007        <figure id="data_api.figures.arrays">
2008          <title>Array LIMS - arrays</title>
2009          <screenshot>
2010            <mediaobject>
2011              <imageobject>
2012                <imagedata 
2013                  align="center"
2014                  fileref="figures/uml/datalayer.arrays.png" format="PNG" />
2015              </imageobject>
2016            </mediaobject>
2017          </screenshot>
2018        </figure>
2019      </sect3>
2020     
2021      <sect3 id="data_api.arrays.designs">
2022        <title>Array designs</title>
2023       
2024        <para>
2025          Array designs are stored in <classname docapi="net.sf.basedb.core.data">ArrayDesignData</classname> objects
2026          and can be created either as standalone designs or
2027          from plates. In the first case the features on an array design
2028          are described by a reporter map. A reporter map is a file
2029          that maps a coordinate (block, meta-grid, row, column),
2030          position or an external ID on an array design to a
2031          reporter. Which method to use is given by the
2032          <property>ArrayDesign.featureIdentificationMethod</property> property.
2033          The coordinate system on an array design is divided into blocks.
2034          Each block can be identified either by a <property>blockNumber</property>
2035          or by meta coordinates. This information is stored in
2036          <classname docapi="net.sf.basedb.core.data">ArrayDesignBlockData</classname> items. Each block
2037          contains several <classname docapi="net.sf.basedb.core.data">FeatureData</classname> items, each
2038          one identified by a row and column coordinate. Platforms that doesn't
2039          divide the array design into blocks or doesn't use the coordinate system at all
2040          must still create a single super-block that holds all features.
2041        </para>
2042       
2043        <para>
2044          Array designs that are created from plates use a print map file
2045          instead of a reporter map. A print map is similar to a plate mapping
2046          but maps features (instead of wells) to wells. The file should
2047          specifify which plate and well a feature is created from. Reporter
2048          information will automatically be copied by BASE from the well.
2049        </para>
2050       
2051        <para>
2052          It is also possible to skip the importing of features into the
2053          database and just keep the data in the orginal files instead.
2054          This is typically done for Affymetrix CDF files.
2055        </para>
2056       
2057      </sect3>
2058     
2059      <sect3 id="data_api.arrays.slides">
2060        <title>Array slides</title>
2061       
2062        <para>
2063          The <classname docapi="net.sf.basedb.core.data">ArraySlideData</classname> represents a single
2064          array. Arrays are usually printed several hundreds in a batch,
2065          represented by a <classname docapi="net.sf.basedb.core.data">ArrayBatchData</classname> item.
2066          The <property>batchIndex</property> is the ordinal number of the
2067          array in the batch. The <property>barcode</property> can be used
2068          as a means for external programs to identify the array. BASE doesn't
2069          care if a value is given or if they are unique or not. If the
2070          <property>destroyed</property> flag is set it prevents a slide from
2071          beeing used by a hybridization.
2072        </para>
2073
2074      </sect3>
2075    </sect2>
2076
2077    <sect2 id="data_api.rawdata">
2078      <title>Hybridizations and raw data</title>
2079     
2080      <sect3 id="data_api.rawdata.uml">
2081        <title>UML diagram</title>
2082       
2083        <figure id="data_api.figures.rawdata">
2084          <title>Hybridizations and raw data</title>
2085          <screenshot>
2086            <mediaobject>
2087              <imageobject>
2088                <imagedata 
2089                  align="center"
2090                  scalefit="1" width="100%"
2091                  fileref="figures/uml/datalayer.rawdata.png" format="PNG" />
2092              </imageobject>
2093            </mediaobject>
2094          </screenshot>
2095        </figure>
2096      </sect3>
2097     
2098      <sect3 id="data_api.rawdata.hybridizations">
2099        <title>Hybridizations</title>
2100       
2101        <para>
2102        Hybridizations connects the slides from the Array LIMS part
2103        with labeled extracts from the biomaterials part. The <property>creationEvent</property>
2104        is used to register which labeled extracts that were used on the hybridization.
2105        The relation to slides is a one-to-one relation. A slide can only be used on
2106        a single hybridization and a hybridization can only use a single slide. The relation
2107        is optional from both sides.
2108        </para>
2109
2110        <para>
2111        The scanning of the hybridized slide is registered as separate scan events.
2112        One or more images can optionally be attached to each scan.
2113        The images are not used by BASE.
2114        </para>
2115       
2116      </sect3>
2117     
2118      <sect3 id="data_api.rawdata.description">
2119        <title>Raw data</title>
2120       
2121        <para>
2122        A <classname docapi="net.sf.basedb.core.data">RawBioAssayData</classname> object represents
2123        the raw data that is produced by analysing the image(s) from a
2124        single scan. You may register which software that was used, the
2125        protocol and any parameters (through the annotation system).
2126        </para>
2127
2128        <para>
2129        Files with the analysed data values can be attached to the
2130        associated <classname docapi="net.sf.basedb.core.data">FileSetData</classname> object. The platform
2131        and, optionally, the variant has information about the file types
2132        that can be used for that platform. If the platform file types support
2133        metadata extraction, headers, the number of spots, and other
2134        information may be automatically extracted from the raw data file(s).
2135        </para>
2136       
2137        <para>
2138        If the platform support it, raw data can also be imported into the database.
2139        This is handled by batchers and <classname docapi="net.sf.basedb.core.data">RawData</classname> objects.
2140        Which table to store the data in depends on the <property>rawDataType</property>
2141        property. The properties shown for the <classname>RawData</classname> class
2142        in the diagram are the mandatory properties. Each raw data type defines additional
2143        properties that are specific to that raw data type.
2144        </para>
2145       
2146      </sect3>
2147     
2148      <sect3 id="data_api.rawdata.spotimages">
2149        <title>Spot images</title>
2150       
2151        <para>
2152        Spot images can be created if you have the original image
2153        files. BASE can use 1-3 images as sources for the red, green
2154        and blue channel respectively. The creation of spotimages requires
2155        that x and y coordinates are given for each raw data spot. The scaling
2156        and offset values are used to convert the coordinates to pixel
2157        coordinates. With this information BASE is able to cut out a square
2158        from the source images that, theoretically, contains a specific spot and
2159        nothing else. The spot images are gamma-corrected independently and then
2160        put together into PNG images that are stored in a zip file.
2161        </para>
2162      </sect3>
2163     
2164    </sect2>
2165
2166    <sect2 id="data_api.experiments">
2167      <title>Experiments and analysis</title>
2168     
2169     
2170      <sect3 id="data_api.experiments.uml">
2171        <title>UML diagram</title>
2172       
2173        <figure id="data_api.figures.experiments">
2174          <title>Experiments</title>
2175          <screenshot>
2176            <mediaobject>
2177              <imageobject>
2178                <imagedata 
2179                  align="center"
2180                  scalefit="1" width="75%"
2181                  fileref="figures/uml/datalayer.experiments.png" format="PNG" />
2182              </imageobject>
2183            </mediaobject>
2184          </screenshot>
2185        </figure>
2186      </sect3>
2187     
2188      <sect3 id="data_api.experiments.description">
2189        <title>Experiments</title>
2190       
2191        <para>
2192          The <classname docapi="net.sf.basedb.core.data">ExperimentData</classname> 
2193          class is used to collect information about a single experiment. It
2194          links to any number of <classname docapi="net.sf.basedb.core.data">RawBioAssayData</classname>
2195          items, which must all be of the same <classname 
2196          docapi="net.sf.basedb.core">RawDataType</classname>.
2197        </para>
2198       
2199        <para>
2200          Annotation types that are needed in the analysis must connected to
2201          the experiment as experimental factors and the annotation values should
2202          be set on or inherited by each raw bioassay that is part of the
2203          experiment.
2204        </para>
2205       
2206        <para>
2207          The directory connected to the experiment is the default directory
2208          where plugins that generate files should store them.
2209        </para>
2210      </sect3>
2211           
2212      <sect3 id="data_api.experiments.bioassays">
2213        <title>Bioassay sets, bioassays and transformations</title>
2214       
2215        <para>
2216          Each line of analysis starts with the creation of a <emphasis>root</emphasis>
2217          <classname docapi="net.sf.basedb.core.data">BioAssaySetData</classname>,
2218          which holds the intensities calculated from the raw data.
2219          A bioassayset can hold one intensity for each channel. The number of
2220          channels is defined by the raw data type. For each raw bioassay used a
2221          <classname docapi="net.sf.basedb.core.data">BioAssayData</classname>
2222          is created.
2223        </para>
2224       
2225        <para>
2226          Information about the process that calculated the intensities are
2227          stored in a <classname docapi="net.sf.basedb.core.data">TransformationData</classname>
2228          object. The root transformation links with the raw bioassays that are used
2229          in this line of analysis and to a <classname 
2230          docapi="net.sf.basedb.core.data">JobData</classname> which has information
2231          about which plug-in and parameters that was used in the calculation.
2232        </para>
2233     
2234        <para>
2235          Once the root bioassayset has been created it is possible to
2236          again apply a transformation to it. This time the transformation
2237          links to a single source bioassayset instead of the raw bioassays.
2238          As before, it still links to a job with information about the plug-in and
2239          parameters that does the actual work. The transformation must make sure
2240          that new bioassays are created and linked to the bioassays in the
2241          source bioassayset. This above process may be repeated as many times
2242          as needed.
2243        </para>
2244       
2245        <para>
2246          Data to a bioassay set can only be added to it before it has been
2247          committed to the database. Once the transaction has been committed
2248          it is no longed possible to add more data or to modify existing
2249          data.
2250        </para>
2251     
2252      </sect3>
2253
2254      <sect3 id="data_api.experiments.virtualdb">
2255        <title>Virtual databases, datacubes, etc.</title>
2256       
2257        <para>
2258          The above processes requires a flexible storage solution for the data.
2259          Each experiment is related to a <classname docapi="net.sf.basedb.core.data">VirtualDb</classname>
2260          object. This object represents the set of tables that are needed to store
2261          data for the experiment. All tables are created in a special part of the
2262          BASE database that we call the <emphasis>dynamic database</emphasis>.
2263          In MySQL the dynamic database is a separate database, in Postgres it is
2264          a separate schema.
2265        </para>
2266       
2267        <para>
2268          A virual database is divided into data cubes. A data cube can be seen
2269          as a three-dimensional object where each point can hold data that in
2270          most cases can be interpreted as data for a single spot from an
2271          array. The coordinates to a point is given by <emphasis>layer</emphasis>,
2272          <emphasis>column</emphasis> and <emphasis>position</emphasis>. The
2273          layer and column coordinates are represented by the
2274          <classname docapi="net.sf.basedb.core.data">DataCubeLayerData</classname>
2275          and <classname docapi="net.sf.basedb.core.data">DataCubeColumnData</classname>
2276          objects. The position coordinate has no separate object associated with
2277          it.
2278        </para>
2279       
2280        <para>
2281          Data for a single bioassay set is always stored in a single layer. It
2282          is possible for more than one bioassay set to use the same layer. This
2283          usually happens for filtering transformations that doesn't modify the
2284          data.  The filtered bioassay set is then linked to a
2285          <classname docapi="net.sf.basedb.core.data">DataCubeFilterData</classname>
2286          object, which has information about which data points that
2287          passed the filter.
2288        </para>
2289       
2290        <para>
2291          All data for a bioassay is stored in a single column.
2292          Two bioassays in different bioassaysets (layers) can only have the same
2293          column if one is the parent of the other.
2294        </para>
2295       
2296        <para>
2297          The position coordinate is tied to a reporter.
2298        </para>
2299       
2300        <para>
2301          A child bioassay set may use the same data cube as it's parent
2302          bioassay set if all of the following conditions are true:
2303        </para>
2304       
2305        <itemizedlist>
2306        <listitem>
2307          <para>
2308          All positions are linked to the same reporter as the positions
2309          in the parent bioassay set.
2310          </para>
2311        </listitem>
2312       
2313        <listitem>
2314          <para>
2315          All data points are linked to the same (possible many) raw data
2316          spots as the corresponding data points in the parent bioassay set.
2317          </para>
2318        </listitem>
2319       
2320        <listitem>
2321          <para>
2322          The bioassays in the child bioassay set each have exactly one
2323          parent in the parent bioassay set. No parent bioassay may be the
2324          parent of more than one child bioassay.
2325          </para>
2326        </listitem>
2327        </itemizedlist>
2328       
2329        <para>
2330          If any of the above conditions are not true, a new data cube
2331          must be created for the child bioassay set.
2332        </para>
2333      </sect3>
2334     
2335      <sect3 id="data_api.dynamic.description">
2336        <title>The dynamic database</title>
2337
2338        <figure id="data_api.figures.dynamic">
2339          <title>The dynamic database</title>
2340          <screenshot>
2341            <mediaobject>
2342              <imageobject>
2343                <imagedata 
2344                  align="center"
2345                  fileref="figures/uml/datalayer.dynamic.png" format="PNG" />
2346              </imageobject>
2347            </mediaobject>
2348          </screenshot>
2349        </figure>
2350       
2351        <para>
2352          Each virtual database consists of several tables. The tables
2353          are dynamically created when needed. For each table shown in the diagram
2354          the # sign is replaced by the id of the virtual database object at run
2355          time.
2356        </para>
2357       
2358        <para>
2359          There are no classes in the data layer for these tables and they
2360          are not mapped with Hibernate. When we work with these tables we
2361          are always using batcher classes and queries that works with integer,
2362          floats and strings.
2363        </para>
2364       
2365        <bridgehead>The D#Spot table</bridgehead>
2366        <para>
2367          This is the main table which keeps the intensities for a single spot
2368          in the data cube. Extra values attached to the spot are kept in separate
2369          tables, one for each type of value (D#SpotInt, D#SpotFloat and D#SpotString).
2370        </para>
2371       
2372        <bridgehead>The D#Pos table</bridgehead>
2373        <para>
2374          This table stores the reporter id for each position in a cube.
2375          Extra values attached to the position are kept in separate tables,
2376          one for each type of value (D#PosInt, D#PosFloat and D#PosString).
2377        </para>
2378       
2379        <bridgehead>The D#Filter table</bridgehead>
2380        <para>
2381          This table stores the coordinates for the spots that remain after
2382          filtering. Note that each filter is related to a bioassayset which
2383          gives the cube and layer values. Each row in the filter table then
2384          adds the column and position allowing us to find the spots in the
2385          D#Spot table.
2386        </para>
2387       
2388        <bridgehead>The D#RawParents table</bridgehead>
2389        <para>
2390          This table holds mappings for a spot to the raw data it is calculated
2391          from. We don't need the layer coordinate since all layers in a cube
2392          must have the same mapping to raw data.
2393        </para>
2394       
2395      </sect3>     
2396
2397     
2398    </sect2>
2399   
2400    <sect2 id="data_api.misc">
2401      <title>Other classes</title>
2402     
2403      <sect3 id="data_api.misc.uml">
2404        <title>UML diagram</title>
2405       
2406        <figure id="data_api.figures.misc">
2407          <title>Other classes</title>
2408          <screenshot>
2409            <mediaobject>
2410              <imageobject>
2411                <imagedata 
2412                  align="center"
2413                  fileref="figures/uml/datalayer.misc.png" format="PNG" />
2414              </imageobject>
2415            </mediaobject>
2416          </screenshot>
2417        </figure>
2418      </sect3>
2419     
2420    </sect2>
2421
2422  </sect1>
2423 
2424  <sect1 id="api_overview.core_api" chunked="1">
2425    <title>The Core API</title>
2426   
2427    <para>
2428      This section gives an overview of various parts of the core API.
2429    </para>
2430   
2431    <sect2 id="core_api.data_in_files">
2432      <title>Using files to store data</title>
2433     
2434      <para>
2435        BASE 2.5 introduced the possibility to use files to store data instead
2436        of importing it into the database. Files can be attached
2437        to any item that implements the <interfacename docapi="net.sf.basedb.core">FileStoreEnabled</interfacename>
2438        interface. Currently this is <classname docapi="net.sf.basedb.core">RawBioAssay</classname>
2439        and <classname docapi="net.sf.basedb.core">ArrayDesign</classname>. The
2440        ability to store data in files is not a replacement for storing data in the
2441        database. It is possible (for some platforms/raw data types) to have data in
2442        files and in the database at the same time. We would have liked to enforce
2443        that (raw) data is always present in files, but this will not be backwards compatible
2444        with older installations, so there are three cases:
2445      </para>
2446     
2447      <itemizedlist>
2448      <listitem>
2449        <para>
2450        Data in files only
2451        </para>
2452      </listitem>
2453      <listitem>
2454        <para>
2455        Data in the database only
2456        </para>
2457      </listitem>
2458      <listitem>
2459        <para>
2460        Data in both files and in the database
2461        </para>
2462      </listitem>
2463      </itemizedlist>
2464     
2465      <para>
2466        Not all three cases are supported for all types of data. This is controlled
2467        by the <classname docapi="net.sf.basedb.core">Platform</classname> class, which may disallow
2468        that data is stored in the database. To check this call
2469        <methodname>Platform.isFileOnly()</methodname> and/or
2470        <methodname>Platform.getRawDataType()</methodname>. If the <methodname>isFileOnly()</methodname>
2471        method returns <constant>true</constant>, the platform can't store data in
2472        the database. If the value is <constant>false</constant> more information
2473        can be obtained by calling <methodname>getRawDataType()</methodname>,
2474        which may return:
2475      </para>
2476     
2477      <itemizedlist>
2478      <listitem>
2479        <para>
2480          <constant>null</constant>: The platform can store data with any
2481          raw data type in the database.
2482        </para>
2483      </listitem>
2484      <listitem>
2485        <para>
2486        A <classname docapi="net.sf.basedb.core">RawDataType</classname> that has <code>isStoredInDb() == true</code>:
2487        The platform can store data in the database but only data with the specified raw
2488        data type.
2489        </para>
2490      </listitem>
2491      <listitem>
2492        <para>
2493        A <classname docapi="net.sf.basedb.core">RawDataType</classname> that has <code>isStoredInDb() == false</code>:
2494        The platform can't store data in the database.
2495        </para>
2496      </listitem>
2497      </itemizedlist>
2498
2499      <para>
2500        One major change from earlier BASE versions is that the registration of raw data types
2501        has changed. The <filename>raw-data-types.xml</filename> file should
2502        only be used for raw data types that are stored in the database. The
2503        <sgmltag>storage</sgmltag> tag has been deprecated and BASE will refuse
2504        to start if it finds a raw data type definitions with <code>storage="file"</code>.
2505      </para>
2506     
2507      <para>
2508        For backwards compatibility reasons, each <classname docapi="net.sf.basedb.core">Platform</classname>
2509        that can only store data in files will create "virtual" raw data type
2510        objects internally. These raw data types all return <constant>false</constant>
2511        from the <methodname>RawDataType.isStoredInDb()</methodname>
2512        method. They also have a back-link to the platform/variant that
2513        created it: <methodname>RawDataType.getPlatform()</methodname>
2514        and <methodname>RawDataType.getVariant()</methodname>. These two methods
2515        will always return <constant>null</constant> when called on a raw data type
2516        that can be stored in the database.
2517      </para>
2518     
2519      <itemizedlist>
2520        <title>See also</title>
2521        <listitem><xref linkend="data_api.platforms" /></listitem>
2522        <listitem><xref linkend="plugin_developer.other.datafiles" /></listitem>
2523        <listitem><xref linkend="appendix.rawdatatypes.platforms" /></listitem>
2524        <listitem>
2525          <xref linkend="appendix.incompatible.2.5" /> in
2526          <xref linkend="appendix.incompatible" />
2527        </listitem>
2528      </itemizedlist>
2529     
2530      <sect3 id="core_api.data_in_files.diagram">
2531        <title>Diagram of classes and methods</title>
2532        <figure id="core_api.figures.data_in_files">
2533          <title>Store data in files</title>
2534          <screenshot>
2535            <mediaobject>
2536              <imageobject>
2537                <imagedata 
2538                  align="center"
2539                  scalefit="1" width="100%"
2540                  fileref="figures/uml/corelayer.datainfiles.png" format="PNG" />
2541              </imageobject>
2542            </mediaobject>
2543          </screenshot>
2544        </figure>
2545       
2546        <para>
2547          This is rather large set of classes and methods. The ultimate goal
2548          is to be able to create links between a <classname docapi="net.sf.basedb.core">RawBioAssay</classname>
2549          / <classname docapi="net.sf.basedb.core">ArrayDesign</classname> and <classname docapi="net.sf.basedb.core">File</classname>
2550          items and to provide some metadata about the files.
2551          The <classname docapi="net.sf.basedb.core">FileStoreUtil</classname> class is one of the most
2552          important ones. It is intended to make it easy for plug-in (and other)
2553          developers to access the files without having to mess with platform
2554          or file type objects. The API is best described
2555          by a set of use-case examples.
2556        </para>
2557       
2558      </sect3>
2559     
2560      <sect3 id="core_api.data_in_files.ask">
2561        <title>Use case: Asking the user for files for a given item</title>
2562
2563        <para>
2564          A client application must know what types of files it makes sense
2565          to ask the user for. In some cases, data may be split into more than
2566          one file so we need a generic way to select files.
2567        </para>
2568       
2569        <para>
2570          Given that we have a <interfacename docapi="net.sf.basedb.core">FileStoreEnabled</interfacename>
2571          item we want to find out which <classname docapi="net.sf.basedb.core">DataFileType</classname>
2572          items that can be used for that item. The
2573          <methodname>DataFileType.getQuery(FileStoreEnabled)</methodname>
2574          can be used for this. Internally, the method uses the result from
2575          <methodname>FileStoreEnabled.getPlatform()</methodname>
2576          and <methodname>FileStoreEnabled.getVariant()</methodname>
2577          methods to restrict the query to only return file types for
2578          a given platform and/or variant. If the item doesn't have
2579          a platform or variant the query will return all file types
2580          that are associated with the given item type. In any case, we get a list
2581          of <classname>DataFileType</classname> items, each one representing a
2582          specific file type that we should ask the user about. Examples:
2583        </para>
2584
2585        <orderedlist>
2586        <listitem>
2587          <para>
2588          The <constant>Affymetrix</constant> platform defines <constant>CEL</constant>
2589          as a raw data file and <constant>CDF</constant> as an array design (reporter map)
2590          file. If we have a <classname docapi="net.sf.basedb.core">RawBioAssay</classname> the query will only return
2591          the CEL file type and the client can ask the user for a CEL file.
2592          </para>
2593        </listitem>
2594        <listitem>
2595          <para>
2596          The <constant>Generic</constant> platform defines <constant>PRINT_MAP</constant>
2597          and <constant>REPORTER_MAP</constant> for array designs. If we have
2598          an <classname docapi="net.sf.basedb.core">ArrayDesign</classname> the query will return those two
2599          items.
2600          </para>
2601        </listitem>
2602        </orderedlist>
2603     
2604        <para>
2605          It might also be interesting to know the currently selected file
2606          for each file type and if the platform has set the <varname>required</varname>
2607          flag for a particular file type. Here is a simple code example
2608          that may be useful to start from:
2609        </para>
2610     
2611        <programlisting language="java">
2612DbControl dc = ...
2613FileStoreEnabled item = ...
2614Platform platform = item.getPlatform();
2615PlatformVariant variant = item.getVariant();
2616
2617// Get list of DataFileTypes used by the platform
2618ItemQuery&lt;DataFileType&gt; query =
2619   DataFileType.getQuery(item);
2620List&lt;DataFileType&gt; types = query.list(dc);
2621
2622// Always check hasFileSet() method first to avoid
2623// creating the file set if it doesn't exists
2624FileSet fileSet = item.hasFileSet() ?
2625   null : item.getFileSet();
2626   
2627for (DataFileType type : types)
2628{
2629   // Get the current file, if any
2630   FileSetMember member = fileSet == null || !fileSet.hasMember(type) ?
2631      null : fileSet.getMember(type);
2632   File current = member == null ?
2633      null : member.getFile();
2634   
2635   // Check if a file is required by the platform
2636   PlatformFileType pft = platform == null ?
2637      null : platform.getFileType(type, variant);
2638   boolean isRequired = pft == null ?
2639      false : pft.isRequired();
2640     
2641   // Now we can do something with this information to
2642   // let the user select a file ...
2643}
2644</programlisting>
2645     
2646        <note>
2647          <title>Also remember to catch PermissionDeniedException</title>
2648          <para>
2649            The above code may look complicated, but this is mostly because
2650            of all checks for <constant>null</constant> values. Remember
2651            that many things are optional and may return <constant>null</constant>.
2652            Another thing to look out for is
2653            <exceptionname>PermissionDeniedException</exceptionname>:s. The logged in
2654            user may not have access to all items. The above example doesn't include
2655            any code for this since it would have made it too complex.
2656          </para>
2657        </note>
2658      </sect3>
2659     
2660      <sect3 id="core_api.data_in_files.link">
2661        <title>Use case: Link, validate and extract metadata from the selected files</title>
2662        <para>
2663          When the user has selected the file(s) we must store the links
2664          to them in the database. This is done with a <classname docapi="net.sf.basedb.core">FileSet</classname>
2665          object. A file set can contain any number of files. The only limitation
2666          is that it can only contain one file for each file type.
2667          Call <methodname>FileSet.setMember()</methodname> to store
2668          a file in the file set. If a file already exists for the given file type
2669          it is replaced, otherwise a new entry is created. The following
2670          program example assumes that we have a map where <classname docapi="net.sf.basedb.core">File</classname>:s
2671          are related to <classname docapi="net.sf.basedb.core">DataFileType</classname>:s. When all files
2672          have been added we call <methodname>FileSet.validate()</methodname>
2673          to validate the files and extract metadata.
2674        </para>
2675       
2676        <programlisting language="java">
2677DbControl dc = ...
2678FileStoreEnabled item = ...
2679Map&lt;DataFileType, File&gt; files = ...
2680
2681// Store the selected files in the fileset
2682FileSet fileSet = item.getFileSet();
2683for (Map.Entry&lt;DataFileType, File&gt; entry : files)
2684{
2685   DataFileType type = entry.getKey();
2686   File file = entry.getValue();
2687   fileSet.setMember(type, file);
2688}
2689
2690// Validate the files and extract metadata
2691fileSet.validate(dc, true);
2692</programlisting>
2693
2694        <para>
2695          Validation and extraction of metadata is important since we want
2696          data in files to be equivalent to data in the database. The validation
2697          and metadata extraction is done by the core when the
2698          <methodname>FileSet.validate()</methodname> is called.
2699          The process is partly pluggable since each <classname docapi="net.sf.basedb.core">DataFileType</classname> 
2700          can name a class that should do the validation and/or metadata extraction.
2701        </para>
2702
2703        <note>
2704          <para>
2705          The <methodname>FileSet.validate()</methodname> only validates
2706          the files where the file types have specified plug-ins that can
2707          do the validation and metadata extraction. The method doesn't
2708          throw any exceptions. Instead, all validation errors
2709          are returned a list of <classname>Throwable</classname>:s. The
2710          validation result is also stored for each file and can be access
2711          with <methodname>FileSetMember.isValid()</methodname> and
2712          <methodname>FileSetMember.getErrorMessage()</methodname>.
2713          </para>
2714        </note>
2715
2716        <para>
2717          Here is the general outline of what is going on in the core:
2718        </para>
2719
2720        <orderedlist>
2721        <listitem>
2722          <para>
2723          The core checks the <classname docapi="net.sf.basedb.core">DataFileType</classname> of all
2724          members in the file set and creates <classname docapi="net.sf.basedb.core.filehandler">DataFileValidator</classname>
2725          and <classname docapi="net.sf.basedb.core.filehandler">DataFileMetadataReader</classname> objects. Only one instance
2726          of each class is created. If the file set contains members which has the
2727          same validator or metadata reader, they will all share the same instance.
2728          </para>
2729        </listitem>
2730       
2731        <listitem>
2732          <para>
2733          Each validator/metadata reader class is initialised with calls to
2734          <methodname>DataFileHandler.setItem()</methodname> and
2735          <methodname>DataFileHandler.setFile()</methodname>.
2736          </para>
2737        </listitem>
2738       
2739        <listitem>
2740          <para>
2741          Each validator is called. The result of the validation is saved for each
2742          file and can be retreieved by <methodname>FileSetMember.isValid()</methodname>
2743          and <methodname>FileSetMember.getErrorMessage()</methodname>.
2744          </para>
2745        </listitem>
2746       
2747        <listitem>
2748          <para>
2749          Each metadata reader is called, unless the metadata reader is the same class
2750          as the validator and the validation failed. If the metadata reader is a
2751          different class, it is called even if the validation failed.
2752          </para>
2753        </listitem>
2754        </orderedlist>
2755
2756        <note>
2757          <title>Only one instance of each validator class is created</title>
2758          <para>
2759          The validation/metadata extraction is not done until all files have been
2760          added to the fileset. If the same validator/meta data reader is
2761          used for more than one file, the same instance is reused. Ie.
2762          the <methodname>setFile()</methodname> is called one time
2763          for each file/file type pair. The <methodname>validate()</methodname>
2764          and <methodname>extractMetadata()</methodname> methods are only
2765          called once.
2766          </para>
2767        </note>
2768       
2769        <para>
2770          All validators and meta data extractors should extend
2771          the <classname docapi="net.sf.basedb.core.filehandler">AbstractDataFileHandler</classname> class. The reason
2772          is that we may want to add more methods to the <interfacename docapi="net.sf.basedb.core.filehandler">DataFileHandler</interfacename>
2773          interface in the future. The <classname docapi="net.sf.basedb.core.filehandler">AbstractDataFileHandler</classname> will
2774          be used to provide default implementations for backwards compatibility.
2775        </para>
2776       
2777      </sect3>
2778     
2779      <sect3 id="core_api.data_in_files.import">
2780        <title>Use case: Import data into the database</title>
2781       
2782        <para>
2783          This should be done by existing plug-ins in the same way as before.
2784          A slight modification is needed since it is good if the importers
2785          are made aware of already selected files in the <classname docapi="net.sf.basedb.core">FileSet</classname>
2786          to provide good default values. The <classname docapi="net.sf.basedb.core">FileStoreUtil</classname>
2787          class is very useful in cases like this:
2788        </para>
2789       
2790        <programlisting language="java">
2791RawBioAssay rba = ...
2792DbControl dc = ...
2793
2794// Get the current raw data file, if any
2795List&lt;File&gt; rawDataFiles =
2796   FileStoreUtil.getGenericDataFiles(dc, rba, FileType.RAW_DATA);
2797File defaultFile = rawDataFiles.size() > 0 ?
2798   rawDataFiles.get(0) : null;
2799   
2800// Create parameter asking for input file - use current as default
2801PluginParameter&lt;File&gt; fileParameter = new PluginParameter&lt;File&gt;(
2802   "file",
2803   "Raw data file",
2804   "The file that contains the raw data that you want to import",
2805   new FileParameterType(defaultFile, true, 1)
2806);
2807</programlisting>
2808
2809      <para>
2810        An import plug-in should also save the file that was used to the file set:
2811      </para>
2812     
2813      <programlisting language="java">
2814RawBioassay rba = ...
2815// The file the user selected to import from
2816File rawDataFile = (File)job.getValue("file");
2817
2818// Save the file to the fileset. The method will check which file
2819// type the platform uses as the raw data type. As a fallback the
2820// GENERIC_RAW_DATA type is used
2821FileStoreUtil.setGenericDataFile(dc, rba, FileType.RAW_DATA,
2822   DataFileType.GENERIC_RAW_DATA, rawDataFile);
2823</programlisting>
2824
2825      </sect3>
2826     
2827      <sect3 id="core_api.data_in_files.experiments">
2828        <title>Use case: Using raw data from files in an experiment</title>
2829       
2830        <para>
2831          Just as before, an experiment is still locked to a single
2832          <classname docapi="net.sf.basedb.core">RawDataType</classname>. This is a design issue that
2833          would break too many things if changed. If data is stored in files
2834          the experiment is also locked to a single <classname docapi="net.sf.basedb.core">Platform</classname>.
2835          This has been designed to have as little impact on existing
2836          plug-ins as possible. In most cases, the plug-ins will continue
2837          to work as before.
2838        </para>
2839       
2840        <para>
2841          A plug-in (using data from the database that needs to check if it can
2842          be used within an experiment can still do:
2843        </para>
2844       
2845        <programlisting language="java">
2846Experiment e = ...
2847RawDataType rdt = e.getRawDataType();
2848if (rdt.isStoredInDb())
2849{
2850   // Check number of channels, etc...
2851   // ... run plug-in code ...
2852}
2853</programlisting>
2854       
2855        <para>
2856          A newer plug-in which uses data from files should do:
2857        </para>
2858       
2859        <programlisting language="java">
2860Experiment e = ...
2861DbControl dc = ...
2862RawDataType rdt = e.getRawDataType();
2863if (!rdt.isStoredInDb())
2864{
2865   // Check that platform/variant is supported
2866   Platform p = rdt.getPlatform(dc);
2867   PlatformVariant v = rdt.getVariant(dc);
2868   // ...
2869
2870   // Get data files
2871   File aFile = FileStoreUtil.getDataFile(dc, ...);
2872   
2873   // ... run plug-in code ...
2874}
2875</programlisting>
2876       
2877      </sect3>
2878     
2879    </sect2>
2880   
2881    <sect2 id="core_api.signals">
2882      <title>Sending signals (to plug-ins)</title>
2883   
2884      <para>
2885        BASE has a simple system for sending signals between different parts of
2886        a system. This signalling system was initially developed to be able to
2887        kill plug-ins that a user for some reason wanted to abort. The signalling
2888        system as such is not limited to this and it can be used for other purposes
2889        as well. Signals can of course be handled internally in a single JVM but
2890        also sent externally to other JVM:s running on the same or a different
2891        computer. The transport mechanism for signals is decoupled from the actual
2892        handling of them. If you want to, you could implement a signal transporter
2893        that sends signal as emails and the target plug-in would never know.
2894      </para>
2895     
2896      <para>
2897        The remainder of this section will focus mainly on the sending and
2898        transportation of signals. For more information about handling signals
2899        on the receiving end, see <xref linkend="plugin_developer.signals" />.
2900      </para>
2901     
2902      <sect3 id="core_api.signals.diagram">
2903        <title>Diagram of classes and methods</title>
2904        <figure id="core_api.figures.signals">
2905          <title>The signalling system</title>
2906          <screenshot>
2907            <mediaobject>
2908              <imageobject>
2909                <imagedata 
2910                  align="center"
2911                  scalefit="1" width="100%"
2912                  fileref="figures/uml/corelayer.signals.png" format="PNG" />
2913              </imageobject>
2914            </mediaobject>
2915          </screenshot>
2916        </figure>
2917     
2918        <para>
2919          The signalling system is rather simple. An object that wish
2920          to receieve signals must implement the
2921          <interfacename docapi="net.sf.basedb.core.signal"
2922          >SignalTarget</interfacename>. It's only method
2923          is <methodname>getSignalHandler()</methodname>. A
2924          <interfacename docapi="net.sf.basedb.core.signal"
2925          >SignalHandler</interfacename> is an object that
2926          knows what to do when a signal is delivered to it. The target object
2927          may implement the <interfacename>SignalHandler</interfacename> itself
2928          or use one of the existing handlers.
2929        </para>
2930       
2931        <para>
2932          The difficult part here is to be aware that a signal is usually
2933          delivered by a separate thread. The target object must be aware
2934          of this and know how to handle multiple threads. As an example we
2935          can use the <classname docapi="net.sf.basedb.core.signal"
2936          >ThreadSignalHandler</classname> which simply
2937          calls <code>Thread.interrupt()</code> to deliver a signal. The target
2938          object that uses this signal handler it must know that it should check
2939          <code>Thread.interrupted()</code> at regular intervals from the main
2940          thread. If that method returns true, it means that the <constant>ABORT</constant>
2941          signal has been delivered and the main thread should clean up and exit as
2942          soon as possible.
2943        </para>
2944       
2945        <para>
2946          Even if a signal handler could be given directly to the party
2947          that may be interested in sending a signal to the target this
2948          is not recommended. This would only work when sending signals
2949          within the same virtual machine. The signalling system includes
2950          <interfacename docapi="net.sf.basedb.core.signal"
2951          >SignalTransporter</interfacename> and
2952          <interfacename docapi="net.sf.basedb.core.signal"
2953          >SignalReceiver</interfacename> objects that are used
2954          to decouple the sending of signals with the handling of signals. The
2955          implementation usually comes in pairs, for example
2956          <classname docapi="net.sf.basedb.core.signal"
2957          >SocketSignalTransporters</classname> and <classname 
2958          docapi="net.sf.basedb.core.signal">SocketSignalReceiver</classname>.
2959        </para>
2960       
2961        <para>
2962          Setting up the transport mechanism is usually a system responsibility.
2963          Only the system know what kind of transport that is appropriate for it's current
2964          setup. Ie. should signals be delievered by TCP/IP sockets, only internally, or
2965          should a delivery mechanism based on web services be implemented?
2966          If a system wants to receive signals it must create an appropriate
2967          <interfacename>SignalReceiver</interfacename> object. Within BASE the
2968          internal job queue set up it's own signalling system that can be used to
2969          send signals (eg. kill) running jobs. The job agents do the same but uses
2970          a different implementation. See <xref linkend="appendix.base.config.jobqueue" />
2971          for more information about how to configure the internal job queue's
2972          signal receiver. In both cases, there is only one signal receiver instance
2973          active in the system.
2974        </para>
2975       
2976        <para>
2977          Let's take the internal job queue as an example. Here is how it works:
2978        </para>
2979       
2980        <itemizedlist>
2981        <listitem>
2982          <para>
2983          When the internal job queue is started, it will also create a signal
2984          receiver instance according to the settings in <filename>base.config</filename>.
2985          The default is to create <classname docapi="net.sf.basedb.core.signal"
2986          >LocalSignalReceiver</classname>
2987          which can only be used inside the same JVM. If needed, this can
2988          be changed to a <classname docapi="net.sf.basedb.core.signal"
2989          >SocketSignalReceiver</classname> or any other
2990          user-provided implementation.
2991          </para>
2992        </listitem>
2993       
2994        <listitem>
2995          <para>
2996          When the job queue has found a plug-in to execute it will check if
2997          it also implements the <interfacename docapi="net.sf.basedb.core.signal"
2998          >SignalTarget</interfacename>
2999          interface. If it does, a signal handler is created and registered
3000          with the signal receiver. This is actually done by the BASE core
3001          by calling <methodname>PluginExecutionRequest.registerSignalReceiver()</methodname>
3002          which also makes sure that the the ID returned from the registration is
3003          stored in the database together with the job item representing the
3004          plug-in to execute.
3005          </para>
3006        </listitem>
3007       
3008        <listitem>
3009          <para>
3010          Now, when the web client see's a running job which has a non-empty
3011          signal transporter property, the <guilabel>Abort</guilabel>
3012          button is activated. If the user clicks this button the BASE core
3013          uses the information in the database to create
3014          <interfacename docapi="net.sf.basedb.core.signal"
3015          >SignalTransporter</interfacename> object. This
3016          is simply done by calling <code>Job.getSignalTransporter()</code>.
3017          The created signal transporter knows how to send a signal
3018          to the signal receiver it was first registered with. When the
3019          signal arrives at the receiver it will find the handler for it
3020          and call <code>SignalHandler.handleSignal()</code>. This will in it's turn
3021          trigger some action in the signal target which soon will abort what
3022          it is doing and exit.
3023          </para>
3024        </listitem>
3025        </itemizedlist>
3026       
3027       
3028      </sect3>
3029   
3030    </sect2>
3031   
3032  </sect1>
3033
3034  <sect1 id="api_overview.query_api">
3035    <title>The Query API</title>
3036    <para>
3037      This documentation is only available in the old format.
3038      See <ulink url="http://base.thep.lu.se/chrome/site/doc/historical/development/overview/query/index.html"
3039        >http://base.thep.lu.se/chrome/site/doc/historical/development/overview/query/index.html</ulink>
3040    </para>
3041   
3042  </sect1>
3043 
3044  <sect1 id="api_overview.dynamic_and_batch_api">
3045    <title>Analysis and the Dynamic and Batch API:s</title>
3046    <para>
3047      This documentation is only available in the old format.
3048      See <ulink url="http://base.thep.lu.se/chrome/site/doc/historical/development/overview/dynamic/index.html"
3049        >http://base.thep.lu.se/chrome/site/doc/historical/development/overview/dynamic/index.html</ulink>
3050    </para>
3051  </sect1>
3052
3053  <sect1 id="api_overview.extensions">
3054    <title>Extensions API</title>
3055   
3056    <sect2 id="api_overview.extensions.core">
3057      <title>The core part</title>
3058   
3059      <para>
3060        The <emphasis>Extensions API</emphasis> is divided into two parts. A core
3061        part and a web client specific part. The core part can be found in the
3062        <package>net.sf.basedb.util.extensions</package> package and it's sub-packages,
3063        and consists of three sub-parts:
3064      </para>
3065     
3066      <itemizedlist>
3067      <listitem>
3068        <para>
3069        A set of interface definitions which forms the core of the Extensions API.
3070        The interfaces defines, for example, what an <interfacename 
3071        docapi="net.sf.basedb.util.extensions">Extension</interfacename> is and
3072        what an <interfacename 
3073        docapi="net.sf.basedb.util.extensions">ActionFactory</interfacename> should do.
3074        </para>
3075      </listitem>
3076     
3077      <listitem>
3078        <para>
3079        A <classname docapi="net.sf.basedb.util.extensions">Registry</classname> that is
3080        used to keep track of installed extensions. The registry also provides
3081        functionality for invoking and using the extensions.
3082        </para>
3083      </listitem>
3084     
3085      <listitem>
3086        <para>
3087        Utility classes that are useful when implementation a client application
3088        that can be extendable. The most useful example is the <classname
3089        docapi="net.sf.basedb.util.extensions.xml">XmlLoader</classname> which can
3090        read extension definitions from XML files and create the proper factories,
3091        etc.
3092        </para>
3093      </listitem>
3094      </itemizedlist>
3095     
3096      <figure id="core_api.figures.extensions_core">
3097        <title>The core part of the Extensions API</title>
3098        <screenshot>
3099          <mediaobject>
3100            <imageobject>
3101              <imagedata 
3102                align="center"
3103                fileref="figures/uml/corelayer.extensions_core.png" format="PNG" />
3104            </imageobject>
3105          </mediaobject>
3106        </screenshot>
3107      </figure>
3108     
3109      <para>
3110        The <classname docapi="net.sf.basedb.util.extensions">Registry</classname> 
3111        is one of the main classes in the extension system. All extension points and
3112        extensions must be registered before they can be used. Typically, you will
3113        first register extension points and then extensions, beacuse an extension
3114        can't be registered until the extension point it is extending has been
3115        registered.
3116      </para>
3117     
3118      <para>
3119        An <interfacename docapi="net.sf.basedb.util.extensions">ExtensionPoint</interfacename>
3120        is an ID and a definition of an <interfacename docapi="net.sf.basedb.util.extensions">Action</interfacename>
3121        class. The other options (name, description, renderer factory, etc.) are optional.
3122        An <interfacename docapi="net.sf.basedb.util.extensions">Extension</interfacename>
3123        that extends a specific extension point must provide an
3124        <interfacename docapi="net.sf.basedb.util.extensions">ActionFactory</interfacename>
3125        instance that can create actions of the type the extension point requires.
3126      </para>
3127     
3128      <example id="core_api.example.extensions_core">
3129        <title>The menu extensions point</title>
3130        <para>
3131        The <code>net.sf.basedb.clients.web.menu.extensions</code> extension point
3132        requires <interfacename 
3133        docapi="net.sf.basedb.clients.web.extensions.menu">MenuItemAction</interfacename>
3134        objects. An extension for this extension point must provide a factory that
3135        can create <classname>MenuItemAction</classname>:s. BASE ships with default
3136        factory implementations, for example the <classname 
3137        docapi="net.sf.basedb.clients.web.extensions.menu">FixedMenuItemFactory</classname>
3138        class, but an extension may provide it's own factory implementation if it wants to.
3139        </para>
3140      </example>
3141     
3142      <para>
3143        Call the <methodname>Registry.useExtensions()</methodname> method
3144        to use extensions from one or several extension points. This method will
3145        find all extensions for the given extension points. If a filter is given,
3146        it checks if any of the extensions or extension points has been disabled.
3147        It will then call <methodname>ActionFactory.prepareContext()</methodname>
3148        for all remaining extensions. This gives the action factory a chance to
3149        also disable the extension, for example, if the logged in user doesn't
3150        have a required permission. The action factory may also set attributes
3151        on the context. The attributes can be anything that the extension point
3152        may make use of. Check the documentation for the specific extension point
3153        for information about which attributes it supports. If there are
3154        any renderer factories, their <methodname>RendererFactory.prepareContext()</methodname>
3155        is also called. They have the same possibility of setting attributes
3156        on the context, but can't disable an extension.
3157      </para>
3158     
3159      <para>
3160        After this, an <classname 
3161        docapi="net.sf.basedb.util.extensions">ExtensionsInvoker</classname>
3162        object is created and returned to the extension point. Note that
3163        the <methodname>ActionFactory.getActions()</methodname> has not been
3164        called yet, so we don't know if the extensions are actually
3165        going to generate any actions. The <methodname>ActionFactory.getActions()</methodname>
3166        is not called until we have got ourselves an
3167        <classname docapi="net.sf.basedb.util.extensions">ActionIterator</classname>
3168        from the <methodname>ExtensionsInvoker.iterate()</methodname> method and
3169        starts to iterate. The call to <methodname>ActionIterator.hasNext()</methodname>
3170        will propagate down to <methodname>ActionFactory.getActions()</methodname>
3171        and the generated actions are then available with the
3172        <methodname>ActionIterator.next()</methodname> method.
3173      </para>
3174     
3175      <para>
3176        The <methodname>ExtensionsInvoker.renderDefault()</methodname>
3177        and <methodname>ExtensionsInvoker.render()</methodname> are
3178        just convenience methods that will make it easer to render
3179        the actions. The first method will of course only work if the
3180        extension point is providing a renderer factory, that can
3181        create the default renderer.
3182      </para>
3183     
3184      <note>
3185        <title>Be aware of multi-threading issues</title>
3186        <para>
3187          When you are creating extensions you must be aware that
3188          multiple threads may access the same objects at the same time.
3189          In particular, any action factory or renderer factory has to be
3190          thread-safe, since only one exists for each extension.
3191          Action and renderer objects should be thread-safe if the
3192          factories re-use the same objects.
3193        </para>
3194      </note>
3195   
3196    </sect2>
3197   
3198    <sect2 id="api_overview.extensions.web">
3199      <title>The web client part</title>
3200   
3201      <para>
3202        The web client specific parts of the Extensions API can be found
3203        in the <package>net.sf.basedb.client.web.extensions</package> package
3204        and it's subpackages. The top-level package contains classes used to
3205        administrate the extension system. Here is for example the
3206        <classname docapi="net.sf.basedb.client.web.extensions">ExtensionsControl</classname> 
3207        class which is the master controller for the web client extensions. It:
3208      </para>
3209     
3210      <itemizedlist>
3211      <listitem>
3212        <para>
3213        Keeps track of installed extensions and which JAR or XML file they are
3214        installed from.
3215        </para>
3216      </listitem>
3217     
3218      <listitem>
3219        <para>
3220        Can, manually or automatically, find and install new or
3221        updated extensions and uninstall deleted extensions.
3222        </para>
3223      </listitem>
3224     
3225      <listitem>
3226        <para>
3227        Adds permission control to the extension system, so that only an
3228        administrator is allowed to change settings, enable/disable extensions,
3229        etc.
3230        </para>
3231      </listitem>
3232      </itemizedlist>
3233     
3234      <para>
3235        In the top-level package there are also some abstract classes that may
3236        be useful to extend for developers creating their own extensions.
3237        For example, we recommend that all action factories extend the <classname 
3238        docapi="net.sf.basedb.client.web.extensions">AbstractJspActionFactory</classname>
3239        class.
3240      </para>
3241     
3242      <para>
3243        The sub-packages to <package>net.sf.basedb.client.web.extensions</package>
3244        are mostly specific to a single extension point or to a specific type of
3245        extension point. The <package>net.sf.basedb.client.web.extensions.menu</package>
3246        package, for example, contains classes that are/can be used for extensions
3247        adding menu items to the <menuchoice><guimenu>Extensions</guimenu></menuchoice>
3248        menu.
3249      </para>
3250     
3251      <figure id="core_api.figures.extensions_web">
3252        <title>The web client part of the Extensions API</title>
3253        <screenshot>
3254          <mediaobject>
3255            <imageobject>
3256              <imagedata 
3257                align="center"
3258                fileref="figures/uml/corelayer.extensions_web.png" format="PNG" />
3259            </imageobject>
3260          </mediaobject>
3261        </screenshot>
3262      </figure>
3263   
3264      <para>
3265        When the Tomcat web server is starting up, the <classname 
3266        docapi="net.sf.basedb.client.web.extensions">ExtensionsServlet</classname>
3267        is automatically loaded. This servlet has as two purposes:
3268      </para>
3269     
3270      <itemizedlist>
3271      <listitem>
3272        <para>
3273        Initialise the extensions system by calling
3274        <methodname>ExtensionsControl.init()</methodname>. This will result in
3275        an initial scan for installed extensions, which is equivalent to doing
3276        a manual scan with the force update setting to false. This means that
3277        the extension system is up an running as soon as the first user log's
3278        in to BASE.
3279        </para>
3280      </listitem>
3281     
3282      <listitem>
3283        <para>
3284        Act as a proxy for custom servlets defined by the extensions. URL:s
3285        ending with <code>.servlet</code> has been mapped to the
3286        <classname>ExtensionsServlet</classname>. When a request is made it
3287        will extract the name of the extension's JAR file from the
3288        URL, get the corresponding <classname 
3289        docapi="net.sf.basedb.client.web.extensions">ExtensionsFile</classname>
3290        and <classname docapi="net.sf.basedb.client.web.extensions">ServletWrapper</classname>
3291        and then invoke the custom servlet. More information can be found in
3292        <xref linkend="extensions_developer.servlets" />.
3293        </para>
3294      </listitem>
3295     
3296      </itemizedlist>
3297     
3298      <para>
3299        Using extensions only involves calling the
3300        <methodname>ExtensionsControl.createContext()</methodname> and
3301        <methodname>ExtensionsControl.useExtensions()</methodname> methods. This
3302        returns an <classname docapi="net.sf.basedb.util.extensions">ExtensionsInvoker</classname> 
3303        object as described in the previous section.
3304      </para>
3305     
3306      <para>
3307        To render the actions it is possible to either use the
3308        <methodname>ExtensionsInvoker.iterate()</methodname> method
3309        and generate HTML from the information in each action. Or
3310        (the better way) is to use a renderer together with the
3311        <classname docapi="net.sf.basedb.clients.web.taglib.extensions">Render</classname>
3312        taglib.
3313      </para>
3314     
3315      <para>
3316        To get information about the installed extensions, 
3317        change settings, enabled/disable extensions, performing a manual
3318        scan, etc. use the <methodname>ExtensionsControl.get()</methodname>
3319        method. This will create a permission-controlled object. All
3320        users has read permission, administrators has write permission.
3321      </para>
3322     
3323      <note>
3324        <para>
3325          The permission we check for is WRITE permission on the
3326          web client item. This means it is possible to give a user
3327          permissions to manage the extension system by assigning
3328          WRITE permission to the web client entry in the database.
3329          Do this from <menuchoice>
3330            <guimenu>Administrate</guimenu>
3331            <guimenuitem>Clients</guimenuitem>
3332          </menuchoice>.
3333        </para>
3334      </note>
3335   
3336      <para>
3337        The <classname docapi="net.sf.basedb.clients.web.extensions">XJspCompiler</classname>
3338        is mapped to handle the compilation <code>.xjsp</code> files
3339        which are regular JSP files with a different extension. This feature is
3340        experimental and requires installing an extra JAR into Tomcat's lib
3341        directory. See <xref linkend="admin.extensions.xjspcompiler" /> for
3342        more information.
3343      </para>
3344   
3345    </sect2>
3346   
3347  </sect1>
3348
3349  <sect1 id="api_overview.other_api">
3350    <title>Other useful classes and methods</title>
3351    <para>
3352      TODO
3353    </para>
3354  </sect1>
3355 
3356</chapter>
Note: See TracBrowser for help on using the repository browser.