source: trunk/doc/src/docbook/developerdoc/api_overview.xml @ 3757

Last change on this file since 3757 was 3757, checked in by Nicklas Nordborg, 14 years ago

References #721: Store data in files instead of in the database

Updated specification and documentation.

  • Property svn:eol-style set to native
  • Property svn:keywords set to Id
File size: 37.6 KB
Line 
1<?xml version="1.0" encoding="UTF-8"?>
2<!DOCTYPE chapter PUBLIC
3    "-//Dawid Weiss//DTD DocBook V3.1-Based Extension for XML and graphics inclusion//EN"
4    "../../../../lib/docbook/preprocess/dweiss-docbook-extensions.dtd">
5<!--
6  $Id: api_overview.xml 3757 2007-09-20 07:50:22Z nicklas $
7
8  Copyright (C) 2007 Peter Johansson, Nicklas Nordborg, Martin Svensson
9
10  This file is part of BASE - BioArray Software Environment.
11  Available at http://base.thep.lu.se/
12
13  BASE is free software; you can redistribute it and/or
14  modify it under the terms of the GNU General Public License
15  as published by the Free Software Foundation; either version 2
16  of the License, or (at your option) any later version.
17
18  BASE is distributed in the hope that it will be useful,
19  but WITHOUT ANY WARRANTY; without even the implied warranty of
20  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
21  GNU General Public License for more details.
22
23  You should have received a copy of the GNU General Public License
24  along with this program; if not, write to the Free Software
25  Foundation, Inc., 59 Temple Place - Suite 330,
26  Boston, MA  02111-1307, USA.
27-->
28
29<chapter id="api_overview">
30  <?dbhtml dir="api"?>
31  <title>API overview (how to use and code examples)</title>
32
33  <sect1 id="api_overview.public_api">
34    <title>The Public API of BASE</title>
35   
36    <para>
37      Not all public classes and methods in the <filename>BASE2Core.jar</filename>
38      and other JAR files shipped with BASE are considered as
39      <emphasis>Public API</emphasis>. This is important knowledge
40      since we will always try to maintain backwards compatibility
41      for classes that are part of the public API. For other
42      classes, changes may be instroduced at any time without
43      notice or specific documentation. In other words:
44    </para>
45   
46    <note>
47      <title>Only use the public API when developing plug-ins</title>
48      <para>
49        This will maximize the chance that you plug-in will continue
50        to work with the next BASE release. If you use the non-public API
51        you do so at your own risk.
52      </para>
53    </note>
54   
55    <para>
56      See the <ulink url="http://base.thep.lu.se/chrome/site/doc/api/index.html"
57        >javadoc</ulink> for information about
58      what parts of the API that contributes to the public API.
59      Methods, classes and other elements that have been tagged as
60      <code>@deprecated</code> should be considered as part of the internal API
61      and may be removed in a subsequent relase without warning.
62    </para>
63   
64    <para>
65      See <xref linkend="appendix.incompatible" /> to read more about
66      changes that have been introduced by each release.
67    </para>
68
69    <sect2 id="api_overview.compatibility">
70      <title>What is backwards compatibility?</title>
71     
72      <para>
73        There is a great article about this subject on <ulink 
74        url="http://wiki.eclipse.org/index.php/Evolving_Java-based_APIs"
75          >http://wiki.eclipse.org/index.php/Evolving_Java-based_APIs</ulink>.
76        This is what we will try to comply with. If you do not want to
77        read the entire article, here are some of the most important points:
78      </para>
79     
80     
81      <sect3 id="api_overview.compatibility.binary">
82        <title>Binary compatibility</title>
83        <para>
84        <blockquote>
85          Pre-existing Client binaries must link and run with new releases of the
86          Component without recompiling.
87        </blockquote>
88       
89        For example:
90        <itemizedlist>
91        <listitem>
92          <para>
93            We cannot change the number or types of parameters to a method
94            or constructor.
95          </para>
96        </listitem>
97        <listitem>
98          <para>
99            We cannot add or change methods to interfaces that are intended
100            to be implemented by plug-in or client code.
101          </para>
102        </listitem>
103        </itemizedlist>
104        </para>       
105      </sect3>
106     
107      <sect3 id="api_overview.compatibility.contract">
108        <title>Contract compatibility</title>
109        <para>
110          <blockquote>
111          API changes must not invalidate formerly legal Client code.
112          </blockquote>
113       
114          For example:
115          <itemizedlist>
116          <listitem>
117            <para>
118              We cannot change the implementation of a method to do
119              things differently than before. For example, allow <constant>null</constant>
120              as a return value when it was not allowed before.
121            </para>
122          </listitem>
123          </itemizedlist>
124       
125          <note>
126            <para>
127            Sometimes there is a very fine line between what is considered a
128            bug and what is considered a feature. For example, if the
129            actual implementation does not do what the javadoc says,
130            do we change the code or do we change the documentation?
131            This has to be considered from case to case and depends on
132            the age of the code and if we expect plug-ins and clients to be
133            affected by it or not.
134            </para>
135          </note>
136        </para>
137      </sect3>
138     
139      <sect3 id="api_overview.compatibility.source">
140        <title>Source code compatibility</title>
141        <para>
142        This is not an important matter and is not always possible to
143        achieve. In most cases, the problems are easy to fix.
144        Example:
145       
146        <itemizedlist>
147        <listitem>
148          <para>
149          Adding a class may break a plug-in or client that import
150          classes with <constant>.*</constant> if the same class name
151          exists in another package.
152          </para>
153        </listitem>
154        </itemizedlist>
155        </para>
156      </sect3>
157    </sect2>
158  </sect1>
159
160  <sect1 id="api_overview.data_api" chunked="1">
161    <title>The database schema and the Data Layer API</title>
162
163    <para>
164      This section gives an overview of the entire data layer API.
165      The figure below show how different modules relate to each other.
166    </para>
167   
168    <note>
169      All information has not yet been transfered from the old documentation.
170      The old documentation is available at
171      <ulink url="http://base.thep.lu.se/chrome/site/doc/development/overview/data/index.html"
172        >http://base.thep.lu.se/chrome/site/doc/development/overview/data/index.html</ulink>
173    </note>
174   
175    <figure id="data_api.figures.overview">
176      <title>Data layer overview</title>
177      <screenshot>
178        <mediaobject>
179          <imageobject>
180            <imagedata 
181              fileref="figures/uml/datalayer.overview.png" format="PNG" />
182          </imageobject>
183        </mediaobject>
184      </screenshot>
185    </figure>
186
187    <sect2 id="data_api.basic">
188      <title>Basic classes and interfaces</title>
189     
190      <para>
191        This document contains information about the basic classes and interfaces in this package.
192        They are important since all data-layer classes must inherit from one of the already
193        existing abstract base classes or implement one or more of the
194        existing interfaces. They contain code that is common to all classes,
195        for example implementations of the <methodname>equals()</methodname>
196        and <methodname>hashCode()</methodname> methods or how to link with the owner of an
197        item.
198      </para>
199     
200      <sect3 id="data_api.basic.uml">
201        <title>UML diagram</title>
202       
203        <figure id="data_api.figures.basic">
204          <title>Basic classes and interfaces</title>
205          <screenshot>
206            <mediaobject>
207              <imageobject>
208                <imagedata 
209                  fileref="figures/uml/datalayer.basic.png" format="PNG" />
210              </imageobject>
211            </mediaobject>
212          </screenshot>
213        </figure>
214      </sect3>
215     
216      <sect3 id="data_api.basic.classes">
217        <title>Classes</title>
218       
219        <variablelist>
220        <varlistentry>
221          <term><classname>BasicData</classname></term>
222          <listitem>
223            <para>
224            The root class. It overrides the <methodname>equals()</methodname>,
225            <methodname>hashCode()</methodname> and <methodname>toString()</methodname> methods
226            from the <classname>Object</classname> class. It also defines the
227            <varname>id</varname> and <varname>version</varname> properties.
228            All data layer classes must inherit from this class or one of it's subclasses.
229            </para>
230          </listitem>
231        </varlistentry>
232       
233        <varlistentry>
234          <term><classname>OwnedData</classname></term>
235          <listitem>
236            <para>
237            Extends the <classname>BasicData</classname> class and adds
238            an <varname>owner</varname> property. The owner is a required link to a
239            <classname>UserData</classname> object, representing the user that
240            is the owner of the item.
241            </para>
242          </listitem>
243        </varlistentry>
244
245        <varlistentry>
246          <term><classname>SharedData</classname></term>
247          <listitem>
248            <para>
249            Extends the <classname>OwnedData</classname> class and adds
250            properties (<varname>itemKey</varname> and <varname>projectKey</varname>)
251            that holds access permission information for an item.
252            Access permissions are held in <classname>ItemKeyData</classname> and/or
253            <classname>ProjectKeyData</classname> objects. These objects only exists if
254            the item has been shared.
255            </para>
256          </listitem>
257        </varlistentry>
258
259        <varlistentry>
260          <term><classname>CommonData</classname></term>
261          <listitem>
262            <para>
263            This is a convenience class for items that extends the <classname>SharedData</classname>
264            class and implements the <interfacename>NameableData</interfacename> and
265            <interfacename>RemoveableData</interfacename> interfaces. This is one of
266            the most common situations.
267            </para>
268          </listitem>
269        </varlistentry>
270
271        <varlistentry>
272          <term><classname>AnnotatedData</classname></term>
273          <listitem>
274            <para>
275            This is a convenience class for items that can be annotated.
276            Annotations are held in <classname>AnnotationSetData</classname> objects.
277            The annotation set only exists if annotations has been created for the item.
278            </para>
279          </listitem>
280        </varlistentry>
281        </variablelist>
282       
283      </sect3>
284     
285      <sect3 id="data_api.basic.interfaces">
286        <title>Interfaces</title>
287       
288        <variablelist>
289        <varlistentry>
290          <term><classname>IdentifiableData</classname></term>
291          <listitem>
292            <para>
293            All items are identifiable, which means that they have a unique <varname>id</varname>.
294            The id is unique for all items of a specific type (ie. class). The id is number
295            that is automatically generated by the database and has no other meaning
296            outside of the application. The <varname>version</varname> property is used for
297            detecting and preventing concurrent modifications to an item.
298            </para>
299          </listitem>
300        </varlistentry>
301       
302        <varlistentry>
303          <term><classname>OwnableData</classname></term>
304          <listitem>
305            <para>
306            An ownable item is an item which has an owner. The owner is represented as a
307            required link to a <classname>UserData</classname> object.
308            </para>
309          </listitem>
310        </varlistentry>       
311
312        <varlistentry>
313          <term><classname>ShareableData</classname></term>
314          <listitem>
315            <para>
316            A shareable item is an item which can be shared to other users, groups or projects.
317            Access permissions are held in <classname>ItemKeyData</classname> and/or
318            <classname>ProjectKeyData</classname> objects.
319            </para>
320          </listitem>
321        </varlistentry>
322             
323        <varlistentry>
324          <term><classname>NameableData</classname></term>
325          <listitem>
326            <para>
327            A nameable item is an item that has a name (required) and a description
328            (optional). The name doesn't have to be unique, except in a few special
329            cases (for example, the name of a file).
330            </para>
331          </listitem>
332        </varlistentry>
333       
334        <varlistentry>
335          <term><classname>RemovableData</classname></term>
336          <listitem>
337            <para>
338            A removable item is an item that can be flagged as removed. This doesn't
339            remove the information about the item from the database, but can be used by
340            client applications to hide items that the user is not interested in.
341            A trashcan function can be used to either restore or permanently
342            remove items that has the flag set.
343            </para>
344          </listitem>
345        </varlistentry>
346               
347        <varlistentry>
348          <term><classname>SystemData</classname></term>
349          <listitem>
350            <para>
351            A system item is an item which has an additional id in the form of string. A system id
352            is required when we need to make sure that we can get a specific item without
353            knowing the numeric id. Example of such items are the root user and the everyone group.
354            A system id is generally constructed like:
355            <constant>net.sf.basedb.core.User.ROOT</constant>. The system id:s are defined in the
356            core layer by each item class.
357            </para>
358          </listitem>
359        </varlistentry>
360
361        <varlistentry>
362          <term><classname>DiskConsumableData</classname></term>
363          <listitem>
364            <para>
365            This interface is used by items which occupies a lot of disk space and
366            should be part of the quota system, for example files. The required
367            <classname>DiskUsageData</classname> contains information about the size,
368            location, owner etc. of the item.
369            </para>
370          </listitem>
371        </varlistentry>
372       
373        <varlistentry>
374          <term><classname>AnnotatableData</classname></term>
375          <listitem>
376            <para>
377            This interface is used by items which can be annotated. Annotations are name/value
378            pairs that are attached as extra information to an item. All annotations are
379            contained in an <classname>AnnotationSetData</classname> object.
380            </para>
381          </listitem>
382        </varlistentry>
383       
384        <varlistentry>
385          <term><classname>ExtendableData</classname></term>
386          <listitem>
387            <para>
388            This interface is used by items which can have extra administrator-defined
389            columns. The functionality is similar to annotations. It is not as flexible,
390            since it is a global configuration, but has better performance. BASE will
391            generate extra database columns to store the data in the tables for items that
392            can be extended.
393            </para>
394          </listitem>
395        </varlistentry>
396       
397        <varlistentry>
398          <term><classname>BatchableData</classname></term>
399          <listitem>
400            <para>
401            This interface is a tagging interface which is used by items that needs batch
402            functionality in the core.
403            </para>
404          </listitem>
405        </varlistentry>
406        </variablelist>
407
408      </sect3>
409    </sect2>
410   
411    <sect2 id="data_api.authentication">
412      <title>User authentication and access control</title>
413     
414      <para>
415         This section gives an overview of user authentication and
416         how groups, roles and projects are used for access control
417         to items.
418      </para>
419     
420      <sect3 id="data_api.authentication.uml">
421        <title>UML diagram</title>
422       
423        <figure id="data_api.figures.authentication">
424          <title>User authentication and access control</title>
425          <screenshot>
426            <mediaobject>
427              <imageobject>
428                <imagedata 
429                  fileref="figures/uml/datalayer.authentication.png" format="PNG" />
430              </imageobject>
431            </mediaobject>
432          </screenshot>
433        </figure>
434      </sect3>
435     
436      <sect3 id="data_api.authentication.users">
437        <title>Users and passwords</title>     
438     
439        <para>
440          The <classname>UserData</classname> class holds information about users.
441          We keep the passwords in a separate table and use proxies to avoid loading
442          password data each time a user is loaded to minimize security risks. It is
443          only if the password needs to be changed that the <classname>PasswordData</classname>
444          object is loaded. The one-to-one mapping between user and password is controlled
445          by the password class, but a cascade attribute on the user class makes sure
446          that the password is deleted when a user is deleted.
447        </para>
448      </sect3>
449
450      <sect3 id="data_api.authentication.groups">
451        <title>Groups, roles and projects</title>     
452     
453        <para>
454          The <classname>GroupData</classname>, <classname>RoleData</classname> and
455          <classname>ProjectData</classname> classes holds information about groups, roles
456          and projects respectively. A user may be a member of any number of groups,
457          roles and/or projects. The membership in a project comes with an attached
458          permission values. This is the highest permission the user has in the
459          project. No matter what permission an item has been shared with the
460          user will not get higher permission. Groups may be members of other groups and
461          also in projects.
462        </para>
463       
464      </sect3>
465     
466      <sect3 id="data_api.authentication.keys">
467        <title>Keys</title>     
468     
469        <para>
470          The <classname>KeyData</classname> class and it's subclasses
471          <classname>ItemKeyData</classname>, <classname>ProjectKeyData</classname> and
472          <classname>RoleKeyData</classname>, are used to store information about access
473          permissions to items. To get permission to manipulate an item a user must have
474          access to a key giving that permission. There are three types of keys:
475        </para>
476       
477        <variablelist>
478        <varlistentry>
479          <term><classname>ItemKey</classname></term>
480          <listitem>
481            <para>
482            Is used to give a user or group access to a specific item. The item
483            must be a <interfacename>ShareableData</interfacename> item.
484            The permissions are usually set be the owner of the item. Once created an
485            item key cannot be changed. This allows the core to reuse a key if the
486            permissions match exactly, ie. for a given set of users/groups/permissions
487            there can be only one item key object.
488            </para>
489          </listitem>
490        </varlistentry>
491
492        <varlistentry>
493          <term><classname>ProjectKey</classname></term>
494          <listitem>
495            <para>
496            Is used to give members of a project access to a specific item. The item
497            must be a <interfacename>ShareableData</interfacename> item. Once created a
498            project key cannot be changed. This allows the core to reuse a key if the
499            permissions match exactly, ie. for a given set of projects/permissions
500            there can be only one project key object.
501            </para>
502          </listitem>
503        </varlistentry>
504
505        <varlistentry>
506          <term><classname>RoleKey</classname></term>
507          <listitem>
508            <para>
509            Is used to give a user access to all items of a specific type, ie.
510            <constant>READ</constant> all <constant>SAMPLES</constant>. The installation
511            will make sure that there already exists a role key for each type of item, and
512            it is not possible to add new or delete existing keys. Unlike the other two types
513            this key can be modified.
514            </para>
515           
516            <para>
517            A role key is also used to assign permissions to plug-ins. If a plug-in has
518            been specified to use permissions the default is to deny everything.
519            The mapping to the role key is used to grant permissions to the plugin.
520            The <varname>granted</varname> value gives the plugin access to all items
521            of the related item type regardless of if the user that is running the plug-in has the
522            permission or not. The <varname>denied</varname> values denies access to all
523            items of the related item type even if the logged in user has the permission.
524            Permissions that are not granted nor denied are checked against the
525            logged in users regular permissions. Permissions to items that are
526            not linked are always denied.
527            </para>
528          </listitem>
529        </varlistentry>
530        </variablelist>
531       
532      </sect3>
533
534      <sect3 id="data_api.authentication.permissions">
535        <title>Permissions</title>
536       
537        <para>
538          The <varname>permission</varname> property appearing in many classes is an
539          integer values describing the permission:
540        </para>
541       
542        <informaltable>
543        <tgroup cols="2">
544          <colspec colname="value" />
545          <colspec colname="permission" />
546          <thead>
547            <row>
548              <entry>Value</entry>
549              <entry>Permission</entry>
550            </row>
551          </thead>
552          <tbody>
553            <row>
554              <entry>1</entry>
555              <entry>Read</entry>
556            </row>
557            <row>
558              <entry>3</entry>
559              <entry>Use</entry>
560            </row>
561            <row>
562              <entry>7</entry>
563              <entry>Restricted write</entry>
564            </row>
565            <row>
566              <entry>15</entry>
567              <entry>Write</entry>
568            </row>
569            <row>
570              <entry>31</entry>
571              <entry>Delete</entry>
572            </row>
573            <row>
574              <entry>47 (=32+15)</entry>
575              <entry>Set owner</entry>
576            </row>
577            <row>
578              <entry>79 (=64+15)</entry>
579              <entry>Set permissions</entry>
580            </row>
581            <row>
582              <entry>128</entry>
583              <entry>Create</entry>
584            </row>
585            <row>
586              <entry>256</entry>
587              <entry>Denied</entry>
588            </row>
589          </tbody>
590        </tgroup>
591        </informaltable>
592       
593        <para>
594          The values are constructed so that
595          <constant>READ</constant> -&gt;
596          <constant>USE</constant> -&gt;
597          <constant>RESTRICTED_WRITE</constant> -&gt;
598          <constant>WRITE</constant> -&gt;
599          <constant>DELETE</constant>
600          are chained in the sense that a higher permission always implies the lower permissions
601          also. The <constant>SET_OWNER</constant> and <constant>SET_PERMISSION</constant>
602          both implies <constant>WRITE</constant> permission. The <constant>DENIED</constant>
603          permission is only valid for role keys, and if specified it overrides all
604          other permissions.               
605        </para>
606       
607        <para>
608          When combining permission for a single item the permission codes for the different
609          paths are OR-ed together. For example a user has a role key with <constant>READ</constant>
610          permission for <constant>SAMPLES</constant>, but also an item key with <constant>USE</constant>
611          permission for a specific sample. Of course, the resulting permission for that
612          sample is <constant>USE</constant>. For other samples the resulting permission is
613          <constant>READ</constant>.
614        </para>
615       
616        <para>
617          If the user is also a member of a project which has <constant>WRITE</constant>
618          permission for the same sample, the user will have <constant>WRITE</constant>
619          permission when working with that project.
620        </para>
621       
622        <para>
623          The <constant>RESTRICTED_WRITE</constant> permission is in most cases the same
624          as the <constant>WRITE</constant> permission. So far the <constant>RESTRICTED_WRITE</constant>
625          permission is only given to users to their own <classname>UserData</classname>
626          object so they can change their address and other contact information,
627          but not quota, expiration date and other administrative information.
628        </para>
629
630      </sect3>
631    </sect2>
632
633    <sect2 id="data_api.wares">
634      <title>Hardware and software</title>
635    </sect2>
636   
637    <sect2 id="data_api.reporters">
638      <title>Reporters</title>
639    </sect2>
640
641    <sect2 id="data_api.quota">
642      <title>Quota and disk usage</title>
643    </sect2>
644
645    <sect2 id="data_api.sessions">
646      <title>Client, session and settings</title>
647    </sect2>
648
649    <sect2 id="data_api.files">
650      <title>Files and directories</title>
651    </sect2>
652   
653    <sect2 id="data_api.platforms">
654      <title>Experimental platforms</title>
655
656      <para>
657         This section gives an overview of experimental platforms
658         and how they are used to enable data storage in files
659         instead of in the database.
660      </para>
661     
662      <note>
663        <title>THIS IS A DRAFT!</title>
664        <para>
665          This document is a draft currently beeing worked on!
666          Changes are expected before the design is finalized.
667        </para>
668      </note>
669     
670      <sect3 id="data_api.platforms.uml">
671        <title>UML diagram</title>
672       
673        <figure id="data_api.figures.platforms">
674          <title>Experimental platforms</title>
675          <screenshot>
676            <mediaobject>
677              <imageobject>
678                <imagedata 
679                  fileref="figures/uml/datalayer.platforms.png" format="PNG" />
680              </imageobject>
681            </mediaobject>
682          </screenshot>
683        </figure>
684      </sect3>
685     
686      <sect3 id="data_api.platforms.platforms">
687        <title>Platforms</title>
688       
689        <para>
690          The <classname>PlatformData</classname> holds information about a
691          platform. A platform can have one or more <classname>PlatformVariant</classname>:s.
692          Both the platform and variant are identified by a system ID that
693          is fixed and can't be changed. <emphasis>Affymetrix</emphasis>
694          and <emphasis>Illumina</emphasis> are examples of platforms.
695          If the <varname>fileOnly</varname> flag is set data for the platform
696          can only be stored in files and not imported into the database. If
697          the flag is not set data can be imported into the database.
698          The <varname>rawDataType</varname> can be used to lock the platform
699          to a specific raw data type. If the value is <constant>null</constant>
700          the platform can use any raw data type.
701        </para>
702       
703        <para>
704          Each platform and it's variant can be connected to one or more
705          <classname>FileSetMemberTypeData</classname> items. This item
706          describes the kind of files that are used to hold data for
707          the platform and/or variant. The file types are re-usable between
708          different platforms and variants. Note that a file type may be attached
709          to either only a platform or to a platform with a variant. File
710          types attached to platforms are inherited by the variants. The variants
711          can only define additional file types, not remove or redefine file types
712          that has been attached to the platform.
713        </para>
714        <para>
715          The file type is also identified
716          by a fixed, non-changable system ID. The <varname>itemType</varname>
717          property tells us what type of item the file holds data for (ie.
718          array design or raw bioassay). It also links to a <classname>FileType</classname>
719          which is the generic type of data in the file. This allows to query
720          the database for, as an example, for files with the generic type
721          <constant>FileType.RAW_DATA</constant>. If we are in an Affymetrix
722          experiment we will get the CEL file, for another platform we will
723          get another file.
724        </para>
725
726      </sect3>
727     
728      <sect3 id="data_api.platforms.files">
729        <title>Files</title>
730       
731        <para>
732          An item must implement the <interfacename>FileStoreEnabledData</interfacename>
733          interface to be able to store data in files instead of in the database.
734          The interface creates a link to a <classname>FileSetData</classname> object.
735          In a file set it is only possible to store one file for each
736          <classname>FileSetMemberTypeData</classname> item.
737        </para>
738       
739      </sect3>
740    </sect2>
741
742    <sect2 id="data_api.protocols">
743      <title>Protocols</title>
744    </sect2>
745
746    <sect2 id="data_api.parameters">
747      <title>Parameters</title>
748    </sect2>
749
750    <sect2 id="data_api.annotations">
751      <title>Annotations</title>
752    </sect2>
753
754    <sect2 id="data_api.plugins">
755      <title>Plug-ins, jobs and job agents</title>
756    </sect2>
757   
758    <sect2 id="data_api.biomaterials">
759      <title>Biomaterials</title>
760    </sect2>
761
762    <sect2 id="data_api.plates">
763      <title>Array LIMS - plates</title>
764    </sect2>
765
766    <sect2 id="data_api.arrays">
767      <title>Array LIMS - arrays</title>
768    </sect2>
769
770    <sect2 id="data_api.rawdata">
771      <title>Hybridizations and raw data</title>
772    </sect2>
773
774    <sect2 id="data_api.experiments">
775      <title>Experiments and analysis</title>
776    </sect2>
777   
778    <sect2 id="data_api.misc">
779      <title>Other classes</title>
780    </sect2>
781
782  </sect1>
783 
784  <sect1 id="api_overview.core_api" chunked="1">
785    <title>The Core API</title>
786   
787    <para>
788      This section gives an overview of various parts of the core API.
789    </para>
790   
791    <sect2 id="core_api.data_in_files">
792      <title>Using files to store data</title>
793      <note>
794        <title>THIS IS A DRAFT!</title>
795        <para>
796          This document is a draft currently beeing worked on!
797          Changes are expected before the design is finalized.
798        </para>
799      </note>
800     
801      <para>
802        This section is about how BASE can use files to store data instead
803        of importing it into the database. See <xref linkend="data_api.platforms" />
804        for an overview of the database schema for this feature. Files can be attached
805        to any item that implements the <interfacename>FileStoreEnabled</interfacename>
806        interface. Currently this is <classname>RawBioAssay</classname>, <classname>ArrayDesign</classname>,
807        <classname>BioAssaySet</classname> and <classname>BioAssay</classname>. The
808        ability to store data in files is not a replacement for storing data in the
809        database. It is possible (for some platforms/raw data types) to have data in
810        files and in the database at the same time. We would have liked to enforce
811        that (raw) data is always present in files, but this will not be backwards compatible
812        with older installations, so there are three cases:
813      </para>
814     
815      <itemizedlist>
816      <listitem>
817        <para>
818        Data in files only
819        </para>
820      </listitem>
821      <listitem>
822        <para>
823        Data in the database only
824        </para>
825      </listitem>
826      <listitem>
827        <para>
828        Data in both files and in the database
829        </para>
830      </listitem>
831      </itemizedlist>
832     
833      <para>
834        Not all three cases are supported for all types of data. This is controlled
835        by the <classname>Platform</classname> class, which may disallow
836        that data is stored in the database. To check this call
837        <methodname>getRawDataType()</methodname> which may return:
838      </para>
839     
840      <itemizedlist>
841      <listitem>
842        <para>
843          <constant>null</constant>: The platform can store data with any
844          raw data type in the database.
845        </para>
846      </listitem>
847      <listitem>
848        <para>
849        A <classname>RawDataType</classname> that has <code>isStoredInDb() == true</code>:
850        The platform can store data in the database but only data with the specified raw
851        data type.
852        </para>
853      </listitem>
854      <listitem>
855        <para>
856        A <classname>RawDataType</classname> that has <code>isStoredInDb() == false</code>:
857        The platform can't store data in the database.
858        </para>
859      </listitem>
860      </itemizedlist>
861
862      <para>
863        One major modification is that the registration of raw data types
864        has changed. The <filename>raw-data-types.xml</filename> file should
865        only be used for raw data types that are stored in the database. The
866        <sgmltag>storage</sgmltag> tag has been deprecated and BASE will ignore
867        any raw data type definitions with <code>storage="file"</code>.
868        To replace this, each <classname>Platform</classname> that
869        can only store data in files also defines a "virtual" raw data type.
870      </para>
871     
872      <sect3 id="core_api.data_in_files.diagram">
873        <title>Diagram of classes and methods</title>
874        <figure id="core_api.figures.data_in_files">
875          <title>Store data in files</title>
876          <screenshot>
877            <mediaobject>
878              <imageobject>
879                <imagedata 
880                  fileref="figures/uml/corelayer.datainfiles.png" format="PNG" />
881              </imageobject>
882            </mediaobject>
883          </screenshot>
884        </figure>
885      </sect3>
886     
887      <sect3 id="core_api.data_in_files.ask">
888        <title>Asking the user for files</title>
889
890        <para>
891          A client application must know what types of files it makes sense
892          to ask the user for. In some cases, data may be split into more than
893          one file so we need a generic way to select files.
894        </para>
895       
896        <para>
897          Given that we have a <interfacename>FileStoreEnabled</interfacename>
898          item we use the <methodname>FileSetMemberType.getQuery()</methodname>
899          method to find which file types that can be used for that
900          item. Internally, the <methodname>getQuery()</methodname>
901          method uses the <methodname>FileStoreEnabled.getPlatform()</methodname>
902          and <methodname>FileStoreEnabled.getVariant()</methodname>
903          methods to restrict the query to only return file types for
904          a given platform and/or variant. If the item doesn't have
905          a platform or variant the query will only return file types
906          that are associated with the given item type, but not with any specific
907          platform. In any case, we get a list of <classname>FileSetMemberType</classname>
908          items, each one representing a specific file type that
909          we should ask the user about. Examples:
910        </para>
911
912        <orderedlist>
913        <listitem>
914          <para>
915          The <constant>Affymetrix</constant> platform defines <constant>CEL</constant>
916          for <constant>FileType.RAW_DATA</constant>
917          and <constant>CDF</constant> for <constant>FileType.REPORTER_MAP</constant>.
918          respectively. If we have a
919          <classname>RawBioAssay</classname> the query will only return
920          the CEL file type and the client can ask the user for a CEL file.
921          </para>
922        </listitem>
923        <listitem>
924          <para>
925          More examples.... ???
926          </para>
927        </listitem>
928        </orderedlist>
929     
930        <para>
931          Here is a simple code template that might be useful.
932        </para>
933       
934        <programlisting>
935DbControl dc = ...
936FileStoreEnabled item = ...
937ItemQuery&lt;FileSetMemberType&gt; query =
938   FileSetMemberType.getQuery(item);
939List&lt;FileSetMemberType&gt; types = query.list(dc);
940// We now have a list of file types...
941// ... ask the user to select a file for each one of them
942</programlisting>
943     
944      </sect3>
945     
946      <sect3 id="core_api.data_in_files.link">
947        <title>Link to the selected files</title>
948        <para>
949          When the user has selected the file(s) we must store the links
950          to them in the database. This is done with a <classname>FileSet</classname>.
951          object. A file set can contain any number of files. The only limitation
952          is that it can only contain one file for each file type.
953          Call <methodname>FileSet.setMember()</methodname> to store
954          a file in the set. If a file already exists for the given file type
955          it is replaced, otherwise a new entry is created.
956        </para>
957      </sect3>
958     
959      <sect3 id="core_api.data_in_files.validate">
960        <title>Validate the file and extract metadata</title>
961       
962        <para>
963          Validation and extraction of metadata is important since we want
964          data in files to be equivalent to data in the database. The validation
965          and metadata extraction is automatically done by the core when a
966          file is added to a file set. The process is partly pluggable
967          since each <classname>FileSetMemberType</classname> can name a class
968          that should do the validation and/or metadata extraction.
969          Here is the general outline:
970        </para>
971       
972        <programlisting>
973FileStoreEnabled item = ...
974FileSetMemberType type = ...
975File file = ...
976FileSetMember member = new FileSetMember(file, type);
977
978FileValidator validator = type.getValidator();
979MetadataReader metadata = type.getMetadataReader();
980validator.setFile(member);
981validator.setItem(item);
982// Repeat for 'metadata' if not same as 'validator'
983
984validator.validate();
985metadata.extractMetadata();
986</programlisting>
987       
988        <note>
989          <title>Only one instance of each validator class is created</title>
990          <para>
991          The validation/metadata extraction is not done until all files have been
992          added to the fileset. If the same validator/meta data extractor is
993          used for more than one file, the same instance is reused. Ie.
994          the <methodname>setFile()</methodname> is called one time
995          for each file/file type pair. The <methodname>validate()</methodname>
996          and <methodname>extractMetadata()</methodname> methods are only
997          called once.
998          </para>
999        </note>
1000       
1001        <para>
1002          All validators and meta data extractors should extend
1003          the <classname>AbstractFileHandler</classname> class. The reason
1004          is that we may want to add more methods to the <interfacename>FileHandler</interfacename>
1005          interface in the future. The <classname>AbstractFileHandler</classname> will
1006          be used to provide default implementations for backwards compatibility.
1007        </para>
1008       
1009      </sect3>
1010     
1011      <sect3 id="core_api.data_in_files.import">
1012        <title>Import data into the database</title>
1013       
1014        <para>
1015          This should be done by existing plug-ins in the same way as before.
1016          A slight modification is needed since it is good if the importers
1017          are made aware of already selected files in the <classname>FileSet</classname>
1018          to provide good default values. Something like this.
1019        </para>
1020       
1021        <programlisting>
1022File defaultFile = null;
1023RawBioAssay rba = ...;
1024if (rba.hasFileSet())
1025{
1026   FileSet fileSet = rba.getFileSet();
1027   List&lt;FileSetMember&gt; members =
1028      fileSet.getMembers(FileType.RAW_DATA);
1029   if (members.size() &gt; 0)
1030   {
1031      defaultFile = members.get(0).getFile();
1032   }
1033}       
1034</programlisting>
1035      </sect3>
1036     
1037      <sect3 id="core_api.data_in_files.experiments">
1038        <title>Using raw data from files in an experiment</title>
1039       
1040        <para>
1041          Just as before, an experiment is still locked to a single
1042          <classname>RawDataType</classname>. This is a design issue that
1043          would break too many things if changed. If data is stored in files
1044          the experiment is also locked to a single <classname>Platform</classname>.
1045          This has been designed to have as little impact on existing
1046          plug-ins as possible. In most cases, the plug-ins will continue
1047          to work as before.
1048        </para>
1049       
1050        <para>
1051          A plug-in (using data from the database that needs to check if it can
1052          be used within an experiment can still do:
1053        </para>
1054       
1055        <programlisting>
1056Experiment e = ...
1057RawDataType rdt = e.getRawDataType();
1058if (rdt.isStoredInDb())
1059{
1060   // Check number of channels, etc...
1061   // ... run plug-in code ...
1062}
1063</programlisting>
1064       
1065        <para>
1066          A newer plug-in which uses data from files should do:
1067        </para>
1068       
1069        <programlisting>
1070Experiment e = ...
1071RawDataType rdt = e.getRawDataType();
1072if (!rdt.isStoredInDb())
1073{
1074   Platform p = rdt.getPlatform();
1075   PlatformVariant v = rdt.getVariant();
1076   // Check that platform/variant is supported
1077   // ... run plug-in code ...
1078}
1079</programlisting>
1080       
1081      </sect3>
1082     
1083    </sect2>
1084  </sect1>
1085
1086  <sect1 id="api_overview.query_api">
1087    <title>The Query API</title>
1088    <para>
1089      This documentation is only available in the old format.
1090      See <ulink url="http://base.thep.lu.se/chrome/site/doc/development/overview/query/index.html"
1091        >http://base.thep.lu.se/chrome/site/doc/development/overview/query/index.html</ulink>
1092    </para>
1093   
1094  </sect1>
1095 
1096  <sect1 id="api_overview.dynamic_and_batch_api">
1097    <title>Analysis and the Dynamic and Batch API:s</title>
1098    <para>
1099      This documentation is only available in the old format.
1100      See <ulink url="http://base.thep.lu.se/chrome/site/doc/development/overview/dynamic/index.html"
1101        >http://base.thep.lu.se/chrome/site/doc/development/overview/dynamic/index.html</ulink>
1102    </para>
1103  </sect1>
1104
1105  <sect1 id="api_overview.other_api">
1106    <title>Other useful classes and methods</title>
1107    <para>
1108      TODO
1109    </para>
1110  </sect1>
1111 
1112</chapter>
Note: See TracBrowser for help on using the repository browser.