  • trunk/doc/src/docbook/developerdoc/base_overview.xml

    r4537 r4902  
    27 <chapter id="base_develop_overview">
     27<chapter id="base_develop_overview" chunked="0">
    2828  <?dbhtml dir="develop_overview"?>
    2929  <title>Developer overview of BASE</title>
    3131  <para>
    32     This documentation is only available in the old format.
    33     See <ulink url=""
    34       ></ulink>
     32    This section gives a brief overview of the architechture used
     33    in  BASE. This is a good starting point if you need to know how
     34    various parts of BASE are glued together. The figure below should
     35    display most of the importants parts in BASE. The following
     36    sections will briefly describe some parts of the figure
     37    and give you pointers for further reading if you are interested in
     38    the details.
    3539  </para>
     41    <figure id="develop.figures.overview">
     42      <title>Overview of the BASE application</title>
     43      <screenshot>
     44        <mediaobject>
     45          <imageobject>
     46            <imagedata
     47              align="center"
     48              scalefit="1" width="100%"
     49              fileref="figures/uml/base.overview.png" format="PNG" />
     50          </imageobject>
     51        </mediaobject>
     52      </screenshot>
     53    </figure>
     55    <sect1 id="base_develop_overview.database">
     56      <title>Fixed vs. dynamic database</title>
     58      <para>
     59        BASE stores most of it's data in a database. The database is divided into
     60        two parts, one fixed and one dynamic part.
     61      </para>
     63      <para>
     64        The fixed part contains tables that corresponds
     65        to the various items found in BASE. There is, for example, one table
     66        for users, one table for groups and one table for reporters. Some items
     67        share the same table. Biosources, samples, extracts and labeled extracts are
     68        all biomaterials and share the <code>BioMaterials</code> table. The access
     69        to the fixed part of the database goes through Hibernate in most cases
     70        or through the the Batch API in some cases (for example, access to reporters).
     71      </para>
     73      <para>
     74        The dynamic part of the database contains tables for storing analyzed data. 
     75        Each experiment has it's own set of tables and it is not possible to mix data
     76        from two experiments. The dynamic part of the database can only be accessed
     77        by the Batch API and the Query API using SQL and JDBC.
     78      </para>
     80      <note>
     81        The actual location of the two parts depends on the database that is used.
     82        MySQL uses two separate databases while PostgreSQL uses one database with two schemas.
     83      </note>
     85      <bridgehead>More information</bridgehead>
     86      <itemizedlist>
     87        <listitem>
     88          <para>
     89          <xref linkend="api_overview.dynamic_and_batch_api" />
     90          </para>
     91        </listitem>
     92      </itemizedlist>
     94    </sect1>
     96    <sect1 id="base_develop_overview.hibernate">
     97      <title>Hibernate and the DbEngine</title>
     99      <para>
     100        Hibernate (<ulink url=""></ulink>) is an
     101        object/relational mapping software package. It takes plain Java objects
     102        and stores them in a database. All we have to do is to set the properties
     103        on the objects (for example: <code>user.setName("A name")</code>). Hibernate
     104        will take care of the SQL generation and database communication for us.
     105        This is not a magic or automatic process. We have to provide mapping
     106        information about what objects goes into which tables and what properties
     107        goes into which columns, and other stuff like caching and proxy settings, etc.
     108        This is done by annotating the code with Javadoc comments. The classes
     109        that are mapped to the database are found in the <code></code>
     110        package, which is shown as the <guilabel>Data classes</guilabel> box in the image above.
     111        The <classname docapi="net.sf.basedb.core">HibernateUtil</classname> class contains a
     112        lot of functionality for interacting with Hibernate.
     113      </para>
     115      <para>
     116        Hibernate supports many different database systems. In theory, this means
     117        that BASE should work with all those databases. However, in practice we have
     118        found that this is not the case. For example, Oracle converts empty strings
     119        to <code>null</code> values, which breaks some parts of our code that
     120        expects non-null values. Another difficulty is that our Batch API and some parts of
     121        the Query API:s generates native SQL as well. We try to use database dialect information
     122        from Hibernate, but it is not always possible. The <interfacename
     123        docapi="net.sf.basedb.core.dbengine">DbEngine</interfacename> contains code
     124        for generating the SQL that Hibernate can't help us with. We have implemented
     125        a generic <classname docapi="net.sf.basedb.core.dbengine">DefaultDbEngine</classname>
     126        which follows ANSI specifications and special drivers for MySQL
     127        (<classname docapi="net.sf.basedb.core.dbengine">MySQLEngine</classname>) and
     128        PostgreSQL (<classname docapi="net.sf.basedb.core.dbengine">PostgresDbEngine</classname>).
     129        We don't expect BASE to work with other databases without modifications.
     130      </para>
     132      <bridgehead>More information</bridgehead>
     133      <itemizedlist>
     134        <listitem>
     135          <para>
     136          <xref linkend="core_ref.rules.datalayer" />
     137          </para>
     138        </listitem>
     139        <listitem>
     140          <para>
     141          <ulink url=""></ulink>
     142          </para>
     143        </listitem>
     144      </itemizedlist>
     146    </sect1>
     148    <sect1 id="base_develop_overview.batchapi">
     149      <title>The Batch API</title>
     151      <para>
     152        Hibernate comes with a price. It affects performance and uses a lot
     153        of memory. This means that those parts of BASE that often handles
     154        lots of items at the same time doesn't work well with Hibernate. This
     155        is for example reporters, array design features and raw data. We
     156        have created the Batch API to solve these problems.
     157      </para>
     159      <para>
     160        The Batch API uses JDBC and SQL directly against the database. However, we
     161        still use metadata and database dialect information available from Hibernate
     162        to generate most of the SQL we need. In theory, this should make the Batch API
     163        just as database-independent as Hibernate is. In practice there is some information
     164        that we can't extract from Hibernate so we have implemented a simple
     165        <interfacename docapi="net.sf.basedb.core.dbengine">DbEngine</interfacename>
     166        to account for missing pieces. The Batch API can be used for any
     167        <classname docapi="">BatchableData</classname> class in the
     168        fixed part of the database and is the only way for adding data to the dynamic part.
     169      </para>
     171      <note>
     172        The main reason for the Batch API is to avoid the internal caching
     173        of Hibernate which eats lots of memory when handling thousands of items.
     174        Hibernate 3.1 introduced a new stateless API which among other things doesn't
     175        do any caching. This version was released after we had created the Batch API.
     176        We made a few tests to check if it would be better for us to switch back to Hibernate
     177        but found that it didn't perform as well as our own Batch API (it was about 2 times slower).
     178        In any case, we can never get Hibernate to work with the dynamic database,
     179        so the Batch API is needed.
     180      </note>
     182      <bridgehead>More information</bridgehead>
     183      <itemizedlist>
     184        <listitem>
     185          <para>
     186          <xref linkend="api_overview.dynamic_and_batch_api" />
     187          </para>
     188        </listitem>
     189        <listitem>
     190          <para>
     191          <xref linkend="core_ref.rules.batchclass" />
     192          </para>
     193        </listitem>
     194        <listitem>
     195          <para>
     196          <xref linkend="core_ref.batch" />
     197          </para>
     198        </listitem>
     199      </itemizedlist>
     200    </sect1>
     202    <sect1 id="base_develop_overview.classes">
     203      <title>Data classes vs. item classes</title>
     205      <para>
     206        The data classes are, with few exceptions, for internal use. These are the classes
     207        that are mapped to the database with Hibernate mapping files. They are very simple
     208        and contains no logic at all. They don't do any permission checks or any data
     209        validation.
     210      </para>
     212      <para>
     213        Most of the data classes has a corresponding item class. For example:
     214        <classname docapi="">UserData</classname>
     215        and <classname docapi="net.sf.basedb.core">User</classname>,
     216        <classname docapi="">GroupData</classname> and
     217        <classname docapi="net.sf.basedb.core">Group</classname>.
     218        The item classes are what the client applications can see and use. They contain
     219        logic for permission checking (for example if the logged in user has WRITE permission)
     220        and data validation (for example setting a required property to null).
     221      </para>
     223      <para>
     224        The exception to the above scheme are the batchable classes, which are
     225        all subclasses of the <classname docapi="">BatchableData</classname>
     226        class. For example, there is a <classname docapi="">ReporterData</classname>
     227        class but no corresponding item class. Instead there is a 
     228        batcher implementation, <classname docapi="net.sf.basedb.core">ReporterBatcher</classname>,
     229        which takes care of the more or less the same things that an item class does,
     230        but it also takes care of it's own SQL generation and JDBC calls that
     231        bypasses Hibernate and the caching system.
     232      </para>
     234      <bridgehead>More information</bridgehead>
     235      <itemizedlist>
     236        <listitem>
     237          <para>
     238          <xref linkend="core_ref.rules.datalayer" />
     239          </para>
     240        </listitem>
     241        <listitem>
     242          <para>
     243          <xref linkend="core_ref.rules.itemclass" />
     244          </para>
     245        </listitem>
     246        <listitem>
     247          <para>
     248          <xref linkend="core_ref.rules.batchclass" />
     249          </para>
     250        </listitem>
     251        <listitem>
     252          <para>
     253          <xref linkend="core_ref.accesspermissions" />
     254          </para>
     255        </listitem>
     256        <listitem>
     257          <para>
     258          <xref linkend="core_ref.datavalidation" />
     259          </para>
     260        </listitem>
     261        <listitem>
     262          <para>
     263          <xref linkend="core_ref.batch" />
     264          </para>
     265        </listitem>
     266      </itemizedlist>
     267    </sect1>
     269    <sect1 id="base_develop_overview.queryapi">
     270      <title>The Query API</title>
     271      <para>
     272        The Query API is used to build and execute queries against the data in the
     273        database. It builds a query by using objects that represents certain
     274        operations. For example, there is an <classname
     275        docapi="net.sf.basedb.core.query">EqRestriction</classname> object
     276        which tests if two expressions are equal and there is an <classname
     277        docapi="net.sf.basedb.core.query">AddExpression</classname>
     278        object which adds two expressions. In this way it is possible to build
     279        very complex queries without using SQL or HQL.
     280      </para>
     282      <para>
     283        The Query API knows how to work both via Hibernate and via SQL. In the first case it
     284        generates HQL (Hibernate Query Language) statements which Hibernate then
     285        translates into SQL. In the second case SQL is generated directly.
     286        In most cases HQL and SQL are identical, but not
     287        always. Some situations are solved by having the Query API generate
     288        slightly different query strings (with the help of information from
     289        Hibernate and the DbEngine). Some query elements can only be used
     290        with one of the query types.
     291      </para>
     293      <note>
     294        The object-based approach makes it a bit difficult to store
     295        a query for later reuse. The <code>net.sf.basedb.util.jep</code>
     296        package contains an expression parser that can be used to convert
     297        a string to <interfacename
     298        docapi="net.sf.basedb.core.query">Restriction</interfacename>:s and
     299        <interfacename
     300        docapi="net.sf.basedb.core.query">Expression</interfacename>:s for
     301        the Query API. While it doesn't cover 100% of the cases it should be
     302        useful for the <code>WHERE</code> part of a query.
     303      </note>
     305      <bridgehead>More information</bridgehead>
     306      <itemizedlist>
     307        <listitem>
     308          <para>
     309          <xref linkend="api_overview.query_api" />
     310          </para>
     311        </listitem>
     312      </itemizedlist>
     313    </sect1>
     315    <sect1 id="base_develop_overview.controllerapi">
     316      <title>The Controller API</title>
     317      <para>
     318        The Controller API is the very heart of the Base 2 system. This part
     319        of the core is used for boring but essential details, such as
     320        user authentication, database connection management, transaction
     321        management, data validation, and more. We don't write more about this
     322        part here, but recommends reading the documents below.
     323      </para>
     325      <bridgehead>More information</bridgehead>
     326      <itemizedlist>
     327        <listitem>
     328          <para>
     329          <xref linkend="core_ref.coreinternals" />
     330          </para>
     331        </listitem>
     332      </itemizedlist>
     333    </sect1>
     335    <sect1 id="base_develop_overview.plugins">
     336      <title>Plug-ins</title>
     338      <para>
     339        From the core code's point of view a plug-in is just another client
     340        application. A plug-in doesn't have more powers and doesn't have
     341        access to some special API that allows it to do cool stuff that other
     342        clients can't.
     343      </para>
     345      <para>
     346        However, the core must be able to control when and where a plug-in is
     347        executed. Some plug-ins may take a long time doing their calculations
     348        and may use a lot of memory. It would be bad if a several users started
     349        to execute a resource-demanding plug-in at the same time. This problem is
     350        solved by adding a job queue. Each plug-in that should be executed is
     351        registered as <classname
     352        docapi="net.sf.basedb.core">Job</classname> in the database. A job controller is
     353        checking the job queue at regular intervals. The job controller can then
     354        choose if it should execute the plug-in or wait depending on the current
     355        load on the server.
     356      </para>
     358      <note>
     359        BASE ships with two types of job controllers. One internal that runs
     360        inside the web application, and one external that is designed to run
     361        on separate servers, so called job agents. The internal job controller
     362        should work fine in most cases. The drawback with this controller is
     363        that a badly written plug-in may crash the entire web server. For example,
     364        a call to <code>System.exit()</code> in the plug-in code shuts down Tomcat
     365        as well.
     366      </note>
     368      <bridgehead>More information</bridgehead>
     369      <itemizedlist>
     370        <listitem>
     371          <para>
     372          <xref linkend="plugin_developer" />
     373          </para>
     374        </listitem>
     375        <listitem>
     376          <para>
     377          <xref linkend="core_ref.pluginexecution" />
     378          </para>
     379        </listitem>
     380      </itemizedlist>
     381    </sect1>
     383    <sect1 id="base_develop_overview.clients">
     384      <title>Client applications</title>
     385      <para>
     386        Client applications are application that use the BASE Core API. The current web
     387        application is built with Java Server Pages (JSP). It is supported by several
     388        application servers but we have only tested it with Tomcat. Other client
     389        applications are the external job agents that executes plug-ins on separate
     390        servers, and the migration tool that migrates data from a BASE 1.2.x installation
     391        to BASE 2.
     392      </para>
     394      <para>
     395        Although it is possible to develop a completely new client appliction from
     396        scratch we don't see this as a likely thing to happen. Instead, there are
     397        some other possibilites to access data in BASE and to extend the functionality
     398        in BASE.
     399      </para>
     401      <para>
     402        The first possibility is to use the Web Service API. This allows you to access
     403        some of the data in the BASE database and download it for further use. The
     404        Web Service API is currently very limited but it is not hard to extend it
     405        to cover more use cases.
     406      </para>
     408      <para>
     409        A second possibility is to use the Extension API. This allows a developer to
     410        add functionality that appears directly in the web interface. For example,
     411        additional menu items and toolbar buttons. This API is also easy to extend to
     412        cover more use cases.
     413      </para>
     415      <bridgehead>More information</bridgehead>
     416      <itemizedlist>
     417        <listitem>
     418          <para>
     419          <xref linkend="webservices" />
     420          </para>
     421        </listitem>
     422        <listitem>
     423          <para>
     424          <xref linkend="extensions_developer" />
     425          </para>
     426        </listitem>
     427        <listitem>
     428          <para>
     429          The <ulink url="">BASE plug-ins site</ulink> also
     430          has examples of extensions and web services implementations.
     431          </para>
     432        </listitem>
     433      </itemizedlist>
     434    </sect1>
