source: trunk/doc/src/docbook/developer/base_overview.xml @ 5738

Last change on this file since 5738 was 5738, checked in by Nicklas Nordborg, 10 years ago

References #1590: Documentation cleanup

Re-orgarnized file/directory structure of documentation to make the paths a little bit shorter.

  • Property svn:eol-style set to native
  • Property svn:keywords set to Id
File size: 16.5 KB
Line 
1<?xml version="1.0" encoding="UTF-8"?>
2<!DOCTYPE chapter PUBLIC
3    "-//Dawid Weiss//DTD DocBook V3.1-Based Extension for XML and graphics inclusion//EN"
4    "../../../../lib/docbook/preprocess/dweiss-docbook-extensions.dtd">
5<!--
6  $Id: base_overview.xml 5738 2011-09-15 06:53:11Z nicklas $
7
8  Copyright (C) 2007 Nicklas Nordborg, Martin Svensson
9
10  This file is part of BASE - BioArray Software Environment.
11  Available at http://base.thep.lu.se/
12
13  BASE is free software; you can redistribute it and/or
14  modify it under the terms of the GNU General Public License
15  as published by the Free Software Foundation; either version 3
16  of the License, or (at your option) any later version.
17
18  BASE is distributed in the hope that it will be useful,
19  but WITHOUT ANY WARRANTY; without even the implied warranty of
20  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
21  GNU General Public License for more details.
22
23  You should have received a copy of the GNU General Public License
24  along with BASE. If not, see <http://www.gnu.org/licenses/>.
25-->
26
27<chapter id="base_develop_overview" chunked="0">
28  <title>Developer overview of BASE</title>
29
30  <para>
31    This section gives a brief overview of the architechture used
32    in  BASE. This is a good starting point if you need to know how
33    various parts of BASE are glued together. The figure below should
34    display most of the importants parts in BASE. The following
35    sections will briefly describe some parts of the figure
36    and give you pointers for further reading if you are interested in
37    the details.
38  </para>
39 
40    <figure id="develop.figures.overview">
41      <title>Overview of the BASE application</title>
42      <screenshot>
43        <mediaobject>
44          <imageobject>
45            <imagedata 
46              align="center"
47              scalefit="1" width="100%"
48              fileref="figures/uml/base.overview.png" format="PNG" />
49          </imageobject>
50        </mediaobject>
51      </screenshot>
52    </figure>
53   
54    <sect1 id="base_develop_overview.database">
55      <title>Fixed vs. dynamic database</title>
56
57      <para>
58        BASE stores most of it's data in a database. The database is divided into
59        two parts, one fixed and one dynamic part.
60      </para>
61   
62      <para>
63        The fixed part contains tables that corresponds
64        to the various items found in BASE. There is, for example, one table
65        for users, one table for groups and one table for reporters. Some items
66        share the same table. Biosources, samples, extracts and labeled extracts are
67        all biomaterials and share the <code>BioMaterials</code> table. The access
68        to the fixed part of the database goes through Hibernate in most cases
69        or through the the Batch API in some cases (for example, access to reporters).
70      </para>
71     
72      <para>
73        The dynamic part of the database contains tables for storing analyzed data. 
74        Each experiment has it's own set of tables and it is not possible to mix data
75        from two experiments. The dynamic part of the database can only be accessed
76        by the Batch API and the Query API using SQL and JDBC.
77      </para>
78
79      <note>
80        The actual location of the two parts depends on the database that is used.
81        MySQL uses two separate databases while PostgreSQL uses one database with two schemas.
82      </note>
83
84      <bridgehead>More information</bridgehead>
85      <itemizedlist>
86        <listitem>
87          <para>
88          <xref linkend="api_overview.dynamic_and_batch_api" />
89          </para>
90        </listitem>
91      </itemizedlist>
92   
93    </sect1>
94
95    <sect1 id="base_develop_overview.hibernate">
96      <title>Hibernate and the DbEngine</title>
97   
98      <para>
99        Hibernate (<ulink url="http://www.hibernate.org">www.hibernate.org</ulink>) is an
100        object/relational mapping software package. It takes plain Java objects
101        and stores them in a database. All we have to do is to set the properties
102        on the objects (for example: <code>user.setName("A name")</code>). Hibernate
103        will take care of the SQL generation and database communication for us.
104        This is not a magic or automatic process. We have to provide mapping
105        information about what objects goes into which tables and what properties
106        goes into which columns, and other stuff like caching and proxy settings, etc.
107        This is done by annotating the code with Javadoc comments. The classes
108        that are mapped to the database are found in the <code>net.sf.basedb.core.data</code> 
109        package, which is shown as the <guilabel>Data classes</guilabel> box in the image above.
110        The <classname docapi="net.sf.basedb.core">HibernateUtil</classname> class contains a
111        lot of functionality for interacting with Hibernate.
112      </para>
113   
114      <para>
115        Hibernate supports many different database systems. In theory, this means
116        that BASE should work with all those databases. However, in practice we have
117        found that this is not the case. For example, Oracle converts empty strings
118        to <code>null</code> values, which breaks some parts of our code that
119        expects non-null values. Another difficulty is that our Batch API and some parts of
120        the Query API:s generates native SQL as well. We try to use database dialect information
121        from Hibernate, but it is not always possible. The <interfacename 
122        docapi="net.sf.basedb.core.dbengine">DbEngine</interfacename> contains code
123        for generating the SQL that Hibernate can't help us with. We have implemented
124        a generic <classname docapi="net.sf.basedb.core.dbengine">DefaultDbEngine</classname>
125        which follows ANSI specifications and special drivers for MySQL
126        (<classname docapi="net.sf.basedb.core.dbengine">MySQLEngine</classname>) and
127        PostgreSQL (<classname docapi="net.sf.basedb.core.dbengine">PostgresDbEngine</classname>).
128        We don't expect BASE to work with other databases without modifications.
129      </para> 
130       
131      <bridgehead>More information</bridgehead>
132      <itemizedlist>
133        <listitem>
134          <para>
135          <xref linkend="core_ref.rules.datalayer" />
136          </para>
137        </listitem>
138        <listitem>
139          <para>
140          <ulink url="http://www.hibernate.org">www.hibernate.org</ulink>
141          </para>
142        </listitem>
143      </itemizedlist>
144
145    </sect1>
146
147    <sect1 id="base_develop_overview.batchapi">
148      <title>The Batch API</title>
149   
150      <para>
151        Hibernate comes with a price. It affects performance and uses a lot
152        of memory. This means that those parts of BASE that often handles
153        lots of items at the same time doesn't work well with Hibernate. This
154        is for example reporters, array design features and raw data. We
155        have created the Batch API to solve these problems.
156      </para>
157
158      <para>
159        The Batch API uses JDBC and SQL directly against the database. However, we
160        still use metadata and database dialect information available from Hibernate
161        to generate most of the SQL we need. In theory, this should make the Batch API
162        just as database-independent as Hibernate is. In practice there is some information
163        that we can't extract from Hibernate so we have implemented a simple
164        <interfacename docapi="net.sf.basedb.core.dbengine">DbEngine</interfacename>
165        to account for missing pieces. The Batch API can be used for any
166        <classname docapi="net.sf.basedb.core.data">BatchableData</classname> class in the
167        fixed part of the database and is the only way for adding data to the dynamic part.
168      </para>
169 
170      <note>
171        The main reason for the Batch API is to avoid the internal caching
172        of Hibernate which eats lots of memory when handling thousands of items.
173        Hibernate 3.1 introduced a new stateless API which among other things doesn't
174        do any caching. This version was released after we had created the Batch API.
175        We made a few tests to check if it would be better for us to switch back to Hibernate
176        but found that it didn't perform as well as our own Batch API (it was about 2 times slower).
177        In any case, we can never get Hibernate to work with the dynamic database,
178        so the Batch API is needed.
179      </note>
180     
181      <bridgehead>More information</bridgehead>
182      <itemizedlist>
183        <listitem>
184          <para>
185          <xref linkend="api_overview.dynamic_and_batch_api" />
186          </para>
187        </listitem>
188        <listitem>
189          <para>
190          <xref linkend="core_ref.rules.batchclass" />
191          </para>
192        </listitem>
193        <listitem>
194          <para>
195          <xref linkend="core_ref.batch" />
196          </para>
197        </listitem>
198      </itemizedlist>
199    </sect1>
200   
201    <sect1 id="base_develop_overview.classes">
202      <title>Data classes vs. item classes</title>
203   
204      <para>
205        The data classes are, with few exceptions, for internal use. These are the classes
206        that are mapped to the database with Hibernate mapping files. They are very simple
207        and contains no logic at all. They don't do any permission checks or any data
208        validation.
209      </para>
210 
211      <para>
212        Most of the data classes has a corresponding item class. For example:
213        <classname docapi="net.sf.basedb.core.data">UserData</classname>
214        and <classname docapi="net.sf.basedb.core">User</classname>,
215        <classname docapi="net.sf.basedb.core.data">GroupData</classname> and
216        <classname docapi="net.sf.basedb.core">Group</classname>.
217        The item classes are what the client applications can see and use. They contain
218        logic for permission checking (for example if the logged in user has WRITE permission)
219        and data validation (for example setting a required property to null).
220      </para>
221     
222      <para>
223        The exception to the above scheme are the batchable classes, which are
224        all subclasses of the <classname docapi="net.sf.basedb.core.data">BatchableData</classname>
225        class. For example, there is a <classname docapi="net.sf.basedb.core.data">ReporterData</classname>
226        class but no corresponding item class. Instead there is a 
227        batcher implementation, <classname docapi="net.sf.basedb.core">ReporterBatcher</classname>,
228        which takes care of the more or less the same things that an item class does,
229        but it also takes care of it's own SQL generation and JDBC calls that
230        bypasses Hibernate and the caching system.
231      </para>
232 
233      <bridgehead>More information</bridgehead>
234      <itemizedlist>
235        <listitem>
236          <para>
237          <xref linkend="core_ref.rules.datalayer" />
238          </para>
239        </listitem>
240        <listitem>
241          <para>
242          <xref linkend="core_ref.rules.itemclass" />
243          </para>
244        </listitem>
245        <listitem>
246          <para>
247          <xref linkend="core_ref.rules.batchclass" />
248          </para>
249        </listitem>
250        <listitem>
251          <para>
252          <xref linkend="core_ref.accesspermissions" />
253          </para>
254        </listitem>
255        <listitem>
256          <para>
257          <xref linkend="core_ref.datavalidation" />
258          </para>
259        </listitem>
260        <listitem>
261          <para>
262          <xref linkend="core_ref.batch" />
263          </para>
264        </listitem>
265      </itemizedlist>
266    </sect1>
267
268    <sect1 id="base_develop_overview.queryapi">
269      <title>The Query API</title>
270      <para>
271        The Query API is used to build and execute queries against the data in the
272        database. It builds a query by using objects that represents certain
273        operations. For example, there is an <classname 
274        docapi="net.sf.basedb.core.query">EqRestriction</classname> object
275        which tests if two expressions are equal and there is an <classname 
276        docapi="net.sf.basedb.core.query">AddExpression</classname>
277        object which adds two expressions. In this way it is possible to build
278        very complex queries without using SQL or HQL.
279      </para>
280     
281      <para>
282        The Query API knows how to work both via Hibernate and via SQL. In the first case it
283        generates HQL (Hibernate Query Language) statements which Hibernate then
284        translates into SQL. In the second case SQL is generated directly.
285        In most cases HQL and SQL are identical, but not
286        always. Some situations are solved by having the Query API generate
287        slightly different query strings (with the help of information from
288        Hibernate and the DbEngine). Some query elements can only be used
289        with one of the query types.
290      </para>
291     
292      <note>
293        The object-based approach makes it a bit difficult to store
294        a query for later reuse. The <code>net.sf.basedb.util.jep</code> 
295        package contains an expression parser that can be used to convert
296        a string to <interfacename 
297        docapi="net.sf.basedb.core.query">Restriction</interfacename>:s and
298        <interfacename 
299        docapi="net.sf.basedb.core.query">Expression</interfacename>:s for
300        the Query API. While it doesn't cover 100% of the cases it should be
301        useful for the <code>WHERE</code> part of a query.
302      </note>
303     
304      <bridgehead>More information</bridgehead>
305      <itemizedlist>
306        <listitem>
307          <para>
308          <xref linkend="api_overview.query_api" />
309          </para>
310        </listitem>
311      </itemizedlist>
312    </sect1>
313
314    <sect1 id="base_develop_overview.controllerapi">
315      <title>The Controller API</title>
316      <para>
317        The Controller API is the very heart of the BASE system. This part
318        of the core is used for boring but essential details, such as
319        user authentication, database connection management, transaction
320        management, data validation, and more. We don't write more about this
321        part here, but recommends reading the documents below.
322      </para>
323     
324      <bridgehead>More information</bridgehead>
325      <itemizedlist>
326        <listitem>
327          <para>
328          <xref linkend="core_ref.coreinternals" />
329          </para>
330        </listitem>
331      </itemizedlist>
332    </sect1>
333
334    <sect1 id="base_develop_overview.plugins">
335      <title>Plug-ins</title>
336   
337      <para>
338        From the core code's point of view a plug-in is just another client
339        application. A plug-in doesn't have more powers and doesn't have
340        access to some special API that allows it to do cool stuff that other
341        clients can't.
342      </para>
343
344      <para>
345        However, the core must be able to control when and where a plug-in is
346        executed. Some plug-ins may take a long time doing their calculations
347        and may use a lot of memory. It would be bad if a several users started
348        to execute a resource-demanding plug-in at the same time. This problem is
349        solved by adding a job queue. Each plug-in that should be executed is
350        registered as <classname 
351        docapi="net.sf.basedb.core">Job</classname> in the database. A job controller is
352        checking the job queue at regular intervals. The job controller can then
353        choose if it should execute the plug-in or wait depending on the current
354        load on the server.
355      </para>
356 
357      <note>
358        BASE ships with two types of job controllers. One internal that runs
359        inside the web application, and one external that is designed to run
360        on separate servers, so called job agents. The internal job controller
361        should work fine in most cases. The drawback with this controller is
362        that a badly written plug-in may crash the entire web server. For example,
363        a call to <code>System.exit()</code> in the plug-in code shuts down Tomcat
364        as well.
365      </note>
366
367      <bridgehead>More information</bridgehead>
368      <itemizedlist>
369        <listitem>
370          <para>
371          <xref linkend="plugin_developer" />
372          </para>
373        </listitem>
374        <listitem>
375          <para>
376          <xref linkend="core_ref.pluginexecution" />
377          </para>
378        </listitem>
379      </itemizedlist>
380    </sect1>
381
382    <sect1 id="base_develop_overview.clients">
383      <title>Client applications</title>
384      <para>
385        Client applications are application that use the BASE Core API. The current web
386        application is built with Java Server Pages (JSP). JSP is supported by several
387        application servers but we have only tested it with Tomcat. Another client
388        application is the external job agents that executes plug-ins on separate
389        servers.
390      </para>
391     
392      <para>
393        Although it is possible to develop a completely new client appliction from
394        scratch we don't see this as a likely thing to happen. Instead, there are
395        some other possibilites to access data in BASE and to extend the functionality
396        in BASE.
397      </para>
398     
399      <para>
400        The first possibility is to use the Web Service API. This allows you to access
401        some of the data in the BASE database and download it for further use. The
402        Web Service API is currently very limited but it is not hard to extend it
403        to cover more use cases.
404      </para>
405     
406      <para>
407        A second possibility is to use the Extension API. This allows a developer to
408        add functionality that appears directly in the web interface. For example,
409        additional menu items and toolbar buttons. This API is also easy to extend to
410        cover more use cases.
411      </para>
412     
413      <bridgehead>More information</bridgehead>
414      <itemizedlist>
415        <listitem>
416          <para>
417          <xref linkend="webservices" />
418          </para>
419        </listitem>
420        <listitem>
421          <para>
422          <xref linkend="extensions_developer" />
423          </para>
424        </listitem>
425        <listitem>
426          <para>
427          The <ulink url="http://baseplugins.thep.lu.se">BASE plug-ins site</ulink> also
428          has examples of extensions and web services implementations.
429          </para>
430        </listitem>
431      </itemizedlist>
432    </sect1>
433
434</chapter>
Note: See TracBrowser for help on using the repository browser.