source: trunk/doc/src/docbook/userdoc/import_data.xml @ 4889

Last change on this file since 4889 was 4889, checked in by Nicklas Nordborg, 13 years ago

References #1290: Change source files to UTF-8

Changed 'Hakkinen' to 'Häkkinen'.

  • Property svn:eol-style set to native
  • Property svn:keywords set to Id
File size: 30.3 KB
Line 
1<?xml version="1.0" encoding="UTF-8"?>
2<!DOCTYPE chapter PUBLIC
3  "-//Dawid Weiss//DTD DocBook V3.1-Based Extension for XML and graphics inclusion//EN"
4  "../../../../lib/docbook/preprocess/dweiss-docbook-extensions.dtd"
5[
6<!ENTITY runplugin.configure.common
7  "The top of the window displays the names of the selected plug-in and
8  configuration, a list with parameters to the left, an area for input fields to the
9  right and buttons to proceed with at the bottom.
10  Click on a parameter in the parameter list to show the form fields
11  for entering values for the parameter to the right. Parameters
12  with an <guilabel>X</guilabel> in front of their names already have a
13  value. Parameters marked with a blue rectangle are required and must
14  be given a value before it is possible to proceed."
15>
16]>
17<!--
18  $Id: import_data.xml 4889 2009-04-06 12:52:39Z nicklas $
19 
20  Copyright (C) 2007 Peter Johansson, Nicklas Nordborg, Martin Svensson
21  Copyright (C) 2008 Jari Häkkinen
22 
23  This file is part of BASE - BioArray Software Environment.
24  Available at http://base.thep.lu.se/
25 
26  BASE is free software; you can redistribute it and/or
27  modify it under the terms of the GNU General Public License
28  as published by the Free Software Foundation; either version 3
29  of the License, or (at your option) any later version.
30 
31  BASE is distributed in the hope that it will be useful,
32  but WITHOUT ANY WARRANTY; without even the implied warranty of
33  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
34  GNU General Public License for more details.
35 
36  You should have received a copy of the GNU General Public License
37  along with BASE. If not, see <http://www.gnu.org/licenses/>.
38-->
39<chapter id="import_data" chunked="0">
40  <?dbhtml dir="import"?>
41  <title>Import of data</title>
42  <para>
43    In some places the only way to get data into BASE is to import it
44    from a file. This typically includes raw data, array design
45    features, reporters and other things, which would be inconvenient
46    to enter by hand due to the large number of data items. There is
47    also convenience batch importers for importing other items such as
48    biosources, samples, and extracts. The batch importers are
49    described later in this chapter after the general import
50    description.
51  </para>
52  <para>
53    Normally, a plug-in handles one type of items and may require a
54    configuration, for example, the import plug-ins need some
55    information about how to find headers and data lines in
56    files. BASE ships with a number of export plug-ins as a part of
57    the core plug-ins package, cf. <xref linkend="coreplugins.import"
58    />. The core plug-in section links to configuration examples for
59    some of the plugins. Go to
60    <menuchoice>
61      <guimenu>Administrate</guimenu>
62      <guimenuitem>Plugins</guimenuitem>
63      <guisubmenu>Definitions</guisubmenu>
64    </menuchoice>
65    to check which plug-ins are installed on your BASE server. When
66    BASE finds a plug-in that supports import of a certain type of
67    item an &gbImport; button is displayed in the toolbar on either
68    the list view or the single-item view.
69  </para>
70    <note>
71    <title>Missing/unavailable button</title>
72    <para>
73      If the import button is missing from a page were you would expect
74      to find them this usually means that:
75    </para>
76    <itemizedlist>
77      <listitem>
78        <simpara>
79          The logged in user does not have permission to use the plug-in.
80        </simpara>
81      </listitem>
82      <listitem>
83        <simpara>
84          The plug-in requires a configuration, but no one has been
85          created or the logged in user does not have permission to
86          use any of the existing configurations.
87        </simpara>
88      </listitem>
89    </itemizedlist>
90    <para>
91      Contact the server administrator or a similar user that has permission to
92      administrate the plug-ins.
93    </para>
94  </note>
95
96    <sect1 id="import_data.import">
97      <title>General import procedure</title>
98
99    <para>
100      Starting a data import is done by a wizard-like interface. There
101      are a number of step you have to go through:
102    </para>
103   
104    <orderedlist>
105      <listitem>
106        <simpara>
107          Select a plug-in and file format to use, or select the
108          auto detect option.
109        </simpara>
110      </listitem>
111      <listitem>
112        <simpara>
113          If you selected the auto detection function, you must select
114          a file to use.
115        </simpara>
116      </listitem>
117      <listitem>
118        <simpara>
119          Specify plug-in parameters.
120        </simpara>
121      </listitem>
122      <listitem>
123        <simpara>
124          Add the import job to the job queue.
125        </simpara>
126      </listitem>
127      <listitem>
128        <simpara>
129          Wait for the job to finish.
130        </simpara>
131      </listitem>
132    </orderedlist>
133
134    <sect2 id="import_export_data.import.plugin_fileformat">
135      <title>Select plug-in and file format</title>
136      <para>
137        Click on the &gbImport; button
138        in the toolbar to start the import wizard. The first step is to
139        select which plug-in and, if supported, which
140        file format to use. There is also an <guilabel>auto detect</guilabel>
141        option that lets you select a file and have BASE try to find a suitable
142        plug-in/file format to use.
143      </para>
144 
145      <figure id="import_export_data.figures.select_import_plugin">
146        <title>Select plug-in and file format</title>
147        <screenshot>
148          <mediaobject>
149            <imageobject><imagedata fileref="figures/select_import_plugin.png" format="PNG" /></imageobject>
150          </mediaobject>
151        </screenshot>
152      </figure>
153   
154
155      <helptext external_id="import.selectplugin"
156        title="Select plug-in and file format for data import">
157
158        <variablelist>
159          <varlistentry>
160            <term><guilabel>Plugin</guilabel></term>
161            <listitem>
162              <para>
163                A list of all plug-ins that are available in the
164                current context. The list only includes plug-ins that
165                the logged in user has permission to use. If you select
166                a plug-in a short description of about it is displayed
167                below the lists. More information about the plug-ins can
168                be found under the menu choice
169                <menuchoice>
170                  <guimenu>Administrate</guimenu>
171                  <guimenuitem>Plugins</guimenuitem>
172                  <guisubmenu>Definitions</guisubmenu>
173                </menuchoice>
174              </para>
175            </listitem>
176          </varlistentry>
177         
178          <varlistentry>
179            <term><guilabel>File format</guilabel></term>
180            <listitem>
181              <para>
182                A list of different file formats configurations
183                supported by the selected plug-in.
184                <menuchoice>
185                  <guimenu>Administrate</guimenu>
186                  <guimenuitem>Plugins</guimenuitem>
187                  <guisubmenu>Configurations</guisubmenu>
188                </menuchoice>.
189              </para>
190             
191              <note>
192                <title>File format vs. Configuration</title>
193                <simpara>
194                A file format is the same thing as a plug-in configuration.
195                It may be confusing that the interface sometimes use
196                <emphasis>file format</emphasis> and sometimes use
197                <emphasis>configuration</emphasis>, but for now, we'll have
198                to live with it.
199                </simpara>
200              </note>
201             
202            </listitem>
203          </varlistentry>
204       
205        </variablelist>
206
207        <para>
208          Proceed to the next step by clicking on the
209          &gbNext; button.
210        </para>
211
212        <seeother>
213          <other external_id="import.autodetect">The auto detect function</other>
214        </seeother>
215      </helptext>
216
217      <sect3 id="import_export_data.import.plugin_fileformat.autodetect">
218        <title>The auto detect function</title>
219       
220        <helptext
221          external_id="import.autodetect"
222          title="The auto detect function">
223       
224        <para>
225          The auto detect function lets you select a file and have
226          BASE try to find a suitable plug-in and file format. This option is
227          selected by default in both the plug-in and file format lists when there is
228          at least one plug-in that supports auto detection.
229        </para>
230        <note>
231          <title>Support of auto detect</title>
232          <para>
233            Not all plug-ins support auto detection. The ones that do are marked in
234            the list with <guilabel>×</guilabel>.
235          </para>
236        </note>
237       
238        <para>
239          Select the auto detect option either for both plug-ins and
240          file formats or only for file formats to use this feature.
241          Continue to the next step by clicking on the &gbNext;
242          button.
243        </para>
244       
245        <seeother>
246          <other external_id="import.selectplugin">Select plug-in and file format for data import</other>
247          <other external_id="import.autodetect.selectfile">Select file for auto detection</other>
248        </seeother>
249     
250        </helptext>
251       
252        <para>
253          You must now select a file to import from.
254        </para>
255       
256        <figure id="import_export_data.figures.select_autodetect_file">
257          <title>Select file for auto detection</title>
258          <screenshot>
259            <mediaobject>
260              <imageobject><imagedata fileref="figures/select_autodetect_file.png" format="PNG" /></imageobject>
261            </mediaobject>
262          </screenshot>
263        </figure>
264       
265        <helptext external_id="import.autodetect.selectfile" 
266          title="Select file for auto detection">
267 
268          <variablelist>
269            <varlistentry>
270              <term><guilabel>File</guilabel></term>
271              <listitem>
272                <para>
273                  Enter the path and file name for the
274                  file you want to use. Use the <guibutton>Browse&hellip;</guibutton>
275                  button to browse after the file in BASE's file system.
276                  If the file does not exist in the file system you have the option
277                  to upload it.
278                  <nohelp>Read more about this in <xref linkend="file_system" />.</nohelp>
279                </para>
280              </listitem>
281            </varlistentry>
282           
283            <varlistentry>
284              <term><guilabel>Recently used</guilabel></term>
285              <listitem>
286                <para>
287                  A list of files you have recently used
288                  for auto detection.
289                </para>
290              </listitem>
291            </varlistentry>
292          </variablelist>
293         
294          <para>
295            Click on the &gbNext; button
296            to start the auto detection.
297          </para>
298
299          <para>
300            If the auto detection finds a exactly one plug-in and file format
301            the next step is to configure any additional parameters needed
302            by the plug-in. This is the same step as if you had selected
303            the same plug-in and file format in the first step.
304            If no plug-in can be found an error message is displayed.
305          </para>
306
307          <note>
308            <title>More then one compatible plug-in/file format</title>
309            <para>
310              If more than one matching plug-in or file format is used
311              you will be taken back to the first step. This time
312              the lists will only include the matching plug-ins/file formats
313              and the auto detect option is not present.
314            </para>
315          </note>
316
317          <seeother>
318            <other external_id="import.selectplugin">Select plug-in and file format for data import</other>
319            <other external_id="import.autodetect">The auto detect function</other>
320          </seeother>
321         
322        </helptext>
323       
324      </sect3>
325
326    </sect2>
327
328    <sect2 id="import_export_data.import.pluginparameters">
329      <title>Specify plug-in parameters</title>
330      <para>
331        When you have selected a plug-in and file format or used
332        the auto detect function to find one, a form where you
333        you can enter additional parameters for the plug-in is displayed.
334      </para>
335     
336      <figure id="import_export_data.figures.confiure_plugin">
337        <title>Specify plug-in parameters</title>
338        <screenshot>
339          <mediaobject>
340            <imageobject>
341              <imagedata 
342                scalefit="1" width="100%"
343                fileref="figures/plugin_parameters.png" format="PNG" />
344            </imageobject>
345          </mediaobject>
346        </screenshot>
347      </figure>
348     
349      <helptext external_id="runplugin.configure.import" 
350        title="Specify plug-in parameters">
351      <para>
352        &runplugin.configure.common;
353      </para>
354     
355      <para>
356        The parameter list is very different from plug-in to plug-in.
357        Common parameters for import plug-ins are:
358      </para>
359     
360      <variablelist>
361        <varlistentry>
362          <term><guilabel>File</guilabel></term>
363          <listitem>
364            <para>
365            The file to import data from. A value is already set if
366            you used the auto detect function.
367            </para>
368          </listitem>
369        </varlistentry>
370       
371        <varlistentry>
372          <term><guilabel>Error handling</guilabel></term>
373          <listitem>
374            <para>
375              A section which contains different options how to
376              handle errors when parsing the file. Normally you can
377              select if the import should fail as a while or if
378              the line with the error should be skipped.
379            </para>
380          </listitem>
381        </varlistentry>
382      </variablelist>
383     
384      <para>
385        Continue to the next step by clicking the
386        &gbNext; button.
387      </para>
388     
389      <seeother>
390        <other external_id="runplugin.configure">The plug-in configuration wizard</other>
391      </seeother>   
392      </helptext>
393
394    </sect2>
395   
396    <sect2 id="import_export_data.import.jobqueue">
397      <title>Add the import job to the job queue</title>
398
399      <para>
400        In this window should information about the job be filled in, like name and
401        description. Where name is required and need to have valid string as a value. There
402        are also two check boxes in this page.
403        <variablelist>
404          <varlistentry>
405            <term>
406              <guilabel>Send message</guilabel>
407            </term>
408            <listitem>
409              <para>
410                Tick this check box if the job should send you a message when it is
411                finished, otherwise untick it
412              </para>
413            </listitem>
414          </varlistentry>
415          <varlistentry>
416            <term>
417              <guilabel>Remove job</guilabel>
418            </term>
419            <listitem>
420              <para>
421                If this check box is ticked, the job will be marked as removed when
422                it is finished, on condition that it was finished successfully. This
423                is only available for import- and export- plugins.
424              </para>
425            </listitem>
426          </varlistentry>
427        </variablelist>
428      </para>
429      <para>
430        Clicking on
431        &gbFinish;
432        when everything is set will end the job configuration and place the job in the job queue.
433        A self-refreshing window appears with information about the
434        job's status and execution time. How long time it takes before the job starts to run
435        depends on which priority it and the other jobs in the queue have. The job does not
436        depend on the status window to be able to run and the window can be
437        closed without interrupting the execution.
438      </para>
439      <tip>
440        <title>View job status</title>
441        <para>
442          A job's status can be viewed at any time by opening it from the job list page,
443          <menuchoice>
444            <guimenuitem>View</guimenuitem>
445            <guimenuitem>Jobs</guimenuitem>
446          </menuchoice>.
447        </para>
448      </tip>
449    </sect2>
450
451    </sect1>
452
453    <sect1 id="import_data.batch">
454      <title>Batch import of data</title>
455
456      <para>
457        There are in general several possibilities to import data into
458        BASE. Bulk data such as reporter information and raw data
459        imports are handled by plug-ins created for these tasks. For
460        item types that are imported in more moderate quantities a
461        suite of batch item importers available
462        (<xref linkend="coreplugins.import.batch" />). These importers
463        allows the user to create new items in BASE and define item
464        properties and associations between items using tab-separated
465        (or equivalent) files.
466      </para>
467
468      <para>
469        The batch importers are available for most users and they may
470        have been pre-configured but there is no requirement to
471        configure the batch importer plug-ins. Here we assume that no
472        plug-in configuration exists for the batch
473        importers. Pre-configuration of the importers is really only
474        needed for facilities that perform the same imports regularly
475        whereas for occasional use the provided wizard is
476        sufficient. Configuring the importers follows the route
477        described in <xref linkend="plugins.configuration" />.
478      </para>
479
480      <para>
481        The batch importers either creates new items or updates
482        already existing items. In either mode the plugin can set
483        values for
484        <itemizedlist>
485          <listitem>
486            <para>
487              Simple properties, <emphasis>eg.</emphasis>, string
488              values, numeric values, dates, etc.
489            </para>
490          </listitem>
491          <listitem>
492            <para>
493              Single-item references, <emphasis>eg.</emphasis>,
494              protocol, label, software, owner, etc.
495            </para>
496          </listitem>
497          <listitem>
498            <para>
499              Multi-item references are references to several other
500              items of the same type. The labeled extracts of a
501              hybridization or pooled samples are two examples of
502              items that refer to several other items; a hybridization
503              may contain several labeled extracts and a sample may be
504              a pool of several samples. In some cases a multi-item
505              reference is bundled with simple
506              values, <emphasis>eg.</emphasis>, used quantity of a
507              source biomaterial, the array index a labeled extract is
508              used on, etc. Multi-item references are never removed by
509              the importer, only added or updated. Removing an item
510              from a multi-item reference is a manual procedure to be
511              done using the web interface.
512            </para>
513          </listitem>
514        </itemizedlist>
515        The batch importers do not set values for annotations since
516        this is handled by the already existing annotation importer
517        plug-in (<xref linkend="annotations.massimport" />). However,
518        the annotation importer and batch item importers have similar
519        behaviour and functionality to minimize the learning cost for
520        users.
521      </para>
522
523      <para>
524        The importer only works one item type at each use and can be
525        used in a <emphasis>dry-run</emphasis> mode where everything
526        is performed as if a real import is taking place, but the work
527        (transaction) is not committed to the database. The result of
528        the test can be stored to a log file and the user can examine
529        the output to see how an actual import would perform. Summary
530        results such as the number of items imported and the number of
531        failed items are reported after the import is finished, and in
532        the case of non-recoverable failure the reason is reported.
533      </para>
534
535      <sect2 id="import_data.batch.fileformat">
536        <title>File format</title>
537
538        <para>
539          For proper and efficient use of the batch importers users
540          need to understand how the files to be imported should be
541          formatted. For users who wishes to get a hands-on
542          experience there is
543          an <ulink url="http://base.thep.lu.se/attachment/wiki/DocBookSupport/batchimport_sample.ods?format=raw">OpenOffice
544          spreadsheet with sample sheets that work with the batch
545          importers</ulink> available for download. This file can be
546          used to import a set of data from the biosource level down
547          to hybridizations with proper associations and properties
548          simply by using the batch importers.
549        </para>
550
551        <para>
552          The input file must be organised into columns separated by a
553          specified character such as a tab or comma character. The
554          data header line contains the column headers which defines
555          the contents of each column and defines the beginning of
556          item data in the file. The item data block continues until
557          the end of the file or to an optional data footer line
558          defining the end of the data block.
559        </para>
560
561        <para>
562          When reading data for an item the plug-in must use some
563          information for identifying items. Depending on item type
564          there are two or three options to select the item identifier
565          <itemizedlist>
566            <listitem>
567              <para>
568                Using the internal <property>id</property>. This is
569                always unique for a specific BASE server.
570              </para>
571            </listitem>
572            <listitem>
573              <para>
574                Using the <property>name</property>. This may or may
575                not be unique.
576              </para>
577            </listitem>
578            <listitem>
579              <para>
580                Some items have
581                an <property>externalId</property>. This may or may
582                not be unique.
583              </para>
584            </listitem>
585            <listitem>
586              <para>
587                Array slides may have a <property>barcode</property>
588                which is similar to
589                the <property>externalId</property>.
590              </para>
591            </listitem>
592          </itemizedlist>
593          It is important that the identifier selected
594          is <emphasis>unique</emphasis> in the file used, or if the
595          file is used to update items already existing in BASE the
596          identifier should also be unique in BASE for the user
597          performing the update. The plug-in will check uniqueness
598          when default parameters are used but the user may change the
599          default behaviour.
600        </para>
601
602        <para>
603          Data for a single item may be split into multiple lines. The
604          first line contains simple properties and single-item
605          references, and the first multi-item reference. If there are
606          more multi-item references they should be on the following
607          lines with empty values in all other columns, except for the
608          column holding the item identifier. The item identifier must
609          have the same value on all lines associated with the
610          item. Lines containing other data than multi-item references
611          will be ignored or may be considered as an error depending
612          on plug-in parameter settings. The reason for treating
613          copied data entries as an error is to catch situations where
614          two items is given the same item identifier by accident.
615        </para>
616
617      </sect2>
618
619      <sect2 id="import_data.batch.running">
620        <title>Running the item batch importer</title>
621
622        <para>
623          This section discuss specific parameters and features of the
624          batch importers. The general use of the batch importers
625          follow the description outlined in
626          <xref linkend="import_data.import" /> and the setting of
627          column mapping parameters is assisted with
628          the <guilabel>Test with file</guilabel> function described
629          in <xref linkend="plugins.configuration.testwithfile"
630          />. The column headers are mapped to item properties at each
631          use of the plug-in but, as pointed out above, they can also
632          be predefined by saving settings as a plug-in
633          configuration. The configuration also includes separator
634          character and other information that is needed to parse
635          files. The ability to save configurations depends on user
636          credential and is by default only granted to administrators.
637        </para>
638
639        <para>
640          The plug-in parameter follows the standard BASE plug-in
641          layout and shows help information for selected
642          parameters. The list below comments on some of the
643          parameters available.
644          <variablelist>
645            <varlistentry>
646              <term>
647                <guilabel>Mode</guilabel>
648              </term>
649              <listitem>
650                <para>
651                  Select the mode of the plug-in. The plug-in can
652                  create new items and/or update items already
653                  existing in BASE. This setting is available to allow
654                  the user to make a conscious choice of how to treat
655                  missing or already existing items. For example, if
656                  the user selects to only update items already
657                  existing the plug-in will complain if an item in the
658                  file does not exist in BASE (using default error
659                  condition treatment). This adds an extra layer of
660                  security and diagnostics for the user during import.
661                </para>
662              </listitem>
663            </varlistentry>
664            <varlistentry>
665              <term>
666                <guilabel>Identification method</guilabel>
667              </term>
668              <listitem>
669                <para>
670                  This parameter defines the method to use to find
671                  already existing items. The parameter can only be
672                  set to a set of item properties listed in the
673                  plug-in parameter dialog. The property selected by
674                  the user must be mapped to a column in the file. If
675                  it is not set there is obviously no way for the
676                  plug-in to identify if an item already exists .
677                </para>
678              </listitem>
679            </varlistentry>
680            <varlistentry>
681              <term>
682                <guilabel>Owned by me</guilabel>, <guilabel>Shared to
683                me</guilabel>, <guilabel>In current
684                project</guilabel>, and <guilabel>Owned by
685                others</guilabel>
686              </term>
687              <listitem>
688                <para>
689                  Defines the set of items the plug-in should look in
690                  when it checks whether an item already exists. The
691                  options are the same that are available in list
692                  views and the actual set of parameters depends in
693                  user credentials.
694                </para>
695                <para>
696                  When <property>id</property> is used as
697                  the <guilabel>Identification method</guilabel>, the
698                  plug-in looks for the item irrespective the setting
699                  of these parameters. Of course, the user still must
700                  have proper access to the item referenced.
701                </para>
702              </listitem>
703            </varlistentry>
704            <varlistentry>
705              <term>
706                <guilabel>Column mapping expressions</guilabel>
707              </term>
708              <listitem>
709                <para>
710                  Use the <guilabel>Test with file</guilabel> function
711                  described in
712                  <xref linkend="plugins.configuration.testwithfile"
713                  /> to set the column mapping parameters.
714                </para>
715                <para>
716                  When creating pooled items,
717                  the <property>pooled</property> property is used to
718                  tell the plug-in that an item is pooled. Pooled in
719                  BASE language really means that the item parent is
720                  of the same type as the item itself. If an item is
721                  not pooled then the parent is of another type
722                  following a predefined hierarchy in BASE. In
723                  ascending order the BASE ordering
724                  of <emphasis>parent - child - grandchild -
725                  ...</emphasis> item relation is <emphasis>biosource
726                  - sample - extract - labeled extract</emphasis>.
727                </para>
728                <para>
729                  The values accepted for <property>pooled</property>
730                  are <constant>empty (' ')</constant>,
731                  <constant>0</constant>, <constant>1</constant>,
732                  <constant>no</constant>, <constant>yes</constant>,
733                  <constant>false</constant>,
734                  and <constant>true</constant>. Any other string is
735                  interpreted as the item is pooled.  Sometimes all
736                  items in a file to be imported are pooled but there
737                  is no column that marks the pooled status. This can
738                  be resolved by setting
739                  the <property>pooled</property> mapping to a
740                  constant string
741                  <constant>'1'</constant> which make all items to be
742                  treated as pooled in the import (no backslash '\'
743                  character, compare with column header mapping
744                  strings that contain backslash characters
745                  like <constant>'\pool column\'</constant>).
746                </para>
747              </listitem>
748            </varlistentry>
749          </variablelist>
750          After setting the parameters,
751          select <guilabel>Next</guilabel>. Another parameter dialog
752          will appear where error handling options can be set among
753          with
754          <variablelist>
755            <varlistentry>
756              <term>
757                <guilabel>Log file</guilabel>
758              </term>
759              <listitem>
760                <para>
761                  Setting this parameter will turn on logging. The
762                  plug-in will give detailed information about how the
763                  file is parsed. This is useful for resolving file
764                  parsing issues.
765                </para>
766              </listitem>
767            </varlistentry>
768            <varlistentry>
769              <term>
770                <guilabel>Dry run</guilabel>
771              </term>
772              <listitem>
773                <para>
774                  Enable or disable test run of the plug-in. If
775                  enabled the plug-in will parse and simulate an
776                  import. When enabling this option you should set
777                  the <guilabel>Log file</guilabel> also. The dry run
778                  mode allows testing of large imports and updates by
779                  creating a log file that can be examined for
780                  inconsistencies before actually performing the action
781                  without a safety net.
782                </para>
783              </listitem>
784            </varlistentry>
785          </variablelist>
786        </para>
787
788        <para>
789          During file parsing the plug-in will look for items
790          referenced on each line. There are three outcomes of this
791          item search
792          <itemizedlist>
793            <listitem>
794              <para>
795                No item is found. Depending on parameter settings this
796                may abort the plug-in, the plug-in may ignore the
797                line, or a new item is created.
798              </para>
799            </listitem>
800            <listitem>
801              <para>
802                One item is found. This is the item that is going to
803                be updated.
804              </para>
805            </listitem>
806            <listitem>
807              <para>
808                More than one item is found. Depending on parameter
809                settings this may abort the plug-in or the plug-in may
810                ignored the line.
811              </para>
812            </listitem>
813          </itemizedlist>
814        </para>
815
816      </sect2>
817
818      <sect2 id="import_data.batch.comments">
819        <title>Comments on the item batch importers</title>
820
821        <para>
822          The item batch importers are not designed to change or
823          create annotations. There is another plug-in for this, see
824          <xref linkend="annotations.massimport" /> for an
825          introduction to the annotation importer.
826        </para>
827
828        <para>
829          There is no need to map all columns when running the
830          importer. When new items are created usually the only
831          mandatory entry is <property>Name</property>, and when
832          running the plug-in in update mode only the column defining
833          the item identification property needs to be defined. This
834          can be utilized when only one or a few properties needs to
835          be updated; map only columns that should be changed and the
836          plug-in will ignore the other properties and leave them as
837          they are already stored in BASE. This also means that if one
838          property should be deleted then that property must be mapped
839          and the value must be empty in the file. Note, multi-item
840          reference cannot be deleted with the batch importer, and
841          deletion of multi-item references must be done using the web
842          interface.
843        </para>
844
845        <para>
846          When parent and other relations are created using the
847          plug-in the referenced items are properly linked and
848          updated. This means that when a quantity that decreases a
849          referenced item is used, the referenced item is updated
850          accordingly. In consequence, if the relation is removed in a
851          later update - maybe wrong parent was referenced - the
852          referenced item is restored and any decrease of quantities
853          are also reset.
854        </para>
855
856        <para>
857          A common mistake is to forget to make sure that some of the
858          referenced items already exists in BASE, or at least are
859          accessible for the user performing the import. Items such as
860          protocols and labels must be added before referencing
861          them. This is of course also true for other items but during
862          batch import one usually follows the natural order of first
863          importing biosources, samples, extracts, and so on. In this
864          way the parents are always present and may be referenced
865          without any issues.
866        </para>
867
868      </sect2>
869
870    </sect1>
871
872</chapter>
Note: See TracBrowser for help on using the repository browser.