source: trunk/doc/src/docbook/user/import_data.xml @ 5798

Last change on this file since 5798 was 5798, checked in by Nicklas Nordborg, 11 years ago

References #1590: Documentation cleanup

New and updated screenshots for chapter 18+19 - Import/export.

  • Property svn:eol-style set to native
  • Property svn:keywords set to Id
File size: 33.6 KB
Line 
1<?xml version="1.0" encoding="UTF-8"?>
2<!DOCTYPE chapter PUBLIC
3  "-//Dawid Weiss//DTD DocBook V3.1-Based Extension for XML and graphics inclusion//EN"
4  "../../../../lib/docbook/preprocess/dweiss-docbook-extensions.dtd"
5[
6<!ENTITY runplugin.configure.common
7  "The top of the window displays the names of the selected plug-in and
8  configuration, a list with parameters to the left, an area for input fields to the
9  right and buttons to proceed with at the bottom.
10  Click on a parameter in the parameter list to show the form fields
11  for entering values for the parameter to the right. Parameters
12  with an <guilabel>X</guilabel> in front of their names already have a
13  value. Parameters marked with a blue rectangle are required and must
14  be given a value before it is possible to proceed."
15>
16]>
17<!--
18  $Id: import_data.xml 5798 2011-10-11 16:32:48Z nicklas $
19 
20  Copyright (C) 2007 Peter Johansson, Nicklas Nordborg, Martin Svensson
21  Copyright (C) 2008 Jari Häkkinen
22 
23  This file is part of BASE - BioArray Software Environment.
24  Available at http://base.thep.lu.se/
25 
26  BASE is free software; you can redistribute it and/or
27  modify it under the terms of the GNU General Public License
28  as published by the Free Software Foundation; either version 3
29  of the License, or (at your option) any later version.
30 
31  BASE is distributed in the hope that it will be useful,
32  but WITHOUT ANY WARRANTY; without even the implied warranty of
33  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
34  GNU General Public License for more details.
35 
36  You should have received a copy of the GNU General Public License
37  along with BASE. If not, see <http://www.gnu.org/licenses/>.
38-->
39<chapter id="import_data" chunked="0">
40  <title>Import of data</title>
41  <para>
42    In some places the only way to get data into BASE is to import it
43    from a file. This typically includes <emphasis>raw data</emphasis>,
44    <emphasis>array design features</emphasis>, <emphasis>reporters</emphasis>
45    and other things, which would be inconvenient
46    to enter by hand due to the large number of data items. There is
47    also convenience batch importers for importing other items such as
48    <emphasis>biosources</emphasis>, <emphasis>samples</emphasis>, and
49    <emphasis>annotations</emphasis>. The batch importers are
50    described later in this chapter after the general import
51    description.
52  </para>
53  <para>
54    Normally, a plug-in handles one type of items and may require a
55    configuration. For example, most import plug-ins need some
56    information about how to find headers and data lines in
57    files. BASE ships with a number of import plug-ins as a part of
58    the core plug-ins package, cf. <xref linkend="coreplugins.import"
59    />. The core plug-in section links to configuration examples for
60    some of the plugins. Go to
61    <menuchoice>
62      <guimenu>Administrate</guimenu>
63      <guimenuitem>Plug-ins &amp; extensions</guimenuitem>
64      <guisubmenu>Plug-in definitions</guisubmenu>
65    </menuchoice>
66    to check which plug-ins are installed on your BASE server. When
67    BASE finds a plug-in that supports import of a certain type of
68    item an &gbImport; button is displayed in the toolbar on either
69    the list view or the single-item view.
70  </para>
71    <note>
72    <title>No "Import" button?</title>
73    <para>
74      If the import button is missing from a page were you would expect
75      to find them this usually means that:
76    </para>
77    <itemizedlist>
78      <listitem>
79        <simpara>
80          The logged in user does not have permission to use the plug-in.
81        </simpara>
82      </listitem>
83      <listitem>
84        <simpara>
85          The plug-in requires a configuration, but no one has been
86          created or the logged in user does not have permission to
87          use any of the existing configurations.
88        </simpara>
89      </listitem>
90    </itemizedlist>
91    <para>
92      Contact the server administrator or a similar user that has permission to
93      administrate the plug-ins.
94    </para>
95  </note>
96
97    <sect1 id="import_data.import">
98      <title>General import procedure</title>
99
100    <para>
101      Starting a data import is done by a wizard-like interface. There
102      are a number of step you have to go through:
103    </para>
104   
105    <orderedlist>
106      <listitem>
107        <simpara>
108          Select a plug-in and file format to use, or use the
109          <emphasis>auto detect</emphasis> option.
110        </simpara>
111      </listitem>
112      <listitem>
113        <simpara>
114          If you selected the auto detection function, you must select
115          a file to use.
116        </simpara>
117      </listitem>
118      <listitem>
119        <simpara>
120          Specify plug-in parameters.
121        </simpara>
122      </listitem>
123      <listitem>
124        <simpara>
125          Add the import job to the job queue.
126        </simpara>
127      </listitem>
128      <listitem>
129        <simpara>
130          Wait for the job to finish.
131        </simpara>
132      </listitem>
133    </orderedlist>
134
135    <sect2 id="import_export_data.import.plugin_fileformat">
136      <title>Select plug-in and file format</title>
137      <para>
138        Click on the &gbImport; button
139        in the toolbar to start the import wizard. The first step is to
140        select which plug-in and, if supported, which
141        file format to use. There is also an <guilabel>auto detect</guilabel>
142        option that lets you select a file and have BASE try to find a suitable
143        plug-in/file format to use.
144      </para>
145 
146      <figure id="import_export_data.figures.select_import_plugin">
147        <title>Select plug-in and file format</title>
148        <screenshot>
149          <mediaobject>
150            <imageobject><imagedata fileref="figures/select_import_plugin.png" format="PNG" /></imageobject>
151          </mediaobject>
152        </screenshot>
153      </figure>
154   
155
156      <helptext external_id="import.selectplugin"
157        title="Select plug-in and file format for data import">
158
159        <variablelist>
160          <varlistentry>
161            <term><guilabel>Plugin + file format</guilabel></term>
162            <listitem>
163              <para>
164                This is a combined list of plug-ins and their
165                respective file format configurations. The list only
166                includes combinations that
167                the logged in user has permission to use. If you select
168                an entry a short description about the plug-in and configuration
169                is displayed
170                below the lists. More information about the plug-ins can
171                be found under the menu choices
172                <menuchoice>
173                  <guimenu>Administrate</guimenu>
174                  <guimenuitem>Plug-ins &amp; extensions</guimenuitem>
175                  <guisubmenu>Plug-in definitions</guisubmenu>
176                </menuchoice>
177                and
178                <menuchoice>
179                  <guimenu>Administrate</guimenu>
180                  <guimenuitem>Plug-ins &amp; extensions</guimenuitem>
181                  <guisubmenu>Plug-in configuration</guisubmenu>
182                </menuchoice>
183              </para>
184              <note>
185                <title>File format vs. Configuration</title>
186                <simpara>
187                A file format is the same thing as a plug-in configuration.
188                It may be confusing that the interface sometimes use
189                <emphasis>file format</emphasis> and sometimes use
190                <emphasis>configuration</emphasis>, but for now, we'll have
191                to live with it.
192                </simpara>
193              </note>
194            </listitem>
195          </varlistentry>
196        </variablelist>
197
198        <para>
199          Proceed to the next step by clicking on the
200          &gbNext; button.
201        </para>
202
203        <seeother>
204          <other external_id="import.autodetect">The auto detect function</other>
205        </seeother>
206      </helptext>
207
208      <sect3 id="import_export_data.import.plugin_fileformat.autodetect">
209        <title>The auto detect function</title>
210       
211        <helptext
212          external_id="import.autodetect"
213          title="The auto detect function">
214       
215        <para>
216          The auto detect function lets you select a file and have
217          BASE try to find a suitable plug-in and file format. This option is
218          selected by default in the combined plug-in and file format list when there is
219          at least one plug-in that supports auto detection.
220        </para>
221        <note>
222          <title>Support of auto detect</title>
223          <para>
224            Not all plug-ins support auto detection. The ones that do are marked in
225            the list with <guilabel>×</guilabel>.
226          </para>
227        </note>
228       
229        <para>
230          Select the <guilabel>auto detect (all)</guilabel> option to search for a file format
231          in all plug-ins that supports the feature, or select the <guilabel>auto detect (plugin)</guilabel>
232          option to only search the file formats for a specific plug-in.
233          Continue to the next step by clicking on the &gbNext; button.
234        </para>
235       
236        <seeother>
237          <other external_id="import.selectplugin">Select plug-in and file format for data import</other>
238          <other external_id="import.autodetect.selectfile">Select file for auto detection</other>
239        </seeother>
240     
241        </helptext>
242       
243        <para>
244          You must now select a file to import from.
245        </para>
246       
247        <figure id="import_export_data.figures.select_autodetect_file">
248          <title>Select file for auto detection</title>
249          <screenshot>
250            <mediaobject>
251              <imageobject><imagedata fileref="figures/select_autodetect_file.png" format="PNG" /></imageobject>
252            </mediaobject>
253          </screenshot>
254        </figure>
255       
256        <helptext external_id="import.autodetect.selectfile" 
257          title="Select file for auto detection">
258 
259          <variablelist>
260            <varlistentry>
261              <term><guilabel>Plugin</guilabel></term>
262              <listitem>
263                <para>
264                  Displays the selected plug-in or <guilabel>all</guilabel> if the
265                  auto-detection is used on all supporting plug-ins.
266                </para>
267              </listitem>
268            </varlistentry>
269            <varlistentry>
270              <term><guilabel>File</guilabel></term>
271              <listitem>
272                <para>
273                  Enter the path and file name for the
274                  file you want to use. Use the <guibutton>Browse&hellip;</guibutton>
275                  button to browse after the file in BASE's file system.
276                  If the file does not exist in the file system you have the option
277                  to upload it.
278                  <nohelp>Read more about this in <xref linkend="file_system" />.</nohelp>
279                </para>
280              </listitem>
281            </varlistentry>
282            <varlistentry>
283              <term><guilabel>Character set</guilabel></term>
284              <listitem>
285                <para>
286                  The character set used in text files. If the selected file has been configured
287                  with a character set the correct option is automatically selected. In all
288                  cases, you have the option to override the default selection. Most files,
289                  typically use either the UTF-8 or ISO-8859-1 character set.
290                </para>
291              </listitem>
292            </varlistentry>
293            <varlistentry>
294              <term><guilabel>Recently used</guilabel></term>
295              <listitem>
296                <para>
297                  A list of files you have recently used
298                  for auto detection.
299                </para>
300              </listitem>
301            </varlistentry>
302          </variablelist>
303         
304          <para>
305            Click on the &gbNext; button
306            to start the auto detection. There are three possible outcomes:
307          </para>
308
309          <itemizedlist>
310            <listitem>
311              <para>
312              Exactly one matching plug-in and file format is found. The next step is
313              to configure any additional parameters needed
314              by the plug-in. This is the same step as if you had selected
315              the same plug-in and file format in the first step.
316              </para>
317            </listitem>
318            <listitem>
319              <para>
320              If no matching plug-in and file format is found an error message
321              is displayed. If logged in with enough permissions to do so there
322              is an option to create a new file format/configuration.
323              </para>
324            </listitem>
325            <listitem>
326              <para>
327              If multiple matching plug-ins and file formats are found
328              you will be taken back to the first step. This time
329              the lists will only include the matching plug-ins/file formats
330              and the auto detect option is not present.
331              </para>
332            </listitem>
333          </itemizedlist>
334
335          <seeother>
336            <other external_id="import.selectplugin">Select plug-in and file format for data import</other>
337            <other external_id="import.autodetect">The auto detect function</other>
338          </seeother>
339         
340        </helptext>
341       
342      </sect3>
343
344    </sect2>
345
346    <sect2 id="import_export_data.import.pluginparameters">
347      <title>Specify plug-in parameters</title>
348      <para>
349        When you have selected a plug-in and file format or used
350        the auto detect function to find one, a form where you
351        you can enter additional parameters for the plug-in is displayed.
352      </para>
353     
354      <figure id="import_export_data.figures.configure_plugin">
355        <title>Specify plug-in parameters</title>
356        <screenshot>
357          <mediaobject>
358            <imageobject>
359              <imagedata 
360                scalefit="1" width="100%"
361                fileref="figures/plugin_parameters.png" format="PNG" />
362            </imageobject>
363          </mediaobject>
364        </screenshot>
365      </figure>
366     
367      <helptext external_id="runplugin.configure.import" 
368        title="Specify plug-in parameters">
369      <para>
370        &runplugin.configure.common;
371      </para>
372     
373      <para>
374        The parameter list is very different from plug-in to plug-in.
375        Common parameters for import plug-ins are:
376      </para>
377     
378      <variablelist>
379        <varlistentry>
380          <term><guilabel>File</guilabel></term>
381          <listitem>
382            <para>
383            The file to import data from. A value is already set if
384            you used the auto detect function.
385            </para>
386          </listitem>
387        </varlistentry>
388       
389        <varlistentry>
390          <term><guilabel>File parser regular expressions</guilabel></term>
391          <listitem>
392            <para>
393            Various regular expressions that are used when parsing the file
394            to ensure that the data is found. In most cases, all values
395            are taken from the matched configuration and can be left as is.
396            </para>
397          </listitem>
398        </varlistentry>
399       
400        <varlistentry>
401          <term><guilabel>Error handling</guilabel></term>
402          <listitem>
403            <para>
404              A section which contains different options how to
405              handle errors when parsing the file. Normally you can
406              select if the import should fail as a whole or if
407              only the line with the error should be skipped.
408            </para>
409          </listitem>
410        </varlistentry>
411      </variablelist>
412     
413      <para>
414        Continue to the next step by clicking the
415        &gbNext; button.
416      </para>
417     
418      <seeother>
419        <other external_id="runplugin.configure">The plug-in configuration wizard</other>
420      </seeother>   
421      </helptext>
422
423    </sect2>
424   
425    <sect2 id="import_export_data.import.jobqueue">
426      <title>Add the import job to the job queue</title>
427
428      <figure id="import_export_data.figures.finish_job">
429        <title>Job name and options</title>
430        <screenshot>
431          <mediaobject>
432            <imageobject>
433              <imagedata 
434                fileref="figures/finish_job.png" format="PNG" />
435            </imageobject>
436          </mediaobject>
437        </screenshot>
438      </figure>
439     
440      <helptext external_id="runplugin.finshjob" 
441        title="Set job name and options">
442      <para>
443        In this window should information about the job be filled in, like name and
444        description. Where name is required and need to have valid string as a value. There
445        are also two check boxes in this page.
446        <variablelist>
447          <varlistentry>
448            <term>
449              <guilabel>Name</guilabel>
450            </term>
451            <listitem>
452              <para>
453                Most plug-ins should suggest a name for the job, but you can change it if
454                you want to.
455              </para>
456            </listitem>
457          </varlistentry>
458          <varlistentry>
459            <term>
460              <guilabel>Use job agent</guilabel>
461            </term>
462            <listitem>
463              <para>
464                This option is only available if the BASE system has been configured with
465                job agents and the logged in user has <constant>SELECT_JOBAGENT</constant>
466                permission. Select the <guilabel>automatic</guilabel> option to let
467                BASE automatically select a job agent or select a specific option
468                to force the use of that particular job agent.
469              </para>
470            </listitem>
471          </varlistentry>
472          <varlistentry>
473            <term>
474              <guilabel>Send message</guilabel>
475            </term>
476            <listitem>
477              <para>
478                Tick this check box if the job should send you a message when it is
479                finished, otherwise untick it
480              </para>
481            </listitem>
482          </varlistentry>
483          <varlistentry>
484            <term>
485              <guilabel>Remove job</guilabel>
486            </term>
487            <listitem>
488              <para>
489                If this check box is ticked, the job will be marked as removed when
490                it is finished, on condition that it was finished successfully. This
491                is only available for import- and export- plugins.
492              </para>
493            </listitem>
494          </varlistentry>
495        </variablelist>
496      </para>
497      <para>
498        Clicking on
499        &gbFinish;
500        when everything is set will end the job configuration and place the job in the job queue.
501        A self-refreshing window appears with information about the
502        job's status and execution time. How long time it takes before the job starts to run
503        depends on which priority it and the other jobs in the queue have. The job does not
504        depend on the status window to be able to run and the window can be
505        closed without interrupting the execution.
506      </para>
507      <tip>
508        <title>View job status</title>
509        <para>
510          A job's status can be viewed at any time by opening it from the job list page,
511          <menuchoice>
512            <guimenuitem>View</guimenuitem>
513            <guimenuitem>Jobs</guimenuitem>
514          </menuchoice>.
515        </para>
516      </tip>
517      </helptext>
518    </sect2>
519
520    </sect1>
521
522    <sect1 id="import_data.batch">
523      <title>Batch import of data</title>
524
525      <para>
526        There are in general several possibilities to import data into
527        BASE. Bulk data such as reporter information and raw data
528        imports are handled by plug-ins created for these tasks. For
529        item types that are imported in more moderate quantities a
530        suite of batch item importers available
531        (<xref linkend="coreplugins.import.batch" />). These importers
532        allows the user to create new items in BASE and define item
533        properties and associations between items using tab-separated
534        (or equivalent) files.
535      </para>
536
537      <para>
538        The batch importers are available for most users and they may
539        have been pre-configured but there is no requirement to
540        configure the batch importer plug-ins. Here we assume that no
541        plug-in configuration exists for the batch
542        importers. Pre-configuration of the importers is really only
543        needed for facilities that perform the same imports regularly
544        whereas for occasional use the provided wizard is
545        sufficient. Configuring the importers follows the route
546        described in <xref linkend="plugins.configuration" />.
547      </para>
548
549      <para>
550        The batch importers either creates new items or updates
551        already existing items. In either mode the plugin can set
552        values for
553        <itemizedlist>
554          <listitem>
555            <para>
556              Simple properties, <emphasis>eg.</emphasis>, string
557              values, numeric values, dates, etc.
558            </para>
559          </listitem>
560          <listitem>
561            <para>
562              Single-item references, <emphasis>eg.</emphasis>,
563              protocol, label, software, owner, etc.
564            </para>
565          </listitem>
566          <listitem>
567            <para>
568              Multi-item references are references to several other
569              items of the same type. The extracts of a
570              physical bioassay or pooled samples are two examples of
571              items that refer to several other items; a physical bioassay
572              may contain several extracts and a sample may be
573              a pool of several samples. In some cases a multi-item
574              reference is bundled with simple
575              values, <emphasis>eg.</emphasis>, used quantity of a
576              source biomaterial, the position an extract is
577              used on, etc. Multi-item references are never removed by
578              the importer, only added or updated. Removing an item
579              from a multi-item reference is a manual procedure to be
580              done using the web interface.
581            </para>
582          </listitem>
583        </itemizedlist>
584        The batch importers do not set values for annotations since
585        this is handled by the annotation importer
586        plug-in (<xref linkend="annotations.massimport" />). However,
587        the annotation importer and batch item importers have similar
588        behaviour and functionality to minimize the learning cost for
589        users.
590      </para>
591
592      <para>
593        The importer only works with one type of items at each use and can be
594        used in a <emphasis>dry-run</emphasis> mode where everything
595        is performed as if a real import is taking place, but the work
596        (transaction) is not committed to the database. The result of
597        the test can be stored to a log file and the user can examine
598        the output to see how an actual import would perform. Summary
599        results such as the number of items imported and the number of
600        failed items are reported after the import is finished, and in
601        the case of non-recoverable failure the reason is reported.
602      </para>
603
604      <sect2 id="import_data.batch.fileformat">
605        <title>File format</title>
606
607        <para>
608          For proper and efficient use of the batch importers users
609          need to understand how the files to be imported should be
610          formatted. The input file must be organised into columns separated by a
611          specified character such as a tab or comma character. The
612          data header line contains the column headers which defines
613          the contents of each column and defines the beginning of
614          item data in the file. The item data block continues until
615          the end of the file or to an optional data footer line
616          defining the end of the data block.
617        </para>
618
619        <para>
620          When reading data for an item the plug-in must use some
621          information for identifying items. Depending on item type
622          there are two or three options to select the item identifier
623          <itemizedlist>
624            <listitem>
625              <para>
626                Using the internal <property>id</property>. This is
627                always unique for a specific BASE server.
628              </para>
629            </listitem>
630            <listitem>
631              <para>
632                Using the <property>name</property>. This may or may
633                not be unique.
634              </para>
635            </listitem>
636            <listitem>
637              <para>
638                Some items have
639                an <property>externalId</property>. This may or may
640                not be unique.
641              </para>
642            </listitem>
643            <listitem>
644              <para>
645                Array slides may have a <property>barcode</property>
646                which is similar to
647                the <property>externalId</property>.
648              </para>
649            </listitem>
650          </itemizedlist>
651          It is important that the identifier selected
652          is <emphasis>unique</emphasis> in the file used, or if the
653          file is used to update items already existing in BASE the
654          identifier should also be unique in BASE for the user
655          performing the update. The plug-in will check uniqueness
656          when default parameters are used but the user may change the
657          default behaviour.
658        </para>
659
660        <para>
661          Data for a single item may be split into multiple lines. The
662          first line contains simple properties and single-item
663          references, and the first multi-item reference. If there are
664          more multi-item references they should be on the following
665          lines with empty values in all other columns, except for the
666          column holding the item identifier. The item identifier must
667          have the same value on all lines associated with the
668          item. Lines containing other data than multi-item references
669          will be ignored or may be considered as an error depending
670          on plug-in parameter settings. The reason for treating
671          copied data entries as an error is to catch situations where
672          two items is given the same item identifier by accident.
673        </para>
674
675      </sect2>
676
677      <sect2 id="import_data.batch.running">
678        <title>Running the item batch importer</title>
679
680        <para>
681          This section discuss specific parameters and features of the
682          batch importers. The general use of the batch importers
683          follow the description outlined in
684          <xref linkend="import_data.import" /> and the setting of
685          column mapping parameters is assisted with
686          the <guilabel>Test with file</guilabel> function described
687          in <xref linkend="plugins.configuration.testwithfile"
688          />. The column headers are mapped to item properties at each
689          use of the plug-in but, as pointed out above, they can also
690          be predefined by saving settings as a plug-in
691          configuration. The configuration also includes separator
692          character and other information that is needed to parse
693          files. The ability to save configurations depends on user
694          credential and is by default only granted to administrators.
695        </para>
696
697        <para>
698          The plug-in parameter follows the standard BASE plug-in
699          layout and shows help information for selected
700          parameters. The list below comments on some of the
701          parameters available.
702        </para>
703       
704          <variablelist>
705            <varlistentry>
706              <term>
707                <guilabel>Mode</guilabel>
708              </term>
709              <listitem>
710                <para>
711                  Select the mode of the plug-in. The plug-in can
712                  create new items and/or update items already
713                  existing in BASE. This setting is available to allow
714                  the user to make a conscious choice of how to treat
715                  missing or already existing items. For example, if
716                  the user selects to only update items already
717                  existing the plug-in will complain if an item in the
718                  file does not exist in BASE (using default error
719                  condition treatment). This adds an extra layer of
720                  security and diagnostics for the user during import.
721                </para>
722              </listitem>
723            </varlistentry>
724            <varlistentry>
725              <term>
726                <guilabel>Data directory</guilabel>
727              </term>
728              <listitem>
729                <para>
730                  This option is only available for items that has support for
731                  attaching files (eg. array design, derived bioassay, etc.).
732                  This setting is used to resolve file references that doesn't
733                  include a complete absolute path.
734                </para>
735              </listitem>
736            </varlistentry>
737            <varlistentry>
738              <term>
739                <guilabel>Identification method</guilabel>
740              </term>
741              <listitem>
742                <para>
743                  This parameter defines the method to use to find
744                  already existing items. The parameter can only be
745                  set to a set of item properties listed in the
746                  plug-in parameter dialog. The property selected by
747                  the user must be mapped to a column in the file. If
748                  it is not set there is obviously no way for the
749                  plug-in to identify if an item already exists.
750                </para>
751              </listitem>
752            </varlistentry>
753            <varlistentry>
754              <term>
755                <guilabel>Item subtypes</guilabel>
756              </term>
757              <listitem>
758                <para>
759                  Only look for existing items among the selected subtypes. If no subtype
760                  is selected all items are searched. If exactly one subtype is selected
761                  new items are automatically created with this subtype (unless it is overridden
762                  by specific subtype values in the import file).
763                </para>
764              </listitem>
765            </varlistentry>
766            <varlistentry>
767              <term>
768                <guilabel>Owned by me</guilabel>, <guilabel>Shared to
769                me</guilabel>, <guilabel>In current
770                project</guilabel>, and <guilabel>Owned by
771                others</guilabel>
772              </term>
773              <listitem>
774                <para>
775                  Defines the set of items the plug-in should look in
776                  when it checks whether an item already exists. The
777                  options are the same that are available in list
778                  views and the actual set of parameters depends in
779                  user credentials.
780                </para>
781                <para>
782                  When <property>id</property> is used as
783                  the <guilabel>Identification method</guilabel>, the
784                  plug-in looks for the item irrespective the setting
785                  of these parameters. Of course, the user still must
786                  have proper access to the item referenced.
787                </para>
788              </listitem>
789            </varlistentry>
790            <varlistentry>
791              <term>
792                <guilabel>Column mapping expressions</guilabel>
793              </term>
794              <listitem>
795                <para>
796                  Use the <guilabel>Test with file</guilabel> function
797                  described in
798                  <xref linkend="plugins.configuration.testwithfile"
799                  /> to set the column mapping parameters.
800                </para>
801                <para>
802                  When working with biomaterial items, the
803                  <guilabel>Parent type</guilabel> property is used to
804                  tell the plug-in how to find parent items. This only
805                  has to be set if the parent item is of the same type
806                  as the biomaterial being imported since the default
807                  is to look for the nearest parent type in the predefined hierarchy.
808                  In ascending order the BASE ordering
809                  of <emphasis>parent - child - grandchild -
810                  ...</emphasis> item relation is <emphasis>biosource
811                  - sample - extract</emphasis>.
812                </para>
813                <para>
814                  The values accepted for <guilabel>Parent type</guilabel>
815                  are <constant>BIOSOURCE</constant>,
816                  <constant>SAMPLE</constant> or <constant>EXTRACT</constant>.
817                  Sometimes all items in a file to be imported have the same parent
818                  type but there is no column with this information. This can
819                  be resolved by setting
820                  the <guilabel>Parent type</guilabel> mapping to a
821                  constant string (eg. no backslash '\' character).
822                </para>
823              </listitem>
824            </varlistentry>
825            <varlistentry>
826              <term>
827                <guilabel>Permissions</guilabel>
828              </term>
829              <listitem>
830                <para>
831                  This is a column mapping that can be used to update the permissions
832                  set on items. Normally, new items are only shared to the active project
833                  (if any). By naming a permission template, new items are shared using
834                  the permissions from that template instead. Permissions on already existing
835                  items are merged with the permission from the template.
836                </para>
837              </listitem>
838            </varlistentry>
839          </variablelist>
840         
841          <para>
842          After setting the parameters,
843          select <guilabel>Next</guilabel>. Another parameter dialog
844          will appear where error handling options can be set among
845          with
846          </para>
847         
848          <variablelist>
849            <varlistentry>
850              <term>
851                <guilabel>Log file</guilabel>
852              </term>
853              <listitem>
854                <para>
855                  Setting this parameter will turn on logging. The
856                  plug-in will give detailed information about how the
857                  file is parsed. This is useful for resolving file
858                  parsing issues.
859                </para>
860              </listitem>
861            </varlistentry>
862            <varlistentry>
863              <term>
864                <guilabel>Dry run</guilabel>
865              </term>
866              <listitem>
867                <para>
868                  Enable or disable test run of the plug-in. If
869                  enabled the plug-in will parse and simulate an
870                  import. When enabling this option you should set
871                  the <guilabel>Log file</guilabel> also. The dry run
872                  mode allows testing of large imports and updates by
873                  creating a log file that can be examined for
874                  inconsistencies before actually performing the action
875                  without a safety net.
876                </para>
877              </listitem>
878            </varlistentry>
879          </variablelist>
880
881
882        <para>
883          During file parsing the plug-in will look for items
884          referenced on each line. There are three outcomes of this
885          item search
886        </para>
887       
888          <itemizedlist>
889            <listitem>
890              <para>
891                No item is found. Depending on parameter settings this
892                may abort the plug-in, the plug-in may ignore the
893                line, or a new item is created.
894              </para>
895            </listitem>
896            <listitem>
897              <para>
898                One item is found. This is the item that is going to
899                be updated.
900              </para>
901            </listitem>
902            <listitem>
903              <para>
904                More than one item is found. Depending on parameter
905                settings this may abort the plug-in or the plug-in may
906                ignore the line.
907              </para>
908            </listitem>
909          </itemizedlist>
910
911      </sect2>
912
913      <sect2 id="import_data.batch.comments">
914        <title>Comments on the item batch importers</title>
915
916        <para>
917          The item batch importers are not designed to change or
918          create annotations. There is another plug-in for this, see
919          <xref linkend="annotations.massimport" /> for an
920          introduction to the annotation importer.
921        </para>
922
923        <para>
924          There is no need to map all columns when running the
925          importer. When new items are created usually the only
926          mandatory entry is <property>Name</property>, and when
927          running the plug-in in update mode only the column defining
928          the item identification property needs to be defined. This
929          can be utilized when only one or a few properties needs to
930          be updated; map only columns that should be changed and the
931          plug-in will ignore the other properties and leave them as
932          they are already stored in BASE. This also means that if one
933          property should be deleted then that property must be mapped
934          and the value must be empty in the file. Note, multi-item
935          reference cannot be deleted with the batch importer, and
936          deletion of multi-item references must be done using the web
937          interface.
938        </para>
939
940        <para>
941          When parent and other relations are created using the
942          plug-in the referenced items are properly linked and
943          updated. This means that when a quantity that decreases a
944          referenced item is used, the referenced item is updated
945          accordingly. In consequence, if the relation is removed in a
946          later update - maybe wrong parent was referenced - the
947          referenced item is restored and any decrease of quantities
948          are also reset.
949        </para>
950
951        <para>
952          A common mistake is to forget to make sure that some of the
953          referenced items already exists in BASE, or at least are
954          accessible for the user performing the import. Items such as
955          protocols and labels must be added before referencing
956          them. This is of course also true for other items but during
957          batch import one usually follows the natural order of first
958          importing biosources, samples, extracts, and so on. In this
959          way the parents are always present and may be referenced
960          without any issues.
961        </para>
962
963      </sect2>
964
965    </sect1>
966
967</chapter>
Note: See TracBrowser for help on using the repository browser.