1 | <?xml version="1.0" encoding="UTF-8"?> |
---|
2 | <!DOCTYPE chapter PUBLIC |
---|
3 | "-//Dawid Weiss//DTD DocBook V3.1-Based Extension for XML and graphics inclusion//EN" |
---|
4 | "../../../../lib/docbook/preprocess/dweiss-docbook-extensions.dtd" |
---|
5 | [ |
---|
6 | <!ENTITY runplugin.configure.common |
---|
7 | "The top of the window displays the names of the selected plug-in and |
---|
8 | configuration, a list with parameters to the left, an area for input fields to the |
---|
9 | right and buttons to proceed with at the bottom. |
---|
10 | Click on a parameter in the parameter list to show the form fields |
---|
11 | for entering values for the parameter to the right. Parameters |
---|
12 | with an <guilabel>X</guilabel> in front of their names already have a |
---|
13 | value. Parameters marked with a blue rectangle are required and must |
---|
14 | be given a value before it is possible to proceed." |
---|
15 | > |
---|
16 | ]> |
---|
17 | <!-- |
---|
18 | $Id: import_data.xml 5784 2011-10-05 12:52:42Z nicklas $ |
---|
19 | |
---|
20 | Copyright (C) 2007 Peter Johansson, Nicklas Nordborg, Martin Svensson |
---|
21 | Copyright (C) 2008 Jari Häkkinen |
---|
22 | |
---|
23 | This file is part of BASE - BioArray Software Environment. |
---|
24 | Available at http://base.thep.lu.se/ |
---|
25 | |
---|
26 | BASE is free software; you can redistribute it and/or |
---|
27 | modify it under the terms of the GNU General Public License |
---|
28 | as published by the Free Software Foundation; either version 3 |
---|
29 | of the License, or (at your option) any later version. |
---|
30 | |
---|
31 | BASE is distributed in the hope that it will be useful, |
---|
32 | but WITHOUT ANY WARRANTY; without even the implied warranty of |
---|
33 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
---|
34 | GNU General Public License for more details. |
---|
35 | |
---|
36 | You should have received a copy of the GNU General Public License |
---|
37 | along with BASE. If not, see <http://www.gnu.org/licenses/>. |
---|
38 | --> |
---|
39 | <chapter id="import_data" chunked="0"> |
---|
40 | <title>Import of data</title> |
---|
41 | <para> |
---|
42 | In some places the only way to get data into BASE is to import it |
---|
43 | from a file. This typically includes raw data, array design |
---|
44 | features, reporters and other things, which would be inconvenient |
---|
45 | to enter by hand due to the large number of data items. There is |
---|
46 | also convenience batch importers for importing other items such as |
---|
47 | biosources, samples, and annotations. The batch importers are |
---|
48 | described later in this chapter after the general import |
---|
49 | description. |
---|
50 | </para> |
---|
51 | <para> |
---|
52 | Normally, a plug-in handles one type of items and may require a |
---|
53 | configuration, for example, the import plug-ins need some |
---|
54 | information about how to find headers and data lines in |
---|
55 | files. BASE ships with a number of import plug-ins as a part of |
---|
56 | the core plug-ins package, cf. <xref linkend="coreplugins.import" |
---|
57 | />. The core plug-in section links to configuration examples for |
---|
58 | some of the plugins. Go to |
---|
59 | <menuchoice> |
---|
60 | <guimenu>Administrate</guimenu> |
---|
61 | <guimenuitem>Plug-ins & extensions</guimenuitem> |
---|
62 | <guisubmenu>Plug-in definitions</guisubmenu> |
---|
63 | </menuchoice> |
---|
64 | to check which plug-ins are installed on your BASE server. When |
---|
65 | BASE finds a plug-in that supports import of a certain type of |
---|
66 | item an &gbImport; button is displayed in the toolbar on either |
---|
67 | the list view or the single-item view. |
---|
68 | </para> |
---|
69 | <note> |
---|
70 | <title>No "Import" button?</title> |
---|
71 | <para> |
---|
72 | If the import button is missing from a page were you would expect |
---|
73 | to find them this usually means that: |
---|
74 | </para> |
---|
75 | <itemizedlist> |
---|
76 | <listitem> |
---|
77 | <simpara> |
---|
78 | The logged in user does not have permission to use the plug-in. |
---|
79 | </simpara> |
---|
80 | </listitem> |
---|
81 | <listitem> |
---|
82 | <simpara> |
---|
83 | The plug-in requires a configuration, but no one has been |
---|
84 | created or the logged in user does not have permission to |
---|
85 | use any of the existing configurations. |
---|
86 | </simpara> |
---|
87 | </listitem> |
---|
88 | </itemizedlist> |
---|
89 | <para> |
---|
90 | Contact the server administrator or a similar user that has permission to |
---|
91 | administrate the plug-ins. |
---|
92 | </para> |
---|
93 | </note> |
---|
94 | |
---|
95 | <sect1 id="import_data.import"> |
---|
96 | <title>General import procedure</title> |
---|
97 | |
---|
98 | <para> |
---|
99 | Starting a data import is done by a wizard-like interface. There |
---|
100 | are a number of step you have to go through: |
---|
101 | </para> |
---|
102 | |
---|
103 | <orderedlist> |
---|
104 | <listitem> |
---|
105 | <simpara> |
---|
106 | Select a plug-in and file format to use, or select the |
---|
107 | auto detect option. |
---|
108 | </simpara> |
---|
109 | </listitem> |
---|
110 | <listitem> |
---|
111 | <simpara> |
---|
112 | If you selected the auto detection function, you must select |
---|
113 | a file to use. |
---|
114 | </simpara> |
---|
115 | </listitem> |
---|
116 | <listitem> |
---|
117 | <simpara> |
---|
118 | Specify plug-in parameters. |
---|
119 | </simpara> |
---|
120 | </listitem> |
---|
121 | <listitem> |
---|
122 | <simpara> |
---|
123 | Add the import job to the job queue. |
---|
124 | </simpara> |
---|
125 | </listitem> |
---|
126 | <listitem> |
---|
127 | <simpara> |
---|
128 | Wait for the job to finish. |
---|
129 | </simpara> |
---|
130 | </listitem> |
---|
131 | </orderedlist> |
---|
132 | |
---|
133 | <sect2 id="import_export_data.import.plugin_fileformat"> |
---|
134 | <title>Select plug-in and file format</title> |
---|
135 | <para> |
---|
136 | Click on the &gbImport; button |
---|
137 | in the toolbar to start the import wizard. The first step is to |
---|
138 | select which plug-in and, if supported, which |
---|
139 | file format to use. There is also an <guilabel>auto detect</guilabel> |
---|
140 | option that lets you select a file and have BASE try to find a suitable |
---|
141 | plug-in/file format to use. |
---|
142 | </para> |
---|
143 | |
---|
144 | <figure id="import_export_data.figures.select_import_plugin"> |
---|
145 | <title>Select plug-in and file format</title> |
---|
146 | <screenshot> |
---|
147 | <mediaobject> |
---|
148 | <imageobject><imagedata fileref="figures/select_import_plugin.png" format="PNG" /></imageobject> |
---|
149 | </mediaobject> |
---|
150 | </screenshot> |
---|
151 | </figure> |
---|
152 | |
---|
153 | |
---|
154 | <helptext external_id="import.selectplugin" |
---|
155 | title="Select plug-in and file format for data import"> |
---|
156 | |
---|
157 | <variablelist> |
---|
158 | <varlistentry> |
---|
159 | <term><guilabel>Plugin + file format</guilabel></term> |
---|
160 | <listitem> |
---|
161 | <para> |
---|
162 | This is a combined list of plug-ins and their |
---|
163 | respective file format configurations. The list only |
---|
164 | includes combinations that |
---|
165 | the logged in user has permission to use. If you select |
---|
166 | an entry a short description of about the plug-in and configuration |
---|
167 | is displayed |
---|
168 | below the lists. More information about the plug-ins can |
---|
169 | be found under the menu choices |
---|
170 | <menuchoice> |
---|
171 | <guimenu>Administrate</guimenu> |
---|
172 | <guimenuitem>Plug-ins & extensions</guimenuitem> |
---|
173 | <guisubmenu>Plug-in definitions</guisubmenu> |
---|
174 | </menuchoice> |
---|
175 | and |
---|
176 | <menuchoice> |
---|
177 | <guimenu>Administrate</guimenu> |
---|
178 | <guimenuitem>Plug-ins & extensions</guimenuitem> |
---|
179 | <guisubmenu>Plug-in configuration</guisubmenu> |
---|
180 | </menuchoice> |
---|
181 | </para> |
---|
182 | <note> |
---|
183 | <title>File format vs. Configuration</title> |
---|
184 | <simpara> |
---|
185 | A file format is the same thing as a plug-in configuration. |
---|
186 | It may be confusing that the interface sometimes use |
---|
187 | <emphasis>file format</emphasis> and sometimes use |
---|
188 | <emphasis>configuration</emphasis>, but for now, we'll have |
---|
189 | to live with it. |
---|
190 | </simpara> |
---|
191 | </note> |
---|
192 | </listitem> |
---|
193 | </varlistentry> |
---|
194 | </variablelist> |
---|
195 | |
---|
196 | <para> |
---|
197 | Proceed to the next step by clicking on the |
---|
198 | &gbNext; button. |
---|
199 | </para> |
---|
200 | |
---|
201 | <seeother> |
---|
202 | <other external_id="import.autodetect">The auto detect function</other> |
---|
203 | </seeother> |
---|
204 | </helptext> |
---|
205 | |
---|
206 | <sect3 id="import_export_data.import.plugin_fileformat.autodetect"> |
---|
207 | <title>The auto detect function</title> |
---|
208 | |
---|
209 | <helptext |
---|
210 | external_id="import.autodetect" |
---|
211 | title="The auto detect function"> |
---|
212 | |
---|
213 | <para> |
---|
214 | The auto detect function lets you select a file and have |
---|
215 | BASE try to find a suitable plug-in and file format. This option is |
---|
216 | selected by default in the combined plug-in and file format list when there is |
---|
217 | at least one plug-in that supports auto detection. |
---|
218 | </para> |
---|
219 | <note> |
---|
220 | <title>Support of auto detect</title> |
---|
221 | <para> |
---|
222 | Not all plug-ins support auto detection. The ones that do are marked in |
---|
223 | the list with <guilabel>×</guilabel>. |
---|
224 | </para> |
---|
225 | </note> |
---|
226 | |
---|
227 | <para> |
---|
228 | Select the <guilabel>auto detect (all)</guilabel> option to search for a file format |
---|
229 | in all plug-ins that supports the feature, or select the <guilabel>auto detect (plugin)</guilabel> |
---|
230 | option to only search the file formats for a specific plug-in. |
---|
231 | Continue to the next step by clicking on the &gbNext; button. |
---|
232 | </para> |
---|
233 | |
---|
234 | <seeother> |
---|
235 | <other external_id="import.selectplugin">Select plug-in and file format for data import</other> |
---|
236 | <other external_id="import.autodetect.selectfile">Select file for auto detection</other> |
---|
237 | </seeother> |
---|
238 | |
---|
239 | </helptext> |
---|
240 | |
---|
241 | <para> |
---|
242 | You must now select a file to import from. |
---|
243 | </para> |
---|
244 | |
---|
245 | <figure id="import_export_data.figures.select_autodetect_file"> |
---|
246 | <title>Select file for auto detection</title> |
---|
247 | <screenshot> |
---|
248 | <mediaobject> |
---|
249 | <imageobject><imagedata fileref="figures/select_autodetect_file.png" format="PNG" /></imageobject> |
---|
250 | </mediaobject> |
---|
251 | </screenshot> |
---|
252 | </figure> |
---|
253 | |
---|
254 | <helptext external_id="import.autodetect.selectfile" |
---|
255 | title="Select file for auto detection"> |
---|
256 | |
---|
257 | <variablelist> |
---|
258 | <varlistentry> |
---|
259 | <term><guilabel>Plugin</guilabel></term> |
---|
260 | <listitem> |
---|
261 | <para> |
---|
262 | Displayes the selected plug-in or <guilabel>all</guilabel> if the |
---|
263 | auto-detection is used on all supporting plug-ins. |
---|
264 | </para> |
---|
265 | </listitem> |
---|
266 | </varlistentry> |
---|
267 | <varlistentry> |
---|
268 | <term><guilabel>File</guilabel></term> |
---|
269 | <listitem> |
---|
270 | <para> |
---|
271 | Enter the path and file name for the |
---|
272 | file you want to use. Use the <guibutton>Browse…</guibutton> |
---|
273 | button to browse after the file in BASE's file system. |
---|
274 | If the file does not exist in the file system you have the option |
---|
275 | to upload it. |
---|
276 | <nohelp>Read more about this in <xref linkend="file_system" />.</nohelp> |
---|
277 | </para> |
---|
278 | </listitem> |
---|
279 | </varlistentry> |
---|
280 | <varlistentry> |
---|
281 | <term><guilabel>Character set</guilabel></term> |
---|
282 | <listitem> |
---|
283 | <para> |
---|
284 | The character set used in text files. If the selected file has been configured |
---|
285 | with a character set the correct option is automatically selected. In all |
---|
286 | cases, you have the option to override the default selection. Most files, |
---|
287 | typically use one of the UTF-8 or ISO-8859-1 character sets. |
---|
288 | </para> |
---|
289 | </listitem> |
---|
290 | </varlistentry> |
---|
291 | <varlistentry> |
---|
292 | <term><guilabel>Recently used</guilabel></term> |
---|
293 | <listitem> |
---|
294 | <para> |
---|
295 | A list of files you have recently used |
---|
296 | for auto detection. |
---|
297 | </para> |
---|
298 | </listitem> |
---|
299 | </varlistentry> |
---|
300 | </variablelist> |
---|
301 | |
---|
302 | <para> |
---|
303 | Click on the &gbNext; button |
---|
304 | to start the auto detection. There are three possible outcomes: |
---|
305 | </para> |
---|
306 | |
---|
307 | <itemizedlist> |
---|
308 | <listitem> |
---|
309 | <para> |
---|
310 | Exactly one matching plug-in and file format is found. The next step is |
---|
311 | to configure any additional parameters needed |
---|
312 | by the plug-in. This is the same step as if you had selected |
---|
313 | the same plug-in and file format in the first step. |
---|
314 | </para> |
---|
315 | </listitem> |
---|
316 | <listitem> |
---|
317 | <para> |
---|
318 | If no matching plug-in and file format is found an error message |
---|
319 | is displayed. If logged in with enough permissions to do so there |
---|
320 | is an option to create a new file format/configuration. |
---|
321 | </para> |
---|
322 | </listitem> |
---|
323 | <listitem> |
---|
324 | <para> |
---|
325 | If multiple matching plug-ins and file formats are found |
---|
326 | you will be taken back to the first step. This time |
---|
327 | the lists will only include the matching plug-ins/file formats |
---|
328 | and the auto detect option is not present. |
---|
329 | </para> |
---|
330 | </listitem> |
---|
331 | </itemizedlist> |
---|
332 | |
---|
333 | <seeother> |
---|
334 | <other external_id="import.selectplugin">Select plug-in and file format for data import</other> |
---|
335 | <other external_id="import.autodetect">The auto detect function</other> |
---|
336 | </seeother> |
---|
337 | |
---|
338 | </helptext> |
---|
339 | |
---|
340 | </sect3> |
---|
341 | |
---|
342 | </sect2> |
---|
343 | |
---|
344 | <sect2 id="import_export_data.import.pluginparameters"> |
---|
345 | <title>Specify plug-in parameters</title> |
---|
346 | <para> |
---|
347 | When you have selected a plug-in and file format or used |
---|
348 | the auto detect function to find one, a form where you |
---|
349 | you can enter additional parameters for the plug-in is displayed. |
---|
350 | </para> |
---|
351 | |
---|
352 | <figure id="import_export_data.figures.confiure_plugin"> |
---|
353 | <title>Specify plug-in parameters</title> |
---|
354 | <screenshot> |
---|
355 | <mediaobject> |
---|
356 | <imageobject> |
---|
357 | <imagedata |
---|
358 | scalefit="1" width="100%" |
---|
359 | fileref="figures/plugin_parameters.png" format="PNG" /> |
---|
360 | </imageobject> |
---|
361 | </mediaobject> |
---|
362 | </screenshot> |
---|
363 | </figure> |
---|
364 | |
---|
365 | <helptext external_id="runplugin.configure.import" |
---|
366 | title="Specify plug-in parameters"> |
---|
367 | <para> |
---|
368 | &runplugin.configure.common; |
---|
369 | </para> |
---|
370 | |
---|
371 | <para> |
---|
372 | The parameter list is very different from plug-in to plug-in. |
---|
373 | Common parameters for import plug-ins are: |
---|
374 | </para> |
---|
375 | |
---|
376 | <variablelist> |
---|
377 | <varlistentry> |
---|
378 | <term><guilabel>File</guilabel></term> |
---|
379 | <listitem> |
---|
380 | <para> |
---|
381 | The file to import data from. A value is already set if |
---|
382 | you used the auto detect function. |
---|
383 | </para> |
---|
384 | </listitem> |
---|
385 | </varlistentry> |
---|
386 | |
---|
387 | <varlistentry> |
---|
388 | <term><guilabel>Error handling</guilabel></term> |
---|
389 | <listitem> |
---|
390 | <para> |
---|
391 | A section which contains different options how to |
---|
392 | handle errors when parsing the file. Normally you can |
---|
393 | select if the import should fail as a whole or if |
---|
394 | only the line with the error should be skipped. |
---|
395 | </para> |
---|
396 | </listitem> |
---|
397 | </varlistentry> |
---|
398 | </variablelist> |
---|
399 | |
---|
400 | <para> |
---|
401 | Continue to the next step by clicking the |
---|
402 | &gbNext; button. |
---|
403 | </para> |
---|
404 | |
---|
405 | <seeother> |
---|
406 | <other external_id="runplugin.configure">The plug-in configuration wizard</other> |
---|
407 | </seeother> |
---|
408 | </helptext> |
---|
409 | |
---|
410 | </sect2> |
---|
411 | |
---|
412 | <sect2 id="import_export_data.import.jobqueue"> |
---|
413 | <title>Add the import job to the job queue</title> |
---|
414 | |
---|
415 | <para> |
---|
416 | In this window should information about the job be filled in, like name and |
---|
417 | description. Where name is required and need to have valid string as a value. There |
---|
418 | are also two check boxes in this page. |
---|
419 | <variablelist> |
---|
420 | <varlistentry> |
---|
421 | <term> |
---|
422 | <guilabel>Send message</guilabel> |
---|
423 | </term> |
---|
424 | <listitem> |
---|
425 | <para> |
---|
426 | Tick this check box if the job should send you a message when it is |
---|
427 | finished, otherwise untick it |
---|
428 | </para> |
---|
429 | </listitem> |
---|
430 | </varlistentry> |
---|
431 | <varlistentry> |
---|
432 | <term> |
---|
433 | <guilabel>Remove job</guilabel> |
---|
434 | </term> |
---|
435 | <listitem> |
---|
436 | <para> |
---|
437 | If this check box is ticked, the job will be marked as removed when |
---|
438 | it is finished, on condition that it was finished successfully. This |
---|
439 | is only available for import- and export- plugins. |
---|
440 | </para> |
---|
441 | </listitem> |
---|
442 | </varlistentry> |
---|
443 | </variablelist> |
---|
444 | </para> |
---|
445 | <para> |
---|
446 | Clicking on |
---|
447 | &gbFinish; |
---|
448 | when everything is set will end the job configuration and place the job in the job queue. |
---|
449 | A self-refreshing window appears with information about the |
---|
450 | job's status and execution time. How long time it takes before the job starts to run |
---|
451 | depends on which priority it and the other jobs in the queue have. The job does not |
---|
452 | depend on the status window to be able to run and the window can be |
---|
453 | closed without interrupting the execution. |
---|
454 | </para> |
---|
455 | <tip> |
---|
456 | <title>View job status</title> |
---|
457 | <para> |
---|
458 | A job's status can be viewed at any time by opening it from the job list page, |
---|
459 | <menuchoice> |
---|
460 | <guimenuitem>View</guimenuitem> |
---|
461 | <guimenuitem>Jobs</guimenuitem> |
---|
462 | </menuchoice>. |
---|
463 | </para> |
---|
464 | </tip> |
---|
465 | </sect2> |
---|
466 | |
---|
467 | </sect1> |
---|
468 | |
---|
469 | <sect1 id="import_data.batch"> |
---|
470 | <title>Batch import of data</title> |
---|
471 | |
---|
472 | <para> |
---|
473 | There are in general several possibilities to import data into |
---|
474 | BASE. Bulk data such as reporter information and raw data |
---|
475 | imports are handled by plug-ins created for these tasks. For |
---|
476 | item types that are imported in more moderate quantities a |
---|
477 | suite of batch item importers available |
---|
478 | (<xref linkend="coreplugins.import.batch" />). These importers |
---|
479 | allows the user to create new items in BASE and define item |
---|
480 | properties and associations between items using tab-separated |
---|
481 | (or equivalent) files. |
---|
482 | </para> |
---|
483 | |
---|
484 | <para> |
---|
485 | The batch importers are available for most users and they may |
---|
486 | have been pre-configured but there is no requirement to |
---|
487 | configure the batch importer plug-ins. Here we assume that no |
---|
488 | plug-in configuration exists for the batch |
---|
489 | importers. Pre-configuration of the importers is really only |
---|
490 | needed for facilities that perform the same imports regularly |
---|
491 | whereas for occasional use the provided wizard is |
---|
492 | sufficient. Configuring the importers follows the route |
---|
493 | described in <xref linkend="plugins.configuration" />. |
---|
494 | </para> |
---|
495 | |
---|
496 | <para> |
---|
497 | The batch importers either creates new items or updates |
---|
498 | already existing items. In either mode the plugin can set |
---|
499 | values for |
---|
500 | <itemizedlist> |
---|
501 | <listitem> |
---|
502 | <para> |
---|
503 | Simple properties, <emphasis>eg.</emphasis>, string |
---|
504 | values, numeric values, dates, etc. |
---|
505 | </para> |
---|
506 | </listitem> |
---|
507 | <listitem> |
---|
508 | <para> |
---|
509 | Single-item references, <emphasis>eg.</emphasis>, |
---|
510 | protocol, label, software, owner, etc. |
---|
511 | </para> |
---|
512 | </listitem> |
---|
513 | <listitem> |
---|
514 | <para> |
---|
515 | Multi-item references are references to several other |
---|
516 | items of the same type. The extracts of a |
---|
517 | physical bioassay or pooled samples are two examples of |
---|
518 | items that refer to several other items; a physical bioassay |
---|
519 | may contain several extracts and a sample may be |
---|
520 | a pool of several samples. In some cases a multi-item |
---|
521 | reference is bundled with simple |
---|
522 | values, <emphasis>eg.</emphasis>, used quantity of a |
---|
523 | source biomaterial, the position an extract is |
---|
524 | used on, etc. Multi-item references are never removed by |
---|
525 | the importer, only added or updated. Removing an item |
---|
526 | from a multi-item reference is a manual procedure to be |
---|
527 | done using the web interface. |
---|
528 | </para> |
---|
529 | </listitem> |
---|
530 | </itemizedlist> |
---|
531 | The batch importers do not set values for annotations since |
---|
532 | this is handled by the annotation importer |
---|
533 | plug-in (<xref linkend="annotations.massimport" />). However, |
---|
534 | the annotation importer and batch item importers have similar |
---|
535 | behaviour and functionality to minimize the learning cost for |
---|
536 | users. |
---|
537 | </para> |
---|
538 | |
---|
539 | <para> |
---|
540 | The importer only works with one type of items at each use and can be |
---|
541 | used in a <emphasis>dry-run</emphasis> mode where everything |
---|
542 | is performed as if a real import is taking place, but the work |
---|
543 | (transaction) is not committed to the database. The result of |
---|
544 | the test can be stored to a log file and the user can examine |
---|
545 | the output to see how an actual import would perform. Summary |
---|
546 | results such as the number of items imported and the number of |
---|
547 | failed items are reported after the import is finished, and in |
---|
548 | the case of non-recoverable failure the reason is reported. |
---|
549 | </para> |
---|
550 | |
---|
551 | <sect2 id="import_data.batch.fileformat"> |
---|
552 | <title>File format</title> |
---|
553 | |
---|
554 | <para> |
---|
555 | For proper and efficient use of the batch importers users |
---|
556 | need to understand how the files to be imported should be |
---|
557 | formatted. The input file must be organised into columns separated by a |
---|
558 | specified character such as a tab or comma character. The |
---|
559 | data header line contains the column headers which defines |
---|
560 | the contents of each column and defines the beginning of |
---|
561 | item data in the file. The item data block continues until |
---|
562 | the end of the file or to an optional data footer line |
---|
563 | defining the end of the data block. |
---|
564 | </para> |
---|
565 | |
---|
566 | <para> |
---|
567 | When reading data for an item the plug-in must use some |
---|
568 | information for identifying items. Depending on item type |
---|
569 | there are two or three options to select the item identifier |
---|
570 | <itemizedlist> |
---|
571 | <listitem> |
---|
572 | <para> |
---|
573 | Using the internal <property>id</property>. This is |
---|
574 | always unique for a specific BASE server. |
---|
575 | </para> |
---|
576 | </listitem> |
---|
577 | <listitem> |
---|
578 | <para> |
---|
579 | Using the <property>name</property>. This may or may |
---|
580 | not be unique. |
---|
581 | </para> |
---|
582 | </listitem> |
---|
583 | <listitem> |
---|
584 | <para> |
---|
585 | Some items have |
---|
586 | an <property>externalId</property>. This may or may |
---|
587 | not be unique. |
---|
588 | </para> |
---|
589 | </listitem> |
---|
590 | <listitem> |
---|
591 | <para> |
---|
592 | Array slides may have a <property>barcode</property> |
---|
593 | which is similar to |
---|
594 | the <property>externalId</property>. |
---|
595 | </para> |
---|
596 | </listitem> |
---|
597 | </itemizedlist> |
---|
598 | It is important that the identifier selected |
---|
599 | is <emphasis>unique</emphasis> in the file used, or if the |
---|
600 | file is used to update items already existing in BASE the |
---|
601 | identifier should also be unique in BASE for the user |
---|
602 | performing the update. The plug-in will check uniqueness |
---|
603 | when default parameters are used but the user may change the |
---|
604 | default behaviour. |
---|
605 | </para> |
---|
606 | |
---|
607 | <para> |
---|
608 | Data for a single item may be split into multiple lines. The |
---|
609 | first line contains simple properties and single-item |
---|
610 | references, and the first multi-item reference. If there are |
---|
611 | more multi-item references they should be on the following |
---|
612 | lines with empty values in all other columns, except for the |
---|
613 | column holding the item identifier. The item identifier must |
---|
614 | have the same value on all lines associated with the |
---|
615 | item. Lines containing other data than multi-item references |
---|
616 | will be ignored or may be considered as an error depending |
---|
617 | on plug-in parameter settings. The reason for treating |
---|
618 | copied data entries as an error is to catch situations where |
---|
619 | two items is given the same item identifier by accident. |
---|
620 | </para> |
---|
621 | |
---|
622 | </sect2> |
---|
623 | |
---|
624 | <sect2 id="import_data.batch.running"> |
---|
625 | <title>Running the item batch importer</title> |
---|
626 | |
---|
627 | <para> |
---|
628 | This section discuss specific parameters and features of the |
---|
629 | batch importers. The general use of the batch importers |
---|
630 | follow the description outlined in |
---|
631 | <xref linkend="import_data.import" /> and the setting of |
---|
632 | column mapping parameters is assisted with |
---|
633 | the <guilabel>Test with file</guilabel> function described |
---|
634 | in <xref linkend="plugins.configuration.testwithfile" |
---|
635 | />. The column headers are mapped to item properties at each |
---|
636 | use of the plug-in but, as pointed out above, they can also |
---|
637 | be predefined by saving settings as a plug-in |
---|
638 | configuration. The configuration also includes separator |
---|
639 | character and other information that is needed to parse |
---|
640 | files. The ability to save configurations depends on user |
---|
641 | credential and is by default only granted to administrators. |
---|
642 | </para> |
---|
643 | |
---|
644 | <para> |
---|
645 | The plug-in parameter follows the standard BASE plug-in |
---|
646 | layout and shows help information for selected |
---|
647 | parameters. The list below comments on some of the |
---|
648 | parameters available. |
---|
649 | </para> |
---|
650 | |
---|
651 | <variablelist> |
---|
652 | <varlistentry> |
---|
653 | <term> |
---|
654 | <guilabel>Mode</guilabel> |
---|
655 | </term> |
---|
656 | <listitem> |
---|
657 | <para> |
---|
658 | Select the mode of the plug-in. The plug-in can |
---|
659 | create new items and/or update items already |
---|
660 | existing in BASE. This setting is available to allow |
---|
661 | the user to make a conscious choice of how to treat |
---|
662 | missing or already existing items. For example, if |
---|
663 | the user selects to only update items already |
---|
664 | existing the plug-in will complain if an item in the |
---|
665 | file does not exist in BASE (using default error |
---|
666 | condition treatment). This adds an extra layer of |
---|
667 | security and diagnostics for the user during import. |
---|
668 | </para> |
---|
669 | </listitem> |
---|
670 | </varlistentry> |
---|
671 | <varlistentry> |
---|
672 | <term> |
---|
673 | <guilabel>Data directory</guilabel> |
---|
674 | </term> |
---|
675 | <listitem> |
---|
676 | <para> |
---|
677 | This option is only available for items that has support for |
---|
678 | attaching files (eg. array design, derived bioassay, etc.). |
---|
679 | This setting is used to resolve file references that doesn't |
---|
680 | include a complete absolute path. |
---|
681 | </para> |
---|
682 | </listitem> |
---|
683 | </varlistentry> |
---|
684 | <varlistentry> |
---|
685 | <term> |
---|
686 | <guilabel>Identification method</guilabel> |
---|
687 | </term> |
---|
688 | <listitem> |
---|
689 | <para> |
---|
690 | This parameter defines the method to use to find |
---|
691 | already existing items. The parameter can only be |
---|
692 | set to a set of item properties listed in the |
---|
693 | plug-in parameter dialog. The property selected by |
---|
694 | the user must be mapped to a column in the file. If |
---|
695 | it is not set there is obviously no way for the |
---|
696 | plug-in to identify if an item already exists. |
---|
697 | </para> |
---|
698 | </listitem> |
---|
699 | </varlistentry> |
---|
700 | <varlistentry> |
---|
701 | <term> |
---|
702 | <guilabel>Item subtypes</guilabel> |
---|
703 | </term> |
---|
704 | <listitem> |
---|
705 | <para> |
---|
706 | Only look for existing items among the selected subtypes. If no subtype |
---|
707 | is selected all items are searched. If exactly one subtype is selected |
---|
708 | new items are automatically created with this subtype (unless it is overridden |
---|
709 | by specific subtype values in the import file). |
---|
710 | </para> |
---|
711 | </listitem> |
---|
712 | </varlistentry> |
---|
713 | <varlistentry> |
---|
714 | <term> |
---|
715 | <guilabel>Owned by me</guilabel>, <guilabel>Shared to |
---|
716 | me</guilabel>, <guilabel>In current |
---|
717 | project</guilabel>, and <guilabel>Owned by |
---|
718 | others</guilabel> |
---|
719 | </term> |
---|
720 | <listitem> |
---|
721 | <para> |
---|
722 | Defines the set of items the plug-in should look in |
---|
723 | when it checks whether an item already exists. The |
---|
724 | options are the same that are available in list |
---|
725 | views and the actual set of parameters depends in |
---|
726 | user credentials. |
---|
727 | </para> |
---|
728 | <para> |
---|
729 | When <property>id</property> is used as |
---|
730 | the <guilabel>Identification method</guilabel>, the |
---|
731 | plug-in looks for the item irrespective the setting |
---|
732 | of these parameters. Of course, the user still must |
---|
733 | have proper access to the item referenced. |
---|
734 | </para> |
---|
735 | </listitem> |
---|
736 | </varlistentry> |
---|
737 | <varlistentry> |
---|
738 | <term> |
---|
739 | <guilabel>Column mapping expressions</guilabel> |
---|
740 | </term> |
---|
741 | <listitem> |
---|
742 | <para> |
---|
743 | Use the <guilabel>Test with file</guilabel> function |
---|
744 | described in |
---|
745 | <xref linkend="plugins.configuration.testwithfile" |
---|
746 | /> to set the column mapping parameters. |
---|
747 | </para> |
---|
748 | <para> |
---|
749 | When working with biomaterial items, the |
---|
750 | <guilabel>Parent type</guilabel> property is used to |
---|
751 | tell the plug-in how to find parent items. This only |
---|
752 | has to be set if the parent item is of the same type |
---|
753 | as the biomaterial being imported since the default |
---|
754 | is to look for the nearest parent type in the predefined hierarchy. |
---|
755 | In ascending order the BASE ordering |
---|
756 | of <emphasis>parent - child - grandchild - |
---|
757 | ...</emphasis> item relation is <emphasis>biosource |
---|
758 | - sample - extract</emphasis>. |
---|
759 | </para> |
---|
760 | <para> |
---|
761 | The values accepted for <guilabel>Parent type</guilabel> |
---|
762 | are <constant>BIOSOURCE</constant>, |
---|
763 | <constant>SAMPLE</constant> or <constant>EXTRACT</constant>. |
---|
764 | Sometimes all items in a file to be imported have the same parent |
---|
765 | type but there is no column with this information. This can |
---|
766 | be resolved by setting |
---|
767 | the <guilabel>Parent type</guilabel> mapping to a |
---|
768 | constant string (eg. no backslash '\' character). |
---|
769 | </para> |
---|
770 | </listitem> |
---|
771 | </varlistentry> |
---|
772 | <varlistentry> |
---|
773 | <term> |
---|
774 | <guilabel>Permissions</guilabel> |
---|
775 | </term> |
---|
776 | <listitem> |
---|
777 | <para> |
---|
778 | This is a column mapping that can be used to update the permissions |
---|
779 | set on items. Normally, new items are only shared to the active project |
---|
780 | (if any). By naming a permission template, new items are shared using |
---|
781 | the permissions from that template instead. Permissions on already existing |
---|
782 | items are merged with the permission from the template. |
---|
783 | </para> |
---|
784 | </listitem> |
---|
785 | </varlistentry> |
---|
786 | </variablelist> |
---|
787 | |
---|
788 | <para> |
---|
789 | After setting the parameters, |
---|
790 | select <guilabel>Next</guilabel>. Another parameter dialog |
---|
791 | will appear where error handling options can be set among |
---|
792 | with |
---|
793 | </para> |
---|
794 | |
---|
795 | <variablelist> |
---|
796 | <varlistentry> |
---|
797 | <term> |
---|
798 | <guilabel>Log file</guilabel> |
---|
799 | </term> |
---|
800 | <listitem> |
---|
801 | <para> |
---|
802 | Setting this parameter will turn on logging. The |
---|
803 | plug-in will give detailed information about how the |
---|
804 | file is parsed. This is useful for resolving file |
---|
805 | parsing issues. |
---|
806 | </para> |
---|
807 | </listitem> |
---|
808 | </varlistentry> |
---|
809 | <varlistentry> |
---|
810 | <term> |
---|
811 | <guilabel>Dry run</guilabel> |
---|
812 | </term> |
---|
813 | <listitem> |
---|
814 | <para> |
---|
815 | Enable or disable test run of the plug-in. If |
---|
816 | enabled the plug-in will parse and simulate an |
---|
817 | import. When enabling this option you should set |
---|
818 | the <guilabel>Log file</guilabel> also. The dry run |
---|
819 | mode allows testing of large imports and updates by |
---|
820 | creating a log file that can be examined for |
---|
821 | inconsistencies before actually performing the action |
---|
822 | without a safety net. |
---|
823 | </para> |
---|
824 | </listitem> |
---|
825 | </varlistentry> |
---|
826 | </variablelist> |
---|
827 | |
---|
828 | |
---|
829 | <para> |
---|
830 | During file parsing the plug-in will look for items |
---|
831 | referenced on each line. There are three outcomes of this |
---|
832 | item search |
---|
833 | </para> |
---|
834 | |
---|
835 | <itemizedlist> |
---|
836 | <listitem> |
---|
837 | <para> |
---|
838 | No item is found. Depending on parameter settings this |
---|
839 | may abort the plug-in, the plug-in may ignore the |
---|
840 | line, or a new item is created. |
---|
841 | </para> |
---|
842 | </listitem> |
---|
843 | <listitem> |
---|
844 | <para> |
---|
845 | One item is found. This is the item that is going to |
---|
846 | be updated. |
---|
847 | </para> |
---|
848 | </listitem> |
---|
849 | <listitem> |
---|
850 | <para> |
---|
851 | More than one item is found. Depending on parameter |
---|
852 | settings this may abort the plug-in or the plug-in may |
---|
853 | ignore the line. |
---|
854 | </para> |
---|
855 | </listitem> |
---|
856 | </itemizedlist> |
---|
857 | |
---|
858 | </sect2> |
---|
859 | |
---|
860 | <sect2 id="import_data.batch.comments"> |
---|
861 | <title>Comments on the item batch importers</title> |
---|
862 | |
---|
863 | <para> |
---|
864 | The item batch importers are not designed to change or |
---|
865 | create annotations. There is another plug-in for this, see |
---|
866 | <xref linkend="annotations.massimport" /> for an |
---|
867 | introduction to the annotation importer. |
---|
868 | </para> |
---|
869 | |
---|
870 | <para> |
---|
871 | There is no need to map all columns when running the |
---|
872 | importer. When new items are created usually the only |
---|
873 | mandatory entry is <property>Name</property>, and when |
---|
874 | running the plug-in in update mode only the column defining |
---|
875 | the item identification property needs to be defined. This |
---|
876 | can be utilized when only one or a few properties needs to |
---|
877 | be updated; map only columns that should be changed and the |
---|
878 | plug-in will ignore the other properties and leave them as |
---|
879 | they are already stored in BASE. This also means that if one |
---|
880 | property should be deleted then that property must be mapped |
---|
881 | and the value must be empty in the file. Note, multi-item |
---|
882 | reference cannot be deleted with the batch importer, and |
---|
883 | deletion of multi-item references must be done using the web |
---|
884 | interface. |
---|
885 | </para> |
---|
886 | |
---|
887 | <para> |
---|
888 | When parent and other relations are created using the |
---|
889 | plug-in the referenced items are properly linked and |
---|
890 | updated. This means that when a quantity that decreases a |
---|
891 | referenced item is used, the referenced item is updated |
---|
892 | accordingly. In consequence, if the relation is removed in a |
---|
893 | later update - maybe wrong parent was referenced - the |
---|
894 | referenced item is restored and any decrease of quantities |
---|
895 | are also reset. |
---|
896 | </para> |
---|
897 | |
---|
898 | <para> |
---|
899 | A common mistake is to forget to make sure that some of the |
---|
900 | referenced items already exists in BASE, or at least are |
---|
901 | accessible for the user performing the import. Items such as |
---|
902 | protocols and labels must be added before referencing |
---|
903 | them. This is of course also true for other items but during |
---|
904 | batch import one usually follows the natural order of first |
---|
905 | importing biosources, samples, extracts, and so on. In this |
---|
906 | way the parents are always present and may be referenced |
---|
907 | without any issues. |
---|
908 | </para> |
---|
909 | |
---|
910 | </sect2> |
---|
911 | |
---|
912 | </sect1> |
---|
913 | |
---|
914 | </chapter> |
---|