1 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> |
---|
2 | <!-- |
---|
3 | $Id: fileupload.html 4509 2008-09-11 20:01:44Z jari $ |
---|
4 | |
---|
5 | Copyright (C) 2005 Jari Hakkinen, Nicklas Nordborg |
---|
6 | Copyright (C) 2006 Jari Hakkinen |
---|
7 | |
---|
8 | This file is part of BASE - BioArray Software Environment. |
---|
9 | Available at http://base.thep.lu.se/ |
---|
10 | |
---|
11 | BASE is free software; you can redistribute it and/or |
---|
12 | modify it under the terms of the GNU General Public License |
---|
13 | as published by the Free Software Foundation; either version 3 |
---|
14 | of the License, or (at your option) any later version. |
---|
15 | |
---|
16 | BASE is distributed in the hope that it will be useful, |
---|
17 | but WITHOUT ANY WARRANTY; without even the implied warranty of |
---|
18 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
---|
19 | GNU General Public License for more details. |
---|
20 | |
---|
21 | You should have received a copy of the GNU General Public License |
---|
22 | along with BASE. If not, see <http://www.gnu.org/licenses/>. |
---|
23 | --> |
---|
24 | <html> |
---|
25 | <head> |
---|
26 | <title>BASE - Core specification - File upload and disk quota</title> |
---|
27 | <link rel=stylesheet type="text/css" href="../../styles.css"> |
---|
28 | </head> |
---|
29 | <body> |
---|
30 | |
---|
31 | <div class="navigation"> |
---|
32 | <a href="../../index.html">BASE</a> |
---|
33 | <img src="../../next.gif"> |
---|
34 | <a href="index.html">Core specification</a> |
---|
35 | <img src="../../next.gif"> |
---|
36 | File upload and disk quota |
---|
37 | </div> |
---|
38 | |
---|
39 | <h1>File upload and disk quota</h1> |
---|
40 | |
---|
41 | <div class="abstract"> |
---|
42 | <p> |
---|
43 | This document covers the details of how to handle file uploads in BASE. |
---|
44 | A discussion about disk quota is also included. |
---|
45 | </p> |
---|
46 | |
---|
47 | <b>Contents</b><br> |
---|
48 | <ol> |
---|
49 | <li><a href="#files">Files and directories</a> |
---|
50 | <li><a href="#secondary">Secondary storage</a> |
---|
51 | <li><a href="#quota">Disk quota</a> |
---|
52 | </ol> |
---|
53 | |
---|
54 | <b>See also</b><br> |
---|
55 | <ul> |
---|
56 | <li><a href="../../development/overview/data/files.html">Implementation overview - files</a> |
---|
57 | <li><a href="../../development/overview/data/quota.html">Implementation overview - quota</a> |
---|
58 | </ul> |
---|
59 | |
---|
60 | <p class="authors"> |
---|
61 | <b>Last updated:</b> $Date: 2008-09-11 20:01:44 +0000 (Thu, 11 Sep 2008) $ |
---|
62 | </p> |
---|
63 | </div> |
---|
64 | |
---|
65 | <a name="files"> |
---|
66 | <h2>1. Files and directories</h2> |
---|
67 | </a> |
---|
68 | |
---|
69 | <h3>1.1 Files</h3> |
---|
70 | <ol> |
---|
71 | <li>BASE should be able to store files related to experiments and |
---|
72 | other items. |
---|
73 | |
---|
74 | <li>A file may have a type, for example, raw data, protocol, reporter list, |
---|
75 | etc. The type of the file is only used for giving client applications |
---|
76 | a better way to filter files, not for stopping a user from using |
---|
77 | a file wherever it can be used. |
---|
78 | |
---|
79 | <li>BASE should keep track if a file has been used or not. With "used" |
---|
80 | we mean that it has been linked to another item, for example a |
---|
81 | protocol. |
---|
82 | |
---|
83 | <li>It should be possible to use a file multiple times. |
---|
84 | |
---|
85 | <li>A user should be able to delete the "physical" file from disk, but the |
---|
86 | information about the file should still remain in the database. |
---|
87 | |
---|
88 | <li>A deleted file can be re-uploaded in case it is needed again. |
---|
89 | |
---|
90 | <li>BASE may rename an uploaded file to avoid overwriting an existing one. |
---|
91 | |
---|
92 | <li>The core should calculate and store a unique value, for example |
---|
93 | the MD5 sum, for each file. This value is used to warn a user |
---|
94 | that is re-uploading a file. The user is not prevented from uploading |
---|
95 | since it is possible that errors may have been corrected. |
---|
96 | |
---|
97 | <li>Files brought back from secondary storage, should however be checked |
---|
98 | for a valid MD5 value. |
---|
99 | |
---|
100 | </ol> |
---|
101 | |
---|
102 | <h3>1.2 Directories</h3> |
---|
103 | |
---|
104 | <ol> |
---|
105 | <li>It should be possible to create a directory structure. The directory |
---|
106 | structure doesn't have to be physically represented on the disk. |
---|
107 | |
---|
108 | <li>Each directory may contain multiple files, but a single file can only |
---|
109 | appear inside one directory. |
---|
110 | |
---|
111 | <li>The directory structure may not limit how a file can be used, but |
---|
112 | is only used as a means for users to organise their files. |
---|
113 | |
---|
114 | <li>A client application may ignore the directory structure and |
---|
115 | display all files as if they were part of the root directory. |
---|
116 | |
---|
117 | <li>It is not possible to delete directories that contains files. |
---|
118 | <span class="note">[NOTE] Alternatively, if a directory that contains files is deleted |
---|
119 | the files are moved to the root directory, but this is a client issue and not |
---|
120 | a core issue.</span> |
---|
121 | |
---|
122 | <li>Some client applications, for example a FTP client, requires that a file |
---|
123 | can be uniquely identified by name. This implies that all files in a |
---|
124 | directory must be unique. |
---|
125 | |
---|
126 | <li class="question">[QUESTION] How do we handle sharing of files |
---|
127 | with users and groups? Should we require that all parent directories |
---|
128 | must also be shared? Or do we magically "create" a parallel directory |
---|
129 | structure like: /shared/johan, /shared/nicklas<br> |
---|
130 | [ANSWER] This is mainly a client issue. But, the core must allow |
---|
131 | a client to traverse the path leading to the file. |
---|
132 | |
---|
133 | |
---|
134 | </ol> |
---|
135 | |
---|
136 | <a name="secondary"> |
---|
137 | <h2>2. Secondary storage</h2> |
---|
138 | </a> |
---|
139 | |
---|
140 | <ol> |
---|
141 | <li>BASE can be configured to support a secondary storage, where files |
---|
142 | that are rarely used can be placed, for example on tape-backup. |
---|
143 | <div class="note"> |
---|
144 | [NOTE] The secondary storage is intended to be used for large |
---|
145 | files that are not regularly used once they have been parsed after |
---|
146 | the upload, for example images and raw data files. Such files may be |
---|
147 | moved to cheaper long-term storage. |
---|
148 | </DIV> |
---|
149 | |
---|
150 | <li>A user may flag that a file should be moved to the secondary storage. |
---|
151 | Information about the file should remain in the database. |
---|
152 | |
---|
153 | <li>A user may flag that a file placed in the secondary storage should be |
---|
154 | retreived and placed in the primary storage again. |
---|
155 | |
---|
156 | <li>The BASE core will only handle the flagging of files to be moved. |
---|
157 | It is the responsibility of an external application to actually move |
---|
158 | the files between the primary and secondary storage. |
---|
159 | |
---|
160 | <li>The external application should check if files need to be moved at |
---|
161 | regular intervals. For example once every night. |
---|
162 | |
---|
163 | <li>A file that is placed in secondary storage can also be flagged |
---|
164 | to be deleted. |
---|
165 | |
---|
166 | </ol> |
---|
167 | |
---|
168 | <a name="quota"> |
---|
169 | <h2>3. Disk quota</h2> |
---|
170 | </a> |
---|
171 | |
---|
172 | <ol> |
---|
173 | <li>A user must be assigned a disk quota that may not be exceeded. |
---|
174 | |
---|
175 | <li>The quota is checked in the beginning of an operation, ie. before |
---|
176 | uploading a file. If the check is successful the operation is |
---|
177 | allowed to proceed, even if the quota is exceeded after the operation. |
---|
178 | <span class="note">[NOTE] This is because if a plugin runs for |
---|
179 | several hours it should not be rejected while saving the result</span>. |
---|
180 | |
---|
181 | <li>The quota applies to uploaded files, and other data that takes |
---|
182 | a lot of disk space. What we mean with "other data" and "lot of |
---|
183 | disk space" is decided for each case and should not matter to |
---|
184 | the quota system. |
---|
185 | |
---|
186 | <li>Quota values may be specified as a total sum, or with values |
---|
187 | for each type of data or file. |
---|
188 | |
---|
189 | <li>It should be possible to have independent quota settings for |
---|
190 | primary and secondary storage. |
---|
191 | |
---|
192 | <li>Files that have been deleted should not be counted. |
---|
193 | |
---|
194 | <li class="note">[IMPLEMENTATION NOTE] Checking against quota values |
---|
195 | is something that is done fairly often. Used disk space should not |
---|
196 | have to recalculated each time. A cache holding the most recent |
---|
197 | values should be considered. |
---|
198 | |
---|
199 | <li>A group may also be assigned quota values. |
---|
200 | |
---|
201 | <li>A user may be configured to use the quota from <b>one</b> |
---|
202 | of the groups where the user is a member. Then, both the user's |
---|
203 | individual quota and the group's quota are checked. |
---|
204 | |
---|
205 | <li>The amount of disk space used should be stored per user, |
---|
206 | item and type. It will then be possible to generate reports over |
---|
207 | disc usage for groups and projects as well. |
---|
208 | |
---|
209 | <li class="question">[QUESTION] How do we handle removing a user from |
---|
210 | a group from which the user has used quota? How do we handle adding |
---|
211 | a user to a group? How do we handle changing owner on an item |
---|
212 | that uses quota? |
---|
213 | <p> |
---|
214 | Answer: We do not make any automatical changes that require batch |
---|
215 | updates. If a user |
---|
216 | is removed/added from the quota group, the disc usage is still |
---|
217 | counted against the original group. Changing the owner of the |
---|
218 | item will cause the disc usage to be taken over by |
---|
219 | the new owner. |
---|
220 | </ol> |
---|
221 | |
---|
222 | |
---|
223 | |
---|
224 | </body> |
---|
225 | </html> |
---|