Opened 14 years ago

Last modified 11 years ago

#791 new task

Plug-in execution time estimates

Reported by: Jari Häkkinen Owned by: everyone
Priority: major Milestone: BASE Future Release
Component: coreplugins Version:
Keywords: Cc:

Description (last modified by Nicklas Nordborg)

Plug-ins should be better in estimating the expected execution time. This is needed for better utilization of the job slots. Too many of the core plug-ins has a SHORT time estimate.

We probably need support functions in the core for this. For example based on the total file size of the files a plug-in is going to process and/or the total number of spots / number of queries, etc. an index number can be created. The plug-in should give scaling factors (log, linear, log * linear, quadratic, etc) and cooefficients. It may also have to tell if it does most of it's work by doing SQL on the database or in the program code.

The server admin should provide configuration settings (in base.config) that depends on the hardware/performance of the server. Based on this an index is calculated with a range from 1-100, which is divided into 4 (or more??) slots.

Change History (3)

comment:1 Changed 14 years ago by Nicklas Nordborg

Description: modified (diff)

comment:2 Changed 14 years ago by Johan Enell

Milestone: BASE 2.6BASE 2.x+

comment:3 Changed 11 years ago by Nicklas Nordborg

This ticket has been around for some time and I really don't think there is any reasonable automatic solution for it. There are too many parameters involved. Consider, for example, the test result in #1238 (BioAssaySetExporter? performance). For the same data set the export times vary a lot depending on which fields are selected.

The impact of the execution time also depends on what the plug-in is doing. Base1 plug-ins that runs as separate process have a lot lower impact than a plug-in that works mainly against the database. In the first case a reasonably equipped job agent can run several plug-ins at a time, and it is possible to add more job agents if there is need for more performance. In the latter case, the database is the limiting factor and it will not help to add more job agents. In fact, it will only make everything (including the web interface) slower since most of the time everybody is just waiting for the database to deliver the results.

So, I think a different approach is needed. My main ideas are:

  • Possibility to configure different queues for different plug-ins or groups of plug-ins. Eg. one queue for Base1 plug-ins, one queue for import plug-ins, etc.
  • Let users select an estimated execution time. Possible limited by role permissions or by an admin for each plug-in.
  • Job agents may be configured to automatically kill a job if the time exceeds the expected with some specified amount (eg. 50%). LONG jobs should not be killed.
  • Make the internal job queue look like a job agent so that we don't have to provide special handling and options for it every time.
Note: See TracTickets for help on using tickets.