JUDI Task

A JUDI task is associated with a parameter database and actually represents a collection of DoIt tasks, each corresponding to a row in the parameter database.

A JUDI task is python class inherited from the class Task. It should define the following class variables.

Essential class variables:
  • inputs: A python dictionary for the JUDI files input to the current task.
  • targets: A python dictionary for the JUDI files generated by the current task.
  • actions: A list of DoIt actions.
Optional class variables:
  • mask: A list of parameters that are masked from the global parameter database for the current task.

Parameter substitution in actions

In additions to the forms of actions supported in DoIt, JUDI supports the following additional form:

  • (func, args): Here func could be a string or a callable and args is a list of arguments. When func is a str, it can have placeholders {} which are replaced by the elements of args. When func is a callable it must have only positional arguments provided through args. An element of args can have special strings which are replaced by values as shown in the following table:
Argument substitution rules
arg case substituted value
‘$x’ ‘x’ is an input/target file
Blank separated list of paths for the instances of JUDI file
‘x’ applicable to the current JUDI task instance.
‘#x’ ‘x’ is an input/target file
Parameter database associated with ‘x’
‘#x’ ‘x’ is a parameter
Value of parameter ‘x’
‘##’  
Python dictionary containing all parameters
and their values

Some special actions

To help users of JUDI summarize data across parameter settings two examples of actions are given in module judi.utils.

  • combine_csvs(pdb_big, pdb_small): This function row-binds the CSV files given in the path column of the file parameter database pdb_big into a single CSV file whose path is given in the path column of the file parameter database pdb_small. The function additionally adds the extra parameter settings from pdb_big in the consolidated CSV file.
  • combine_pdfs(pdb_big, pdb_small): This function combines the pages from PDF files given in the path column of the file parameter database pdb_big into a single PDF file whose path is given in the path column of the file parameter database pdb_small.

Some examples

The following code snippet dodo.py creates a global parameter database with two parameters W and X and then creates a task with a parameter database that masks parameter W in the global parameter database. Each of the task instances for parameter X then concatenates the input files for all possible values of W. Using the class variable actions, several parameter substitutions have been demonstrated.

from judi import add_param, show_param_db, File, Task
add_param("1 2".split(), 'W')
add_param("a b".split(), 'X')
show_param_db()

class Test(Task):
  mask = ['W']
  inputs = {'foo': File('bar', path=lambda x: ''.join([x['X'], x['W']]) + '.txt')}
  targets = {'zoo': File('combined.txt', mask = mask)}
  actions = [('echo ">>" foo files: {}', ['$foo']),
             ('echo ">>" foo param db:'),
             (show_param_db, ['#foo']),
             ('echo ">>" zoo files: {}', ['$zoo']),
             ('echo ">>" zoo param db:'),
             (show_param_db, ['#zoo']),
             ('echo ">>" param X: {}', ['#X']),
             ('echo ">>" All parameters:'),
             (lambda x: print(x), ['##']),
             ('cat {} > {}', ['$foo', '$zoo'])]

The output of doit -f dodo.py is shown below:

Global param db:
    W  X
0  1  a
1  1  b
2  2  a
3  2  b
.  Test:X~a
>> foo files: a1.txt a2.txt
>> foo param db:
Param db:
    W name    path
1  1  bar  a1.txt
3  2  bar  a2.txt
>> zoo files: ./judi_files/X~a/combined.txt
>> zoo param db:
Param db:
            name                           path
1  combined.txt  ./judi_files/X~a/combined.txt
>> param X: a
>> All parameters:
X    a
Name: 0, dtype: object
.  Test:X~b
>> foo files: b1.txt b2.txt
>> foo param db:
Param db:
    W name    path
1  1  bar  b1.txt
3  2  bar  b2.txt
>> zoo files: ./judi_files/X~b/combined.txt
>> zoo param db:
Param db:
            name                           path
1  combined.txt  ./judi_files/X~b/combined.txt
>> param X: b
>> All parameters:
X    b
Name: 1, dtype: object