UDFs in Flow V2

UDFs provide custom functionality to your Flow.

Adding the Apply UDF step

You can add the Apply UDF step to a Flow.

  1. Open an .ibflow file and click Tools > Add Step > Select Step > Apply UDF.

  2. Set the input and output extensions.

  3. Set the input and output folders.

  4. Add a registered custom function to the Formula field. The Apply UDF step is added as the last step.

Tip: To add the Apply UDF step in a different order, delete previous steps until you can add the Apply UDF step in the right place. Then add the Flow steps that you removed back to your Flow. Be sure to verify the associated input and output folders, scripts directories, and other supporting file structures for the steps that you move.

Pre- and post-run hooks

Custom functions that are defined in the scripts directory and registered with the custom function name can be run before and after a Flow.

  • Pre-Flow UDFs are run immediately after the output folder setup

  • Post-Flow UDFs are run after the entire Run Flow, Run Flows, or Run Metaflow completes

The Post-Flow UDFs for Metaflow work only on binary (.ibflowbin) files.

Scripts location

For Run Flow, Run Flows, and Run Metaflow, you must define the pre-Flow and post-Flow UDFs within a scripts directory. The scripts directory must be in the same folder as the Flows. The folder is usually called the Workflows/.

To use the Flow root directory and the file-like object during the custom post-hook step, write your UDF function to accept the flow_info_json and clients parameters.

Special input variables

The flow_info_json dictionary is type FlowInfoDict and contains runtime information about the Flow.

FlowInfoDict = TypedDict('FlowInfoDict', 
                         {'root_output_folder': Text})
  • root_output_folder is the absolute path to the Run Flow/Flows/Metaflow operation’s output directory.

  • input_folder is the absolute path to the input directory on which we are running the Flow.

  • CONFIG is a set of key-value pairs that are dynamically passed at runtime into a Flow.

    • An example runtime config:
      {"key1": "val1", "key2": "val2"}
      
  • clients is an object that contains a property called ibfile.

Note: To enable forward compatibility, be sure to update your existing functions to accept a variable number of **kwargs.

Logging in UDF

Use Python’s standard logging library to log messages from an Apply UDF step, a pre-flow UDF, or a post-flow UDF. Logs will show up in Flow Dashboard. You can filter to see only the logs from UDFs by selecting the “Show Developer Logs Only” option.

Note: Flow logs currently have a size limit of 20MB per job ID by default. As a good practice, avoid logging binary values (like images), entire IBDOCs, or extraction results that might contain PII. Logs are stored in the file system.

Note: Logging in UDFs used to be done by the LOGGER object from function context. Although LOGGER is still supported, we recommend you to directly use the logging library from Python now.

Pre Flow UDF example

Sample pre-run UDF:

import logging

def custom_prep_fn(flow_info_json, clients, **kwargs): 
    logging.info('Flow info json {}'.format(flow_info_json))
    root_out = flow_info_json['root_output_folder']
    logging.info('Custom UDF started')
    if clients:
        clients.ibfile.write_file(root_out + '/custom_output.txt', root_out)
    logging.info('Custom UDF ended')

def register(name_to_fn):
    more_fns = {
        'custom_fn_name': {
            'fn': custom_prep_fn,
            'ex': '',
            'desc': ''
        }
    }
    name_to_fn.update(more_fns)

Replace the custom_fn_name key with one of these function names for the desired Run type:

Run type Custom function name
Run Flow flow_custom_prep
Run Flows multiflow_custom_prep
Run Metaflow metaflow_custom_prep

Post Flow UDF example

Sample post-run UDF:

import logging

def custom_post_fn(flow_info_json, clients, **kwargs):
    logging.info('Root output folder {}'.format(flow_info_json))
    if not clients:
      return
    root_out = flow_info_json['root_output_folder']
    classes = clients.ibfile.read_file(root_out + '/class_output_folders.json')
    clients.ibfile.write_file(root_out + '/class_output_folders_new.json', classes)
    clients.ibfile.write_file(root_out + '/root_output_folder.txt', root_out)
    logging.info('Written files successfully')

def register(name_to_fn):
    more_fns = {
        'custom_fn_name': {
            'fn': custom_post_fn,
            'ex': '',
            'desc': ''
        }
    }
    name_to_fn.update(more_fns)

Replace the custom_fn_name key with a custom function name. The following table shows the run type that corresponds to the custom function name:

Run type Custom function name
Run Flow flow_custom_finish
Run Flows multiflow_custom_finish
Run Metaflow metaflow_custom_finish

Note: Because the post-Flow and pre-Flow function names are different, you can register both UDF types at the same time so they can be executed for the appropriate run type at the appropriate time.