Apply Classifier

The apply classifier step identifies record type, or class, based on a classification model.

For common document types, you might be able to use an existing Marketplace model for classification. Otherwise, you can train classification models in ML Studio. Classification models are connected as modules to the apply classifier step in a flow.

Split PDFs

The Split PDFs option, if enabled, splits and groups pages in a PDF or TIFF file into separate records according to their class. Split documents are assigned file names that indicate the original document, the assigned class, and the range of pages included.

For example, if pages 1–5 of a 10-page PDF file named input.pdf are classified as class_name_1 and pages 6–10 are class_name_2, then:

  • labeled_outputs/class_name_1/input.class_name_1-1-5.pdf contains pages 1-5 of input.pdf

  • labeled_outputs/class_name_2/input.class_name_2-6-10.pdf contains pages 6-10 of input.pdf