Reader

Table of Contents

Reader is used to digitize documents for annotation and flow processing. You can use the Reader app independently to visualize the impact of digitization settings on your documents.

Reader digitizes the contents of a specified folder and produces a sibling project directory that includes:

<project_name>.ibreader file containing all the configuration and settings for your project.
read_result directory where all the output .ibdoc files are stored.

For details about supported file types, see the Reader support reference.

Working in the Reader interface

The Reader interface includes a documents sidebar and a split view document preview that displays the document image on the left and corresponding text on the right.

The toolbar contains display options as well as buttons for accessing configuration settings and for running Reader. Lens mode lets you select the text output for whole words or lines. If visual entities are enabled for the project, a right sidebar enables searching and navigating between entities.

Reader configuration options

To access Reader configuration options, click the Edit Reader Config icon in the split view panel.

Configuration settings include these categories:

Profile – Specifies input and output directories, and provides the option to reset configuration to defaults.
Read – Specifies digitization settings, broken down by General, Pre-Processing, and OCR options.

Tip

For details about digitization settings, see the parameter reference for the process files step in Flow, which provides similar settings.
Entity – Specifies options for enabling and configuring visual entities, including checkboxes, signatures, barcodes or QR codes, tables, or custom entities.

Changing any setting and clicking Save and Run saves the updated configuration to the .ibreader file and redigitizes your files with the updated settings.

Creating and using custom entity models

Custom entities let you train Instabase to recognize unique visual objects in your documents.

To add custom entities to a flow, create and publish a custom entity model, enable it in Reader, then specify the Reader module in the process files step of your flow.

In ML Studio, create a model.
- The model request has input_document which contains the document page image bytes.
- Ensure that the model creates the detected entities in the form of List[DetectedEntity] and adds them to the DetectionResult.
Publish the model to the Marketplace.
In Reader, with a new or existing module open, click Edit Reader Config.
Select Entity > Custom Entity, then click Add New Entity.
Enter the name for your published custom entity model and click Add.

Modify the version number if needed.
Click Save and Run.

The entity model is saved to the Reader module.

Tip

Note the path to your Reader module shown in breadcrumbs below the header.
In Flow, in a process files step, set Use Reader Module to True, then select the Reader module that you created in the previous steps.

Alternatively, in the OCR Config advanced view, you can set the entity_models parameter to your published custom entity model and version, for example:
```
 ```json
 "entity_models": [
 {
     "model_name": "selection_box_model",
     "model_version": "0.0.3",
     "params": {}
 }
 ]
 ```
```

Example custom entity model

Here is an example custom entity model.

import io

from typing import Any, Callable, List

from google.protobuf import json_format
from PIL import Image

from instabase.model_service.model_cache import InvalidModelRequest, Model
from instabase.protos.model_service import model_service_pb2
from instabase.protos.doc_types import core_types_pb2


class CustomModel(Model):
  """Defines a model that can be called."""

  def load(self) -> None:
    """Loads the model - for example, read a file, create a neural network
    object, and so on. Important: All loading must occur here. Use the
    `cleanup()` method to remove from memory everything that is loaded here.
    """
    self.model = load_weights('path/to/weights.pt')

  def cleanup(self) -> None:
    """Cleans up any objects from memory related to this model, such as objects
    loaded inside the `load()` method.
    """
    self.model = None

  def validate_request(self,
                        request: model_service_pb2.RunModelRequest) -> None:
    """Validates the request so it contains all the necessary values."""
    if not request.HasField('input_document'):
      raise InvalidModelRequest(
          'input_document not provided within RunModelRequest')
    if not request.input_document.HasField('content'):
      raise InvalidModelRequest(
          'content field not provided within input_document of RunModelRequest'
      )

  def run(self, request: model_service_pb2.RunModelRequest) -> model_service_pb2.ModelResult:
    """This is the actual logic that defines how this model is run."""
    bytes_io = io.BytesIO(request.input_document.content)
    pil_image = Image.open(bytes_io)
    processed_image = preprocess_image(pil_image)

    # Run the model.
    predictions = self.model(processed_image)

    # Create Entities from predictions.
    found_entities = [
      model_service_pb2.DetectedEntity(
        content='[{}]'.format(pred[0]),
        label=pred[0],
        bounding_box=core_types_pb2.Rectangle(
          top_x=pred[2],
          top_y=pred[3],
          bottom_x=pred[4],
          bottom_y=pred[5]
        ),
        score=pred[2]
    ) for pred in predictions]

    # Finally, we return the actual result of the model
    return model_service_pb2.ModelResult(
      detection_result=model_service_pb2.DetectionResult(
        entities=found_entities
      )
    )