Automation Metrics API

The Automation Metrics API allows you to query for metrics through the last 90 days for any deployed solution that relates to extraction performance, based on validation success and human review modifications.

To authorize your request, the Automation Metrics API requires an Authorization header with the value Bearer XYZ, where XYZ is your access token. See API authorization.

In this document, URL_BASE refers to the root URL of your Instabase instance, such as https://www.instabase.com.

import requests

url_base = "https://www.instabase.com"
automation_metrics_api_url = url_base + '/api/v2/automation-metrics'

Query for Metrics

Method Syntax
POST URL_BASE/api/v2/automation-metrics

Description

Query for automation metrics based on a specific time query range and one-to-many aggregation formats.

Request body

The request body is a JSON object containing details about the type of metrics that you want to query for, and how you want to aggregate those metrics in for our API response.

Parameters are required unless marked as optional.

Name Type Description Values
solution_name string The name of the deployed solution to query metrics from. A valid name of an existing deployed solution
start_time int (epoch-milliseconds) Beginning of the lookback window for the query range An epoch within the last 3 months
end_time int (epoch-milliseconds) End of the lookback window for the query range An epoch within the last 3 months and later than start_time
aggregations list[dict] List of aggregation actions Valid aggregation dictionaries
username string Optional. Username of the human reviewer Username for an existing Instabas euser
job-id string Optional. A unique identifier for a specific flow job Valid flow job ID

Aggregation types

Each aggregation type is associated with specific values in the aggregation dictionary sent through the aggregations parameter. Supported aggregation types are:

  • TIME-SERIES
  • CUSTOM-METRIC
  • SUM
Aggregation dictionary format
{
  "name": "", // str
  "type": "" , // str
  "options": {}, // Dict[str, Any]
}
Key Type Description
name string A custom user-defined name
type string One of TIME-SERIES, CUSTOM-METRIC, SUM
options dict Required values are dependent on aggregation type

Aggregation type: Time Series

A time series aggregation request returns bucketed results for the given sub-aggregations. The results are divided into evenly sized buckets across the time interval, and each requested sub-aggregation runs inside each bucket.

Request
{
  "name": "<custom name>",
  "type": "TIME-SERIES",
  "options": {
    "buckets": 1, //int
    "timezone": "", // str (optional)
    "aggregations": [], // List[Dict]
  }
}
Key Type Description
buckets int Number of buckets that the queried time series data is divided into
aggregations list[dict] A list of sub aggregations to perform on data in each bucket. There must be at least one requested sub aggregation.
timezone string Optional. Timezone to divide bucket intervals. Defaults to UTC, so for a query with 30 buckets over 30 days of data, buckets are 1 day and intervals are at midnight UTC.
Response

Returns a list of buckets, each of them containing a timestamp and all requested sub-aggregations.

[
  {
    "timestamp": int (epoch milliseconds),
    <sub aggregation name>: <sub aggregation value>,
    <sub aggregation name>: <sub aggregation value>
  }
]

Aggregation type: Custom Metric

A custom metric aggregation returns a predefined, custom reponse for a specific metric type.

Request
{
  "name": "<custom name>",
  "type": "CUSTOM-METRIC",
  "options": {
    "metric_name": "" // str
  }
}
Key Type Description
metric_name string Name of the custom metric. Supported options are FIELD-LEVEL-ACCUMULATE and CLASS-LEVEL-ACCUMULATE
Response

Each custom metric returns its own response format.

CLASS-LEVEL-ACCUMULATE

[
  {
    "version": str,
    "class_name": str,
    "pages_processed": int,
    "class_counters": {
      "invalid_modified": int,
      "invalid_unmodified": int,
      "no_extraction": int,
      "valid_modified": int,
      "no_validation_unmodified": int,
      "valid_unmodified": int,
      "no_validation_modified": int

      integers here record the number of records with classification results in each validation state
      includes all records which were originally classified by the flow as <classname>
      these results can be used to determine reclassification rate
    },
    "field_counters": {
      "invalid_modified": int,
      "invalid_unmodified": int,
      "no_extraction": int,
      "valid_modified": int,
      "no_validation_unmodified": int,
      "valid_unmodified": int,
      "no_validation_modified": int

      integers here record the number of fields with extraction results in each validation state
      these results can be used to determine class-level automation rate
    }
  }
]

FIELD-LEVEL-ACCUMULATE

[
  {
    "version": str,
    "class_name": str,
    "field_name": str,
    "field_counters": {
      "invalid_modified": int,
      "invalid_unmodified": int,
      "no_extraction": int,
      "valid_modified": int,
      "no_validation_unmodified": int,
      "valid_unmodified": int,
      "no_validation_modified": int

      integers here record the number of occurances of the field in each validation state
      these results can be used to determine field level confusion matrix
    }
  }
]

Aggregation type: Sum

A sum aggregation returns the integer sum of the requested metric.

Request
{
  "name": "<custom name>",
  "type": "SUM",
  "options": {
    "metric_name": "" // str
  }
}
Key Type Description
metric_name string Metric to sum over. Supported options are listed below.

Supported Metric Names

  • ERRORED-RECORDS
  • MODIFIED-RECORDS
  • PROCESSED-RECORDS
  • REMAPPED-RECORDS
  • STP-RECORDS
  • RECLASSIFIED-RECORDS
  • FAILED-VALIDATION-RECORDS
  • PROCESSED-PAGES
  • PROCESSED-DOCUMENTS
  • FAILED-VALIDATION-DOCUMENTS
  • STP-DOCUMENTS
  • ERRORED-DOCUMENTS
  • FAILED-VALIDATION-DOCUMENTS
Response

Returns an integer sum

Response status

Status Meaning
200 OK Metrics were successfully queried and aggregated
400 Bad Request Request body had bad syntax and might be incorrect
401 Unauthorized Authorized user does not have permissions to the Deployed Solutions app

Examples

Request

headers = {
  'Authorization': 'Bearer {0}'.format(token)
}
data = json.dumps({
  'solution_name': 'My Example Solution',
  'start_time': 12345,
  'end_time': 67890,
  'aggregations': [
    {
    'name': 'time_series_data',
    'type': 'TIME-SERIES',
    'options': {
      'buckets': 2,
      'aggregations': [
        {
          'name': 'class_accumulate_results',
          'type': 'CUSTOM-METRIC',
          'options': {
            'metric_name': 'CLASS-LEVEL-ACCUMULATE'
          }
        },
        {
          'name': 'field_accumulate_results',
          'type': 'CUSTOM-METRIC',
          'options': {
            'metric_name': 'FIELD-LEVEL-ACCUMULATE'
          }
        },
        {
          'name': 'total_errored_records',
          'type': 'SUM',
          'options': {
            'metric_name': 'ERRORED-RECORDS'
          }
        }
      ]
      }
    },
    {
      'name': 'total_processed_records',
      'type': 'SUM',
      'options': {
          'metric_name': 'PROCESSED-RECORDS'
      }
    }
  ]
})
resp = requests.post(automation_metrics_api_url, headers=headers, data=data)

This request queries time series data for 3 aggregations: class level accumulate, field level accumulate, and the number of errored records. All of these aggregations are split across two buckets for the returned data. The request also queries the total number of records processed across the total time range.

Response

HTTP STATUS CODE 200

{
  "status": "OK",
  "data": {
    "time_series_data": [
      {
        "timestamp": 12345
        "class_accumulate_results": ...
        "field_accumulate_results": ...
        "total_errored_records": 4
      },
      {
        "timestamp": 34567
        "class_accumulate_results": ...
        "field_accumulate_results": ...
        "total_errored_records": 2
      },
    ],
    "total_processed_records": 100
  }
}