Automation Metrics API

Table of Contents

The Automation Metrics API allows you to query for metrics through the last 90 days for any deployed solution that relates to extraction performance, based on validation success and human review modifications.

To authorize your request, the Automation Metrics API requires an Authorization header with the value Bearer XYZ, where XYZ is your access token. See API authorization.

In this document, URL_BASE refers to the root URL of your Instabase instance, such as https://www.instabase.com.

import requests

url_base = "https://www.instabase.com"
automation_metrics_api_url = url_base + '/api/v2/automation-metrics'

Query for Metrics

Method	Syntax
POST	`URL_BASE/api/v2/automation-metrics`

Description

Query for automation metrics based on a specific time query range and one-to-many aggregation formats.

Request body

The request body is a JSON object containing details about the type of metrics that you want to query for, and how you want to aggregate those metrics in for our API response.

Parameters are required unless marked as optional.

Name	Type	Description	Values
`solution_name`	string	The name of the deployed solution to query metrics from.	A valid name of an existing deployed solution
`start_time`	int (epoch-milliseconds)	Beginning of the lookback window for the query range	An epoch within the last 3 months
`end_time`	int (epoch-milliseconds)	End of the lookback window for the query range	An epoch within the last 3 months and later than `start_time`
`aggregations`	list[dict]	List of aggregation actions	Valid aggregation dictionaries
`username`	string	Optional. Username of the human reviewer	Username for an existing Instabas euser
`job-id`	string	Optional. A unique identifier for a specific flow job	Valid flow job ID

Aggregation types

Each aggregation type is associated with specific values in the aggregation dictionary sent through the aggregations parameter. Supported aggregation types are:

TIME-SERIES
CUSTOM-METRIC
SUM

Aggregation dictionary format

{
  "name": "", // str
  "type": "" , // str
  "options": {}, // Dict[str, Any]
}

Key	Type	Description
`name`	string	A custom user-defined name
`type`	string	One of `TIME-SERIES`, `CUSTOM-METRIC`, `SUM`
`options`	dict	Required values are dependent on aggregation type

Aggregation type: Time Series

A time series aggregation request returns bucketed results for the given sub-aggregations. The results are divided into evenly sized buckets across the time interval, and each requested sub-aggregation runs inside each bucket.

Request

{
  "name": "<custom name>",
  "type": "TIME-SERIES",
  "options": {
    "buckets": 1, //int
    "timezone": "", // str (optional)
    "aggregations": [], // List[Dict]
  }
}

Key	Type	Description
`buckets`	int	Number of buckets that the queried time series data is divided into
`aggregations`	list[dict]	A list of sub aggregations to perform on data in each bucket. There must be at least one requested sub aggregation.
`timezone`	string	Optional. Timezone to divide bucket intervals. Defaults to UTC, so for a query with 30 buckets over 30 days of data, buckets are 1 day and intervals are at midnight UTC.

Response

Returns a list of buckets, each of them containing a timestamp and all requested sub-aggregations.

[
  {
    "timestamp": int (epoch milliseconds),
    <sub aggregation name>: <sub aggregation value>,
    <sub aggregation name>: <sub aggregation value>
  }
]

Aggregation type: Custom Metric

A custom metric aggregation returns a predefined, custom reponse for a specific metric type.

Request

{
  "name": "<custom name>",
  "type": "CUSTOM-METRIC",
  "options": {
    "metric_name": "" // str
  }
}

Key	Type	Description
`metric_name`	string	Name of the custom metric. Supported options are `FIELD-LEVEL-ACCUMULATE` and `CLASS-LEVEL-ACCUMULATE`

Response

Each custom metric returns its own response format.

CLASS-LEVEL-ACCUMULATE

[
  {
    "version": str,
    "class_name": str,
    "pages_processed": int,
    "class_counters": {
      "invalid_modified": int,
      "invalid_unmodified": int,
      "no_extraction": int,
      "valid_modified": int,
      "no_validation_unmodified": int,
      "valid_unmodified": int,
      "no_validation_modified": int

      integers here record the number of records with classification results in each validation state
      includes all records which were originally classified by the flow as <classname>
      these results can be used to determine reclassification rate
    },
    "field_counters": {
      "invalid_modified": int,
      "invalid_unmodified": int,
      "no_extraction": int,
      "valid_modified": int,
      "no_validation_unmodified": int,
      "valid_unmodified": int,
      "no_validation_modified": int

      integers here record the number of fields with extraction results in each validation state
      these results can be used to determine class-level automation rate
    }
  }
]

FIELD-LEVEL-ACCUMULATE

[
  {
    "version": str,
    "class_name": str,
    "field_name": str,
    "field_counters": {
      "invalid_modified": int,
      "invalid_unmodified": int,
      "no_extraction": int,
      "valid_modified": int,
      "no_validation_unmodified": int,
      "valid_unmodified": int,
      "no_validation_modified": int

      integers here record the number of occurances of the field in each validation state
      these results can be used to determine field level confusion matrix
    }
  }
]

Aggregation type: Sum

A sum aggregation returns the integer sum of the requested metric.

Request

{
  "name": "<custom name>",
  "type": "SUM",
  "options": {
    "metric_name": "" // str
  }
}

Key	Type	Description
`metric_name`	string	Metric to sum over. Supported options are listed below.

Supported Metric Names

ERRORED-RECORDS
MODIFIED-RECORDS
PROCESSED-RECORDS
REMAPPED-RECORDS
STP-RECORDS
RECLASSIFIED-RECORDS
FAILED-VALIDATION-RECORDS
PROCESSED-PAGES
PROCESSED-DOCUMENTS
FAILED-VALIDATION-DOCUMENTS
STP-DOCUMENTS
ERRORED-DOCUMENTS
FAILED-VALIDATION-DOCUMENTS

Response

Returns an integer sum

Response status

Status	Meaning
200 OK	Metrics were successfully queried and aggregated
400 Bad Request	Request body had bad syntax and might be incorrect
401 Unauthorized	Authorized user does not have permissions to the Deployed Solutions app

Examples

Request

headers = {
  'Authorization': 'Bearer {0}'.format(token)
}
data = json.dumps({
  'solution_name': 'My Example Solution',
  'start_time': 12345,
  'end_time': 67890,
  'aggregations': [
    {
    'name': 'time_series_data',
    'type': 'TIME-SERIES',
    'options': {
      'buckets': 2,
      'aggregations': [
        {
          'name': 'class_accumulate_results',
          'type': 'CUSTOM-METRIC',
          'options': {
            'metric_name': 'CLASS-LEVEL-ACCUMULATE'
          }
        },
        {
          'name': 'field_accumulate_results',
          'type': 'CUSTOM-METRIC',
          'options': {
            'metric_name': 'FIELD-LEVEL-ACCUMULATE'
          }
        },
        {
          'name': 'total_errored_records',
          'type': 'SUM',
          'options': {
            'metric_name': 'ERRORED-RECORDS'
          }
        }
      ]
      }
    },
    {
      'name': 'total_processed_records',
      'type': 'SUM',
      'options': {
          'metric_name': 'PROCESSED-RECORDS'
      }
    }
  ]
})
resp = requests.post(automation_metrics_api_url, headers=headers, data=data)

This request queries time series data for 3 aggregations: class level accumulate, field level accumulate, and the number of errored records. All of these aggregations are split across two buckets for the returned data. The request also queries the total number of records processed across the total time range.

Response

HTTP STATUS CODE 200

{
  "status": "OK",
  "data": {
    "time_series_data": [
      {
        "timestamp": 12345
        "class_accumulate_results": ...
        "field_accumulate_results": ...
        "total_errored_records": 4
      },
      {
        "timestamp": 34567
        "class_accumulate_results": ...
        "field_accumulate_results": ...
        "total_errored_records": 2
      },
    ],
    "total_processed_records": 100
  }
}