Search API

Use the Search API to query for documents and retrieve document index attributes.

Before you begin

Document Search must be enabled in your Instabase deployment.

For the Search API, api-root defines where to route API requests for your Instabase instance:

import json, requests

api_root = "https://www.instabase.com/api/v1/search"

See Instabase API authorization and response conventions for authorization and error convention details.

Search Documents

Use this API to search for documents that match the specified query and are accessible by the user’s permissions.

Request

Query documents by sending a POST request to api-root/document/query with the post body encoded as a JSON, where:

headers = {"Authorization": "Bearer {0}".format(token)}
data = json.dumps(
    {
        "global_search": "hello",
        "terms_search_dict": {
          "filename": {
            "value": "foo.txt",
            "type": "exact"
          }
        }
    }
)
resp = requests.post(api_root + "/document/query", headers=headers, data=data).json()

Fields:

  • global_search: Optional. Search for documents that match the specified string across all eligible term attributes. If specified, then ignore terms_search_dict.

  • terms_search_dict: Optional. Search for documents that match the specified value for given attributes. There are four query types one can choose:

    • exact: This is the default type, where documents are matched only if the specified values exactly matches their term attributes.

    • fuzzy: Fuzzy query matches values up to an edit distance of two, such as returning “foo.txt” as a match for “foot.txt”.

    • wildcard: Wildcard query can include wildcard character(s) that can match any character.

    • fulltext: Fulltext query performs a fulltext search based on a given value. This is an advanced search option and is only useful when querying against complete_file_path.

Response

If successful, the response contains information about the matched documents stored in the documents field, which is a list of dictionaries that include documents’ information such as file path, type, and metadata. See example below:

{
    "status": "OK",
    "documents": [{
        "name": "hello.txt",
        "path": "search_testing/hello.txt",
        "type": "file",
        "full_path": "user1/my-repo/fs/Instabase Drive/search_testing/hello.txt",
        "metadata": {
            "modified_timestamp": "1625577144",
            "size": "1993",
            "type": "file",
            "source_url": None
        },
        "perms": None,
        "ext": "txt",
        "open_options": [{
            "display_name": "Flow",
            "href": "/apps/flow/edit/user1/my-repo/fs/Instabase Drive/search_testing/hello.txt"
        },
        {
            "display_name": "Text Editor",
            "href": "/apps/text-editor/user1/my-repo/fs/Instabase Drive/files/search_testing/hello.txt"
        }],
        "default_open_href": "/apps/flow/edit/user1/my-repo/fs/Instabase Drive/search_testing/hello.txt"
    }],
    "num_documents": 1
}

Get Document Index Attributes

Use this API to retrieve the list of existing document index attributes.

Request

The request must be:

headers = {"Authorization": "Bearer {0}".format(token)}
resp = requests.get(api_root + "/document/index", headers=headers).json()

Response

{
    "status": "OK",
    "attributes": [
      {"key": "complete_file_path_tree", "name": "CompleteFilePathTree", "type": "Text", "description": "CompleteFilePath indexed with a path analyzer", "is_custom": False},
      {"key": "repo_name", "name": "RepoName", "type": "Keyword", "description": "RepoName of the document", "is_custom": False},
      {"key": "file_path", "name": "FilePath", "type": "Keyword", "description": "File path of the file relative to the mount point, such as `a/b/c.file.txt`", "is_custom": False},
      {"key": "account_id", "name": "AccountID", "type": "Keyword", "description": "AccountID of the owner of the document", "is_custom": False},
      {"key": "repo_owner", "name": "RepoOwner", "type": "Keyword", "description": "RepoOwner of the document", "is_custom": False},
      {"key": "file_size", "name": "FileSize", "type": "Integer", "description": "The size of the file in bytes", "is_custom": False},
      {"key": "content_hash_xxhash", "name": "ContentHashXXHash", "type": "Keyword", "description": "The content hash of the file using the xxHash algorithm", "is_custom": False},
      {"key": "complete_file_path_reverse_tree", "name": "CompleteFilePathReverseTree", "type": "Text", "description": "CompleteFilePath indexed with a reverse path analyzer", "is_custom": False},
      {"key": "date_created", "name": "DateCreated", "type": "DateTime", "description": "Time the document was created", "is_custom": False},
      {"key": "document_type", "name": "DocumentType", "type": "Keyword", "description": "Type of the document", "is_custom": False},
      {"key": "file_extension", "name": "FileExtension", "type": "Keyword", "description": "File extension, excluding the dot, such as `txt`", "is_custom": False},
      {"key": "mount_point", "name": "MountPoint", "type": "Keyword", "description": "MountPoint of the document", "is_custom": False},
      {"key": "date_modified", "name": "DateModified", "type": "DateTime", "description": "Time the document was last modified", "is_custom": False},
      {"key": "index_time", "name": "IndexTime", "type": "DateTime", "description": "The time the document was indexed (milliseconds since epoch)", "is_custom": False},
      {"key": "complete_file_path", "name": "CompleteFilePath", "type": "Keyword", "description": "Complete file path, including repoOwner/repoName of the file", "is_custom": False},
      {"key": "last_accessed", "name": "LastAccessedTs", "type": "DateTime", "description": "Time the document was last accessed", "is_custom": False},
      {"key": "document_id", "name": "DocumentID", "type": "Keyword", "description": "Unique identifier for this document", "is_custom": False},
      {"key": "filename", "name": "Filename", "type": "Keyword", "description": "Filename, including extension, such as `file.txt`", "is_custom": False},
      {"key": "file_basename", "name": "FileBasename", "type": "Keyword", "description": "The file base name, excluding extension, such as `file`", "is_custom": False}
    ]
}