PDF Functions

get_pdf_fonts

get_pdf_fonts(ibocr)

Get PDF Fonts associated with provided input



NOTE: The flavour of the function that takes INPUT_IBOCR will be deprecated

after September 30th 2019.  Please use in INPUT_IBOCR_RECORD.



Args:

    ibocr (Union[IBOCRRecordDict, IBOCRRecord]): Could be either a:

      - Dictionary with info about one ibocr record

      - The IBOCRRecord itself



Returns:

    Returns pdf fonts used across this entire document



Examples:

    get_pdf_fonts(INPUT_IBOCR) -> [{'name': 'TimesNewRoman', 'type': 'Type1', 'encoding': 'PDFEncoding'}]

    get_pdf_fonts(INPUT_IBOCR_RECORD) -> [{'name': 'TimesNewRoman', 'type': 'Type1', 'encoding': 'PDFEncoding'}]

get_pdf_metadata

get_pdf_metadata(ibocr, field_name)

Get PDF Metadata associated with provided input



NOTE: The flavour of the function that takes INPUT_IBOCR will be deprecated

after September 30th 2019.  Please use in INPUT_IBOCR_RECORD.



Args:

    ibocr (Union[IBOCRRecordDict, IBOCRRecord]): Could be either a:

      - Dictionary with info about one ibocr record

      - The IBOCRRecord itself

    field_name (string): PDF metadata field name to retrieve. Valid field names

        are: title, author, subject, keywords_str, creator, producer,

        creation_timestamp, modification_timestamp, trapped_str.

        Timestamps are provided in seconds since epoch. See PDDocumentInformation

        for information about what each field indicates.



Returns:

    Returns pdf metadata given the specified field



Examples:

    get_pdf_metadata(INPUT_IBOCR, 'title') -> "title of the PDF"

    get_pdf_metadata(INPUT_IBOCR_RECORD, 'title') -> "title of the PDF"