Recognition result specification (XML)
RecognitionResult
Root element of the recognition result
property | type | written as | required | description |
---|---|---|---|---|
resultSchemaVersion | string | attribute | Recognition result schema version in major.minor.patch format, where major , minor , patch are non-negative integers. | |
dataFieldResults | DataFieldResult | multiple elements | List of data fields results. |
RecognitionResult xml example
Click to expand xml
<?xml version="1.0" encoding="UTF-8"?>
<results resultSchemaVersion="1.0.0">
<dataFieldResult ...>
....
</dataFieldResult>
....
<dataFieldResult ...>
....
</dataFieldResult>
</results>
DataFieldResult
property | type | written as | required | description |
---|---|---|---|---|
name | string | attribute | The name of the data field. | |
dataType | string | attribute | The type of data field as it is specified in template. | |
results | ResultValue | multiple elements | The list of recognition results which can be one of several types: TEXT, IMAGE, TABLE, GROUP. |
DataFieldResult xml example
Click to expand xml
<dataFieldResult name="M1" dataType="root">
<result ...>
...
</result>
...
<result ...>
...
</result>
</dataFieldResult>
ResultValue
TextResult
property | type | written as | required | description |
---|---|---|---|---|
resultType | string | attribute | The type of value result. Always TEXT . | |
content | string | single element | Extracted text data. | |
pageLocationMeta | PageLocationMeta | single element | Describes the location of the result within the PDF file. | |
fontMeta | FontMeta | single element | Contains information about font of the result content. |
TextResult xml without meta example
Click to expand xml
<result resultType="TEXT">
<content>st nd</content>
</result>
TextResult xml with meta example
Click to expand xml
<result resultType="TEXT">
<pageLocationMeta .../>
<fontMeta .../>
<content>st nd</content>
</result>
ImageResult
property | type | written as | required | description |
---|---|---|---|---|
resultType | string | attribute | The type of value result. Always IMAGE . | |
base64 | string | single element | Representation of the extracted image bytes as base64 string. | |
pageLocationMeta | PageLocationMeta | single element | Describes the location of the result within the PDF file. |
ImageResult xml without meta example
Click to expand xml
<result resultType="IMAGE">
<base64>abcdefghijk</base64>
</result>
ImageResult xml with meta example
Click to expand xml
<result resultType="IMAGE">
<pageLocationMeta .../>
<base64>abcdefghijk</base64>
</result>
TableResult
property | type | written as | required | description |
---|---|---|---|---|
resultType | string | attribute | The type of value result. Always TABLE . | |
rows | TableRowResult | multiple elements | The list of table row results. | |
pageLocationMetas | PageLocationMeta | multiple elements | Describes the locations of the result within the PDF file, will contain multiple values in case the table takes up several pages. |
TableResult without meta xml example
Click to expand xml
<result resultType="TABLE">
<result resultType="TABLE_ROW">
...
</result>
...
<result resultType="TABLE_ROW">
...
</result>
</result>
TableResult with meta xml example
Click to expand xml
<result resultType="TABLE">
<pageLocationMeta .../>
...
<pageLocationMeta ../>
<result resultType="TABLE_ROW">
...
</result>
...
<result resultType="TABLE_ROW">
...
</result>
</result>
TableRowResult
property | type | written as | required | description |
---|---|---|---|---|
resultType | string | attribute | The type of value result. Always TABLE_ROW . | |
cells | TableCellResult | multiple elements | The list of table cells in the row. | |
pageLocationMeta | PageLocationMeta | single element | Describes the location of the result within the PDF file. |
TableRowResult xml without meta example
Click to expand xml
<result resultType="TABLE_ROW">
<result resultType="TABLE_CELL" ...>
...
</result>
...
<result resultType="TABLE_CELL" ...>
...
</result>
</result>
TableRowResult xml with meta example
Click to expand xml
<result resultType="TABLE_ROW">
<pageLocationMeta .../>
<result resultType="TABLE_CELL" ...>
...
</result>
...
<result resultType="TABLE_CELL" ...>
...
</result>
</result>
TableCellResult
property | type | written as | required | description |
---|---|---|---|---|
resultType | string | attribute | The type of value result. Always TABLE_CELL . | |
content | string | single element | Text data extracted from the cell. | |
pageLocationMeta | PageLocationMeta | single element | Describes the location of the result within the PDF file. | |
fontMeta | FontMeta | single element | Contains information about font of the result content. | |
rowspan | int | attribute | Specifies the number of rows a cell should span. | |
colspan | int | attribute | Specifies the number of columns a cell should span. |
TableCellResult without meta xml example
Click to expand xml
<result resultType="TABLE_CELL" rowspan="2" colspan="2">
<content>Key</content>
</result>
TableCellResult with meta xml example
Click to expand xml
<result resultType="TABLE_CELL" rowspan="2" colspan="2">
<pageLocationMeta .../>
<fontMeta .../>
<content>Key</content>
</result>
GroupResult
property | type | written as | required | description |
---|---|---|---|---|
resultType | string | attribute | The type of value result. Always GROUP . | |
entries | GroupEntryResult | multiple elements | List of grouped result entries. |
GroupResult xml example
Click to expand xml
<result resultType="GROUP">
<result resultType="GROUP_ENTRY" ...>
...
</result>
...
<result resultType="GROUP_ENTRY" ...>
...
</result>
</result>
GroupEntryResult
property | type | written as | required | description |
---|---|---|---|---|
resultType | string | attribute | The type of value result. Always GROUP_ENTRY . | |
name | string | attribute | The group entry name | |
dataType | string | attribute | The type of the group entry data | |
results | ResultValue | multiple elements | The list of recognition results which can be one of several types: TEXT, IMAGE, TABLE, GROUP. |
GroupEntryResult xml example
Click to expand xml
<result name="GroupEntry" dataType="dataType" resultType="GROUP_ENTRY">
<result ...>
...
</result>
...
<result ...>
...
</result>
</result>
Meta
PageLocationMeta
property | type | written as | required | description |
---|---|---|---|---|
x | double | attribute | The x coordinate on the page. | |
y | double | attribute | The y coordinate on the page. | |
width | double | attribute | The width of the location. | |
height | double | attribute | The height of the location. | |
page | int | attribute | The page number. |
PageLocationMeta xml example
Click to expand xml
<pageLocationMeta x="176.8" y="543.52" width="34.1" height="6.42" page="2"/>
FontMeta
property | type | written as | required | description |
---|---|---|---|---|
fontName | string | attribute | The font name. | |
fontStyle | string | attribute | The font style. Possible values: NORMAL , BOLD , ITALIC , BOLD_ITALIC . | |
fontColor | String (rgb format) | attribute | The font color. The format is #rrggbb , where rr , gg , bb are hex representations of corresponding color value. |
FontMeta xml example
Click to expand xml
<fontMeta fontName="TimesNewRomanPSMT" fontStyle="NORMAL" fontColor="#000000"/>
Complete example
RecognitionResult without meta xml example
Click to expand xml
<?xml version="1.0" encoding="UTF-8"?>
<results resultSchemaVersion="1.0.0">
<dataFieldResult name="M1" dataType="root1">
<result resultType="TEXT">
<content>st nd</content>
</result>
</dataFieldResult>
<dataFieldResult name="M2" dataType="root2">
<result resultType="IMAGE">
<base64>abcdefghijk</base64>
</result>
</dataFieldResult>
<dataFieldResult name="M3" dataType="root3">
<result resultType="TABLE">
<result resultType="TABLE_ROW">
<result resultType="TABLE_CELL">
<content>Key</content>
</result>
<result resultType="TABLE_CELL" rowspan="2" colspan="2">
<content>Key</content>
</result>
</result>
</result>
</dataFieldResult>
<dataFieldResult name="M4" dataType="root4">
<result resultType="GROUP">
<result name="GroupEntry" dataType="dataType" resultType="GROUP_ENTRY">
<result resultType="TEXT">
<content>Group Text</content>
</result>
</result>
</result>
</dataFieldResult>
</results>
RecognitionResult with xml example
Click to expand xml
<?xml version="1.0" encoding="UTF-8"?>
<results resultSchemaVersion="1.0.0">
<dataFieldResult name="M1" dataType="root1">
<result resultType="TEXT">
<pageLocationMeta x="176.8" y="543.52" width="34.1" height="6.42" page="2"/>
<fontMeta fontName="TimesNewRomanPSMT" fontStyle="NORMAL" fontColor="000000"/>
<content>st nd</content>
</result>
</dataFieldResult>
<dataFieldResult name="M2" dataType="root2">
<result resultType="IMAGE">
<pageLocationMeta x="160.8" y="400.31" width="20.1" height="7.42" page="2"/>
<base64>abcdefghijk</base64>
</result>
</dataFieldResult>
<dataFieldResult name="M3" dataType="root3">
<result resultType="TABLE">
<pageLocationMeta x="176.8" y="543.52" width="34.1" height="6.42" page="2"/>
<result resultType="TABLE_ROW">
<pageLocationMeta x="176.8" y="543.52" width="34.1" height="6.42" page="2"/>
<result resultType="TABLE_CELL">
<pageLocationMeta x="176.8" y="543.52" width="34.1" height="6.42" page="2"/>
<fontMeta fontName="TimesNewRomanPSMT" fontStyle="NORMAL" fontColor="000000"/>
<content>Key</content>
</result>
<result resultType="TABLE_CELL" rowspan="2" colspan="2">
<pageLocationMeta x="176.8" y="350.9" width="34.1" height="6.42" page="2"/>
<fontMeta fontName="TimesNewRomanPSMT" fontStyle="NORMAL" fontColor="000000"/>
<content>Key</content>
</result>
</result>
</result>
</dataFieldResult>
<dataFieldResult name="M4" dataType="root4">
<result resultType="GROUP">
<result name="GroupEntry" dataType="dataType" resultType="GROUP_ENTRY">
<result resultType="TEXT">
<pageLocationMeta x="176.8" y="543.52" width="34.1" height="6.42" page="2"/>
<fontMeta fontName="TimesNewRomanPSMT" fontStyle="NORMAL" fontColor="000000"/>
<content>Group Text</content>
</result>
</result>
</result>
</dataFieldResult>
</results>