Skip to main content

Search area

By default pdf2Data applies a parsing pipeline to the entire document.

You can restrict the area by specifying page, pages range and/or by selecting a specific area in the document.

Restricting page

If all you need is to restrict the page(s), choose the Page option:

After that you can specify the page number:

or select a range of multiple pages:

when defining the range, you can specify where to count from: from the start of the document or from its end:

Selecting specific area

To select a specific area in the document you can choose the Custom option:

After that you can select the area on the canvas:

You can further modify the coordinates using rectangle resize or redraw it using button.

Extending the search area

You can include/exclude into your search area parts of the page next to the selected one by clicking on them:

Repeat search area on multiple pages

After you draw a region, the page on which it was drawn is selected as the target one. You can include multiple pages by selecting the "All" or "Multiple" options, in which case the selected area will be repeated on those pages as well. This could be useful to cut off static parts of the document like headers/footers that may interfere with the results.

Finish editing

When you are done with search area modification you can exit search area editing mode by clicking on data field name, any selector or data field settings. This will make the region not interactive in the canvas so that you could focus on the result rectangles.