Skip to main content

Deploying pdf2Data Editor manually

This guide describes how to manually configure and deploy pdf2Data Editor. It requires you deep understanding of docker technology and possible pitfalls, which are out of the scope of this guide. A few obvious shortcomings of manual deployment are

  1. It is error-prone and time-consuming
  2. Not only initial installation but also any upgrade requires manually review and changing configuration files
caution

That's why we recommend, whenever possible, following the guide "Deploying pdf2Data Editor with a helper script " instead.

Prerequisites

Since pdf2Data Editor is provided as Docker containers, we assume that you have some familiarity with containerization, particularly with Docker Compose.

To deploy and start the app, the following software must be pre-installed.

  • Get Docker 19.03.0+ To verify installation please use docker --version
  • Docker compose plugin 1.27.0+ To verify installation please run docker compose --version in terminal

All pdf2Data Editor components are available as images on AWS ECR (or on Docker Hub for older versions) so your system must be able to access it.

Deployment

Create Docker Compose configuration file

To get the application deployed, it is required to create a docker-compose.yml file with the following content:

Lightweight UI

services:
pdf2data-editor:
container_name: pdf2data-editor
pull_policy: always
restart: unless-stopped
image: public.ecr.aws/apryse/pdf2data-manager-db:{version} # replace {version} placeholder with the actual app version
env_file: .env
ports: ['80:8080']
services:
#=========== FRONTEND ===============
pdf2data-manager-frontend:
container_name: pdf2data-manager-frontend
restart: unless-stopped
pull_policy: always
image: public.ecr.aws/apryse/pdf2data-manager-frontend:{version} # replace {version} placeholder with the actual app version
ports: ['80:8080']
env_file: .env
depends_on: [pdf2data-manager-backend, pdf2data-editor]
#=========== BACKEND ===============
pdf2data-manager-backend:
container_name: pdf2data-manager-backend
restart: unless-stopped
pull_policy: always
image: public.ecr.aws/apryse/pdf2data-manager-backend:{version} # replace {version} placeholder with the actual app version
env_file: .env
depends_on: [pdf2data-manager-db]
#=========== DATABASE ===============
pdf2data-manager-db:
container_name: pdf2data-manager-db
restart: unless-stopped
pull_policy: always
image: public.ecr.aws/apryse/pdf2data-manager-db:{version} # replace {version} placeholder with the actual app version
env_file: .env
volumes: ['pdf2data-manager-db:/var/lib/postgresql/data']
#=========== EDITOR =================
pdf2data-editor:
container_name: pdf2data-editor
restart: unless-stopped
pull_policy: always
image: public.ecr.aws/apryse/pdf2data-editor:{version} # replace {version} placeholder with the actual app version
env_file: .env
#=========== GENERAL ================
volumes:
pdf2data-manager-db: null
networks:
default:
name: pdf2data-manager-network
caution

For installation of the application with version prior 4.2.0 you'll need to drop the public.ecr.aws/ prefix for all the images so that they were downloaded from Docker Hub instead

Create an environment configuration file

There are a bunch of environment variables that can be used to configure pdf2Data Editor.

Those are grouped by purpose and have self-explanatory names with default values. All variables need to be set in a separate file, .env which needs to be in the same directory as the docker-compose.yml file.

note

Editor page uses Apryse WebViewer and without a license the pdf file will be displayed with a watermark. To get rid of it you need to add license value to this env PDF2DATA_EDITOR_WEB_VIEWER_API_TOKEN

Below is a sample of the .env file:

Lightweight UI:

PDF2DATA_EDITOR_MODE=STANDALONE
PDF2DATA_EDITOR_URL=pdf2data-editor:8080
PDF2DATA_EDITOR_CONTAINER_MEMORY_LIMIT=2048
MPDF2DATA_EDITOR_JVM_MEMORY_LIMIT_MB=1848
PDF2DATA_EDITOR_WEB_VIEWER_API_TOKEN = ... # WebViewer license value

Fully-functional UI:

PDF2DATA_EDITOR_MODE=MANAGER
PDF2DATA_EDITOR_TEMPLATE_REPOSITORY_MANAGER_HOST=http://pdf2data-manager-backend:8080/api
PDF2DATA_EDITOR_URL=pdf2data-editor:8080
PDF2DATA_EDITOR_WEB_VIEWER_API_TOKEN = ... # WebViewer license value

PDF2DATA_MANAGER_BACKEND_URL=pdf2data-manager-backend:8080
PDF2DATA_MANAGER_MULTIPLE_WORKSPACES=false
PDF2DATA_MANAGER_DEFAULT_ADMIN_EMAIL=<valid email address>
PDF2DATA_MANAGER_DEFAULT_ADMIN_PASSWORD=<SOME_SECURE_PASSWORD>
PDF2DATA_MANAGER_DEFAULT_TOKEN_PRIVATE_KEY=... # ! Fill with Base64-encoded value of a random key. Minimum 512 bit keys are recommended

PDF2DATA_MANAGER_DB_NAME=postgres
PDF2DATA_MANAGER_DB_PASSWORD=postgres
PDF2DATA_MANAGER_DB_SCHEMA=manager
PDF2DATA_MANAGER_DB_URL=pdf2data-manager-db:5432
PDF2DATA_MANAGER_DB_USERNAME=postgres
PDF2DATA_MANAGER_JPA_GENERATE_STATISTICS=false
PDF2DATA_MANAGER_JPA_SHOW_SQL=false

POSTGRES_DB=postgres
POSTGRES_PASSWORD=postgres
POSTGRES_USER=postgres
PDF2DATA_EDITOR_CONTAINER_MEMORY_LIMIT=819M
PDF2DATA_EDITOR_JVM_MEMORY_LIMIT_MB=619
PDF2DATA_MANAGER_BACKEND_CONTAINER_MEMORY_LIMIT=819
MPDF2DATA_MANAGER_BACKEND_JVM_MEMORY_LIMIT_MB=619
PDF2DATA_MANAGER_DATABASE_CONTAINER_MEMORY_LIMIT=205M
PDF2DATA_MANAGER_FRONTEND_CONTAINER_MEMORY_LIMIT=205M
important

For more information on available settings and their meaning see Customizing pdf2Data Editor application

Run / Stop the application

To run the application it is required to run the following command in the folder which contains the above-described docker-compose.yml and .env files.

docker-compose up -d

To stop the application it is required to run the following command in the folder which contains the above-described docker-compose.yml and .env files.

docker-compose down