Docker Compose (Recommended)
The simplest way to run OpenDataMask in any environment:
git clone https://github.com/MaximumTrainer/OpenDataMask.git
cd OpenDataMask
# Generate and export secrets
export JWT_SECRET=$(openssl rand -base64 32)
export ENCRYPTION_KEY=$(openssl rand -base64 32 | head -c 32)
# Start all services
docker-compose up -d
# Check status
docker-compose ps
docker-compose logs -f backend
Services started:
- frontend โ Vue 3 UI served by nginx on port
80 - backend โ Spring Boot REST API on port
8080 - postgres โ PostgreSQL 16 database on port
5432
Docker Images
Pre-built images are published to the GitHub Container Registry on every push to main:
# Backend
docker pull ghcr.io/maximumtrainer/opendatamask/backend:latest
# Frontend
docker pull ghcr.io/maximumtrainer/opendatamask/frontend:latest
# CLI
docker pull ghcr.io/maximumtrainer/opendatamask/cli:latest
Building locally
docker build -t opendatamask-backend ./backend
docker build -t opendatamask-frontend ./frontend
docker build -t opendatamask-cli ./cli
Environment Variables
| Variable | Required | Default | Description |
|---|---|---|---|
DATABASE_URL | Yes | โ | JDBC URL for PostgreSQL metadata store |
DATABASE_USERNAME | Yes | โ | PostgreSQL username |
DATABASE_PASSWORD | Yes | โ | PostgreSQL password |
JWT_SECRET | Yes | โ | JWT signing secret. Generate with openssl rand -base64 32 |
ENCRYPTION_KEY | Yes | โ | Credential encryption key, exactly 16 or 32 characters |
SERVER_PORT | No | 8080 | Backend HTTP listen port |
JWT_EXPIRATION | No | 86400000 | Token expiry in milliseconds (default 24 h) |
MONGODB_URI | No | โ | MongoDB connection URI (only needed when masking MongoDB sources) |
Database Setup
OpenDataMask uses PostgreSQL 15+ for its metadata store. On first startup, Hibernate automatically creates the required schema (ddl-auto: update). No manual migration is needed.
# Create database manually (if not using docker-compose)
psql -U postgres -c "CREATE DATABASE opendatamask;"
psql -U postgres -c "CREATE USER opendatamask WITH PASSWORD 'secret';"
psql -U postgres -c "GRANT ALL PRIVILEGES ON DATABASE opendatamask TO opendatamask;"
Terraform (AWS)
The infra/ directory provides Terraform configuration to provision a complete AWS environment โ VPC, EC2 instance, security groups, Elastic IP, and S3/DynamoDB remote state. Everything runs as docker-compose on a single t3.small EC2 instance (Amazon Linux 2023), keeping costs low while remaining production-upgradeable.
Prerequisites
- Terraform 1.6+ installed โ install guide
- AWS account with IAM credentials (EC2, VPC, S3, DynamoDB permissions)
- AWS CLI configured:
aws configure
One-time: Bootstrap Remote State
# Create S3 bucket for Terraform state
aws s3api create-bucket --bucket my-opendatamask-tfstate --region us-east-1
aws s3api put-bucket-versioning \
--bucket my-opendatamask-tfstate \
--versioning-configuration Status=Enabled
# Create DynamoDB table for state locking
aws dynamodb create-table \
--table-name opendatamask-tf-locks \
--attribute-definitions AttributeName=LockID,AttributeType=S \
--key-schema AttributeName=LockID,KeyType=HASH \
--billing-mode PAY_PER_REQUEST
Deploy
cd infra
cp terraform.tfvars.example terraform.tfvars
# Edit terraform.tfvars โ add your SSH public key
terraform init \
-backend-config="bucket=my-opendatamask-tfstate" \
-backend-config="dynamodb_table=opendatamask-tf-locks" \
-backend-config="region=us-east-1"
terraform plan
terraform apply
# Get the server's public IP
terraform output server_public_ip
GitHub Secrets for CI/CD Pipeline
Configure in GitHub โ Settings โ Secrets and variables โ Actions:
| Secret | Description |
|---|---|
AWS_ACCESS_KEY_ID | AWS IAM access key |
AWS_SECRET_ACCESS_KEY | AWS IAM secret key |
AWS_REGION | AWS region (e.g. us-east-1) |
EC2_SSH_PRIVATE_KEY | PEM private key for SSH deploys |
EC2_SSH_PUBLIC_KEY | Matching SSH public key (stored in EC2) |
JWT_SECRET | 32+ char JWT signing secret |
ENCRYPTION_KEY | 32 char field encryption key |
TF_STATE_BUCKET | S3 bucket for Terraform state |
TF_STATE_DYNAMODB_TABLE | DynamoDB table for state locking |
Kubernetes
A basic Kubernetes deployment uses standard Deployment and Service resources. Store secrets with kubectl create secret:
kubectl create secret generic opendatamask-secrets \
--from-literal=JWT_SECRET="$(openssl rand -base64 32)" \
--from-literal=ENCRYPTION_KEY="$(openssl rand -base64 32 | head -c 32)" \
--from-literal=DATABASE_PASSWORD="your-db-password"
Reference the secret in your Deployment envFrom block. A PostgreSQL StatefulSet or a managed cloud database (AWS RDS, Azure Database for PostgreSQL, Google Cloud SQL) is recommended for production.
CI / CD Pipelines
OpenDataMask ships with six GitHub Actions workflows delivering a complete build โ deploy โ verify โ docs pipeline:
| Workflow | File | Trigger | Purpose |
|---|---|---|---|
| CI | ci.yml | push / PR to main | Build, lint, test backend + frontend + CLI; build and push Docker images to GHCR |
| Deploy | deploy.yml | after CI on main | Full pipeline: terraform apply โ SSH deploy โ health verify |
| Sandbox Verification | sandbox-verification.yml | push / PR to main | End-to-end masking correctness check; publishes JUnit report artifact |
| Playwright E2E | playwright-e2e.yml | after Sandbox Verification | Full browser E2E test suite against the deployed frontend |
| Deploy Website | deploy-website.yml | after E2E on main | Generate screenshots and publish documentation to GitHub Pages |
| CodeQL | codeql.yml | push / PR / weekly | Static security analysis for Kotlin, JS/TS, Go |
Deploy Pipeline Flow
push to main
โโโบ CI (build + test + Docker push โ GHCR)
โโโบ deploy.yml:
โโ Job 1: terraform apply โ provision/update AWS infra
โโ Job 2: SSH deploy โ docker-compose pull && up
โโ Job 3: verify โ curl /actuator/health โ 200 โ
โโโบ Sandbox Masking Verification
โโโบ Playwright E2E Tests
โโโบ Deploy Website (GitHub Pages)
GitHub Environments (staging, production) track each deployment โ enabling deployment status, history, and environment URLs in the GitHub UI.
Security Notes
- Never commit
JWT_SECRETorENCRYPTION_KEYto source control. Use GitHub Secrets or a secrets manager. - The health endpoint (
/actuator/health) is publicly accessible by default. Restrict it at the network level in production. - All connection passwords are encrypted at rest using AES-256 keyed by
ENCRYPTION_KEY. - HTTPS termination should be handled by a reverse proxy (nginx, Cloudflare, ALB) in front of the backend.
- JWT tokens expire after 24 hours by default (
JWT_EXPIRATION).