Docker Deployment
Overview
Bodhi App provides optimized Docker images for different hardware configurations:
- CPU: Multi-platform (AMD64 + ARM64)
- CUDA: NVIDIA GPU acceleration (8-12x speedup)
- ROCm: AMD GPU acceleration
- Vulkan: Cross-vendor GPU acceleration
All variants run the same Bodhi App codebase with hardware-specific optimizations for llama.cpp inference.
Docker Registry: GitHub Container Registry (ghcr.io)
Note: Use the GitHub CLI (
gh
) to explore available images and tags at github.com/bodhisearch/Bodhi App/pkgs/container/bodhiapp.
Latest Docker Releases
For the most up-to-date Docker image versions and variants, visit getbodhi.app. The website automatically displays the latest production Docker releases with copy-to-clipboard commands for all available variants.
Why check the website:
- Always shows the latest version numbers
- Automatically updates when new variants are released
- Provides ready-to-use docker pull commands
- No manual version checking required
The examples in this documentation use latest-{variant}
tags for convenience, but you can find specific version tags (e.g., 0.0.2-cpu
) on the website for production deployments.
Variant Comparison
Variant | Platforms | Hardware | Use Case | Performance |
---|---|---|---|---|
CPU | AMD64, ARM64 | Any CPU | General purpose, ARM devices | Baseline |
CUDA | AMD64 | NVIDIA GPU | NVIDIA GPUs, cloud instances | 8-12x faster |
ROCm | AMD64 | AMD GPU | AMD GPUs | GPU accelerated* |
Vulkan | AMD64 | Cross-vendor GPU | Multi-vendor GPU support | GPU accelerated* |
Note: Performance benchmark data is not yet available for ROCm and Vulkan variants. For image sizes, use
gh
CLI to query the container registry. New variants may be added over time - check getbodhi.app for the complete list.
Choosing a Variant:
- Have NVIDIA GPU? → CUDA variant (best performance)
- Have AMD GPU? → ROCm variant
- Need cross-vendor GPU support? → Vulkan variant
- CPU only or ARM device? → CPU variant
Prerequisites
- Docker 20.10+ installed (installation guide)
For GPU Variants:
- CUDA: NVIDIA GPU with CUDA 11+ support, Docker with NVIDIA GPU support (installation guide)
- ROCm: AMD GPU with ROCm support (refer to llama.cpp documentation for requirements)
Quick Start
CPU Variant (Recommended for Most Users)
# Pull image
docker pull ghcr.io/bodhisearch/bodhiapp:latest-cpu
# Run container
docker run --name bodhiapp \
-p 1135:8080 \
-e BODHI_PUBLIC_HOST=0.0.0.0 \
-e BODHI_PUBLIC_PORT=1135 \
-e BODHI_ENCRYPTION_KEY=your-strong-encryption-key-here \
-v $(pwd)/docker-data:/data \
ghcr.io/bodhisearch/bodhiapp:latest-cpu
Important: Replace
your-strong-encryption-key-here
with your own strong encryption key. The container will not start with the placeholder value.
Access: Open browser to http://localhost:1135
CUDA Variant (NVIDIA GPU)
# Pull image
docker pull ghcr.io/bodhisearch/bodhiapp:latest-cuda
# Run container with GPU access
docker run --name bodhiapp-cuda \
-p 1135:8080 \
-e BODHI_PUBLIC_HOST=0.0.0.0 \
-e BODHI_PUBLIC_PORT=1135 \
-e BODHI_ENCRYPTION_KEY=your-strong-encryption-key-here \
-v $(pwd)/docker-data:/data \
--gpus all \
ghcr.io/bodhisearch/bodhiapp:latest-cuda
Important: Replace
your-strong-encryption-key-here
with your own strong encryption key. The container will not start with the placeholder value.
Note: The Docker images use base images from GPU vendors with required runtime libraries included. Use the
--gpus all
flag to provide GPU access to the container.
ROCm Variant (AMD GPU)
# Pull image
docker pull ghcr.io/bodhisearch/bodhiapp:latest-rocm
# Run container with GPU access
docker run --name bodhiapp-rocm \
-p 1135:8080 \
-e BODHI_PUBLIC_HOST=0.0.0.0 \
-e BODHI_PUBLIC_PORT=1135 \
-e BODHI_ENCRYPTION_KEY=your-strong-encryption-key-here \
-v $(pwd)/docker-data:/data \
--device=/dev/kfd \
--device=/dev/dri \
ghcr.io/bodhisearch/bodhiapp:latest-rocm
Important: Replace
your-strong-encryption-key-here
with your own strong encryption key. The container will not start with the placeholder value.
Note: For AMD GPU device mapping, refer to llama.cpp ROCm documentation for specific requirements.
Volume Configuration
Required Volumes
Data Volume (/data
):
- Configuration files
- Database (users, tokens, requests)
- Application settings
- Logs
Models Volume (/models
):
- Downloaded GGUF models
- Model cache
- Model aliases (stored as files)
- SQLite-based application state
Volume Size: Depends on the models you download. Allocate space based on your model requirements (typically 4-80GB per model). Also includes minimal space for SQLite database and alias files.
Volume Examples
Named Volumes (Recommended):
# Create named volumes
docker volume create bodhi-data
docker volume create bodhi-models
# Use in run command
docker run -v bodhi-data:/data -v bodhi-models:/models ...
Bind Mounts (Direct host paths):
# Use host directories
docker run \
-v /path/to/data:/data \
-v /path/to/models:/models \
...
Environment Variables
Essential Configuration
Common Environment Variables:
# Server Configuration
-e BODHI_PORT=1135 \ # Server port (default: 1135)
-e BODHI_HOST=0.0.0.0 \ # Server host (default: 0.0.0.0)
-e BODHI_ENCRYPTION_KEY=your-key \ # Required for data encryption
# RunPod Auto-Configuration
-e BODHI_ON_RUNPOD=true \ # Enables RunPod-specific auto-config
# Public Host (for cloud/network deployments)
-e BODHI_PUBLIC_SCHEME=https \
-e BODHI_PUBLIC_HOST=your-domain.com \
-e BODHI_PUBLIC_PORT=443 \
Note:
BODHI_ENCRYPTION_KEY
is required for securing stored data. For complete environment variable reference, see Configuration Guide (coming soon).
Cloud Platform Deployment
Note: Railway-specific deployment is not yet supported. Use Docker deployment on your preferred cloud platform.
RunPod
RunPod Auto-Configuration:
Bodhi App supports automatic configuration for RunPod deployments. Set the BODHI_ON_RUNPOD=true
environment variable to enable auto-configuration using RunPod-injected environment variables for public host and other properties.
Steps:
- Create new pod on RunPod
- Select Docker variant:
ghcr.io/bodhisearch/bodhiapp:latest-cuda
for GPU podsghcr.io/bodhisearch/bodhiapp:latest-cpu
for CPU pods
- Configure volumes (
/data
and/models
) - Set environment variables (including
BODHI_ON_RUNPOD=true
) - Deploy
Generic Cloud Platform
Bodhi App works on any platform supporting Docker:
Requirements:
- Docker support
- Volume/storage support
- Network ingress
- Minimum Resources (API-only, lightweight workflow):
- 2GB RAM
- Single-core 2.4GHz CPU
- For Local Model Inference: Resources depend on the model size and requirements
Configuration:
- Deploy Docker image
- Configure volumes (
/data
and/models
) - Set public host environment variables
- Configure OAuth callback URL
- Open port 1135 (or custom port)
Docker Compose
Note: Docker Compose deployment has not been tested. Use single-container deployment commands shown above.
Performance Optimization
GPU Configuration
Docker Factory Settings:
- Docker files include factory settings optimized for fastest single-request inference
- Settings can be overridden via environment variables or the settings dashboard
- All settings are llama.cpp pass-through parameters
CUDA Requirements:
- NVIDIA GPU with CUDA 11+ support
- Docker with NVIDIA GPU support (installation guide)
- Use
--gpus all
flag for GPU access
ROCm: Check llama.cpp documentation for specific requirements.
Parallel Request Optimization
For handling parallel requests, override factory settings (optimized for single requests) using:
- Environment variables
- Settings dashboard configuration
- Refer to llama.cpp documentation for optimal settings for your hardware
Note: Performance benchmark data is not yet available.
Troubleshooting
Container Won't Start
Symptoms: Container exits immediately or won't start
Common Issues:
- Missing
BODHI_ENCRYPTION_KEY
: This environment variable is required for data encryption - Port conflicts: Application runs on port 1135 by default. If port is unavailable, the application fails with an error message
Solutions:
- Check logs:
docker logs <container-name>
- Verify
BODHI_ENCRYPTION_KEY
is set - Check environment variables are correctly configured
- Verify volume permissions
- Search the codebase for error codes to understand specific issues
GPU Not Detected (CUDA)
Symptoms: Running but using CPU instead of GPU
Solutions:
- Verify NVIDIA Docker runtime installed
- Check
--gpus all
flag is used in docker run command - Run
nvidia-smi
in container to verify GPU is visible - Verify CUDA 11+ support
GPU Not Detected (ROCm)
Symptoms: AMD GPU not utilized
Solutions:
- Verify correct device mapping flags are used
- Check llama.cpp documentation for ROCm requirements
- Verify driver installation
Performance Slower Than Expected
Symptoms: Inference speed below expectations
Possible Causes & Solutions:
- Factory settings for single request: If running parallel requests, override settings via environment variables or settings dashboard
- Model inference powered by llama.cpp: Refer to llama.cpp documentation for optimal configuration
- Wrong hardware variant: Ensure using appropriate Docker variant for your hardware (CUDA for NVIDIA, ROCm for AMD, etc.)
OAuth Redirect Issues
Symptoms: OAuth callback fails in cloud deployment
Solutions:
- Verify BODHI_PUBLIC_HOST matches actual domain
- Set BODHI_PUBLIC_SCHEME to https (if using HTTPS)
- Set BODHI_PUBLIC_PORT correctly
- Update OAuth callback URL in provider
- See OAuth Configuration Guide
Upgrading
Upgrade Process
# Stop container
docker stop bodhiapp
# Pull new image
docker pull ghcr.io/bodhisearch/bodhiapp:latest-cpu
# Remove old container (keeps volumes)
docker rm bodhiapp
# Start new container with same config
docker run -d \
-p 1135:1135 \
-v bodhi-data:/data \
-v bodhi-models:/models \
--name bodhiapp \
ghcr.io/bodhisearch/bodhiapp:latest-cpu
Data Safety: Volumes are preserved. Database migrations run automatically on startup.
Backup and Restore
Backup Strategy: To backup your Bodhi App installation, you need to save:
- BODHI_HOME folder (or the
/data
volume) - Contains configuration, database, and application state - BODHI_ENCRYPTION_KEY environment variable - Required for data decryption
Both must match during restore for the application to work correctly.
Backup Commands:
# Backup data volume
docker run --rm \
-v bodhi-data:/data \
-v $(pwd):/backup \
alpine tar czf /backup/bodhi-data-backup.tar.gz /data
# Backup models volume
docker run --rm \
-v bodhi-models:/models \
-v $(pwd):/backup \
alpine tar czf /backup/bodhi-models-backup.tar.gz /models
Version Pinning
Latest Tag:
# Always latest version (not recommended for production)
docker pull ghcr.io/bodhisearch/bodhiapp:latest-cpu
Specific Version:
# Pin to specific version (recommended for production)
docker pull ghcr.io/bodhisearch/bodhiapp:v0.1.0-cpu
Note: Use
gh
CLI to explore available version tags at the container registry.
Hardware Acceleration Details
CUDA Variant
Supported GPUs:
- NVIDIA GPUs with CUDA Compute Capability 5.0+ (Maxwell and newer)
- CUDA 11 and 12 support
Performance Gains:
- 8-12x speedup vs CPU for typical models
- Performance varies by model size
VRAM Requirements:
- Varies based on model size
- Refer to HuggingFace model page for specific llama.cpp hardware requirements
GPU Recommendations:
- Bodhi App recommends optimal models during setup based on open-source benchmarks
- No dynamic recommendations currently available
Configuration:
# Check GPU availability in container
docker exec -it bodhiapp-cuda nvidia-smi
# Monitor GPU usage
nvidia-smi -l 1
ROCm Variant
Supported GPUs:
- AMD Radeon RX and Instinct series
- Refer to llama.cpp documentation for ROCm support details
VRAM Requirements:
- Varies based on model size
- Refer to HuggingFace model page for specific llama.cpp hardware requirements
CPU Variant
Optimizations:
- Multi-threaded inference with configurable thread count
- AVX/AVX2/AVX512 instruction set support (when available)
- ARM NEON optimizations for ARM64
Performance Tips:
- For optimal settings, refer to llama.cpp documentation
- Run on dedicated hardware for best performance
Security Considerations
Container Security
Best Practices:
- Run containers with non-root user when possible
- Use read-only root filesystem where feasible
- Limit container capabilities
- Keep images updated with latest security patches
Network Security:
- Use reverse proxy (nginx, Traefik) for HTTPS termination
- Configure firewall rules appropriately
- Use Docker networks for service isolation
Secrets Management
OAuth Credentials:
- Use environment variables for sensitive configuration
- Consider Docker secrets or external secret management (Vault, AWS Secrets Manager)
- Never commit credentials to version control
API Keys:
- Store remote AI API keys as environment variables
- Rotate keys regularly
- Use separate keys for development and production
Resource Planning
Storage Requirements
Data Volume:
- Initial: ~100MB (configuration and database)
- Growth: Minimal (user data, chat history in LocalStorage)
Models Volume:
- Small models (7B quantized): 4-8GB each
- Medium models (13B quantized): 8-16GB each
- Large models (70B quantized): 40-80GB each
Total Recommendations:
- Light usage (1-2 models): 50GB
- Medium usage (3-5 models): 150GB
- Heavy usage (10+ models): 500GB+
Memory and CPU Requirements
Minimum Requirements (API-only, lightweight workflow):
- 2GB RAM
- Single-core 2.4GHz CPU
For Local Model Inference:
- CPU and Memory: Requirements depend on model size
- VRAM: Varies by model size - refer to HuggingFace model page for specific llama.cpp requirements
- GPU Variants: CPU usage is reduced when GPU is active
Monitoring and Observability
Logs
Access Container Logs:
# View logs
docker logs bodhiapp
# Follow logs in real-time
docker logs -f bodhiapp
# Last 100 lines
docker logs --tail 100 bodhiapp
Log Levels:
- Configure via
BODHI_LOG_LEVEL
environment variable - For available log levels and configuration details, see Configuration Guide (coming soon)
Health Checks
Manual Health Check:
# Check if server responds
curl http://localhost:1135/health
# Check from inside container
docker exec bodhiapp curl http://localhost:1135/health
Note: For health check endpoint details, see the OpenAPI documentation.
Performance Monitoring
Resource Usage:
# Container resource stats
docker stats bodhiapp
# Detailed inspection
docker inspect bodhiapp
GPU Monitoring (CUDA):
# Inside container
docker exec -it bodhiapp-cuda nvidia-smi
# Continuous monitoring
docker exec -it bodhiapp-cuda nvidia-smi -l 1
Multi-Container Deployments
Note: For multi-container deployments, load balancing, high availability configurations, and session persistence details, see Advanced Deployment Guide (coming soon).
Migration and Backup
Backup Strategy
Data Backup:
# Backup data volume
docker run --rm \
-v bodhi-data:/data \
-v $(pwd):/backup \
alpine tar czf /backup/bodhi-data-backup.tar.gz /data
# Backup models volume
docker run --rm \
-v bodhi-models:/models \
-v $(pwd):/backup \
alpine tar czf /backup/bodhi-models-backup.tar.gz /models
Restore:
# Restore data volume
docker run --rm \
-v bodhi-data:/data \
-v $(pwd):/backup \
alpine sh -c "cd / && tar xzf /backup/bodhi-data-backup.tar.gz"
# Restore models volume
docker run --rm \
-v bodhi-models:/models \
-v $(pwd):/backup \
alpine sh -c "cd / && tar xzf /backup/bodhi-models-backup.tar.gz"
Migration Between Hosts
Steps:
- Stop container on source host
- Backup volumes (see above)
- Transfer backup files to destination host
- Restore volumes on destination host
- Start container with same configuration
Note: For database portability and migration details, see Configuration Guide (coming soon).
Related Documentation
- Installation Guide - Desktop and server installation
- Environment Variables - Complete configuration reference
- Authentication - OAuth2 setup
- Multi-Platform Installation - Desktop apps
- llama.cpp Documentation - GPU requirements and performance tuning