This guide explains how to build and publish Docker images to GitHub Container Registry (ghcr.io) for the Data Lake Receiver application.
- Overview
- Prerequisites
- Automated Deployment (GitHub Actions)
- Manual Deployment
- Pulling Published Images
- Configuration
- Troubleshooting
The Data Lake Receiver can be automatically published to GitHub Container Registry using:
- GitHub Actions - Automated CI/CD on push/tag
- Manual Scripts - For local development and testing
Published images are available at:
ghcr.io/sparkworks/data-lake-receiver:latest
ghcr.io/sparkworks/data-lake-receiver:1.0-SNAPSHOT
ghcr.io/sparkworks/data-lake-receiver:v1.0.0
- Repository hosted on GitHub
- GitHub Actions enabled (enabled by default)
- No additional setup needed - uses built-in
GITHUB_TOKEN
- Docker installed and running
- GitHub Personal Access Token with
write:packagespermission - Maven (if building locally)
- Go to GitHub Settings > Tokens
- Click "Generate new token (classic)"
- Give it a descriptive name (e.g., "Docker Push Token")
- Select scopes:
- ✅
write:packages- Upload packages to GitHub Package Registry - ✅
read:packages- Download packages from GitHub Package Registry - ✅
delete:packages- Delete packages (optional)
- ✅
- Click "Generate token"
- Copy the token immediately (you won't see it again!)
- Save it securely
The GitHub Actions workflow (.github/workflows/docker-publish.yml) automatically:
- Builds the Docker image when you push code
- Tags images based on branch/tag
- Pushes to GitHub Container Registry
- Creates multi-architecture images (amd64 + arm64)
The workflow runs automatically on:
| Event | When | Image Tags Created |
|---|---|---|
| Push to main/master | Merge to default branch | latest, main |
| Push to develop | Push to develop branch | develop |
| Git tag | Create tag v1.0.0 |
v1.0.0, 1.0, 1, latest |
| Pull Request | Create/update PR | pr-123 |
| Manual | Workflow dispatch | Current branch |
# Create and push a version tag
git tag v1.2.3
git push origin v1.2.3
# GitHub Actions will automatically build and publish:
# - ghcr.io/sparkworks/data-lake-receiver:v1.2.3
# - ghcr.io/sparkworks/data-lake-receiver:1.2
# - ghcr.io/sparkworks/data-lake-receiver:1
# - ghcr.io/sparkworks/data-lake-receiver:latest# Push to develop branch
git push origin develop
# GitHub Actions will automatically build and publish:
# - ghcr.io/sparkworks/data-lake-receiver:develop- Go to repository on GitHub
- Click Actions tab
- Select Build and Push Docker Image workflow
- Click Run workflow
- Select branch
- Click Run workflow button
- Go to repository on GitHub
- Click Actions tab
- See workflow runs and logs
The workflow is configured in .github/workflows/docker-publish.yml:
env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }} # sparkworks/data-lake-receiverTo customize:
- Change organization: Fork repository or change
IMAGE_NAME - Add more triggers: Edit
on:section - Modify platforms: Edit
platforms:(amd64, arm64, arm/v7, etc.)
Use the push-to-github.sh script.
# Make script executable
chmod +x push-to-github.sh
# Set GitHub token
export GITHUB_TOKEN=ghp_xxxxxxxxxxxx# Basic usage (auto-detects version from pom.xml)
./push-to-github.sh
# Specify version and user/org
./push-to-github.sh --user sparkworks --version 1.0.0
# Use Spring Boot buildpacks instead of Dockerfile
./push-to-github.sh --build-method buildpacks
# Pass token directly
./push-to-github.sh --token ghp_xxxxxxxxxxxx --version 1.2.3-t, --token TOKEN GitHub Personal Access Token
-u, --user USERNAME GitHub username or organization (default: sparkworks)
-v, --version VERSION Version tag (default: auto-detected from pom.xml)
-b, --build-method Build method: 'dockerfile' or 'buildpacks' (default: dockerfile)
-h, --help Show help message
Use the push-to-github.bat script.
REM Set GitHub token
set GITHUB_TOKEN=ghp_xxxxxxxxxxxxREM Basic usage
push-to-github.bat
REM With arguments: [TOKEN] [USER] [VERSION]
push-to-github.bat ghp_xxxxxxxxxxxx sparkworks 1.0.0
REM Using environment variable
set GITHUB_TOKEN=ghp_xxxxxxxxxxxx
push-to-github.batIf the repository is public, anyone can pull images:
# Pull latest version
docker pull ghcr.io/sparkworks/data-lake-receiver:latest
# Pull specific version
docker pull ghcr.io/sparkworks/data-lake-receiver:1.0.0
# Run the image
docker run -p 4000:4000 ghcr.io/sparkworks/data-lake-receiver:latestFor private repositories, authenticate first:
# Login to GitHub Container Registry
echo $GITHUB_TOKEN | docker login ghcr.io -u USERNAME --password-stdin
# Pull the image
docker pull ghcr.io/sparkworks/data-lake-receiver:latest
# Logout (optional)
docker logout ghcr.ioUpdate your docker-compose.yml:
services:
data-lake-receiver:
image: ghcr.io/sparkworks/data-lake-receiver:latest
ports:
- "4000:4000"
environment:
- STORAGE_TYPE=FILESYSTEMFor private images, login first:
echo $GITHUB_TOKEN | docker login ghcr.io -u USERNAME --password-stdin
docker-compose up -dBy default, packages are private. To make them public:
- Go to the package page:
https://github.com/users/sparkworks/packages/container/data-lake-receiver - Click Package settings
- Scroll to Danger Zone
- Click Change visibility
- Select Public
- Confirm
To allow teams or users to access private images:
- Go to package settings (link above)
- Click Manage Actions access
- Add repositories, teams, or users
- Set permissions (read, write, admin)
To automatically delete old images:
- Go to package settings
- Under Package retention
- Configure retention rules:
- Keep latest N versions
- Delete images older than X days
- Keep tagged versions
Example:
Delete untagged versions after 7 days
Keep 5 most recent tagged versions
Error:
Error response from daemon: Get "https://ghcr.io/v2/": unauthorized
Solution:
- Check token has
write:packagespermission - Verify token is not expired
- Re-login:
docker logout ghcr.io echo $GITHUB_TOKEN | docker login ghcr.io -u USERNAME --password-stdin
Error:
Error: ghcr.io/sparkworks/data-lake-receiver:latest: not found
Solution:
- Check package exists: https://github.com/sparkworks?tab=packages
- If private, ensure you're authenticated
- Check image name matches repository name
Error:
denied: permission_denied: write_package
Solution:
- Ensure you're using the correct GitHub username
- For organization packages, you need organization member access
- Check repository permissions
- Regenerate token with correct permissions
Check workflow logs:
- Go to repository Actions tab
- Click the failed workflow run
- Expand failed step
- Check error messages
Common issues:
- Maven build fails → Check
pom.xmland dependencies - Docker build fails → Check
Dockerfilesyntax - Permission denied → Check repository settings and token
Solutions:
- Use multi-stage builds (already implemented in Dockerfile)
- Minimize layers in Dockerfile
- Use
.dockerignoreto exclude files - Clean up in same RUN command:
RUN apt-get update && \ apt-get install -y package && \ apt-get clean && \ rm -rf /var/lib/apt/lists/*
If building for multiple platforms fails:
# Setup buildx
docker buildx create --use
# Build for specific platform
docker buildx build --platform linux/amd64 -t image:tag .-
Use Semantic Versioning
- Tags:
v1.0.0,v1.1.0,v2.0.0 - GitHub Actions will create major/minor tags automatically
- Tags:
-
Always Tag Releases
git tag -a v1.0.0 -m "Release version 1.0.0" git push origin v1.0.0 -
Keep Images Small
- Use Alpine base images
- Multi-stage builds
- Minimize layers
-
Secure Tokens
- Never commit tokens to repository
- Use environment variables
- Rotate tokens periodically
-
Test Before Release
- Test develop/feature branches first
- Only tag main/master when stable
-
Document Changes
- Update version in
pom.xml - Add release notes
- Update CHANGELOG
- Update version in
# 1. Create feature branch
git checkout -b feature/new-feature
# 2. Make changes and commit
git add .
git commit -m "Add new feature"
# 3. Push to GitHub
git push origin feature/new-feature
# 4. Create Pull Request
# GitHub Actions builds PR image: ghcr.io/sparkworks/data-lake-receiver:pr-123
# 5. Merge to develop
# GitHub Actions builds: ghcr.io/sparkworks/data-lake-receiver:develop
# 6. Test develop branch image
docker pull ghcr.io/sparkworks/data-lake-receiver:develop
# Test thoroughly...
# 7. Merge to main
# GitHub Actions builds: ghcr.io/sparkworks/data-lake-receiver:main
# 8. Create release tag
git tag v1.0.0
git push origin v1.0.0
# GitHub Actions builds:
# - ghcr.io/sparkworks/data-lake-receiver:v1.0.0
# - ghcr.io/sparkworks/data-lake-receiver:1.0
# - ghcr.io/sparkworks/data-lake-receiver:1
# - ghcr.io/sparkworks/data-lake-receiver:latest- GitHub Container Registry Documentation
- GitHub Actions Documentation
- Docker Build Push Action
- Docker Metadata Action
For issues with:
- GitHub Actions: Check repository Actions tab
- Manual scripts: Run with
--helpflag - Authentication: Verify token permissions
- Package visibility: Check package settings on GitHub