Case Study: Optimizing Docker Build Pipelines for Microservices

FinanceHub: A Microservices Journey

FinanceHub (name changed for privacy) is a leading financial technology company providing services to over 5 million users across North America. Their platform consists of 70+ microservices handling everything from payments processing to investment analytics.

70%
Build Time Reduction
45%
Image Size Reduction
$156K
Annual CI Cost Savings

As organizations scale their microservices architectures, Docker build pipelines often become bottlenecks in the development process. This case study explores how FinanceHub, a fintech company with 70+ microservices, transformed their Docker build process to dramatically improve developer productivity and reduce infrastructure costs.

The Initial Situation

FinanceHub's engineering team had grown to over 200 developers across 15 teams, each responsible for multiple microservices. Their architecture included services built with various technologies:

  • 35 Node.js services
  • 20 Java/Spring Boot services
  • 8 Python services
  • 7 Go services
  • Various other specialized services

Each service had its own repository and CI/CD pipeline. The team was deploying to production approximately 50 times per day across all services, with each deployment requiring a Docker build and push to their container registry.

January 2024

The Breaking Point

The engineering leadership realized they had a problem when their CI/CD costs reached $28,000 per month, with Docker builds accounting for nearly 60% of build minutes. Many developers complained about waiting 15-20 minutes for CI/CD pipelines to complete for even minor changes.

February 2024

Assessment and Planning

A dedicated team of 3 DevOps engineers and 2 senior developers conducted a thorough assessment of their Docker build processes. They discovered numerous inefficiencies, including unoptimized Dockerfiles, lack of caching, and oversized base images.

March-April 2024

Implementation Phase

The team implemented a series of Docker build optimizations across all services, starting with the most frequently updated ones. They created standardized Dockerfile templates for each technology stack and updated CI/CD pipelines to leverage BuildKit and remote caching.

May 2024

Results and Ongoing Improvements

By May, all 70+ services had been migrated to the optimized Docker build process. The team continued to refine their approach, implementing automation to ensure all new services followed best practices.

Key Challenges and Solutions

Challenge #1: Slow Build Times

Most services had build times of 8-15 minutes in CI, with an average of 12 minutes. This resulted in developers waiting for feedback and delayed deployments, especially for urgent fixes.

Analysis revealed common issues:

  • Poor layer ordering causing unnecessary rebuilds
  • No caching between builds in CI/CD
  • Full rebuilds for minor code changes

Solution: Optimized Dockerfile Templates

The team created standardized Dockerfile templates for each technology stack with:

  • Optimized layer ordering for better cache utilization
  • Multi-stage builds to separate build and runtime dependencies
  • BuildKit cache mounts for package manager caches
  • Remote cache storage and retrieval in CI/CD

Example for Node.js services:

# syntax=docker/dockerfile:1.4
FROM node:18-alpine AS deps
WORKDIR /app
COPY package*.json ./
RUN --mount=type=cache,target=/root/.npm \
    npm ci

FROM node:18-alpine AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN --mount=type=cache,target=/root/.npm \
    npm run build

FROM node:18-alpine AS runner
WORKDIR /app
ENV NODE_ENV=production
COPY --from=builder /app/dist ./dist
COPY --from=deps /app/node_modules ./node_modules
USER node
EXPOSE 3000
CMD ["node", "dist/main.js"]

Challenge #2: Oversized Images

Container images were unnecessarily large, causing:

  • Slower deployments due to large image pulls
  • Higher storage costs in container registries
  • Increased attack surface with unneeded tools
  • Wasted resources in production

The average Node.js service image was 1.2GB, and Java services averaged 850MB.

Solution: Image Size Optimization

The team implemented several image size reduction techniques:

  • Strict multi-stage builds with minimal final images
  • Alpine-based images where appropriate
  • Distroless images for Java services
  • Production-only dependencies in final stage
  • Removal of development tools and documentation

Example for Java services:

# syntax=docker/dockerfile:1.4
FROM eclipse-temurin:17-jdk-alpine AS builder
WORKDIR /app
COPY gradle/ gradle/
COPY gradlew build.gradle settings.gradle ./
RUN --mount=type=cache,target=/root/.gradle \
    ./gradlew dependencies

COPY src/ src/
RUN --mount=type=cache,target=/root/.gradle \
    ./gradlew bootJar

FROM gcr.io/distroless/java17-debian11
WORKDIR /app
COPY --from=builder /app/build/libs/*.jar app.jar
EXPOSE 8080
USER nonroot
ENTRYPOINT ["java", "-jar", "app.jar"]

Challenge #3: CI/CD Inefficiencies

CI/CD pipelines were not optimized for Docker builds:

  • No reuse of layer cache between pipeline runs
  • Separate build and push steps causing duplication
  • BuildKit features not enabled in CI
  • High CI minutes consumption (approximately 500,000 minutes/month)

Solution: CI/CD Pipeline Optimization

The team redesigned their CI/CD pipeline approach:

  • Implemented BuildKit's remote caching in all pipelines
  • Created shared base images for common dependencies
  • Added distributed caching to store and retrieve layers
  • Combined build and push steps to reduce overhead

Example GitHub Actions workflow excerpt:

steps:
  - name: Set up Docker Buildx
    uses: docker/setup-buildx-action@v2

  - name: Login to Container Registry
    uses: docker/login-action@v2
    with:
      registry: ${{ env.REGISTRY }}
      username: ${{ secrets.REGISTRY_USERNAME }}
      password: ${{ secrets.REGISTRY_PASSWORD }}

  - name: Build and push
    uses: docker/build-push-action@v4
    with:
      context: .
      push: true
      tags: ${{ env.IMAGE_NAME }}:${{ env.TAG }}
      cache-from: type=registry,ref=${{ env.IMAGE_NAME }}:cache
      cache-to: type=registry,ref=${{ env.IMAGE_NAME }}:cache,mode=max

Results and Impact

The optimization efforts led to dramatic improvements across all key metrics:

Metric Before After Improvement
Avg. Build Time (CI) 12 minutes 3.6 minutes 70%
Avg. Node.js Image Size 1.2 GB 154 MB 87%
Avg. Java Image Size 850 MB 180 MB 79%
CI Compute Minutes/Month 500,000 150,000 70%
Deployment Time 4.5 minutes 1.8 minutes 60%
Monthly CI Cost $28,000 $9,000 68%

These improvements had a profound effect on the development process:

  • Developer Productivity: Faster feedback cycles led to more iterations and fewer context switches
  • Incident Response: Time to deploy critical fixes reduced by 65%
  • Infrastructure Costs: Annual savings of approximately $156,000 in CI costs alone
  • Security Posture: Smaller attack surface with minimal production images
  • Deployment Reliability: Faster, more reliable deployments with fewer timeout issues
Before
Build pipeline before optimization

The original pipeline had long build times with many redundant steps and no layer caching between runs.

After
Build pipeline after optimization

The optimized pipeline leverages BuildKit caching, parallel builds, and shared base images for dramatic speed improvements.

Key Learnings and Best Practices

The FinanceHub team identified several key learnings that can be applied to other microservices environments:

Standardize Across Teams

Creating standardized Dockerfile templates for each technology stack ensured consistency and simplified maintenance. The team developed a central repository of Dockerfile templates that teams could easily adapt to their specific services.

This standardization made it easier to implement improvements across all services and onboard new services with optimized builds from day one.

Invest in Shared Base Images

The team created a set of custom base images for each major technology stack that included common dependencies and security configurations. These images were rebuilt weekly with the latest security patches.

This approach reduced duplication, improved security compliance, and further decreased build times by providing optimized starting points for service builds.

Measure Everything

The team implemented comprehensive metrics collection for their Docker build process:

  • Build times for each pipeline stage
  • Image sizes and layer counts
  • Cache hit ratios in CI/CD
  • CI minutes consumed per service

These metrics allowed them to identify bottlenecks, prioritize optimizations, and quantify improvements.

Pipeline Architecture Matters

The team discovered that the design of CI/CD pipelines significantly impacted build performance. They restructured their pipelines to:

  • Run tests in parallel with Docker builds where possible
  • Use ephemeral environments for integration tests
  • Implement smart skipping of stages when appropriate
  • Optimize for cold starts with distributed caching

These pipeline architecture improvements complemented the Dockerfile optimizations for maximum effect.

Lessons for Other Organizations

Based on FinanceHub's experience, here are key recommendations for organizations looking to optimize their Docker build pipelines for microservices:

  1. Start with Measurement: Collect baseline metrics before making changes to quantify improvements
  2. Prioritize High-Impact Services: Begin with the most frequently built services or those with the longest build times
  3. Standardize for Scale: Create templates and standards that can be applied consistently across all services
  4. Educate Teams: Ensure all developers understand Docker build best practices through workshops and documentation
  5. Automate Compliance: Implement CI checks to ensure Dockerfile best practices are followed
  6. Consider Total Costs: Factor in both infrastructure costs and developer time when evaluating optimizations
  7. Iterate and Improve: Continuously monitor metrics and refine your approach based on real-world results
"Our Docker build optimization project started as a cost-saving initiative, but quickly became a major productivity win. Developers who previously had to wait 15 minutes for feedback now get it in under 4 minutes. That translates to more iterations, better code quality, and happier engineers."

— VP of Engineering, FinanceHub

Conclusion

FinanceHub's journey to optimize their Docker build pipelines demonstrates that with careful analysis and implementation of best practices, organizations can achieve dramatic improvements in build performance, image size, and costs.

The key to their success was a holistic approach that addressed:

  • Dockerfile structure and optimization
  • CI/CD pipeline architecture
  • Caching strategies at multiple levels
  • Standardization across teams and services
  • Continuous measurement and improvement

For organizations with growing microservices architectures, investing in Docker build optimization can yield significant returns in terms of both direct costs and developer productivity. The principles and techniques demonstrated in this case study can be adapted to microservices environments of any size.

Share this article

Related Articles

5 Advanced Docker Caching Techniques

May 10, 2025

Learn how to optimize your Docker builds with five advanced caching techniques that can significantly reduce build times.

7 New Dockerfile Features You Should Know About

May 15, 2025

Discover the latest Dockerfile features and syntax improvements that can make your builds more efficient.

10 Docker Security Best Practices

Coming Soon

Essential security practices for building and deploying Docker containers in production environments.

Comments

Comments system placeholder. In a real implementation, this would be integrated with a third-party comments system or custom solution.