Build a Docker image like Heroku with Buildpacks.io

Dusty Candland | Tue Dec 22 2020 | docker, buildpacks.io, ruby, heroku, aws

Buildpacks.io is bringing buildpacks, like Heroku uses, to anyone. Buildpacks are a better way to build Docker images, partly due to caching layers, but also helps build reusable build processes. This is a better intro Turn Your Code into Docker Images with Cloud Native Buildpacks.

The docs are good and I recommend going though them before jumping in here.

The existing buildpacks didn't work for my needs, but it's easy to build your own. It's more work to make it available to everyone, but I think eventualy there will be prebuilt packs that "just work". Currently, there are some edge cases that didn't meet my needs.

If you find a pack that works for you, you only need a project.toml file.

[project]
id = "com.myapp"
name = "MyApp"
version = "1.0.0"

[build]
exclude = [
    ".bundle",
    "log",
    "tmp",
    "storage",
    "public/assets",
    ".byebug_history",
    "public/packs",
    "public/packs-test",
    "node_modules",
    "yarn-error.log",
    ".DS_Store",
    "config/master.key",
    "config/credentials/production.key",
    "config/credentials/staging.key",
    ".env",
    ".envrc",
    ".sprockets-manifest-*.json",
    ".git",
    ".gitignore",
    ".ebextensions",
    ".elasticbeanstalk",
    "dist"
]

The buildpack

There are more details to this process than I'm using, but this has been working well. You can use any executable for these. I went the simple route with BASH scripts.

There is a detect phase and a build phase. I didn't use the features of the detech phase much and this is where you'd want to put in more time for a published pack. The detect phase can determine what's getting built and help interact with other buildpacks.

Here's the file structure for my custom buildpack.

dist/ruby-buildpack
├── bin
│  ├── build
│  └── detect
└── buildpack.toml

To start we need a buildpack.toml file.

# Buildpack API version
api = "0.2"

# Buildpack ID and metadata
[buildpack]
id = "net.candland.buildpacks.ruby"
version = "0.0.1"
name = "Ruby Buildpack"

# Stacks that the buildpack will work with
[[stacks]]
id = "io.buildpacks.stacks.bionic"

Next we need a detect file. This only makes sure there is a Gemfile. You can do more interesting things here and pass values to the build phase.

#!/usr/bin/env bash
set -eo pipefail

if [[ ! -f Gemfile ]]; then
   exit 100
fi

The real work is in the build file. This has multiple caching layers so building an image doesn't require a fresh start each time. For example, gems and node_modules are cached between builds saving a lot of time.

This is a long file and could (probably should) be split into different buildpacks. However, each section follows a similar pattern of setting up a caching layer, building into that layer, and setting some configuration for that layer.

Here's the big file and we'll look at some details after.

#!/usr/bin/env bash
set -eo pipefail

echo "---> Ruby Buildpack"

# GET ARGS
layersdir=$1

# DOWNLOAD RUBY
echo ""
echo "---> Ruby Install"
rubylayer="$layersdir"/ruby
mkdir -p "$rubylayer"

ruby_url=https://s3-external-1.amazonaws.com/heroku-buildpack-ruby/heroku-18/ruby-2.7.1.tgz
if [[ ! -f "$rubylayer"/bin/ruby ]] ; then
  echo "     Downloading and extracting Ruby"
  wget -q -O - "$ruby_url" | tar -xzf - -C "$rubylayer"
  echo -e 'cache = true\nlaunch = true' > "$rubylayer.toml"
fi

# MAKE RUBY AVAILABLE TO THIS SCRIPT
RUBY_PATH="$rubylayer/bin"
export PATH="$RUBY_PATH:$PATH"
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH:+${LD_LIBRARY_PATH}:}"$rubylayer/lib"

echo "     ruby version $(ruby -v)"

# INSTALL NODE & YARN
echo ""
echo "---> NodeJS"
nodelayer="$layersdir/node"
mkdir -p "$nodelayer"
echo -e 'cache = true\nlaunch = true' > "$nodelayer.toml"

node_url=https://nodejs.org/download/release/v12.20.0/node-v12.20.0-linux-x64.tar.gz
if [[ ! -f "$nodelayer"/node/bin/node ]] ; then
  echo "     Downloading and extracting NodeJS"
  wget -q -O - "$node_url" | tar -xzf - -C "$nodelayer"
  mv "$nodelayer"/node-v12.20.0-linux-x64 "$nodelayer"/node
fi

NODE_PATH="$nodelayer/node/bin"
export PATH="$NODE_PATH:$PATH"

echo "     node version $(node -v)"

# YARN
echo ""
echo "---> Yarn"

yarn_url=https://github.com/yarnpkg/yarn/releases/download/v1.22.10/yarn-v1.22.10.tar.gz
if [[ ! -f "$nodelayer"/yarn/bin/yarn ]] ; then
  echo "     Downloading and extracting Yarn"
  wget -q -O - "$yarn_url" | tar -xzf - -C "$nodelayer"
  mv "$nodelayer"/yarn-v1.22.10 "$nodelayer"/yarn
fi

YARN_PATH="$nodelayer/yarn/bin"
export PATH="$YARN_PATH:$PATH"

mkdir -p "$nodelayer/profile.d/"
echo "     Prepending $NODE_PATH:$YARN_PATH to PATH"
echo "export PATH=$NODE_PATH:$YARN_PATH:\$PATH" > "$nodelayer/profile.d/01_path.sh"

echo "     yarn version $(yarn -v)"

#  Modules layer
echo ""
echo "---> Node modules layer"

moduleslayer="$layersdir/modules"
mkdir -p "$moduleslayer"

local_modules="$(pwd)/node_modules"

rm -rf "$local_modules"
mkdir -p "$moduleslayer/node_modules"
ln -s "$moduleslayer/node_modules" "$local_modules"

echo -e 'cache = true\nlaunch = true' > "$moduleslayer.toml"

NODE_MODULES_PATH="$moduleslayer/node_modules/.bin"
export PATH="$NODE_MODULES_PATH:$PATH"

mkdir -p "$moduleslayer/profile.d/"
echo "     Prepending $NODE_MODULES_PATH to PATH"
echo "export PATH=$NODE_MODULES_PATH:\$PATH" > "$moduleslayer/profile.d/01_path.sh"


# INSTALL GEMS
echo ""
echo "---> Bundle"

bundlerlayer="$layersdir/bundler"
mkdir -p "$bundlerlayer"

local_bundler_checksum=$(sha256sum Gemfile.lock | cut -d ' ' -f 1)
remote_bundler_checksum=$(cat "$bundlerlayer.toml" | yj -t | jq -r .metadata 2>/dev/null || echo 'not found')

bundle config --local path "$bundlerlayer" >/dev/null
bundle config --local bin "$bundlerlayer/bin" >/dev/null
bundle config --local without development:test >/dev/null

if [[ ! -z "$SKIP_GEM_CACHE" ]] ; then
  echo "     SKIPPING GEM CACHE"
fi

if [[ -f Gemfile.lock && $local_bundler_checksum == $remote_bundler_checksum && -z "$SKIP_GEM_CACHE" ]] ; then
  # Determine if no gem dependencies have changed, so it can reuse existing gems without running bundle install
  echo "---> Reusing gems"
else
  # Determine if there has been a gem dependency change and install new gems to the bundler layer; re-using existing and un-changed gems
  echo "---> Installing gems"
  echo -e "cache = true\nlaunch = true\nmetadata = \"$local_bundler_checksum\"" > "$bundlerlayer.toml"
  bundle install
fi

bundle binstubs --all --path="$bundlerlayer/bin"

# ASSETS
echo ""
echo "---> compile assets"

bundle exec rails assets:precompile

echo ""
echo "---> PATH"
echo "     $PATH"

echo ""
echo "---> launch commands"
echo "     Writing to: $layersdir/launch.toml"

# SET DEFAULT START COMMAND
cat > "$layersdir/launch.toml" <<EOL
[[processes]]
type = "web"
command = 'rails db:migrate && rails server -b 0.0.0.0 -p \${PORT:-5000} -e \$RAILS_ENV'

# our worker process
[[processes]]
type = "sidekiq"
command = 'bundle exec sidekiq -C config/sidekiq.yml -e \$RAILS_ENV'

# console
[[processes]]
type = "console"
command = 'bundle exec rails console -e \$RAILS_ENV'

# rails
[[processes]]
type = "rails"
command = 'bundle exec rails'
EOL

It's all pretty specific to this project. Versions could be detected in the detect phase and then built accordingly in the build phase.

Layers

Anything written to a layer can be available for later builds and included when launching. You write another toml file for each later to describe how it's going to be used.

echo -e "cache = true\nlaunch = true\nmetadata = \"$local_bundler_checksum\"" > "$bundlerlayer.toml"

There we're saying this layer should be around again for the next build cache = true and should be available to the running containter launch = true. We're also saving a checksum we read back next build.

Environment vars are not available in the launched image unless we write them to a file in the layer. This adds the node_modules path to the environment by writing a script that will get sourced on launch. It needs to go into the profile.d directory of a layer. The launcher will handle the rest.

echo "export PATH=$NODE_MODULES_PATH:\$PATH" > "$moduleslayer/profile.d/01_path.sh"

Node_modules also presented a tricky issue, that as far as I can tell, you can symlink to the directory as long as the directory you're linking too is also named node_modules.

ln -s "$moduleslayer/node_modules" "$local_modules"

Launcher

The launcher is the entry point when running your image. It makes sure everything is setup and ready, like environment vars. Here you'll see we have and entry point for Rails and one for Sidekiq. The other two are shortcuts, that aren't really needed.

cat > "$layersdir/launch.toml" <<EOL
[[processes]]
type = "web"
command = 'rails db:migrate && rails server -b 0.0.0.0 -p \${PORT:-5000} -e \$RAILS_ENV'

# our worker process
[[processes]]
type = "sidekiq"
command = 'bundle exec sidekiq -C config/sidekiq.yml -e \$RAILS_ENV'

# console
[[processes]]
type = "console"
command = 'bundle exec rails console -e \$RAILS_ENV'

# rails
[[processes]]
type = "rails"
command = 'bundle exec rails'
EOL

web is the default and any of the others can be used when starting the Docker image.

An example to get to a bash shell

sudo docker-compose exec web launcher bash

Or run directly in a container

sudo docker exec -i -t my_application console

Building

Install the pack application from Buildpacks.io. Then run the build command something like this.

pack build my_application --env NODE_ENV=production --env RAILS_ENV=production --env RAILS_PRODUCTION_KEY=$(cat config/credentials/production.key) --path . --buildpack ./ruby-buildpack  --descriptor project.toml --builder paketobuildpacks/builder:full

The buider you use is the base image, paketobuildpacks/builder:full in this case.

Once built you can use like any normal Docker image.

Next we'll look at running this on Elastic Beanstalk using the new AWS Linux2 Docker environment... Ruby on Rails deployment to Elastic Beanstalk

Webmentions

These are webmentions via the IndieWeb and webmention.io. Mention this post from your site: