Skip to content

Conversation

sjmonson
Copy link
Collaborator

@sjmonson sjmonson commented Jul 31, 2025

Summary

Various fixes to containers images and CI. Containers images should now reuse most layers on build, need less hacks when running in OCP, and properly detect the guidellm version. Release and RC git tags/branches tag the container image with the version. The latest and stable tags now smartly update based on highest version tag (prev most recent build).

Details

  • Moves Containerfile to top-level
  • Adds a .containerignore that mirrors .gitignore
  • Fixes version detection in image builds
  • Correctly sets release type and tags rc/release with version when building images
  • Use PDM to lock venv in image builds
  • Adds weekly job for tagging latest and stable based on existing version tagged images
  • Adds weekly job to prune dev containers that are more than 2 weeks old
    • Note: This job is currently set to dry-run and will be enabled in a future PR
  • Improves container compatibility with K8s unpriv users
  • Reorders image layers to improve build caching

Test Plan

1. Workflows

  • Workflows were partially tested with act but need to be run on GitHub to verify

2. Container Changes

  1. Run podman run --rm ghcr.io/vllm-project/guidellm:pr-254 --version to verify version is set
  2. Run podman run --rm --entrypoint /opt/app-root/guidellm/bin/pip ghcr.io/vllm-project/guidellm:pr-254 freeze and verify that versions match the pylock.toml.
  3. Run a pod with the GuideLLM container and verify the user has write permissions to /home/guidellm. (For tokenizer/dataset caching)

Related Issues

  • Resolves #

  • "I certify that all code in this PR is my own, except as noted below."

Use of AI

  • Includes AI-assisted code completion
  • Includes code generated by an AI application
  • Includes AI-generated tests (NOTE: AI written tests should have a docstring that includes ## WRITTEN BY AI ##)

@sjmonson sjmonson force-pushed the feat/k8s_container_fixes branch from 3604c63 to f57f9df Compare July 31, 2025 18:50
@sjmonson sjmonson changed the title WIP: Make container more K8s friendly WIP: Make container more OCP friendly Jul 31, 2025
@sjmonson sjmonson marked this pull request as draft August 1, 2025 16:15
@sjmonson sjmonson force-pushed the feat/k8s_container_fixes branch from f57f9df to 250df65 Compare September 8, 2025 14:49
@sjmonson sjmonson force-pushed the feat/k8s_container_fixes branch 2 times, most recently from c62588f to ef02649 Compare September 8, 2025 18:05
Signed-off-by: Samuel Monson <[email protected]>
@sjmonson sjmonson force-pushed the feat/k8s_container_fixes branch from ef02649 to e360711 Compare September 8, 2025 18:05
@sjmonson sjmonson force-pushed the feat/k8s_container_fixes branch from 437ca73 to 4bb780f Compare September 8, 2025 21:24
@sjmonson sjmonson changed the title WIP: Make container more OCP friendly Various fixes to container and container CI Sep 9, 2025
@sjmonson sjmonson changed the title Various fixes to container and container CI Various fixes to container build and CI Sep 9, 2025
@sjmonson sjmonson changed the title Various fixes to container build and CI Various fixes to container image and CI Sep 9, 2025
@sjmonson sjmonson force-pushed the feat/k8s_container_fixes branch 2 times, most recently from b9dc4ad to cb05e48 Compare September 9, 2025 18:11
@sjmonson sjmonson force-pushed the feat/k8s_container_fixes branch from cb05e48 to ca3f6e8 Compare September 9, 2025 18:18
@sjmonson sjmonson marked this pull request as ready for review September 9, 2025 18:24
Signed-off-by: Samuel Monson <[email protected]>
@sjmonson sjmonson requested review from Copilot and markurtz September 9, 2025 20:16
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the container image build process and CI workflows to improve caching, fix version detection, and automate container maintenance. The changes move the Containerfile to the root directory, improve image layer ordering for better build caching, and add automation for tagging latest/stable versions and cleaning up development images.

  • Relocated Containerfile from deploy/ to root directory with improved layer caching strategy
  • Enhanced CI workflows to properly tag container images with versions and build types
  • Added automated container maintenance workflow for cleaning up old PR images and updating latest/stable tags

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
deploy/Containerfile Removed old Containerfile from deploy directory
Containerfile New Containerfile at root with improved caching and PDM integration
README.md Added documentation for available container image tags
.github/workflows/release.yml Updated to use new Containerfile path and tag with version
.github/workflows/release-candidate.yml Updated to use new Containerfile path and tag RC versions
.github/workflows/nightly.yml Updated to use new Containerfile path and set nightly build type
.github/workflows/development.yml Updated to use new Containerfile path and set dev build type
.github/workflows/container-maintenance.yml New workflow for automated container cleanup and tag management
.containerignore Added container ignore file referencing .gitignore

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

image-names: ${{ github.event.repository.name }}
image-tags: "pr-*"
cut-off: 2w
dry-run: true
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set this to dry-run for now until we can confirm it deletes the correct set of images.

Comment on lines +17 to +18
RUN apt-get update \
&& apt-get install -y --no-install-recommends git \
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had hoped to keep this image distro-agnostic to enable downstream RHEL builds but we need git to have the correct package version tag after install. There are currently two alternatives I am considering:

  1. Write a helper script that implements apt and dnf based installs based on /etc/os-release.
  2. Change base_image to a community dnf based image.

I prefer (2) but currently there are not any minimal 3.13 images based on Fedora/CentOS/RHEL. The closest is quay.io/fedora/python-312-minimal. I've filled sclorg/s2i-python-container#753 with the project to hopeful enable python-313-minimal builds.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to use a 3.12 base image instead?

Copy link
Collaborator

@jaredoconnell jaredoconnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I left one comment.

Comment on lines +17 to +18
RUN apt-get update \
&& apt-get install -y --no-install-recommends git \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to use a 3.12 base image instead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants