Skip to content

Add Docker support#253

Open
WolffM wants to merge 8 commits intomicrosoft:devfrom
WolffM:fix/85-add-docker-support
Open

Add Docker support#253
WolffM wants to merge 8 commits intomicrosoft:devfrom
WolffM:fix/85-add-docker-support

Conversation

@WolffM
Copy link

@WolffM WolffM commented Mar 14, 2026

Summary

Fixes #85: Add Docker support

Changes

This PR addresses the issue described above. Changes were developed on the fix/85-add-docker-support branch.

Related Issue

@WolffM WolffM force-pushed the fix/85-add-docker-support branch from c44ab8a to e1a890e Compare March 14, 2026 14:50
@WolffM
Copy link
Author

WolffM commented Mar 14, 2026

@microsoft-github-policy-service agree

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds first-class Docker artifacts to run Data Formulator via docker compose / docker build, bundling the Vite-built frontend into the Python runtime image and documenting container usage.

Changes:

  • Introduces a multi-stage Dockerfile to build the frontend and ship a Python runtime image that serves the bundled UI.
  • Adds docker-compose.yml and .dockerignore for easier local deployment and smaller build contexts.
  • Adds a minimal sandbox image Dockerfile and Docker usage documentation.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
py-src/data_formulator/sandbox/Dockerfile.sandbox Adds a minimal Python sandbox image definition for Docker-based code execution.
docker-compose.yml Defines a compose service for running Data Formulator with persisted workspace data.
Dockerfile Adds multi-stage Docker build (frontend build + Python runtime image).
DEVELOPMENT.md Documents Docker quick-start and sandbox limitations in containers.
.dockerignore Reduces Docker build context size and avoids leaking local artifacts/secrets.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +35 to +37
# Create a non-root user to run the application
RUN useradd -m -s /bin/bash appuser

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@Chenglong-MS Chenglong-MS changed the base branch from main to dev March 16, 2026 17:42
@Chenglong-MS Chenglong-MS requested a review from Copilot March 16, 2026 22:15
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds first-class Docker artifacts to run Data Formulator via docker build / docker compose, including a separate minimal image intended for the Docker-based sandbox backend.

Changes:

  • Added a multi-stage Dockerfile (Node build stage + Python runtime stage) and a docker-compose.yml for local deployment.
  • Added a minimal Dockerfile.sandbox for the Docker sandbox backend image.
  • Documented Docker usage in DEVELOPMENT.md and added a .dockerignore.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
py-src/data_formulator/sandbox/Dockerfile.sandbox Defines a minimal Python image for the Docker sandbox backend.
docker-compose.yml Compose service to run the app and persist workspace data in a named volume.
Dockerfile Multi-stage build bundling the built frontend into the Python package image.
DEVELOPMENT.md Adds Docker quick-start and notes about sandbox limitations in containers.
.dockerignore Reduces build context size and avoids copying local artifacts/secrets into images.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +22 to +25
# Drop to a non-root user for extra isolation
RUN useradd -m sandbox
USER sandbox

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds first-class Docker artifacts to run Data Formulator via a container image / Docker Compose, and introduces a minimal Docker image for the Docker-based sandbox backend.

Changes:

  • Added a multi-stage Dockerfile that builds the Vite frontend and installs the Python package into a slim runtime image.
  • Added docker-compose.yml and .dockerignore to support an easy local deployment workflow with persisted workspace data.
  • Added py-src/data_formulator/sandbox/Dockerfile.sandbox for building the data-formulator-sandbox image used by the Docker sandbox backend.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
py-src/data_formulator/sandbox/Dockerfile.sandbox Defines a minimal Python image with data/ML deps for the Docker sandbox runtime.
docker-compose.yml Compose setup to build/run the app and persist workspace data across restarts.
Dockerfile Multi-stage build: frontend build (Node) + Python runtime image bundling built frontend.
DEVELOPMENT.md Documents Docker quick-start and limitations of running the Docker sandbox from inside a container.
.dockerignore Reduces Docker build context size and avoids copying secrets/caches into images.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- "5567:5567"
env_file:
- .env
user: "0:0"

To stop the container: `docker compose down`

Workspace data (uploaded files, sessions) is persisted in a Docker volume (`data_formulator_home`) so it survives container restarts.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds containerization support to run Data Formulator via Docker/Docker Compose, aligning with the repo’s existing “bundle frontend into py-src/data_formulator/dist and serve via Flask” packaging approach.

Changes:

  • Added a multi-stage Dockerfile to build the frontend and install/serve the Python app in a runtime image.
  • Added docker-compose.yml for a one-command local run with persisted workspace data via a named volume.
  • Added a sandbox image Dockerfile (Dockerfile.sandbox), .dockerignore, and Docker usage documentation in DEVELOPMENT.md.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
Dockerfile Multi-stage build: frontend build (Node) + Python runtime that installs the package and serves on port 5567
docker-compose.yml Compose service definition, env file loading, port mapping, and a persistent volume for workspace data
py-src/data_formulator/sandbox/Dockerfile.sandbox Minimal Python image intended for the Docker-based sandbox backend
DEVELOPMENT.md Added Docker quickstart and clarified sandbox limitations when running inside containers
.dockerignore Reduces Docker build context size and avoids copying local artifacts/secrets

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +22 to +25
# Drop to a non-root user for extra isolation
RUN useradd -m sandbox
USER sandbox

Comment on lines +20 to +25
- .env
user: "0:0"
volumes:
# Persist workspace data (uploaded files, sessions, etc.) across container restarts.
- data_formulator_home:/home/appuser/.data_formulator
restart: unless-stopped
Comment on lines +26 to +33
# System dependencies needed by some Python packages
RUN apt-get update && apt-get install -y --no-install-recommends \
gcc \
g++ \
libpq-dev \
unixodbc-dev \
curl \
&& rm -rf /var/lib/apt/lists/*
Remove user: "0:0" override in docker-compose.yml — the Dockerfile
already creates /home/appuser/.data_formulator and chowns it to appuser
before switching to USER appuser, so the override was causing the app to
run as root and write to /root/.data_formulator, bypassing the mounted
volume entirely.

Pass --user with host uid:gid to docker run in DockerSandbox so the
sandbox container UID matches the host user that created the bind-mounted
output directory. Without this, the non-root sandbox user cannot write
the output parquet file, silently breaking all Docker sandbox executions.
@WolffM
Copy link
Author

WolffM commented Mar 18, 2026

hey @Chenglong-MS, i think the comments are addressed now. thanks for taking a look.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds first-class Docker deployment support for Data Formulator (closes #85), including containerizing the full build (frontend + Python runtime) and documenting Docker usage.

Changes:

  • Adds a multi-stage Dockerfile to build the frontend and package it into a Python runtime image.
  • Adds docker-compose.yml plus .dockerignore for easier local deployment and persistence of workspace data.
  • Updates the Docker-based sandbox runner to run containers as the host user (UID/GID) to avoid permission issues on bind mounts, and adds a dedicated sandbox Dockerfile.sandbox.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
py-src/data_formulator/sandbox/docker_sandbox.py Runs docker sandbox containers as the invoking host UID/GID.
py-src/data_formulator/sandbox/Dockerfile.sandbox Defines a minimal Python sandbox image with common data libraries.
docker-compose.yml Adds compose config to build/run the app with a persisted workspace volume.
Dockerfile Adds multi-stage image build (Node build stage + Python runtime stage).
DEVELOPMENT.md Documents Docker quickstart and limitations around docker-in-docker sandboxing.
.dockerignore Reduces Docker build context size and avoids copying secrets like .env.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@Chenglong-MS
Copy link
Collaborator

seems good! let me do a little test and merge it! 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Docker Support

3 participants