Docker
Gyre can be run in Docker, either locally or on a cloud host like vast.ai. This allows us to easily manage Gyre's dependancies.
The basic command to run Gyre in Docker looks something like this:
docker run --gpus all -it -p 5000:5000 -p 50051:50051 \
-e HF_API_TOKEN={your huggingface token} \
-e SD_LISTEN_TO_ALL=1 \
-v $HOME/.cache/huggingface:/huggingface \
-v `pwd`/weights:/weights \
ghcr.io/stablecabal/gyre:cuda118-xformers-latest
Images & Tags
Pre-built docker images are available at Github Container Registry or Docker Hub
Docker images are provided as Cuda 11.6
, 11.7
and 11.8
based versions,
with basic
, xformers
, bundle
and xformers-training
options
Unless you have a specific reason to use another, cuda118-xformers is the recommended option.
basic
Just gyre on top of the nvidia runtime image for the selected cuda version. These are the fastest to build, so will be available very quickly after any release.
xformers
Basic + xformers. Faster and less vram used, but bigger docker image and slightly longer delay to be available.
bundle
xformers + the flying dog web interface. For end-users who just want to create AI art (but who are comfortable with Docker).
xformers-training
Xformers + Bits & Bytes (for 8Bit Adam) + Deepspeed. Useful extras for running training via a shell into the image. Much bigger docker image.
Volume mounts
This will share the weights and huggingface cache, but you can mount other folders into the volume to do other things:
/huggingface
The huggingface cache, usually mapped to ~/.cache/huggingface in guest
/weights
Any local weights can be mapped to here
/config
Override the config by making a config directory, putting the engines.yaml and other yaml in it, and mounting it to the image
-v `pwd`/config:/config \
/gyre
You can check out the latest version of the server code and then mount it into the Docker image to run the very latest code (including any local edits you make)
-v `pwd`/gyre:/gyre \
Environment variables
Localtunnel
The docker image has built-in support for localtunnel, which will expose the GRPC-WEB endpoint on an https domain. It will automatically set an access token key if you don't provide one. Check your Docker log for the values to use
-e SD_LOCALTUNNEL=1 \
Server arguments
All the server arguments can be provided as environment variables, starting with SD:
- SD_ENGINECFG
- SD_GRPC_PORT
- SD_HTTP_PORT
- SD_VRAM_OPTIMISATION_LEVEL
- SD_NSFW_BEHAVIOUR
- SD_WEIGHT_ROOT
- SD_HTTP_FILE_ROOT
- SD_ACCESS_TOKEN
- SD_LISTEN_TO_ALL
- SD_ENABLE_MPS
- SD_RELOAD
- SD_LOCALTUNNEL
Building
Pytorch does not distribute official binaries that use Cuda 11.8. Therefore Gyre uses a two-step build process
- Dockerfile.devbase, pytorch + torchvision built from source on top of the selected CUDA version and installed into a micromamba environment
- Dockerfile, the server images themselves
Building gyre-devbase
You can pass in a short CUDA ver (116, 117, 118) and a full CUDA ver (11.6.2, 11.7.1, 11.8.0) to target. Default is Cuda 11.8.0
docker build . -f Dockerfile.devbase \
--target devbase \
--build-arg CUDA_VER=118 --build-arg CUDA_FULLVER=11.8.0 \
--tag ghcr.io/stablecabal/gyre-dev:pytorch112-cuda118-latest
Building gyre
You can pass in github references to specific the exact version of dependancies to build.
docker build . -f Dockerfile --target xformers \
--build-arg XFORMERS_REPO=https://github.com/hafriedlander/xformers.git \
--build-arg XFORMERS_REF=53b7454 \
--build-arg TRITON_REF=8650b4d \
--tag ghcr.io/stablecabal/gyre:cuda118-xformers-latest