Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.cloud.vessl.ai/llms.txt

Use this file to discover all available pages before exploring further.

Containers in workspaces and jobs are ephemeral. To make a dataset, model checkpoint, or large config file available to your training scripts, put it in a persistent volume and mount that volume into the workload. This guide walks through the practical ways to get data into a volume.
Do not encode data into the job command itself. Pasting a gzip+base64 blob, a long heredoc, or other large payloads into --cmd is rejected by the API: command bodies over 256 KiB, environment variable values over 8 KiB, or more than 128 environment variable pairs return a 4xx error. Use one of the patterns below instead.

At a glance

The right approach depends on which kind of volume you target.
Volume kindHow to load data
Object storageUpload directly from your machine with vesslctl volume upload, or any S3-compatible client via vesslctl volume token.
Cluster storageMount the volume into a workspace, then bring the data into the mount path from inside that workspace.

Object storage — upload from your machine

vesslctl volume upload is the primary path. Files stream from your machine straight to S3, so transfer size is limited only by your network and storage quota.
1

Pick or create an Object storage volume

vesslctl volume list --type object
If you need a new volume, see Create a volume or use the CLI:
vesslctl volume create \
  --name training-data \
  --storage <object-storage-slug> \
  --teams <team-name>
--teams controls which teams can mount this volume; it is required for Object storage.
2

Upload local files

vesslctl volume upload <volume-slug> ./dataset/ \
  --remote-prefix datasets/v1/ \
  --exclude "*.pyc" \
  --exclude "__pycache__"
--dry-run previews the file list without transferring. --overwrite replaces existing remote keys; without it, identical keys are skipped. See vesslctl volume upload for the full flag reference.
3

Verify and mount

vesslctl volume ls <volume-slug> --prefix datasets/v1/
Mount the volume into a workspace or pass it to a job with --object-volume <volume-slug>:/shared.
Need to drive the upload from another tool — aws s3 cp, rclone, DVC, or a custom pipeline? Run vesslctl volume token <volume-slug> to get temporary S3 credentials and an endpoint URL scoped to just that volume. See vesslctl volume token.

Cluster storage — load through a workspace

vesslctl volume upload does not support Cluster storage volumes. Instead, mount the Cluster storage volume into a workspace and bring the data into the mount path from inside that workspace.
1

Create a workspace that mounts the volume

Mount the Cluster storage volume at a clear path under Persistent volume, for example /data:
vesslctl workspace create \
  --name data-loader \
  --cluster <cluster-slug> \
  --resource-spec <spec-slug> \
  --image quay.io/vessl-ai/torch:2.9.1-cuda13.0.1-py3.13-slim \
  --cluster-volume <cluster-volume-slug>:/data
Any container image with the tools you need (curl, wget, aws, huggingface-cli, git-lfs, …) works. To minimize hourly cost while you move data, pick a CPU-only spec from vesslctl resource-spec list.
2

Connect to the workspace

Wait until the workspace is running, then connect over SSH or in JupyterLab. See Connect to a workspace.Once connected, cd /data (or whatever mount path you chose). Anything you write below this path lands in the Cluster storage volume and persists after the workspace is paused or terminated.
3

Bring the data in (pick a pattern below)

Several patterns work. Pick the one that matches where the data lives.
4

Pause the workspace when you are done

Cluster storage data persists past pause — you do not need a running workspace to keep the data alive. Pause to stop compute billing:
vesslctl workspace pause <workspace-slug>
Resume with vesslctl workspace start <workspace-slug> later if you need to add or modify data.

Pattern A — Pull from the public internet

The simplest and most common case: the data is already at a public (or token-authenticated) URL.
# Inside the workspace shell, with the volume mounted at /data
cd /data

# Plain HTTP(S) downloads
wget https://example.com/datasets/imagenet-subset.tar.gz
tar -xzf imagenet-subset.tar.gz

# Hugging Face datasets / model repos
pip install -U "huggingface_hub[cli]"
huggingface-cli download <org>/<repo> --local-dir ./hf-cache --repo-type dataset

# S3 / GCS / Azure (use the matching CLI)
aws s3 sync s3://my-bucket/datasets/v1/ ./datasets/v1/
For very large transfers, aria2c -x 16 parallelizes HTTP downloads, and rclone copy handles cloud-storage providers with built-in retry and verification.

Pattern B — Push from your laptop over SSH

When the data is only on your laptop and you want to skip the round trip through the public internet, use SSH to copy directly into the mount path.
# scp a single file or directory
scp -i /path/to/<key> -P <port> ./dataset.tar.gz \
  root@<workspace-host>:/data/

# rsync (resumable, deduped, recommended for large trees)
rsync -avh --progress -e "ssh -i /path/to/<key> -p <port>" \
  ./dataset/ root@<workspace-host>:/data/
The host, port, and key path come from the workspace Connect tab — see Connect to a workspace. rsync is preferable for anything multi-gigabyte: it resumes after a dropped connection (--partial) and only retransmits changed files on a re-run.

Pattern C — Stage through Object storage

When you want a one-time copy from your laptop into Cluster storage on a different cluster (or from one cluster to another), use Object storage as a portable intermediate. Object storage is reachable from any cluster.
# 1. From your laptop: upload to an Object storage volume
vesslctl volume upload <object-volume-slug> ./dataset/ \
  --remote-prefix v1/

# 2. Create the data-loader workspace mounting BOTH volumes
vesslctl workspace create \
  --name data-loader \
  --cluster <cluster-slug> \
  --resource-spec <spec-slug> \
  --image quay.io/vessl-ai/torch:2.9.1-cuda13.0.1-py3.13-slim \
  --object-volume <object-volume-slug>:/shared \
  --cluster-volume <cluster-volume-slug>:/data

# 3. Inside the workspace: copy from /shared into /data
cp -r /shared/v1/. /data/v1/
After the copy completes, the Object storage staging copy is optional to keep — delete it from the volume if it is no longer needed, or keep it as a backup.

Pattern D — Open a custom HTTP port

Need a browser drag-and-drop, a sync server, or a temporary webhook into the workspace? Open a custom HTTP or TCP port when you create the workspace (see Workspace ports) and serve directly from the mount path.
# Quick browser-friendly file upload UI on a custom HTTP port (for example, 8000)
pip install --no-cache-dir uvicorn fastapi python-multipart
# … or use any small upload server such as filebrowser or miniserve --upload-files

# rclone serve: expose /data over HTTP/WebDAV/SFTP for a chosen client
rclone serve http /data --addr :8000  # then point your laptop's rclone at it
Use this when neither the CLI upload (Object) nor SSH copy (Pattern B) fits — for example, when a teammate without SSH access needs to drop files in, or when an external service is pushing data to the workspace.
Custom ports are reachable from anywhere the workspace URL is — treat them like any other public endpoint. Add basic auth, a one-time token, or shut the port down once the load is finished.

Anti-pattern: do not embed data in --cmd

A pattern that looks tempting — especially to LLM coding agents — is to gzip+base64 a dataset into a single shell line and pass it via --cmd:
# DON'T DO THIS — the API rejects --cmd over 256 KiB,
# and even shorter inline blobs make jobs hard to reproduce and observe.
vesslctl job create \
  --cmd "echo 'H4sIAA...<3 MiB of base64>...' | base64 -d | gunzip > /tmp/data && python eval.py"
The API rejects this: Job.command is capped at 256 KiB, each environment variable value at 8 KiB, and the total environment variable count at 128. Requests beyond these thresholds return 4xx. Always upload the data to a volume (this page) and mount it instead.

Next steps