Products / For AI builders
ModelSphere
AvailableYour own Hugging Face Hub. Change one env var — every client just works.
ModelSphere is an on-premises model and dataset registry that speaks the Hugging Face Hub API on the wire. Set HF_ENDPOINT and every standard client — transformers, datasets, diffusers, huggingface-cli, plain git with LFS — works against your registry without patches. It doubles as a pull-through cache for the public Hub, fetching upstream models lazily or via scheduled prefetch for air-gapped sites, and adds what an enterprise registry needs: OIDC SSO, scoped personal access tokens, organizations, private repos, audit logging, Prometheus metrics, and one-tarball backup and restore.
Specification
- Version
- v1.6.4 — generally available
- Protocol
- Hugging Face Hub API · git over HTTPS and SSH · git-LFS (> 5 GiB via HF multipart)
- Modes
- Primary registry · pull-through cache · scheduled prefetch
- Operations
- OIDC · PAT scopes · audit log · Prometheus · one-tarball backup
- Languages
- English · 简体中文
Proof, not promises
See it in one block.
No proprietary SDKs, no rewrites — ModelSphere meets your tools where they already are.
$ export HF_ENDPOINT=https://models.intra.example
# everything downstream just works — no client patches
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B")
# served from your registry; cached from upstream or fully air-gapped▌ Wire-compatible with the Hugging Face Hub API — including git clone and multi-gigabyte LFS transfers.
Capabilities
What ModelSphere gives you
Wire-compatible with the Hub
Wire-compatible with huggingface_hub — so transformers, datasets, diffusers, and huggingface-cli work by setting HF_ENDPOINT. No forks, no client patches, no proprietary SDK to adopt.
Real git underneath
git clone, git lfs pull, and git lfs push work natively; objects beyond 5 GiB upload via Hugging Face's multipart transfer. Model versioning is git versioning — branches, tags, and commits you already understand.
A browsable Hub, not just an API
Beyond the wire protocol, ModelSphere ships a web UI: searchable browse with tag, pipeline, and library filters; rendered model cards; an in-browser file viewer and commit-diff history; discussions and likes. Push over HTTPS or SSH; clone, diff, and review in the browser.
Pull-through cache for the public Hub
Point it at huggingface.co and it lazily caches what your team pulls — or prefetch on a schedule for fully air-gapped sites. One registry serves both worlds.
Enterprise plumbing included
OIDC SSO, personal access tokens with fine-grained scopes, organizations and teams, public/private repos, an audit log of every change, Prometheus metrics with shipped Grafana dashboards, and backup/restore as a single tarball.
How it works
From public dependency to private registry.
- Step 01
Set HF_ENDPOINT
One environment variable. Every standard Hugging Face client now talks to your registry instead of the public Hub.
- Step 02
Push, pull, clone
Upload fine-tunes with huggingface-cli or git-lfs. Pull base models through the built-in cache — lazily, or prefetched for air-gap.
- Step 03
Govern and observe
OIDC decides who sees what. Tokens are scoped. Every change lands in the audit log; usage lands in Prometheus.
Who it's for
Built for these teams
- Teams hosting fine-tuned and internal models
- Air-gapped organizations that still want HF workflows
- Platform teams centralizing model and dataset storage
Pairs well with
Other builder products
ConsoleX
AvailableLog in, get a governed Kubernetes workspace. No kubectl, no tickets.
On first SSO login every user gets an isolated namespace with quotas, default-deny networking, storage, and a web terminal — provisioned automatically, reconciled continuously.
Learn moreDevSpace
AvailableJupyter or VS Code on a GPU in seconds. Idle environments shut themselves down.
Single-click Jupyter, Marimo, Streamlit, Gradio, and VS Code environments — GPU-ready, isolated per user behind a per-pod auth proxy, with SSH access and idle shutdown by default.
Learn moreTrainX
AvailableAdmins write the template. Users fill a form. Kubernetes runs the job.
Self-describing training templates render straight into UI forms — with live quota checks, streaming logs, parsed progress bars, and one-click TensorBoard.
Learn more