Initial production-ready Gemma 3 vLLM ROCm stack
Co-Authored-By: Oz <oz-agent@warp.dev>
This commit is contained in:
50
docs/UPGRADE_NOTES.md
Normal file
50
docs/UPGRADE_NOTES.md
Normal file
@ -0,0 +1,50 @@
|
||||
# Upgrade Notes
|
||||
## Standard safe upgrade path
|
||||
From repository root:
|
||||
```bash
|
||||
git pull
|
||||
docker compose pull
|
||||
./scripts/restart.sh
|
||||
```
|
||||
Then run smoke tests:
|
||||
```bash
|
||||
./scripts/test_api.sh
|
||||
./scripts/test_ui.sh
|
||||
python3 scripts/test_python_client.py
|
||||
```
|
||||
|
||||
## Versioning guidance
|
||||
- Prefer pinning image tags in `docker-compose.yml` once your deployment is stable.
|
||||
- Upgrading vLLM may change runtime defaults or engine behavior; check vLLM release notes before major version jumps.
|
||||
- Keep `GEMMA_MODEL_ID` explicit in `.env` to avoid unintentional model drift.
|
||||
|
||||
## Model upgrade considerations
|
||||
When changing Gemma 3 variants (for example, from 1B to larger sizes):
|
||||
- Verify host RAM and GPU memory capacity.
|
||||
- Expect re-download of model weights and larger disk usage.
|
||||
- Re-tune:
|
||||
- `VLLM_MAX_MODEL_LEN`
|
||||
- `VLLM_GPU_MEMORY_UTILIZATION`
|
||||
- Re-run validation scripts after restart.
|
||||
|
||||
## Backup recommendations
|
||||
Before major upgrades, back up local persistent data:
|
||||
```bash
|
||||
mkdir -p backups
|
||||
tar -czf backups/hf-cache-$(date +%Y%m%d-%H%M%S).tar.gz "${HOME}/.cache/huggingface"
|
||||
tar -czf backups/open-webui-data-$(date +%Y%m%d-%H%M%S).tar.gz frontend/data/open-webui
|
||||
```
|
||||
If you use local predownloaded models:
|
||||
```bash
|
||||
tar -czf backups/models-$(date +%Y%m%d-%H%M%S).tar.gz models
|
||||
```
|
||||
|
||||
## Rollback approach
|
||||
If a new image/model combination fails:
|
||||
1. Revert `docker-compose.yml` and `.env` to previous known-good values.
|
||||
2. Pull previous pinned images (if pinned by tag/digest).
|
||||
3. Restart:
|
||||
```bash
|
||||
./scripts/restart.sh
|
||||
```
|
||||
4. Re-run smoke tests.
|
||||
Reference in New Issue
Block a user