Commit Graph

411 Commits

Author SHA1 Message Date
f219abb74f chore(deps): update ghcr.io/mostlygeek/llama-swap docker tag to v197-vulkan-b8248 2026-03-13 04:00:10 +01:00
0130991c74 refactor: add move llama-swap package config to renovate.json 2026-03-13 04:00:10 +01:00
bbb57cc174 configure renovate to automatically merge patch updates 2026-03-13 04:00:10 +01:00
966d2c50c0 update renovate comment for llama-swap image tag management 2026-03-13 04:00:10 +01:00
fb4fcc7c12 Update renovate/renovate Docker tag to v43.60.4 2026-03-13 04:00:10 +01:00
1026beb722 Update Helm release ingress-nginx to v4.15.0 2026-03-13 04:00:10 +01:00
af737ab82b Update caddy Docker tag to v2.11.2 2026-03-13 04:00:10 +01:00
6dc09ec242 Update Helm release open-webui to v12.10.0 2026-03-13 04:00:10 +01:00
39fc38d62b add qwen3.5 4b heretic 2026-03-13 04:00:10 +01:00
e72a79be8f add glm-5 from openrouter to llama-swap 2026-03-13 04:00:10 +01:00
4fda343b01 clean up llama-swap config 2026-03-13 04:00:10 +01:00
266ced7362 adjust parameters of qwen3-coder-next 2026-03-13 04:00:10 +01:00
8a074839b1 automatically fit context on qwen3.5 2b and 4b 2026-03-13 04:00:10 +01:00
42038207fc Add Q3_K_M variand of Qwen3.5-9B 2026-03-13 04:00:10 +01:00
28cb53c031 fiix thinking versions of Qwen3.5 small 2026-03-13 04:00:10 +01:00
88a73cbb41 set strategy to recreate on llama-swap deployment 2026-03-13 04:00:10 +01:00
46a7e24932 add 2B, 4B, 9B versions of Qwen3.5 in thinking + nonthinking variants 2026-03-13 04:00:10 +01:00
cd7ebac6b9 increase target margin of 2048MB of VRAM 2026-03-13 04:00:10 +01:00
ba9db6ce41 add Qwen3.5 Small 0.8B model and replace Qwen3-VL-2B as task model 2026-03-13 04:00:10 +01:00
6dd9a717e2 shorten context for qwen3-vl-2b and lower kv cache quant 2026-03-13 04:00:10 +01:00
c67b6f7ebe add path to mmproj in qwen3.5 heretic 2026-03-13 04:00:10 +01:00
8d7cf402fd manually update llama-swap image tag 2026-03-13 04:00:10 +01:00
2a59555c3b Add more README 2026-03-13 04:00:10 +01:00
f236b89cca Update Helm release immich to v1.1.1 2026-03-13 04:00:10 +01:00
5f3f3d33ee Update renovate/renovate Docker tag to v43.46.6 2026-03-13 04:00:10 +01:00
b22498c60f Update caddy Docker tag to v2.11.1 2026-03-13 04:00:10 +01:00
13aaae7620 Update Helm release cert-manager to v1.19.4 2026-03-13 04:00:10 +01:00
1d7fba80d4 Update Helm release cert-manager-webhook-ovh to v0.9.2 2026-03-13 04:00:10 +01:00
3fdad80b22 Update Helm release openbao to v0.25.6 2026-03-13 04:00:10 +01:00
865a98ed97 revamp readme 2026-03-13 04:00:10 +01:00
78a81c5b72 Add mmproj-url for Qwen3.5-35B-A3B-heretic model 2026-03-13 04:00:10 +01:00
2bb23c4ed0 add gemma-3-270m-it-qat model 2026-03-13 04:00:10 +01:00
8c29fc8018 Add Qwen3.5-35B-A3B-heretic models 2026-03-13 04:00:10 +01:00
2836542569 Add always loaded Qwen3-VL-2B-Instruct 2026-03-13 04:00:10 +01:00
1e68450d8a Add Qwen3.5-35-A3B model 2026-03-13 04:00:10 +01:00
0a57fdd22d update CoreDNS logging configuration to include all log classes 2026-03-13 04:00:10 +01:00
a0a7b85cc2 custom config of coredns to deny ipv6 huggingface 2026-03-13 04:00:10 +01:00
2c83eb26b3 automatically fit models by llama.cpp 2026-03-13 04:00:10 +01:00
ec038d7154 fix models mount 2026-03-13 04:00:10 +01:00
b61e3b5c08 add schema reference to config.yaml 2026-03-13 04:00:10 +01:00
59bf4a1aa6 configure llama-swap to log llama.cpp output 2026-03-13 04:00:10 +01:00
63a8e2f7ac add Qwen3-Coder-Next model 2026-03-13 04:00:10 +01:00
1ddef7951a update llama-swap image 2026-03-13 04:00:10 +01:00
b431b9c038 disable built in open-webui ingress 2026-03-13 04:00:10 +01:00
6b0c50b104 increase openwebui storage to 10Gi 2026-03-13 04:00:10 +01:00
9f55d67ffa migrate llama models to ssd 2026-03-13 04:00:10 +01:00
3ffadc8628 add ssd volume for llama models 2026-03-13 04:00:10 +01:00
a138171c2f add lvmpv ssd storage class 2026-03-13 04:00:10 +01:00
a986aea9ed add openwebui 2026-03-13 04:00:10 +01:00
3939bc9138 add workaround for cert-manager-webhook-ovh 2026-03-13 04:00:10 +01:00