Commit Graph

142 Commits

Author SHA1 Message Date
Lumpiasty 6dd9a717e2 shorten context for qwen3-vl-2b and lower kv cache quant 2026-03-13 04:00:10 +01:00
Lumpiasty c67b6f7ebe add path to mmproj in qwen3.5 heretic 2026-03-13 04:00:10 +01:00
Lumpiasty 8d7cf402fd manually update llama-swap image tag 2026-03-13 04:00:10 +01:00
Renovate b22498c60f Update caddy Docker tag to v2.11.1 2026-03-13 04:00:10 +01:00
Lumpiasty 78a81c5b72 Add mmproj-url for Qwen3.5-35B-A3B-heretic model 2026-03-13 04:00:10 +01:00
Lumpiasty 2bb23c4ed0 add gemma-3-270m-it-qat model 2026-03-13 04:00:10 +01:00
Lumpiasty 8c29fc8018 Add Qwen3.5-35B-A3B-heretic models 2026-03-13 04:00:10 +01:00
Lumpiasty 2836542569 Add always loaded Qwen3-VL-2B-Instruct 2026-03-13 04:00:10 +01:00
Lumpiasty 1e68450d8a Add Qwen3.5-35-A3B model 2026-03-13 04:00:10 +01:00
Lumpiasty 2c83eb26b3 automatically fit models by llama.cpp 2026-03-13 04:00:10 +01:00
Lumpiasty ec038d7154 fix models mount 2026-03-13 04:00:10 +01:00
Lumpiasty b61e3b5c08 add schema reference to config.yaml 2026-03-13 04:00:10 +01:00
Lumpiasty 59bf4a1aa6 configure llama-swap to log llama.cpp output 2026-03-13 04:00:10 +01:00
Lumpiasty 63a8e2f7ac add Qwen3-Coder-Next model 2026-03-13 04:00:10 +01:00
Lumpiasty 1ddef7951a update llama-swap image 2026-03-13 04:00:10 +01:00
Lumpiasty 9f55d67ffa migrate llama models to ssd 2026-03-13 04:00:10 +01:00
Lumpiasty 3ffadc8628 add ssd volume for llama models 2026-03-13 04:00:10 +01:00
Lumpiasty a3f30873f9 switch llama models dir to lvm hdd 2026-03-13 04:00:09 +01:00
Lumpiasty 96e5202e6d add lvm hdd llama models pvc 2026-03-13 04:00:09 +01:00
Lumpiasty 9032060930 add abliterated versions of qwen3-vl 2026-03-13 04:00:08 +01:00
Lumpiasty f13c3ae3e7 Add 8B and 2B variants of qwen3-vl 2026-03-13 04:00:08 +01:00
Lumpiasty 669beccc35 fix Qwen3-VL-4B-Instruct-GGUF models looping issue 2026-03-13 04:00:08 +01:00
Lumpiasty 5eb7b7bb0c add qwen3-vl thinking variant 2026-03-13 04:00:08 +01:00
Lumpiasty 0b677d0faf add qwen3-vl, fix librechat taking over settings and clean up llama config 2026-03-13 04:00:08 +01:00
Lumpiasty e3325670de fix cache location after llama-swap update 2026-03-13 04:00:08 +01:00
Lumpiasty b9200d3a4c update llama-swap 2026-03-13 04:00:08 +01:00
Lumpiasty 8063cbaf80 update llama-swap docker image 2026-03-13 04:00:08 +01:00
Renovate 77ebe2cc89 Update caddy Docker tag to v2.10.2 2026-03-13 04:00:08 +01:00
Lumpiasty 9544f4719f Add Qwen2.5-VL models 2026-03-13 04:00:07 +01:00
Lumpiasty eb4ac7acf4 add qwen3-4b-2507 model 2026-03-13 04:00:07 +01:00
Lumpiasty 60fafe2a91 move all ingresses to new nginx ingress 2026-03-13 04:00:07 +01:00
Lumpiasty feaf805208 update llama-swap 2026-03-13 04:00:07 +01:00
Lumpiasty 93855dc712 llama automatic unloading and longer start timeout 2026-03-13 04:00:07 +01:00
Lumpiasty 241dce4524 disable warmups 2026-03-13 04:00:07 +01:00
Lumpiasty 17805e6b31 add gemma3 model 2026-03-13 04:00:07 +01:00
Lumpiasty 6f3e612dde move llama models to ssd 2026-03-13 04:00:07 +01:00
Lumpiasty 32eea7c3af add gemma3n 2026-03-13 04:00:07 +01:00
Lumpiasty de3ef465f0 add qwen3 no thinking 2026-03-13 04:00:07 +01:00
Lumpiasty fc8860f89a increase context size 2026-03-13 04:00:07 +01:00
Lumpiasty 869cc79898 add qwen3 2026-03-13 04:00:07 +01:00
Lumpiasty 5813db75dc gpu offload in llama.cpp 2026-03-13 04:00:07 +01:00
Lumpiasty af6545444b llama-swap 2026-03-13 04:00:07 +01:00