Commit Graph

44 Commits

Author SHA1 Message Date
Lumpiasty ae7f53c395 add bonsai model
ci/woodpecker/push/flux-reconcile-source Pipeline was successful
ci/woodpecker/cron/renovate Pipeline was successful
2026-04-14 18:31:18 +02:00
Lumpiasty 8160a52176 add gemma 4 models
ci/woodpecker/push/flux-reconcile-source Pipeline was successful
2026-04-04 02:48:02 +02:00
Lumpiasty ad3b2229c2 get rid of openrouter proxying via llama-swap
ci/woodpecker/push/flux-reconcile-source Pipeline was successful
2026-04-04 02:39:26 +02:00
Lumpiasty 054df42d8b update qwen3.5 4b ctx size to 128k 2026-03-30 21:05:00 +02:00
Lumpiasty 9e74ed6a19 increase --fit-target to 1.5GB 2026-03-29 23:50:45 +02:00
Lumpiasty ce0b13ebb3 change kv cache quant to q8_0 2026-03-20 00:57:39 +01:00
Lumpiasty 7e7b3e3d71 add max ctx on llama.cpp 2026-03-17 01:33:35 +01:00
Lumpiasty 79315d32db add GLM-4.7-Flash model 2026-03-16 18:19:28 +01:00
Lumpiasty 2d295d24e0 add 27b q3 variant of qwen3.5 2026-03-13 04:00:10 +01:00
Lumpiasty e8efa9ddc1 lower kv cache quant to q4_0 and increase ctx to 64k 2026-03-13 04:00:10 +01:00
Lumpiasty c88dd2899a remove ttl of all models in llama-swap 2026-03-13 04:00:10 +01:00
Lumpiasty 39fc38d62b add qwen3.5 4b heretic 2026-03-13 04:00:10 +01:00
Lumpiasty e72a79be8f add glm-5 from openrouter to llama-swap 2026-03-13 04:00:10 +01:00
Lumpiasty 4fda343b01 clean up llama-swap config 2026-03-13 04:00:10 +01:00
Lumpiasty 266ced7362 adjust parameters of qwen3-coder-next 2026-03-13 04:00:10 +01:00
Lumpiasty 8a074839b1 automatically fit context on qwen3.5 2b and 4b 2026-03-13 04:00:10 +01:00
Lumpiasty 42038207fc Add Q3_K_M variand of Qwen3.5-9B 2026-03-13 04:00:10 +01:00
Lumpiasty 28cb53c031 fiix thinking versions of Qwen3.5 small 2026-03-13 04:00:10 +01:00
Lumpiasty 46a7e24932 add 2B, 4B, 9B versions of Qwen3.5 in thinking + nonthinking variants 2026-03-13 04:00:10 +01:00
Lumpiasty cd7ebac6b9 increase target margin of 2048MB of VRAM 2026-03-13 04:00:10 +01:00
Lumpiasty ba9db6ce41 add Qwen3.5 Small 0.8B model and replace Qwen3-VL-2B as task model 2026-03-13 04:00:10 +01:00
Lumpiasty 6dd9a717e2 shorten context for qwen3-vl-2b and lower kv cache quant 2026-03-13 04:00:10 +01:00
Lumpiasty c67b6f7ebe add path to mmproj in qwen3.5 heretic 2026-03-13 04:00:10 +01:00
Lumpiasty 78a81c5b72 Add mmproj-url for Qwen3.5-35B-A3B-heretic model 2026-03-13 04:00:10 +01:00
Lumpiasty 2bb23c4ed0 add gemma-3-270m-it-qat model 2026-03-13 04:00:10 +01:00
Lumpiasty 8c29fc8018 Add Qwen3.5-35B-A3B-heretic models 2026-03-13 04:00:10 +01:00
Lumpiasty 2836542569 Add always loaded Qwen3-VL-2B-Instruct 2026-03-13 04:00:10 +01:00
Lumpiasty 1e68450d8a Add Qwen3.5-35-A3B model 2026-03-13 04:00:10 +01:00
Lumpiasty 2c83eb26b3 automatically fit models by llama.cpp 2026-03-13 04:00:10 +01:00
Lumpiasty b61e3b5c08 add schema reference to config.yaml 2026-03-13 04:00:10 +01:00
Lumpiasty 59bf4a1aa6 configure llama-swap to log llama.cpp output 2026-03-13 04:00:10 +01:00
Lumpiasty 63a8e2f7ac add Qwen3-Coder-Next model 2026-03-13 04:00:10 +01:00
Lumpiasty 9032060930 add abliterated versions of qwen3-vl 2026-03-13 04:00:08 +01:00
Lumpiasty f13c3ae3e7 Add 8B and 2B variants of qwen3-vl 2026-03-13 04:00:08 +01:00
Lumpiasty 669beccc35 fix Qwen3-VL-4B-Instruct-GGUF models looping issue 2026-03-13 04:00:08 +01:00
Lumpiasty 5eb7b7bb0c add qwen3-vl thinking variant 2026-03-13 04:00:08 +01:00
Lumpiasty 0b677d0faf add qwen3-vl, fix librechat taking over settings and clean up llama config 2026-03-13 04:00:08 +01:00
Lumpiasty 9544f4719f Add Qwen2.5-VL models 2026-03-13 04:00:07 +01:00
Lumpiasty eb4ac7acf4 add qwen3-4b-2507 model 2026-03-13 04:00:07 +01:00
Lumpiasty 93855dc712 llama automatic unloading and longer start timeout 2026-03-13 04:00:07 +01:00
Lumpiasty 241dce4524 disable warmups 2026-03-13 04:00:07 +01:00
Lumpiasty 17805e6b31 add gemma3 model 2026-03-13 04:00:07 +01:00
Lumpiasty 32eea7c3af add gemma3n 2026-03-13 04:00:07 +01:00
Lumpiasty de3ef465f0 add qwen3 no thinking 2026-03-13 04:00:07 +01:00