Commit Graph

57 Commits

Author SHA1 Message Date
6b012e01a8 update renovate comment for llama-swap image tag management 2026-03-10 12:55:03 +01:00
82029fa745 Merge pull request 'Update caddy Docker tag to v2.11.2' (#147) from renovate/caddy-2.x into fresh-start
Reviewed-on: #147
2026-03-10 11:53:51 +00:00
2df8303905 add qwen3.5 4b heretic 2026-03-08 21:39:53 +01:00
65c11ab4ca add glm-5 from openrouter to llama-swap 2026-03-08 17:58:01 +01:00
55da75f06e clean up llama-swap config 2026-03-08 17:25:44 +01:00
ac0165cf01 adjust parameters of qwen3-coder-next 2026-03-07 22:52:49 +01:00
15989f4891 automatically fit context on qwen3.5 2b and 4b 2026-03-07 21:01:32 +01:00
1b11201ad0 Update caddy Docker tag to v2.11.2 2026-03-07 00:00:27 +00:00
a3ebc531fe Add Q3_K_M variand of Qwen3.5-9B 2026-03-06 23:21:58 +01:00
63f154293d fiix thinking versions of Qwen3.5 small 2026-03-06 23:17:48 +01:00
42aa0a7263 set strategy to recreate on llama-swap deployment 2026-03-06 23:08:03 +01:00
a9b8b45328 add 2B, 4B, 9B versions of Qwen3.5 in thinking + nonthinking variants 2026-03-06 23:07:02 +01:00
3dc481bc8b increase target margin of 2048MB of VRAM 2026-03-06 02:41:34 +01:00
711c437c0a add Qwen3.5 Small 0.8B model and replace Qwen3-VL-2B as task model 2026-03-05 23:17:30 +01:00
975f1db8f5 shorten context for qwen3-vl-2b and lower kv cache quant 2026-03-05 22:42:54 +01:00
ab9ddd0f3b add path to mmproj in qwen3.5 heretic 2026-03-05 19:31:03 +01:00
3e59786c83 manually update llama-swap image tag 2026-03-05 19:27:45 +01:00
96a09ae6f9 Merge pull request 'Update caddy Docker tag to v2.11.1' (#141) from renovate/caddy-2.x into fresh-start
Reviewed-on: #141
2026-03-02 17:26:21 +00:00
5c4535beb6 Add mmproj-url for Qwen3.5-35B-A3B-heretic model 2026-03-02 03:19:16 +01:00
44aa0c8136 add gemma-3-270m-it-qat model 2026-02-28 23:20:13 +01:00
902004f2e7 Add Qwen3.5-35B-A3B-heretic models 2026-02-28 18:33:42 +01:00
bf1f1c0b41 Add always loaded Qwen3-VL-2B-Instruct 2026-02-28 17:48:20 +01:00
5915b8dd30 Add Qwen3.5-35-A3B model 2026-02-28 15:49:59 +01:00
c14257842a automatically fit models by llama.cpp 2026-02-26 01:38:39 +01:00
d053342234 fix models mount 2026-02-26 01:25:21 +01:00
2dbd964c28 add schema reference to config.yaml 2026-02-26 00:43:16 +01:00
7712aac0f5 configure llama-swap to log llama.cpp output 2026-02-26 00:39:58 +01:00
c7bc79f574 add Qwen3-Coder-Next model 2026-02-26 00:10:53 +01:00
6cba277b9d update llama-swap image 2026-02-25 19:07:10 +01:00
bfb089aeff migrate llama models to ssd 2026-02-25 16:03:12 +01:00
ed83a66a83 add ssd volume for llama models 2026-02-25 15:43:42 +01:00
b4ba66dc18 Update caddy Docker tag to v2.11.1 2026-02-24 00:00:41 +00:00
b95c9e7c69 switch llama models dir to lvm hdd 2026-02-21 16:51:04 +01:00
05c28d0d46 add lvm hdd llama models pvc 2026-02-21 16:28:06 +01:00
b21f8e402b add abliterated versions of qwen3-vl 2025-12-06 23:33:56 +01:00
65e75a4d39 Add 8B and 2B variants of qwen3-vl 2025-11-15 22:21:10 +01:00
6c7457d095 fix Qwen3-VL-4B-Instruct-GGUF models looping issue 2025-11-15 20:40:27 +01:00
9b556e98a9 add qwen3-vl thinking variant 2025-11-15 19:31:53 +01:00
202ebc7b86 add qwen3-vl, fix librechat taking over settings and clean up llama config 2025-11-15 19:18:43 +01:00
ec61023f74 fix cache location after llama-swap update 2025-11-15 18:05:12 +01:00
05d3493bb7 update llama-swap 2025-11-15 17:57:46 +01:00
f4a865ce7a update llama-swap docker image 2025-10-19 20:38:39 +02:00
c0f9670837 Update caddy Docker tag to v2.10.2 2025-10-19 18:18:35 +00:00
708ffe203c Add Qwen2.5-VL models 2025-09-13 02:42:21 +02:00
9c61d47fda add qwen3-4b-2507 model 2025-08-18 02:50:46 +02:00
444c4faf96 move all ingresses to new nginx ingress 2025-08-03 18:17:37 +02:00
a26a351396 update llama-swap 2025-08-03 17:16:25 +02:00
c4628523bc llama automatic unloading and longer start timeout 2025-07-29 02:31:39 +02:00
071e87ee44 disable warmups 2025-07-29 02:24:14 +02:00
9e17aadb56 add gemma3 model 2025-07-29 02:22:52 +02:00