Commit Graph

23 Commits

Author SHA1 Message Date
b21f8e402b add abliterated versions of qwen3-vl 2025-12-06 23:33:56 +01:00
65e75a4d39 Add 8B and 2B variants of qwen3-vl 2025-11-15 22:21:10 +01:00
6c7457d095 fix Qwen3-VL-4B-Instruct-GGUF models looping issue 2025-11-15 20:40:27 +01:00
9b556e98a9 add qwen3-vl thinking variant 2025-11-15 19:31:53 +01:00
202ebc7b86 add qwen3-vl, fix librechat taking over settings and clean up llama config 2025-11-15 19:18:43 +01:00
ec61023f74 fix cache location after llama-swap update 2025-11-15 18:05:12 +01:00
05d3493bb7 update llama-swap 2025-11-15 17:57:46 +01:00
f4a865ce7a update llama-swap docker image 2025-10-19 20:38:39 +02:00
c0f9670837 Update caddy Docker tag to v2.10.2 2025-10-19 18:18:35 +00:00
708ffe203c Add Qwen2.5-VL models 2025-09-13 02:42:21 +02:00
9c61d47fda add qwen3-4b-2507 model 2025-08-18 02:50:46 +02:00
444c4faf96 move all ingresses to new nginx ingress 2025-08-03 18:17:37 +02:00
a26a351396 update llama-swap 2025-08-03 17:16:25 +02:00
c4628523bc llama automatic unloading and longer start timeout 2025-07-29 02:31:39 +02:00
071e87ee44 disable warmups 2025-07-29 02:24:14 +02:00
9e17aadb56 add gemma3 model 2025-07-29 02:22:52 +02:00
0fde3108d6 move llama models to ssd 2025-07-26 17:54:23 +02:00
9765f1cf86 add gemma3n 2025-07-23 23:46:44 +02:00
5f3a00b382 add qwen3 no thinking 2025-07-23 22:56:52 +02:00
b379c181f2 increase context size 2025-07-23 22:06:45 +02:00
e1801347f2 add qwen3 2025-07-23 20:15:37 +02:00
d53db88fd2 gpu offload in llama.cpp 2025-07-23 19:55:48 +02:00
18eb912f03 llama-swap 2025-07-23 00:18:45 +02:00