Commit Graph

73 Commits

Author SHA1 Message Date
9032060930 add abliterated versions of qwen3-vl 2026-03-13 04:00:08 +01:00
f13c3ae3e7 Add 8B and 2B variants of qwen3-vl 2026-03-13 04:00:08 +01:00
669beccc35 fix Qwen3-VL-4B-Instruct-GGUF models looping issue 2026-03-13 04:00:08 +01:00
5eb7b7bb0c add qwen3-vl thinking variant 2026-03-13 04:00:08 +01:00
0b677d0faf add qwen3-vl, fix librechat taking over settings and clean up llama config 2026-03-13 04:00:08 +01:00
e3325670de fix cache location after llama-swap update 2026-03-13 04:00:08 +01:00
b9200d3a4c update llama-swap 2026-03-13 04:00:08 +01:00
8063cbaf80 update llama-swap docker image 2026-03-13 04:00:08 +01:00
77ebe2cc89 Update caddy Docker tag to v2.10.2 2026-03-13 04:00:08 +01:00
9544f4719f Add Qwen2.5-VL models 2026-03-13 04:00:07 +01:00
eb4ac7acf4 add qwen3-4b-2507 model 2026-03-13 04:00:07 +01:00
60fafe2a91 move all ingresses to new nginx ingress 2026-03-13 04:00:07 +01:00
feaf805208 update llama-swap 2026-03-13 04:00:07 +01:00
93855dc712 llama automatic unloading and longer start timeout 2026-03-13 04:00:07 +01:00
241dce4524 disable warmups 2026-03-13 04:00:07 +01:00
17805e6b31 add gemma3 model 2026-03-13 04:00:07 +01:00
6f3e612dde move llama models to ssd 2026-03-13 04:00:07 +01:00
32eea7c3af add gemma3n 2026-03-13 04:00:07 +01:00
de3ef465f0 add qwen3 no thinking 2026-03-13 04:00:07 +01:00
fc8860f89a increase context size 2026-03-13 04:00:07 +01:00
869cc79898 add qwen3 2026-03-13 04:00:07 +01:00
5813db75dc gpu offload in llama.cpp 2026-03-13 04:00:07 +01:00
af6545444b llama-swap 2026-03-13 04:00:07 +01:00