|
|
9032060930
|
add abliterated versions of qwen3-vl
|
2026-03-13 04:00:08 +01:00 |
|
|
|
f13c3ae3e7
|
Add 8B and 2B variants of qwen3-vl
|
2026-03-13 04:00:08 +01:00 |
|
|
|
669beccc35
|
fix Qwen3-VL-4B-Instruct-GGUF models looping issue
|
2026-03-13 04:00:08 +01:00 |
|
|
|
5eb7b7bb0c
|
add qwen3-vl thinking variant
|
2026-03-13 04:00:08 +01:00 |
|
|
|
0b677d0faf
|
add qwen3-vl, fix librechat taking over settings and clean up llama config
|
2026-03-13 04:00:08 +01:00 |
|
|
|
e3325670de
|
fix cache location after llama-swap update
|
2026-03-13 04:00:08 +01:00 |
|
|
|
b9200d3a4c
|
update llama-swap
|
2026-03-13 04:00:08 +01:00 |
|
|
|
8063cbaf80
|
update llama-swap docker image
|
2026-03-13 04:00:08 +01:00 |
|
|
|
77ebe2cc89
|
Update caddy Docker tag to v2.10.2
|
2026-03-13 04:00:08 +01:00 |
|
|
|
9544f4719f
|
Add Qwen2.5-VL models
|
2026-03-13 04:00:07 +01:00 |
|
|
|
eb4ac7acf4
|
add qwen3-4b-2507 model
|
2026-03-13 04:00:07 +01:00 |
|
|
|
60fafe2a91
|
move all ingresses to new nginx ingress
|
2026-03-13 04:00:07 +01:00 |
|
|
|
feaf805208
|
update llama-swap
|
2026-03-13 04:00:07 +01:00 |
|
|
|
93855dc712
|
llama automatic unloading and longer start timeout
|
2026-03-13 04:00:07 +01:00 |
|
|
|
241dce4524
|
disable warmups
|
2026-03-13 04:00:07 +01:00 |
|
|
|
17805e6b31
|
add gemma3 model
|
2026-03-13 04:00:07 +01:00 |
|
|
|
6f3e612dde
|
move llama models to ssd
|
2026-03-13 04:00:07 +01:00 |
|
|
|
32eea7c3af
|
add gemma3n
|
2026-03-13 04:00:07 +01:00 |
|
|
|
de3ef465f0
|
add qwen3 no thinking
|
2026-03-13 04:00:07 +01:00 |
|
|
|
fc8860f89a
|
increase context size
|
2026-03-13 04:00:07 +01:00 |
|
|
|
869cc79898
|
add qwen3
|
2026-03-13 04:00:07 +01:00 |
|
|
|
5813db75dc
|
gpu offload in llama.cpp
|
2026-03-13 04:00:07 +01:00 |
|
|
|
af6545444b
|
llama-swap
|
2026-03-13 04:00:07 +01:00 |
|