Commit Graph

13 Commits

Author SHA1 Message Date
eb4ac7acf4 add qwen3-4b-2507 model 2026-03-13 04:00:07 +01:00
60fafe2a91 move all ingresses to new nginx ingress 2026-03-13 04:00:07 +01:00
feaf805208 update llama-swap 2026-03-13 04:00:07 +01:00
93855dc712 llama automatic unloading and longer start timeout 2026-03-13 04:00:07 +01:00
241dce4524 disable warmups 2026-03-13 04:00:07 +01:00
17805e6b31 add gemma3 model 2026-03-13 04:00:07 +01:00
6f3e612dde move llama models to ssd 2026-03-13 04:00:07 +01:00
32eea7c3af add gemma3n 2026-03-13 04:00:07 +01:00
de3ef465f0 add qwen3 no thinking 2026-03-13 04:00:07 +01:00
fc8860f89a increase context size 2026-03-13 04:00:07 +01:00
869cc79898 add qwen3 2026-03-13 04:00:07 +01:00
5813db75dc gpu offload in llama.cpp 2026-03-13 04:00:07 +01:00
af6545444b llama-swap 2026-03-13 04:00:07 +01:00