|
|
f4a865ce7a
|
update llama-swap docker image
|
2025-10-19 20:38:39 +02:00 |
|
|
|
c0f9670837
|
Update caddy Docker tag to v2.10.2
|
2025-10-19 18:18:35 +00:00 |
|
|
|
708ffe203c
|
Add Qwen2.5-VL models
|
2025-09-13 02:42:21 +02:00 |
|
|
|
9c61d47fda
|
add qwen3-4b-2507 model
|
2025-08-18 02:50:46 +02:00 |
|
|
|
444c4faf96
|
move all ingresses to new nginx ingress
|
2025-08-03 18:17:37 +02:00 |
|
|
|
a26a351396
|
update llama-swap
|
2025-08-03 17:16:25 +02:00 |
|
|
|
c4628523bc
|
llama automatic unloading and longer start timeout
|
2025-07-29 02:31:39 +02:00 |
|
|
|
071e87ee44
|
disable warmups
|
2025-07-29 02:24:14 +02:00 |
|
|
|
9e17aadb56
|
add gemma3 model
|
2025-07-29 02:22:52 +02:00 |
|
|
|
0fde3108d6
|
move llama models to ssd
|
2025-07-26 17:54:23 +02:00 |
|
|
|
9765f1cf86
|
add gemma3n
|
2025-07-23 23:46:44 +02:00 |
|
|
|
5f3a00b382
|
add qwen3 no thinking
|
2025-07-23 22:56:52 +02:00 |
|
|
|
b379c181f2
|
increase context size
|
2025-07-23 22:06:45 +02:00 |
|
|
|
e1801347f2
|
add qwen3
|
2025-07-23 20:15:37 +02:00 |
|
|
|
d53db88fd2
|
gpu offload in llama.cpp
|
2025-07-23 19:55:48 +02:00 |
|
|
|
18eb912f03
|
llama-swap
|
2025-07-23 00:18:45 +02:00 |
|