|
|
a0814e76ee
|
increase pvc for llama to 300 Gi
ci/woodpecker/push/flux-reconcile-source Pipeline was successful
|
2026-04-04 22:49:26 +02:00 |
|
|
|
8160a52176
|
add gemma 4 models
ci/woodpecker/push/flux-reconcile-source Pipeline was successful
|
2026-04-04 02:48:02 +02:00 |
|
|
|
ad3b2229c2
|
get rid of openrouter proxying via llama-swap
ci/woodpecker/push/flux-reconcile-source Pipeline was successful
|
2026-04-04 02:39:26 +02:00 |
|
|
|
e923fc3c30
|
chore(deps): update ghcr.io/mostlygeek/llama-swap docker tag to v199-vulkan-b8637
|
2026-04-04 00:00:54 +00:00 |
|
|
|
4e30c9b94d
|
chore(deps): update ghcr.io/mostlygeek/llama-swap docker tag to v199-vulkan-b8606
|
2026-04-03 00:00:32 +00:00 |
|
|
|
3d53b4b10b
|
chore(deps): update ghcr.io/mostlygeek/llama-swap docker tag to v199-vulkan-b8589
|
2026-04-02 00:00:30 +00:00 |
|
|
|
054df42d8b
|
update qwen3.5 4b ctx size to 128k
|
2026-03-30 21:05:00 +02:00 |
|
|
|
e485a4fc7f
|
chore(deps): update ghcr.io/mostlygeek/llama-swap docker tag to v199-vulkan-b8576
|
2026-03-30 00:00:49 +00:00 |
|
|
|
9e74ed6a19
|
increase --fit-target to 1.5GB
|
2026-03-29 23:50:45 +02:00 |
|
|
|
99bc04b76a
|
chore(deps): update ghcr.io/mostlygeek/llama-swap docker tag to v199-vulkan-b8562
|
2026-03-29 00:00:50 +00:00 |
|
|
|
cb53301926
|
chore(deps): update ghcr.io/mostlygeek/llama-swap docker tag to v199-vulkan-b8547
|
2026-03-27 17:42:04 +00:00 |
|
|
|
66cb3c9d82
|
chore(deps): update ghcr.io/mostlygeek/llama-swap docker tag to v199
|
2026-03-27 00:00:28 +00:00 |
|
|
|
9a1fe1f740
|
chore(deps): update ghcr.io/mostlygeek/llama-swap docker tag to v198-vulkan-b8508
|
2026-03-26 00:00:49 +00:00 |
|
|
|
8cf02fea0e
|
chore(deps): update ghcr.io/mostlygeek/llama-swap docker tag to v198-vulkan-b8496
|
2026-03-25 00:00:29 +00:00 |
|
|
|
1d85bf3a88
|
chore(deps): update ghcr.io/mostlygeek/llama-swap docker tag to v198-vulkan-b8477
|
2026-03-24 00:00:39 +00:00 |
|
|
|
bfede17c87
|
chore(deps): update ghcr.io/mostlygeek/llama-swap docker tag to v198-vulkan-b8468
|
2026-03-23 00:00:21 +00:00 |
|
|
|
471c0ba62d
|
chore(deps): update ghcr.io/mostlygeek/llama-swap docker tag to v198-vulkan-b8461
|
2026-03-22 00:00:23 +00:00 |
|
|
|
8717526358
|
chore(deps): update ghcr.io/mostlygeek/llama-swap docker tag to v198-vulkan-b8445
|
2026-03-20 22:31:36 +00:00 |
|
|
|
ce0b13ebb3
|
change kv cache quant to q8_0
|
2026-03-20 00:57:39 +01:00 |
|
|
|
73d6d1f15a
|
chore(deps): update ghcr.io/mostlygeek/llama-swap docker tag to v198-vulkan-b8400
|
2026-03-19 00:00:34 +00:00 |
|
|
|
8d994e7aa1
|
chore(deps): update ghcr.io/mostlygeek/llama-swap docker tag to v198-vulkan-b8390
|
2026-03-18 00:00:28 +00:00 |
|
|
|
7e7b3e3d71
|
add max ctx on llama.cpp
|
2026-03-17 01:33:35 +01:00 |
|
|
|
82864a4738
|
chore(deps): update ghcr.io/mostlygeek/llama-swap docker tag to v198-vulkan-b8369
|
2026-03-17 00:00:58 +00:00 |
|
|
|
79315d32db
|
add GLM-4.7-Flash model
|
2026-03-16 18:19:28 +01:00 |
|
|
|
afbcea4e82
|
chore(deps): update ghcr.io/mostlygeek/llama-swap docker tag to v198-vulkan-b8352
|
2026-03-15 17:40:26 +00:00 |
|
|
|
4b4cec10be
|
chore(deps): update ghcr.io/mostlygeek/llama-swap docker tag to v198
|
2026-03-15 00:00:34 +00:00 |
|
|
|
2d295d24e0
|
add 27b q3 variant of qwen3.5
|
2026-03-13 04:00:10 +01:00 |
|
|
|
e8efa9ddc1
|
lower kv cache quant to q4_0 and increase ctx to 64k
|
2026-03-13 04:00:10 +01:00 |
|
|
|
c88dd2899a
|
remove ttl of all models in llama-swap
|
2026-03-13 04:00:10 +01:00 |
|
|
|
f219abb74f
|
chore(deps): update ghcr.io/mostlygeek/llama-swap docker tag to v197-vulkan-b8248
|
2026-03-13 04:00:10 +01:00 |
|
|
|
0130991c74
|
refactor: add move llama-swap package config to renovate.json
|
2026-03-13 04:00:10 +01:00 |
|
|
|
966d2c50c0
|
update renovate comment for llama-swap image tag management
|
2026-03-13 04:00:10 +01:00 |
|
|
|
af737ab82b
|
Update caddy Docker tag to v2.11.2
|
2026-03-13 04:00:10 +01:00 |
|
|
|
39fc38d62b
|
add qwen3.5 4b heretic
|
2026-03-13 04:00:10 +01:00 |
|
|
|
e72a79be8f
|
add glm-5 from openrouter to llama-swap
|
2026-03-13 04:00:10 +01:00 |
|
|
|
4fda343b01
|
clean up llama-swap config
|
2026-03-13 04:00:10 +01:00 |
|
|
|
266ced7362
|
adjust parameters of qwen3-coder-next
|
2026-03-13 04:00:10 +01:00 |
|
|
|
8a074839b1
|
automatically fit context on qwen3.5 2b and 4b
|
2026-03-13 04:00:10 +01:00 |
|
|
|
42038207fc
|
Add Q3_K_M variand of Qwen3.5-9B
|
2026-03-13 04:00:10 +01:00 |
|
|
|
28cb53c031
|
fiix thinking versions of Qwen3.5 small
|
2026-03-13 04:00:10 +01:00 |
|
|
|
88a73cbb41
|
set strategy to recreate on llama-swap deployment
|
2026-03-13 04:00:10 +01:00 |
|
|
|
46a7e24932
|
add 2B, 4B, 9B versions of Qwen3.5 in thinking + nonthinking variants
|
2026-03-13 04:00:10 +01:00 |
|
|
|
cd7ebac6b9
|
increase target margin of 2048MB of VRAM
|
2026-03-13 04:00:10 +01:00 |
|
|
|
ba9db6ce41
|
add Qwen3.5 Small 0.8B model and replace Qwen3-VL-2B as task model
|
2026-03-13 04:00:10 +01:00 |
|
|
|
6dd9a717e2
|
shorten context for qwen3-vl-2b and lower kv cache quant
|
2026-03-13 04:00:10 +01:00 |
|
|
|
c67b6f7ebe
|
add path to mmproj in qwen3.5 heretic
|
2026-03-13 04:00:10 +01:00 |
|
|
|
8d7cf402fd
|
manually update llama-swap image tag
|
2026-03-13 04:00:10 +01:00 |
|
|
|
b22498c60f
|
Update caddy Docker tag to v2.11.1
|
2026-03-13 04:00:10 +01:00 |
|
|
|
78a81c5b72
|
Add mmproj-url for Qwen3.5-35B-A3B-heretic model
|
2026-03-13 04:00:10 +01:00 |
|
|
|
2bb23c4ed0
|
add gemma-3-270m-it-qat model
|
2026-03-13 04:00:10 +01:00 |
|