LocalAI/backend/cpp
Ettore Di Giacinto e49ea0123b
feat(llama.cpp): add flash_attention and no_kv_offloading (#2310)
feat(llama.cpp): add flash_attn and no_kv_offload

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-13 19:07:51 +02:00
..
grpc fix: respect concurrency from parent build parameters when building GRPC (#2023) 2024-04-13 09:14:32 +02:00
llama feat(llama.cpp): add flash_attention and no_kv_offloading (#2310) 2024-05-13 19:07:51 +02:00