LocalAI/core
Ettore Di Giacinto c89271b2e4
feat(llama.cpp): add distributed llama.cpp inferencing (#2324)
* feat(llama.cpp): support distributed llama.cpp

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat: let tweak how chat messages are merged together

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactor

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Makefile: register to ALL_GRPC_BACKENDS

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring, allow disable auto-detection of backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* minor fixups

Signed-off-by: mudler <mudler@localai.io>

* feat: add cmd to start rpc-server from llama.cpp

Signed-off-by: mudler <mudler@localai.io>

* ci: add ccache

Signed-off-by: mudler <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: mudler <mudler@localai.io>
2024-05-15 01:17:02 +02:00
..
backend feat(llama.cpp): add flash_attention and no_kv_offloading (#2310) 2024-05-13 19:07:51 +02:00
cli feat(llama.cpp): add distributed llama.cpp inferencing (#2324) 2024-05-15 01:17:02 +02:00
clients feat(store): add Golang client (#1977) 2024-04-16 15:54:14 +02:00
config feat(llama.cpp): add distributed llama.cpp inferencing (#2324) 2024-05-15 01:17:02 +02:00
http feat(llama.cpp): add distributed llama.cpp inferencing (#2324) 2024-05-15 01:17:02 +02:00
schema feat(grammar): support models with specific construct (#2291) 2024-05-12 01:13:22 +02:00
services feat(webui): ux improvements (#2247) 2024-05-07 01:17:07 +02:00
startup feat(startup): show CPU/GPU information with --debug (#2241) 2024-05-05 09:10:23 +02:00
application.go refactor(application): introduce application global state (#2072) 2024-04-29 17:42:37 +00:00