* wip: llama.cpp c++ gRPC server
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* make it work, attach it to the build process
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* update deps
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix: add protobuf dep
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* try fix protobuf on cmake
* cmake: workarounds
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* add packages
* cmake: use fixed version of grpc
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* cmake(grpc): install locally
* install grpc
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* install required deps for grpc on debian bullseye
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* debug
* debug
* Fixups
* no need to install cmake manually
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* ci: fixup macOS
* use brew whenever possible
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* macOS fixups
* debug
* fix container build
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* workaround
* try mac
https://stackoverflow.com/questions/23905661/on-mac-g-clang-fails-to-search-usr-local-include-and-usr-local-lib-by-def
* Disable temp. arm64 docker image builds
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
**Description**
This PR syncs up the `llama` backend to use `gguf`
(https://github.com/go-skynet/go-llama.cpp/pull/180). It also adds
`llama-stable` to the targets so we can still load ggml. It adapts the
current tests to use the `llama-backend` for ggml and uses a `gguf`
model to run tests on the new backend.
In order to consume the new version of go-llama.cpp, it also bump go to
1.21 (images, pipelines, etc)
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>