LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2024-06-07 19:40:48 +00:00

Author	SHA1	Message	Date
Ettore Di Giacinto	17cf6c4a4d	feat(amdgpu): try to build in single binary (#2485 ) * feat(amdgpu): try to build in single binary Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Release space from worker Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-06-05 08:44:15 +02:00
Ettore Di Giacinto	c603b95ac7	ci: pin build-time protoc (#2461 ) ci: pin protoc Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-06-01 18:59:15 +02:00
Ettore Di Giacinto	10c64dbb55	models(gallery): add mopeymule (#2449 ) * models(gallery): add mopeymule Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: try to fix workflow Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-31 18:08:39 +02:00
Ettore Di Giacinto	2bbc52fcc8	feat(build): add arm64 core containers (#2421 ) ci: add arm64 container images Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-28 10:34:59 +02:00
Ettore Di Giacinto	d075dc44dd	ci: push test images when building PRs (#2424 ) ci: try to push image Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-27 22:07:35 +02:00
Ettore Di Giacinto	be8ffbdfcf	ci(grpc-cache): also arm64 (#2423 ) grpc-cache: also arm64 Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-27 17:23:34 +02:00
Sertaç Özercan	29615576fb	ci: fix sd release (#2400 ) Signed-off-by: Sertac Ozercan <sozercan@gmail.com>	2024-05-25 09:33:50 +02:00
Ettore Di Giacinto	e0187c2a1a	ci: do not tag latest on AIO automatically Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-24 09:41:13 +02:00
Sertaç Özercan	7efa8e75d4	fix: stablediffusion binary (#2385 ) Signed-off-by: Sertac Ozercan <sozercan@gmail.com>	2024-05-23 08:34:37 +02:00
Ettore Di Giacinto	7551369abe	Update checksum_checker.sh Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-05-23 08:33:58 +02:00
Ettore Di Giacinto	21a12c2cdd	ci(checksum_checker): do get sha from hf API when available (#2380 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-22 23:51:02 +02:00
Ettore Di Giacinto	371d0cc1f7	ci: generate specific image for intel builds (#2374 ) ci: fix intel images until are fixed upstream Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-22 23:35:39 +02:00
Ettore Di Giacinto	1a3dedece0	dependencies(grpcio): bump to fix CI issues (#2362 ) feat(grpcio): bump to fix CI issues Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-21 14:33:47 +02:00
Ettore Di Giacinto	fdb45153fe	feat(llama.cpp): Totally decentralized, private, distributed, p2p inference (#2343 ) * feat(llama.cpp): Enable decentralized, distributed inference As https://github.com/mudler/LocalAI/pull/2324 introduced distributed inferencing thanks to @rgerganov implementation in https://github.com/ggerganov/llama.cpp/pull/6829 in upstream llama.cpp, now it is possible to distribute the workload to remote llama.cpp gRPC server. This changeset now uses mudler/edgevpn to establish a secure, distributed network between the nodes using a shared token. The token is generated automatically when starting the server with the `--p2p` flag, and can be used by starting the workers with `local-ai worker p2p-llama-cpp-rpc` by passing the token via environment variable (TOKEN) or with args (--token). As per how mudler/edgevpn works, a network is established between the server and the workers with dht and mdns discovery protocols, the llama.cpp rpc server is automatically started and exposed to the underlying p2p network so the API server can connect on. When the HTTP server is started, it will discover the workers in the network and automatically create the port-forwards to the service locally. Then llama.cpp is configured to use the services. This feature is behind the "p2p" GO_FLAGS Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * go mod tidy Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: add p2p tag Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * better message Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-20 19:17:59 +02:00
Ettore Di Giacinto	8ad669339e	add openvoice backend (#2334 ) Wip openvoice	2024-05-19 16:27:08 +02:00
Ettore Di Giacinto	c89271b2e4	feat(llama.cpp): add distributed llama.cpp inferencing (#2324 ) * feat(llama.cpp): support distributed llama.cpp Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat: let tweak how chat messages are merged together Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactor Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Makefile: register to ALL_GRPC_BACKENDS Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactoring, allow disable auto-detection of backends Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * minor fixups Signed-off-by: mudler <mudler@localai.io> * feat: add cmd to start rpc-server from llama.cpp Signed-off-by: mudler <mudler@localai.io> * ci: add ccache Signed-off-by: mudler <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Signed-off-by: mudler <mudler@localai.io>	2024-05-15 01:17:02 +02:00
Sertaç Özercan	a670318a9f	feat: auto select llama-cpp cuda runtime (#2306 ) * auto select cpu variant Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * remove cuda target for now Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * fix metal Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * fix path Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * cuda Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * auto select cuda Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * update test Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * select CUDA backend only if present Signed-off-by: mudler <mudler@localai.io> * ci: keep cuda bin in path Signed-off-by: mudler <mudler@localai.io> * Makefile: make dist now builds also cuda Signed-off-by: mudler <mudler@localai.io> * Keep pushing fallback in case auto-flagset/nvidia fails There could be other reasons for which the default binary may fail. For example we might have detected an Nvidia GPU, however the user might not have the drivers/cuda libraries installed in the system, and so it would fail to start. We keep the fallback of llama.cpp at the end of the llama.cpp backends to try to fallback loading in case things go wrong Signed-off-by: mudler <mudler@localai.io> * Do not build cuda on MacOS Signed-off-by: mudler <mudler@localai.io> * cleanup Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * Apply suggestions from code review Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> --------- Signed-off-by: Sertac Ozercan <sozercan@gmail.com> Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> Signed-off-by: mudler <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com> Co-authored-by: mudler <mudler@localai.io>	2024-05-14 19:40:18 +02:00
cryptk	28a421cb1d	feat: migrate python backends from conda to uv (#2215 ) * feat: migrate diffusers backend from conda to uv - replace conda with UV for diffusers install (prototype for all extras backends) - add ability to build docker with one/some/all extras backends instead of all or nothing Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate autogtpq bark coqui from conda to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: convert exllama over to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate exllama2 to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate mamba to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate parler to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate petals to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: fix tests Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate rerankers to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate sentencetransformers to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: install uv for tests-linux Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: make sure file exists before installing on intel images Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate transformers backend to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate transformers-musicgen to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate vall-e-x to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: migrate vllm to uv Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: add uv install to the rest of test-extra.yml Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: adjust file perms on all install/run/test scripts Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: add missing acclerate dependencies Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: add some more missing dependencies to python backends Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: parler tests venv py dir fix Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: correct filename for transformers-musicgen tests Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: adjust the pwd for valle tests Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: cleanup and optimization work for uv migration Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: add setuptools to requirements-install for mamba Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: more size optimization work Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: make installs and tests more consistent, cleanup some deps Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: cleanup Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: mamba backend is cublas only Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: uncomment lines in makefile Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> --------- Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>	2024-05-10 15:08:08 +02:00
Ettore Di Giacinto	650ae620c5	ci: get latest git version Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-05-09 11:33:16 +02:00
Ettore Di Giacinto	6a209cbef6	ci: get file name correctly in checksum_checker.sh Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-05-09 10:57:23 +02:00
Ettore Di Giacinto	9786bb826d	ci: try to fix checksum_checker.sh Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-05-09 09:34:07 +02:00
Ettore Di Giacinto	9b4c6f348a	Update checksum_checker.yaml Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-05-09 00:57:22 +02:00
Ettore Di Giacinto	cb6ddb21ec	Update checksum_checker.yaml Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-05-09 00:55:48 +02:00
Ettore Di Giacinto	0baacca605	Update checksum_checker.yaml Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-05-09 00:54:35 +02:00
Ettore Di Giacinto	222d714ec7	Update checksum_checker.yaml Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-05-09 00:51:57 +02:00
Ettore Di Giacinto	fd2d89d37b	Update checksum_checker.sh Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-05-09 00:43:16 +02:00
Ettore Di Giacinto	6440b608dc	Update checksum_checker.yaml Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-05-09 00:42:48 +02:00
Ettore Di Giacinto	1937118eab	Update checksum_checker.yaml Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-05-09 00:34:56 +02:00
Ettore Di Giacinto	bc272d1e4b	ci: add checksum checker pipeline (#2274 ) Signed-off-by: mudler <mudler@localai.io>	2024-05-09 00:31:27 +02:00
Ettore Di Giacinto	c5798500cb	feat(single-build): generate single binaries for releases (#2246 ) * feat(single-build): generate single binaries for releases Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * drop old targets Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-05 17:20:51 +02:00
cryptk	a0aa5d01a1	feat: update ROCM and use smaller image (#2196 ) * feat: update ROCM and use smaller image Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: add call to ldconfig to fix AMDs broken library packages Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> --------- Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>	2024-05-03 18:46:49 +02:00
cryptk	f7aabf1b50	fix: bring everything onto the same GRPC version to fix tests (#2199 ) fix: more places where we are installing grpc that need a version specified fix: attempt to fix metal tests fix: metal/brew is forcing an update, they don't have 1.58 available anymore Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>	2024-04-30 19:12:15 +00:00
dependabot[bot]	53c3842bc2	build(deps): bump dependabot/fetch-metadata from 2.0.0 to 2.1.0 (#2186 ) Bumps [dependabot/fetch-metadata](https://github.com/dependabot/fetch-metadata) from 2.0.0 to 2.1.0. - [Release notes](https://github.com/dependabot/fetch-metadata/releases) - [Commits](https://github.com/dependabot/fetch-metadata/compare/v2.0.0...v2.1.0) --- updated-dependencies: - dependency-name: dependabot/fetch-metadata dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-04-29 21:12:37 +00:00
Dave	982dc6a2bd	fix: github bump_docs.sh regex to drop emoji and other text (#2180 ) fix: bump_docs regex Signed-off-by: Dave Lee <dave@gray101.com>	2024-04-29 03:55:29 +00:00
cryptk	987b7ad42d	feat: only keep the build artifacts from the grpc build (#2172 ) * feat: only keep the build artifacts from the grpc build Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: remove separate Cache GRPC build step Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: remove docker inspect step, it is leftover from previous debugging Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> --------- Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>	2024-04-28 19:24:16 +00:00
Ettore Di Giacinto	7e6bf6e7a1	ci: add auto-label rule for gallery in labeler.yml Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-04-27 19:52:26 +02:00
cryptk	9fc0135991	feat: cleanup Dockerfile and make final image a little smaller (#2146 ) * feat: cleanup Dockerfile and make final image a little smaller Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: add build-essential to final stage Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: more GRPC cache misses Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: correct for another cause of GRPC cache misses Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: generate new GRPC cache automatically if needed Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: use new GRPC_MAKEFLAGS build arg in GRPC cache generation Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> --------- Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>	2024-04-27 19:48:20 +02:00
fakezeta	c9451cb604	Bump oneapi-basekit, optimum and openvino (#2139 ) * Bump oneapi-basekit, optimum and openvino * Changed PERFORMANCE HINT to CUMULATIVE_THROUGHPUT Minor latency change for first token but about 10-15% speedup on token generation.	2024-04-26 16:20:43 +02:00
Ettore Di Giacinto	5d170e9264	Update yaml-check.yml Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-04-25 16:05:02 +02:00
Ettore Di Giacinto	1b0a64aa46	Update yaml-check.yml Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-04-25 15:57:06 +02:00
Ettore Di Giacinto	aa8e1c63d5	Create yaml-check.yml Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-04-25 15:52:52 +02:00
Ettore Di Giacinto	60690c9fc4	ci: add swagger pipeline Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-04-25 15:11:01 +02:00
Ettore Di Giacinto	b664edde29	feat(rerankers): Add new backend, support jina rerankers API (#2121 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-04-25 00:19:02 +02:00
Dave	228bc4903f	fix: action-tmate detached (#2092 ) connect-timeout-seconds works best with `detached: true` Signed-off-by: Dave <dave@gray101.com>	2024-04-21 22:39:17 +02:00
Dave	1038f7469c	fix: action-tmate: use connect-timeout-sections and limit-access-to-actor (#2083 ) fix for action-tmate: connect-timeout-sections and limit-access-to-actor Signed-off-by: Dave Lee <dave@gray101.com>	2024-04-20 08:42:02 +00:00
cryptk	852316c5a6	fix: move the GRPC cache generation workflow into it's own concurrency group (#2071 ) Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>	2024-04-18 20:52:34 -04:00
cryptk	13012cfa70	feat: better control of GRPC docker cache (#2070 ) Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>	2024-04-18 16:19:36 -04:00
Ettore Di Giacinto	af9e5a2d05	Revert #1963 (#2056 ) * Revert "fix(fncall): fix regression introduced in #1963 (#2048)" This reverts commit `6b06d4e0af`. * Revert "fix: action-tmate back to upstream, dead code removal (#2038)" This reverts commit `fdec8a9d00`. * Revert "feat(grpc): return consumed token count and update response accordingly (#2035)" This reverts commit `e843d7df0e`. * Revert "refactor: backend/service split, channel-based llm flow (#1963)" This reverts commit `eed5706994`. * feat(grpc): return consumed token count and update response accordingly Fixes: #1920 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-04-17 23:33:49 +02:00
Dave	fdec8a9d00	fix: action-tmate back to upstream, dead code removal (#2038 ) cleanup: upstream action-tmate has taken my PR, drop master reference. Also remove dead code from api.go Signed-off-by: Dave Lee <dave@gray101.com>	2024-04-16 01:46:36 +00:00
dependabot[bot]	320d8a48d9	build(deps): bump github/codeql-action from 2 to 3 (#2041 ) Bumps [github/codeql-action](https://github.com/github/codeql-action) from 2 to 3. - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](https://github.com/github/codeql-action/compare/v2...v3) --- updated-dependencies: - dependency-name: github/codeql-action dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-04-15 22:02:44 +00:00

1 2 3 4

185 Commits