Commit Graph

98 Commits

Author SHA1 Message Date
Ettore Di Giacinto
c603b95ac7
ci: pin build-time protoc (#2461)
ci: pin protoc

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-06-01 18:59:15 +02:00
Ettore Di Giacinto
2bbc52fcc8
feat(build): add arm64 core containers (#2421)
ci: add arm64 container images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-28 10:34:59 +02:00
Ettore Di Giacinto
9f5c274321
feat(images): do not install python deps in the core image (#2425)
do not install python deps in the core image

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-27 22:07:48 +02:00
Sertaç Özercan
3200a6655e
fix: gpu fetch device info (#2403)
* fix: gpu fetch device info

Signed-off-by: Sertac Ozercan <sozercan@gmail.com>

* use pciutils package

Signed-off-by: Sertac Ozercan <sozercan@gmail.com>

---------

Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
2024-05-26 09:56:06 +02:00
Ettore Di Giacinto
371d0cc1f7
ci: generate specific image for intel builds (#2374)
ci: fix intel images until are fixed upstream

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-22 23:35:39 +02:00
Ettore Di Giacinto
f91e4e5c03
ci: correctly build p2p in GO_TAGS (#2369)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-22 10:15:36 +02:00
Ettore Di Giacinto
fdb45153fe
feat(llama.cpp): Totally decentralized, private, distributed, p2p inference (#2343)
* feat(llama.cpp): Enable decentralized, distributed inference

As https://github.com/mudler/LocalAI/pull/2324 introduced distributed inferencing thanks to
@rgerganov implementation in https://github.com/ggerganov/llama.cpp/pull/6829 in upstream llama.cpp, now
it is possible to distribute the workload to remote llama.cpp gRPC server.

This changeset now uses mudler/edgevpn to establish a secure, distributed network between the nodes using a shared token.
The token is generated automatically when starting the server with the `--p2p` flag, and can be used by starting the workers
with `local-ai worker p2p-llama-cpp-rpc` by passing the token via environment variable (TOKEN) or with args (--token).

As per how mudler/edgevpn works, a network is established between the server and the workers with dht and mdns discovery protocols,
the llama.cpp rpc server is automatically started and exposed to the underlying p2p network so the API server can connect on.

When the HTTP server is started, it will discover the workers in the network and automatically create the port-forwards to the service locally.
Then llama.cpp is configured to use the services.

This feature is behind the "p2p" GO_FLAGS

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* go mod tidy

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: add p2p tag

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* better message

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-20 19:17:59 +02:00
Ettore Di Giacinto
8ad669339e
add openvoice backend (#2334)
Wip openvoice
2024-05-19 16:27:08 +02:00
Ettore Di Giacinto
c89271b2e4
feat(llama.cpp): add distributed llama.cpp inferencing (#2324)
* feat(llama.cpp): support distributed llama.cpp

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat: let tweak how chat messages are merged together

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactor

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Makefile: register to ALL_GRPC_BACKENDS

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring, allow disable auto-detection of backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* minor fixups

Signed-off-by: mudler <mudler@localai.io>

* feat: add cmd to start rpc-server from llama.cpp

Signed-off-by: mudler <mudler@localai.io>

* ci: add ccache

Signed-off-by: mudler <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: mudler <mudler@localai.io>
2024-05-15 01:17:02 +02:00
cryptk
28a421cb1d
feat: migrate python backends from conda to uv (#2215)
* feat: migrate diffusers backend from conda to uv

  - replace conda with UV for diffusers install (prototype for all
    extras backends)
  - add ability to build docker with one/some/all extras backends
    instead of all or nothing

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate autogtpq bark coqui from conda to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: convert exllama over to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate exllama2 to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate mamba to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate parler to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate petals to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: fix tests

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate rerankers to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate sentencetransformers to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: install uv for tests-linux

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: make sure file exists before installing on intel images

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate transformers backend to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate transformers-musicgen to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate vall-e-x to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: migrate vllm to uv

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: add uv install to the rest of test-extra.yml

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: adjust file perms on all install/run/test scripts

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: add missing acclerate dependencies

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: add some more missing dependencies to python backends

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: parler tests venv py dir fix

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: correct filename for transformers-musicgen tests

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: adjust the pwd for valle tests

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: cleanup and optimization work for uv migration

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: add setuptools to requirements-install for mamba

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: more size optimization work

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: make installs and tests more consistent, cleanup some deps

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: cleanup

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: mamba backend is cublas only

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: uncomment lines in makefile

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

---------

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-05-10 15:08:08 +02:00
cryptk
a0aa5d01a1
feat: update ROCM and use smaller image (#2196)
* feat: update ROCM and use smaller image

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: add call to ldconfig to fix AMDs broken library packages

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

---------

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-05-03 18:46:49 +02:00
cryptk
3754f154ee
feat: organize Dockerfile into distinct sections (#2181)
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-04-30 10:12:19 +02:00
cryptk
987b7ad42d
feat: only keep the build artifacts from the grpc build (#2172)
* feat: only keep the build artifacts from the grpc build

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: remove separate Cache GRPC build step

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: remove docker inspect step, it is leftover from previous debugging

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

---------

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-04-28 19:24:16 +00:00
cryptk
9fc0135991
feat: cleanup Dockerfile and make final image a little smaller (#2146)
* feat: cleanup Dockerfile and make final image a little smaller

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: add build-essential to final stage

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: more GRPC cache misses

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: correct for another cause of GRPC cache misses

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: generate new GRPC cache automatically if needed

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: use new GRPC_MAKEFLAGS build arg in GRPC cache generation

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

---------

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-04-27 19:48:20 +02:00
Ettore Di Giacinto
b664edde29
feat(rerankers): Add new backend, support jina rerankers API (#2121)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-04-25 00:19:02 +02:00
cryptk
3411e072ca
Fix cleanup sonarqube findings (#2106)
* fix: update dockerignore and gitignore to exclude sonarqube work dir

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: remove useless equality check

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: use sonarqube Dockerfile recommendations

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

---------

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-04-23 18:43:00 +02:00
cryptk
13012cfa70
feat: better control of GRPC docker cache (#2070)
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-04-18 16:19:36 -04:00
Ettore Di Giacinto
0fdff26924
feat(parler-tts): Add new backend (#2027)
* feat(parler-tts): Add new backend

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(parler-tts): try downgrade protobuf

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(parler-tts): add parler conda env

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Revert "feat(parler-tts): try downgrade protobuf"

This reverts commit bd5941d5cfc00676b45a99f71debf3c34249cf3c.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* deps: add grpc

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix: try to gen proto with same environment

* workaround

* Revert "fix: try to gen proto with same environment"

This reverts commit 998c745e2f.

* Workaround fixup

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Dave <dave@gray101.com>
2024-04-13 18:59:21 +02:00
cryptk
1981154f49
fix: dont commit generated files to git (#1993)
* fix: initial work towards not committing generated files to the repository

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: improve build docs

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: remove unused folder from .dockerignore and .gitignore

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: attempt to fix extra backend tests

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: attempt to fix other tests

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: more test fixes

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: fix apple tests

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: more extras tests fixes

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: add GOBIN to PATH in docker build

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: extra tests and Dockerfile corrections

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: remove build dependency checks

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: add golang protobuf compilers to tests-linux action

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: ensure protogen is run for extra backend installs

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: use newer protobuf

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: more missing protoc binaries

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: missing dependencies during docker build

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: don't install grpc compilers in the final stage if they aren't needed

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: python-grpc-tools in 22.04 repos is too old

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: add a couple of extra build dependencies to Makefile

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: unbreak container rebuild functionality

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

---------

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-04-13 09:37:32 +02:00
cryptk
93702e39d4
feat(build): adjust number of parallel make jobs (#1915)
* feat(build): adjust number of parallel make jobs

* fix: update make on MacOS from brew to support --output-sync argument

* fix: cache grpc with version as part of key to improve validity of cache hits

* fix: use gmake for tests-apple to use the updated GNU make version

* fix: actually use the new make version for tests-apple

* feat: parallelize tests-extra

* feat: attempt to cache grpc build for docker images

* fix: don't quote GRPC version

* fix: don't cache go modules, we have limited cache space, better used elsewhere

* fix: release with the same version of go that we test with

* fix: don't fail on exporting cache layers

* fix: remove deprecated BUILD_GRPC docker arg from Makefile
2024-03-29 22:32:40 +01:00
Ettore Di Giacinto
6cf99527f8
docs(aio): Add All-in-One images docs (#1887)
* docs(aio): Add AIO images docs

* add image generation link to quickstart

* while reviewing I noticed this one link was missing, so quickly adding it.

Signed-off-by: Dave <dave@gray101.com>
Co-authored-by: Dave <dave@gray101.com>
2024-03-25 02:01:30 +00:00
Dave
ed5734ae25
test/fix: OSX Test Repair (#1843)
* test with gguf instead of ggml. Updates testPrompt to match? Adds debugging line to Dockerfile that I've found helpful recently.

* fix testPrompt slightly

* Sad Experiment: Test GH runner without metal?

* break apart CGO_LDFLAGS

* switch runner

* upstream llama.cpp disables Metal on Github CI!

* missed a dir from clean-tests

* CGO_LDFLAGS

* tmate failure + NO_ACCELERATE

* whisper.cpp has a metal fix

* do the exact opposite of the name of this branch, but keep it around for unrelated fixes?

* add back newlines

* add tmate to linux for testing

* update fixtures

* timeout for tmate
2024-03-18 19:19:43 +01:00
cryptk
020ce29cd8
fix(make): allow to parallelize jobs (#1845)
* fix: clean up Makefile dependencies to allow for parallel builds

* refactor: remove old unused backend from Makefile

* fix: finish removing legacy backend, update piper

* fix: I broke llama... I fixed llama

* feat: give the tests and builds a few threads

* fix: ensure libraries are replaced before build, add dropreplace target

* Fix image build workflows
2024-03-17 15:39:20 +01:00
cryptk
a6b540737f
fix: missing OpenCL libraries from docker containers during clblas docker build (#1830) 2024-03-14 08:40:37 +01:00
Ettore Di Giacinto
5d1018495f
feat(intel): add diffusers/transformers support (#1746)
* feat(intel): add diffusers support

* try to consume upstream container image

* Debug

* Manually install deps

* Map transformers/hf cache dir to modelpath if not specified

* fix(compel): update initialization, pass by all gRPC options

* fix: add dependencies, implement transformers for xpu

* base it from the oneapi image

* Add pillow

* set threads if specified when launching the API

* Skip conda install if intel

* defaults to non-intel

* ci: add to pipelines

* prepare compel only if enabled

* Skip conda install if intel

* fix cleanup

* Disable compel by default

* Install torch 2.1.0 with Intel

* Skip conda on some setups

* Detect python

* Quiet output

* Do not override system python with conda

* Prefer python3

* Fixups

* exllama2: do not install without conda (overrides pytorch version)

* exllama/exllama2: do not install if not using cuda

* Add missing dataset dependency

* Small fixups, symlink to python, add requirements

* Add neural_speed to the deps

* correctly handle model offloading

* fix: device_map == xpu

* go back at calling python, fixed at dockerfile level

* Exllama2 restricted to only nvidia gpus

* Tokenizer to xpu
2024-03-07 14:37:45 +01:00
fenfir
fb0a4c5d9a
Build docker container for ROCm (#1595)
* Dockerfile changes to build for ROCm

* Adjust linker flags for ROCm

* Update conda env for diffusers and transformers to use ROCm pytorch

* Update transformers conda env for ROCm

* ci: build hipblas images

* fixup rebase

* use self-hosted

Signed-off-by: mudler <mudler@localai.io>

* specify LD_LIBRARY_PATH only when BUILD_TYPE=hipblas

---------

Signed-off-by: mudler <mudler@localai.io>
Co-authored-by: mudler <mudler@localai.io>
2024-02-16 15:08:50 +01:00
Ettore Di Giacinto
53dbe36f32
feat(tts): respect YAMLs config file, add sycl docs/examples (#1692)
* feat(refactor): refactor config and input reading

* feat(tts): read config file for TTS

* examples(kubernetes): Add simple deployment example

* examples(kubernetes): Add simple deployment for intel arc

* docs(sycl): add sycl example

* feat(tts): do not always pick a first model

* fixups to run vall-e-x on container

* Correctly resolve backend
2024-02-10 21:37:03 +01:00
Ettore Di Giacinto
ddd21f1644
feat: Use ubuntu as base for container images, drop deprecated ggml-transformers backends (#1689)
* cleanup backends

* switch image to ubuntu 22.04

* adapt commands for ubuntu

* transformers cleanup

* no contrib on ubuntu

* Change test model to gguf

* ci: disable bark tests (too cpu-intensive)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* cleanup

* refinements

* use intel base image

* Makefile: Add docker targets

* Change test model

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-02-08 20:12:51 +01:00
Ettore Di Giacinto
e23e490455
Revert "fix(Dockerfile): sycl dependencies" (#1687)
Revert "fix(Dockerfile): sycl dependencies (#1686)"

This reverts commit f76bb8954b.
2024-02-06 20:48:29 +01:00
Ettore Di Giacinto
f76bb8954b
fix(Dockerfile): sycl dependencies (#1686)
* fix(Dockerfile): sycl dependencies

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(ci): cleanup before running bark test

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-02-06 19:42:52 +01:00
Ettore Di Giacinto
1c57f8d077
feat(sycl): Add support for Intel GPUs with sycl (#1647) (#1660)
* feat(sycl): Add sycl support (#1647)

* onekit: install without prompts

* set cmake args only in grpc-server

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* cleanup

* fixup sycl source env

* Cleanup docs

* ci: runs on self-hosted

* fix typo

* bump llama.cpp

* llama.cpp: update server

* adapt to upstream changes

* adapt to upstream changes

* docs: add sycl

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-02-01 19:21:52 +01:00
Ettore Di Giacinto
9e653d6abe
feat: 🐍 add mamba support (#1589)
feat(mamba): Initial import

This is a first iteration of the mamba backend, loosely based on
mamba-chat(https://github.com/havenhq/mamba-chat).
2024-01-19 23:42:50 +01:00
Ettore Di Giacinto
5309da40b7
Update Dockerfile
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-01-09 08:55:43 +01:00
Ettore Di Giacinto
e19d7226f8
feat: more embedded models, coqui fixes, add model usage and description (#1556)
* feat: add model descriptions and usage

* remove default model gallery

* models: add embeddings and tts

* docs: update table

* docs: updates

* images: cleanup pip cache after install

* images: always run apt-get clean

* ux: improve gRPC connection errors

* ux: improve some messages

* fix: fix coqui when no AudioPath is passed by

* embedded: add more models

* Add usage

* Reorder table
2024-01-08 00:37:02 +01:00
Ettore Di Giacinto
db926896bd
Revert "[Refactor]: Core/API Split" (#1550)
Revert "[Refactor]: Core/API Split (#1506)"

This reverts commit ab7b4d5ee9.
2024-01-05 18:04:46 +01:00
Dave
ab7b4d5ee9
[Refactor]: Core/API Split (#1506)
Refactors api folder to core, creates firm split between backend code and api frontend.
2024-01-05 15:34:56 +01:00
Gianluca Boiano
cae7b197ec
feat: add tiny dream stable diffusion support (#1283)
Signed-off-by: Gianluca Boiano <morf3089@gmail.com>
2023-12-24 19:27:24 +00:00
Ettore Di Giacinto
95eb72bfd3
feat: add 🐸 coqui (#1489)
* feat: add coqui

* docs: update news
2023-12-24 19:38:54 +01:00
Dave
8b6e601405
Feat: new backend: transformers-musicgen (#1387)
Transformers-MusicGen
---------

Signed-off-by: Dave <dave@gray101.com>
2023-12-08 10:01:02 +01:00
Ettore Di Giacinto
6011911746
fix(piper): pin petals, phonemize and espeak (#1393)
* fix: pin phonemize and espeak

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix: pin petals deps

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-12-07 22:58:41 +01:00
Ettore Di Giacinto
2b2d6673ff
exllama(v2): fix exllamav1, add exllamav2 (#1384)
* fix(exllama): fix exllama deps with anaconda

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(exllamav2): add exllamav2 backend

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-12-05 08:15:37 +01:00
Ettore Di Giacinto
238fec244a
fix(vall-e-x): correctly install reqs in environment (#1377) 2023-12-03 21:16:36 +01:00
Ettore Di Giacinto
b7821361c3
feat(petals): add backend (#1350)
* feat(petals): add backend

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-28 09:01:46 +01:00
Ettore Di Giacinto
6d187af643
fix: handle grpc and llama-cpp with REBUILD=true (#1328)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-25 08:48:24 +01:00
Ettore Di Giacinto
92cbc4d516
feat(transformers): add embeddings with Automodel (#1308)
* Update huggingface.py

Switch SentenceTransformer for AutoModel in order to set trust_remote_code needed to use the encode method with embeddings models like jinai-v2

Signed-off-by: Lucas Hänke de Cansino <lhc@next-boss.eu>

* feat(transformers): split in separate backend

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Lucas Hänke de Cansino <lhc@next-boss.eu>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Lucas Hänke de Cansino <lhc@next-boss.eu>
2023-11-20 21:21:17 +01:00
Ettore Di Giacinto
3c9544b023
refactor: rename llama-stable to llama-ggml (#1287)
* refactor: rename llama-stable to llama-ggml

* Makefile: get sources in sources/

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixup path

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixup sources

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups sd

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* update SD

* fixup

* fixup: create piper libdir also when not built

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix make target on linux test

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-18 08:18:43 +01:00
Ettore Di Giacinto
ad0e30bca5
refactor: move backends into the backends directory (#1279)
* refactor: move backends into the backends directory

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactor: move main close to implementation for every backend

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-13 22:40:16 +01:00
Gianluca Boiano
bde87d00b9
deps(go-piper): update to 2023.11.6-3 (#1257)
Signed-off-by: Gianluca Boiano <morf3089@gmail.com>
2023-11-11 18:40:26 +01:00
Ettore Di Giacinto
8123f009d0 dockerfile: fixup duplicate
This should have been "exllama"

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-05 14:09:31 +01:00
Ettore Di Giacinto
622aaa9f7d dockerfile: avoid pushing a big layer
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-05 10:31:33 +01:00