Commit Graph

50 Commits

Author SHA1 Message Date
Ettore Di Giacinto
af9e5a2d05
Revert #1963 (#2056)
* Revert "fix(fncall): fix regression introduced in #1963 (#2048)"

This reverts commit 6b06d4e0af.

* Revert "fix: action-tmate back to upstream, dead code removal (#2038)"

This reverts commit fdec8a9d00.

* Revert "feat(grpc): return consumed token count and update response accordingly (#2035)"

This reverts commit e843d7df0e.

* Revert "refactor: backend/service split, channel-based llm flow (#1963)"

This reverts commit eed5706994.

* feat(grpc): return consumed token count and update response accordingly

Fixes: #1920

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-04-17 23:33:49 +02:00
Dave
eed5706994
refactor: backend/service split, channel-based llm flow (#1963)
Refactor: channel based llm flow and services split

---------

Signed-off-by: Dave Lee <dave@gray101.com>
2024-04-13 09:45:34 +02:00
cryptk
1981154f49
fix: dont commit generated files to git (#1993)
* fix: initial work towards not committing generated files to the repository

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: improve build docs

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: remove unused folder from .dockerignore and .gitignore

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: attempt to fix extra backend tests

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: attempt to fix other tests

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: more test fixes

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: fix apple tests

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: more extras tests fixes

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: add GOBIN to PATH in docker build

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: extra tests and Dockerfile corrections

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: remove build dependency checks

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: add golang protobuf compilers to tests-linux action

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: ensure protogen is run for extra backend installs

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: use newer protobuf

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: more missing protoc binaries

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: missing dependencies during docker build

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: don't install grpc compilers in the final stage if they aren't needed

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: python-grpc-tools in 22.04 repos is too old

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: add a couple of extra build dependencies to Makefile

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: unbreak container rebuild functionality

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

---------

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-04-13 09:37:32 +02:00
Ludovic Leroux
12c0d9443e
feat: use tokenizer.apply_chat_template() in vLLM (#1990)
Use tokenizer.apply_chat_template() in vLLM

Signed-off-by: Ludovic LEROUX <ludovic@inpher.io>
2024-04-11 19:20:22 +02:00
Richard Palethorpe
643d85d2cc
feat(stores): Vector store backend (#1795)
Add simple vector store backend

Signed-off-by: Richard Palethorpe <io@richiejp.com>
2024-03-22 21:14:04 +01:00
Ettore Di Giacinto
20136ca8b7
feat(tts): add Elevenlabs and OpenAI TTS compatibility layer (#1834)
* feat(elevenlabs): map elevenlabs API support to TTS

This allows elevenlabs Clients to work automatically with LocalAI by
supporting the elevenlabs API.

The elevenlabs server endpoint is implemented such as it is wired to the
TTS endpoints.

Fixes: https://github.com/mudler/LocalAI/issues/1809

* feat(openai/tts): compat layer with openai tts

Fixes: #1276

* fix: adapt tts CLI
2024-03-14 23:08:34 +01:00
Ludovic Leroux
939411300a
Bump vLLM version + more options when loading models in vLLM (#1782)
* Bump vLLM version to 0.3.2

* Add vLLM model loading options

* Remove transformers-exllama

* Fix install exllama
2024-03-01 22:48:53 +01:00
Dave
255748bcba
MQTT Startup Refactoring Part 1: core/ packages part 1 (#1728)
This PR specifically introduces a `core` folder and moves the following packages over, without any other changes:

- `api/backend`
- `api/config`
- `api/options`
- `api/schema`

Once this is merged and we confirm there's no regressions, I can migrate over the remaining changes piece by piece to split up application startup, backend services, http, and mqtt as was the goal of the earlier PRs!
2024-02-21 01:21:19 +00:00
Ettore Di Giacinto
cb7512734d
transformers: correctly load automodels (#1643)
* backends(transformers): use AutoModel with LLM types

* examples: animagine-xl

* Add codellama examples
2024-01-26 00:13:21 +01:00
coyzeng
d5d82ba344
feat(grpc): backend SPI pluggable in embedding mode (#1621)
* run server

* grpc backend embedded support

* backend providable
2024-01-23 08:56:36 +01:00
Ettore Di Giacinto
e19d7226f8
feat: more embedded models, coqui fixes, add model usage and description (#1556)
* feat: add model descriptions and usage

* remove default model gallery

* models: add embeddings and tts

* docs: update table

* docs: updates

* images: cleanup pip cache after install

* images: always run apt-get clean

* ux: improve gRPC connection errors

* ux: improve some messages

* fix: fix coqui when no AudioPath is passed by

* embedded: add more models

* Add usage

* Reorder table
2024-01-08 00:37:02 +01:00
Ettore Di Giacinto
db926896bd
Revert "[Refactor]: Core/API Split" (#1550)
Revert "[Refactor]: Core/API Split (#1506)"

This reverts commit ab7b4d5ee9.
2024-01-05 18:04:46 +01:00
Dave
ab7b4d5ee9
[Refactor]: Core/API Split (#1506)
Refactors api folder to core, creates firm split between backend code and api frontend.
2024-01-05 15:34:56 +01:00
Ettore Di Giacinto
7641f92cde
feat(diffusers): update, add autopipeline, controlnet (#1432)
* feat(diffusers): update, add autopipeline, controlenet

* tests with AutoPipeline

* simplify logic
2023-12-13 19:20:22 +01:00
Ettore Di Giacinto
824612f1b4
feat: initial watchdog implementation (#1341)
* feat: initial watchdog implementation

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

* fiuxups

* Add more output

* wip: idletime checker

* wire idle watchdog checks

* enlarge watchdog time window

* small fixes

* Use stopmodel

* Always delete process

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-26 18:36:23 +01:00
Ettore Di Giacinto
548959b50f
feat: queue up requests if not running parallel requests (#1296)
Return a GRPC which handles a lock in case it is not meant to be
parallel.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-16 22:20:16 +01:00
Ettore Di Giacinto
ad0e30bca5
refactor: move backends into the backends directory (#1279)
* refactor: move backends into the backends directory

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactor: move main close to implementation for every backend

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-13 22:40:16 +01:00
Ettore Di Giacinto
803a0ac02a
feat(llama.cpp): support lora with scale and yarn (#1277)
* feat(llama.cpp): support lora with scale

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(llama.cpp): support yarn

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-11 18:40:48 +01:00
Ettore Di Giacinto
0eae727366
🔥 add LaVA support and GPT vision API, Multiple requests for llama.cpp, return JSON types (#1254)
* wip

* wip

* Make it functional

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* wip

* Small fixups

* do not inject space on role encoding, encode img at beginning of messages

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add examples/config defaults

* Add include dir of current source dir

* cleanup

* fixes

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups

* Revert "fixups"

This reverts commit f1a4731cca.

* fixes

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-11 13:14:59 +01:00
Ettore Di Giacinto
a28ab18987
feat(vllm): Allow to set quantization (#1094)
This particularly useful to set AWQ

**Description**

Follow up of #1015 

**Notes for Reviewers**


**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [ ] Yes, I signed my commits.
 

<!--
Thank you for contributing to LocalAI! 

Contributing Conventions:

1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR. 
3. Sign your commits

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-09-22 15:52:38 +02:00
Ettore Di Giacinto
8ccf5b2044
feat(speculative-sampling): allow to specify a draft model in the model config (#1052)
**Description**

This PR fixes #1013.

It adds `draft_model` and `n_draft` to the model YAML config in order to
load models with speculative sampling. This should be compatible as well
with grammars.

example:

```yaml
backend: llama                                                                                                                                                                   
context_size: 1024                                                                                                                                                                        
name: my-model-name
parameters:
  model: foo-bar
n_draft: 16                                                                                                                                                                      
draft_model: model-name
```

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-09-14 17:44:16 +02:00
Ettore Di Giacinto
dc307a1cc0
feat: add vall-e-x (#1007)
**Description**

This PR fixes #985 

**Notes for Reviewers**


**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [ ] Yes, I signed my commits.
 

<!--
Thank you for contributing to LocalAI! 

Contributing Conventions:

1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR. 
3. Sign your commits

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-09-04 19:25:23 +02:00
Ettore Di Giacinto
44bc7aa3d0
feat: Allow to load lora adapters for llama.cpp (#955)
**Description**

This PR fixes #

**Notes for Reviewers**


**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [ ] Yes, I signed my commits.
 

<!--
Thank you for contributing to LocalAI! 

Contributing Conventions:

1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR. 
3. Sign your commits

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-25 21:58:46 +02:00
Dave
901f0709c5
Feat: rwkv improvements: (#937) 2023-08-22 18:48:06 +02:00
Ettore Di Giacinto
cc060a283d
fix: drop racy code, refactor and group API schema (#931)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-20 14:04:45 +02:00
Ettore Di Giacinto
afdc0ebfd7
feat: add --single-active-backend to allow only one backend active at the time (#925)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-19 01:49:33 +02:00
Ettore Di Giacinto
1079b18ff7
feat(diffusers): be consistent with pipelines, support also depthimg2img (#926)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-18 22:06:24 +02:00
Dave
8cb1061c11
Usage Features (#863) 2023-08-18 21:23:14 +02:00
Ettore Di Giacinto
2bacd0180d
feat(diffusers): add img2img and clip_skip, support more kernels schedulers (#906)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-17 23:38:59 +02:00
Ettore Di Giacinto
37700f2d98
feat(diffusers): add DPMSolverMultistepScheduler++, DPMSolverMultistepSchedulerSDE++, guidance_scale (#903)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-16 01:11:42 +02:00
Ettore Di Giacinto
a96c3bc885
feat(diffusers): various enhancements (#895) 2023-08-14 23:12:00 +02:00
Ettore Di Giacinto
8c781a6a44
feat: Add Diffusers (#874)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-09 08:38:51 +02:00
Ettore Di Giacinto
3c8fc37c56 feat: Add UseFastTokenizer
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-08 01:10:05 +02:00
Ettore Di Giacinto
a843e64fc2 feat: add initial AutoGPTQ backend implementation 2023-08-07 22:53:28 +02:00
Ettore Di Giacinto
5ca21ee398
feat: add ngqa and RMSNormEps parameters (#860)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-03 00:51:08 +02:00
Ettore Di Giacinto
00ccb8d4f1 fix: set default rope freq base to 10000 during model load
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-29 10:40:56 +02:00
Dave
8b90ac2b1a
1000 -> 10,000 for ropeFreqBase?
the error message talks about a default of 10k, so setting this to 10k instead of 1k experimentally.
2023-07-29 02:37:24 -04:00
Ettore Di Giacinto
f085baa77d fix: set default rope if not specified
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-29 01:07:16 +02:00
Ettore Di Giacinto
096d98c3d9
fix: add rope settings during model load, fix CUDA (#821)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-27 21:56:05 +02:00
Ettore Di Giacinto
b96e30e66c
fix: use bytes in gRPC proto instead of strings (#813)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-27 18:41:04 +02:00
Ettore Di Giacinto
569c1d1163
feat: add rope settings and negative prompt, drop grammar backend (#797)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-25 19:05:27 +02:00
Ettore Di Giacinto
c71c729bc2 debug 2023-07-21 10:53:26 +02:00
Ettore Di Giacinto
3feb632eb4
refactor: rename "llama-master" and "llama" (#776)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-20 00:36:16 +02:00
Ettore Di Giacinto
6352448b72
feat: add llama-master backend (#752)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-17 23:58:15 +02:00
Ettore Di Giacinto
1d0ed95a54 feat: move other backends to grpc
This finally makes everything more consistent

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-15 01:19:43 +02:00
Ettore Di Giacinto
5dcfdbe51d feat: various refactorings
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-15 01:19:43 +02:00
Ettore Di Giacinto
f2f1d7fe72 feat: use gRPC for transformers
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-15 01:19:43 +02:00
Ettore Di Giacinto
ae533cadef feat: move gpt4all to a grpc service
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-15 01:19:43 +02:00
Ettore Di Giacinto
58f6aab637 feat: move llama to a grpc
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-15 01:19:43 +02:00
Ettore Di Giacinto
b816009db0 feat: add falcon ggllm via grpc client
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-15 01:19:43 +02:00