Commit Graph

19 Commits

Author SHA1 Message Date
Ettore Di Giacinto fb2a05ff43
feat(gallery): display job status also during navigation (#2151)
* feat(gallery): keep showing progress also when refreshing

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(intel-gpu): better defaults

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat: make it thread-safe

Signed-off-by: mudler <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: mudler <mudler@localai.io>
2024-04-27 09:08:33 +02:00
Ettore Di Giacinto 0d8bf91699
feat: Galleries UI (#2104)
* WIP: add models to webui

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Register routes

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix: don't cache models

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* small fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix: fixup multiple installs (strings.Clone)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-04-23 09:22:58 +02:00
Ettore Di Giacinto 180cd4ccda
fix(llama.cpp-ggml): fixup `max_tokens` for old backend (#2094)
fix(llama.cpp-ggml): set 0 as default for `max_tokens`

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-04-21 16:34:00 +02:00
Ettore Di Giacinto afa1bca1e3
fix(llama.cpp): set -1 as default for max tokens (#2087)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-04-20 20:20:10 +02:00
Taikono-Himazin 03adc1f60d
Add tensor_parallel_size setting to vllm setting items (#2085)
Signed-off-by: Taikono-Himazin <kazu@po.harenet.ne.jp>
2024-04-20 14:37:02 +00:00
Ettore Di Giacinto bbea62b907
feat(functions): support models with no grammar, add tests (#2068)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-04-18 22:43:12 +02:00
Ettore Di Giacinto af9e5a2d05
Revert #1963 (#2056)
* Revert "fix(fncall): fix regression introduced in #1963 (#2048)"

This reverts commit 6b06d4e0af.

* Revert "fix: action-tmate back to upstream, dead code removal (#2038)"

This reverts commit fdec8a9d00.

* Revert "feat(grpc): return consumed token count and update response accordingly (#2035)"

This reverts commit e843d7df0e.

* Revert "refactor: backend/service split, channel-based llm flow (#1963)"

This reverts commit eed5706994.

* feat(grpc): return consumed token count and update response accordingly

Fixes: #1920

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-04-17 23:33:49 +02:00
Dave eed5706994
refactor: backend/service split, channel-based llm flow (#1963)
Refactor: channel based llm flow and services split

---------

Signed-off-by: Dave Lee <dave@gray101.com>
2024-04-13 09:45:34 +02:00
Ludovic Leroux 12c0d9443e
feat: use tokenizer.apply_chat_template() in vLLM (#1990)
Use tokenizer.apply_chat_template() in vLLM

Signed-off-by: Ludovic LEROUX <ludovic@inpher.io>
2024-04-11 19:20:22 +02:00
Ettore Di Giacinto 8342553214
fix(llama.cpp): set better defaults for llama.cpp (#1961)
fix(defaults): set better defaults for llama.cpp

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-04-06 22:56:45 +02:00
Ettore Di Giacinto ff77d3bc22
fix(seed): generate random seed per-request if -1 is set (#1952)
* fix(seed): generate random seed per-request if -1 is set

Also update ci with new workflows and allow the aio tests to run with an
api key

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* docs(openvino): Add OpenVINO example

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-04-03 22:25:47 +02:00
Ettore Di Giacinto 9bc209ba73
fix(welcome): stable model list (#1949) 2024-04-02 19:25:32 +02:00
Ettore Di Giacinto e8f02c083f
fix(functions): respect when selected from string (#1940)
* fix(functions): respect when selected from string

* fix(toolschoice): decode both string and objects
2024-04-01 19:39:54 +02:00
Ettore Di Giacinto 600152df23
fix(config): pass by config options, respect defaults (#1878)
This bug had the unpleasant effect that it ignored defaults passed by
the CLI. For instance threads could be changed only via model config
file.
2024-03-22 20:55:11 +01:00
Ettore Di Giacinto 843f93e1ab
fix(config): default to debug=false if not set (#1853) 2024-03-18 18:59:39 +01:00
Ettore Di Giacinto b9e77d394b
feat(model-help): display help text in markdown (#1825)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-03-13 21:50:46 +01:00
Ettore Di Giacinto f895d06605
fix(config): set better defaults for inferencing (#1822)
* fix(defaults): set better defaults for inferencing

This changeset aim to have better defaults and to properly detect when
no inference settings are provided with the model.

If not specified, we defaults to mirostat sampling, and offload all the
GPU layers (if a GPU is detected).

Related to https://github.com/mudler/LocalAI/issues/1373 and https://github.com/mudler/LocalAI/issues/1723

* Adapt tests

* Also pre-initialize default seed
2024-03-13 10:05:30 +01:00
Ludovic Leroux 939411300a
Bump vLLM version + more options when loading models in vLLM (#1782)
* Bump vLLM version to 0.3.2

* Add vLLM model loading options

* Remove transformers-exllama

* Fix install exllama
2024-03-01 22:48:53 +01:00
Dave 1c312685aa
refactor: move remaining api packages to core (#1731)
* core 1

* api/openai/files fix

* core 2 - core/config

* move over core api.go and tests to the start of core/http

* move over localai specific endpoints to core/http, begin the service/endpoint split there

* refactor big chunk on the plane

* refactor chunk 2 on plane, next step: port and modify changes to request.go

* easy fixes for request.go, major changes not done yet

* lintfix

* json tag lintfix?

* gitignore and .keep files

* strange fix attempt: rename the config dir?
2024-03-01 16:19:53 +01:00