fakezeta
3882130911
feat: Add Bitsandbytes quantization for transformer backend enhancement #1775 and fix: Transformer backend error on CUDA #1774 ( #1823 )
...
* fixes #1775 and #1774
Add BitsAndBytes Quantization and fixes embedding on CUDA devices
* Manage 4bit and 8 bit quantization
Manage different BitsAndBytes options with the quantization: parameter in yaml
* fix compilation errors on non CUDA environment
2024-03-14 23:06:30 +01:00
cryptk
a6b540737f
fix: missing OpenCL libraries from docker containers during clblas docker build ( #1830 )
2024-03-14 08:40:37 +01:00
LocalAI [bot]
f82065703d
⬆️ Update ggerganov/llama.cpp ( #1827 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-03-14 08:39:39 +01:00
cryptk
b423af001d
fix: the correct BUILD_TYPE for OpenCL is clblas (with no t) ( #1828 )
2024-03-14 08:39:21 +01:00
Ettore Di Giacinto
b9e77d394b
feat(model-help): display help text in markdown ( #1825 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-03-13 21:50:46 +01:00
Ettore Di Giacinto
57222497ec
fix(docker-compose): update docker compose file ( #1824 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-03-13 17:57:45 +01:00
LocalAI [bot]
5c5f07c1e7
⬆️ Update ggerganov/llama.cpp ( #1821 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-03-13 10:05:46 +01:00
Ettore Di Giacinto
f895d06605
fix(config): set better defaults for inferencing ( #1822 )
...
* fix(defaults): set better defaults for inferencing
This changeset aim to have better defaults and to properly detect when
no inference settings are provided with the model.
If not specified, we defaults to mirostat sampling, and offload all the
GPU layers (if a GPU is detected).
Related to https://github.com/mudler/LocalAI/issues/1373 and https://github.com/mudler/LocalAI/issues/1723
* Adapt tests
* Also pre-initialize default seed
2024-03-13 10:05:30 +01:00
Ettore Di Giacinto
bc8f648a91
fix(doc/examples): set defaults to mirostat ( #1820 )
...
The default sampler on some models don't return enough candidates which
leads to a false sense of randomness. Tracing back the code it looks
that with the temperature sampler there might not be enough
candidates to pick from, and since the seed and "randomness" take effect
while picking a good candidate this yields to the same results over and
over.
Fixes https://github.com/mudler/LocalAI/issues/1723 by updating the
examples and documentation to use mirostat instead.
2024-03-11 19:49:03 +01:00
LocalAI [bot]
8e57f4df31
⬆️ Update ggerganov/llama.cpp ( #1818 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-03-11 00:02:37 +01:00
LocalAI [bot]
a08cc5adbb
⬆️ Update ggerganov/llama.cpp ( #1816 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-03-10 09:32:09 +01:00
LocalAI [bot]
595a73fce4
⬆️ Update ggerganov/llama.cpp ( #1813 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-03-09 09:27:06 +01:00
LocalAI [bot]
dc919e08e8
⬆️ Update ggerganov/llama.cpp ( #1811 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-03-08 08:21:25 +01:00
Ettore Di Giacinto
5d1018495f
feat(intel): add diffusers/transformers support ( #1746 )
...
* feat(intel): add diffusers support
* try to consume upstream container image
* Debug
* Manually install deps
* Map transformers/hf cache dir to modelpath if not specified
* fix(compel): update initialization, pass by all gRPC options
* fix: add dependencies, implement transformers for xpu
* base it from the oneapi image
* Add pillow
* set threads if specified when launching the API
* Skip conda install if intel
* defaults to non-intel
* ci: add to pipelines
* prepare compel only if enabled
* Skip conda install if intel
* fix cleanup
* Disable compel by default
* Install torch 2.1.0 with Intel
* Skip conda on some setups
* Detect python
* Quiet output
* Do not override system python with conda
* Prefer python3
* Fixups
* exllama2: do not install without conda (overrides pytorch version)
* exllama/exllama2: do not install if not using cuda
* Add missing dataset dependency
* Small fixups, symlink to python, add requirements
* Add neural_speed to the deps
* correctly handle model offloading
* fix: device_map == xpu
* go back at calling python, fixed at dockerfile level
* Exllama2 restricted to only nvidia gpus
* Tokenizer to xpu
2024-03-07 14:37:45 +01:00
LocalAI [bot]
ad6fd7a991
⬆️ Update ggerganov/llama.cpp ( #1805 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-03-06 23:28:31 +01:00
LocalAI [bot]
e022b5959e
⬆️ Update mudler/go-stable-diffusion ( #1802 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-03-05 23:39:57 +00:00
LocalAI [bot]
db7f4955a1
⬆️ Update ggerganov/llama.cpp ( #1801 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-03-05 21:50:27 +00:00
Dave
5c69dd155f
feat(autogpt/transformers): consume trust_remote_code
( #1799 )
...
trusting remote code by default is a danger to our users
2024-03-05 19:47:15 +01:00
TwinFin
504f2e8bf4
Update Backend Dependancies ( #1797 )
...
* Update transformers.yml
Signed-off-by: TwinFin <57421631+TwinFinz@users.noreply.github.com>
* Update transformers-rocm.yml
Signed-off-by: TwinFin <57421631+TwinFinz@users.noreply.github.com>
* Update transformers-nvidia.yml
Signed-off-by: TwinFin <57421631+TwinFinz@users.noreply.github.com>
---------
Signed-off-by: TwinFin <57421631+TwinFinz@users.noreply.github.com>
2024-03-05 10:10:00 +00:00
Luna Midori
e586dc2924
Edit links in readme and integrations page ( #1796 )
...
* Update integrations.md
Signed-off-by: Luna Midori <118759930+lunamidori5@users.noreply.github.com>
* Update README.md
Signed-off-by: Luna Midori <118759930+lunamidori5@users.noreply.github.com>
* Update README.md
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Signed-off-by: Luna Midori <118759930+lunamidori5@users.noreply.github.com>
* Update README.md
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Signed-off-by: Luna Midori <118759930+lunamidori5@users.noreply.github.com>
---------
Signed-off-by: Luna Midori <118759930+lunamidori5@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-03-05 10:14:30 +01:00
Ettore Di Giacinto
333f918005
Update integrations.md
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-03-05 09:45:54 +01:00
LocalAI [bot]
c8e29033c2
⬆️ Update ggerganov/llama.cpp ( #1794 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-03-05 08:59:09 +01:00
LocalAI [bot]
d0bd961bde
⬆️ Update ggerganov/llama.cpp ( #1791 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-03-04 09:44:21 +01:00
Ettore Di Giacinto
006511ee25
Revert "feat(assistant): Initial implementation of assistants api" ( #1790 )
...
Revert "feat(assistant): Initial implementation of assistants api (#1761 )"
This reverts commit 4ab72146cd
.
2024-03-03 10:31:06 +01:00
Steven Christou
4ab72146cd
feat(assistant): Initial implementation of assistants api ( #1761 )
...
Initial implementation of assistants api
2024-03-03 08:50:43 +01:00
LocalAI [bot]
b60a3fc879
⬆️ Update ggerganov/llama.cpp ( #1789 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-03-03 08:49:23 +01:00
Ettore Di Giacinto
a0eeb74957
Update hot topics/roadmap
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-03-02 09:35:40 +01:00
LocalAI [bot]
daa0b8741c
⬆️ Update ggerganov/llama.cpp ( #1785 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-03-01 22:38:24 +00:00
Ludovic Leroux
939411300a
Bump vLLM version + more options when loading models in vLLM ( #1782 )
...
* Bump vLLM version to 0.3.2
* Add vLLM model loading options
* Remove transformers-exllama
* Fix install exllama
2024-03-01 22:48:53 +01:00
Dave
1c312685aa
refactor: move remaining api packages to core ( #1731 )
...
* core 1
* api/openai/files fix
* core 2 - core/config
* move over core api.go and tests to the start of core/http
* move over localai specific endpoints to core/http, begin the service/endpoint split there
* refactor big chunk on the plane
* refactor chunk 2 on plane, next step: port and modify changes to request.go
* easy fixes for request.go, major changes not done yet
* lintfix
* json tag lintfix?
* gitignore and .keep files
* strange fix attempt: rename the config dir?
2024-03-01 16:19:53 +01:00
LocalAI [bot]
316de82f51
⬆️ Update ggerganov/llama.cpp ( #1779 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-02-29 22:33:30 +00:00
Ettore Di Giacinto
9068bc5271
Create SECURITY.md
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-02-29 19:53:04 +01:00
Oussama
31a4c9c9d3
Fix Command Injection Vulnerability ( #1778 )
...
* Added fix for command injection
* changed function name from sh to runCommand
2024-02-29 18:32:29 +00:00
Ettore Di Giacinto
c1966af2cf
ci: reduce stress on self-hosted runners ( #1776 )
...
Split jobs by self-hosted and free public runner provided by Github
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-02-29 11:40:08 +01:00
LocalAI [bot]
c665898652
⬆️ Update donomii/go-rwkv.cpp ( #1771 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-02-28 23:50:27 +00:00
LocalAI [bot]
f651a660aa
⬆️ Update ggerganov/llama.cpp ( #1772 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-02-28 23:02:30 +01:00
Ettore Di Giacinto
ba672b51da
Update README.md
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-02-28 16:03:38 +01:00
Ettore Di Giacinto
be498c5dd9
Update openai-functions.md
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-02-28 15:58:31 +01:00
Ettore Di Giacinto
6e95beccb9
Update overview.md
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-02-28 15:24:08 +01:00
Ettore Di Giacinto
c8be839481
Update openai-functions.md
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-02-27 23:24:46 +01:00
LocalAI [bot]
c7e08813a5
⬆️ Update ggerganov/llama.cpp ( #1767 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-02-27 23:12:51 +01:00
LocalAI [bot]
d21a6b33ab
⬆️ Update ggerganov/llama.cpp ( #1756 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-02-27 18:07:51 +00:00
Joshua Waring
9112cf153e
Update integrations.md ( #1765 )
...
Added Jetbrains compatible plugin for LocalAI
Signed-off-by: Joshua Waring <Joshhua5@users.noreply.github.com>
2024-02-27 17:35:59 +01:00
Ettore Di Giacinto
3868ac8402
Update README.md
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-02-27 15:44:15 +01:00
Ettore Di Giacinto
3f09010227
Update README.md
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-02-27 15:43:15 +01:00
Ettore Di Giacinto
d6cf82aba3
fix(tests): re-enable tests after code move ( #1764 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-02-27 15:04:19 +01:00
Ettore Di Giacinto
dfe54639b1
Update README.md
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-02-27 10:37:56 +01:00
Ettore Di Giacinto
bc5f5aa538
deps(llama.cpp): update ( #1759 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-02-26 13:18:44 +01:00
Ettore Di Giacinto
05818e0425
fix(functions): handle correctly when there are no results ( #1758 )
2024-02-26 08:38:23 +01:00
Sertaç Özercan
7f72a61104
ci: add stablediffusion to release ( #1757 )
...
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
2024-02-25 23:06:18 +00:00