LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2024-06-07 19:40:48 +00:00

History

fakezeta e7cbe32601 feat: Openvino runtime for transformer backend and streaming support for Openvino and CUDA (#1892 ) * fixes #1775 and #1774 Add BitsAndBytes Quantization and fixes embedding on CUDA devices * Manage 4bit and 8 bit quantization Manage different BitsAndBytes options with the quantization: parameter in yaml * fix compilation errors on non CUDA environment * OpenVINO draft First draft of OpenVINO integration in transformer backend * first working implementation * Streaming working * Small fix for regression on CUDA and XPU * use pip version of optimum[openvino] * Update backend/python/transformers/transformers_server.py Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> --------- Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-03-26 23:31:43 +00:00
..
transformers	feat: Openvino runtime for transformer backend and streaming support for Openvino and CUDA (#1892 )	2024-03-26 23:31:43 +00:00

feat: Openvino runtime for transformer backend and streaming support for Openvino and CUDA (#1892 )

* fixes #1775 and #1774

Add BitsAndBytes Quantization and fixes embedding on CUDA devices

* Manage 4bit and 8 bit quantization

Manage different BitsAndBytes options with the quantization: parameter in yaml

* fix compilation errors on non CUDA environment

* OpenVINO draft

First draft of OpenVINO integration in transformer backend

* first working implementation

* Streaming working

* Small fix for regression on CUDA and XPU

* use pip version of optimum[openvino]

* Update backend/python/transformers/transformers_server.py

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

---------

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

2024-03-26 23:31:43 +00:00

transformers

feat: Openvino runtime for transformer backend and streaming support for Openvino and CUDA (#1892 )

2024-03-26 23:31:43 +00:00