LocalAI/backend/python/transformers
fakezeta 3882130911
feat: Add Bitsandbytes quantization for transformer backend enhancement #1775 and fix: Transformer backend error on CUDA #1774 (#1823)
* fixes #1775 and #1774

Add BitsAndBytes Quantization and fixes embedding on CUDA devices

* Manage 4bit and 8 bit quantization

Manage different BitsAndBytes options with the quantization: parameter in yaml

* fix compilation errors on non CUDA environment
2024-03-14 23:06:30 +01:00
..
backend_pb2_grpc.py feat(transformers): add embeddings with Automodel (#1308) 2023-11-20 21:21:17 +01:00
backend_pb2.py Bump vLLM version + more options when loading models in vLLM (#1782) 2024-03-01 22:48:53 +01:00
Makefile feat(conda): share envs with transformer-based backends (#1465) 2023-12-21 08:35:15 +01:00
README.md feat(transformers): add embeddings with Automodel (#1308) 2023-11-20 21:21:17 +01:00
run.sh feat(intel): add diffusers/transformers support (#1746) 2024-03-07 14:37:45 +01:00
test_transformers_server.py tests: add diffusers tests (#1419) 2023-12-11 08:20:34 +01:00
test.sh fix: rename transformers.py to avoid circular import (#1337) 2023-11-26 08:49:43 +01:00
transformers_server.py feat: Add Bitsandbytes quantization for transformer backend enhancement #1775 and fix: Transformer backend error on CUDA #1774 (#1823) 2024-03-14 23:06:30 +01:00

Creating a separate environment for the transformers project

make transformers