LocalAI/compatibility-table.md at 6ca4d38a01f48b717ae9a201cef7dce72016e78d

8.5 KiB

Raw Blame History

+++ disableToc = false title = "Model compatibility table" weight = 24 +++

Besides llama based models, LocalAI is compatible also with other architectures. The table below lists all the compatible models families and the associated binding repository.

LocalAI will attempt to automatically load models which are not explicitly configured for a specific backend. You can specify the backend to use by configuring a model with a YAML file. See [the advanced section]({{%relref "docs/advanced" %}}) for more details.

Backend and Bindings	Compatible models	Completion/Chat endpoint	Capability	Embeddings support	Token stream support	Acceleration
[llama.cpp]({{%relref "docs/features/text-generation#llama.cpp" %}})	Vicuna, Alpaca, LLaMa	yes	GPT and Functions	yes**	yes	CUDA, openCL, cuBLAS, Metal
gpt4all-llama	Vicuna, Alpaca, LLaMa	yes	GPT	no	yes	N/A
gpt4all-mpt	MPT	yes	GPT	no	yes	N/A
gpt4all-j	GPT4ALL-J	yes	GPT	no	yes	N/A
falcon-ggml (binding)	Falcon (*)	yes	GPT	no	no	N/A
gpt2 (binding)	GPT2, Cerebras	yes	GPT	no	no	N/A
dolly (binding)	Dolly	yes	GPT	no	no	N/A
gptj (binding)	GPTJ	yes	GPT	no	no	N/A
mpt (binding)	MPT	yes	GPT	no	no	N/A
replit (binding)	Replit	yes	GPT	no	no	N/A
gptneox (binding)	GPT NeoX, RedPajama, StableLM	yes	GPT	no	no	N/A
starcoder (binding)	Starcoder	yes	GPT	no	no	N/A
bloomz (binding)	Bloom	yes	GPT	no	no	N/A
rwkv (binding)	rwkv	yes	GPT	no	yes	N/A
bert (binding)	bert	no	Embeddings only	yes	no	N/A
whisper	whisper	no	Audio	no	no	N/A
stablediffusion (binding)	stablediffusion	no	Image	no	no	N/A
langchain-huggingface	Any text generators available on HuggingFace through API	yes	GPT	no	no	N/A
piper (binding)	Any piper onnx model	no	Text to voice	no	no	N/A
falcon (binding)	Falcon ***	yes	GPT	no	yes	CUDA
sentencetransformers	BERT	no	Embeddings only	yes	no	N/A
`bark`	bark	no	Audio generation	no	no	yes
`autogptq`	GPTQ	yes	GPT	yes	no	N/A
`exllama`	GPTQ	yes	GPT only	no	no	N/A
`diffusers`	SD,...	no	Image generation	no	no	N/A
`vall-e-x`	Vall-E	no	Audio generation and Voice cloning	no	no	CPU/CUDA
`vllm`	Various GPTs and quantization formats	yes	GPT	no	no	CPU/CUDA
`exllama2`	GPTQ	yes	GPT only	no	no	N/A
`transformers-musicgen`		no	Audio generation	no	no	N/A
tinydream	stablediffusion	no	Image	no	no	N/A
`coqui`	Coqui	no	Audio generation and Voice cloning	no	no	CPU/CUDA
`petals`	Various GPTs and quantization formats	yes	GPT	no	no	CPU/CUDA

Note: any backend name listed above can be used in the backend field of the model configuration file (See [the advanced section]({{%relref "docs/advanced" %}})).

* 7b ONLY
** doesn't seem to be accurate
*** 7b and 40b with the ggccv format, for instance: https://huggingface.co/TheBloke/WizardLM-Uncensored-Falcon-40B-GGML

8.5 KiB Raw Blame History

8.5 KiB

Raw Blame History