fdb45153fe
* feat(llama.cpp): Enable decentralized, distributed inference As https://github.com/mudler/LocalAI/pull/2324 introduced distributed inferencing thanks to @rgerganov implementation in https://github.com/ggerganov/llama.cpp/pull/6829 in upstream llama.cpp, now it is possible to distribute the workload to remote llama.cpp gRPC server. This changeset now uses mudler/edgevpn to establish a secure, distributed network between the nodes using a shared token. The token is generated automatically when starting the server with the `--p2p` flag, and can be used by starting the workers with `local-ai worker p2p-llama-cpp-rpc` by passing the token via environment variable (TOKEN) or with args (--token). As per how mudler/edgevpn works, a network is established between the server and the workers with dht and mdns discovery protocols, the llama.cpp rpc server is automatically started and exposed to the underlying p2p network so the API server can connect on. When the HTTP server is started, it will discover the workers in the network and automatically create the port-forwards to the service locally. Then llama.cpp is configured to use the services. This feature is behind the "p2p" GO_FLAGS Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * go mod tidy Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: add p2p tag Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * better message Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> |
||
---|---|---|
.github | ||
.vscode | ||
aio | ||
backend | ||
configuration | ||
core | ||
custom-ca-certs | ||
docs | ||
embedded | ||
examples | ||
gallery | ||
internal | ||
models | ||
pkg | ||
prompt-templates | ||
swagger | ||
tests | ||
.dockerignore | ||
.editorconfig | ||
.env | ||
.gitattributes | ||
.gitignore | ||
.gitmodules | ||
.yamllint | ||
assets.go | ||
CONTRIBUTING.md | ||
docker-compose.yaml | ||
Dockerfile | ||
Dockerfile.aio | ||
Earthfile | ||
Entitlements.plist | ||
entrypoint.sh | ||
go.mod | ||
go.sum | ||
LICENSE | ||
main.go | ||
Makefile | ||
README.md | ||
renovate.json | ||
SECURITY.md |
LocalAI
💡 Get help - ❓FAQ 💭Discussions 💬 Discord 📖 Documentation website
LocalAI is the free, Open Source OpenAI alternative. LocalAI act as a drop-in replacement REST API that’s compatible with OpenAI (Elevenlabs, Anthropic... ) API specifications for local AI inferencing. It allows you to run LLMs, generate images, audio (and not only) locally or on-prem with consumer grade hardware, supporting multiple model families. Does not require GPU. It is created and maintained by Ettore Di Giacinto.
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-aio-cpu
# Alternative images:
# - if you have an Nvidia GPU:
# docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-aio-gpu-nvidia-cuda-12
# - without preconfigured models
# docker run -ti --name local-ai -p 8080:8080 localai/localai:latest
# - without preconfigured models for Nvidia GPUs
# docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-gpu-nvidia-cuda-12
🔥🔥 Hot topics / Roadmap
- 🔥🔥 Decentralized llama.cpp: https://github.com/mudler/LocalAI/pull/2343 (peer2peer llama.cpp!)
- 🔥🔥 Openvoice: https://github.com/mudler/LocalAI/pull/2334
- 🆕 Function calls without grammars and mixed mode: https://github.com/mudler/LocalAI/pull/2328
- 🔥🔥 Distributed inferencing: https://github.com/mudler/LocalAI/pull/2324
- Chat, TTS, and Image generation in the WebUI: https://github.com/mudler/LocalAI/pull/2222
- Reranker API: https://github.com/mudler/LocalAI/pull/2121
Hot topics (looking for contributors):
- WebUI improvements: https://github.com/mudler/LocalAI/issues/2156
- Backends v2: https://github.com/mudler/LocalAI/issues/1126
- Improving UX v2: https://github.com/mudler/LocalAI/issues/1373
- Assistant API: https://github.com/mudler/LocalAI/issues/1273
- Moderation endpoint: https://github.com/mudler/LocalAI/issues/999
- Vulkan: https://github.com/mudler/LocalAI/issues/1647
If you want to help and contribute, issues up for grabs: https://github.com/mudler/LocalAI/issues?q=is%3Aissue+is%3Aopen+label%3A%22up+for+grabs%22
🚀 Features
- 📖 Text generation with GPTs (
llama.cpp
,gpt4all.cpp
, ... 📖 and more) - 🗣 Text to Audio
- 🔈 Audio to Text (Audio transcription with
whisper.cpp
) - 🎨 Image generation with stable diffusion
- 🔥 OpenAI functions 🆕
- 🧠 Embeddings generation for vector databases
- ✍️ Constrained grammars
- 🖼️ Download Models directly from Huggingface
- 🥽 Vision API
- 🆕 Reranker API
💻 Usage
Check out the Getting started section in our documentation.
🔗 Community and integrations
Build and deploy custom containers:
WebUIs:
Model galleries
Other:
- Helm chart https://github.com/go-skynet/helm-charts
- VSCode extension https://github.com/badgooooor/localai-vscode-plugin
- Terminal utility https://github.com/djcopley/ShellOracle
- Local Smart assistant https://github.com/mudler/LocalAGI
- Home Assistant https://github.com/sammcj/homeassistant-localai / https://github.com/drndos/hass-openai-custom-conversation
- Discord bot https://github.com/mudler/LocalAGI/tree/main/examples/discord
- Slack bot https://github.com/mudler/LocalAGI/tree/main/examples/slack
- Telegram bot https://github.com/mudler/LocalAI/tree/master/examples/telegram-bot
- Examples: https://github.com/mudler/LocalAI/tree/master/examples/
🔗 Resources
- 🆕 New! LLM finetuning guide
- How to build locally
- How to install in Kubernetes
- Projects integrating LocalAI
- How tos section (curated by our community)
📖 🎥 Media, Blogs, Social
- Run LocalAI on AWS EKS with Pulumi
- Run LocalAI on AWS
- Create a slackbot for teams and OSS projects that answer to documentation
- LocalAI meets k8sgpt
- Question Answering on Documents locally with LangChain, LocalAI, Chroma, and GPT4All
- Tutorial to use k8sgpt with LocalAI
Citation
If you utilize this repository, data in a downstream project, please consider citing it with:
@misc{localai,
author = {Ettore Di Giacinto},
title = {LocalAI: The free, Open source OpenAI alternative},
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/go-skynet/LocalAI}},
❤️ Sponsors
Do you find LocalAI useful?
Support the project by becoming a backer or sponsor. Your logo will show up here with a link to your website.
A huge thank you to our generous sponsors who support this project:
Spectro Cloud |
Spectro Cloud kindly supports LocalAI by providing GPU and computing resources to run tests on lamdalabs! |
And a huge shout-out to individuals sponsoring the project by donating hardware or backing the project.
- Sponsor list
- JDAM00 (donating HW for the CI)
🌟 Star history
📖 License
LocalAI is a community-driven project created by Ettore Di Giacinto.
MIT - Author Ettore Di Giacinto
🙇 Acknowledgements
LocalAI couldn't have been built without the help of great software already available from the community. Thank you!
- llama.cpp
- https://github.com/tatsu-lab/stanford_alpaca
- https://github.com/cornelk/llama-go for the initial ideas
- https://github.com/antimatter15/alpaca.cpp
- https://github.com/EdVince/Stable-Diffusion-NCNN
- https://github.com/ggerganov/whisper.cpp
- https://github.com/saharNooby/rwkv.cpp
- https://github.com/rhasspy/piper
🤗 Contributors
This is a community project, a special thanks to our contributors! 🤗