Compare commits

...

48 Commits

Author SHA1 Message Date
Richard Palethorpe ae61a307c6
Merge 49df11b4e8 into 6559ac11b1 2024-05-08 12:34:22 +02:00
Ettore Di Giacinto 6559ac11b1
feat(ui): prompt for chat, support vision, enhancements (#2259)
* feat(ui): allow to set system prompt for chat

Make also the models in the index clickable, and display as table

Fixes #2257

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(vision): support also png with base64 input

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(ui): support vision and upload of files

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* display the processed image

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* make trust remote code stand out

Signed-off-by: mudler <mudler@localai.io>

* feat(ui): track in progress job across index/model gallery

Signed-off-by: mudler <mudler@localai.io>

* minor fixups

Signed-off-by: mudler <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: mudler <mudler@localai.io>
2024-05-08 00:42:34 +02:00
Ettore Di Giacinto 02ec546dd6
models(gallery): Add Soliloquy (#2260)
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-05-08 00:14:19 +02:00
LocalAI [bot] 995aa5ed21
⬆️ Update ggerganov/llama.cpp (#2263)
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-05-07 21:39:12 +00:00
Michael Mior e28ba4b807
Add missing Homebrew dependencies (#2256)
Signed-off-by: Michael Mior <michael.mior@gmail.com>
Signed-off-by: Michael Mior <mmior@mail.rit.edu>
2024-05-07 16:34:30 +00:00
Daniel d1e3436de5
Update readme: add ShellOracle to community integrations (#2254)
Signed-off-by: Daniel Copley <djcopley@users.noreply.github.com>
2024-05-07 08:39:58 +02:00
Dave d3ddc9e4aa
UI: flag `trust_remote_code` to users // favicon support (#2253)
* attempt to indicate trust_remote_code in some way

* bonus: favicon support!

---------

Signed-off-by: Dave Lee <dave@gray101.com>
2024-05-07 08:39:23 +02:00
fakezeta fea9522982
fix: OpenVINO winograd always disabled (#2252)
Winograd convolutions were always disabled giving error when inference device was CPU.
This commit implement logic to disable Winograd convolutions only if CPU or NPU are declared.
2024-05-07 08:38:58 +02:00
Ettore Di Giacinto fe055d4b36
feat(webui): ux improvements (#2247)
* ux: change welcome when there are no models installed

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ux: filter

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ux: show tags in filter

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* wip

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* make tags clickable

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* allow to delete models from the list

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ui: display icon of installed models

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* gallery: remove gallery file when removing model

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(gallery): show a re-install button

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* make filter buttons, rename Gallery field

Signed-off-by: mudler <mudler@localai.io>

* show again buttons at end of operations

Signed-off-by: mudler <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: mudler <mudler@localai.io>
2024-05-07 01:17:07 +02:00
LocalAI [bot] 581b894789
⬆️ Update ggerganov/llama.cpp (#2255)
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-05-06 21:28:07 +00:00
Ettore Di Giacinto 477655f6e6
models(gallery): average_norrmie reupload
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-05-06 19:56:24 +02:00
fakezeta 169d8d21ff
gallery: Added some OpenVINO models (#2249)
* Added some OpenVINO models

Added Phi-3 trust_remote_code: true
Added Hermes 2 Pro Llama3
Added Multilingual-E5-base embedding model with OpenVINO acceleration (CPU and XPU)
Added all-MiniLM-L6-v2 with OpenVINO acceleration (CPU and XPU)

* Added Remote Code for phi, fixed error on Yamllint

* update openvino.yaml

I need to go to rest: today is not my day...
2024-05-06 10:52:05 +02:00
LocalAI [bot] c5475020fe
⬆️ Update ggerganov/llama.cpp (#2251)
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-05-05 21:16:00 +00:00
Dave b52ff1249f
test: check the response URL during image gen in `app_test.go` (#2248)
test: actually check the response URL from image gen

Signed-off-by: Dave Lee <dave@gray101.com>
2024-05-05 18:46:33 +00:00
Ettore Di Giacinto c5798500cb
feat(single-build): generate single binaries for releases (#2246)
* feat(single-build): generate single binaries for releases

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* drop old targets

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-05 17:20:51 +02:00
Ettore Di Giacinto 67ad3532ec
Update README.md
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-05-05 15:45:55 +02:00
Ettore Di Giacinto 5cb96fe7df
models(gallery): add openbiollm (#2245)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-05 15:19:46 +02:00
Ettore Di Giacinto 810e8e5855
models(gallery): add lumimaid (#2244)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-05 15:19:33 +02:00
Ettore Di Giacinto f3bcc648e7
models(gallery): add icon for instruct-coder
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-05-05 12:20:06 +02:00
Ettore Di Giacinto 3096566333
models(gallery): poppy porpoise fix
correct mmproj URL

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-05-05 11:56:07 +02:00
Ettore Di Giacinto f50c6a4e88
models(gallery): update poppy porpoise (#2243)
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-05-05 11:19:09 +02:00
Ettore Di Giacinto ab4ee54855
models(gallery): add llama3-instruct-coder (#2242)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-05 11:18:50 +02:00
Ettore Di Giacinto f2d35062d4
models(gallery): moondream2 fixups
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-05-05 10:49:04 +02:00
Ettore Di Giacinto b69ff46c7e
feat(startup): show CPU/GPU information with --debug (#2241)
Signed-off-by: mudler <mudler@localai.io>
2024-05-05 09:10:23 +02:00
Ettore Di Giacinto 117c9873e1
fix(webui): display small navbar with smaller screens (#2240)
Signed-off-by: mudler <mudler@localai.io>
2024-05-04 23:38:39 +02:00
LocalAI [bot] 17e94fbcb1
⬆️ Update ggerganov/llama.cpp (#2239)
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-05-04 21:26:22 +00:00
Ettore Di Giacinto 92f7feb874
models(gallery): add llama3-llava (#2238)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-04 22:43:11 +02:00
Ettore Di Giacinto b70e2bffa3
models(gallery): add moondream2 (#2237)
* models(gallery): add moondream2

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* models(gallery): fix typo for TTS models

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* models(gallery): add base config for moondream2 and icon

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* linter fixes

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-04 18:29:04 +02:00
nold 06c43ca285
fix(gallery): hermes-2-pro-llama3 models checksum changed (#2236)
fix(gallery): hermes-2-pro-llama3 models checksum

Signed-off-by: Gerrit Pannek <nold@gnu.one>
2024-05-04 17:59:54 +02:00
Ettore Di Giacinto 530bec9c64
feat(llama.cpp): do not specify backends to autoload and add llama.cpp variants (#2232)
* feat(initializer): do not specify backends to autoload

We can simply try to autoload the backends extracted in the asset dir.
This will allow to build variants of the same backend (for e.g. with different instructions sets),
so to have a single binary for all the variants.

Signed-off-by: mudler <mudler@localai.io>

* refactor(prepare): refactor out llama.cpp prepare steps

Make it so are idempotent and that we can re-build

Signed-off-by: mudler <mudler@localai.io>

* [TEST] feat(build): build noavx version along

Signed-off-by: mudler <mudler@localai.io>

* build: make build parallel

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* build: do not override CMAKE_ARGS

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* build: add fallback variant

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(huggingface-langchain): fail if no token is set

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(huggingface-langchain): rename

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix: do not autoload local-store

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix: give priority between the listed backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: mudler <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-04 17:56:12 +02:00
fakezeta fa10302dd2
docs: updated Transformer parameters description (#2234)
updated Transformer parameters
2024-05-04 10:45:25 +02:00
Ettore Di Giacinto 54faaa87ea
fix(webui): correct documentation URL for text2img (#2233)
Signed-off-by: mudler <mudler@localai.io>
Co-authored-by: Dave <dave@gray101.com>
2024-05-04 00:25:13 +00:00
dependabot[bot] daba8a85f9
build(deps): bump tqdm from 4.65.0 to 4.66.3 in /examples/langchain/langchainpy-localai-example in the pip group across 1 directory (#2231)
build(deps): bump tqdm

Bumps the pip group with 1 update in the /examples/langchain/langchainpy-localai-example directory: [tqdm](https://github.com/tqdm/tqdm).


Updates `tqdm` from 4.65.0 to 4.66.3
- [Release notes](https://github.com/tqdm/tqdm/releases)
- [Commits](https://github.com/tqdm/tqdm/compare/v4.65.0...v4.66.3)

---
updated-dependencies:
- dependency-name: tqdm
  dependency-type: direct:production
  dependency-group: pip
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-03 23:15:06 +00:00
LocalAI [bot] ac0f3d6e82
⬆️ Update ggerganov/whisper.cpp (#2230)
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-05-03 22:16:26 +00:00
LocalAI [bot] da0b6a89ae
⬆️ Update ggerganov/llama.cpp (#2229)
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-05-03 21:39:28 +00:00
LocalAI [bot] 929a68c06d
⬆️ Update docs version mudler/LocalAI (#2228)
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-05-03 21:18:11 +00:00
cryptk a0aa5d01a1
feat: update ROCM and use smaller image (#2196)
* feat: update ROCM and use smaller image

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: add call to ldconfig to fix AMDs broken library packages

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

---------

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-05-03 18:46:49 +02:00
Ettore Di Giacinto dc834cc9d2
Update README.md
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-05-03 09:42:35 +02:00
Ettore Di Giacinto b58274b8a2
feat(ui): support multilineand style `ul` (#2226)
* feat(ui/chat): handle multiline in the input field

Signed-off-by: mudler <mudler@localai.io>

* feat(ui/chat): correctly display multiline messages

Signed-off-by: mudler <mudler@localai.io>

* feat(ui/chat): add list style

Signed-off-by: mudler <mudler@localai.io>

---------

Signed-off-by: mudler <mudler@localai.io>
2024-05-03 00:43:02 +02:00
Ettore Di Giacinto a31d00d904
feat(aio): switch to llama3-based for LLM (#2225)
Signed-off-by: mudler <mudler@localai.io>
2024-05-03 00:41:45 +02:00
LocalAI [bot] 2cc1bd85af
⬆️ Update ggerganov/llama.cpp (#2224)
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-05-02 21:23:40 +00:00
Ettore Di Giacinto 2c5a46bc34
feat(ux): Add chat, tts, and image-gen pages to the WebUI (#2222)
* feat(webui): Add chat page

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(webui): Add image-gen page

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(webui): Add tts page

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-02 21:14:10 +02:00
Ettore Di Giacinto f7f8b4804b
models(gallery): Add Hermes-2-Pro-Llama-3-8B-GGUF (#2218)
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-05-02 18:31:13 +02:00
Ettore Di Giacinto e5bd9a76c7
models(gallery): add wizardlm2 (#2209)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-02 18:31:02 +02:00
fakezeta 4690b534e0
feat: user defined inference device for CUDA and OpenVINO (#2212)
user defined inference device

configuration via main_gpu parameter
2024-05-02 09:54:29 +02:00
LocalAI [bot] 6a7a7996bb
⬆️ Update ggerganov/llama.cpp (#2213)
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-05-01 21:19:44 +00:00
Ettore Di Giacinto 962ebbaf77
models(gallery): fixup phi-3 sha
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-05-01 23:06:58 +02:00
Richard Palethorpe 49df11b4e8 Replace own vector search with Chromem-go
Because my implementation is apparently not producing the expected
results and this project fits the bill quite nicely.

Most of the tests are failing, but the core functionality appears to
work.

- [x] Set
- [ ] Get
- [ ] Delete (we have to delete the whole collection presently)
- [ ] Query (it returns normalized embeddings, not the originals)
- [x] Query Normalized

Some features are missing from Chromem, but we don't strictly need
them. Meanwhile there are maybe some minor things that need adding on
our side to get the basic functionality working.
2024-04-18 10:16:41 +01:00
60 changed files with 2422 additions and 906 deletions

View File

@ -61,7 +61,7 @@ jobs:
tag-suffix: '-hipblas'
ffmpeg: 'false'
image-type: 'extras'
base-image: "rocm/dev-ubuntu-22.04:6.0-complete"
base-image: "rocm/dev-ubuntu-22.04:6.1"
grpc-base-image: "ubuntu:22.04"
runs-on: 'arc-runner-set'
makeflags: "--jobs=3 --output-sync=target"

View File

@ -129,7 +129,7 @@ jobs:
ffmpeg: 'true'
image-type: 'extras'
aio: "-aio-gpu-hipblas"
base-image: "rocm/dev-ubuntu-22.04:6.0-complete"
base-image: "rocm/dev-ubuntu-22.04:6.1"
grpc-base-image: "ubuntu:22.04"
latest-image: 'latest-gpu-hipblas'
latest-image-aio: 'latest-aio-gpu-hipblas'
@ -141,7 +141,7 @@ jobs:
tag-suffix: '-hipblas'
ffmpeg: 'false'
image-type: 'extras'
base-image: "rocm/dev-ubuntu-22.04:6.0-complete"
base-image: "rocm/dev-ubuntu-22.04:6.1"
grpc-base-image: "ubuntu:22.04"
runs-on: 'arc-runner-set'
makeflags: "--jobs=3 --output-sync=target"
@ -218,7 +218,7 @@ jobs:
tag-suffix: '-hipblas-ffmpeg-core'
ffmpeg: 'true'
image-type: 'core'
base-image: "rocm/dev-ubuntu-22.04:6.0-complete"
base-image: "rocm/dev-ubuntu-22.04:6.1"
grpc-base-image: "ubuntu:22.04"
runs-on: 'arc-runner-set'
makeflags: "--jobs=3 --output-sync=target"
@ -228,7 +228,7 @@ jobs:
tag-suffix: '-hipblas-core'
ffmpeg: 'false'
image-type: 'core'
base-image: "rocm/dev-ubuntu-22.04:6.0-complete"
base-image: "rocm/dev-ubuntu-22.04:6.1"
grpc-base-image: "ubuntu:22.04"
runs-on: 'arc-runner-set'
makeflags: "--jobs=3 --output-sync=target"

View File

@ -19,12 +19,8 @@ jobs:
strategy:
matrix:
include:
- build: 'avx2'
- build: ''
defines: ''
- build: 'avx'
defines: '-DLLAMA_AVX2=OFF'
- build: 'avx512'
defines: '-DLLAMA_AVX512=ON'
- build: 'cuda12'
defines: ''
- build: 'cuda11'
@ -74,7 +70,6 @@ jobs:
- name: Build
id: build
env:
CMAKE_ARGS: "${{ matrix.defines }}"
BUILD_ID: "${{ matrix.build }}"
run: |
go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@latest
@ -124,63 +119,7 @@ jobs:
name: stablediffusion
path: release/
build-macOS:
strategy:
matrix:
include:
- build: 'avx2'
defines: ''
- build: 'avx'
defines: '-DLLAMA_AVX2=OFF'
- build: 'avx512'
defines: '-DLLAMA_AVX512=ON'
runs-on: macOS-latest
steps:
- name: Clone
uses: actions/checkout@v4
with:
submodules: true
- uses: actions/setup-go@v5
with:
go-version: '1.21.x'
cache: false
- name: Dependencies
run: |
brew install protobuf grpc
go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@latest
go install google.golang.org/protobuf/cmd/protoc-gen-go@latest
- name: Build
id: build
env:
CMAKE_ARGS: "${{ matrix.defines }}"
BUILD_ID: "${{ matrix.build }}"
run: |
export C_INCLUDE_PATH=/usr/local/include
export CPLUS_INCLUDE_PATH=/usr/local/include
export PATH=$PATH:$GOPATH/bin
make dist
- uses: actions/upload-artifact@v4
with:
name: LocalAI-MacOS-${{ matrix.build }}
path: release/
- name: Release
uses: softprops/action-gh-release@v2
if: startsWith(github.ref, 'refs/tags/')
with:
files: |
release/*
build-macOS-arm64:
strategy:
matrix:
include:
- build: 'avx2'
defines: ''
- build: 'avx'
defines: '-DLLAMA_AVX2=OFF'
- build: 'avx512'
defines: '-DLLAMA_AVX512=ON'
runs-on: macos-14
steps:
- name: Clone
@ -198,9 +137,6 @@ jobs:
go install google.golang.org/protobuf/cmd/protoc-gen-go@latest
- name: Build
id: build
env:
CMAKE_ARGS: "${{ matrix.defines }}"
BUILD_ID: "${{ matrix.build }}"
run: |
export C_INCLUDE_PATH=/usr/local/include
export CPLUS_INCLUDE_PATH=/usr/local/include
@ -208,7 +144,7 @@ jobs:
make dist
- uses: actions/upload-artifact@v4
with:
name: LocalAI-MacOS-arm64-${{ matrix.build }}
name: LocalAI-MacOS-arm64
path: release/
- name: Release
uses: softprops/action-gh-release@v2

View File

@ -140,6 +140,18 @@ RUN if [ "${BUILD_TYPE}" = "clblas" ]; then \
rm -rf /var/lib/apt/lists/* \
; fi
RUN if [ "${BUILD_TYPE}" = "hipblas" ]; then \
apt-get update && \
apt-get install -y --no-install-recommends \
hipblas-dev \
rocblas-dev && \
apt-get clean && \
rm -rf /var/lib/apt/lists/* && \
# I have no idea why, but the ROCM lib packages don't trigger ldconfig after they install, which results in local-ai and others not being able
# to locate the libraries. We run ldconfig ourselves to work around this packaging deficiency
ldconfig \
; fi
###################################
###################################

View File

@ -5,7 +5,7 @@ BINARY_NAME=local-ai
# llama.cpp versions
GOLLAMA_STABLE_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be
CPPLLAMA_VERSION?=f364eb6fb5d46118a76fa045f487318de4c24961
CPPLLAMA_VERSION?=b6aa6702030320a3d5fbc2508307af0d7c947e40
# gpt4all version
GPT4ALL_REPO?=https://github.com/nomic-ai/gpt4all
@ -16,7 +16,7 @@ RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp
RWKV_VERSION?=661e7ae26d442f5cfebd2a0881b44e8c55949ec6
# whisper.cpp version
WHISPER_CPP_VERSION?=8fac6455ffeb0a0950a84e790ddb74f7290d33c4
WHISPER_CPP_VERSION?=58210d6a7634ea1e42e0a2dab611f4a0518731dc
# bert.cpp version
BERT_VERSION?=6abe312cded14042f6b7c3cd8edf082713334a4d
@ -152,9 +152,11 @@ ifeq ($(findstring tts,$(GO_TAGS)),tts)
OPTIONAL_GRPC+=backend-assets/grpc/piper
endif
ALL_GRPC_BACKENDS=backend-assets/grpc/langchain-huggingface
ALL_GRPC_BACKENDS=backend-assets/grpc/huggingface
ALL_GRPC_BACKENDS+=backend-assets/grpc/bert-embeddings
ALL_GRPC_BACKENDS+=backend-assets/grpc/llama-cpp
ALL_GRPC_BACKENDS+=backend-assets/grpc/llama-cpp-noavx
ALL_GRPC_BACKENDS+=backend-assets/grpc/llama-cpp-fallback
ALL_GRPC_BACKENDS+=backend-assets/grpc/llama-ggml
ALL_GRPC_BACKENDS+=backend-assets/grpc/gpt4all
ALL_GRPC_BACKENDS+=backend-assets/grpc/rwkv
@ -293,6 +295,7 @@ clean: ## Remove build related file
rm -rf backend-assets/*
$(MAKE) -C backend/cpp/grpc clean
$(MAKE) -C backend/cpp/llama clean
rm -rf backend/cpp/llama-* || true
$(MAKE) dropreplace
$(MAKE) protogen-clean
rmdir pkg/grpc/proto || true
@ -311,14 +314,19 @@ build: prepare backend-assets grpcs ## Build the project
CGO_LDFLAGS="$(CGO_LDFLAGS)" $(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o $(BINARY_NAME) ./
build-minimal:
BUILD_GRPC_FOR_BACKEND_LLAMA=true GRPC_BACKENDS=backend-assets/grpc/llama-cpp GO_TAGS=none $(MAKE) build
BUILD_GRPC_FOR_BACKEND_LLAMA=true GRPC_BACKENDS="backend-assets/grpc/llama-cpp" GO_TAGS=none $(MAKE) build
build-api:
BUILD_GRPC_FOR_BACKEND_LLAMA=true BUILD_API_ONLY=true GO_TAGS=none $(MAKE) build
dist: build
mkdir -p release
# if BUILD_ID is empty, then we don't append it to the binary name
ifeq ($(BUILD_ID),)
cp $(BINARY_NAME) release/$(BINARY_NAME)-$(OS)-$(ARCH)
else
cp $(BINARY_NAME) release/$(BINARY_NAME)-$(BUILD_ID)-$(OS)-$(ARCH)
endif
osx-signed: build
codesign --deep --force --sign "$(OSX_SIGNING_IDENTITY)" --entitlements "./Entitlements.plist" "./$(BINARY_NAME)"
@ -616,8 +624,8 @@ backend-assets/grpc/gpt4all: sources/gpt4all sources/gpt4all/gpt4all-bindings/go
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(CURDIR)/sources/gpt4all/gpt4all-bindings/golang/ LIBRARY_PATH=$(CURDIR)/sources/gpt4all/gpt4all-bindings/golang/ \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/gpt4all ./backend/go/llm/gpt4all/
backend-assets/grpc/langchain-huggingface: backend-assets/grpc
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/langchain-huggingface ./backend/go/llm/langchain/
backend-assets/grpc/huggingface: backend-assets/grpc
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/huggingface ./backend/go/llm/langchain/
backend/cpp/llama/llama.cpp:
LLAMA_VERSION=$(CPPLLAMA_VERSION) $(MAKE) -C backend/cpp/llama llama.cpp
@ -629,7 +637,7 @@ ADDED_CMAKE_ARGS=-Dabsl_DIR=${INSTALLED_LIB_CMAKE}/absl \
-Dutf8_range_DIR=${INSTALLED_LIB_CMAKE}/utf8_range \
-DgRPC_DIR=${INSTALLED_LIB_CMAKE}/grpc \
-DCMAKE_CXX_STANDARD_INCLUDE_DIRECTORIES=${INSTALLED_PACKAGES}/include
backend/cpp/llama/grpc-server:
build-llama-cpp-grpc-server:
# Conditionally build grpc for the llama backend to use if needed
ifdef BUILD_GRPC_FOR_BACKEND_LLAMA
$(MAKE) -C backend/cpp/grpc build
@ -638,19 +646,37 @@ ifdef BUILD_GRPC_FOR_BACKEND_LLAMA
PATH="${INSTALLED_PACKAGES}/bin:${PATH}" \
CMAKE_ARGS="${CMAKE_ARGS} ${ADDED_CMAKE_ARGS}" \
LLAMA_VERSION=$(CPPLLAMA_VERSION) \
$(MAKE) -C backend/cpp/llama grpc-server
$(MAKE) -C backend/cpp/${VARIANT} grpc-server
else
echo "BUILD_GRPC_FOR_BACKEND_LLAMA is not defined."
LLAMA_VERSION=$(CPPLLAMA_VERSION) $(MAKE) -C backend/cpp/llama grpc-server
LLAMA_VERSION=$(CPPLLAMA_VERSION) $(MAKE) -C backend/cpp/${VARIANT} grpc-server
endif
backend-assets/grpc/llama-cpp: backend-assets/grpc backend/cpp/llama/grpc-server
cp -rfv backend/cpp/llama/grpc-server backend-assets/grpc/llama-cpp
backend-assets/grpc/llama-cpp: backend-assets/grpc
$(info ${GREEN}I llama-cpp build info:standard${RESET})
cp -rf backend/cpp/llama backend/cpp/llama-default
$(MAKE) -C backend/cpp/llama-default purge
$(MAKE) VARIANT="llama-default" build-llama-cpp-grpc-server
cp -rfv backend/cpp/llama-default/grpc-server backend-assets/grpc/llama-cpp
# TODO: every binary should have its own folder instead, so can have different metal implementations
ifeq ($(BUILD_TYPE),metal)
cp backend/cpp/llama/llama.cpp/build/bin/default.metallib backend-assets/grpc/
cp backend/cpp/llama-default/llama.cpp/build/bin/default.metallib backend-assets/grpc/
endif
backend-assets/grpc/llama-cpp-noavx: backend-assets/grpc
cp -rf backend/cpp/llama backend/cpp/llama-noavx
$(MAKE) -C backend/cpp/llama-noavx purge
$(info ${GREEN}I llama-cpp build info:noavx${RESET})
CMAKE_ARGS="$(CMAKE_ARGS) -DLLAMA_AVX512=OFF -DLLAMA_AVX2=OFF" $(MAKE) VARIANT="llama-noavx" build-llama-cpp-grpc-server
cp -rfv backend/cpp/llama-noavx/grpc-server backend-assets/grpc/llama-cpp-noavx
backend-assets/grpc/llama-cpp-fallback: backend-assets/grpc
cp -rf backend/cpp/llama backend/cpp/llama-fallback
$(MAKE) -C backend/cpp/llama-fallback purge
$(info ${GREEN}I llama-cpp build info:fallback${RESET})
CMAKE_ARGS="$(CMAKE_ARGS) -DLLAMA_F16C=OFF -DLLAMA_AVX512=OFF -DLLAMA_AVX2=OFF -DLLAMA_FMA=OFF" $(MAKE) VARIANT="llama-fallback" build-llama-cpp-grpc-server
cp -rfv backend/cpp/llama-fallback/grpc-server backend-assets/grpc/llama-cpp-fallback
backend-assets/grpc/llama-ggml: sources/go-llama.cpp sources/go-llama.cpp/libbinding.a backend-assets/grpc
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(CURDIR)/sources/go-llama.cpp LIBRARY_PATH=$(CURDIR)/sources/go-llama.cpp \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/llama-ggml ./backend/go/llm/llama-ggml/

View File

@ -50,6 +50,7 @@
[Roadmap](https://github.com/mudler/LocalAI/issues?q=is%3Aissue+is%3Aopen+label%3Aroadmap)
- Chat, TTS, and Image generation in the WebUI: https://github.com/mudler/LocalAI/pull/2222
- Reranker API: https://github.com/mudler/LocalAI/pull/2121
- Gallery WebUI: https://github.com/mudler/LocalAI/pull/2104
- llama3: https://github.com/mudler/LocalAI/discussions/2076
@ -113,6 +114,7 @@ Model galleries
Other:
- Helm chart https://github.com/go-skynet/helm-charts
- VSCode extension https://github.com/badgooooor/localai-vscode-plugin
- Terminal utility https://github.com/djcopley/ShellOracle
- Local Smart assistant https://github.com/mudler/LocalAGI
- Home Assistant https://github.com/sammcj/homeassistant-localai / https://github.com/drndos/hass-openai-custom-conversation
- Discord bot https://github.com/mudler/LocalAGI/tree/main/examples/discord
@ -131,7 +133,7 @@ Other:
## :book: 🎥 [Media, Blogs, Social](https://localai.io/basics/news/#media-blogs-social)
- [Run LocalAI on AWS EKS with Pulumi](https://www.pulumi.com/ai/answers/tiZMDoZzZV6TLxgDXNBnFE/deploying-helm-charts-on-aws-eks)
- [Run LocalAI on AWS EKS with Pulumi](https://www.pulumi.com/blog/low-code-llm-apps-with-local-ai-flowise-and-pulumi/)
- [Run LocalAI on AWS](https://staleks.hashnode.dev/installing-localai-on-aws-ec2-instance)
- [Create a slackbot for teams and OSS projects that answer to documentation](https://mudler.pm/posts/smart-slackbot-for-teams/)
- [LocalAI meets k8sgpt](https://www.youtube.com/watch?v=PKrDNuJ_dfE)

View File

@ -1,7 +1,7 @@
name: gpt-4
mmap: true
parameters:
model: huggingface://NousResearch/Hermes-2-Pro-Mistral-7B-GGUF/Hermes-2-Pro-Mistral-7B.Q2_K.gguf
model: huggingface://NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF/Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf
template:
chat_message: |

View File

@ -1,7 +1,7 @@
name: gpt-4
mmap: true
parameters:
model: huggingface://NousResearch/Hermes-2-Pro-Mistral-7B-GGUF/Hermes-2-Pro-Mistral-7B.Q6_K.gguf
model: huggingface://NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF/Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf
template:
chat_message: |

View File

@ -2,7 +2,7 @@ name: gpt-4
mmap: false
f16: false
parameters:
model: huggingface://NousResearch/Hermes-2-Pro-Mistral-7B-GGUF/Hermes-2-Pro-Mistral-7B.Q6_K.gguf
model: huggingface://NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF/Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf
template:
chat_message: |

View File

@ -43,31 +43,23 @@ llama.cpp:
llama.cpp/examples/grpc-server: llama.cpp
mkdir -p llama.cpp/examples/grpc-server
cp -r $(abspath ./)/CMakeLists.txt llama.cpp/examples/grpc-server/
cp -r $(abspath ./)/grpc-server.cpp llama.cpp/examples/grpc-server/
cp -rfv $(abspath ./)/json.hpp llama.cpp/examples/grpc-server/
cp -rfv $(abspath ./)/utils.hpp llama.cpp/examples/grpc-server/
echo "add_subdirectory(grpc-server)" >> llama.cpp/examples/CMakeLists.txt
## XXX: In some versions of CMake clip wasn't being built before llama.
## This is an hack for now, but it should be fixed in the future.
cp -rfv llama.cpp/examples/llava/clip.h llama.cpp/examples/grpc-server/clip.h
cp -rfv llama.cpp/examples/llava/llava.cpp llama.cpp/examples/grpc-server/llava.cpp
echo '#include "llama.h"' > llama.cpp/examples/grpc-server/llava.h
cat llama.cpp/examples/llava/llava.h >> llama.cpp/examples/grpc-server/llava.h
cp -rfv llama.cpp/examples/llava/clip.cpp llama.cpp/examples/grpc-server/clip.cpp
bash prepare.sh
rebuild:
cp -rfv $(abspath ./)/CMakeLists.txt llama.cpp/examples/grpc-server/
cp -rfv $(abspath ./)/grpc-server.cpp llama.cpp/examples/grpc-server/
cp -rfv $(abspath ./)/json.hpp llama.cpp/examples/grpc-server/
bash prepare.sh
rm -rf grpc-server
$(MAKE) grpc-server
clean:
rm -rf llama.cpp
purge:
rm -rf llama.cpp/build
rm -rf llama.cpp/examples/grpc-server
rm -rf grpc-server
clean: purge
rm -rf llama.cpp
grpc-server: llama.cpp llama.cpp/examples/grpc-server
@echo "Building grpc-server with $(BUILD_TYPE) build type and $(CMAKE_ARGS)"
ifneq (,$(findstring sycl,$(BUILD_TYPE)))
bash -c "source $(ONEAPI_VARS); \
cd llama.cpp && mkdir -p build && cd build && cmake .. $(CMAKE_ARGS) && cmake --build . --config Release"

View File

@ -0,0 +1,20 @@
#!/bin/bash
cp -r CMakeLists.txt llama.cpp/examples/grpc-server/
cp -r grpc-server.cpp llama.cpp/examples/grpc-server/
cp -rfv json.hpp llama.cpp/examples/grpc-server/
cp -rfv utils.hpp llama.cpp/examples/grpc-server/
if grep -q "grpc-server" llama.cpp/examples/CMakeLists.txt; then
echo "grpc-server already added"
else
echo "add_subdirectory(grpc-server)" >> llama.cpp/examples/CMakeLists.txt
fi
## XXX: In some versions of CMake clip wasn't being built before llama.
## This is an hack for now, but it should be fixed in the future.
cp -rfv llama.cpp/examples/llava/clip.h llama.cpp/examples/grpc-server/clip.h
cp -rfv llama.cpp/examples/llava/llava.cpp llama.cpp/examples/grpc-server/llava.cpp
echo '#include "llama.h"' > llama.cpp/examples/grpc-server/llava.h
cat llama.cpp/examples/llava/llava.h >> llama.cpp/examples/grpc-server/llava.h
cp -rfv llama.cpp/examples/llava/clip.cpp llama.cpp/examples/grpc-server/clip.cpp

View File

@ -4,6 +4,7 @@ package main
// It is meant to be used by the main executable that is the server for the specific backend type (falcon, gpt3, etc)
import (
"fmt"
"os"
"github.com/go-skynet/LocalAI/pkg/grpc/base"
pb "github.com/go-skynet/LocalAI/pkg/grpc/proto"
@ -18,9 +19,14 @@ type LLM struct {
}
func (llm *LLM) Load(opts *pb.ModelOptions) error {
llm.langchain, _ = langchain.NewHuggingFace(opts.Model)
var err error
hfToken := os.Getenv("HUGGINGFACEHUB_API_TOKEN")
if hfToken == "" {
return fmt.Errorf("no huggingface token provided")
}
llm.langchain, err = langchain.NewHuggingFace(opts.Model, hfToken)
llm.model = opts.Model
return nil
return err
}
func (llm *LLM) Predict(opts *pb.PredictOptions) (string, error) {

View File

@ -19,8 +19,12 @@ func main() {
log.Logger = log.Output(zerolog.ConsoleWriter{Out: os.Stderr})
flag.Parse()
s, err := NewStore()
if err != nil {
panic(err)
}
if err := grpc.StartServer(*addr, NewStore()); err != nil {
if err := grpc.StartServer(*addr, s); err != nil {
panic(err)
}
}

View File

@ -3,505 +3,90 @@ package main
// This is a wrapper to statisfy the GRPC service interface
// It is meant to be used by the main executable that is the server for the specific backend type (falcon, gpt3, etc)
import (
"container/heap"
"context"
"fmt"
"math"
"slices"
"strconv"
"github.com/go-skynet/LocalAI/pkg/grpc/base"
pb "github.com/go-skynet/LocalAI/pkg/grpc/proto"
"github.com/rs/zerolog/log"
"github.com/philippgille/chromem-go"
)
type Store struct {
base.SingleThread
// The sorted keys
keys [][]float32
// The sorted values
values [][]byte
// If for every K it holds that ||k||^2 = 1, then we can use the normalized distance functions
// TODO: Should we normalize incoming keys if they are not instead?
keysAreNormalized bool
// The first key decides the length of the keys
keyLen int
maxId int
db *chromem.DB
c *chromem.Collection
}
// TODO: Only used for sorting using Go's builtin implementation. The interfaces are columnar because
// that's theoretically best for memory layout and cache locality, but this isn't optimized yet.
type Pair struct {
Key []float32
Value []byte
}
func NewStore() (*Store, error) {
db := chromem.NewDB()
c, err := db.CreateCollection("default", nil, nil)
if err != nil {
return nil, err
}
func NewStore() *Store {
return &Store{
keys: make([][]float32, 0),
values: make([][]byte, 0),
keysAreNormalized: true,
keyLen: -1,
}
}
func compareSlices(k1, k2 []float32) int {
assert(len(k1) == len(k2), fmt.Sprintf("compareSlices: len(k1) = %d, len(k2) = %d", len(k1), len(k2)))
return slices.Compare(k1, k2)
}
func hasKey(unsortedSlice [][]float32, target []float32) bool {
return slices.ContainsFunc(unsortedSlice, func(k []float32) bool {
return compareSlices(k, target) == 0
})
}
func findInSortedSlice(sortedSlice [][]float32, target []float32) (int, bool) {
return slices.BinarySearchFunc(sortedSlice, target, func(k, t []float32) int {
return compareSlices(k, t)
})
}
func isSortedPairs(kvs []Pair) bool {
for i := 1; i < len(kvs); i++ {
if compareSlices(kvs[i-1].Key, kvs[i].Key) > 0 {
return false
}
}
return true
}
func isSortedKeys(keys [][]float32) bool {
for i := 1; i < len(keys); i++ {
if compareSlices(keys[i-1], keys[i]) > 0 {
return false
}
}
return true
}
func sortIntoKeySlicese(keys []*pb.StoresKey) [][]float32 {
ks := make([][]float32, len(keys))
for i, k := range keys {
ks[i] = k.Floats
}
slices.SortFunc(ks, compareSlices)
assert(len(ks) == len(keys), fmt.Sprintf("len(ks) = %d, len(keys) = %d", len(ks), len(keys)))
assert(isSortedKeys(ks), "keys are not sorted")
return ks
db: db,
c: c,
}, nil
}
func (s *Store) Load(opts *pb.ModelOptions) error {
return nil
}
// Sort the incoming kvs and merge them with the existing sorted kvs
func (s *Store) StoresSet(opts *pb.StoresSetOptions) error {
if len(opts.Keys) == 0 {
return fmt.Errorf("no keys to add")
ids := make([]string, len(opts.Keys))
for i, _ := range(ids) {
ids[i] = strconv.Itoa(i)
}
if len(opts.Keys) != len(opts.Values) {
return fmt.Errorf("len(keys) = %d, len(values) = %d", len(opts.Keys), len(opts.Values))
embeddings := make([][]float32, len(opts.Keys))
for i, key := range opts.Keys {
embeddings[i] = key.Floats
}
if s.keyLen == -1 {
s.keyLen = len(opts.Keys[0].Floats)
} else {
if len(opts.Keys[0].Floats) != s.keyLen {
return fmt.Errorf("Try to add key with length %d when existing length is %d", len(opts.Keys[0].Floats), s.keyLen)
}
contents := make([]string, len(opts.Values))
for i, value := range opts.Values {
contents[i] = string(value.Bytes)
}
kvs := make([]Pair, len(opts.Keys))
for i, k := range opts.Keys {
if s.keysAreNormalized && !isNormalized(k.Floats) {
s.keysAreNormalized = false
var sample []float32
if len(s.keys) > 5 {
sample = k.Floats[:5]
} else {
sample = k.Floats
}
log.Debug().Msgf("Key is not normalized: %v", sample)
}
kvs[i] = Pair{
Key: k.Floats,
Value: opts.Values[i].Bytes,
}
}
slices.SortFunc(kvs, func(a, b Pair) int {
return compareSlices(a.Key, b.Key)
})
assert(len(kvs) == len(opts.Keys), fmt.Sprintf("len(kvs) = %d, len(opts.Keys) = %d", len(kvs), len(opts.Keys)))
assert(isSortedPairs(kvs), "keys are not sorted")
l := len(kvs) + len(s.keys)
merge_ks := make([][]float32, 0, l)
merge_vs := make([][]byte, 0, l)
i, j := 0, 0
for {
if i+j >= l {
break
}
if i >= len(kvs) {
merge_ks = append(merge_ks, s.keys[j])
merge_vs = append(merge_vs, s.values[j])
j++
continue
}
if j >= len(s.keys) {
merge_ks = append(merge_ks, kvs[i].Key)
merge_vs = append(merge_vs, kvs[i].Value)
i++
continue
}
c := compareSlices(kvs[i].Key, s.keys[j])
if c < 0 {
merge_ks = append(merge_ks, kvs[i].Key)
merge_vs = append(merge_vs, kvs[i].Value)
i++
} else if c > 0 {
merge_ks = append(merge_ks, s.keys[j])
merge_vs = append(merge_vs, s.values[j])
j++
} else {
merge_ks = append(merge_ks, kvs[i].Key)
merge_vs = append(merge_vs, kvs[i].Value)
i++
j++
}
}
assert(len(merge_ks) == l, fmt.Sprintf("len(merge_ks) = %d, l = %d", len(merge_ks), l))
assert(isSortedKeys(merge_ks), "merge keys are not sorted")
s.keys = merge_ks
s.values = merge_vs
return nil
return s.c.Add(context.Background(), ids, embeddings, nil, contents)
}
func (s *Store) StoresDelete(opts *pb.StoresDeleteOptions) error {
if len(opts.Keys) == 0 {
return fmt.Errorf("no keys to delete")
}
if len(opts.Keys) == 0 {
return fmt.Errorf("no keys to add")
}
if s.keyLen == -1 {
s.keyLen = len(opts.Keys[0].Floats)
} else {
if len(opts.Keys[0].Floats) != s.keyLen {
return fmt.Errorf("Trying to delete key with length %d when existing length is %d", len(opts.Keys[0].Floats), s.keyLen)
}
}
ks := sortIntoKeySlicese(opts.Keys)
l := len(s.keys) - len(ks)
merge_ks := make([][]float32, 0, l)
merge_vs := make([][]byte, 0, l)
tail_ks := s.keys
tail_vs := s.values
for _, k := range ks {
j, found := findInSortedSlice(tail_ks, k)
if found {
merge_ks = append(merge_ks, tail_ks[:j]...)
merge_vs = append(merge_vs, tail_vs[:j]...)
tail_ks = tail_ks[j+1:]
tail_vs = tail_vs[j+1:]
} else {
assert(!hasKey(s.keys, k), fmt.Sprintf("Key exists, but was not found: t=%d, %v", len(tail_ks), k))
}
log.Debug().Msgf("Delete: found = %v, t = %d, j = %d, len(merge_ks) = %d, len(merge_vs) = %d", found, len(tail_ks), j, len(merge_ks), len(merge_vs))
}
merge_ks = append(merge_ks, tail_ks...)
merge_vs = append(merge_vs, tail_vs...)
assert(len(merge_ks) <= len(s.keys), fmt.Sprintf("len(merge_ks) = %d, len(s.keys) = %d", len(merge_ks), len(s.keys)))
s.keys = merge_ks
s.values = merge_vs
assert(len(s.keys) >= l, fmt.Sprintf("len(s.keys) = %d, l = %d", len(s.keys), l))
assert(isSortedKeys(s.keys), "keys are not sorted")
assert(func() bool {
for _, k := range ks {
if _, found := findInSortedSlice(s.keys, k); found {
return false
}
}
return true
}(), "Keys to delete still present")
if len(s.keys) != l {
log.Debug().Msgf("Delete: Some keys not found: len(s.keys) = %d, l = %d", len(s.keys), l)
}
return nil
return fmt.Errorf("Per document delete not implemented in chromem")
}
func (s *Store) StoresGet(opts *pb.StoresGetOptions) (pb.StoresGetResult, error) {
pbKeys := make([]*pb.StoresKey, 0, len(opts.Keys))
pbValues := make([]*pb.StoresValue, 0, len(opts.Keys))
ks := sortIntoKeySlicese(opts.Keys)
if len(s.keys) == 0 {
log.Debug().Msgf("Get: No keys in store")
}
if s.keyLen == -1 {
s.keyLen = len(opts.Keys[0].Floats)
} else {
if len(opts.Keys[0].Floats) != s.keyLen {
return pb.StoresGetResult{}, fmt.Errorf("Try to get a key with length %d when existing length is %d", len(opts.Keys[0].Floats), s.keyLen)
}
}
tail_k := s.keys
tail_v := s.values
for i, k := range ks {
j, found := findInSortedSlice(tail_k, k)
if found {
pbKeys = append(pbKeys, &pb.StoresKey{
Floats: k,
})
pbValues = append(pbValues, &pb.StoresValue{
Bytes: tail_v[j],
})
tail_k = tail_k[j+1:]
tail_v = tail_v[j+1:]
} else {
assert(!hasKey(s.keys, k), fmt.Sprintf("Key exists, but was not found: i=%d, %v", i, k))
}
}
if len(pbKeys) != len(opts.Keys) {
log.Debug().Msgf("Get: Some keys not found: len(pbKeys) = %d, len(opts.Keys) = %d, len(s.Keys) = %d", len(pbKeys), len(opts.Keys), len(s.keys))
}
return pb.StoresGetResult{
Keys: pbKeys,
Values: pbValues,
}, nil
}
func isNormalized(k []float32) bool {
var sum float32
for _, v := range k {
sum += v
}
return sum == 1.0
}
// TODO: This we could replace with handwritten SIMD code
func normalizedCosineSimilarity(k1, k2 []float32) float32 {
assert(len(k1) == len(k2), fmt.Sprintf("normalizedCosineSimilarity: len(k1) = %d, len(k2) = %d", len(k1), len(k2)))
var dot float32
for i := 0; i < len(k1); i++ {
dot += k1[i] * k2[i]
}
assert(dot >= -1 && dot <= 1, fmt.Sprintf("dot = %f", dot))
// 2.0 * (1.0 - dot) would be the Euclidean distance
return dot
}
type PriorityItem struct {
Similarity float32
Key []float32
Value []byte
}
type PriorityQueue []*PriorityItem
func (pq PriorityQueue) Len() int { return len(pq) }
func (pq PriorityQueue) Less(i, j int) bool {
// Inverted because the most similar should be at the top
return pq[i].Similarity < pq[j].Similarity
}
func (pq PriorityQueue) Swap(i, j int) {
pq[i], pq[j] = pq[j], pq[i]
}
func (pq *PriorityQueue) Push(x any) {
item := x.(*PriorityItem)
*pq = append(*pq, item)
}
func (pq *PriorityQueue) Pop() any {
old := *pq
n := len(old)
item := old[n-1]
*pq = old[0 : n-1]
return item
}
func (s *Store) StoresFindNormalized(opts *pb.StoresFindOptions) (pb.StoresFindResult, error) {
tk := opts.Key.Floats
top_ks := make(PriorityQueue, 0, int(opts.TopK))
heap.Init(&top_ks)
for i, k := range s.keys {
sim := normalizedCosineSimilarity(tk, k)
heap.Push(&top_ks, &PriorityItem{
Similarity: sim,
Key: k,
Value: s.values[i],
})
if top_ks.Len() > int(opts.TopK) {
heap.Pop(&top_ks)
}
}
similarities := make([]float32, top_ks.Len())
pbKeys := make([]*pb.StoresKey, top_ks.Len())
pbValues := make([]*pb.StoresValue, top_ks.Len())
for i := top_ks.Len() - 1; i >= 0; i-- {
item := heap.Pop(&top_ks).(*PriorityItem)
similarities[i] = item.Similarity
pbKeys[i] = &pb.StoresKey{
Floats: item.Key,
}
pbValues[i] = &pb.StoresValue{
Bytes: item.Value,
}
}
return pb.StoresFindResult{
Keys: pbKeys,
Values: pbValues,
Similarities: similarities,
}, nil
}
func cosineSimilarity(k1, k2 []float32, mag1 float64) float32 {
assert(len(k1) == len(k2), fmt.Sprintf("cosineSimilarity: len(k1) = %d, len(k2) = %d", len(k1), len(k2)))
var dot, mag2 float64
for i := 0; i < len(k1); i++ {
dot += float64(k1[i] * k2[i])
mag2 += float64(k2[i] * k2[i])
}
sim := float32(dot / (mag1 * math.Sqrt(mag2)))
assert(sim >= -1 && sim <= 1, fmt.Sprintf("sim = %f", sim))
return sim
}
func (s *Store) StoresFindFallback(opts *pb.StoresFindOptions) (pb.StoresFindResult, error) {
tk := opts.Key.Floats
top_ks := make(PriorityQueue, 0, int(opts.TopK))
heap.Init(&top_ks)
var mag1 float64
for _, v := range tk {
mag1 += float64(v * v)
}
mag1 = math.Sqrt(mag1)
for i, k := range s.keys {
dist := cosineSimilarity(tk, k, mag1)
heap.Push(&top_ks, &PriorityItem{
Similarity: dist,
Key: k,
Value: s.values[i],
})
if top_ks.Len() > int(opts.TopK) {
heap.Pop(&top_ks)
}
}
similarities := make([]float32, top_ks.Len())
pbKeys := make([]*pb.StoresKey, top_ks.Len())
pbValues := make([]*pb.StoresValue, top_ks.Len())
for i := top_ks.Len() - 1; i >= 0; i-- {
item := heap.Pop(&top_ks).(*PriorityItem)
similarities[i] = item.Similarity
pbKeys[i] = &pb.StoresKey{
Floats: item.Key,
}
pbValues[i] = &pb.StoresValue{
Bytes: item.Value,
}
}
return pb.StoresFindResult{
Keys: pbKeys,
Values: pbValues,
Similarities: similarities,
}, nil
return pb.StoresGetResult{}, fmt.Errorf("Get not really implemented in chromem, although query may work")
}
func (s *Store) StoresFind(opts *pb.StoresFindOptions) (pb.StoresFindResult, error) {
tk := opts.Key.Floats
if len(tk) != s.keyLen {
return pb.StoresFindResult{}, fmt.Errorf("Try to find key with length %d when existing length is %d", len(tk), s.keyLen)
res, err := s.c.QueryEmbedding(context.Background(), opts.Key.Floats, int(opts.TopK), nil, nil)
if err != nil {
return pb.StoresFindResult{}, err
}
if opts.TopK < 1 {
return pb.StoresFindResult{}, fmt.Errorf("opts.TopK = %d, must be >= 1", opts.TopK)
keys := make([]*pb.StoresKey, len(res))
values := make([]*pb.StoresValue, len(res))
similarities := make([]float32, len(res))
for i, r := range(res) {
keys[i] = &pb.StoresKey{Floats: r.Embedding}
similarities[i] = r.Similarity
values[i] = &pb.StoresValue{Bytes: []byte(r.Content)}
}
if s.keyLen == -1 {
s.keyLen = len(opts.Key.Floats)
} else {
if len(opts.Key.Floats) != s.keyLen {
return pb.StoresFindResult{}, fmt.Errorf("Try to add key with length %d when existing length is %d", len(opts.Key.Floats), s.keyLen)
}
}
if s.keysAreNormalized && isNormalized(tk) {
return s.StoresFindNormalized(opts)
} else {
if s.keysAreNormalized {
var sample []float32
if len(s.keys) > 5 {
sample = tk[:5]
} else {
sample = tk
}
log.Debug().Msgf("Trying to compare non-normalized key with normalized keys: %v", sample)
}
return s.StoresFindFallback(opts)
}
return pb.StoresFindResult{
Keys: keys,
Values: values,
Similarities: similarities,
}, nil
}

View File

@ -89,8 +89,8 @@ class BackendServicer(backend_pb2_grpc.BackendServicer):
quantization = None
if self.CUDA:
if request.Device:
device_map=request.Device
if request.MainGPU:
device_map=request.MainGPU
else:
device_map="cuda:0"
if request.Quantization == "bnb_4bit":
@ -143,28 +143,48 @@ class BackendServicer(backend_pb2_grpc.BackendServicer):
from optimum.intel.openvino import OVModelForCausalLM
from openvino.runtime import Core
if "GPU" in Core().available_devices:
device_map="GPU"
if request.MainGPU:
device_map=request.MainGPU
else:
device_map="CPU"
device_map="AUTO"
devices = Core().available_devices
if "GPU" in " ".join(devices):
device_map="AUTO:GPU"
# While working on a fine tuned model, inference may give an inaccuracy and performance drop on GPU if winograd convolutions are selected.
# https://docs.openvino.ai/2024/openvino-workflow/running-inference/inference-devices-and-modes/gpu-device.html
if "CPU" or "NPU" in device_map:
if "-CPU" or "-NPU" not in device_map:
ovconfig={"PERFORMANCE_HINT": "CUMULATIVE_THROUGHPUT"}
else:
ovconfig={"PERFORMANCE_HINT": "CUMULATIVE_THROUGHPUT","GPU_DISABLE_WINOGRAD_CONVOLUTION": "YES"}
self.model = OVModelForCausalLM.from_pretrained(model_name,
compile=True,
trust_remote_code=request.TrustRemoteCode,
ov_config={"PERFORMANCE_HINT": "CUMULATIVE_THROUGHPUT"},
ov_config=ovconfig,
device=device_map)
self.OV = True
elif request.Type == "OVModelForFeatureExtraction":
from optimum.intel.openvino import OVModelForFeatureExtraction
from openvino.runtime import Core
if "GPU" in Core().available_devices:
device_map="GPU"
if request.MainGPU:
device_map=request.MainGPU
else:
device_map="CPU"
device_map="AUTO"
devices = Core().available_devices
if "GPU" in " ".join(devices):
device_map="AUTO:GPU"
# While working on a fine tuned model, inference may give an inaccuracy and performance drop on GPU if winograd convolutions are selected.
# https://docs.openvino.ai/2024/openvino-workflow/running-inference/inference-devices-and-modes/gpu-device.html
if "CPU" or "NPU" in device_map:
if "-CPU" or "-NPU" not in device_map:
ovconfig={"PERFORMANCE_HINT": "CUMULATIVE_THROUGHPUT"}
else:
ovconfig={"PERFORMANCE_HINT": "CUMULATIVE_THROUGHPUT","GPU_DISABLE_WINOGRAD_CONVOLUTION": "YES"}
self.model = OVModelForFeatureExtraction.from_pretrained(model_name,
compile=True,
trust_remote_code=request.TrustRemoteCode,
ov_config={"PERFORMANCE_HINT": "CUMULATIVE_THROUGHPUT"},
ov_config=ovconfig,
export=True,
device=device_map)
self.OV = True
@ -226,8 +246,8 @@ class BackendServicer(backend_pb2_grpc.BackendServicer):
# Pool to get sentence embeddings; i.e. generate one 1024 vector for the entire sentence
sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
print("Calculated embeddings for: " + request.Embeddings, file=sys.stderr)
print("Embeddings:", sentence_embeddings, file=sys.stderr)
# print("Calculated embeddings for: " + request.Embeddings, file=sys.stderr)
# print("Embeddings:", sentence_embeddings, file=sys.stderr)
return backend_pb2.EmbeddingResult(embeddings=sentence_embeddings[0])
async def _predict(self, request, context, streaming=False):
@ -371,4 +391,4 @@ if __name__ == "__main__":
)
args = parser.parse_args()
asyncio.run(serve(args.addr))
asyncio.run(serve(args.addr))

View File

@ -42,7 +42,7 @@ type RunCMD struct {
CORSAllowOrigins string `env:"LOCALAI_CORS_ALLOW_ORIGINS,CORS_ALLOW_ORIGINS" group:"api"`
UploadLimit int `env:"LOCALAI_UPLOAD_LIMIT,UPLOAD_LIMIT" default:"15" help:"Default upload-limit in MB" group:"api"`
APIKeys []string `env:"LOCALAI_API_KEY,API_KEY" help:"List of API Keys to enable API authentication. When this is set, all the requests must be authenticated with one of these API keys" group:"api"`
DisableWelcome bool `env:"LOCALAI_DISABLE_WELCOME,DISABLE_WELCOME" default:"false" help:"Disable welcome pages" group:"api"`
DisableWebUI bool `env:"LOCALAI_DISABLE_WEBUI,DISABLE_WEBUI" default:"false" help:"Disable webui" group:"api"`
ParallelRequests bool `env:"LOCALAI_PARALLEL_REQUESTS,PARALLEL_REQUESTS" help:"Enable backends to handle multiple requests in parallel if they support it (e.g.: llama.cpp or vllm)" group:"backends"`
SingleActiveBackend bool `env:"LOCALAI_SINGLE_ACTIVE_BACKEND,SINGLE_ACTIVE_BACKEND" help:"Allow only one backend to be run at a time" group:"backends"`
@ -84,8 +84,8 @@ func (r *RunCMD) Run(ctx *Context) error {
idleWatchDog := r.EnableWatchdogIdle
busyWatchDog := r.EnableWatchdogBusy
if r.DisableWelcome {
opts = append(opts, config.DisableWelcomePage)
if r.DisableWebUI {
opts = append(opts, config.DisableWebUI)
}
if idleWatchDog || busyWatchDog {

View File

@ -15,7 +15,7 @@ type ApplicationConfig struct {
ConfigFile string
ModelPath string
UploadLimitMB, Threads, ContextSize int
DisableWelcomePage bool
DisableWebUI bool
F16 bool
Debug bool
ImageDir string
@ -107,8 +107,8 @@ var EnableWatchDogBusyCheck = func(o *ApplicationConfig) {
o.WatchDogBusy = true
}
var DisableWelcomePage = func(o *ApplicationConfig) {
o.DisableWelcomePage = true
var DisableWebUI = func(o *ApplicationConfig) {
o.DisableWebUI = true
}
func SetWatchDogBusyTimeout(t time.Duration) AppOption {

View File

@ -182,6 +182,12 @@ func (cl *BackendConfigLoader) GetAllBackendConfigs() []BackendConfig {
return res
}
func (cl *BackendConfigLoader) RemoveBackendConfig(m string) {
cl.Lock()
defer cl.Unlock()
delete(cl.configs, m)
}
func (cl *BackendConfigLoader) ListBackendConfigs() []string {
cl.Lock()
defer cl.Unlock()

View File

@ -1,7 +1,9 @@
package http
import (
"embed"
"errors"
"net/http"
"strings"
"github.com/go-skynet/LocalAI/pkg/utils"
@ -18,6 +20,8 @@ import (
"github.com/gofiber/contrib/fiberzerolog"
"github.com/gofiber/fiber/v2"
"github.com/gofiber/fiber/v2/middleware/cors"
"github.com/gofiber/fiber/v2/middleware/favicon"
"github.com/gofiber/fiber/v2/middleware/filesystem"
"github.com/gofiber/fiber/v2/middleware/recover"
// swagger handler
@ -42,6 +46,11 @@ func readAuthHeader(c *fiber.Ctx) string {
return authHeader
}
// Embed a directory
//
//go:embed static/*
var embedDirStatic embed.FS
// @title LocalAI API
// @version 2.0.0
// @description The LocalAI Rest API.
@ -169,10 +178,25 @@ func App(cl *config.BackendConfigLoader, ml *model.ModelLoader, appConfig *confi
routes.RegisterElevenLabsRoutes(app, cl, ml, appConfig, auth)
routes.RegisterLocalAIRoutes(app, cl, ml, appConfig, galleryService, auth)
routes.RegisterOpenAIRoutes(app, cl, ml, appConfig, auth)
routes.RegisterPagesRoutes(app, cl, ml, appConfig, auth)
routes.RegisterUIRoutes(app, cl, ml, appConfig, galleryService, auth)
if !appConfig.DisableWebUI {
routes.RegisterUIRoutes(app, cl, ml, appConfig, galleryService, auth)
}
routes.RegisterJINARoutes(app, cl, ml, appConfig, auth)
httpFS := http.FS(embedDirStatic)
app.Use(favicon.New(favicon.Config{
URL: "/favicon.ico",
FileSystem: httpFS,
File: "static/favicon.ico",
}))
app.Use("/static", filesystem.New(filesystem.Config{
Root: httpFS,
PathPrefix: "static",
Browse: true,
}))
// Define a custom 404 handler
// Note: keep this at the bottom!
app.Use(notFoundHandler)

View File

@ -708,10 +708,26 @@ var _ = Describe("API test", func() {
// The response should contain an URL
Expect(err).ToNot(HaveOccurred(), fmt.Sprint(resp))
dat, err := io.ReadAll(resp.Body)
Expect(err).ToNot(HaveOccurred(), string(dat))
Expect(string(dat)).To(ContainSubstring("http://127.0.0.1:9090/"), string(dat))
Expect(string(dat)).To(ContainSubstring(".png"), string(dat))
Expect(err).ToNot(HaveOccurred(), "error reading /image/generations response")
imgUrlResp := &schema.OpenAIResponse{}
err = json.Unmarshal(dat, imgUrlResp)
Expect(imgUrlResp.Data).ToNot(Or(BeNil(), BeZero()))
imgUrl := imgUrlResp.Data[0].URL
Expect(imgUrl).To(ContainSubstring("http://127.0.0.1:9090/"), imgUrl)
Expect(imgUrl).To(ContainSubstring(".png"), imgUrl)
imgResp, err := http.Get(imgUrl)
Expect(err).To(BeNil())
Expect(imgResp).ToNot(BeNil())
Expect(imgResp.StatusCode).To(Equal(200))
Expect(imgResp.ContentLength).To(BeNumerically(">", 0))
imgData := make([]byte, 512)
count, err := io.ReadFull(imgResp.Body, imgData)
Expect(err).To(Or(BeNil(), MatchError(io.EOF)))
Expect(count).To(BeNumerically(">", 0))
Expect(count).To(BeNumerically("<=", 512))
Expect(http.DetectContentType(imgData)).To(Equal("image/png"))
})
})
@ -787,11 +803,11 @@ var _ = Describe("API test", func() {
})
It("returns errors", func() {
backends := len(model.AutoLoadBackends) + 1 // +1 for huggingface
_, err := client.CreateCompletion(context.TODO(), openai.CompletionRequest{Model: "foomodel", Prompt: testPrompt})
Expect(err).To(HaveOccurred())
Expect(err.Error()).To(ContainSubstring(fmt.Sprintf("error, status code: 500, message: could not load model - all backends returned error: %d errors occurred:", backends)))
Expect(err.Error()).To(ContainSubstring("error, status code: 500, message: could not load model - all backends returned error:"))
})
It("transcribes audio", func() {
if runtime.GOOS != "linux" {
Skip("test supported only on linux")

View File

@ -2,9 +2,11 @@ package elements
import (
"fmt"
"strings"
"github.com/chasefleming/elem-go"
"github.com/chasefleming/elem-go/attrs"
"github.com/go-skynet/LocalAI/core/services"
"github.com/go-skynet/LocalAI/pkg/gallery"
"github.com/go-skynet/LocalAI/pkg/xsync"
)
@ -13,7 +15,12 @@ const (
NoImage = "https://upload.wikimedia.org/wikipedia/commons/6/65/No-Image-Placeholder.svg"
)
func DoneProgress(uid, text string) string {
func DoneProgress(galleryID, text string, showDelete bool) string {
// Split by @ and grab the name
if strings.Contains(galleryID, "@") {
galleryID = strings.Split(galleryID, "@")[1]
}
return elem.Div(
attrs.Props{},
elem.H3(
@ -25,10 +32,11 @@ func DoneProgress(uid, text string) string {
},
elem.Text(text),
),
elem.If(showDelete, deleteButton(galleryID), reInstallButton(galleryID)),
).Render()
}
func ErrorProgress(err string) string {
func ErrorProgress(err, galleryName string) string {
return elem.Div(
attrs.Props{},
elem.H3(
@ -38,8 +46,9 @@ func ErrorProgress(err string) string {
"tabindex": "-1",
"autofocus": "",
},
elem.Text("Error"+err),
elem.Text("Error "+err),
),
installButton(galleryName),
).Render()
}
@ -64,12 +73,13 @@ func StartProgressBar(uid, progress, text string) string {
if progress == "" {
progress = "0"
}
return elem.Div(attrs.Props{
"hx-trigger": "done",
"hx-get": "/browse/job/" + uid,
"hx-swap": "outerHTML",
"hx-target": "this",
},
return elem.Div(
attrs.Props{
"hx-trigger": "done",
"hx-get": "/browse/job/" + uid,
"hx-swap": "innerHTML",
"hx-target": "this",
},
elem.H3(
attrs.Props{
"role": "status",
@ -99,11 +109,123 @@ func cardSpan(text, icon string) elem.Node {
elem.I(attrs.Props{
"class": icon + " pr-2",
}),
elem.Text(text),
//elem.Text(text),
)
}
func ListModels(models []*gallery.GalleryModel, installing *xsync.SyncedMap[string, string]) string {
func searchableElement(text, icon string) elem.Node {
return elem.Form(
attrs.Props{},
elem.Input(
attrs.Props{
"type": "hidden",
"name": "search",
"value": text,
},
),
elem.Span(
attrs.Props{
"class": "inline-block bg-gray-200 rounded-full px-3 py-1 text-sm font-semibold text-gray-700 mr-2 mb-2 hover:bg-gray-300 hover:shadow-gray-2",
},
elem.A(
attrs.Props{
// "name": "search",
// "value": text,
//"class": "inline-block bg-gray-200 rounded-full px-3 py-1 text-sm font-semibold text-gray-700 mr-2 mb-2",
"href": "#!",
"hx-post": "/browse/search/models",
"hx-target": "#search-results",
// TODO: this doesn't work
// "hx-vals": `{ \"search\": \"` + text + `\" }`,
"hx-indicator": ".htmx-indicator",
},
elem.I(attrs.Props{
"class": icon + " pr-2",
}),
elem.Text(text),
),
),
//elem.Text(text),
)
}
func link(text, url string) elem.Node {
return elem.A(
attrs.Props{
"class": "inline-block bg-gray-200 rounded-full px-3 py-1 text-sm font-semibold text-gray-700 mr-2 mb-2 hover:bg-gray-300 hover:shadow-gray-2",
"href": url,
"target": "_blank",
},
elem.I(attrs.Props{
"class": "fas fa-link pr-2",
}),
elem.Text(text),
)
}
func installButton(galleryName string) elem.Node {
return elem.Button(
attrs.Props{
"data-twe-ripple-init": "",
"data-twe-ripple-color": "light",
"class": "float-right inline-block rounded bg-primary px-6 pb-2.5 mb-3 pt-2.5 text-xs font-medium uppercase leading-normal text-white shadow-primary-3 transition duration-150 ease-in-out hover:bg-primary-accent-300 hover:shadow-primary-2 focus:bg-primary-accent-300 focus:shadow-primary-2 focus:outline-none focus:ring-0 active:bg-primary-600 active:shadow-primary-2 dark:shadow-black/30 dark:hover:shadow-dark-strong dark:focus:shadow-dark-strong dark:active:shadow-dark-strong",
"hx-swap": "outerHTML",
// post the Model ID as param
"hx-post": "/browse/install/model/" + galleryName,
},
elem.I(
attrs.Props{
"class": "fa-solid fa-download pr-2",
},
),
elem.Text("Install"),
)
}
func reInstallButton(galleryName string) elem.Node {
return elem.Button(
attrs.Props{
"data-twe-ripple-init": "",
"data-twe-ripple-color": "light",
"class": "float-right inline-block rounded bg-primary ml-2 px-6 pb-2.5 mb-3 pt-2.5 text-xs font-medium uppercase leading-normal text-white shadow-primary-3 transition duration-150 ease-in-out hover:bg-primary-accent-300 hover:shadow-primary-2 focus:bg-primary-accent-300 focus:shadow-primary-2 focus:outline-none focus:ring-0 active:bg-primary-600 active:shadow-primary-2 dark:shadow-black/30 dark:hover:shadow-dark-strong dark:focus:shadow-dark-strong dark:active:shadow-dark-strong",
"hx-swap": "outerHTML",
// post the Model ID as param
"hx-post": "/browse/install/model/" + galleryName,
},
elem.I(
attrs.Props{
"class": "fa-solid fa-arrow-rotate-right pr-2",
},
),
elem.Text("Reinstall"),
)
}
func deleteButton(modelName string) elem.Node {
return elem.Button(
attrs.Props{
"data-twe-ripple-init": "",
"data-twe-ripple-color": "light",
"hx-confirm": "Are you sure you wish to delete the model?",
"class": "float-right inline-block rounded bg-red-800 px-6 pb-2.5 mb-3 pt-2.5 text-xs font-medium uppercase leading-normal text-white shadow-primary-3 transition duration-150 ease-in-out hover:bg-red-accent-300 hover:shadow-red-2 focus:bg-red-accent-300 focus:shadow-primary-2 focus:outline-none focus:ring-0 active:bg-red-600 active:shadow-primary-2 dark:shadow-black/30 dark:hover:shadow-dark-strong dark:focus:shadow-dark-strong dark:active:shadow-dark-strong",
"hx-swap": "outerHTML",
// post the Model ID as param
"hx-post": "/browse/delete/model/" + modelName,
},
elem.I(
attrs.Props{
"class": "fa-solid fa-cancel pr-2",
},
),
elem.Text("Delete"),
)
}
func ListModels(models []*gallery.GalleryModel, processing *xsync.SyncedMap[string, string], galleryService *services.GalleryService) string {
//StartProgressBar(uid, "0")
modelsElements := []elem.Node{}
// span := func(s string) elem.Node {
@ -114,43 +236,6 @@ func ListModels(models []*gallery.GalleryModel, installing *xsync.SyncedMap[stri
// elem.Text(s),
// )
// }
deleteButton := func(m *gallery.GalleryModel) elem.Node {
return elem.Button(
attrs.Props{
"data-twe-ripple-init": "",
"data-twe-ripple-color": "light",
"class": "float-right inline-block rounded bg-red-800 px-6 pb-2.5 mb-3 pt-2.5 text-xs font-medium uppercase leading-normal text-white shadow-primary-3 transition duration-150 ease-in-out hover:bg-red-accent-300 hover:shadow-red-2 focus:bg-red-accent-300 focus:shadow-primary-2 focus:outline-none focus:ring-0 active:bg-red-600 active:shadow-primary-2 dark:shadow-black/30 dark:hover:shadow-dark-strong dark:focus:shadow-dark-strong dark:active:shadow-dark-strong",
"hx-swap": "outerHTML",
// post the Model ID as param
"hx-post": "/browse/delete/model/" + m.Name,
},
elem.I(
attrs.Props{
"class": "fa-solid fa-cancel pr-2",
},
),
elem.Text("Delete"),
)
}
installButton := func(m *gallery.GalleryModel) elem.Node {
return elem.Button(
attrs.Props{
"data-twe-ripple-init": "",
"data-twe-ripple-color": "light",
"class": "float-right inline-block rounded bg-primary px-6 pb-2.5 mb-3 pt-2.5 text-xs font-medium uppercase leading-normal text-white shadow-primary-3 transition duration-150 ease-in-out hover:bg-primary-accent-300 hover:shadow-primary-2 focus:bg-primary-accent-300 focus:shadow-primary-2 focus:outline-none focus:ring-0 active:bg-primary-600 active:shadow-primary-2 dark:shadow-black/30 dark:hover:shadow-dark-strong dark:focus:shadow-dark-strong dark:active:shadow-dark-strong",
"hx-swap": "outerHTML",
// post the Model ID as param
"hx-post": "/browse/install/model/" + fmt.Sprintf("%s@%s", m.Gallery.Name, m.Name),
},
elem.I(
attrs.Props{
"class": "fa-solid fa-download pr-2",
},
),
elem.Text("Install"),
)
}
descriptionDiv := func(m *gallery.GalleryModel) elem.Node {
@ -175,7 +260,15 @@ func ListModels(models []*gallery.GalleryModel, installing *xsync.SyncedMap[stri
actionDiv := func(m *gallery.GalleryModel) elem.Node {
galleryID := fmt.Sprintf("%s@%s", m.Gallery.Name, m.Name)
currentlyInstalling := installing.Exists(galleryID)
currentlyProcessing := processing.Exists(galleryID)
isDeletionOp := false
if currentlyProcessing {
status := galleryService.GetStatus(galleryID)
if status != nil && status.Deletion {
isDeletionOp = true
}
// if status == nil : "Waiting"
}
nodes := []elem.Node{
cardSpan("Repository: "+m.Gallery.Name, "fa-brands fa-git-alt"),
@ -187,25 +280,31 @@ func ListModels(models []*gallery.GalleryModel, installing *xsync.SyncedMap[stri
)
}
tagsNodes := []elem.Node{}
for _, tag := range m.Tags {
nodes = append(nodes,
cardSpan(tag, "fas fa-tag"),
tagsNodes = append(tagsNodes,
searchableElement(tag, "fas fa-tag"),
)
}
nodes = append(nodes,
elem.Div(
attrs.Props{
"class": "flex flex-row flex-wrap content-center",
},
tagsNodes...,
),
)
for i, url := range m.URLs {
nodes = append(nodes,
elem.A(
attrs.Props{
"class": "inline-block bg-gray-200 rounded-full px-3 py-1 text-sm font-semibold text-gray-700 mr-2 mb-2",
"href": url,
"target": "_blank",
},
elem.I(attrs.Props{
"class": "fas fa-link pr-2",
}),
elem.Text("Link #"+fmt.Sprintf("%d", i+1)),
))
link("Link #"+fmt.Sprintf("%d", i+1), url),
)
}
progressMessage := "Installation"
if isDeletionOp {
progressMessage = "Deletion"
}
return elem.Div(
@ -219,17 +318,17 @@ func ListModels(models []*gallery.GalleryModel, installing *xsync.SyncedMap[stri
nodes...,
),
elem.If(
currentlyInstalling,
currentlyProcessing,
elem.Node( // If currently installing, show progress bar
elem.Raw(StartProgressBar(installing.Get(galleryID), "0", "Installing")),
elem.Raw(StartProgressBar(processing.Get(galleryID), "0", progressMessage)),
), // Otherwise, show install button (if not installed) or display "Installed"
elem.If(m.Installed,
//elem.Node(elem.Div(
// attrs.Props{},
// span("Installed"), deleteButton(m),
// )),
deleteButton(m),
installButton(m),
elem.Node(elem.Div(
attrs.Props{},
reInstallButton(m.ID()),
deleteButton(m.Name),
)),
installButton(m.ID()),
),
),
)
@ -243,11 +342,13 @@ func ListModels(models []*gallery.GalleryModel, installing *xsync.SyncedMap[stri
m.Icon = NoImage
}
divProperties := attrs.Props{
"class": "flex justify-center items-center",
}
elems = append(elems,
elem.Div(attrs.Props{
"class": "flex justify-center items-center",
},
elem.Div(divProperties,
elem.A(attrs.Props{
"href": "#!",
// "class": "justify-center items-center",
@ -260,6 +361,19 @@ func ListModels(models []*gallery.GalleryModel, installing *xsync.SyncedMap[stri
),
))
_, trustRemoteCodeExists := m.Overrides["trust_remote_code"]
if trustRemoteCodeExists {
elems = append(elems, elem.Div(
attrs.Props{
"class": "flex justify-center items-center bg-red-500 text-white p-2 rounded-lg mt-2",
},
elem.I(attrs.Props{
"class": "fa-solid fa-circle-exclamation pr-2",
}),
elem.Text("Attention: Trust Remote Code is required for this model"),
))
}
elems = append(elems, descriptionDiv(m), actionDiv(m))
modelsElements = append(modelsElements,
elem.Div(

View File

@ -61,11 +61,11 @@ func (mgs *ModelGalleryEndpointService) ApplyModelGalleryEndpoint() func(c *fibe
return err
}
mgs.galleryApplier.C <- gallery.GalleryOp{
Req: input.GalleryModel,
Id: uuid.String(),
GalleryName: input.ID,
Galleries: mgs.galleries,
ConfigURL: input.ConfigURL,
Req: input.GalleryModel,
Id: uuid.String(),
GalleryModelName: input.ID,
Galleries: mgs.galleries,
ConfigURL: input.ConfigURL,
}
return c.JSON(struct {
ID string `json:"uuid"`
@ -79,8 +79,8 @@ func (mgs *ModelGalleryEndpointService) DeleteModelGalleryEndpoint() func(c *fib
modelName := c.Params("name")
mgs.galleryApplier.C <- gallery.GalleryOp{
Delete: true,
GalleryName: modelName,
Delete: true,
GalleryModelName: modelName,
}
uuid, err := uuid.NewUUID()

View File

@ -3,22 +3,39 @@ package localai
import (
"github.com/go-skynet/LocalAI/core/config"
"github.com/go-skynet/LocalAI/internal"
"github.com/go-skynet/LocalAI/pkg/gallery"
"github.com/go-skynet/LocalAI/pkg/model"
"github.com/gofiber/fiber/v2"
)
func WelcomeEndpoint(appConfig *config.ApplicationConfig,
cl *config.BackendConfigLoader, ml *model.ModelLoader) func(*fiber.Ctx) error {
cl *config.BackendConfigLoader, ml *model.ModelLoader, modelStatus func() (map[string]string, map[string]string)) func(*fiber.Ctx) error {
return func(c *fiber.Ctx) error {
models, _ := ml.ListModels()
backendConfigs := cl.GetAllBackendConfigs()
galleryConfigs := map[string]*gallery.Config{}
for _, m := range backendConfigs {
cfg, err := gallery.GetLocalModelConfiguration(ml.ModelPath, m.Name)
if err != nil {
continue
}
galleryConfigs[m.Name] = cfg
}
// Get model statuses to display in the UI the operation in progress
processingModels, taskTypes := modelStatus()
summary := fiber.Map{
"Title": "LocalAI API - " + internal.PrintableVersion(),
"Version": internal.PrintableVersion(),
"Models": models,
"ModelsConfig": backendConfigs,
"GalleryConfig": galleryConfigs,
"ApplicationConfig": appConfig,
"ProcessingModels": processingModels,
"TaskTypes": taskTypes,
}
if string(c.Context().Request.Header.ContentType()) == "application/json" || len(c.Accepts("html")) == 0 {

View File

@ -63,10 +63,14 @@ func getBase64Image(s string) (string, error) {
return encoded, nil
}
// if the string instead is prefixed with "data:image/jpeg;base64,", drop it
if strings.HasPrefix(s, "data:image/jpeg;base64,") {
return strings.ReplaceAll(s, "data:image/jpeg;base64,", ""), nil
// if the string instead is prefixed with "data:image/...;base64,", drop it
dropPrefix := []string{"data:image/jpeg;base64,", "data:image/png;base64,"}
for _, prefix := range dropPrefix {
if strings.HasPrefix(s, prefix) {
return strings.ReplaceAll(s, prefix, ""), nil
}
}
return "", fmt.Errorf("not valid string")
}
@ -181,7 +185,7 @@ func updateRequestConfig(config *config.BackendConfig, input *schema.OpenAIReque
input.Messages[i].StringContent = fmt.Sprintf("[img-%d]", index) + input.Messages[i].StringContent
index++
} else {
fmt.Print("Failed encoding image", err)
log.Error().Msgf("Failed encoding image: %s", err)
}
}
}

View File

@ -3,11 +3,14 @@ package routes
import (
"fmt"
"html/template"
"sort"
"strings"
"github.com/go-skynet/LocalAI/core/config"
"github.com/go-skynet/LocalAI/core/http/elements"
"github.com/go-skynet/LocalAI/core/http/endpoints/localai"
"github.com/go-skynet/LocalAI/core/services"
"github.com/go-skynet/LocalAI/internal"
"github.com/go-skynet/LocalAI/pkg/gallery"
"github.com/go-skynet/LocalAI/pkg/model"
"github.com/go-skynet/LocalAI/pkg/xsync"
@ -24,16 +27,64 @@ func RegisterUIRoutes(app *fiber.App,
auth func(*fiber.Ctx) error) {
// keeps the state of models that are being installed from the UI
var installingModels = xsync.NewSyncedMap[string, string]()
var processingModels = xsync.NewSyncedMap[string, string]()
// modelStatus returns the current status of the models being processed (installation or deletion)
// it is called asynchonously from the UI
modelStatus := func() (map[string]string, map[string]string) {
processingModelsData := processingModels.Map()
taskTypes := map[string]string{}
for k, v := range processingModelsData {
status := galleryService.GetStatus(v)
taskTypes[k] = "Installation"
if status != nil && status.Deletion {
taskTypes[k] = "Deletion"
} else if status == nil {
taskTypes[k] = "Waiting"
}
}
return processingModelsData, taskTypes
}
app.Get("/", auth, localai.WelcomeEndpoint(appConfig, cl, ml, modelStatus))
// Show the Models page (all models)
app.Get("/browse", auth, func(c *fiber.Ctx) error {
term := c.Query("term")
models, _ := gallery.AvailableGalleryModels(appConfig.Galleries, appConfig.ModelPath)
// Get all available tags
allTags := map[string]struct{}{}
tags := []string{}
for _, m := range models {
for _, t := range m.Tags {
allTags[t] = struct{}{}
}
}
for t := range allTags {
tags = append(tags, t)
}
sort.Strings(tags)
if term != "" {
models = gallery.GalleryModels(models).Search(term)
}
// Get model statuses
processingModelsData, taskTypes := modelStatus()
summary := fiber.Map{
"Title": "LocalAI - Models",
"Models": template.HTML(elements.ListModels(models, installingModels)),
"Repositories": appConfig.Galleries,
"Title": "LocalAI - Models",
"Version": internal.PrintableVersion(),
"Models": template.HTML(elements.ListModels(models, processingModels, galleryService)),
"Repositories": appConfig.Galleries,
"AllTags": tags,
"ProcessingModels": processingModelsData,
"TaskTypes": taskTypes,
// "ApplicationConfig": appConfig,
}
@ -53,17 +104,7 @@ func RegisterUIRoutes(app *fiber.App,
models, _ := gallery.AvailableGalleryModels(appConfig.Galleries, appConfig.ModelPath)
filteredModels := []*gallery.GalleryModel{}
for _, m := range models {
if strings.Contains(m.Name, form.Search) ||
strings.Contains(m.Description, form.Search) ||
strings.Contains(m.Gallery.Name, form.Search) ||
strings.Contains(strings.Join(m.Tags, ","), form.Search) {
filteredModels = append(filteredModels, m)
}
}
return c.SendString(elements.ListModels(filteredModels, installingModels))
return c.SendString(elements.ListModels(gallery.GalleryModels(models).Search(form.Search), processingModels, galleryService))
})
/*
@ -84,12 +125,12 @@ func RegisterUIRoutes(app *fiber.App,
uid := id.String()
installingModels.Set(galleryID, uid)
processingModels.Set(galleryID, uid)
op := gallery.GalleryOp{
Id: uid,
GalleryName: galleryID,
Galleries: appConfig.Galleries,
Id: uid,
GalleryModelName: galleryID,
Galleries: appConfig.Galleries,
}
go func() {
galleryService.C <- op
@ -110,15 +151,16 @@ func RegisterUIRoutes(app *fiber.App,
uid := id.String()
installingModels.Set(galleryID, uid)
processingModels.Set(galleryID, uid)
op := gallery.GalleryOp{
Id: uid,
Delete: true,
GalleryName: galleryID,
Id: uid,
Delete: true,
GalleryModelName: galleryID,
}
go func() {
galleryService.C <- op
cl.RemoveBackendConfig(galleryID)
}()
return c.SendString(elements.StartProgressBar(uid, "0", "Deletion"))
@ -141,7 +183,7 @@ func RegisterUIRoutes(app *fiber.App,
return c.SendString(elements.ProgressBar("100"))
}
if status.Error != nil {
return c.SendString(elements.ErrorProgress(status.Error.Error()))
return c.SendString(elements.ErrorProgress(status.Error.Error(), status.GalleryModelName))
}
return c.SendString(elements.ProgressBar(fmt.Sprint(status.Progress)))
@ -153,17 +195,123 @@ func RegisterUIRoutes(app *fiber.App,
status := galleryService.GetStatus(c.Params("uid"))
for _, k := range installingModels.Keys() {
if installingModels.Get(k) == c.Params("uid") {
installingModels.Delete(k)
galleryID := ""
for _, k := range processingModels.Keys() {
if processingModels.Get(k) == c.Params("uid") {
galleryID = k
processingModels.Delete(k)
}
}
showDelete := true
displayText := "Installation completed"
if status.Deletion {
showDelete = false
displayText = "Deletion completed"
}
return c.SendString(elements.DoneProgress(c.Params("uid"), displayText))
return c.SendString(elements.DoneProgress(galleryID, displayText, showDelete))
})
// Show the Chat page
app.Get("/chat/:model", auth, func(c *fiber.Ctx) error {
backendConfigs := cl.GetAllBackendConfigs()
summary := fiber.Map{
"Title": "LocalAI - Chat with " + c.Params("model"),
"ModelsConfig": backendConfigs,
"Model": c.Params("model"),
"Version": internal.PrintableVersion(),
}
// Render index
return c.Render("views/chat", summary)
})
app.Get("/chat/", auth, func(c *fiber.Ctx) error {
backendConfigs := cl.GetAllBackendConfigs()
if len(backendConfigs) == 0 {
// If no model is available redirect to the index which suggests how to install models
return c.Redirect("/")
}
summary := fiber.Map{
"Title": "LocalAI - Chat with " + backendConfigs[0].Name,
"ModelsConfig": backendConfigs,
"Model": backendConfigs[0].Name,
"Version": internal.PrintableVersion(),
}
// Render index
return c.Render("views/chat", summary)
})
app.Get("/text2image/:model", auth, func(c *fiber.Ctx) error {
backendConfigs := cl.GetAllBackendConfigs()
summary := fiber.Map{
"Title": "LocalAI - Generate images with " + c.Params("model"),
"ModelsConfig": backendConfigs,
"Model": c.Params("model"),
"Version": internal.PrintableVersion(),
}
// Render index
return c.Render("views/text2image", summary)
})
app.Get("/text2image/", auth, func(c *fiber.Ctx) error {
backendConfigs := cl.GetAllBackendConfigs()
if len(backendConfigs) == 0 {
// If no model is available redirect to the index which suggests how to install models
return c.Redirect("/")
}
summary := fiber.Map{
"Title": "LocalAI - Generate images with " + backendConfigs[0].Name,
"ModelsConfig": backendConfigs,
"Model": backendConfigs[0].Name,
"Version": internal.PrintableVersion(),
}
// Render index
return c.Render("views/text2image", summary)
})
app.Get("/tts/:model", auth, func(c *fiber.Ctx) error {
backendConfigs := cl.GetAllBackendConfigs()
summary := fiber.Map{
"Title": "LocalAI - Generate images with " + c.Params("model"),
"ModelsConfig": backendConfigs,
"Model": c.Params("model"),
"Version": internal.PrintableVersion(),
}
// Render index
return c.Render("views/tts", summary)
})
app.Get("/tts/", auth, func(c *fiber.Ctx) error {
backendConfigs := cl.GetAllBackendConfigs()
if len(backendConfigs) == 0 {
// If no model is available redirect to the index which suggests how to install models
return c.Redirect("/")
}
summary := fiber.Map{
"Title": "LocalAI - Generate audio with " + backendConfigs[0].Name,
"ModelsConfig": backendConfigs,
"Model": backendConfigs[0].Name,
"Version": internal.PrintableVersion(),
}
// Render index
return c.Render("views/tts", summary)
})
}

View File

@ -1,19 +0,0 @@
package routes
import (
"github.com/go-skynet/LocalAI/core/config"
"github.com/go-skynet/LocalAI/core/http/endpoints/localai"
"github.com/go-skynet/LocalAI/pkg/model"
"github.com/gofiber/fiber/v2"
)
func RegisterPagesRoutes(app *fiber.App,
cl *config.BackendConfigLoader,
ml *model.ModelLoader,
appConfig *config.ApplicationConfig,
auth func(*fiber.Ctx) error) {
if !appConfig.DisableWelcomePage {
app.Get("/", auth, localai.WelcomeEndpoint(appConfig, cl, ml))
}
}

238
core/http/static/chat.js Normal file
View File

@ -0,0 +1,238 @@
/*
https://github.com/david-haerer/chatapi
MIT License
Copyright (c) 2023 David Härer
Copyright (c) 2024 Ettore Di Giacinto
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
*/
function submitKey(event) {
event.preventDefault();
localStorage.setItem("key", document.getElementById("apiKey").value);
document.getElementById("apiKey").blur();
}
function submitSystemPrompt(event) {
event.preventDefault();
localStorage.setItem("system_prompt", document.getElementById("systemPrompt").value);
document.getElementById("systemPrompt").blur();
}
var image = "";
function submitPrompt(event) {
event.preventDefault();
const input = document.getElementById("input").value;
Alpine.store("chat").add("user", input, image);
document.getElementById("input").value = "";
const key = localStorage.getItem("key");
const systemPrompt = localStorage.getItem("system_prompt");
promptGPT(systemPrompt, key, input);
}
function readInputImage() {
if (!this.files || !this.files[0]) return;
const FR = new FileReader();
FR.addEventListener("load", function(evt) {
image = evt.target.result;
});
FR.readAsDataURL(this.files[0]);
}
async function promptGPT(systemPrompt, key, input) {
const model = document.getElementById("chat-model").value;
// Set class "loader" to the element with "loader" id
//document.getElementById("loader").classList.add("loader");
// Make the "loader" visible
document.getElementById("loader").style.display = "block";
document.getElementById("input").disabled = true;
document.getElementById('messages').scrollIntoView(false)
messages = Alpine.store("chat").messages();
// if systemPrompt isn't empty, push it at the start of messages
if (systemPrompt) {
messages.unshift({
role: "system",
content: systemPrompt
});
}
// loop all messages, and check if there are images. If there are, we need to change the content field
messages.forEach((message) => {
if (message.image) {
// The content field now becomes an array
message.content = [
{
"type": "text",
"text": message.content
}
]
message.content.push(
{
"type": "image_url",
"image_url": {
"url": message.image,
}
}
);
// remove the image field
delete message.image;
}
});
// reset the form and the image
image = "";
document.getElementById("input_image").value = null;
document.getElementById("fileName").innerHTML = "";
// if (image) {
// // take the last element content's and add the image
// last_message = messages[messages.length - 1]
// // The content field now becomes an array
// last_message.content = [
// {
// "type": "text",
// "text": last_message.content
// }
// ]
// last_message.content.push(
// {
// "type": "image_url",
// "image_url": {
// "url": image,
// }
// }
// );
// // and we replace it in the messages array
// messages[messages.length - 1] = last_message
// // reset the form and the image
// image = "";
// document.getElementById("input_image").value = null;
// document.getElementById("fileName").innerHTML = "";
// }
// Source: https://stackoverflow.com/a/75751803/11386095
const response = await fetch("/v1/chat/completions", {
method: "POST",
headers: {
Authorization: `Bearer ${key}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: model,
messages: messages,
stream: true,
}),
});
if (!response.ok) {
Alpine.store("chat").add(
"assistant",
`<span class='error'>Error: POST /v1/chat/completions ${response.status}</span>`,
);
return;
}
const reader = response.body
?.pipeThrough(new TextDecoderStream())
.getReader();
if (!reader) {
Alpine.store("chat").add(
"assistant",
`<span class='error'>Error: Failed to decode API response</span>`,
);
return;
}
while (true) {
const { value, done } = await reader.read();
if (done) break;
let dataDone = false;
const arr = value.split("\n");
arr.forEach((data) => {
if (data.length === 0) return;
if (data.startsWith(":")) return;
if (data === "data: [DONE]") {
dataDone = true;
return;
}
const token = JSON.parse(data.substring(6)).choices[0].delta.content;
if (!token) {
return;
}
hljs.highlightAll();
Alpine.store("chat").add("assistant", token);
document.getElementById('messages').scrollIntoView(false)
});
hljs.highlightAll();
if (dataDone) break;
}
// Remove class "loader" from the element with "loader" id
//document.getElementById("loader").classList.remove("loader");
document.getElementById("loader").style.display = "none";
// enable input
document.getElementById("input").disabled = false;
// scroll to the bottom of the chat
document.getElementById('messages').scrollIntoView(false)
// set focus to the input
document.getElementById("input").focus();
}
document.getElementById("key").addEventListener("submit", submitKey);
document.getElementById("system_prompt").addEventListener("submit", submitSystemPrompt);
document.getElementById("prompt").addEventListener("submit", submitPrompt);
document.getElementById("input").focus();
document.getElementById("input_image").addEventListener("change", readInputImage);
storeKey = localStorage.getItem("key");
if (storeKey) {
document.getElementById("apiKey").value = storeKey;
} else {
document.getElementById("apiKey").value = null;
}
storesystemPrompt = localStorage.getItem("system_prompt");
if (storesystemPrompt) {
document.getElementById("systemPrompt").value = storesystemPrompt;
} else {
document.getElementById("systemPrompt").value = null;
}
marked.setOptions({
highlight: function (code) {
return hljs.highlightAuto(code).value;
},
});

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

View File

@ -0,0 +1,93 @@
body {
font-family: 'Inter', sans-serif;
}
.chat-container { height: 90vh; display: flex; flex-direction: column; }
.chat-messages { overflow-y: auto; flex-grow: 1; }
.htmx-indicator{
opacity:0;
transition: opacity 10ms ease-in;
}
.htmx-request .htmx-indicator{
opacity:1
}
/* Loader (https://cssloaders.github.io/) */
.loader {
width: 12px;
height: 12px;
border-radius: 50%;
display: block;
margin:15px auto;
position: relative;
color: #FFF;
box-sizing: border-box;
animation: animloader 2s linear infinite;
}
@keyframes animloader {
0% { box-shadow: 14px 0 0 -2px, 38px 0 0 -2px, -14px 0 0 -2px, -38px 0 0 -2px; }
25% { box-shadow: 14px 0 0 -2px, 38px 0 0 -2px, -14px 0 0 -2px, -38px 0 0 2px; }
50% { box-shadow: 14px 0 0 -2px, 38px 0 0 -2px, -14px 0 0 2px, -38px 0 0 -2px; }
75% { box-shadow: 14px 0 0 2px, 38px 0 0 -2px, -14px 0 0 -2px, -38px 0 0 -2px; }
100% { box-shadow: 14px 0 0 -2px, 38px 0 0 2px, -14px 0 0 -2px, -38px 0 0 -2px; }
}
.progress {
height: 20px;
margin-bottom: 20px;
overflow: hidden;
background-color: #f5f5f5;
border-radius: 4px;
box-shadow: inset 0 1px 2px rgba(0,0,0,.1);
}
.progress-bar {
float: left;
width: 0%;
height: 100%;
font-size: 12px;
line-height: 20px;
color: #fff;
text-align: center;
background-color: #337ab7;
-webkit-box-shadow: inset 0 -1px 0 rgba(0,0,0,.15);
box-shadow: inset 0 -1px 0 rgba(0,0,0,.15);
-webkit-transition: width .6s ease;
-o-transition: width .6s ease;
transition: width .6s ease;
}
.user {
background-color: #007bff;
}
.assistant {
background-color: #28a745;
}
.message {
display: flex;
align-items: center;
}
.user, .assistant {
flex-grow: 1;
margin: 0.5rem;
}
ul {
list-style-type: disc; /* Adds bullet points */
padding-left: 1.25rem; /* Indents the list from the left margin */
margin-top: 1rem; /* Space above the list */
}
li {
font-size: 0.875rem; /* Small text size */
color: #4a5568; /* Dark gray text */
background-color: #f7fafc; /* Very light gray background */
border-radius: 0.375rem; /* Rounded corners */
padding: 0.5rem; /* Padding inside each list item */
box-shadow: 0 1px 3px 0 rgba(0, 0, 0, 0.1), 0 1px 2px 0 rgba(0, 0, 0, 0.06); /* Subtle shadow */
margin-bottom: 0.5rem; /* Vertical space between list items */
}
li:last-child {
margin-bottom: 0; /* Removes bottom margin from the last item */
}

96
core/http/static/image.js Normal file
View File

@ -0,0 +1,96 @@
/*
https://github.com/david-haerer/chatapi
MIT License
Copyright (c) 2023 David Härer
Copyright (c) 2024 Ettore Di Giacinto
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
*/
function submitKey(event) {
event.preventDefault();
localStorage.setItem("key", document.getElementById("apiKey").value);
document.getElementById("apiKey").blur();
}
function genImage(event) {
event.preventDefault();
const input = document.getElementById("input").value;
const key = localStorage.getItem("key");
promptDallE(key, input);
}
async function promptDallE(key, input) {
document.getElementById("loader").style.display = "block";
document.getElementById("input").value = "";
document.getElementById("input").disabled = true;
const model = document.getElementById("image-model").value;
const response = await fetch("/v1/images/generations", {
method: "POST",
headers: {
Authorization: `Bearer ${key}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: model,
steps: 10,
prompt: input,
n: 1,
size: "512x512",
}),
});
const json = await response.json();
if (json.error) {
// Display error if there is one
var div = document.getElementById('result'); // Get the div by its ID
div.innerHTML = '<p style="color:red;">' + json.error.message + '</p>';
return;
}
const url = json.data[0].url;
var div = document.getElementById('result'); // Get the div by its ID
var img = document.createElement('img'); // Create a new img element
img.src = url; // Set the source of the image
img.alt = 'Generated image'; // Set the alt text of the image
div.innerHTML = ''; // Clear the existing content of the div
div.appendChild(img); // Add the new img element to the div
document.getElementById("loader").style.display = "none";
document.getElementById("input").disabled = false;
document.getElementById("input").focus();
}
document.getElementById("key").addEventListener("submit", submitKey);
document.getElementById("input").focus();
document.getElementById("genimage").addEventListener("submit", genImage);
document.getElementById("loader").style.display = "none";
const storeKey = localStorage.getItem("key");
if (storeKey) {
document.getElementById("apiKey").value = storeKey;
}

64
core/http/static/tts.js Normal file
View File

@ -0,0 +1,64 @@
function submitKey(event) {
event.preventDefault();
localStorage.setItem("key", document.getElementById("apiKey").value);
document.getElementById("apiKey").blur();
}
function genAudio(event) {
event.preventDefault();
const input = document.getElementById("input").value;
const key = localStorage.getItem("key");
tts(key, input);
}
async function tts(key, input) {
document.getElementById("loader").style.display = "block";
document.getElementById("input").value = "";
document.getElementById("input").disabled = true;
const model = document.getElementById("tts-model").value;
const response = await fetch("/tts", {
method: "POST",
headers: {
Authorization: `Bearer ${key}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: model,
input: input,
}),
});
if (!response.ok) {
const jsonData = await response.json(); // Now safely parse JSON
var div = document.getElementById('result');
div.innerHTML = '<p style="color:red;">Error: ' +jsonData.error.message + '</p>';
return;
}
var div = document.getElementById('result'); // Get the div by its ID
var link=document.createElement('a');
link.className = "m-2 float-right inline-block rounded bg-primary px-6 pb-2.5 mb-3 pt-2.5 text-xs font-medium uppercase leading-normal text-white shadow-primary-3 transition duration-150 ease-in-out hover:bg-primary-accent-300 hover:shadow-primary-2 focus:bg-primary-accent-300 focus:shadow-primary-2 focus:outline-none focus:ring-0 active:bg-primary-600 active:shadow-primary-2 dark:shadow-black/30 dark:hover:shadow-dark-strong dark:focus:shadow-dark-strong dark:active:shadow-dark-strong";
link.innerHTML = "<i class='fa-solid fa-download'></i> Download result";
const blob = await response.blob();
link.href=window.URL.createObjectURL(blob);
div.innerHTML = ''; // Clear the existing content of the div
div.appendChild(link); // Add the new img element to the div
console.log(link)
document.getElementById("loader").style.display = "none";
document.getElementById("input").disabled = false;
document.getElementById("input").focus();
}
document.getElementById("key").addEventListener("submit", submitKey);
document.getElementById("input").focus();
document.getElementById("tts").addEventListener("submit", genAudio);
document.getElementById("loader").style.display = "none";
const storeKey = localStorage.getItem("key");
if (storeKey) {
document.getElementById("apiKey").value = storeKey;
}

228
core/http/views/chat.html Normal file
View File

@ -0,0 +1,228 @@
<!--
Part of this page is based on the OpenAI Chatbot example by David Härer:
https://github.com/david-haerer/chatapi
MIT License Copyright (c) 2023 David Härer
Copyright (c) 2024 Ettore Di Giacinto
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
-->
<!doctype html>
<html lang="en">
{{template "views/partials/head" .}}
<script defer src="/static/chat.js"></script>
<style>
body {
overflow: hidden;
}
</style>
<body class="bg-gray-900 text-gray-200" x-data="{ key: $store.chat.key }">
<div class="flex flex-col min-h-screen">
{{template "views/partials/navbar"}}
<div class="chat-container mt-2 mr-2 ml-2 mb-2 bg-gray-800 shadow-lg rounded-lg" >
<!-- Chat Header -->
<div class="border-b border-gray-700 p-4" x-data="{ component: 'menu' }">
<div class="flex items-center justify-between">
<h1 class="text-lg font-semibold"> <i class="fa-solid fa-comments"></i> Chat with {{.Model}} <a href="https://localai.io/features/text-generation/" target="_blank" >
<i class="fas fa-circle-info pr-2"></i>
</a></h1>
<div x-show="component === 'menu'" id="menu">
<button
@click="$store.chat.clear()"
id="clear"
title="Clear chat history"
data-twe-ripple-init
data-twe-ripple-color="light"
class="m-2 float-right inline-block rounded bg-primary px-6 pb-2.5 mb-3 pt-2.5 text-xs font-medium uppercase leading-normal text-white shadow-primary-3 transition duration-150 ease-in-out hover:bg-primary-accent-300 hover:shadow-primary-2 focus:bg-primary-accent-300 focus:shadow-primary-2 focus:outline-none focus:ring-0 active:bg-primary-600 active:shadow-primary-2 dark:shadow-black/30 dark:hover:shadow-dark-strong dark:focus:shadow-dark-strong dark:active:shadow-dark-strong"
>
Clear chat 🔥
</button>
<button @click="component = 'key'" title="Update API key"
class="m-2 float-right inline-block rounded bg-primary px-6 pb-2.5 mb-3 pt-2.5 text-xs font-medium uppercase leading-normal text-white shadow-primary-3 transition duration-150 ease-in-out hover:bg-primary-accent-300 hover:shadow-primary-2 focus:bg-primary-accent-300 focus:shadow-primary-2 focus:outline-none focus:ring-0 active:bg-primary-600 active:shadow-primary-2 dark:shadow-black/30 dark:hover:shadow-dark-strong dark:focus:shadow-dark-strong dark:active:shadow-dark-strong"
>Set API Key🔑</button>
<button @click="component = 'system_prompt'" title="System Prompt"
class="m-2 float-right inline-block rounded bg-primary px-6 pb-2.5 mb-3 pt-2.5 text-xs font-medium uppercase leading-normal text-white shadow-primary-3 transition duration-150 ease-in-out hover:bg-primary-accent-300 hover:shadow-primary-2 focus:bg-primary-accent-300 focus:shadow-primary-2 focus:outline-none focus:ring-0 active:bg-primary-600 active:shadow-primary-2 dark:shadow-black/30 dark:hover:shadow-dark-strong dark:focus:shadow-dark-strong dark:active:shadow-dark-strong"
>Set system prompt</button>
</div>
<form x-show="component === 'key'" id="key">
<input
type="password"
id="apiKey"
name="apiKey"
class="bg-gray-800 text-white border border-gray-600 focus:border-blue-500 focus:ring focus:ring-blue-500 focus:ring-opacity-50 rounded-md shadow-sm p-2 appearance-none"
placeholder="OpenAI API Key"
x-model.lazy="key"
/>
<button @click="component = 'menu'" type="submit" title="Save API key">
<i class="fa-solid fa-arrow-right"></i>
</button>
</form>
<form x-show="component === 'system_prompt'" id="system_prompt">
<textarea
type="text"
id="systemPrompt"
name="systemPrompt"
class="bg-gray-800 text-white border border-gray-600 focus:border-blue-500 focus:ring focus:ring-blue-500 focus:ring-opacity-50 rounded-md shadow-sm p-2 appearance-none"
placeholder="System prompt"
x-model.lazy="system_prompt"
></textarea>
<button @click="component = 'menu'" type="submit" title="Save Prompt">
<i class="fa-solid fa-arrow-right"></i>
</button>
</form>
<select x-data="{ link : '' }" x-model="link" x-init="$watch('link', value => window.location = link)"
class="bg-gray-800 text-white border border-gray-600 focus:border-blue-500 focus:ring focus:ring-blue-500 focus:ring-opacity-50 rounded-md shadow-sm p-2 appearance-none"
>
<!-- Options -->
<option value="" disabled class="text-gray-400" >Select a model</option>
{{ $model:=.Model}}
{{ range .ModelsConfig }}
{{ if eq .Name $model }}
<option value="/chat/{{.Name}}" selected class="bg-gray-700 text-white">{{.Name}}</option>
{{ else }}
<option value="/chat/{{.Name}}" class="bg-gray-700 text-white">{{.Name}}</option>
{{ end }}
{{ end }}
</select>
</div>
</div>
<div class="chat-messages p-4" id="chat" x-data="{history: $store.chat.history}">
<p id="usage" x-show="history.length === 0">
Start chatting with the AI by typing a prompt in the input field below.
</p>
<div id="messages">
<template x-for="message in history">
<div class="message flex items-start space-x-2 my-2" >
<!--<img :src="message.role === 'user' ? '/path/to/user-icon.png' : '/path/to/bot-icon.png'" alt="" class="h-6 w-6">-->
<i class="fa-solid h-8 w-8" :class="message.role === 'user' ? 'fa-user' : 'fa-robot'" ></i>
<div class="flex flex-col flex-1">
<span class="text-xs font-semibold text-gray-600" x-text="message.role === 'user' ? 'User' : 'Assistant ({{.Model}})'"></span>
<template x-if="message.role === 'user'">
<div class="p-2 flex-1 rounded" :class="message.role" x-html="message.html"></div>
</template>
<template x-if="message.role === 'assistant'">
<div class="p-2 flex-1 rounded" :class="message.role" x-html="message.html"></div>
</template>
<template x-if="message.image">
<img :src="message.image" alt="Image" class="rounded-lg mt-2 h-36 w-36">
</template>
</div>
</div>
</template>
</div>
</div>
<div class="p-4 border-t border-gray-700" x-data="{ inputValue: '', shiftPressed: false, fileName: '' }">
<div id="loader" class="my-2 loader" style="display: none;"></div>
<input id="chat-model" type="hidden" value="{{.Model}}">
<input id="input_image" type="file" style="display: none;" @change="fileName = $event.target.files[0].name">
<form id="prompt" action="/chat/{{.Model}}" method="get" @submit.prevent="submitPrompt">
<div class="relative w-full">
<textarea
id="input"
name="input"
x-model="inputValue"
placeholder="Send a message..."
class="p-2 pl-2 border rounded w-full bg-gray-600 text-white placeholder-gray-300"
required
@keydown.shift="shiftPressed = true"
@keyup.shift="shiftPressed = false"
@keydown.enter="if (!shiftPressed) { submitPrompt($event); }"
style="padding-right: 4rem;"
></textarea>
<span x-text="fileName" id="fileName" class="absolute right-16 top-5 text-gray-300 text-sm mr-2"></span>
<button type="button" onclick="document.getElementById('input_image').click()" class="fa-solid fa-paperclip text-gray-300 ml-2 absolute right-10 top-3 text-lg p-2">
</button>
<button type=submit><i class="fa-solid fa-circle-up text-gray-300 absolute right-2 top-3 text-lg p-2"></i></button>
</div>
</form>
</div>
<script>
document.addEventListener("alpine:init", () => {
Alpine.store("chat", {
history: [],
languages: [undefined],
clear() {
this.history.length = 0;
},
add(role, content, image) {
const N = this.history.length - 1;
if (this.history.length && this.history[N].role === role) {
this.history[N].content += content;
str = this.history[N].content;
this.history[N].html = DOMPurify.sanitize(
marked.parse(this.history[N].content),
);
} else {
c = ""
// split content newlines in content
const lines = content.split("\n");
// for each line, do DOMPurify.sanitize(marked.parse(line)) and add it to c
lines.forEach((line) => {
c += DOMPurify.sanitize(marked.parse(line));
});
this.history.push({
role: role,
content: content,
html: c,
image: image,
});
}
const parser = new DOMParser();
const html = parser.parseFromString(
this.history[this.history.length - 1].html,
"text/html",
);
const code = html.querySelectorAll("pre code");
if (!code.length) return;
code.forEach((el) => {
const language = el.className.split("language-")[1];
if (this.languages.includes(language)) return;
const script = document.createElement("script");
script.src = `https://cdn.jsdelivr.net/gh/highlightjs/cdn-release@11.8.0/build/languages/${language}.min.js`;
document.head.appendChild(script);
this.languages.push(language);
});
},
messages() {
return this.history.map((message) => {
return {
role: message.role,
content: message.content,
image: message.image,
};
});
},
});
});
</script>
</div>
</body>
</html>

View File

@ -10,23 +10,76 @@
<div class="container mx-auto px-4 flex-grow">
<div class="header text-center py-12">
<h1 class="text-5xl font-bold text-gray-100">Welcome to <i>your</i> LocalAI instance!</h1>
<div class="mt-6">
<!-- Logo can be uncommented and updated with a valid URL -->
</div>
<p class="mt-4 text-lg">The FOSS alternative to OpenAI, Claude, ...</p>
<a href="https://localai.io" target="_blank" class="mt-4 inline-block bg-blue-500 text-white py-2 px-4 rounded-lg shadow transition duration-300 ease-in-out hover:bg-blue-700 hover:shadow-lg">
<i class="fas fa-book-reader pr-2"></i>Documentation
</a>
</a>
</div>
<div class="models mt-12">
<div class="models mt-4">
<!-- Show in progress operations-->
{{ if .ProcessingModels }}
<h3
class="mt-4 mb-4 text-center text-3xl font-semibold text-gray-100">Operations in progress</h2>
{{end}}
{{$taskType:=.TaskTypes}}
{{ range $key,$value:=.ProcessingModels }}
{{ $op := index $taskType $key}}
{{$parts := split "@" $key}}
<div class="flex items-center justify-between bg-slate-600 p-2 mb-2 rounded-md">
<div class="flex items center">
<span class="text-gray-300"><a href="/browse?term={{$parts._1}}"
class="text-white-500 inline-block bg-blue-200 rounded-full px-3 py-1 text-sm font-semibold text-gray-700 mr-2 mb-2 hover:bg-gray-300 hover:shadow-gray-2"
>{{$parts._1}}</a> (from the '{{$parts._0}}' repository)</span>
</div>
<div hx-get="/browse/job/{{$value}}" hx-swap="innerHTML" hx-target="this" hx-trigger="done">
<h3 role="status" id="pblabel" >{{$op}}
<div hx-get="/browse/job/progress/{{$value}}" hx-trigger="every 600ms" hx-target="this"
hx-swap= "innerHTML" ></div></h3>
</div>
</div>
{{ end }}
<!-- END Show in progress operations-->
{{ if eq (len .ModelsConfig) 0 }}
<h2 class="text-center text-3xl font-semibold text-gray-100"> <i class="text-yellow-200 ml-2 fa-solid fa-triangle-exclamation animate-pulse"></i> Ouch! seems you don't have any models installed!</h2>
<p class="text-center mt-4 text-xl">..install something from the <a class="text-gray-400 hover:text-white ml-1 px-3 py-2 rounded" href="/browse">🖼️ Gallery</a> or check the <a href="https://localai.io/basics/getting_started/" class="text-gray-400 hover:text-white ml-1 px-3 py-2 rounded"> <i class="fa-solid fa-book"></i> Getting started documentation </a></p>
{{ else }}
<h2 class="text-center text-3xl font-semibold text-gray-100">Installed models</h2>
<p class="text-center mt-4 text-xl">We have {{len .ModelsConfig}} pre-loaded models available.</p>
<ul class="mt-8 space-y-4">
<table class="table-auto mt-4 w-full text-left text-gray-200">
<thead class="text-xs text-gray-400 uppercase bg-gray-700">
<tr>
<th class="px-4 py-2"></th>
<th class="px-4 py-2">Model Name</th>
<th class="px-4 py-2">Backend</th>
<th class="px-4 py-2 float-right">Actions</th>
</tr>
</thead>
<tbody>
{{$galleryConfig:=.GalleryConfig}}
{{$noicon:="https://upload.wikimedia.org/wikipedia/commons/6/65/No-Image-Placeholder.svg"}}
{{ range .ModelsConfig }}
<li class="bg-gray-800 border border-gray-700 p-4 rounded-lg">
<div class="flex justify-between items-center">
<p class="font-bold text-white flex items-center"><i class="fas fa-brain pr-2"></i>{{.Name}}</p>
{{ $cfg:= index $galleryConfig .Name}}
<tr class="bg-gray-800 border-b border-gray-700">
<td class="px-4 py-3">
{{ with $cfg }}
<img {{ if $cfg.Icon }}
src="{{$cfg.Icon}}"
{{ else }}
src="{{$noicon}}"
{{ end }}
class="rounded-t-lg max-h-24 max-w-24 object-cover mt-3"
>
{{ else}}
<img src="{{$noicon}}" class="rounded-t-lg max-h-24 max-w-24 object-cover mt-3">
{{ end }}
</td>
<td class="px-4 py-3 font-bold">
<p class="font-bold text-white flex items-center"><i class="fas fa-brain pr-2"></i><a href="/browse?term={{.Name}}">{{.Name}}</a></p>
</td>
<td class="px-4 py-3 font-bold">
{{ if .Backend }}
<!-- Badge for Backend -->
<span class="inline-block bg-blue-500 text-white py-1 px-3 rounded-full text-xs">
@ -37,11 +90,20 @@
auto
</span>
{{ end }}
</div>
<!-- Additional details can go here -->
</li>
</td>
<td class="px-4 py-3">
<button
class="float-right inline-block rounded bg-red-800 px-6 pb-2.5 mb-3 pt-2.5 text-xs font-medium uppercase leading-normal text-white shadow-primary-3 transition duration-150 ease-in-out hover:bg-red-accent-300 hover:shadow-red-2 focus:bg-red-accent-300 focus:shadow-primary-2 focus:outline-none focus:ring-0 active:bg-red-600 active:shadow-primary-2 dark:shadow-black/30 dark:hover:shadow-dark-strong dark:focus:shadow-dark-strong dark:active:shadow-dark-strong"
data-twe-ripple-color="light" data-twe-ripple-init="" hx-confirm="Are you sure you wish to delete the model?" hx-post="/browse/delete/model/{{.Name}}" hx-swap="outerHTML"><i class="fa-solid fa-cancel pr-2"></i>Delete</button>
</td>
{{ end }}
</ul>
</tbody>
</table>
{{ end }}
</div>
</div>

View File

@ -13,10 +13,83 @@
🖼️ Available models from <i>{{ len .Repositories }}</i> repositories <a href="https://localai.io/models/" target="_blank" >
<i class="fas fa-circle-info pr-2"></i>
</a></h2>
<div class="text-center font-semibold text-gray-100">
<h2>Filter by type:</h2>
<button hx-post="/browse/search/models"
class="text-white-500 inline-block bg-blue-200 rounded-full px-3 py-1 text-sm font-semibold text-gray-700 mr-2 mb-2 hover:bg-gray-300 hover:shadow-gray-2"
hx-target="#search-results"
hx-vals='{"search": "tts"}'
hx-indicator=".htmx-indicator" >TTS</button>
<button hx-post="/browse/search/models"
class="text-white-500 inline-block bg-blue-200 rounded-full px-3 py-1 text-sm font-semibold text-gray-700 mr-2 mb-2 hover:bg-gray-300 hover:shadow-gray-2"
hx-target="#search-results"
hx-vals='{"search": "stablediffusion"}'
hx-indicator=".htmx-indicator" >Image generation</button>
<button hx-post="/browse/search/models" \
class="text-white-500 inline-block bg-blue-200 rounded-full px-3 py-1 text-sm font-semibold text-gray-700 mr-2 mb-2 hover:bg-gray-300 hover:shadow-gray-2"
hx-target="#search-results"
hx-vals='{"search": "llm"}'
hx-indicator=".htmx-indicator" >Text generation</button>
<button hx-post="/browse/search/models"
class="text-white-500 inline-block bg-blue-200 rounded-full px-3 py-1 text-sm font-semibold text-gray-700 mr-2 mb-2 hover:bg-gray-300 hover:shadow-gray-2"
hx-target="#search-results"
hx-vals='{"search": "multimodal"}'
hx-indicator=".htmx-indicator" >Multimodal</button>
<button hx-post="/browse/search/models"
class="text-white-500 inline-block bg-blue-200 rounded-full px-3 py-1 text-sm font-semibold text-gray-700 mr-2 mb-2 hover:bg-gray-300 hover:shadow-gray-2"
hx-target="#search-results"
hx-vals='{"search": "embedding"}'
hx-indicator=".htmx-indicator" >Embeddings</button>
<button hx-post="/browse/search/models"
class="text-white-500 inline-block bg-blue-200 rounded-full px-3 py-1 text-sm font-semibold text-gray-700 mr-2 mb-2 hover:bg-gray-300 hover:shadow-gray-2"
hx-target="#search-results"
hx-vals='{"search": "rerank"}'
hx-indicator=".htmx-indicator" >Rerankers</button>
<button
hx-post="/browse/search/models"
class="text-white-500 inline-block bg-blue-200 rounded-full px-3 py-1 text-sm font-semibold text-gray-700 mr-2 mb-2 hover:bg-gray-300 hover:shadow-gray-2"
hx-target="#search-results"
hx-vals='{"search": "whisper"}'
hx-indicator=".htmx-indicator" >Audio transcription</button>
</div>
<div class="text-center text-xs font-semibold text-gray-100">
Filter by tags:
{{ range .AllTags }}
<button hx-post="/browse/search/models" class="text-blue-500" hx-target="#search-results"
hx-vals='{"search": "{{.}}"}'
hx-indicator=".htmx-indicator" >{{.}}</button>
{{ end }}
</div>
<span class="htmx-indicator loader"></span>
<input class="form-control appearance-none block w-full px-3 py-2 text-base font-normal text-gray-300 pb-2 mb-5 bg-gray-800 bg-clip-padding border border-solid border-gray-600 rounded transition ease-in-out m-0 focus:text-gray-300 focus:bg-gray-900 focus:border-blue-500 focus:outline-none" type="search"
<!-- Show in progress operations-->
{{ if .ProcessingModels }}
<h2
class="mt-4 mb-4 text-center text-3xl font-semibold text-gray-100">Operations in progress</h2>
{{end}}
{{$taskType:=.TaskTypes}}
{{ range $key,$value:=.ProcessingModels }}
{{ $op := index $taskType $key}}
{{$parts := split "@" $key}}
<div class="flex items-center justify-between bg-slate-600 p-2 mb-2 rounded-md">
<div class="flex items center">
<span class="text-gray-300"><a href="/browse?term={{$parts._1}}"
class="text-white-500 inline-block bg-blue-200 rounded-full px-3 py-1 text-sm font-semibold text-gray-700 mr-2 mb-2 hover:bg-gray-300 hover:shadow-gray-2"
>{{$parts._1}}</a> (from the '{{$parts._0}}' repository)</span>
</div>
<div hx-get="/browse/job/{{$value}}" hx-swap="innerHTML" hx-target="this" hx-trigger="done">
<h3 role="status" id="pblabel" >{{$op}}
<div hx-get="/browse/job/progress/{{$value}}" hx-trigger="every 600ms" hx-target="this"
hx-swap= "innerHTML" ></div></h3>
</div>
</div>
{{ end }}
<!-- END Show in progress operations-->
<input class="form-control appearance-none block w-full mt-5 px-3 py-2 text-base font-normal text-gray-300 pb-2 mb-5 bg-gray-800 bg-clip-padding border border-solid border-gray-600 rounded transition ease-in-out m-0 focus:text-gray-300 focus:bg-gray-900 focus:border-blue-500 focus:outline-none" type="search"
name="search" placeholder="Begin Typing To Search models..."
hx-post="/browse/search/models"
hx-trigger="input changed delay:500ms, search"

View File

@ -2,6 +2,28 @@
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>{{.Title}}</title>
<link
rel="stylesheet"
href="https://cdn.jsdelivr.net/gh/highlightjs/cdn-release@11.8.0/build/styles/default.min.css"
/>
<script
defer
src="https://cdn.jsdelivr.net/gh/highlightjs/cdn-release@11.8.0/build/highlight.min.js"
></script>
<script
defer
src="https://cdn.jsdelivr.net/npm/alpinejs@3.x.x/dist/cdn.min.js"
></script>
<script
defer
src="https://cdn.jsdelivr.net/npm/marked/marked.min.js"
></script>
<script
defer
src="https://cdn.jsdelivr.net/npm/dompurify@3.0.6/dist/purify.min.js"
></script>
<link href="/static/general.css" rel="stylesheet" />
<link href="https://fonts.googleapis.com/css2?family=Inter:wght@400;600;700&family=Roboto:wght@400;500&display=swap" rel="stylesheet">
<link
href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700,900&display=swap"
@ -27,52 +49,4 @@
</script>
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.1.1/css/all.min.css">
<script src="https://unpkg.com/htmx.org@1.9.12" integrity="sha384-ujb1lZYygJmzgSwoxRggbCHcjc0rB2XoQrxeTUQyRjrOnlCoYta87iKBWq3EsdM2" crossorigin="anonymous"></script>
<style>
body {
font-family: 'Inter', sans-serif;
}
/* Loader (https://cssloaders.github.io/) */
.loader {
width: 12px;
height: 12px;
border-radius: 50%;
display: block;
margin:15px auto;
position: relative;
color: #FFF;
box-sizing: border-box;
animation: animloader 2s linear infinite;
}
@keyframes animloader {
0% { box-shadow: 14px 0 0 -2px, 38px 0 0 -2px, -14px 0 0 -2px, -38px 0 0 -2px; }
25% { box-shadow: 14px 0 0 -2px, 38px 0 0 -2px, -14px 0 0 -2px, -38px 0 0 2px; }
50% { box-shadow: 14px 0 0 -2px, 38px 0 0 -2px, -14px 0 0 2px, -38px 0 0 -2px; }
75% { box-shadow: 14px 0 0 2px, 38px 0 0 -2px, -14px 0 0 -2px, -38px 0 0 -2px; }
100% { box-shadow: 14px 0 0 -2px, 38px 0 0 2px, -14px 0 0 -2px, -38px 0 0 -2px; }
}
.progress {
height: 20px;
margin-bottom: 20px;
overflow: hidden;
background-color: #f5f5f5;
border-radius: 4px;
box-shadow: inset 0 1px 2px rgba(0,0,0,.1);
}
.progress-bar {
float: left;
width: 0%;
height: 100%;
font-size: 12px;
line-height: 20px;
color: #fff;
text-align: center;
background-color: #337ab7;
-webkit-box-shadow: inset 0 -1px 0 rgba(0,0,0,.15);
box-shadow: inset 0 -1px 0 rgba(0,0,0,.15);
-webkit-transition: width .6s ease;
-o-transition: width .6s ease;
transition: width .6s ease;
}
</style>
</head>

View File

@ -6,12 +6,42 @@
<a href="/" class="text-white text-xl font-bold"><img src="https://github.com/go-skynet/LocalAI/assets/2420543/0966aa2a-166e-4f99-a3e5-6c915fc997dd" alt="LocalAI Logo" class="h-10 mr-3 border-2 border-gray-300 shadow rounded"></a>
<a href="/" class="text-white text-xl font-bold">LocalAI</a>
</div>
<div>
<!-- Menu button for small screens -->
<div class="lg:hidden">
<button id="menu-toggle" class="text-gray-400 hover:text-white focus:outline-none">
<i class="fas fa-bars fa-lg"></i>
</button>
</div>
<!-- Navigation links -->
<div class="hidden lg:flex lg:items-center lg:justify-end lg:flex-1 lg:w-0">
<a href="/" class="text-gray-400 hover:text-white px-3 py-2 rounded"><i class="fas fa-home pr-2"></i>Home</a>
<a href="https://localai.io" class="text-gray-400 hover:text-white px-3 py-2 rounded" target="_blank" ><i class="fas fa-book-reader pr-2"></i> Documentation</a>
<a href="/browse/" class="text-gray-400 hover:text-white px-3 py-2 rounded"><i class="fas fa-brain pr-2"></i> Models</a>
<a href="/chat/" class="text-gray-400 hover:text-white px-3 py-2 rounded"><i class="fa-solid fa-comments pr-2"></i> Chat</a>
<a href="/text2image/" class="text-gray-400 hover:text-white px-3 py-2 rounded"><i class="fas fa-image pr-2"></i> Generate images</a>
<a href="/tts/" class="text-gray-400 hover:text-white px-3 py-2 rounded"><i class="fa-solid fa-music pr-2"></i> TTS </a>
<a href="/swagger/" class="text-gray-400 hover:text-white px-3 py-2 rounded"><i class="fas fa-code pr-2"></i> API</a>
</div>
</div>
<!-- Collapsible menu for small screens -->
<div class="hidden lg:hidden" id="mobile-menu">
<div class="pt-4 pb-3 border-t border-gray-700">
<a href="/" class="block text-gray-400 hover:text-white px-3 py-2 rounded mt-1"><i class="fas fa-home pr-2"></i>Home</a>
<a href="https://localai.io" class="block text-gray-400 hover:text-white px-3 py-2 rounded mt-1" target="_blank" ><i class="fas fa-book-reader pr-2"></i> Documentation</a>
<a href="/browse/" class="block text-gray-400 hover:text-white px-3 py-2 rounded mt-1"><i class="fas fa-brain pr-2"></i> Models</a>
<a href="/chat/" class="block text-gray-400 hover:text-white px-3 py-2 rounded mt-1"><i class="fa-solid fa-comments pr-2"></i> Chat</a>
<a href="/text2image/" class="block text-gray-400 hover:text-white px-3 py-2 rounded mt-1"><i class="fas fa-image pr-2"></i> Generate images</a>
<a href="/tts/" class="block text-gray-400 hover:text-white px-3 py-2 rounded mt-1"><i class="fa-solid fa-music pr-2"></i> TTS </a>
<a href="/swagger/" class="block text-gray-400 hover:text-white px-3 py-2 rounded mt-1"><i class="fas fa-code pr-2"></i> API</a>
</div>
</div>
</div>
</nav>
</nav>
<script>
// JavaScript to toggle the mobile menu
document.getElementById('menu-toggle').addEventListener('click', function () {
var mobileMenu = document.getElementById('mobile-menu');
mobileMenu.classList.toggle('hidden');
});
</script>

View File

@ -0,0 +1,89 @@
<!DOCTYPE html>
<html lang="en">
{{template "views/partials/head" .}}
<script defer src="/static/image.js"></script>
<body class="bg-gray-900 text-gray-200">
<div class="flex flex-col min-h-screen">
{{template "views/partials/navbar" .}}
<div class="container mx-auto px-4 flex-grow " x-data="{ component: 'menu' }">
<div class="mt-12">
<div class="flex items-center justify-center text-center pb-2">
<span class="text-3xl font-semibold text-gray-100">
🖼️ Text to Image
<a href="https://localai.io/features/image-generation" target="_blank" >
<i class="fas fa-circle-info pr-2"></i>
</a>
</span>
</div>
<div class="text-center font-semibold text-gray-100">
<div class="flex items-center justify-between">
<div x-show="component === 'menu'" id="menu">
<button @click="component = 'key'" title="Update API key"
class="m-2 float-right inline-block rounded bg-primary px-6 pb-2.5 mb-3 pt-2.5 text-xs font-medium uppercase leading-normal text-white shadow-primary-3 transition duration-150 ease-in-out hover:bg-primary-accent-300 hover:shadow-primary-2 focus:bg-primary-accent-300 focus:shadow-primary-2 focus:outline-none focus:ring-0 active:bg-primary-600 active:shadow-primary-2 dark:shadow-black/30 dark:hover:shadow-dark-strong dark:focus:shadow-dark-strong dark:active:shadow-dark-strong"
>Set API Key🔑</button>
</div>
<form x-show="component === 'key'" id="key">
<input
type="password"
id="apiKey"
name="apiKey"
placeholder="OpenAI API Key"
x-model.lazy="key"
/>
<button @click="component = 'menu'" type="submit" title="Save API key">
🔒
</button>
</form>
<select x-data="{ link : '' }" x-model="link" x-init="$watch('link', value => window.location = link)"
class="bg-gray-800 text-white border border-gray-600 focus:border-blue-500 focus:ring focus:ring-blue-500 focus:ring-opacity-50 rounded-md shadow-sm p-2 appearance-none"
>
<!-- Options -->
<option value="" disabled class="text-gray-400" >Select a model</option>
{{ $model:=.Model}}
{{ range .ModelsConfig }}
{{ if eq .Name $model }}
<option value="/text2image/{{.Name}}" selected class="bg-gray-700 text-white">{{.Name}}</option>
{{ else }}
<option value="/text2image/{{.Name}}" class="bg-gray-700 text-white">{{.Name}}</option>
{{ end }}
{{ end }}
</select>
</div>
</div>
<div class="mt-12">
<input id="image-model" type="hidden" value="{{.Model}}">
<form id="genimage" action="/text2image/{{.Model}}" method="get">
<input
type="text"
id="input"
name="input"
placeholder="Prompt…"
autocomplete="off"
class="p-2 border rounded w-full bg-gray-600 text-white placeholder-gray-300"
required
/>
</form>
<div class="container max-w-screen-lg mx-auto mt-4 pb-10 flex justify-center">
<div id="loader" class="my-2 loader" ></div>
</div>
<div class="container max-w-screen-lg mx-auto mt-4 pb-10 flex justify-center">
<div id="result" class="mx-auto"></div>
</div>
</div>
</div>
</div>
{{template "views/partials/footer" .}}
</div>
</body>
</html>

86
core/http/views/tts.html Normal file
View File

@ -0,0 +1,86 @@
<!DOCTYPE html>
<html lang="en">
{{template "views/partials/head" .}}
<script defer src="/static/tts.js"></script>
<body class="bg-gray-900 text-gray-200">
<div class="flex flex-col min-h-screen">
{{template "views/partials/navbar" .}}
<div class="container mx-auto px-4 flex-grow " x-data="{ component: 'menu' }">
<div class="mt-12">
<div class="flex items-center justify-center text-center pb-2">
<span class="text-3xl font-semibold text-gray-100">
<i class="fa-solid fa-music"></i> Text to speech/audio
<a href="https://localai.io/features/text-to-audio/" target="_blank" >
<i class="fas fa-circle-info pr-2"></i>
</a>
</span>
</div>
<div class="text-center font-semibold text-gray-100">
<div class="flex items-center justify-between">
<div x-show="component === 'menu'" id="menu">
<button @click="component = 'key'" title="Update API key"
class="m-2 float-right inline-block rounded bg-primary px-6 pb-2.5 mb-3 pt-2.5 text-xs font-medium uppercase leading-normal text-white shadow-primary-3 transition duration-150 ease-in-out hover:bg-primary-accent-300 hover:shadow-primary-2 focus:bg-primary-accent-300 focus:shadow-primary-2 focus:outline-none focus:ring-0 active:bg-primary-600 active:shadow-primary-2 dark:shadow-black/30 dark:hover:shadow-dark-strong dark:focus:shadow-dark-strong dark:active:shadow-dark-strong"
>Set API Key🔑</button>
</div>
<form x-show="component === 'key'" id="key">
<input
type="password"
id="apiKey"
name="apiKey"
placeholder="OpenAI API Key"
x-model.lazy="key"
/>
<button @click="component = 'menu'" type="submit" title="Save API key">
🔒
</button>
</form>
<select x-data="{ link : '' }" x-model="link" x-init="$watch('link', value => window.location = link)"
class="bg-gray-800 text-white border border-gray-600 focus:border-blue-500 focus:ring focus:ring-blue-500 focus:ring-opacity-50 rounded-md shadow-sm p-2 appearance-none"
>
<!-- Options -->
<option value="" disabled class="text-gray-400" >Select a model</option>
{{ $model:=.Model}}
{{ range .ModelsConfig }}
{{ if eq .Name $model }}
<option value="/tts/{{.Name}}" selected class="bg-gray-700 text-white">{{.Name}}</option>
{{ else }}
<option value="/tts/{{.Name}}" class="bg-gray-700 text-white">{{.Name}}</option>
{{ end }}
{{ end }}
</select>
</div>
</div>
<div class="mt-12">
<input id="tts-model" type="hidden" value="{{.Model}}">
<form id="tts" action="/tts/{{.Model}}" method="get">
<input
type="text"
id="input"
name="input"
placeholder="Prompt…"
autocomplete="off"
class="p-2 border rounded w-full bg-gray-600 text-white placeholder-gray-300"
required
/>
</form>
<div class="container max-w-screen-lg mx-auto mt-4 pb-10 flex justify-center">
<div id="loader" class="my-2 loader" ></div>
</div>
<div class="container max-w-screen-lg mx-auto mt-4 pb-10 flex justify-center">
<div id="result" class="mx-auto"></div>
</div>
</div>
</div>
</div>
{{template "views/partials/footer" .}}
</div>
</body>
</html>

View File

@ -90,7 +90,7 @@ func (g *GalleryService) Start(c context.Context, cl *config.BackendConfigLoader
if op.Delete {
modelConfig := &config.BackendConfig{}
// Galleryname is the name of the model in this case
dat, err := os.ReadFile(filepath.Join(g.modelPath, op.GalleryName+".yaml"))
dat, err := os.ReadFile(filepath.Join(g.modelPath, op.GalleryModelName+".yaml"))
if err != nil {
updateError(err)
continue
@ -111,14 +111,14 @@ func (g *GalleryService) Start(c context.Context, cl *config.BackendConfigLoader
files = append(files, modelConfig.MMProjFileName())
}
err = gallery.DeleteModelFromSystem(g.modelPath, op.GalleryName, files)
err = gallery.DeleteModelFromSystem(g.modelPath, op.GalleryModelName, files)
} else {
// if the request contains a gallery name, we apply the gallery from the gallery list
if op.GalleryName != "" {
if strings.Contains(op.GalleryName, "@") {
err = gallery.InstallModelFromGallery(op.Galleries, op.GalleryName, g.modelPath, op.Req, progressCallback)
if op.GalleryModelName != "" {
if strings.Contains(op.GalleryModelName, "@") {
err = gallery.InstallModelFromGallery(op.Galleries, op.GalleryModelName, g.modelPath, op.Req, progressCallback)
} else {
err = gallery.InstallModelFromGalleryByName(op.Galleries, op.GalleryName, g.modelPath, op.Req, progressCallback)
err = gallery.InstallModelFromGalleryByName(op.Galleries, op.GalleryModelName, g.modelPath, op.Req, progressCallback)
}
} else if op.ConfigURL != "" {
startup.PreloadModelsConfigurations(op.ConfigURL, g.modelPath, op.ConfigURL)
@ -148,10 +148,11 @@ func (g *GalleryService) Start(c context.Context, cl *config.BackendConfigLoader
g.UpdateStatus(op.Id,
&gallery.GalleryOpStatus{
Deletion: op.Delete,
Processed: true,
Message: "completed",
Progress: 100})
Deletion: op.Delete,
Processed: true,
GalleryModelName: op.GalleryModelName,
Message: "completed",
Progress: 100})
}
}
}()

View File

@ -11,6 +11,7 @@ import (
"github.com/go-skynet/LocalAI/pkg/assets"
"github.com/go-skynet/LocalAI/pkg/model"
pkgStartup "github.com/go-skynet/LocalAI/pkg/startup"
"github.com/go-skynet/LocalAI/pkg/xsysinfo"
"github.com/rs/zerolog/log"
)
@ -19,12 +20,23 @@ func Startup(opts ...config.AppOption) (*config.BackendConfigLoader, *model.Mode
log.Info().Msgf("Starting LocalAI using %d threads, with models path: %s", options.Threads, options.ModelPath)
log.Info().Msgf("LocalAI version: %s", internal.PrintableVersion())
caps, err := xsysinfo.CPUCapabilities()
if err == nil {
log.Debug().Msgf("CPU capabilities: %v", caps)
}
gpus, err := xsysinfo.GPUs()
if err == nil {
log.Debug().Msgf("GPU count: %d", len(gpus))
for _, gpu := range gpus {
log.Debug().Msgf("GPU: %s", gpu.String())
}
}
// Make sure directories exists
if options.ModelPath == "" {
return nil, nil, nil, fmt.Errorf("options.ModelPath cannot be empty")
}
err := os.MkdirAll(options.ModelPath, 0750)
err = os.MkdirAll(options.ModelPath, 0750)
if err != nil {
return nil, nil, nil, fmt.Errorf("unable to create ModelPath: %q", err)
}

View File

@ -296,7 +296,7 @@ backend: transformers
parameters:
model: "facebook/opt-125m"
type: AutoModelForCausalLM
quantization: bnb_4bit # One of: bnb_8bit, bnb_4bit, xpu_4bit (optional)
quantization: bnb_4bit # One of: bnb_8bit, bnb_4bit, xpu_4bit, xpu_8bit (optional)
```
The backend will automatically download the required files in order to run the model.
@ -307,10 +307,42 @@ The backend will automatically download the required files in order to run the m
| Type | Description |
| --- | --- |
| `AutoModelForCausalLM` | `AutoModelForCausalLM` is a model that can be used to generate sequences. |
| `OVModelForCausalLM` | for OpenVINO models |
| `AutoModelForCausalLM` | `AutoModelForCausalLM` is a model that can be used to generate sequences. Use it for NVIDIA CUDA and Intel GPU with Intel Extensions for Pytorch acceleration |
| `OVModelForCausalLM` | for Intel CPU/GPU/NPU OpenVINO Text Generation models |
| `OVModelForFeatureExtraction` | for Intel CPU/GPU/NPU OpenVINO Embedding acceleration |
| N/A | Defaults to `AutoModel` |
- `OVModelForCausalLM` requires OpenVINO IR [Text Generation](https://huggingface.co/models?library=openvino&pipeline_tag=text-generation) models from Hugging face
- `OVModelForFeatureExtraction` works with any Safetensors Transformer [Feature Extraction](https://huggingface.co/models?pipeline_tag=feature-extraction&library=transformers,safetensors) model from Huggingface (Embedding Model)
Please note that streaming is currently not implemente in `AutoModelForCausalLM` for Intel GPU.
AMD GPU support is not implemented.
Although AMD CPU is not officially supported by OpenVINO there are reports that it works: YMMV.
##### Embeddings
Use `embeddings: true` if the model is an embedding model
##### Inference device selection
Transformer backend tries to automatically select the best device for inference, anyway you can override the decision manually overriding with the `main_gpu` parameter.
| Inference Engine | Applicable Values |
| --- | --- |
| CUDA | `cuda`, `cuda.X` where X is the GPU device like in `nvidia-smi -L` output |
| OpenVINO | Any applicable value from [Inference Modes](https://docs.openvino.ai/2024/openvino-workflow/running-inference/inference-devices-and-modes.html) like `AUTO`,`CPU`,`GPU`,`NPU`,`MULTI`,`HETERO` |
Example for CUDA:
`main_gpu: cuda.0`
Example for OpenVINO:
`main_gpu: AUTO:-CPU`
This parameter applies to both Text Generation and Feature Extraction (i.e. Embeddings) models.
##### Inference Precision
Transformer backend automatically select the fastest applicable inference precision according to the device support.
CUDA backend can manually enable *bfloat16* if your hardware support it with the following parameter:
`f16: true`
##### Quantization
@ -318,8 +350,42 @@ The backend will automatically download the required files in order to run the m
| --- | --- |
| `bnb_8bit` | 8-bit quantization |
| `bnb_4bit` | 4-bit quantization |
| `xpu_8bit` | 8-bit quantization for Intel XPUs |
| `xpu_4bit` | 4-bit quantization for Intel XPUs |
##### Trust Remote Code
Some models like Microsoft Phi-3 requires external code than what is provided by the transformer library.
By default it is disabled for security.
It can be manually enabled with:
`trust_remote_code: true`
##### Maximum Context Size
Maximum context size in bytes can be specified with the parameter: `context_size`. Do not use values higher than what your model support.
Usage example:
`context_size: 8192`
##### Auto Prompt Template
Usually chat template is defined by the model author in the `tokenizer_config.json` file.
To enable it use the `use_tokenizer_template: true` parameter in the `template` section.
Usage example:
```
template:
use_tokenizer_template: true
```
##### Custom Stop Words
Stopwords are usually defined in `tokenizer_config.json` file.
They can be overridden with the `stopwords` parameter in case of need like in llama3-Instruct model.
Usage example:
```
stopwords:
- "<|eot_id|>"
- "<|end_of_text|>"
```
#### Usage
Use the `completions` endpoint by specifying the `transformers` model:

View File

@ -144,7 +144,7 @@ Install `xcode` from the Apps Store (needed for metalkit)
```
# install build dependencies
brew install abseil cmake go grpc protobuf wget
brew install abseil cmake go grpc protobuf wget protoc-gen-go protoc-gen-go-grpc
# clone the repo
git clone https://github.com/go-skynet/LocalAI.git

View File

@ -45,10 +45,11 @@ LocalAI will attempt to automatically load models which are not explicitly confi
| [tinydream](https://github.com/symisc/tiny-dream#tiny-dreaman-embedded-header-only-stable-diffusion-inference-c-librarypixlabiotiny-dream) | stablediffusion | no | Image | no | no | N/A |
| `coqui` | Coqui | no | Audio generation and Voice cloning | no | no | CPU/CUDA |
| `petals` | Various GPTs and quantization formats | yes | GPT | no | no | CPU/CUDA |
| `transformers` | Various GPTs and quantization formats | yes | GPT, embeddings | yes | no | CPU/CUDA |
| `transformers` | Various GPTs and quantization formats | yes | GPT, embeddings | yes | yes**** | CPU/CUDA/XPU |
Note: any backend name listed above can be used in the `backend` field of the model configuration file (See [the advanced section]({{%relref "docs/advanced" %}})).
- \* 7b ONLY
- ** doesn't seem to be accurate
- *** 7b and 40b with the `ggccv` format, for instance: https://huggingface.co/TheBloke/WizardLM-Uncensored-Falcon-40B-GGML
- *** 7b and 40b with the `ggccv` format, for instance: https://huggingface.co/TheBloke/WizardLM-Uncensored-Falcon-40B-GGML
- **** Only for CUDA and OpenVINO CPU/XPU acceleration.

View File

@ -1,3 +1,3 @@
{
"version": "v2.13.0"
"version": "v2.14.0"
}

View File

@ -25,7 +25,7 @@ PyYAML==6.0
requests==2.31.0
SQLAlchemy==2.0.12
tenacity==8.2.2
tqdm==4.65.0
tqdm==4.66.3
typing-inspect==0.8.0
typing_extensions==4.5.0
urllib3==1.26.18

View File

@ -96,6 +96,22 @@
- filename: Meta-Llama-3-8B-Instruct.Q6_K.gguf
sha256: b7bad45618e2a76cc1e89a0fbb93a2cac9bf410e27a619c8024ed6db53aa9b4a
uri: huggingface://QuantFactory/Meta-Llama-3-8B-Instruct-GGUF/Meta-Llama-3-8B-Instruct.Q6_K.gguf
- <<: *llama3
name: "llama-3-8b-instruct-coder"
icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/0O4cIuv3wNbY68-FP7tak.jpeg
urls:
- https://huggingface.co/bartowski/Llama-3-8B-Instruct-Coder-GGUF
- https://huggingface.co/rombodawg/Llama-3-8B-Instruct-Coder
description: |
Original model: https://huggingface.co/rombodawg/Llama-3-8B-Instruct-Coder
All quants made using imatrix option with dataset provided by Kalomaze here
overrides:
parameters:
model: Llama-3-8B-Instruct-Coder-Q4_K_M.gguf
files:
- filename: Llama-3-8B-Instruct-Coder-Q4_K_M.gguf
sha256: 639ab8e3aeb7aa82cff6d8e6ef062d1c3e5a6d13e6d76e956af49f63f0e704f8
uri: huggingface://bartowski/Llama-3-8B-Instruct-Coder-GGUF/Llama-3-8B-Instruct-Coder-Q4_K_M.gguf
- <<: *llama3
name: "llama3-70b-instruct"
overrides:
@ -242,6 +258,22 @@
- filename: Llama-3-LewdPlay-8B-evo.q8_0.gguf
sha256: 1498152d598ff441f73ec6af9d3535875302e7251042d87feb7e71a3618966e8
uri: huggingface://Undi95/Llama-3-LewdPlay-8B-evo-GGUF/Llama-3-LewdPlay-8B-evo.q8_0.gguf
- <<: *llama3
name: "llama-3-soliloquy-8b-v2-iq-imatrix"
license: cc-by-nc-4.0
icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/u98dnnRVCwMh6YYGFIyff.png
urls:
- https://huggingface.co/Lewdiculous/Llama-3-Soliloquy-8B-v2-GGUF-IQ-Imatrix
description: |
Soliloquy-L3 is a highly capable roleplaying model designed for immersive, dynamic experiences. Trained on over 250 million tokens of roleplaying data, Soliloquy-L3 has a vast knowledge base, rich literary expression, and support for up to 24k context length. It outperforms existing ~13B models, delivering enhanced roleplaying capabilities.
overrides:
context_size: 8192
parameters:
model: Llama-3-Soliloquy-8B-v2-Q4_K_M-imat.gguf
files:
- filename: Llama-3-Soliloquy-8B-v2-Q4_K_M-imat.gguf
sha256: 3e4e066e57875c36fc3e1c1b0dba506defa5b6ed3e3e80e1f77c08773ba14dc8
uri: huggingface://Lewdiculous/Llama-3-Soliloquy-8B-v2-GGUF-IQ-Imatrix/Llama-3-Soliloquy-8B-v2-Q4_K_M-imat.gguf
- <<: *llama3
name: "chaos-rp_l3_b-iq-imatrix"
urls:
@ -319,8 +351,26 @@
model: Average_Normie_l3_v1_8B-Q4_K_M-imat.gguf
files:
- filename: Average_Normie_l3_v1_8B-Q4_K_M-imat.gguf
sha256: 9e98cd2672f716a0872912fdc4877969efd14d6f682f28e156f8591591c00d9c
sha256: 159eb62f2c8ae8fee10d9ed8386ce592327ca062807194a88e10b7cbb47ef986
uri: huggingface://Lewdiculous/Average_Normie_l3_v1_8B-GGUF-IQ-Imatrix/Average_Normie_l3_v1_8B-Q4_K_M-imat.gguf
- <<: *llama3
name: "openbiollm-llama3-8b"
urls:
- https://huggingface.co/aaditya/OpenBioLLM-Llama3-8B-GGUF
- https://huggingface.co/aaditya/Llama3-OpenBioLLM-8B
license: llama3
icon: https://cdn-uploads.huggingface.co/production/uploads/5f3fe13d79c1ba4c353d0c19/KGmRE5w2sepNtwsEu8t7K.jpeg
description: |
Introducing OpenBioLLM-8B: A State-of-the-Art Open Source Biomedical Large Language Model
OpenBioLLM-8B is an advanced open source language model designed specifically for the biomedical domain. Developed by Saama AI Labs, this model leverages cutting-edge techniques to achieve state-of-the-art performance on a wide range of biomedical tasks.
overrides:
parameters:
model: openbiollm-llama3-8b.Q4_K_M.gguf
files:
- filename: openbiollm-llama3-8b.Q4_K_M.gguff
sha256: 806fa724139b6a2527e33a79c25a13316188b319d4eed33e20914d7c5955d349
uri: huggingface://aaditya/OpenBioLLM-Llama3-8B-GGUF/openbiollm-llama3-8b.Q4_K_M.gguf
- <<: *llama3
name: "llama-3-8b-lexifun-uncensored-v1"
icon: "https://cdn-uploads.huggingface.co/production/uploads/644ad182f434a6a63b18eee6/GrOs1IPG5EXR3MOCtcQiz.png"
@ -406,6 +456,25 @@
- filename: Aura_Uncensored_l3_8B-Q4_K_M-imat.gguf
sha256: 265ded6a4f439bec160f394e3083a4a20e32ebb9d1d2d85196aaab23dab87fb2
uri: huggingface://Lewdiculous/Aura_Uncensored_l3_8B-GGUF-IQ-Imatrix/Aura_Uncensored_l3_8B-Q4_K_M-imat.gguf
- <<: *llama3
name: "llama-3-lumimaid-8b-v0.1"
urls:
- https://huggingface.co/NeverSleep/Llama-3-Lumimaid-8B-v0.1-GGUF
icon: https://cdn-uploads.huggingface.co/production/uploads/630dfb008df86f1e5becadc3/d3QMaxy3peFTpSlWdWF-k.png
license: cc-by-nc-4.0
description: |
This model uses the Llama3 prompting format
Llama3 trained on our RP datasets, we tried to have a balance between the ERP and the RP, not too horny, but just enough.
We also added some non-RP dataset, making the model less dumb overall. It should look like a 40%/60% ratio for Non-RP/RP+ERP data.
overrides:
parameters:
model: Llama-3-Lumimaid-8B-v0.1.q4_k_m.gguf
files:
- filename: Llama-3-Lumimaid-8B-v0.1.q4_k_m.gguf
sha256: 23ac0289da0e096d5c00f6614dfd12c94dceecb02c313233516dec9225babbda
uri: huggingface://NeverSleep/Llama-3-Lumimaid-8B-v0.1-GGUF/Llama-3-Lumimaid-8B-v0.1.q4_k_m.gguf
- <<: *llama3
name: "suzume-llama-3-8B-multilingual"
urls:
@ -520,6 +589,62 @@
- filename: Noromaid-13B-0.4-DPO.q4_k_m.gguf
sha256: cb28e878d034fae3d0b43326c5fc1cfb4ab583b17c56e41d6ce023caec03c1c1
uri: huggingface://NeverSleep/Noromaid-13B-0.4-DPO-GGUF/Noromaid-13B-0.4-DPO.q4_k_m.gguf
### START Vicuna based
- &wizardlm2
url: "github:mudler/LocalAI/gallery/wizardlm2.yaml@master"
name: "wizardlm2-7b"
description: |
We introduce and opensource WizardLM-2, our next generation state-of-the-art large language models, which have improved performance on complex chat, multilingual, reasoning and agent. New family includes three cutting-edge models: WizardLM-2 8x22B, WizardLM-2 70B, and WizardLM-2 7B.
WizardLM-2 8x22B is our most advanced model, demonstrates highly competitive performance compared to those leading proprietary works and consistently outperforms all the existing state-of-the-art opensource models.
WizardLM-2 70B reaches top-tier reasoning capabilities and is the first choice in the same size.
WizardLM-2 7B is the fastest and achieves comparable performance with existing 10x larger opensource leading models.
icon: https://github.com/nlpxucan/WizardLM/raw/main/imgs/WizardLM.png
license: apache-2.0
urls:
- https://huggingface.co/MaziyarPanahi/WizardLM-2-7B-GGUF
tags:
- llm
- gguf
- gpu
- cpu
- mistral
overrides:
parameters:
model: WizardLM-2-7B.Q4_K_M.gguf
files:
- filename: WizardLM-2-7B.Q4_K_M.gguf
sha256: 613212417701a26fd43f565c5c424a2284d65b1fddb872b53a99ef8add796f64
uri: huggingface://MaziyarPanahi/WizardLM-2-7B-GGUF/WizardLM-2-7B.Q4_K_M.gguf
### moondream2
- url: "github:mudler/LocalAI/gallery/moondream.yaml@master"
license: apache-2.0
description: |
a tiny vision language model that kicks ass and runs anywhere
icon: https://github.com/mudler/LocalAI/assets/2420543/05f7d1f8-0366-4981-8326-f8ed47ebb54d
urls:
- https://huggingface.co/vikhyatk/moondream2
- https://huggingface.co/moondream/moondream2-gguf
- https://github.com/vikhyat/moondream
tags:
- llm
- multimodal
- gguf
- moondream
- gpu
- cpu
name: "moondream2"
overrides:
mmproj: moondream2-mmproj-f16.gguf
parameters:
model: moondream2-text-model-f16.gguf
files:
- filename: moondream2-text-model-f16.gguf
sha256: 4e17e9107fb8781629b3c8ce177de57ffeae90fe14adcf7b99f0eef025889696
uri: huggingface://moondream/moondream2-gguf/moondream2-text-model-f16.gguf
- filename: moondream2-mmproj-f16.gguf
sha256: 4cc1cb3660d87ff56432ebeb7884ad35d67c48c7b9f6b2856f305e39c38eed8f
uri: huggingface://moondream/moondream2-gguf/moondream2-mmproj-f16.gguf
### START LLaVa
- &llava
url: "github:mudler/LocalAI/gallery/llava.yaml@master"
@ -575,7 +700,9 @@
sha256: 09c230de47f6f843e4841656f7895cac52c6e7ec7392acb5e8527de8b775c45a
uri: huggingface://jartine/llava-v1.5-7B-GGUF/llava-v1.5-7b-mmproj-Q8_0.gguf
- <<: *llama3
name: "poppy_porpoise-v0.7-l3-8b-iq-imatrix"
name: "poppy_porpoise-v0.72-l3-8b-iq-imatrix"
urls:
- https://huggingface.co/Lewdiculous/Poppy_Porpoise-0.72-L3-8B-GGUF-IQ-Imatrix
description: |
"Poppy Porpoise" is a cutting-edge AI roleplay assistant based on the Llama 3 8B model, specializing in crafting unforgettable narrative experiences. With its advanced language capabilities, Poppy expertly immerses users in an interactive and engaging adventure, tailoring each adventure to their individual preferences.
@ -590,16 +717,41 @@
- cpu
- llava-1.5
overrides:
mmproj: Llava_1.5_Llama3_mmproj.gguf
mmproj: Llava_1.5_Llama3_mmproj_updated.gguf
parameters:
model: Poppy_Porpoise-v0.7-L3-8B-Q4_K_M-imat.gguf
model: Poppy_Porpoise-0.72-L3-8B-Q4_K_M-imat.gguf
files:
- filename: Poppy_Porpoise-v0.7-L3-8B-Q4_K_M-imat.gguf
sha256: 04badadd6c88cd9c706efef8f5cd337057c805e43dd440a5936f87720c37eb33
uri: huggingface://Lewdiculous/Poppy_Porpoise-v0.7-L3-8B-GGUF-IQ-Imatrix/Poppy_Porpoise-v0.7-L3-8B-Q4_K_M-imat.gguf
- filename: Llava_1.5_Llama3_mmproj.gguf
sha256: d2a9ca943975f6c49c4d55886e873f676a897cff796e92410ace6c20f4efd03b
uri: huggingface://ChaoticNeutrals/Llava_1.5_Llama3_mmproj/mmproj-model-f16.gguf
- filename: Poppy_Porpoise-0.72-L3-8B-Q4_K_M-imat.gguf
sha256: 53743717f929f73aa4355229de114d9b81814cb2e83c6cc1c6517844da20bfd5
uri: huggingface://Lewdiculous/Poppy_Porpoise-0.72-L3-8B-GGUF-IQ-Imatrix/Poppy_Porpoise-0.72-L3-8B-Q4_K_M-imat.gguf
- filename: Llava_1.5_Llama3_mmproj_updated.gguf
sha256: 4f2bb77ca60f2c932d1c6647d334f5d2cd71966c19e850081030c9883ef1906c
uri: https://huggingface.co/ChaoticNeutrals/LLaVA-Llama-3-8B-mmproj-Updated/resolve/main/llava-v1.5-8B-Updated-Stop-Token/mmproj-model-f16.gguf
- <<: *llama3
name: "llava-llama-3-8b-v1_1"
description: |
llava-llama-3-8b-v1_1 is a LLaVA model fine-tuned from meta-llama/Meta-Llama-3-8B-Instruct and CLIP-ViT-Large-patch14-336 with ShareGPT4V-PT and InternVL-SFT by XTuner.
urls:
- https://huggingface.co/xtuner/llava-llama-3-8b-v1_1-gguf
tags:
- llm
- multimodal
- gguf
- gpu
- llama3
- cpu
- llava
overrides:
mmproj: llava-llama-3-8b-v1_1-mmproj-f16.gguf
parameters:
model: llava-llama-3-8b-v1_1-int4.gguf
files:
- filename: llava-llama-3-8b-v1_1-int4.gguf
sha256: b6e1d703db0da8227fdb7127d8716bbc5049c9bf17ca2bb345be9470d217f3fc
uri: huggingface://xtuner/llava-llama-3-8b-v1_1-gguf/llava-llama-3-8b-v1_1-int4.gguf
- filename: llava-llama-3-8b-v1_1-mmproj-f16.gguf
sha256: eb569aba7d65cf3da1d0369610eb6869f4a53ee369992a804d5810a80e9fa035
uri: huggingface://xtuner/llava-llama-3-8b-v1_1-gguf/llava-llama-3-8b-v1_1-mmproj-f16.gguf
### START Phi-2
- &phi-2-chat
url: "github:mudler/LocalAI/gallery/phi-2-chat.yaml@master"
@ -697,7 +849,7 @@
model: Phi-3-mini-4k-instruct-q4.gguf
files:
- filename: "Phi-3-mini-4k-instruct-q4.gguf"
sha256: "4fed7364ee3e0c7cb4fe0880148bfdfcd1b630981efa0802a6b62ee52e7da97e"
sha256: "8a83c7fb9049a9b2e92266fa7ad04933bb53aa1e85136b7b30f1b8000ff2edef"
uri: "huggingface://microsoft/Phi-3-mini-4k-instruct-gguf/Phi-3-mini-4k-instruct-q4.gguf"
- <<: *phi-3
name: "phi-3-mini-4k-instruct:fp16"
@ -760,6 +912,58 @@
- filename: "Hermes-2-Pro-Mistral-7B.Q8_0.gguf"
sha256: "b6d95d7ec9a395b7568cc94b0447fd4f90b6f69d6e44794b1fbb84e3f732baca"
uri: "huggingface://NousResearch/Hermes-2-Pro-Mistral-7B-GGUF/Hermes-2-Pro-Mistral-7B.Q8_0.gguf"
### LLAMA3 version
- <<: *hermes-2-pro-mistral
name: "hermes-2-pro-llama-3-8b"
tags:
- llm
- gguf
- gpu
- llama3
- cpu
urls:
- https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF
overrides:
parameters:
model: Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf
files:
- filename: "Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf"
sha256: "10c52a4820137a35947927be741bb411a9200329367ce2590cc6757cd98e746c"
uri: "huggingface://NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF/Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf"
- <<: *hermes-2-pro-mistral
tags:
- llm
- gguf
- gpu
- llama3
- cpu
name: "hermes-2-pro-llama-3-8b:Q5_K_M"
urls:
- https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF
overrides:
parameters:
model: Hermes-2-Pro-Llama-3-8B-Q5_K_M.gguf
files:
- filename: "Hermes-2-Pro-Llama-3-8B-Q5_K_M.gguf"
sha256: "107f3f55e26b8cc144eadd83e5f8a60cfd61839c56088fa3ae2d5679abf45f29"
uri: "huggingface://NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF/Hermes-2-Pro-Llama-3-8B-Q5_K_M.gguf"
- <<: *hermes-2-pro-mistral
tags:
- llm
- gguf
- gpu
- llama3
- cpu
name: "hermes-2-pro-llama-3-8b:Q8_0"
urls:
- https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF
overrides:
parameters:
model: Hermes-2-Pro-Llama-3-8B-Q8_0.gguf
files:
- filename: "Hermes-2-Pro-Llama-3-8B-Q8_0.gguf"
sha256: "d138388cfda04d185a68eaf2396cf7a5cfa87d038a20896817a9b7cf1806f532"
uri: "huggingface://NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF/Hermes-2-Pro-Llama-3-8B-Q8_0.gguf"
- <<: *hermes-2-pro-mistral
name: "biomistral-7b"
description: |
@ -868,11 +1072,19 @@
urls:
- https://huggingface.co/fakezeta/Phi-3-mini-128k-instruct-ov-int8
overrides:
trust_remote_code: true
context_size: 131072
parameters:
model: fakezeta/Phi-3-mini-128k-instruct-ov-int8
stopwords:
- <|end|>
tags:
- llm
- openvino
- gpu
- phi3
- cpu
- Remote Code Enabled
- <<: *openvino
name: "openvino-starling-lm-7b-beta-openvino-int8"
urls:
@ -881,6 +1093,12 @@
context_size: 8192
parameters:
model: fakezeta/Starling-LM-7B-beta-openvino-int8
tags:
- llm
- openvino
- gpu
- mistral
- cpu
- <<: *openvino
name: "openvino-wizardlm2"
urls:
@ -889,6 +1107,50 @@
context_size: 8192
parameters:
model: fakezeta/Not-WizardLM-2-7B-ov-int8
- <<: *openvino
name: "openvino-hermes2pro-llama3"
urls:
- https://huggingface.co/fakezeta/Hermes-2-Pro-Llama-3-8B-ov-int8
overrides:
context_size: 8192
parameters:
model: fakezeta/Hermes-2-Pro-Llama-3-8B-ov-int8
tags:
- llm
- openvino
- gpu
- llama3
- cpu
- <<: *openvino
name: "openvino-multilingual-e5-base"
urls:
- https://huggingface.co/intfloat/multilingual-e5-base
overrides:
embeddings: true
type: OVModelForFeatureExtraction
parameters:
model: intfloat/multilingual-e5-base
tags:
- llm
- openvino
- gpu
- embedding
- cpu
- <<: *openvino
name: "openvino-all-MiniLM-L6-v2"
urls:
- https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2
overrides:
embeddings: true
type: OVModelForFeatureExtraction
parameters:
model: sentence-transformers/all-MiniLM-L6-v2
tags:
- llm
- openvino
- gpu
- embedding
- cpu
### START Embeddings
- &sentencentransformers
description: |
@ -994,7 +1256,7 @@
- text-to-speech
- cpu
override:
overrides:
parameters:
model: en-us-kathleen-low.onnx
files:
@ -1002,7 +1264,7 @@
uri: https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-en-us-kathleen-low.tar.gz
- <<: *piper
name: voice-ca-upc_ona-x-low
override:
overrides:
parameters:
model: ca-upc_ona-x-low.onnx
files:
@ -1011,7 +1273,7 @@
- <<: *piper
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-ca-upc_pau-x-low
override:
overrides:
parameters:
model: ca-upc_pau-x-low.onnx
files:
@ -1020,7 +1282,7 @@
- <<: *piper
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-da-nst_talesyntese-medium
override:
overrides:
parameters:
model: da-nst_talesyntese-medium.onnx
files:
@ -1029,7 +1291,7 @@
- <<: *piper
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-de-eva_k-x-low
override:
overrides:
parameters:
model: de-eva_k-x-low.onnx
files:
@ -1038,7 +1300,7 @@
- <<: *piper
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-de-karlsson-low
override:
overrides:
parameters:
model: de-karlsson-low.onnx
files:
@ -1047,7 +1309,7 @@
- <<: *piper
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-de-kerstin-low
override:
overrides:
parameters:
model: de-kerstin-low.onnx
files:
@ -1056,7 +1318,7 @@
- <<: *piper
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-de-pavoque-low
override:
overrides:
parameters:
model: de-pavoque-low.onnx
files:
@ -1065,7 +1327,7 @@
- <<: *piper
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-de-ramona-low
override:
overrides:
parameters:
model: de-ramona-low.onnx
files:
@ -1075,7 +1337,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-de-thorsten-low
override:
overrides:
parameters:
model: de-thorsten-low.onnx
files:
@ -1085,7 +1347,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-el-gr-rapunzelina-low
override:
overrides:
parameters:
model: el-gr-rapunzelina-low.onnx
files:
@ -1095,7 +1357,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-en-gb-alan-low
override:
overrides:
parameters:
model: en-gb-alan-low.onnx
files:
@ -1105,7 +1367,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-en-gb-southern_english_female-low
override:
overrides:
parameters:
model: en-gb-southern_english
files:
@ -1115,7 +1377,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-en-us-amy-low
override:
overrides:
parameters:
model: en-us-amy-low.onnx
files:
@ -1125,7 +1387,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-en-us-danny-low
override:
overrides:
parameters:
model: en-us-danny-low.onnx
files:
@ -1135,7 +1397,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-en-us-kathleen-low
override:
overrides:
parameters:
model: en-us-kathleen-low.onnx
files:
@ -1145,7 +1407,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-en-us-lessac-low
override:
overrides:
parameters:
model: en-us-lessac-low.onnx
files:
@ -1155,7 +1417,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-en-us-lessac-medium
override:
overrides:
parameters:
model: en-us-lessac-medium.onnx
files:
@ -1165,7 +1427,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-en-us-libritts-high
override:
overrides:
parameters:
model: en-us-libritts-high.onnx
files:
@ -1175,7 +1437,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-en-us-ryan-high
override:
overrides:
parameters:
model: en-us-ryan-high.onnx
files:
@ -1185,7 +1447,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-en-us-ryan-low
override:
overrides:
parameters:
model: en-us-ryan-low.onnx
files:
@ -1196,7 +1458,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-en-us-ryan-medium
override:
overrides:
parameters:
model: en-us-ryan-medium.onnx
files:
@ -1206,7 +1468,7 @@
- <<: *piper
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-en-us_lessac
override:
overrides:
parameters:
model: en-us-lessac.onnx
files:
@ -1216,7 +1478,7 @@
- <<: *piper
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-es-carlfm-x-low
override:
overrides:
parameters:
model: es-carlfm-x-low.onnx
files:
@ -1227,7 +1489,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-es-mls_10246-low
override:
overrides:
parameters:
model: es-mls_10246-low.onnx
files:
@ -1238,7 +1500,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-es-mls_9972-low
override:
overrides:
parameters:
model: es-mls_9972-low.onnx
files:
@ -1249,7 +1511,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-fi-harri-low
override:
overrides:
parameters:
model: fi-harri-low.onnx
files:
@ -1260,7 +1522,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-fr-gilles-low
override:
overrides:
parameters:
model: fr-gilles-low.onnx
files:
@ -1271,7 +1533,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-fr-mls_1840-low
override:
overrides:
parameters:
model: fr-mls_1840-low.onnx
files:
@ -1282,7 +1544,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-fr-siwis-low
override:
overrides:
parameters:
model: fr-siwis-low.onnx
files:
@ -1293,7 +1555,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-fr-siwis-medium
override:
overrides:
parameters:
model: fr-siwis-medium.onnx
files:
@ -1304,7 +1566,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-is-bui-medium
override:
overrides:
parameters:
model: is-bui-medium.onnx
files:
@ -1315,7 +1577,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-is-salka-medium
override:
overrides:
parameters:
model: is-salka-medium.onnx
files:
@ -1326,7 +1588,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-is-steinn-medium
override:
overrides:
parameters:
model: is-steinn-medium.onnx
files:
@ -1337,7 +1599,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-is-ugla-medium
override:
overrides:
parameters:
model: is-ugla-medium.onnx
files:
@ -1348,7 +1610,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-it-riccardo_fasol-x-low
override:
overrides:
parameters:
model: it-riccardo_fasol-x-low.onnx
files:
@ -1359,7 +1621,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-kk-iseke-x-low
override:
overrides:
parameters:
model: kk-iseke-x-low.onnx
files:
@ -1370,7 +1632,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-kk-issai-high
override:
overrides:
parameters:
model: kk-issai-high.onnx
files:
@ -1381,7 +1643,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-kk-raya-x-low
override:
overrides:
parameters:
model: kk-raya-x-low.onnx
files:
@ -1392,7 +1654,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-ne-google-medium
override:
overrides:
parameters:
model: ne-google-medium.onnx
files:
@ -1403,7 +1665,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-ne-google-x-low
override:
overrides:
parameters:
model: ne-google-x-low.onnx
files:
@ -1414,7 +1676,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-nl-mls_5809-low
override:
overrides:
parameters:
model: nl-mls_5809-low.onnx
files:
@ -1425,7 +1687,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-nl-mls_7432-low
override:
overrides:
parameters:
model: nl-mls_7432-low.onnx
files:
@ -1436,7 +1698,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-nl-nathalie-x-low
override:
overrides:
parameters:
model: nl-nathalie-x-low.onnx
files:
@ -1447,7 +1709,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-nl-rdh-medium
override:
overrides:
parameters:
model: nl-rdh-medium.onnx
files:
@ -1458,7 +1720,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-nl-rdh-x-low
override:
overrides:
parameters:
model: nl-rdh-x-low.onnx
files:
@ -1469,7 +1731,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-no-talesyntese-medium
override:
overrides:
parameters:
model: no-talesyntese-medium.onnx
files:
@ -1480,7 +1742,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-pl-mls_6892-low
override:
overrides:
parameters:
model: pl-mls_6892-low.onnx
files:
@ -1491,7 +1753,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-pt-br-edresson-low
override:
overrides:
parameters:
model: pt-br-edresson-low.onnx
files:
@ -1502,7 +1764,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-ru-irinia-medium
override:
overrides:
parameters:
model: ru-irinia-medium.onnx
files:
@ -1513,7 +1775,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-sv-se-nst-medium
override:
overrides:
parameters:
model: sv-se-nst-medium.onnx
files:
@ -1524,7 +1786,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-uk-lada-x-low
override:
overrides:
parameters:
model: uk-lada-x-low.onnx
files:
@ -1535,7 +1797,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-vi-25hours-single-low
override:
overrides:
parameters:
model: vi-25hours-single-low.onnx
files:
@ -1546,7 +1808,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-vi-vivos-x-low
override:
overrides:
parameters:
model: vi-vivos-x-low.onnx
files:
@ -1557,7 +1819,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-zh-cn-huayan-x-low
override:
overrides:
parameters:
model: zh-cn-huayan-x-low.onnx
files:
@ -1568,7 +1830,7 @@
url: github:mudler/LocalAI/gallery/piper.yaml@master
name: voice-zh_CN-huayan-medium
override:
overrides:
parameters:
model: zh_CN-huayan-medium.onnx
files:

19
gallery/moondream.yaml Normal file
View File

@ -0,0 +1,19 @@
---
name: "moondream2"
config_file: |
backend: llama-cpp
context_size: 2046
roles:
user: "\nQuestion: "
system: "\nSystem: "
assistant: "\nAnswer: "
stopwords:
- "Question:"
- "<|endoftext|>"
f16: true
template:
completion: |
Complete the following sentence: {{.Input}}
chat: "{{.Input}}\nAnswer:\n"

View File

@ -7,6 +7,3 @@ config_file: |
type: OVModelForCausalLM
template:
use_tokenizer_template: true
stopwords:
- "<|eot_id|>"
- "<|end_of_text|>"

15
gallery/wizardlm2.yaml Normal file
View File

@ -0,0 +1,15 @@
---
name: "wizardlm2"
config_file: |
mmap: true
template:
chat_message: |-
{{if eq .RoleName "assistant"}}ASSISTANT: {{.Content}}</s>{{else if eq .RoleName "system"}}{{.Content}}{{else if eq .RoleName "user"}}USER: {{.Content}}{{end}}
chat: "{{.Input}}ASSISTANT: "
completion: |-
{{.Input}}
context_size: 32768
f16: true
stopwords:
- </s>

8
go.mod
View File

@ -67,6 +67,7 @@ require (
github.com/Masterminds/semver/v3 v3.2.0 // indirect
github.com/Microsoft/go-winio v0.6.0 // indirect
github.com/Nvveen/Gotty v0.0.0-20120604004816-cd527374f1e5 // indirect
github.com/StackExchange/wmi v1.2.1 // indirect
github.com/alecthomas/chroma/v2 v2.8.0 // indirect
github.com/aymanbagabas/go-osc52/v2 v2.0.1 // indirect
github.com/aymerick/douceur v0.2.0 // indirect
@ -82,6 +83,7 @@ require (
github.com/docker/go-connections v0.4.0 // indirect
github.com/docker/go-units v0.4.0 // indirect
github.com/dsnet/compress v0.0.2-0.20210315054119-f66993602bf5 // indirect
github.com/ghodss/yaml v1.0.0 // indirect
github.com/go-logr/stdr v1.2.2 // indirect
github.com/go-openapi/jsonpointer v0.21.0 // indirect
github.com/go-openapi/jsonreference v0.21.0 // indirect
@ -95,7 +97,10 @@ require (
github.com/google/shlex v0.0.0-20191202100458-e7afc7fbc510 // indirect
github.com/gorilla/css v1.0.1 // indirect
github.com/huandu/xstrings v1.3.3 // indirect
github.com/jaypipes/ghw v0.12.0 // indirect
github.com/jaypipes/pcidb v1.0.0 // indirect
github.com/josharian/intern v1.0.0 // indirect
github.com/klauspost/cpuid/v2 v2.2.7 // indirect
github.com/klauspost/pgzip v1.2.5 // indirect
github.com/lucasb-eyer/go-colorful v1.2.0 // indirect
github.com/mailru/easyjson v0.7.7 // indirect
@ -103,6 +108,7 @@ require (
github.com/microcosm-cc/bluemonday v1.0.26 // indirect
github.com/mitchellh/colorstring v0.0.0-20190213212951-d06e56a500db // indirect
github.com/mitchellh/copystructure v1.0.0 // indirect
github.com/mitchellh/go-homedir v1.1.0 // indirect
github.com/mitchellh/mapstructure v1.5.0 // indirect
github.com/mitchellh/reflectwalk v1.0.0 // indirect
github.com/moby/term v0.0.0-20201216013528-df9cb8a40635 // indirect
@ -113,6 +119,7 @@ require (
github.com/opencontainers/go-digest v1.0.0 // indirect
github.com/opencontainers/image-spec v1.0.2 // indirect
github.com/opencontainers/runc v1.1.12 // indirect
github.com/philippgille/chromem-go v0.5.0 // indirect
github.com/pierrec/lz4/v4 v4.1.2 // indirect
github.com/pkg/errors v0.9.1 // indirect
github.com/pkoukk/tiktoken-go v0.1.2 // indirect
@ -139,6 +146,7 @@ require (
google.golang.org/genproto/googleapis/rpc v0.0.0-20230822172742-b8732ec3820d // indirect
gopkg.in/fsnotify.v1 v1.4.7 // indirect
gopkg.in/tomb.v1 v1.0.0-20141024135613-dd632973f1e7 // indirect
howett.net/plist v1.0.0 // indirect
)
require (

20
go.sum
View File

@ -14,6 +14,8 @@ github.com/Microsoft/go-winio v0.6.0 h1:slsWYD/zyx7lCXoZVlvQrj0hPTM1HI4+v1sIda2y
github.com/Microsoft/go-winio v0.6.0/go.mod h1:cTAf44im0RAYeL23bpB+fzCyDH2MJiz2BO69KH/soAE=
github.com/Nvveen/Gotty v0.0.0-20120604004816-cd527374f1e5 h1:TngWCqHvy9oXAN6lEVMRuU21PR1EtLVZJmdB18Gu3Rw=
github.com/Nvveen/Gotty v0.0.0-20120604004816-cd527374f1e5/go.mod h1:lmUJ/7eu/Q8D7ML55dXQrVaamCz2vxCfdQBasLZfHKk=
github.com/StackExchange/wmi v1.2.1 h1:VIkavFPXSjcnS+O8yTq7NI32k0R5Aj+v39y29VYDOSA=
github.com/StackExchange/wmi v1.2.1/go.mod h1:rcmrprowKIVzvc+NUiLncP2uuArMWLCbu9SBzvHz7e8=
github.com/alecthomas/assert/v2 v2.6.0 h1:o3WJwILtexrEUk3cUVal3oiQY2tfgr/FHWiz/v2n4FU=
github.com/alecthomas/assert/v2 v2.6.0/go.mod h1:Bze95FyfUr7x34QZrjL+XP+0qgp/zg8yS+TtBj1WA3k=
github.com/alecthomas/chroma/v2 v2.8.0 h1:w9WJUjFFmHHB2e8mRpL9jjy3alYDlU0QLDezj1xE264=
@ -71,6 +73,8 @@ github.com/fsnotify/fsnotify v1.7.0 h1:8JEhPFa5W2WU7YfeZzPNqzMP6Lwt7L2715Ggo0nos
github.com/fsnotify/fsnotify v1.7.0/go.mod h1:40Bi/Hjc2AVfZrqy+aj+yEI+/bRxZnMJyTJwOpGvigM=
github.com/ggerganov/whisper.cpp/bindings/go v0.0.0-20230628193450-85ed71aaec8e h1:KtbU2JR3lJuXFASHG2+sVLucfMPBjWKUUKByX6C81mQ=
github.com/ggerganov/whisper.cpp/bindings/go v0.0.0-20230628193450-85ed71aaec8e/go.mod h1:QIjZ9OktHFG7p+/m3sMvrAJKKdWrr1fZIK0rM6HZlyo=
github.com/ghodss/yaml v1.0.0 h1:wQHKEahhL6wmXdzwWG11gIVCkOv05bNOh+Rxn0yngAk=
github.com/ghodss/yaml v1.0.0/go.mod h1:4dBDuWmgqj2HViK6kFavaiC9ZROes6MMH2rRYeMEF04=
github.com/go-audio/audio v1.0.0 h1:zS9vebldgbQqktK4H0lUqWrG8P0NxCJVqcj7ZpNnwd4=
github.com/go-audio/audio v1.0.0/go.mod h1:6uAu0+H2lHkwdGsAY+j2wHPNPpPoeg5AaEFh9FlA+Zs=
github.com/go-audio/riff v1.0.0 h1:d8iCGbDvox9BfLagY94fBynxSPHO80LmZCaOsmKxokA=
@ -82,6 +86,7 @@ github.com/go-logr/logr v1.2.4 h1:g01GSCwiDw2xSZfjJ2/T9M+S6pFdcNtFYsp+Y43HYDQ=
github.com/go-logr/logr v1.2.4/go.mod h1:jdQByPbusPIv2/zmleS9BjJVeZ6kBagPoEUsqbVz/1A=
github.com/go-logr/stdr v1.2.2 h1:hSWxHoqTgW2S2qGc0LTAI563KZ5YKYRhT3MFKZMbjag=
github.com/go-logr/stdr v1.2.2/go.mod h1:mMo/vtBO5dYbehREoey6XUKy/eSumjCCveDpRre4VKE=
github.com/go-ole/go-ole v1.2.5/go.mod h1:pprOEPIfldk/42T2oK7lQ4v4JSDwmV0As9GaiUsvbm0=
github.com/go-ole/go-ole v1.2.6 h1:/Fpf6oFPoeFik9ty7siob0G6Ke8QvQEuVcuChpwXzpY=
github.com/go-ole/go-ole v1.2.6/go.mod h1:pprOEPIfldk/42T2oK7lQ4v4JSDwmV0As9GaiUsvbm0=
github.com/go-openapi/jsonpointer v0.21.0 h1:YgdVicSA9vH5RiHs9TZW5oyafXZFc6+2Vc1rr/O9oNQ=
@ -162,6 +167,11 @@ github.com/ianlancetaylor/demangle v0.0.0-20200824232613-28f6c0f3b639/go.mod h1:
github.com/imdario/mergo v0.3.11/go.mod h1:jmQim1M+e3UYxmgPu/WyfjB3N3VflVyUjjjwH0dnCYA=
github.com/imdario/mergo v0.3.16 h1:wwQJbIsHYGMUyLSPrEq1CT16AhnhNJQ51+4fdHUnCl4=
github.com/imdario/mergo v0.3.16/go.mod h1:WBLT9ZmE3lPoWsEzCh9LPo3TiwVN+ZKEjmz+hD27ysY=
github.com/jaypipes/ghw v0.12.0 h1:xU2/MDJfWmBhJnujHY9qwXQLs3DBsf0/Xa9vECY0Tho=
github.com/jaypipes/ghw v0.12.0/go.mod h1:jeJGbkRB2lL3/gxYzNYzEDETV1ZJ56OKr+CSeSEym+g=
github.com/jaypipes/pcidb v1.0.0 h1:vtZIfkiCUE42oYbJS0TAq9XSfSmcsgo9IdxSm9qzYU8=
github.com/jaypipes/pcidb v1.0.0/go.mod h1:TnYUvqhPBzCKnH34KrIX22kAeEbDCSRJ9cqLRCuNDfk=
github.com/jessevdk/go-flags v1.4.0/go.mod h1:4FA24M0QyGHXBuZZK/XkWh8h0e1EYbRYJSGM75WSRxI=
github.com/joho/godotenv v1.5.1 h1:7eLL/+HRGLY0ldzfGMeQkb7vMd0as4CfYvUVzLqw0N0=
github.com/joho/godotenv v1.5.1/go.mod h1:f4LDr5Voq0i2e/R5DDNOoa2zzDfwtkZa6DnEwAbqwq4=
github.com/josharian/intern v1.0.0 h1:vlS4z54oSdjm0bgjRigI+G1HpF+tI+9rE5LLzOg8HmY=
@ -174,6 +184,8 @@ github.com/klauspost/compress v1.11.4/go.mod h1:aoV0uJVorq1K+umq18yTdKaF57EivdYs
github.com/klauspost/compress v1.17.0 h1:Rnbp4K9EjcDuVuHtd0dgA4qNuv9yKDYKK1ulpJwgrqM=
github.com/klauspost/compress v1.17.0/go.mod h1:ntbaceVETuRiXiv4DpjP66DpAtAGkEQskQzEyD//IeE=
github.com/klauspost/cpuid v1.2.0/go.mod h1:Pj4uuM528wm8OyEC2QMXAi2YiTZ96dNQPGgoMS4s3ek=
github.com/klauspost/cpuid/v2 v2.2.7 h1:ZWSB3igEs+d0qvnxR/ZBzXVmxkgt8DdzP6m9pfuVLDM=
github.com/klauspost/cpuid/v2 v2.2.7/go.mod h1:Lcz8mBdAVJIBVzewtcLocK12l3Y+JytZYpaMropDUws=
github.com/klauspost/pgzip v1.2.5 h1:qnWYvvKqedOF2ulHpMG72XQol4ILEJ8k2wwRl/Km8oE=
github.com/klauspost/pgzip v1.2.5/go.mod h1:Ch1tH69qFZu15pkjo5kYi6mth2Zzwzt50oCQKQE9RUs=
github.com/kr/pretty v0.3.1 h1:flRD4NNwYAUpkphVc1HcthR4KEIFJ65n8Mw5qdRn3LE=
@ -210,6 +222,8 @@ github.com/mitchellh/colorstring v0.0.0-20190213212951-d06e56a500db h1:62I3jR2Em
github.com/mitchellh/colorstring v0.0.0-20190213212951-d06e56a500db/go.mod h1:l0dey0ia/Uv7NcFFVbCLtqEBQbrT4OCwCSKTEv6enCw=
github.com/mitchellh/copystructure v1.0.0 h1:Laisrj+bAB6b/yJwB5Bt3ITZhGJdqmxquMKeZ+mmkFQ=
github.com/mitchellh/copystructure v1.0.0/go.mod h1:SNtv71yrdKgLRyLFxmLdkAbkKEFWgYaq1OVrnRcwhnw=
github.com/mitchellh/go-homedir v1.1.0 h1:lukF9ziXFxDFPkA1vsr5zpc1XuPDn/wFntq5mG+4E0Y=
github.com/mitchellh/go-homedir v1.1.0/go.mod h1:SfyaCUpYCn1Vlf4IUYiD9fPX4A5wJrkLzIz1N1q0pr0=
github.com/mitchellh/mapstructure v1.5.0 h1:jeMsZIYE/09sWLaz43PL7Gy6RuMjD2eJVyuac5Z2hdY=
github.com/mitchellh/mapstructure v1.5.0/go.mod h1:bFUtVrKA4DC2yAKiSyO/QUcy7e+RRV2QTWOzhPopBRo=
github.com/mitchellh/reflectwalk v1.0.0 h1:9D+8oIskB4VJBN5SFlmc27fSlIBZaov1Wpk/IfikLNY=
@ -260,6 +274,8 @@ github.com/otiai10/openaigo v1.6.0 h1:YTQEbtDSvawETOB/Kmb/6JvuHdHH/eIpSQfHVufiwY
github.com/otiai10/openaigo v1.6.0/go.mod h1:kIaXc3V+Xy5JLplcBxehVyGYDtufHp3PFPy04jOwOAI=
github.com/phayes/freeport v0.0.0-20220201140144-74d24b5ae9f5 h1:Ii+DKncOVM8Cu1Hc+ETb5K+23HdAMvESYE3ZJ5b5cMI=
github.com/phayes/freeport v0.0.0-20220201140144-74d24b5ae9f5/go.mod h1:iIss55rKnNBTvrwdmkUpLnDpZoAHvWaiq5+iMmen4AE=
github.com/philippgille/chromem-go v0.5.0 h1:bryX0F3N6jnN/21iBd8i2/k9EzPTZn3nyiqAti19si8=
github.com/philippgille/chromem-go v0.5.0/go.mod h1:hTd+wGEm/fFPQl7ilfCwQXkgEUxceYh86iIdoKMolPo=
github.com/pierrec/lz4/v4 v4.1.2 h1:qvY3YFXRQE/XB8MlLzJH7mSzBs74eA2gg52YTk6jUPM=
github.com/pierrec/lz4/v4 v4.1.2/go.mod h1:gZWDp/Ze/IJXGXf23ltt2EXimqmTUXEy0GFuRQyBid4=
github.com/pkg/errors v0.8.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=
@ -430,6 +446,7 @@ golang.org/x/sys v0.0.0-20220520151302-bc2c85ada10a/go.mod h1:oPkhp1MJrh7nUepCBc
golang.org/x/sys v0.0.0-20220722155257-8c9f86f7a55f/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.0.0-20220811171246-fbc7d0a398ab/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.2.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.5.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.8.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.10.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
@ -489,6 +506,7 @@ gopkg.in/fsnotify.v1 v1.4.7/go.mod h1:Tz8NjZHkW78fSQdbUxIjBTcgA1z1m8ZHf0WmKUhAMy
gopkg.in/op/go-logging.v1 v1.0.0-20160211212156-b2cb9fa56473/go.mod h1:N1eN2tsCx0Ydtgjl4cqmbRCsY4/+z4cYDeqwZTk6zog=
gopkg.in/tomb.v1 v1.0.0-20141024135613-dd632973f1e7 h1:uRGJdciOHaEIrze2W8Q3AKkepLTh2hOroT7a+7czfdQ=
gopkg.in/tomb.v1 v1.0.0-20141024135613-dd632973f1e7/go.mod h1:dt/ZhP58zS4L8KSrWDmTeBkI65Dw0HsyUHuEVlX15mw=
gopkg.in/yaml.v1 v1.0.0-20140924161607-9f9df34309c0/go.mod h1:WDnlLJ4WF5VGsH/HVa3CI79GS0ol3YnhVnKP89i0kNg=
gopkg.in/yaml.v2 v2.2.2/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
gopkg.in/yaml.v2 v2.2.4/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
gopkg.in/yaml.v2 v2.3.0/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
@ -500,3 +518,5 @@ gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
gotest.tools/v3 v3.0.2/go.mod h1:3SzNCllyD9/Y+b5r9JIKQ474KzkZyqLqEfYqMsX94Bk=
gotest.tools/v3 v3.3.0 h1:MfDY1b1/0xN1CyMlQDac0ziEy9zJQd9CXBRRDHw2jJo=
gotest.tools/v3 v3.3.0/go.mod h1:Mcr9QNxkg0uMvy/YElmo4SpXgJKWgQvYrT7Kw5RzJ1A=
howett.net/plist v1.0.0 h1:7CrbWYbPPO/PyNy38b2EB/+gYbjCe2DXBxgtOOZbSQM=
howett.net/plist v1.0.0/go.mod h1:lqaXoTrLY4hg8tnEzNru53gicrbv7rrk+2xJA/7hw9g=

View File

@ -55,6 +55,9 @@ func InstallModelFromGallery(galleries []Gallery, name string, basePath string,
installName = req.Name
}
// Copy the model configuration from the request schema
config.URLs = append(config.URLs, model.URLs...)
config.Icon = model.Icon
config.Files = append(config.Files, req.AdditionalFiles...)
config.Files = append(config.Files, model.AdditionalFiles...)
@ -186,6 +189,12 @@ func getGalleryModels(gallery Gallery, basePath string) ([]*GalleryModel, error)
return models, nil
}
func GetLocalModelConfiguration(basePath string, name string) (*Config, error) {
name = strings.ReplaceAll(name, string(os.PathSeparator), "__")
galleryFile := filepath.Join(basePath, galleryFileName(name))
return ReadConfigFile(galleryFile)
}
func DeleteModelFromSystem(basePath string, name string, additionalFiles []string) error {
// os.PathSeparator is not allowed in model names. Replace them with "__" to avoid conflicts with file paths.
name = strings.ReplaceAll(name, string(os.PathSeparator), "__")
@ -228,5 +237,8 @@ func DeleteModelFromSystem(basePath string, name string, additionalFiles []strin
err = errors.Join(err, fmt.Errorf("failed to remove file %s: %w", configFile, e))
}
// Delete gallery config file
os.Remove(galleryFile)
return err
}

View File

@ -40,8 +40,10 @@ prompt_templates:
*/
// Config is the model configuration which contains all the model details
// This configuration is read from the gallery endpoint and is used to download and install the model
// It is the internal structure, separated from the request
type Config struct {
Description string `yaml:"description"`
Icon string `yaml:"icon"`
License string `yaml:"license"`
URLs []string `yaml:"urls"`
Name string `yaml:"name"`

View File

@ -1,10 +1,10 @@
package gallery
type GalleryOp struct {
Id string
GalleryName string
ConfigURL string
Delete bool
Id string
GalleryModelName string
ConfigURL string
Delete bool
Req GalleryModel
Galleries []Gallery
@ -19,4 +19,5 @@ type GalleryOpStatus struct {
Progress float64 `json:"progress"`
TotalFileSize string `json:"file_size"`
DownloadedFileSize string `json:"downloaded_size"`
GalleryModelName string `json:"gallery_model_name"`
}

View File

@ -1,5 +1,10 @@
package gallery
import (
"fmt"
"strings"
)
// GalleryModel is the struct used to represent a model in the gallery returned by the endpoint.
// It is used to install the model by resolving the URL and downloading the files.
// The other fields are used to override the configuration of the model.
@ -22,3 +27,23 @@ type GalleryModel struct {
// Installed is used to indicate if the model is installed or not
Installed bool `json:"installed,omitempty" yaml:"installed,omitempty"`
}
func (m GalleryModel) ID() string {
return fmt.Sprintf("%s@%s", m.Gallery.Name, m.Name)
}
type GalleryModels []*GalleryModel
func (gm GalleryModels) Search(term string) GalleryModels {
var filteredModels GalleryModels
for _, m := range gm {
if strings.Contains(m.Name, term) ||
strings.Contains(m.Description, term) ||
strings.Contains(m.Gallery.Name, term) ||
strings.Contains(strings.Join(m.Tags, ","), term) {
filteredModels = append(filteredModels, m)
}
}
return filteredModels
}

View File

@ -2,6 +2,7 @@ package langchain
import (
"context"
"fmt"
"github.com/tmc/langchaingo/llms"
"github.com/tmc/langchaingo/llms/huggingface"
@ -9,11 +10,16 @@ import (
type HuggingFace struct {
modelPath string
token string
}
func NewHuggingFace(repoId string) (*HuggingFace, error) {
func NewHuggingFace(repoId, token string) (*HuggingFace, error) {
if token == "" {
return nil, fmt.Errorf("no huggingface token provided")
}
return &HuggingFace{
modelPath: repoId,
token: token,
}, nil
}
@ -21,7 +27,7 @@ func (s *HuggingFace) PredictHuggingFace(text string, opts ...PredictOption) (*P
po := NewPredictOptions(opts...)
// Init client
llm, err := huggingface.New()
llm, err := huggingface.New(huggingface.WithToken(s.token))
if err != nil {
return nil, err
}

View File

@ -2,27 +2,32 @@ package model
import (
"context"
"errors"
"fmt"
"os"
"path/filepath"
"slices"
"strings"
"time"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
"github.com/hashicorp/go-multierror"
"github.com/phayes/freeport"
"github.com/rs/zerolog/log"
)
var Aliases map[string]string = map[string]string{
"go-llama": LLamaCPP,
"llama": LLamaCPP,
"embedded-store": LocalStoreBackend,
"go-llama": LLamaCPP,
"llama": LLamaCPP,
"embedded-store": LocalStoreBackend,
"langchain-huggingface": LCHuggingFaceBackend,
}
const (
LlamaGGML = "llama-ggml"
LLamaCPP = "llama-cpp"
LlamaGGML = "llama-ggml"
LLamaCPP = "llama-cpp"
LLamaCPPFallback = "llama-cpp-fallback"
Gpt4AllLlamaBackend = "gpt4all-llama"
Gpt4AllMptBackend = "gpt4all-mpt"
Gpt4AllJBackend = "gpt4all-j"
@ -34,21 +39,75 @@ const (
StableDiffusionBackend = "stablediffusion"
TinyDreamBackend = "tinydream"
PiperBackend = "piper"
LCHuggingFaceBackend = "langchain-huggingface"
LCHuggingFaceBackend = "huggingface"
LocalStoreBackend = "local-store"
)
var AutoLoadBackends []string = []string{
LLamaCPP,
LlamaGGML,
Gpt4All,
BertEmbeddingsBackend,
RwkvBackend,
WhisperBackend,
StableDiffusionBackend,
TinyDreamBackend,
PiperBackend,
func backendPath(assetDir, backend string) string {
return filepath.Join(assetDir, "backend-assets", "grpc", backend)
}
// backendsInAssetDir returns the list of backends in the asset directory
// that should be loaded
func backendsInAssetDir(assetDir string) ([]string, error) {
// Exclude backends from automatic loading
excludeBackends := []string{LocalStoreBackend}
entry, err := os.ReadDir(backendPath(assetDir, ""))
if err != nil {
return nil, err
}
var backends []string
ENTRY:
for _, e := range entry {
for _, exclude := range excludeBackends {
if e.Name() == exclude {
continue ENTRY
}
}
if !e.IsDir() {
backends = append(backends, e.Name())
}
}
// order backends from the asset directory.
// as we scan for backends, we want to keep some order which backends are tried of.
// for example, llama.cpp should be tried first, and we want to keep the huggingface backend at the last.
// sets a priority list
// First has more priority
priorityList := []string{
// First llama.cpp and llama-ggml
LLamaCPP, LLamaCPPFallback, LlamaGGML, Gpt4All,
}
toTheEnd := []string{
// last has to be huggingface
LCHuggingFaceBackend,
// then bert embeddings
BertEmbeddingsBackend,
}
slices.Reverse(priorityList)
slices.Reverse(toTheEnd)
// order certain backends first
for _, b := range priorityList {
for i, be := range backends {
if be == b {
backends = append([]string{be}, append(backends[:i], backends[i+1:]...)...)
break
}
}
}
// make sure that some others are pushed at the end
for _, b := range toTheEnd {
for i, be := range backends {
if be == b {
backends = append(append(backends[:i], backends[i+1:]...), be)
break
}
}
}
return backends, nil
}
// starts the grpcModelProcess for the backend, and returns a grpc client
@ -99,7 +158,7 @@ func (ml *ModelLoader) grpcModel(backend string, o *Options) func(string, string
client = ModelAddress(uri)
}
} else {
grpcProcess := filepath.Join(o.assetDir, "backend-assets", "grpc", backend)
grpcProcess := backendPath(o.assetDir, backend)
// Check if the file exists
if _, err := os.Stat(grpcProcess); os.IsNotExist(err) {
return "", fmt.Errorf("grpc process not found: %s. some backends(stablediffusion, tts) require LocalAI compiled with GO_TAGS", grpcProcess)
@ -243,7 +302,12 @@ func (ml *ModelLoader) GreedyLoader(opts ...Option) (grpc.Backend, error) {
// autoload also external backends
allBackendsToAutoLoad := []string{}
allBackendsToAutoLoad = append(allBackendsToAutoLoad, AutoLoadBackends...)
autoLoadBackends, err := backendsInAssetDir(o.assetDir)
if err != nil {
return nil, err
}
log.Debug().Msgf("Loading from the following backends (in order): %+v", autoLoadBackends)
allBackendsToAutoLoad = append(allBackendsToAutoLoad, autoLoadBackends...)
for _, b := range o.externalBackends {
allBackendsToAutoLoad = append(allBackendsToAutoLoad, b)
}
@ -271,10 +335,10 @@ func (ml *ModelLoader) GreedyLoader(opts ...Option) (grpc.Backend, error) {
log.Info().Msgf("[%s] Loads OK", b)
return model, nil
} else if modelerr != nil {
err = multierror.Append(err, modelerr)
err = errors.Join(err, modelerr)
log.Info().Msgf("[%s] Fails: %s", b, modelerr.Error())
} else if model == nil {
err = multierror.Append(err, fmt.Errorf("backend returned no usable model"))
err = errors.Join(err, fmt.Errorf("backend returned no usable model"))
log.Info().Msgf("[%s] Fails: %s", b, "backend returned no usable model")
}
}

View File

@ -15,6 +15,12 @@ func NewSyncedMap[K comparable, V any]() *SyncedMap[K, V] {
}
}
func (m *SyncedMap[K, V]) Map() map[K]V {
m.mu.RLock()
defer m.mu.RUnlock()
return m.m
}
func (m *SyncedMap[K, V]) Get(key K) V {
m.mu.RLock()
defer m.mu.RUnlock()

38
pkg/xsysinfo/cpu.go Normal file
View File

@ -0,0 +1,38 @@
package xsysinfo
import (
"sort"
"github.com/jaypipes/ghw"
"github.com/klauspost/cpuid/v2"
)
func CPUCapabilities() ([]string, error) {
cpu, err := ghw.CPU()
if err != nil {
return nil, err
}
caps := map[string]struct{}{}
for _, proc := range cpu.Processors {
for _, c := range proc.Capabilities {
caps[c] = struct{}{}
}
}
ret := []string{}
for c := range caps {
ret = append(ret, c)
}
// order
sort.Strings(ret)
return ret, nil
}
func HasCPUCaps(ids ...cpuid.FeatureID) bool {
return cpuid.CPU.Supports(ids...)
}

15
pkg/xsysinfo/gpu.go Normal file
View File

@ -0,0 +1,15 @@
package xsysinfo
import (
"github.com/jaypipes/ghw"
"github.com/jaypipes/ghw/pkg/gpu"
)
func GPUs() ([]*gpu.GraphicsCard, error) {
gpu, err := ghw.GPU()
if err != nil {
return nil, err
}
return gpu.GraphicsCards, nil
}