LocalAI [bot]
9ab8f8f5e0
⬆️ Update ggerganov/llama.cpp ( #2339 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-05-17 21:13:01 +00:00
LocalAI [bot]
9a255d6453
⬆️ Update ggerganov/llama.cpp ( #2337 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-05-16 21:53:19 +00:00
LocalAI [bot]
4e92569d45
⬆️ Update ggerganov/whisper.cpp ( #2329 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-05-15 22:24:06 +00:00
LocalAI [bot]
b584dcf18a
⬆️ Update ggerganov/llama.cpp ( #2316 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-05-15 22:20:37 +00:00
Ettore Di Giacinto
c89271b2e4
feat(llama.cpp): add distributed llama.cpp inferencing ( #2324 )
...
* feat(llama.cpp): support distributed llama.cpp
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat: let tweak how chat messages are merged together
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactor
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Makefile: register to ALL_GRPC_BACKENDS
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactoring, allow disable auto-detection of backends
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* minor fixups
Signed-off-by: mudler <mudler@localai.io>
* feat: add cmd to start rpc-server from llama.cpp
Signed-off-by: mudler <mudler@localai.io>
* ci: add ccache
Signed-off-by: mudler <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: mudler <mudler@localai.io>
2024-05-15 01:17:02 +02:00
LocalAI [bot]
566b5cf2ee
⬆️ Update ggerganov/whisper.cpp ( #2326 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-05-14 21:17:46 +00:00
Sertaç Özercan
a670318a9f
feat: auto select llama-cpp cuda runtime ( #2306 )
...
* auto select cpu variant
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
* remove cuda target for now
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
* fix metal
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
* fix path
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
* cuda
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
* auto select cuda
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
* update test
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
* select CUDA backend only if present
Signed-off-by: mudler <mudler@localai.io>
* ci: keep cuda bin in path
Signed-off-by: mudler <mudler@localai.io>
* Makefile: make dist now builds also cuda
Signed-off-by: mudler <mudler@localai.io>
* Keep pushing fallback in case auto-flagset/nvidia fails
There could be other reasons for which the default binary may fail. For example we might have detected an Nvidia GPU,
however the user might not have the drivers/cuda libraries installed in the system, and so it would fail to start.
We keep the fallback of llama.cpp at the end of the llama.cpp backends to try to fallback loading in case things go wrong
Signed-off-by: mudler <mudler@localai.io>
* Do not build cuda on MacOS
Signed-off-by: mudler <mudler@localai.io>
* cleanup
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
* Apply suggestions from code review
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
---------
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Signed-off-by: mudler <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: mudler <mudler@localai.io>
2024-05-14 19:40:18 +02:00
LocalAI [bot]
4ac7956f68
⬆️ Update ggerganov/whisper.cpp ( #2317 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-05-13 22:25:14 +00:00
Sertaç Özercan
e2c3ffb09b
feat: auto select llama-cpp cpu variant ( #2305 )
...
* auto select cpu variant
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
* remove cuda target for now
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
* fix metal
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
* fix path
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
---------
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
2024-05-13 11:37:52 +02:00
LocalAI [bot]
b4cb22f444
⬆️ Update ggerganov/llama.cpp ( #2303 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-05-12 21:18:59 +00:00
LocalAI [bot]
dfc420706c
⬆️ Update ggerganov/llama.cpp ( #2290 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-05-11 21:16:34 +00:00
LocalAI [bot]
93e581dfd0
⬆️ Update ggerganov/llama.cpp ( #2285 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-05-10 21:09:22 +00:00
Ettore Di Giacinto
9b09eb005f
build: do not specify a BUILD_ID by default ( #2284 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-10 16:01:55 +02:00
LocalAI [bot]
18a04246fa
⬆️ Update ggerganov/llama.cpp ( #2281 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-05-09 22:18:49 +00:00
LocalAI [bot]
d651f390cd
⬆️ Update ggerganov/whisper.cpp ( #2273 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-05-08 22:11:10 +00:00
LocalAI [bot]
eca5200fbd
⬆️ Update ggerganov/llama.cpp ( #2272 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-05-08 21:34:56 +00:00
LocalAI [bot]
995aa5ed21
⬆️ Update ggerganov/llama.cpp ( #2263 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-05-07 21:39:12 +00:00
LocalAI [bot]
581b894789
⬆️ Update ggerganov/llama.cpp ( #2255 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-05-06 21:28:07 +00:00
LocalAI [bot]
c5475020fe
⬆️ Update ggerganov/llama.cpp ( #2251 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-05-05 21:16:00 +00:00
Ettore Di Giacinto
c5798500cb
feat(single-build): generate single binaries for releases ( #2246 )
...
* feat(single-build): generate single binaries for releases
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* drop old targets
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-05 17:20:51 +02:00
LocalAI [bot]
17e94fbcb1
⬆️ Update ggerganov/llama.cpp ( #2239 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-05-04 21:26:22 +00:00
Ettore Di Giacinto
530bec9c64
feat(llama.cpp): do not specify backends to autoload and add llama.cpp variants ( #2232 )
...
* feat(initializer): do not specify backends to autoload
We can simply try to autoload the backends extracted in the asset dir.
This will allow to build variants of the same backend (for e.g. with different instructions sets),
so to have a single binary for all the variants.
Signed-off-by: mudler <mudler@localai.io>
* refactor(prepare): refactor out llama.cpp prepare steps
Make it so are idempotent and that we can re-build
Signed-off-by: mudler <mudler@localai.io>
* [TEST] feat(build): build noavx version along
Signed-off-by: mudler <mudler@localai.io>
* build: make build parallel
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* build: do not override CMAKE_ARGS
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* build: add fallback variant
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix(huggingface-langchain): fail if no token is set
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix(huggingface-langchain): rename
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix: do not autoload local-store
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix: give priority between the listed backends
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: mudler <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-04 17:56:12 +02:00
LocalAI [bot]
ac0f3d6e82
⬆️ Update ggerganov/whisper.cpp ( #2230 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-05-03 22:16:26 +00:00
LocalAI [bot]
da0b6a89ae
⬆️ Update ggerganov/llama.cpp ( #2229 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-05-03 21:39:28 +00:00
LocalAI [bot]
2cc1bd85af
⬆️ Update ggerganov/llama.cpp ( #2224 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-05-02 21:23:40 +00:00
LocalAI [bot]
6a7a7996bb
⬆️ Update ggerganov/llama.cpp ( #2213 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-05-01 21:19:44 +00:00
LocalAI [bot]
f90d56d371
⬆️ Update ggerganov/llama.cpp ( #2203 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-04-30 21:53:31 +00:00
Chris Jowett
970cb3a219
chore: update go-stablediffusion to latest commit with Make jobserver fix
...
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-04-30 15:59:28 -05:00
LocalAI [bot]
29d7812344
⬆️ Update ggerganov/whisper.cpp ( #2188 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-04-29 22:16:04 +00:00
cryptk
5fd46175dc
fix: ensure GNUMake jobserver is passed through to whisper.cpp build ( #2187 )
...
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-04-29 16:40:50 -05:00
LocalAI [bot]
52a268c38c
⬆️ Update ggerganov/llama.cpp ( #2189 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-04-29 21:36:30 +00:00
cryptk
93ca56086e
update go-tinydream to latest commit ( #2182 )
...
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-04-29 15:17:09 +02:00
LocalAI [bot]
5fef3b0ff1
⬆️ Update ggerganov/whisper.cpp ( #2177 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-04-28 22:32:45 +00:00
LocalAI [bot]
01860674c4
⬆️ Update ggerganov/llama.cpp ( #2176 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-04-28 21:41:12 +00:00
cryptk
21974fe1d3
fix: swap to WHISPER_CUDA per deprecation message from whisper.cpp ( #2170 )
...
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-04-28 17:51:53 +00:00
LocalAI [bot]
c3982212f9
⬆️ Update ggerganov/llama.cpp ( #2159 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-04-27 21:32:43 +00:00
LocalAI [bot]
030d555995
⬆️ Update ggerganov/llama.cpp ( #2150 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-04-27 02:18:28 +00:00
fakezeta
c9451cb604
Bump oneapi-basekit, optimum and openvino ( #2139 )
...
* Bump oneapi-basekit, optimum and openvino
* Changed PERFORMANCE HINT to CUMULATIVE_THROUGHPUT
Minor latency change for first token but about 10-15% speedup on token generation.
2024-04-26 16:20:43 +02:00
LocalAI [bot]
365ef92530
⬆️ Update mudler/go-stable-diffusion ( #2134 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-04-25 21:41:38 +00:00
LocalAI [bot]
5fceb876c4
⬆️ Update ggerganov/llama.cpp ( #2133 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-04-25 21:40:41 +00:00
Ettore Di Giacinto
b664edde29
feat(rerankers): Add new backend, support jina rerankers API ( #2121 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-04-25 00:19:02 +02:00
LocalAI [bot]
e16658b7ec
⬆️ Update ggerganov/llama.cpp ( #2123 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-04-24 22:00:17 +00:00
LocalAI [bot]
d30280ed23
⬆️ Update ggerganov/whisper.cpp ( #2122 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-04-24 21:55:30 +00:00
Ettore Di Giacinto
4fffc47e77
deps(llama.cpp): update, use better model for function call tests ( #2119 )
...
deps(llama.cpp): update
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-04-24 18:44:04 +02:00
LocalAI [bot]
38c9abed8b
⬆️ Update ggerganov/llama.cpp ( #2089 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-04-21 16:35:30 +00:00
Ettore Di Giacinto
284ad026b1
refactor(routes): split routes registration ( #2077 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-04-21 01:19:57 +02:00
LocalAI [bot]
1e37101930
⬆️ Update ggerganov/llama.cpp ( #2080 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-04-20 00:05:16 +00:00
LocalAI [bot]
e9448005a5
⬆️ Update ggerganov/llama.cpp ( #2051 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-04-18 21:30:55 +00:00
cryptk
e9f090257c
fix: adjust some sources names to match the naming of their repositories ( #2061 )
...
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-04-18 01:59:05 +00:00
Ettore Di Giacinto
af9e5a2d05
Revert #1963 ( #2056 )
...
* Revert "fix(fncall): fix regression introduced in #1963 (#2048 )"
This reverts commit 6b06d4e0af
.
* Revert "fix: action-tmate back to upstream, dead code removal (#2038 )"
This reverts commit fdec8a9d00
.
* Revert "feat(grpc): return consumed token count and update response accordingly (#2035 )"
This reverts commit e843d7df0e
.
* Revert "refactor: backend/service split, channel-based llm flow (#1963 )"
This reverts commit eed5706994
.
* feat(grpc): return consumed token count and update response accordingly
Fixes : #1920
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-04-17 23:33:49 +02:00