Sebastian.W
d23e73b118
fix(autogptq): do not use_triton with qwen-vl ( #1985 )
...
* Enhance autogptq backend to support VL models
* update dependencies for autogptq
* remove redundant auto-gptq dependency
* Convert base64 to image_url for Qwen-VL model
* implemented model inference for qwen-vl
* remove user prompt from generated answer
* fixed write image error
* fixed use_triton issue when loading Qwen-VL model
---------
Co-authored-by: Binghua Wu <bingwu@estee.com>
2024-04-10 10:36:10 +00:00
Ettore Di Giacinto
d692b2c32a
ci: push latest images for dockerhub ( #1984 )
...
Fixes : #1983
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-04-10 10:31:59 +02:00
LocalAI [bot]
7e2f8bb408
⬆️ Update ggerganov/whisper.cpp ( #1980 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-04-10 09:08:00 +02:00
LocalAI [bot]
951e39d36c
⬆️ Update ggerganov/llama.cpp ( #1979 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-04-10 09:07:41 +02:00
LocalAI [bot]
aeb3f835ae
⬆️ Update docs version mudler/LocalAI ( #1978 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-04-10 09:07:21 +02:00
Ettore Di Giacinto
cc3d601836
ci: fixup latest image push
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-04-09 09:49:11 +02:00
Ettore Di Giacinto
2bbb221fb1
tests(petals): temp disable
2024-04-08 21:28:59 +00:00
LocalAI [bot]
195be10050
⬆️ Update ggerganov/llama.cpp ( #1973 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-04-08 23:26:52 +02:00
fakezeta
a38618db02
fix regression #1971 ( #1972 )
...
fixes regression #1971 introduced by intel_extension_for_transformers==1.4
2024-04-08 22:33:51 +02:00
LocalAI [bot]
efcca15d3f
⬆️ Update ggerganov/llama.cpp ( #1970 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-04-08 08:38:47 +02:00
LocalAI [bot]
a153b628c2
⬆️ Update ggerganov/whisper.cpp ( #1969 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-04-08 08:38:17 +02:00
Ettore Di Giacinto
f36d86ba6d
fix(hermes-2-pro-mistral): correct dashes in template to suppress newlines ( #1966 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-04-07 18:23:47 +02:00
Ettore Di Giacinto
74492a81c7
doc(quickstart): fix typo
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-04-07 11:06:35 +02:00
LocalAI [bot]
ed13782986
⬆️ Update ggerganov/llama.cpp ( #1964 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-04-07 10:32:10 +02:00
Ettore Di Giacinto
8342553214
fix(llama.cpp): set better defaults for llama.cpp ( #1961 )
...
fix(defaults): set better defaults for llama.cpp
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-04-06 22:56:45 +02:00
LocalAI [bot]
8aa5f5a660
⬆️ Update ggerganov/llama.cpp ( #1960 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-04-06 19:15:25 +00:00
LocalAI [bot]
b2d9e3f704
⬆️ Update ggerganov/llama.cpp ( #1959 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-04-05 08:41:55 +02:00
LocalAI [bot]
f744e1f931
⬆️ Update ggerganov/whisper.cpp ( #1958 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-04-05 08:41:35 +02:00
cryptk
b85dad0286
feat: first pass at improving logging ( #1956 )
...
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-04-04 09:24:22 +02:00
LocalAI [bot]
3851b51d98
⬆️ Update ggerganov/llama.cpp ( #1953 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-04-04 00:27:57 +02:00
Ettore Di Giacinto
ff77d3bc22
fix(seed): generate random seed per-request if -1 is set ( #1952 )
...
* fix(seed): generate random seed per-request if -1 is set
Also update ci with new workflows and allow the aio tests to run with an
api key
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* docs(openvino): Add OpenVINO example
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-04-03 22:25:47 +02:00
Ettore Di Giacinto
93cfec3c32
ci: correctly tag latest and aio images
2024-04-03 11:30:23 +02:00
Ettore Di Giacinto
89560ef87f
fix(ci): manually tag latest images ( #1948 )
...
fix(ci): manually tag images
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-04-02 19:25:46 +02:00
Ettore Di Giacinto
9bc209ba73
fix(welcome): stable model list ( #1949 )
2024-04-02 19:25:32 +02:00
Ettore Di Giacinto
84e0dc3246
fix(hermes-2-pro-mistral): correct stopwords ( #1947 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-04-02 15:38:00 +02:00
LocalAI [bot]
4d4d76114d
⬆️ Update ggerganov/llama.cpp ( #1941 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-04-02 09:16:04 +02:00
cryptk
86bc5f1350
fix: use exec in entrypoint scripts to fix signal handling ( #1943 )
2024-04-02 09:15:44 +02:00
Ettore Di Giacinto
e8f02c083f
fix(functions): respect when selected from string ( #1940 )
...
* fix(functions): respect when selected from string
* fix(toolschoice): decode both string and objects
2024-04-01 19:39:54 +02:00
Ettore Di Giacinto
ebb1fcedea
fix(hermes-2-pro-mistral): add stopword for toolcall ( #1939 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-04-01 11:48:35 +02:00
LocalAI [bot]
66f90f8dc1
⬆️ Update ggerganov/llama.cpp ( #1937 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-04-01 08:59:23 +02:00
Ettore Di Giacinto
3c778b538a
Update phi-2-orange.yaml
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-03-31 13:06:41 +02:00
Ettore Di Giacinto
35290e146b
fix(grammar): respect JSONmode and grammar from user input ( #1935 )
...
* fix(grammar): Fix JSON mode and custom grammar
* tests(aio): add jsonmode test
* tests(aio): add functioncall test
* fix(aio): use hermes-2-pro-mistral as llm for CPU profile
* add phi-2-orange
2024-03-31 13:04:09 +02:00
LocalAI [bot]
784657a652
⬆️ Update ggerganov/llama.cpp ( #1934 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-03-31 00:27:38 +01:00
LocalAI [bot]
831efa8893
⬆️ Update ggerganov/whisper.cpp ( #1933 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-03-31 00:27:16 +01:00
Ettore Di Giacinto
957f428fd5
fix(tools): correctly render tools response in templates ( #1932 )
...
* fix(tools): allow to correctly display both Functions and Tools
* models(hermes-2-pro): correctly display function results
2024-03-30 19:02:07 +01:00
Ettore Di Giacinto
61e5e6bc36
fix(swagger): do not specify a host ( #1930 )
...
In this way the requests are redirected to the host used by the client
to perform the request.
2024-03-30 12:04:41 +01:00
Ettore Di Giacinto
eab4a91a9b
fix(aio): correctly detect intel systems ( #1931 )
...
Also rename SIZE to PROFILE
2024-03-30 12:04:32 +01:00
LocalAI [bot]
2bba62ca4d
⬆️ Update ggerganov/llama.cpp ( #1928 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-03-29 22:52:01 +00:00
Ettore Di Giacinto
bcdc83b46d
Update quickstart.md
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-03-29 23:00:06 +01:00
Ettore Di Giacinto
92fbdfd06f
feat(swagger): update ( #1929 )
2024-03-29 22:48:58 +01:00
cryptk
93702e39d4
feat(build): adjust number of parallel make jobs ( #1915 )
...
* feat(build): adjust number of parallel make jobs
* fix: update make on MacOS from brew to support --output-sync argument
* fix: cache grpc with version as part of key to improve validity of cache hits
* fix: use gmake for tests-apple to use the updated GNU make version
* fix: actually use the new make version for tests-apple
* feat: parallelize tests-extra
* feat: attempt to cache grpc build for docker images
* fix: don't quote GRPC version
* fix: don't cache go modules, we have limited cache space, better used elsewhere
* fix: release with the same version of go that we test with
* fix: don't fail on exporting cache layers
* fix: remove deprecated BUILD_GRPC docker arg from Makefile
2024-03-29 22:32:40 +01:00
LocalAI [bot]
a7fc89c207
⬆️ Update ggerganov/whisper.cpp ( #1927 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-03-29 22:29:50 +01:00
Ettore Di Giacinto
123a5a2e16
feat(swagger): Add swagger API doc ( #1926 )
...
* makefile(build): add minimal and api build target
* feat(swagger): Add swagger
2024-03-29 22:29:33 +01:00
LocalAI [bot]
ab2f403dd0
⬆️ Update ggerganov/whisper.cpp ( #1924 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-03-29 00:13:59 +01:00
LocalAI [bot]
b9c5e14e2c
⬆️ Update ggerganov/llama.cpp ( #1923 )
...
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2024-03-29 00:13:38 +01:00
Ettore Di Giacinto
bf65ed6eb8
feat(webui): add partials, show backends associated to models ( #1922 )
...
* feat(webui): add partials, show backends associated to models
* fix(auth): put assistant and backend under auth
2024-03-28 21:52:52 +01:00
Ettore Di Giacinto
4e79294f97
Update README.md
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-03-28 19:52:40 +01:00
Ettore Di Giacinto
8477e8fac3
Update quickstart.md
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-03-28 18:28:30 +01:00
Ettore Di Giacinto
13ccd2afef
docs(aio-usage): update docs to show examples ( #1921 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-03-28 18:16:58 +01:00
Ettore Di Giacinto
23b833d171
Update run-other-models.md
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-03-28 12:42:37 +01:00