LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2024-06-07 19:40:48 +00:00

Author	SHA1	Message	Date
Ettore Di Giacinto	fdb45153fe	feat(llama.cpp): Totally decentralized, private, distributed, p2p inference (#2343 ) * feat(llama.cpp): Enable decentralized, distributed inference As https://github.com/mudler/LocalAI/pull/2324 introduced distributed inferencing thanks to @rgerganov implementation in https://github.com/ggerganov/llama.cpp/pull/6829 in upstream llama.cpp, now it is possible to distribute the workload to remote llama.cpp gRPC server. This changeset now uses mudler/edgevpn to establish a secure, distributed network between the nodes using a shared token. The token is generated automatically when starting the server with the `--p2p` flag, and can be used by starting the workers with `local-ai worker p2p-llama-cpp-rpc` by passing the token via environment variable (TOKEN) or with args (--token). As per how mudler/edgevpn works, a network is established between the server and the workers with dht and mdns discovery protocols, the llama.cpp rpc server is automatically started and exposed to the underlying p2p network so the API server can connect on. When the HTTP server is started, it will discover the workers in the network and automatically create the port-forwards to the service locally. Then llama.cpp is configured to use the services. This feature is behind the "p2p" GO_FLAGS Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * go mod tidy Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: add p2p tag Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * better message Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-20 19:17:59 +02:00
Ettore Di Giacinto	16474bfb40	build: add sha (#2356 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-20 18:02:19 +02:00
Ettore Di Giacinto	5a6d120a56	feat(functions): don't use yaml.MapSlice (#2354 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-20 08:31:06 +02:00
Ettore Di Giacinto	7a480bb16f	models(gallery): add LocalAI-Llama3-8b-Function-Call-v0.2-GGUF (#2355 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-20 00:59:17 +02:00
LocalAI [bot]	053531e434	⬆️ Update ggerganov/whisper.cpp (#2352 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2024-05-19 22:23:02 +00:00
LocalAI [bot]	b7ab4f25d9	⬆️ Update ggerganov/llama.cpp (#2351 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2024-05-19 22:22:03 +00:00
Ettore Di Giacinto	73566a2bb2	feat(functions): allow to use JSONRegexMatch unconditionally (#2349 ) * feat(functions): allow to use JSONRegexMatch unconditionally Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(functions): make json_regex_match a list Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-19 18:24:49 +02:00
Ettore Di Giacinto	8ccd5ab040	feat(webui): statically embed js/css assets (#2348 ) * feat(webui): statically embed js/css assets Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * update font assets Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-19 18:24:27 +02:00
Ettore Di Giacinto	5a3db730b9	Update README.md Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-05-19 16:37:10 +02:00
Ettore Di Giacinto	8ad669339e	add openvoice backend (#2334 ) Wip openvoice	2024-05-19 16:27:08 +02:00
Ettore Di Giacinto	a10a952085	models(gallery): update poppy porpoise mmproj (#2346 ) models(gallery): update poppy porpose mmproj Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-19 13:26:02 +02:00
Ettore Di Giacinto	b37447cac5	models(gallery): add master-yi (#2345 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-19 13:25:29 +02:00
Ettore Di Giacinto	f2d182a2eb	models(gallery): add anita (#2344 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-19 13:25:16 +02:00
lenaxia	6b6c8cdd5f	feat(functions): Enable true regex replacement for the regexReplacement option (#2341 ) * Adding regex capabilities to ParseFunctionCall replacement Signed-off-by: Lenaxia <github@47north.lat> * Adding tests for the regex replace in ParseFunctionCall Signed-off-by: Lenaxia <github@47north.lat> * Fixing tests and adding a test case to validate double quote replacement works Signed-off-by: Lenaxia <github@47north.lat> * Make Regex replacement stable, drop lookaheads Signed-off-by: mudler <mudler@localai.io> --------- Signed-off-by: Lenaxia <github@47north.lat> Signed-off-by: mudler <mudler@localai.io> Co-authored-by: Lenaxia <github@47north.lat> Co-authored-by: mudler <mudler@localai.io>	2024-05-19 01:29:10 +02:00
LocalAI [bot]	5f35e85e86	⬆️ Update ggerganov/llama.cpp (#2342 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2024-05-18 21:06:29 +00:00
Ettore Di Giacinto	02f1b477df	feat(functions): simplify parsing, read functions as list (#2340 ) Signed-off-by: mudler <mudler@localai.io>	2024-05-18 09:35:28 +02:00
LocalAI [bot]	9ab8f8f5e0	⬆️ Update ggerganov/llama.cpp (#2339 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2024-05-17 21:13:01 +00:00
LocalAI [bot]	9a255d6453	⬆️ Update ggerganov/llama.cpp (#2337 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2024-05-16 21:53:19 +00:00
Ettore Di Giacinto	e0ef9e2bb9	models(gallery): add yi 6/9b, sqlcoder, sfr-iterative-dpo (#2335 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-16 20:05:20 +02:00
cryptk	86627b27f7	fix: add setuptools to all requirements-intel.txt files for python backends (#2333 ) Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>	2024-05-16 19:15:46 +02:00
LocalAI [bot]	4e92569d45	⬆️ Update ggerganov/whisper.cpp (#2329 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2024-05-15 22:24:06 +00:00
Ettore Di Giacinto	f7508e3888	models(gallery): add hermes-2-theta-llama-3-8b (#2331 ) Signed-off-by: mudler <mudler@localai.io>	2024-05-16 00:22:32 +02:00
Aleksandr Oleinikov	badfc16df1	fix(gallery) Correct llama3-8b-instruct model file (#2330 ) Correct llama3-8b-instruct model file This must be a mistake because the config tries to use a model file that is different from the one actually being downloaded. I assumed the downloaded file is what should be used so I corrected the specified model file to that Signed-off-by: Aleksandr Oleinikov <10602045+tannisroot@users.noreply.github.com>	2024-05-16 00:22:05 +02:00
LocalAI [bot]	b584dcf18a	⬆️ Update ggerganov/llama.cpp (#2316 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2024-05-15 22:20:37 +00:00
Ettore Di Giacinto	4c845fb47d	Update README.md Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-05-15 23:56:52 +02:00
Ettore Di Giacinto	07c0559d06	Update README.md Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-05-15 23:56:22 +02:00
Ettore Di Giacinto	beb598e4f9	feat(functions): mixed JSON BNF grammars (#2328 ) feat(functions): support mixed JSON BNF grammar This PR provides new options to control how functions are extracted from the LLM, and also provides more control on how JSON grammars can be used (also in conjunction). New YAML settings introduced: - `grammar_message`: when enabled, the generated grammar can also decide to push strings and not only JSON objects. This allows the LLM to pick to either respond freely or using JSON. - `grammar_prefix`: Allows to prefix a string to the JSON grammar definition. - `replace_results`: Is a map that allows to replace strings in the LLM result. As an example, consider the following settings for Hermes-2-Pro-Mistral, which allow extracting both JSON results coming from the model, and the ones coming from the grammar: ```yaml function: # disable injecting the "answer" tool disable_no_action: true # This allows the grammar to also return messages grammar_message: true # Suffix to add to the grammar grammar_prefix: '<tool_call>\n' return_name_in_function_response: true # Without grammar uncomment the lines below # Warning: this is relying only on the capability of the # LLM model to generate the correct function call. # no_grammar: true # json_regex_match: "(?s)<tool_call>(.*?)</tool_call>" replace_results: "<tool_call>": "" "\'": "\"" ``` Note: To disable entirely grammars usage in the example above, uncomment the `no_grammar` and `json_regex_match`. Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-15 20:03:18 +02:00
Ettore Di Giacinto	c89271b2e4	feat(llama.cpp): add distributed llama.cpp inferencing (#2324 ) * feat(llama.cpp): support distributed llama.cpp Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat: let tweak how chat messages are merged together Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactor Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Makefile: register to ALL_GRPC_BACKENDS Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactoring, allow disable auto-detection of backends Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * minor fixups Signed-off-by: mudler <mudler@localai.io> * feat: add cmd to start rpc-server from llama.cpp Signed-off-by: mudler <mudler@localai.io> * ci: add ccache Signed-off-by: mudler <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Signed-off-by: mudler <mudler@localai.io>	2024-05-15 01:17:02 +02:00
Ettore Di Giacinto	29909666c3	Update README.md Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-05-15 00:33:16 +02:00
LocalAI [bot]	566b5cf2ee	⬆️ Update ggerganov/whisper.cpp (#2326 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2024-05-14 21:17:46 +00:00
Sertaç Özercan	a670318a9f	feat: auto select llama-cpp cuda runtime (#2306 ) * auto select cpu variant Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * remove cuda target for now Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * fix metal Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * fix path Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * cuda Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * auto select cuda Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * update test Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * select CUDA backend only if present Signed-off-by: mudler <mudler@localai.io> * ci: keep cuda bin in path Signed-off-by: mudler <mudler@localai.io> * Makefile: make dist now builds also cuda Signed-off-by: mudler <mudler@localai.io> * Keep pushing fallback in case auto-flagset/nvidia fails There could be other reasons for which the default binary may fail. For example we might have detected an Nvidia GPU, however the user might not have the drivers/cuda libraries installed in the system, and so it would fail to start. We keep the fallback of llama.cpp at the end of the llama.cpp backends to try to fallback loading in case things go wrong Signed-off-by: mudler <mudler@localai.io> * Do not build cuda on MacOS Signed-off-by: mudler <mudler@localai.io> * cleanup Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * Apply suggestions from code review Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> --------- Signed-off-by: Sertac Ozercan <sozercan@gmail.com> Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> Signed-off-by: mudler <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com> Co-authored-by: mudler <mudler@localai.io>	2024-05-14 19:40:18 +02:00
Ettore Di Giacinto	84e2407afa	feat(functions): allow to set JSON matcher (#2319 ) Signed-off-by: mudler <mudler@localai.io>	2024-05-14 09:39:20 +02:00
Ettore Di Giacinto	c4186f13c3	feat(functions): support models with no grammar and no regex (#2315 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-14 00:32:32 +02:00
LocalAI [bot]	4ac7956f68	⬆️ Update ggerganov/whisper.cpp (#2317 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2024-05-13 22:25:14 +00:00
Ettore Di Giacinto	e49ea0123b	feat(llama.cpp): add `flash_attention` and `no_kv_offloading` (#2310 ) feat(llama.cpp): add flash_attn and no_kv_offload Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-13 19:07:51 +02:00
Ettore Di Giacinto	7123d07456	models(gallery): add orthocopter (#2313 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-13 18:45:58 +02:00
Ettore Di Giacinto	2db22087ae	models(gallery): add lumimaidv2 (#2312 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-13 18:44:44 +02:00
Ettore Di Giacinto	fa7b2aee9c	models(gallery): add Bunny-llama (#2311 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-13 18:44:25 +02:00
Ettore Di Giacinto	4d70b6fb2d	models(gallery): add aura-llama-Abliterated (#2309 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-13 18:44:10 +02:00
Sertaç Özercan	e2c3ffb09b	feat: auto select llama-cpp cpu variant (#2305 ) * auto select cpu variant Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * remove cuda target for now Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * fix metal Signed-off-by: Sertac Ozercan <sozercan@gmail.com> * fix path Signed-off-by: Sertac Ozercan <sozercan@gmail.com> --------- Signed-off-by: Sertac Ozercan <sozercan@gmail.com>	2024-05-13 11:37:52 +02:00
LocalAI [bot]	b4cb22f444	⬆️ Update ggerganov/llama.cpp (#2303 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2024-05-12 21:18:59 +00:00
LocalAI [bot]	5534b13903	feat(swagger): update swagger (#2302 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2024-05-12 21:00:18 +00:00
fakezeta	5b79bd04a7	add setuptools for openvino (#2301 )	2024-05-12 19:31:43 +00:00
Ettore Di Giacinto	9d8c705fd9	feat(ui): display number of available models for installation (#2298 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-12 14:24:36 +02:00
Ettore Di Giacinto	310b2171be	models(gallery): add llama-3-refueled (#2297 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-12 09:39:58 +02:00
Ettore Di Giacinto	98af0b5d85	models(gallery): add jsl-medllama-3-8b-v2.0 (#2296 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-12 09:38:05 +02:00
Ettore Di Giacinto	ca14f95d2c	models(gallery): add l3-chaoticsoliloquy-v1.5-4x8b (#2295 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-12 09:37:55 +02:00
Ikko Eltociear Ashimine	1b69b338c0	docs: Update semantic-todo/README.md (#2294 ) seperate -> separate Signed-off-by: Ikko Eltociear Ashimine <eltociear@gmail.com>	2024-05-12 09:02:11 +02:00
cryptk	88942e4761	fix: add missing openvino/optimum/etc libraries for Intel, fixes #2289 (#2292 ) Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>	2024-05-12 09:01:45 +02:00
Ettore Di Giacinto	efa32a2677	feat(grammar): support models with specific construct (#2291 ) When enabling grammar with functions, it might be useful to allow more flexibility to support models that are fine-tuned against returning function calls of the form of { "name": "function_name", "arguments" {...} } rather then { "function": "function_name", "arguments": {..} }. This might call out to a more generic approach later on, but for the moment being we can easily support both as we have just to specific different types. If needed we can expand on this later on Signed-off-by: mudler <mudler@localai.io>	2024-05-12 01:13:22 +02:00

1 2 3 4 5 ...

1683 Commits