LocalAI/backend/cpp/llama
Ettore Di Giacinto c56b6ddb1c
fix(llama.cpp): disable infinite context shifting (#1704)
Infinite context loop might as well trigger an infinite loop of context
shifting if the model hallucinates and does not stop answering.
This has the unpleasant effect that the predicion never terminates,
which is the case especially on small models which tends to hallucinate.

Workarounds https://github.com/mudler/LocalAI/issues/1333 by removing
context-shifting.

See also upstream issue: https://github.com/ggerganov/llama.cpp/issues/3969
2024-02-13 21:17:21 +01:00
..
CMakeLists.txt feat(sycl): Add support for Intel GPUs with sycl (#1647) (#1660) 2024-02-01 19:21:52 +01:00
grpc-server.cpp fix(llama.cpp): disable infinite context shifting (#1704) 2024-02-13 21:17:21 +01:00
json.hpp 🔥 add LaVA support and GPT vision API, Multiple requests for llama.cpp, return JSON types (#1254) 2023-11-11 13:14:59 +01:00
Makefile feat(sycl): Add support for Intel GPUs with sycl (#1647) (#1660) 2024-02-01 19:21:52 +01:00
utils.hpp feat(sycl): Add support for Intel GPUs with sycl (#1647) (#1660) 2024-02-01 19:21:52 +01:00