LocalAI/text-generation.md at ab7b4d5ee9448e533a342bd1771393acd2967191

mirror of https://github.com/mudler/LocalAI.git synced 2024-06-07 19:40:48 +00:00

Ettore Di Giacinto c5c77d2b0d

docs: Initial import from localai-website (#1312 )

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2023-11-22 18:13:50 +01:00

2.1 KiB

Raw Blame History

+++ disableToc = false title = "📖 Text generation (GPT)" weight = 2 +++

LocalAI supports generating text with GPT with llama.cpp and other backends (such as rwkv.cpp as ) see also the [Model compatibility]({{%relref "model-compatibility" %}}) for an up-to-date list of the supported model families.

Note:

You can also specify the model name as part of the OpenAI token.
If only one model is available, the API will use it for all the requests.

Chat completions

https://platform.openai.com/docs/api-reference/chat

For example, to generate a chat completion, you can send a POST request to the /v1/chat/completions endpoint with the instruction as the request body:

curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
  "model": "ggml-koala-7b-model-q4_0-r2.bin",
  "messages": [{"role": "user", "content": "Say this is a test!"}],
  "temperature": 0.7
}'

Available additional parameters: top_p, top_k, max_tokens

Edit completions

https://platform.openai.com/docs/api-reference/edits

To generate an edit completion you can send a POST request to the /v1/edits endpoint with the instruction as the request body:

curl http://localhost:8080/v1/edits -H "Content-Type: application/json" -d '{
  "model": "ggml-koala-7b-model-q4_0-r2.bin",
  "instruction": "rephrase",
  "input": "Black cat jumped out of the window",
  "temperature": 0.7
}'

Available additional parameters: top_p, top_k, max_tokens.

Completions

https://platform.openai.com/docs/api-reference/completions

To generate a completion, you can send a POST request to the /v1/completions endpoint with the instruction as per the request body:

curl http://localhost:8080/v1/completions -H "Content-Type: application/json" -d '{
  "model": "ggml-koala-7b-model-q4_0-r2.bin",
  "prompt": "A long time ago in a galaxy far, far away",
  "temperature": 0.7
}'

Available additional parameters: top_p, top_k, max_tokens

List models

You can list all the models available with:

curl http://localhost:8080/v1/models

2.1 KiB Raw Blame History

Chat completions

Edit completions

Completions

List models

2.1 KiB

Raw Blame History