LocalAI/embedded/models/codellama-7b-gguf.yaml

name: codellama-7b-gguf
backend: transformers
parameters:
  model: huggingface://TheBloke/CodeLlama-7B-GGUF/codellama-7b.Q4_K_M.gguf
  temperature: 0.5
  top_k: 40
  seed: -1
  top_p: 0.95
mirostat: 2
mirostat_eta: 1.0
mirostat_tau: 1.0

context_size: 4096
f16: true
gpu_layers: 90
usage: |
      curl http://localhost:8080/v1/completions -H "Content-Type: application/json" -d '{
          "model": "codellama-7b-gguf",
          "prompt": "import socket\n\ndef ping_exponential_backoff(host: str):"
      }'
transformers: correctly load automodels (#1643) * backends(transformers): use AutoModel with LLM types * examples: animagine-xl * Add codellama examples 2024-01-25 23:13:21 +00:00			`name: codellama-7b-gguf`
			`backend: transformers`
			`parameters:`
			`model: huggingface://TheBloke/CodeLlama-7B-GGUF/codellama-7b.Q4_K_M.gguf`
fix(doc/examples): set defaults to mirostat (#1820) The default sampler on some models don't return enough candidates which leads to a false sense of randomness. Tracing back the code it looks that with the temperature sampler there might not be enough candidates to pick from, and since the seed and "randomness" take effect while picking a good candidate this yields to the same results over and over. Fixes https://github.com/mudler/LocalAI/issues/1723 by updating the examples and documentation to use mirostat instead. 2024-03-11 18:49:03 +00:00			`temperature: 0.5`
transformers: correctly load automodels (#1643) * backends(transformers): use AutoModel with LLM types * examples: animagine-xl * Add codellama examples 2024-01-25 23:13:21 +00:00			`top_k: 40`
			`seed: -1`
			`top_p: 0.95`
fix(doc/examples): set defaults to mirostat (#1820) The default sampler on some models don't return enough candidates which leads to a false sense of randomness. Tracing back the code it looks that with the temperature sampler there might not be enough candidates to pick from, and since the seed and "randomness" take effect while picking a good candidate this yields to the same results over and over. Fixes https://github.com/mudler/LocalAI/issues/1723 by updating the examples and documentation to use mirostat instead. 2024-03-11 18:49:03 +00:00			`mirostat: 2`
			`mirostat_eta: 1.0`
			`mirostat_tau: 1.0`

transformers: correctly load automodels (#1643) * backends(transformers): use AutoModel with LLM types * examples: animagine-xl * Add codellama examples 2024-01-25 23:13:21 +00:00			`context_size: 4096`
			`f16: true`
			`gpu_layers: 90`
			`usage: \|`
			`curl http://localhost:8080/v1/completions -H "Content-Type: application/json" -d '{`
			`"model": "codellama-7b-gguf",`
			`"prompt": "import socket\n\ndef ping_exponential_backoff(host: str):"`
			`}'`