LocalAI/.env

## Set number of threads.
## Note: prefer the number of physical cores. Overbooking the CPU degrades performance notably.
# THREADS=14

## Specify a different bind address (defaults to ":8080")
# ADDRESS=127.0.0.1:8080

## Default models context size
# CONTEXT_SIZE=512
#
## Define galleries.
## models will to install will be visible in `/models/available`
# GALLERIES=[{"name":"model-gallery", "url":"github:go-skynet/model-gallery/index.yaml"}]

## CORS settings
# CORS=true
# CORS_ALLOW_ORIGINS=*

## Default path for models
#
MODELS_PATH=/models

## Enable debug mode
# DEBUG=true

## Disables COMPEL (Diffusers)
# COMPEL=0

## Enable/Disable single backend (useful if only one GPU is available)
# SINGLE_ACTIVE_BACKEND=true

## Specify a build type. Available: cublas, openblas, clblas.
## cuBLAS: This is a GPU-accelerated version of the complete standard BLAS (Basic Linear Algebra Subprograms) library. It's provided by Nvidia and is part of their CUDA toolkit.
## OpenBLAS: This is an open-source implementation of the BLAS library that aims to provide highly optimized code for various platforms. It includes support for multi-threading and can be compiled to use hardware-specific features for additional performance. OpenBLAS can run on many kinds of hardware, including CPUs from Intel, AMD, and ARM.
## clBLAS:   This is an open-source implementation of the BLAS library that uses OpenCL, a framework for writing programs that execute across heterogeneous platforms consisting of CPUs, GPUs, and other processors. clBLAS is designed to take advantage of the parallel computing power of GPUs but can also run on any hardware that supports OpenCL. This includes hardware from different vendors like Nvidia, AMD, and Intel.
# BUILD_TYPE=openblas

## Uncomment and set to true to enable rebuilding from source
# REBUILD=true

## Enable go tags, available: stablediffusion, tts
## stablediffusion: image generation with stablediffusion
## tts: enables text-to-speech with go-piper 
## (requires REBUILD=true)
#
# GO_TAGS=stablediffusion

## Path where to store generated images
# IMAGE_PATH=/tmp

## Specify a default upload limit in MB (whisper)
# UPLOAD_LIMIT

## List of external GRPC backends (note on the container image this variable is already set to use extra backends available in extra/)
# EXTERNAL_GRPC_BACKENDS=my-backend:127.0.0.1:9000,my-backend2:/usr/bin/backend.py

### Advanced settings ###
### Those are not really used by LocalAI, but from components in the stack ###
##
### Preload libraries
# LD_PRELOAD=

### Huggingface cache for models
# HUGGINGFACE_HUB_CACHE=/usr/local/huggingface

### Python backends GRPC max workers
### Default number of workers for GRPC Python backends.
### This actually controls wether a backend can process multiple requests or not.
# PYTHON_GRPC_MAX_WORKERS=1

### Define the number of parallel LLAMA.cpp workers (Defaults to 1)
# LLAMACPP_PARALLEL=1

### Enable to run parallel requests
# PARALLEL_REQUESTS=true
feat: allow to override model config (#323) 2023-05-20 15:03:53 +00:00			`## Set number of threads.`
			`## Note: prefer the number of physical cores. Overbooking the CPU degrades performance notably.`
Update .env 2023-04-20 23:47:35 +00:00			`# THREADS=14`
feat: allow to override model config (#323) 2023-05-20 15:03:53 +00:00
			`## Specify a different bind address (defaults to ":8080")`
			`# ADDRESS=127.0.0.1:8080`

			`## Default models context size`
Update .env 2023-04-20 23:47:35 +00:00			`# CONTEXT_SIZE=512`
Update .env 2023-06-28 16:28:53 +00:00			`#`
			`## Define galleries.`
			## models will to install will be visible in `/models/available`
			`# GALLERIES=[{"name":"model-gallery", "url":"github:go-skynet/model-gallery/index.yaml"}]`

			`## CORS settings`
			`# CORS=true`
			`# CORS_ALLOW_ORIGINS=*`
feat: allow to override model config (#323) 2023-05-20 15:03:53 +00:00
			`## Default path for models`
Update .env 2023-06-28 16:28:53 +00:00			`#`
feature: makefile & updates (#23) Co-authored-by: mudler <mudler@c3os.io> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com> 2023-04-15 23:39:07 +00:00			`MODELS_PATH=/models`
feat: allow to override model config (#323) 2023-05-20 15:03:53 +00:00
			`## Enable debug mode`
Update .env 2023-04-20 23:47:35 +00:00			`# DEBUG=true`
feat: allow to override model config (#323) 2023-05-20 15:03:53 +00:00
feat(llama.cpp): update (#1024) Description This PR fixes # Notes for Reviewers [Signed commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin) - [ ] Yes, I signed my commits. <!-- Thank you for contributing to LocalAI! Contributing Conventions: 1. Include descriptive PR titles with [<component-name>] prepended. 2. Build and test your changes before submitting a PR. 3. Sign your commits By following the community's contribution conventions upfront, the review process will be accelerated and your PR merged more quickly. --> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2023-09-08 16:38:22 +00:00			`## Disables COMPEL (Diffusers)`
			`# COMPEL=0`

			`## Enable/Disable single backend (useful if only one GPU is available)`
			`# SINGLE_ACTIVE_BACKEND=true`

Update .env 2023-06-28 16:28:53 +00:00			`## Specify a build type. Available: cublas, openblas, clblas.`
docs: base-Update comments in .env for cublas, openblas, clblas (#867) 2023-08-07 08:22:42 +00:00			`## cuBLAS: This is a GPU-accelerated version of the complete standard BLAS (Basic Linear Algebra Subprograms) library. It's provided by Nvidia and is part of their CUDA toolkit.`
			`## OpenBLAS: This is an open-source implementation of the BLAS library that aims to provide highly optimized code for various platforms. It includes support for multi-threading and can be compiled to use hardware-specific features for additional performance. OpenBLAS can run on many kinds of hardware, including CPUs from Intel, AMD, and ARM.`
			`## clBLAS: This is an open-source implementation of the BLAS library that uses OpenCL, a framework for writing programs that execute across heterogeneous platforms consisting of CPUs, GPUs, and other processors. clBLAS is designed to take advantage of the parallel computing power of GPUs but can also run on any hardware that supports OpenCL. This includes hardware from different vendors like Nvidia, AMD, and Intel.`
feat: allow to override model config (#323) 2023-05-20 15:03:53 +00:00			`# BUILD_TYPE=openblas`

Make REBUILD=false default behavior Add notice to documentation Signed-off-by: mudler <mudler@localai.io> 2023-07-06 22:29:10 +00:00			`## Uncomment and set to true to enable rebuilding from source`
			`# REBUILD=true`
feat: allow to override model config (#323) 2023-05-20 15:03:53 +00:00
Update .env 2023-06-28 16:28:53 +00:00			`## Enable go tags, available: stablediffusion, tts`
			`## stablediffusion: image generation with stablediffusion`
			`## tts: enables text-to-speech with go-piper`
			`## (requires REBUILD=true)`
			`#`
feat: allow to override model config (#323) 2023-05-20 15:03:53 +00:00			`# GO_TAGS=stablediffusion`

			`## Path where to store generated images`
			`# IMAGE_PATH=/tmp`

			`## Specify a default upload limit in MB (whisper)`
Update .env 2023-06-28 16:28:53 +00:00			`# UPLOAD_LIMIT`
feat(llama.cpp): update (#1024) Description This PR fixes # Notes for Reviewers [Signed commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin) - [ ] Yes, I signed my commits. <!-- Thank you for contributing to LocalAI! Contributing Conventions: 1. Include descriptive PR titles with [<component-name>] prepended. 2. Build and test your changes before submitting a PR. 3. Sign your commits By following the community's contribution conventions upfront, the review process will be accelerated and your PR merged more quickly. --> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2023-09-08 16:38:22 +00:00
			`## List of external GRPC backends (note on the container image this variable is already set to use extra backends available in extra/)`
			`# EXTERNAL_GRPC_BACKENDS=my-backend:127.0.0.1:9000,my-backend2:/usr/bin/backend.py`

			`### Advanced settings ###`
			`### Those are not really used by LocalAI, but from components in the stack ###`
			`##`
			`### Preload libraries`
			`# LD_PRELOAD=`

			`### Huggingface cache for models`
feat(python-grpc): allow to set max workers with PYTHON_GRPC_MAX_WORKERS (#1081) Description this allows to customize the maximum number of grpc workers for python backends Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2023-09-19 19:30:39 +00:00			`# HUGGINGFACE_HUB_CACHE=/usr/local/huggingface`

			`### Python backends GRPC max workers`
			`### Default number of workers for GRPC Python backends.`
			`### This actually controls wether a backend can process multiple requests or not.`
:fire: add LaVA support and GPT vision API, Multiple requests for llama.cpp, return JSON types (#1254) * wip * wip * Make it functional Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * wip * Small fixups * do not inject space on role encoding, encode img at beginning of messages Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add examples/config defaults * Add include dir of current source dir * cleanup * fixes Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixups * Revert "fixups" This reverts commit f1a4731ccadf7226c6589d6d39131376f0811625. * fixes Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2023-11-11 12:14:59 +00:00			`# PYTHON_GRPC_MAX_WORKERS=1`

			`### Define the number of parallel LLAMA.cpp workers (Defaults to 1)`
feat: allow to run parallel requests (#1290) * feat: allow to run parallel requests Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixup Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2023-11-16 07:20:05 +00:00			`# LLAMACPP_PARALLEL=1`

			`### Enable to run parallel requests`
			`# PARALLEL_REQUESTS=true`