LocalAI/examples/langchain-chroma
James Braza e34b5f0119
Cleaning up examples/ models and starter .env files (#1124)
Closes https://github.com/go-skynet/LocalAI/issues/1066 and
https://github.com/go-skynet/LocalAI/issues/1065

Standardizes all `examples/`:
- Models in one place (other than `rwkv`, which was one-offy)
- Env files as `.env.example` with `cp`
    - Also standardizes comments and links docs
2023-10-02 18:14:10 +02:00
..
.env.example Cleaning up examples/ models and starter .env files (#1124) 2023-10-02 18:14:10 +02:00
.gitignore docs: fix langchain-chroma example (#298) 2023-05-18 22:50:21 +02:00
docker-compose.yml docs: fix langchain-chroma example (#298) 2023-05-18 22:50:21 +02:00
models Cleaning up examples/ models and starter .env files (#1124) 2023-10-02 18:14:10 +02:00
query.py fix missing openai_api_base on langchain-chroma example (#818) 2023-07-27 18:41:53 +02:00
README.md feat: allow to override model config (#323) 2023-05-20 17:03:53 +02:00
requirements.txt examples: add langchain-chroma example (#248) 2023-05-12 22:20:07 +02:00
store.py fix missing openai_api_base on langchain-chroma example (#818) 2023-07-27 18:41:53 +02:00

Data query example

This example makes use of langchain and chroma to enable question answering on a set of documents.

Setup

Download the models and start the API:

# Clone LocalAI
git clone https://github.com/go-skynet/LocalAI

cd LocalAI/examples/langchain-chroma

wget https://huggingface.co/skeskinen/ggml/resolve/main/all-MiniLM-L6-v2/ggml-model-q4_0.bin -O models/bert
wget https://gpt4all.io/models/ggml-gpt4all-j.bin -O models/ggml-gpt4all-j

# configure your .env
# NOTE: ensure that THREADS does not exceed your machine's CPU cores
mv .env.example .env

# start with docker-compose
docker-compose up -d --build

# tail the logs & wait until the build completes
docker logs -f langchain-chroma-api-1

Python requirements

pip install -r requirements.txt

Create a storage

In this step we will create a local vector database from our document set, so later we can ask questions on it with the LLM.

Note: OPENAI_API_KEY is not required. However the library might fail if no API_KEY is passed by, so an arbitrary string can be used.

export OPENAI_API_BASE=http://localhost:8080/v1
export OPENAI_API_KEY=sk-

wget https://raw.githubusercontent.com/hwchase17/chat-your-data/master/state_of_the_union.txt
python store.py

After it finishes, a directory "db" will be created with the vector index database.

Query

We can now query the dataset.

export OPENAI_API_BASE=http://localhost:8080/v1
export OPENAI_API_KEY=sk-

python query.py
# President Trump recently stated during a press conference regarding tax reform legislation that "we're getting rid of all these loopholes." He also mentioned that he wants to simplify the system further through changes such as increasing the standard deduction amount and making other adjustments aimed at reducing taxpayers' overall burden.    

Keep in mind now things are hit or miss!