mirror of https://github.com/mudler/LocalAI.git synced 2024-06-07 19:40:48 +00:00

History

quoing e7981152b2 [query_data example] max_chunk_overlap in PromptHelper must be in 0..1 range (#1000 ) Description Simple fix, percentage value is expected to be float in range 0..1 Notes for Reviewers [Signed commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin) - [x] Yes, I signed my commits. <!-- Thank you for contributing to LocalAI! Contributing Conventions: 1. Include descriptive PR titles with [<component-name>] prepended. 2. Build and test your changes before submitting a PR. 3. Sign your commits By following the community's contribution conventions upfront, the review process will be accelerated and your PR merged more quickly. -->		2023-09-04 19:12:53 +02:00
..
data	example(add): document query example	2023-05-05 21:56:31 +02:00
models	examples: remove threads from example models (#337 )	2023-05-21 12:25:24 +02:00
.gitignore	example(add): document query example	2023-05-05 21:56:31 +02:00
docker-compose.yml	docs: fix langchain-chroma example (#298 )	2023-05-18 22:50:21 +02:00
query.py	[query_data example] max_chunk_overlap in PromptHelper must be in 0..1 range (#1000 )	2023-09-04 19:12:53 +02:00
README.md	feat: allow to override model config (#323 )	2023-05-20 17:03:53 +02:00
store.py	[query_data example] max_chunk_overlap in PromptHelper must be in 0..1 range (#1000 )	2023-09-04 19:12:53 +02:00
update.py	examples: fix default parameter	2023-05-07 10:13:57 +02:00

README.md

Data query example

This example makes use of Llama-Index to enable question answering on a set of documents.

It loosely follows the quickstart.

Summary of the steps:

prepare the dataset (and store it into data)
prepare a vector index database to run queries on
run queries

Requirements

You will need a training data set. Copy that over data.

Setup

Start the API:

# Clone LocalAI
git clone https://github.com/go-skynet/LocalAI

cd LocalAI/examples/query_data

wget https://huggingface.co/skeskinen/ggml/resolve/main/all-MiniLM-L6-v2/ggml-model-q4_0.bin -O models/bert
wget https://gpt4all.io/models/ggml-gpt4all-j.bin -O models/ggml-gpt4all-j

# start with docker-compose
docker-compose up -d --build

Create a storage

In this step we will create a local vector database from our document set, so later we can ask questions on it with the LLM.

Note: OPENAI_API_KEY is not required. However the library might fail if no API_KEY is passed by, so an arbitrary string can be used.

export OPENAI_API_BASE=http://localhost:8080/v1
export OPENAI_API_KEY=sk-

python store.py

After it finishes, a directory "storage" will be created with the vector index database.

Query

We can now query the dataset.

export OPENAI_API_BASE=http://localhost:8080/v1
export OPENAI_API_KEY=sk-

python query.py

Update

To update our vector database, run update.py

export OPENAI_API_BASE=http://localhost:8080/v1
export OPENAI_API_KEY=sk-

python update.py