name: gpt-3.5-turbo context_size: 2048 f16: true gpu_layers: 90 mmap: true trimsuffix: - "\n" parameters: model: huggingface://TheBloke/phi-2-GGUF/phi-2.Q8_0.gguf template: chat: &template |- Instruct: {{.Input}} Output: completion: *template usage: | To use this model, interact with the API (in another terminal) with curl for instance: curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "phi-2", "messages": [{"role": "user", "content": "How are you doing?", "temperature": 0.1}] }'