Twinny is a vscode plugin for copilot-like functionality with local LLMs, such as code completion, chat and test generation.. In contrast to most plugins like this (cody, codegpt etc.) they don't seem to be focused on creating their own AI and forcing you to have a remote account with them to use your local LLM (yet)

It's still under development, so changing a lot.

2024-04-21

openwebui

I set it up to use open-webui as a provider, because I have that on tailscale so it can all work remotely too.

It does have a setting of ollamawebui in the provider config (which I think was the old name for openwebui) but I couldn't get that working so instead used the ollama api of openwebui.

So for twinny provider config based on the above:


provider: ollama
hostname:  openwebui
port: 8080
api key: <API key from Settings > Account > Api keys in openwebui>

FIM: codeqwen

codeqwen code model gets good feedback, so I set that up. I haven't done much testing, but I tried copilot for a month so it will be interesting to see how it compares.

Based on the codeqwen example and the twinny fim template code, starcode is the FIM template we want.

So for twinny provider config for FIM on the above


type: FIM
FIM template: stable-code
model: codeqwen:code
API path: /ollama/api/generate

Chat: codeqwen

Same setup as FIM, but no need to select template.

2025-02-23

At some point, twinny openwebui config above broke. I had to press the reset providers to remove all the providers and reconfig.

To set twinny up to use my open-webui instance over tailscale, I used the following:

chat config


label: openwebui ollama codeqwen
type: chat
provider: ollama
protocol: http
model name: qwen2.5-coder:7b-instruct
hostname: openwebui
port: 8080
api path: /ollama/v1
api key: <api key from API key section in open webui > profile > account > api keys > api key, starts with sk...>

to test, open an new chat, select openwebui ollama codeqwen, type something in the chat and make sure the response isn't an errorl.

FIM config


type: fim
fim template: codeqwen
provider: ollama
protocol: http
model name: qwen2.5-coder:7b-base
hostname: openwebui
port: 8080
api path: /ollama/api/generate
api key: <api key from API key section in open webui > profile > account > api keys > api key, starts with sk...>

And a quick little test, create a blank python file and add the following:


# a quick test to print dom's number with hardcoded variables

# print the number 28348
print(f"{name}'s number is {x}")

Then put the cursor at after the word "variables" on the first line, and press enter.

If FIM is working, after a few seconds, it should auto-suggest, sometimes 1 line at a time:


name = "Dom"
x = 28348

This proves that the LLM is generating suggestions using context before and after the cursor.