Running local AI code assist

Published on May 24, 2024

I'm intrigued in how effective it is to run code assist models locally. I'm keen to explore the available IDE extensions and AI models. Let's start with VSCode, the Code GPT extension and models run locally with Ollama.

I have yet to effectively use AI tools for code completion in my developer environment, and I want to improve on that. Yes, I've extensively used ChatGPT to guide me, showing me example code and troubleshooting errors, finding that incredibly useful. I have experimented with Copilot in VSCode, however privacy concerns (especially on enterprise projects) have prevented me enabling it by default, and that has limited my practice. By exploring running code assist AI tooling locally I am looking to discover effective developer flows and prompts that can help me through out all my development work.

Locally running AI models may not be currently as powerful as the commercially available ones, however they may be good enough. If they are not good enough now, that may change with pace of improvements of open models that are able to run locally on limited hardware. I expect running local agents to become the norm for certain specialised tasks, and feel getting up to speed with this approach will be valuable for the coming months and years.

Running models locally with Ollama

We can run Ollama locally for our AI code assist. After install, e.g. brew install ollama, start up the Ollama server to be ready to response to our prompts.

ollama serve

Before we configure VSCode to use this, let's interact with the model directly and see how it fares.

ollama run llama3 \
  "you are a code assistant that gives no explanations " \
  "you write scripts that are runnable from the command line " \
  "the scripts allow text to be piped in and then print the output out " \
  "write a python script that converts markdown to HTML"

Which in my case returned:

#!/usr/bin/env python
import sys
from markdown import markdown

def convert_markdown_to_html():
    if sys.stdin.isatty:
        print("Usage: pipe markdown code to this script")
    text =
    html = markdown(text)

if __name__ == "__main__":

Not too bad. Missing brackets on sys.stdin.isatty, which should have read sys.stdin.isatty(). Not difficult to spot when we run the script, and doesn't take away the value of the boiler plate code suggestion from ollama3.

This is a useful prompt response approach, however I do want to get this feedback as I code, so that I stay in the moment and get appropriate suggestions as I go along. We'll cover such set up in the section below.

Code GPT and VS Code

Open up VS Code and install the Code GPT extension.

Code GPT Extension in VS Code

Click through to the extension settings. Select Ollama for AI provider, tick Enable CodeGPT Copilot and select Ollama - deepseek-coder:base for the autocomplete provider. I've also set autocomplete suggestion delay to 300ms, to make the completion show up quicker whilst I experiment with code completion. For normal development I'd set this a little slower, to reduce the CPU load on my local machine.

I did note that not all the models worked for me. For example, using "Ollama - llama:instruct" led to 500 errors from calls to the Ollama generate API. I didn't work out why this was happening and stuck with the deepseek model for now.

Now let's continue to try out the extension. Open up the Code GPT Chat window and let's ask a similar question to before to get some code suggestions on how to convert markdown to HTML. This time the chat bot suggests an alternative python module for processing markdown called mistune, along with the suggestion for the markdown module. These alternatives are useful to bring ideas that may not have been front of your mind, and I since I hadn't come across mistune before, I took this opportunity to read up on it.

Code GPT Extension in VS Code

I choose to stick with the markdown module this time, and created a new file I explicitly started with import markdown, then typed out the name of the function, and let code completion take us forward. Code completion suggests a body of the function which I choose to accept.

Code GPT Extension in VS

Keeping coding, I got further suggestions as we go along which I choose to accept as I please.

Code GPT Extension in VS


Although these coding examples are not complex in any way, this has allowed me to set up my dev environment to try this out in some more real world examples in the coming weeks.

I am not expecting the quality of this AI model to be as good as commercial ones, such as ones provided by Copilot, Anthropic, and Open AI. However, I'll use these locally running models to get a baseline quality. When I experiment with the commercial ones, I can get a sense of how much the commercial services are worth from a quality of life and speed of development perspective. For now I'm ready to let this practice settle in for my day-to-day dev.