I am using a code-completion model for (will be open sourced very soon).
Qwen2.5-coder 1.5b though tends to repeat what has already been written, or change it slightly. (See the video)
Is this intentional? I am passing the prefix and suffix correctly to ollama, so it knows where it currently is. I’m also trimming the amount of lines it can see, so the time-to-first-token isn’t too long.
Do you have a recommendation for a better code model, better suited for this?
You must log in or # to comment.