LLM self-assessment
Yesterday I wrote about AI integration: Trello to GitHub. Further to this, what about letting the LLM assess its own ability to take on a task? Imagine prompting like this:
You are an Engineering Manager. On your team you have a senior engineer, a mid-level engineer and a junior engineer. Read the following Trello card and decide who should work on it:
[card contents]
Reply in JSON format with the following fields:
assignee
: one ofsenior
,mid
orjunior
reason
: the reason for the assignment
The LLM would then reply with something like this:
{
"assignee": "senior",
"reason": "The card is a complex feature that requires a senior engineer"
}
This gives us three grades to work with:
- If the response is
junior
, the workflow sends it down the automated PR path where an LLM writes the code and submits the PR. - If it's
mid
the LLM just writes ideas for solutions, submits them in a PR with a description and a link to the Trello card, and puts it in the "ready to go" column for a human to work on. senior
goes to the "to do" column, for a human to start working on it.
Anywhere in this workflow a human can re-direct cards if they end up in the wrong place. You start small, with only one card in-flight at any time, then as your confidence in the system grows you can dial up the concurrency.