The Language 😼#
We’ll call Kork
’s programming language the Smirking Cat (😼) or Kork
or simply 😼. Honestly, it doesn’t really matter (as long as the LLM doesn’t get confused!).
Set the language name in the prompt to whatever you’d like. Keep it short to avoid consuming too many tokens on things that don’t matter. 🙃
The grammar for the language is defined here.
This language isn’t particularly good and is extremely limited (especially at the moment) – it was clobbered together
in a few hours, is incorrect in some known ways (e.g., const
isn’t actually const
) and most definitely incorrect in unknown ways as well. 🤫
The goal is NOT to allow the LLM to write arbitrary python code. If you need arbitrary code execution use a real programming language.
Kork
is meant to accommodate scenarios in which one wants to produce small constrained programs that can help achieve some specific task phrased in natural language.
It was designed with the following ideas in mind:
Keep the language minimal to limit what the LLM can code
Encourage the LLM to rely heavily on external function invocation
Keep the syntax similar to existing languages, so the LLM can learn how to code in it using information in the prompt
Discourage the LLM from assuming a standard library is available
Alternative Approaches#
Changing syntax: syntax based on S-expressions looks promising based on a few qualitative experiments.
Implementing a light weight python / typescript interpreter in python that supports a minimal subset of features.
The interpreter#
Let’s take a look at the Smirking Cat interpreter.
from kork import run_interpreter
result = run_interpreter("var x = 1; x = x * 10")
result
{'environment': Environment(parent=None, variables={'x': 10}), 'errors': []}
The environment reflects the state of the global symbols after the interpreter finished executing the code.
result["environment"].variables
{'x': 10}
Syntax errors will lead the errors
key being populated. The current error isn’t super informative.
run_interpreter("1 = 2")
{'environment': Environment(parent=None, variables={}),
'errors': [lark.exceptions.UnexpectedToken()]}
RunTimeExceptions will also appear in the errors
key.
run_interpreter("x + 1")
{'environment': Environment(parent=None, variables={}),
'errors': [kork.exceptions.KorkRunTimeException('Variable `x` not found')]}
Environment#
The interpreter can be run with a pre-populated environment
from kork import Environment
env = Environment()
env.set_symbol("x", 10)
10
Please note that y
was created even though no let
or const
or var
keyword was used.
result = run_interpreter("y = x * 3", environment=env)
result
{'environment': Environment(parent=None, variables={'x': 10, 'y': 30}),
'errors': []}
Foreign Functions#
A user can import existing python functions as foreign functions.
from kork.foreign_funcs import to_extern_func_def
def foo(s: str) -> str:
"""Foo will reverse a string!"""
return s[::-1]
Below we’ll convert the python function into an internal representation of an external function definition.
extern_func_def = to_extern_func_def(foo)
extern_func_def
ExternFunctionDef(name='foo', params=ParamList(params=[Param(name='s', type_='str')]), return_type='str', implementation=<function foo at 0x7f5e68873370>, doc_string='Foo will reverse a string!')
Let’s add this function to the environment
env.set_symbol("foo", extern_func_def)
ExternFunctionDef(name='foo', params=ParamList(params=[Param(name='s', type_='str')]), return_type='str', implementation=<function foo at 0x7f5e68873370>, doc_string='Foo will reverse a string!')
result = run_interpreter('var z = foo("meow")', environment=env)
result["environment"].variables["z"]
'woem'