FlexiPrompt: Convenient creation of dynamic prompts in Python

This article will be useful for Python developers working with language models.

Recently, I needed a tool for generating prompts in Python code. I didn't want to use complex solutions, so I created a small library called FlexiPrompt. Here are its main advantages:

  • Easy to integrate into existing code

  • Allows quick and flexible configuration of dialogue with LLM

  • Can split one LLM into several agents, configuring communication through templates

What it looks like in code

Here is a simple example of using FlexiPrompt:

from flexi_prompt import FlexiPrompt

fp = FlexiPrompt()
inner_fp = FlexiPrompt({"another_field1": "nested value1, "})
inner_fp.another_field2 = "nested value2"
inner_fp.another_field1().another_field2()

fp.final_prompt = "Here is: $inner_fp, $some_field, $some_callback"
fp.inner_fp = inner_fp
fp.some_field = 42
fp.some_callback = input  # Example: enter "user input"

print(fp.final_prompt().build())  
# Output: Here is: nested value1, nested value2, 42, user input

Example of use: Improving LLM responses with self-control

Let's look at an interesting example of using FlexiPrompt. We will create a system where language models will evaluate and improve their own responses. Here's how it works:

  1. Receive a request from the user

  2. Generate a response with the first neural network

  3. Ask two different neural networks to evaluate the response and take the average score

  4. Generate a response with the second neural network

  5. Evaluate the response again

  6. If one of the responses receives the highest score, save it as the best and end the process

  7. Repeat steps 2-6 up to 5 times, saving the best response

  8. Provide the best response to the user

Implementation

For this example, we will use the OpenAI and Anthropic APIs. Here is the main structure of the code:

from flexi_prompt import FlexiPrompt
from openai import OpenAI
from anthropic import Anthropic

# Setting up API keys and clients
from google.colab import userdata
os.environ["OPENAI_API_KEY"] = userdata.get("OPENAI_API_KEY")
os.environ["ANTHROPIC_API_KEY"] = userdata.get("ANTHROPIC_API_KEY_TEST1")

def get_openai_answer(question, openai):
    openai_compleion = openai.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": question},
        ],

    )
    return openai_compleion.choices[0].message.content

def get_antropic_answer(question, antropic):
    message = antropic.messages.create(
        max_tokens=4096,
        temperature=0,
        model="claude-3-haiku-20240307",
        system="You are a helpful assistant.",
        messages=[{"role": "user", "content": [{"type": "text", "text": question}]}],
    )
    return message.content[0].text

fp = FlexiPrompt()

# Setting up prompts
fp.question = "Your question here"
fp.rate_prompt = """
Rate the answer to the question from 1 to 9, where 1 is the worst answer.
Be rigorous in your evaluation. Give back only one number as your answer.

Question: 
 $question
Answer: 
 $answer
"""

# Main loop
MAX_ATTEMPTS = 5
THRESHOLD_SCORE = 9
best_rate = 0
best_answer = ""

for attempt in range(MAX_ATTEMPTS):

    fp.answer = get_openai_answer(fp.question().build(), openai)
    answer_rate = get_answer_rate(fp.rate_prompt().build(), openai, antropic)

    if answer_rate > best_rate:
        best_rate = answer_rate
        best_answer = fp.answer

    fp.answer = get_antropic_answer(fp.question().build(), antropic)
    answer_rate = get_answer_rate(fp.rate_prompt().build(), openai, antropic)

    if answer_rate > best_rate:
        best_rate = answer_rate
        best_answer = fp.answer

    if best_rate >= THRESHHOLD_SCORE:
        break

print(best_answer)
print("The answer rate is:", best_rate)

This approach allows for higher quality answers from language models by using their own capabilities for self-assessment and improvement. The full example code is available on GitHub.

Comparison with alternatives

I looked at Haystack, LangChain, and several smaller libraries.

Most out-of-the-box solutions include a lot of functions besides prompting. Under the hood, almost all use jinja.

Jinja itself is a heavier solution and not tailored for prompts. Suitable for large-scale projects.

FlexiPrompt is tailored for simple projects. No need to build classes and abstractions, but you get flexibility in the output.

Plans

There are obvious things to add: the ability to escape special characters and safely add strings.

In the future, I want a robust and reliable response parsing that will take into account the liberties of LLM responses. I think it should be a conversion from string to object or trigger activation.

Comments