Skip to main content
PromptLayer is your workbench for AI engineering. Version, test, and monitor every prompt and agent with robust evals and tracing. Empower domain experts to iterate alongside engineers in the visual editor. This tutorial walks you through building an AI application that generates cake recipes. By the end, you’ll know how to:
  • Create and run prompts in the visual editor
  • Track versions with diffs and commit messages
  • Switch between LLMs (OpenAI, Anthropic, etc.)
  • Deploy to production with release labels
  • View logs and debug issues
Create an account to follow along.

Creating Your First Prompt

Prompts are the core IP of any AI application. By managing them in PromptLayer instead of hardcoding them, you can edit prompts without deploying code, track every change, and test new versions safely. From the PromptLayer dashboard, click NewPrompt.
Creating a new prompt
A prompt starts with two messages. The System message sets the AI’s behavior. The User message is what gets sent each time you run it.
System message: Defines the AI’s persona, tone, and rules. Think of it as the “instruction manual” that stays constant across all runs. Use it for things like:
  • Setting a persona (“You are a helpful assistant…”)
  • Defining output format (“Always respond in JSON…”)
  • Establishing guardrails (“Never discuss competitors…”)
User message: The input that changes each time the prompt runs. This is where you put the actual request and any input variables. In production, your code will dynamically fill in this message.Some prompts also use Assistant messages to show example responses, which helps the AI understand the expected format.Learn more about LLM idioms and why chat formats matter.
Replace the default content with:
System
You are a Michelin-star pastry chef. Generate cake recipes with:

**Overview**: One paragraph about the cake
**Ingredients**: Bullet points with metric and US measurements  
**Instructions**: Numbered steps with temperatures and timing
**Variations**: Optional frostings or substitutions
User
Create a recipe for {{cake_type}} that serves {{serving_size}} people.
Notice {{cake_type}} and {{serving_size}}. These are input variables—think of them like mad-libs. When you run the prompt, you’ll fill these in with real values.
Input variables in prompt

Running Your Prompt

To test your prompt:
  1. Click Define input variables in the right panel
  2. Set cake_type to “Chocolate Cake” and serving_size to “8”
  3. Click Run
Running a prompt in the playground
Save your prompt by clicking Save Template.
Your application fetches prompts from PromptLayer at runtime using the SDK or REST API. This keeps prompts out of your codebase and lets you update them without redeploying. It also means PMs and domain experts can edit prompts directly in the dashboard without waiting for engineering.
from promptlayer import PromptLayer
client = PromptLayer()

# Fetch and run a prompt by name
response = client.run(
    prompt_name="cake-recipe",
    input_variables={"cake_type": "Chocolate", "serving_size": "8"}
)
See Deployment Strategies for caching strategies.

Versioning Your Prompt

PromptLayer tracks every change you make to a prompt. Each save creates a new version with a record of what changed, when, and by whom.
  • Use headers to structure your prompt (**Section**:)
  • Be specific about output format
  • Include examples when possible
PromptLayer supports Jinja2 templates for more advanced variable logic. For structured outputs, see our guide on tool calling with LLMs.

Editing Prompts

To edit a prompt, open it in the editor and make your changes. You can edit any message, add new messages, or change model settings. Let’s try it. Add this line to the end of your System message:
Always include a "Baker's Tip" at the end with advice for beginners.
Click Save Template. Before saving, PromptLayer shows you a diff of exactly what changed: deletions in red, additions in green. Add a commit message like “Added baker’s tip requirement” and save.
Saving with diff view
Your version history appears in the left panel. You can click any previous version to view it.
Version history
Hover over a version and click View Diff to compare any two versions side by side.
Comparing versions with diff

Writing Prompts with AI

Click the magic wand icon to open the AI prompt writer. It can help rewrite or improve your prompts based on your instructions. Try asking it to “add allergy warnings” to the recipe generator.
AI prompt writer

Switching LLMs

PromptLayer is model-agnostic. You can switch between OpenAI, Anthropic, Google, and other providers without changing your prompt. Click the model name at the bottom of the editor to switch.
Switching models
All prompts work across models, including function calling and tool use. You can also connect private models or custom hosts, and build fine-tuned models.
Any model with a base_url can be added as a custom provider - including self-hosted models, Azure OpenAI, or any OpenAI-compatible API.

Deploying to Prod

Once your prompt is ready, PromptLayer can manage which version goes live. Release labels let you control which version is in production, so you can update prompts without touching code. For engineers: PromptLayer offers several deployment strategies including our Python or TypeScript SDKs, webhook-driven caching, fully managed agents, or self-hosted deployments.

Release Labels

Release labels control which prompt version is live. You can mark versions as “production”, “staging”, or “testing”. Your code fetches the prompt by label, so you can update the live version without changing any code.
Release labels
Learn more about release labels.

A/B Testing

PromptLayer supports A/B testing prompts in production. Common use cases include:
  • Testing a new prompt version on 10% of traffic before full rollout
  • Segmenting beta users to receive an experimental prompt based on user metadata
To start an A/B test, assign release labels to different prompt versions and configure traffic splits in the dashboard.
A/B testing prompts
Learn more about A/B testing.

Building Agents

Agents are multi-step AI workflows. Unlike a single prompt, an agent can chain multiple prompts together, use tools, and make decisions based on intermediate results. For example, you could extend the cake recipe generator into an agent that:
  1. Generates the recipe (using our prompt from earlier)
  2. Scales the ingredients for 100 people
  3. Calculates the total cost based on current grocery prices
Create an agent by clicking NewAgent. You can build workflows visually using a drag-and-drop editor. Connect prompts, add conditionals, and loop over data. No code required.
Creating an agent
Like prompts, agents are versioned with commit messages and can be retrieved via the SDK.
Agent version history
Learn more about Agents.

Evaluations

Evals test how well your prompts perform. Common use cases include:
  • LLM-as-judge: Use AI to score outputs against criteria like tone, accuracy, or formatting
  • Historical backtests: Compare a new prompt version against real production data
  • Model comparisons: Test the same prompt across GPT-4, Claude, Gemini, etc.
  • Regression testing: Automatically run evals when a prompt is updated to catch edge cases
  • Human grading: Collect feedback from domain experts on prompt quality
Evaluation pipeline
You can build evaluation pipelines visually and connect them to prompts for continuous testing. Learn more about Evaluations.

Logs and Analytics

Every prompt run is logged with full request and response details. You can attach metadata like user IDs, session IDs, or feature flags to each request, making it easy to debug issues for specific users. Analytics show cost, latency, and usage patterns across your prompts.
Analytics dashboard

Viewing Logs

Click Logs in the sidebar to see all requests. You can filter by prompt, search by content, and debug errors.
Request logs
You can also view logs for a specific prompt by clicking Analytics & Logs in the prompt editor.
Logs filtered by prompt
From the logs table, you can select historical requests and click Backtest to run a new prompt version against those inputs and compare results.

Traces and Spans

For agents, traces show each step of the workflow as spans. You can see timing, inputs, and outputs for every step. Traces are OpenTelemetry (OTEL) compatible, so you can integrate with your existing observability stack.
Trace visualization
Continue to Quickstart Part 2 to learn about evaluations, backtests, and connecting PromptLayer to your code.
You can also watch our Tutorial Videos for guided walkthroughs.

Prompt Management

Evaluations

Migration