native AI syntax for Python
plum
Call any AI model
right inside your code — one line, zero
setup, no libraries.
Works anywhere Python works. Ships as a single file.
our belief
AI operations should come as natural as 1+1
That's what plum is built for.
the query operator
One symbol. Infinite reach.
The ?[...] operator
embeds a model call anywhere a value can go.
# the prompt is just a string
answer = ?["What is the capital of France?" | claude]
# inject runtime values with f-string syntax
summary = ?[f"Summarize this article: {article}" | claude]
# use the result directly — it's just a value
if ?[f"Is this email spam? Reply yes/no: {email}" | claude] == "yes":
quarantine(email)
# works in list comprehensions, map, filter — anywhere an expression fits
tags = [?[f"Classify: {p}" | claude] for p in posts]
# ask for structured output — plum handles parsing
class Sentiment:
score: float # -1.0 to 1.0
label: str
reason: str
result = ?[f"Analyze sentiment: {review}" | claude -> Sentiment]
print(result.score, result.label) # 0.87, "positive"
model routing
Pick your model. Swap anytime.
The pipe | routes the
prompt. Config lives in one file.
01
Any model, same syntax
Claude, GPT, Gemini, local models. The
?[...] syntax
never changes.
?["..." | claude]
?["..." | gpt4]
?["..." | llama3]
02
Simple config
Alpha runs on Ollama locally — no keys, no config.
The plum.toml file lands right after
v1, bringing full multi-provider support.
# plum.toml
[models]
claude = "claude-3-5-sonnet"
03
Caching built in
The architecture is in place. Identical prompts returning cached results — shipping as soon as it's exactly right.
@cache
?[f"Translate: {text}" | claude]
before & after
Without plum vs. with plum.
Same result. Less noise.
import anthropic
client = anthropic.Anthropic()
msg = client.messages.create(
model="claude-3-5-sonnet",
max_tokens=1024,
messages=[{
"role": "user",
"content": "Translate to French: " + text
}]
)
result = msg.content[0].text
result = ?[f"Translate to French: {text}" | claude]
real programs
Plum in practice.
Full programs that show what plum-first code feels like.
def handle_ticket(ticket: Ticket) -> Response:
urgency = ?[f"Rate urgency 1-5: {ticket.body}" | claude -> int]
category = ?[f"Classify as billing/shipping/other: {ticket.body}" | claude]
if urgency >= 4:
escalate(ticket)
return
draft = ?[f"Write a warm reply to this {category} issue: {ticket.body}" | claude]
return Response(draft)
# enrich a product catalog with AI-generated copy
enriched = [
{
**product,
"tagline": ?[f"One-line tagline for: {product['name']}" | claude],
"description": ?[f"SEO description for: {product['name']}" | claude],
"tags": ?[f"5 search tags for: {product['name']}" | claude -> list[str]],
}
for product in catalog.load()
]
questions & answers
Every hard question, answered.
Plum is a Python superset with native AI syntax
— built into the language, not imported as a
library.
Instead of writing 10+ lines
of API boilerplate every time you want to call a
model, you write one line:
result = ?[f"Summarize this: {text}" | claude]
print(). That's the whole idea.
Depends on your definition — and that's a fair
debate.
Technically: Plum has its own
syntax, transpiles to Python (like CoffeeScript
to JavaScript, TypeScript to JavaScript), and
adds semantics Python doesn't have. By that
measure, it qualifies.
Practically:
it's a Python superset with a new operator. If
that makes it a DSL in your book, that's a
reasonable take too.
What it's
definitely not: a library. You don't import
Plum. You write in it.
Two groups:
Developers
who reach for a bash script or quick Python file
— not an SDK. If you want AI in that script
without setting up LangChain, Plum is for
you.
Non-technical people
comfortable with basic Python who hit a wall the
moment an API is involved. Plum removes that
wall.
Note: the examples on this page
use Python features like list comprehensions and
dataclasses. If you're completely new to
programming, Plum still has a learning curve —
just a shorter one.
? conflict with Python
syntax?
▼
? does not exist anywhere in
Python. It's not an operator, not a reserved
word, not used in any special syntax. Python has
zero use for it.
Compare to
JavaScript where ? is taken for
ternary operators and optional chaining. Or Rust
where it's the error propagation operator.
Python has none of that.
We own it completely.
No ambiguity, no conflict, ever.
Yes, with one rule:
the entry point must be Plum, not
Python.
Same rule as TypeScript — you don't
run .ts files with Node directly.
You use ts-node. Same here:
Your .plum files can import any
.py file freely. Your entire
existing Python codebase works inside Plum
untouched.
Models are listed in priority order separated by
|. Plum tries the first one. If it
fails — rate limit, downtime, error — it
automatically tries the next:
result = ?["do this" | claude | gpt | llama3]
# tries claude → if fails, tries gpt → if fails, tries llama3
You write this once. Plum handles the retry logic, the error handling, the fallback. Zero extra code from you.
LangChain is powerful — but it's a library with
an API to learn, objects to chain, and
dependencies to install. It's built for complex
AI pipelines.
Plum is built for the
opposite use case: you have a script, you want
one AI call, and you don't want to set anything
up.
They're not competing for the
same moment.
Fair point. Here's the honest answer:
AI
writes the boilerplate once.
That code runs thousands of times, gets read,
maintained, and debugged by people who didn't
write it.
Plum makes that code
shorter and clearer — not just to write, but to
read six months later.
That said: if
AI tooling keeps improving, this advantage may
shrink. We're betting on simplicity having
lasting value. Time will tell.
This is a real question. Here's the reframe that
resolves it:
If your problem has a deterministic answer,
you don't need AI.
You need a formula or a lookup. Use normal
code.
When you reach for
?[...] you've already accepted that
the answer requires judgment. You're solving a
non-deterministic problem by definition.
So
you don't test that output is identical every
run. You test that output is
acceptable every run. That's a
different kind of test — and arguably more
honest about what software should do anyway.
Non-determinism
in Plum isn't a bug. It's the whole reason
you're using it.
Real risk. Plum's answer is a built-in cost
warning when ?[...] is detected
inside a loop:
Your Plum code doesn't change. The runtime
adapter does.claude in
Plum is not pinned to a specific API version —
it's a Plum model identifier. When Anthropic
changes their API, we update Plum's Claude
adapter. Your ?[...] expressions
stay identical.
This is the same
reason you don't rewrite your Python app when
Python releases a new version. The abstraction
absorbs the change.
Python is the right choice for v1. Not
necessarily forever.
Why Python now:
Entire AI ecosystem already exists. Fastest path
to working prototype. Largest audience.
Non-technical users already learn Python.
Where Python falls short later:
Slow. GIL limits true parallelism. Not
systems-level.
The decision for v2
gets made with evidence from real users — not
theory. Right now the idea is what matters. Not
the implementation language.
open beta
Shape plum with us.
The language is in open beta. Try it now and help decide what plum becomes.