Work to Elevate Consciousness

Explore jobs at organizations building the next wave
of scalable tools for spiritual development

Brought to you by:

My job alerts

Prompt Engineer (Model Behavior & Evaluation)

Yuna

South Africa

Posted on May 2, 2026

Yuna’s mission is to radically transform how mental health support is accessed and delivered. We provide immediate, private, 24/7 support through empathetic conversational AI—closing gaps created by long wait times, high costs, and limited access to care. Every role at Yuna directly shapes the experience users have in their most vulnerable moments.

This is a foundational hire to our engineering team. As our first model behavior engineer, you will be responsible for how our AI conversations behave in the real world. This role goes beyond writing prompts. You will define what “good” looks like, design and run evaluations, diagnose failures across multi-agent systems, and continuously improve the warmth, safety, usefulness, and alignment of Yuna’s conversations at scale.

You will be at the intersection of product, clinical psychology, and engineering. You will own conversational behavior across prompts, context, routing, memory, model choice, and evaluation.

What you’ll do

Own conversational behavior across a multi-agent, multi-model conversational system. Beyond prompting, this will require working with context architecture, agent routing, memory, and model selection
Collaborate with clinicians and mental health experts to design and operate evaluation frameworks for conversational quality, empathy, usefulness, and alignment
Diagnose failures by analyzing real conversation traces, agent routing, memory usage, and system context
Define your own priorities by using data, stakeholder feedback, and conversational logs to identify the highest-impact improvements and act on them
Build the continuous improvement loop, ensuring each failure improves the whole system rather than a patch

Who you are

You are a problem solver at heart. A “jack of all trades” that can define an ambiguous problem, come up with hypotheses, and ship solutions all on your own. You don’t wait for permission, but are confident acting first and pivoting if necessary. You should also be very comfortable with AI and practiced in manipulating it to solve a specific problem.

Required

Experience owning model or agent behavior in a live production environment
Designed and operated evaluation systems (qualitative + quantitative)
Ability to define what “good” means and defend it with evidence
Comfort working hands-on with tools like LangGraph, LangSmith, or comparable tracing and evaluation frameworks
High data literacy: able to reason from logs, traces, metrics, and analytics to isolate root causes
Clear communicator who can collaborate across product, clinical, and engineering teams in ambiguous problem spaces

Nice to have

Working knowledge of Python; ability to read, debug, and collaborate within agent logic and evaluation code
Familiarity with alignment concepts (e.g. human values, safety tradeoffs, refusal behavior)
Proficient with SQL, amplitude, and excel
Background in psychology, linguistics, neuroscience, education, or adjacent fields

Location: Remote (can work from anywhere, but must overlap working hours with 8am-12pm PST at a minimum)

Employment Type: Full Time

What We Offer

Competitive salary (based on experience) + equity options
Remote-first culture with flexibility
A fast-growing, talented, and empathetic team dedicated to transforming mental health care
An opportunity to use AI for good
Level of ownership to make measurable impact, building cutting-edge AI systems that improve lives every day

See more open positions at Yuna

Powered by Getro.com

Privacy policy Cookie policy