Doyle Weaver | 34-Year Legal Generalist → AI Alignment & RLHF Expert

“I spent decades aligning resistant human minds with harsh legal reality — that’s exactly what RLHF does to large language models.”

Every day for 34 years I did reinforcement learning from human feedback — except my models were terrified clients facing decades or life in prison.

Expertise I Bring to AI Teams

RLHF & Reward Modeling Legal Red-Teaming Prompt Engineering Safety & Refusal Writing Constitutional AI (Legal) High-Precision Annotation Jailbreak Testing Reasoning Trace Authoring Contract/Statute Labeling

“My courtroom was the original adversarial alignment lab.”

Over 1,000 criminal clients, face-to-face, under extreme pressure. I turned non-lawyers into rational legal thinkers in minutes — often while they were handcuffed to a chair. My clients had worse hallucinations than any LLM.

Why This Background Is Extremely Rare

34 years as a legal generalist — criminal, civil, family, probate
Over 1,000 criminal client face-to-face interactions — thousands of hours translating dense law while their freedom hung in the balance
Ran real-time red-teaming against prosecutors, judges, and juries
Experienced catastrophic consequences for poor reasoning — the perfect mindset for high-stakes RLHF
Licensed Texas Bar

“I was doing RLHF before it had a name — with human lives on the line.”

Availability — Immediate Remote Start

Open to contract or full-time roles at AI labs, scaling companies, or legal-tech startups:

RLHF / reward modeling (Anthropic, OpenAI, xAI, Cohere, Mistral, Scale, etc.)
Legal red-teaming & constitutional AI alignment
Prompt engineering for legal or general-purpose models
Expert annotation & data quality projects
Training-data strategy consulting

Happy to do paid trials or sample evaluations.

Let’s Talk

“I’m not leaving law — I’m scaling what I did one client at a time to billions of interactions.”

Email me. I respond fast.

weaverlawoffices@gmail.com