Strategic Projects Lead

Pareto Ai

United States$140,000 - $180,000 / yearOther

🧠 About us

At Pareto.AI, we’re on a mission to enable top talent around the world to participate in the development of cutting-edge AI models.

In coming years, AI models will transform how we work and create thousands of new AI training jobs for skilled talent around the world. We’ve joined forces with top AI and crowd researchers at Anthropic, Character.AI, Imbue, Stanford, and University of Pennsylvania to build a fair and ethical platform for AI developers to collaborate with domain experts to train bespoke AI models.

🚀 About this role

Pareto builds human training data pipelines for frontier AI labs. As a Strategic Projects Lead, you sit at the center of that work — owning the architecture, execution, and continuous improvement of complex data collection and evaluation workflows from first scoping call to final delivery.

This is a technical operations role, not a project management role. You'll be expected to read code, deep dive data, reason about LLM internals, design evaluation frameworks, and — increasingly — deploy and iterate on AI agents to automate the work your pipelines do today. We're actively building toward a model where agentic systems handle quality gates, expert routing, and output review, and SPLs are the people designing and operating those systems.

You'll work directly with AI researchers and technical program managers at our client organizations, own delivery against model performance benchmarks, and lead a team of project managers who handle day-to-day execution tracking.

🎯 What you'll do

Pipeline architecture: Design end-to-end data collection and evaluation pipelines for RLVR, RLHF, SFT, red-teaming, and model evaluation workflows. This includes expert sampling strategy, annotation schema, rubric structure, inter-rater calibration, and QA system design. You'll prototype workflows quickly, identify risks, make tradeoff decisions, and communicate with engineering teams about agent interactions.
Agentic system deployment: Build, test, and iterate on AI agents that automate pipeline tasks like quality review, expert matching, output flagging, and throughput anomaly detection. Work closely with the engineering team to scope capabilities, write prompts and evaluation logic, and monitor agent performance.
Quality systems: Define data quality standards, conduct audits using reliability metrics, calibration sets, and statistical sampling, and create systems to prevent quality issues like automated checks and structured output validations.
Client interface: Engage directly with AI researchers and technical program managers to translate requirements into operational workflows, communicate pipeline performance, escalate risks, and contribute to project scoping and pricing.
Research integration: Stay current with advancements in LLM post-training, evaluation methodology, and data tooling. Evaluate new approaches and integrate them into active pipelines for improved quality and efficiency.

🎯 What you'll need

Proficiency in Python and SQL for data manipulation, pipeline monitoring, and quality analysis.
Working knowledge of LLM internals, such as RLHF/SFT training loops, prompt structure, and RL environments.
Hands-on experience with agentic or LLM workflow frameworks like LangChain, DSPy, or equivalent.
Demonstrated ownership of a data or ML pipeline from scoping through delivery, including quality design.
Strong written communication for technical guidelines, rubrics, and pipeline performance reports.
Comfortable operating with ambiguity in fast-moving environments with shifting client priorities.

🌟 You'll stand out if you have

Direct experience with RL environment data pipelines, evaluation framework design, and red-teaming workflows.
Background in data engineering, ML research support, or equivalent roles.
Experience designing or operating agentic systems in production or near-production contexts.
Familiarity with inter-rater reliability methods, calibration set design, and annotation quality frameworks.
Prior client-facing or technical program management experience in AI/ML contexts.
Experience handling projects with fuzzy upfront specs or evolving requirements.

🌟 What we value in candidates

We care less about credentials than about demonstrated ability to own complex technical work and build better systems. A background in software engineering, data science, or ML research is common, but excellent SPLs also come from ML operations, computational linguistics, and applied research support. What matters is your ability to analyze datasets, fix issues, write code, and design preventative workflows.