Note: This position is open exclusively to candidates currently enrolled in the Hiring Our Heroes Fellows Program. Applications from individuals not participating in the program will not be considered.
GenAI Strategic Projects Lead Skillbridge, Public Sector
Scale is at the frontier of the AI industry, improving the world’s leading generative AI and large language models through model evaluations, human-powered supervised fine-tuning datasets, world-class reinforcement learning with human feedback, and more.
Scale AI’s Public Sector team is growing in the Generative AI, Public Sector space, and we’re seeking a GenAI Strategic Projects Lead to lead high-impact projects that drive revenue and experimentation. In this role, you’ll work across operations, engineering, customer engagement, and directly with our clients to produce world-class test and evaluation and training data for Large Language Models for our Public Sector customers.
This role offers a rare opportunity to make a meaningful impact at the intersection of AI and national security. You will build human data labeling pipelines from the ground up, create operational processes to manage and optimize an in-house expert data workforce, and develop novel technology-driven approaches (e.g., scripts, prompt engineering, hybrid data) to improve the quality of both our training and evaluation datasets. You will also own the financial and technical viability of your programs by managing project COGS and partnering with Go-to-Market teams to scope customer engagements through taxonomy design and feasibility validation, ensuring deals are scalable, executable, and economically sound. In addition, you will partner directly with our internal machine learning experts and external stakeholders to ensure our data enables the development of mission-critical applications of AI.
Help shape the future of AI by joining a fast-growing team built on exceptional data, tools, and systems.
You Will:
- Develop, build, and maintain the operations infrastructure required to ensure data labeling pipelines are efficient, scalable, and produce high-quality outputs
- Take ownership of day-to-day progress on high-priority data production pipelines, ensuring projects move forward efficiently
- Partner with our Machine Learning and Go-to-Market teams to scope customer engagements via taxonomy design and technical feasibility validation to ensure deals are scalable and executable.
- Influence cross-org collaboration to define and advance human data strategy, influencing technical and non-technical stakeholders to ensure data quality, scalability, and long-term platform leverage.
- Own the financial health of programs by managing project-level COGS through workforce planning, tooling decisions, and process optimization.
- Partner with subject matter experts in their fields to validate the quality of our data and to translate deep domain knowledge into scalable processes and measurable outcomes
- Work closely with customers to understand their requirements and design data taxonomies that optimize model performance.
- Utilize analytics and data visualization tools to track progress, identify bottlenecks, and make data-driven decisions to optimize pipeline performance
- Own larger and larger components of our data delivery processes, until you ultimately serve as the full owner of our most visible and high impact customer pipelines
You have:
- An active Top Secret security clearance
- 2-3 years of experience in product development, data science, or operations
- A history of successful project management and comfort in ambiguity
- Ability to analyze complex operational data, build queries, and identify trends to inform decisions and optimize processes
- Technical aptitude to understand how to produce data for state of the art post-training techniques such as supervised fine tuning (SFT), reinforcement learning through human feedback (RLHF), Reinforcement Learning with Verifiable Rewards (RLVR) etc
Nice to have:
- Experience working in defense tech and/or an AI company
- A technical degree in fields like computer science, data science, or engineering
- A deep understanding of ML operations for generative AI workflows / products