Scaleai
Learn more about Scaleai, the company behind this role.
Open Roles
ML Systems Engineer, Robotics
Scale's Physical AI business unit is dedicated to solving the data bottleneck across Robotics, Autonomous Vehicles, and Computer Vision. This position will be a key contributor in conducting applied research in Physical AI and developing ML pipelines for processing, training, and fine-tuning on data collected by Scale, with a specific focus on optimizing algorithms and pipelines to run efficiently on GPUs in the cloud. In this role, you will have the opportunity to advance research, shape Scale’s offerings, and expand the frontier of data and model evaluation for Physical AI. The Role As an ML Systems Engineer on the Physical AI team, you will design and build platforms for scalable, reliable, and efficient serving of foundation models specifically tailored for physical agents. Our platform powers cutting-edge research and production systems, supporting both internal research discovery and external customer use cases for autonomous vehicles and robotics. The ideal candidate combines strong ML fundamentals with deep expertise in backend system design. You’ll work in a highly collaborative environment, bridging the gap between Physical AI research and production engineering to accelerate innovation across the company. You Will: - Build & Scale: Maintain fault-tolerant, high-performance systems for serving robotics-related models and foundation models at scale, ensuring low latency for real-time applications. - Platform Development: Build an internal platform to empower model capability discovery, enabling faster iteration cycles for research teams working on robotics. - Collaborate: Work closely with Robotics researchers and Computer Vision engineers to integrate and optimize models for production and research environments. - Design Excellence: Conduct architecture and design reviews to uphold best practices in system scalability, reliability, and security. - Observability: Develop monitoring and observability solutions to ensure system health and real-time performance tracking of model inference. - Lead: Own projects end-to-end, from requirements gathering to implementation, in a fast-paced, cross-functional environment. Ideally, You’d Have: - Experience: 4+ years of experience building large-scale, high-performance backend systems, with deep experience in machine learning infrastructure. - Algorithm Optimization: Deep experience optimizing computer vision and other machine learning algorithms for cloud environments, including GPU-level algorithm optimizations (e.g., CUDA, kernel tuning). - Programming: Strong skills in one or more systems-level languages (e.g., Python, Go, Rust, C++). - Systems Fundamentals: Deep understanding of serving and routing fundamentals (e.g., rate limiting, load balancing, compute budgets, concurrency) for data-intensive applications. - Infrastructure: Experience with containers (Docker), orchestration (Kubernetes), and cloud providers (AWS/GCP). - IaC: Familiarity with infrastructure as code (e.g., Terraform). - Mindset: Proven ability to solve complex problems and work independently in fast-moving environments. Nice to Haves: - Exposure to Vision-Language-Action (VLA) models. - Knowledge of high-performance video processing (e.g., FFmpeg, NVDEC/NVENC) or 3D data handling (point clouds). - Familiarity with robotics middleware (e.g., ROS/ROS2) or AV data formats.
Machine Learning Research Engineer, GenAI Applied ML
About This Role Lead applied ML engineering on Scale's Applied ML team, powering data infrastructure for leading agentic LLMs (ChatGPT, Gemini, Llama). You will build scalable multi-agent systems to validate agentic reasoning and behaviors, scale human expertise, and drive research into real-world agent reliability failures despite strong benchmarks, shipping production fixes. Ideal for exceptional engineers with deep research rigor and a relentless focus on practical, high-impact systems. You will iterate rapidly with data, leverage AI tools to accelerate development, and collaborate tightly across engineering, product, and research. If you excel at turning frontier agent research into reliable deployed systems, we want to hear from you. You will: - Build and deploy multi-agent systems for agentic reasoning validation - Develop pipelines to detect errors and scale human judgment - Combine classical ML, LLMs, and multi-agent techniques for reliability - Lead research into agent failure modes and ship fixes - Use AI tools to speed prototyping and iteration - Build data-driven evaluations and deploy rapid improvements - Integrate systems into Scale's platform Ideally You’ll Have: - PhD or MSc in Computer Science, Mathematics, Statistics, or related field - 3+ years shipping scaled production ML systems - Demonstrated real-world impact - Mastery of PyTorch, TensorFlow, JAX, or scikit-learn - Deep expertise in agentic LLMs and multi-agent systems - Strong software engineering and microservices (AWS/GCP) - Rapid, data-driven iteration - Proficiency using AI tools to accelerate work - Strong research depth with practical bias - Excellent cross-functional communication Nice to Have: - Experience prototyping agent evaluation/reliability systems - Human-in-the-loop or annotation pipeline work - Open-source contributions in agents, evaluation, or alignment - Publications on agent reliability (NeurIPS, ICML, ICLR) Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position and may be inclusive of several career levels at Scale; it will be determined during the interview process based on work location and additional factors, including job-related skills, experience, qualifications, interview performance, and relevant education or training. Scale employees in eligible roles are also granted equity based compensation, subject to Board of Director approval. Your recruiter can share more about the specific salary range for your preferred location during the hiring process, and confirm whether the hired role will be eligible for equity grant. You'll also receive benefits including, but not limited to: comprehensive health, dental and vision coverage, retirement benefits, a learning and development stipend, and generous PTO. Additionally, this role may be eligible for additional benefits such as a commuter stipend. Please reference the job posting's subtitle for where this position will be located. For pay transparency purposes, th
Forward Deployed Engineer, GenAI
About Scale AI At Scale AI, our mission is to accelerate the development of AI applications. For 8 years, Scale has been the leading AI data foundry, helping fuel the most exciting advancements in AI, including generative AI, defense applications, and autonomous vehicles. With our recent Series F round, we’re accelerating the abundance of frontier data to pave the road to Artificial General Intelligence (AGI) and building upon our prior model evaluation work with enterprise customers and governments to deepen our capabilities and offerings for public and private evaluations. About Data Engine Our Generative AI Data Engine powers the world’s most advanced LLMs and generative models through world-class RLHF (Reinforcement Learning with Human Feedback), human data generation, model evaluation, safety, and alignment. The data we produce is some of the most critical work for how humanity will interact with AI. About Our FDE Team Generating high-quality data is the core problem our business solves. We aim to make producing and delivering high-quality data seamless and efficient for operators and customers. Our Team is building customer and operator-specific infrastructure to provide high-quality data with low turnaround time. You'll be exposed to the cutting edge of the Generative AI industry while directly interfacing with the leading model-building organizations in the space, including the top AI research labs and government agencies. Join us in shaping the future of Artificial General Intelligence. As a Forward Deployed Engineer, you'll be at the forefront of providing the critical data infrastructure that powers the most advanced AI models, directly influencing how humanity interacts with AI. You will work with the world’s leading AI companies and government agencies to solve their most complex AI data-related problems. Responsibilities: - Drive Impact: Directly contribute to the advancement of AI by delivering critical data solutions for leading AI innovators and government agencies. - Customer Collaboration: Interact daily with our technical customers, understanding their unique challenges and translating them into impactful solutions. - End-to-End Development: Design, build, and deploy features across the entire stack, from front-end interfaces to back-end systems and infrastructure. - Rapid Experimentation: Deliver high-quality experiments quickly, iterating quickly to meet customer needs and drive innovation. - Strategic Influence: Play a key role in shaping our engineering culture, values, and processes, contributing to the growth of our team and the evolution of our product. - Diverse Projects: Engage in a dynamic mix of designing and deploying cutting-edge data solutions, collaborating with leading AI researchers, and directly influencing the product roadmap. You'll work on everything from large-scale system architecture to customer-facing front-end application design. - Leadership Growth: This role offers a unique opportunity to lead critical projects, shape our engineering culture, and accelerate your career growth in the rapidly evolving field of Generative AI. You'll be positioned to become a future leader in a company defining the next era of technology. Requirements: - At least 2 years of relevant experience is preferred - Proven track record of shipping high-quality products and features at scale. - Strong problem-solving skills and the ability to work independently or as part of a collabo
Research Scientist, Agent Robustness
Scale Labs, Research Scientist — Agent Robustness As the leading data and evaluation partner for frontier AI companies, Scale plays an integral role in understanding the capabilities and safeguarding AI models and systems. Building on this expertise, Scale Labs has launched a new team focused on policy research, to bridge the gap between AI research and global policymakers to make informed, scientific decisions about AI risks and capabilities. Our research tackles the hardest problems in agent robustness, AI control protocols, and AI risk evaluations to help governments, industry, and the public understand and mitigate AI risk while maximizing AI adoption. This team collaborates broadly across industry, the public sector, and academia and regularly publishes our findings. We are actively seeking talented researchers to join us in shaping this vision. As a Research Scientist working on Agent Robustness you will work on the fundamental challenges of building AI agents that are safe and aligned with humans. For example, you might: - Research the science of AI agent capabilities with a focus on how they relate to safety, risk factors, and methodologies for benchmarking them; - Design and build harnesses to test AI agents’ tendency to take harmful actions when pressured to do so by users or tricked into doing so by elements of their environment; - Design and build exploits and mitigations for new and unique failure modes that arise as AI agents gain affordances like coding, web browsing, and computer use; - Characterize and design mitigations for potential failure modes or broader risks of systems involving multiple interacting AI agents. Ideally you’d have: - Commitment to our mission of promoting safe, secure, and trustworthy AI deployments in the industry as frontier AI capabilities continue to advance. - Practical experience conducting technical research collaboratively. You should be comfortable building and leveraging agent scaffolding, designing evaluation harnesses, and quickly turning new ideas from the research literature into working prototypes. - Experience with post-training and RL techniques such as RLHF, DPO, GRPO, and similar approaches. - A track record of published research in machine learning, particularly in generative AI. - At least three years of experience addressing sophisticated ML problems, whether in a research setting or in product development. - Strong written and verbal communication skills to operate in a cross-functional team. Nice to have: - Hands-on experience with agent evaluation frameworks such as SWE-bench, WebArena, OSWorld, Inspect, or similar tools. - Experience with red-teaming, prompt injection, or adversarial testing of AI systems. Our research interviews are crafted to assess candidates' skills in practical ML prototyping and debugging, their grasp of research concepts, and their alignment with our organizational culture. We will not ask any LeetCode-style questions. If you’re excited about advancing AI safety and contributing to our mission, we encourage you to apply, even if your experience doesn’t perfectly align with every requirement. Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position and m
Frontier Agent Engineering Manager, Enterprise
About Scale AI Scale AI is the data foundation for AI, helping organizations build and deploy reliable production AI applications. We partner with leading enterprises and government organizations to accelerate their AI initiatives through our data annotation platform, generative AI solutions, and enterprise AI capabilities. Role Overview As a Forward Deployed AI Engineering Manager on our Enterprise team, you'll be the technical bridge between Scale AI's cutting-edge AI capabilities and our most strategic customers. You'll work with enterprise clients to understand their unique challenges, lead a team that architects specific AI solutions, and ensure successful deployment and adoption of AI systems in production environments. This is a Management role that combines deep engineering and AI expertise, leading a team, and working on customer-facing problems. You'll work directly with customer engineering teams to integrate AI into their critical workflows. Key Responsibilities Customer Integration & Deployment - Partner directly with enterprise customers to understand their technical infrastructure, data pipelines, and business requirements - Design and implement custom integrations between Scale AI's platform and customer data environments (cloud platforms, data warehouses, internal APIs) - Build robust data connectors and ETL pipelines to ingest, process, and prepare customer data for AI workflows - Deploy and configure AI models and agents within customer security and compliance boundaries AI Agent Development - Develop production-grade AI agents tailored to customer use cases across domains like customer support, data analysis, content generation, and workflow automation - Architect multi-agent systems that orchestrate between different models, tools, and data sources - Implement evaluation frameworks to measure agent performance and iterate toward business objectives - Design human-in-the-loop workflows and feedback mechanisms for continuous agent improvement Prompt Engineering & Optimization - Create sophisticated prompt engineering strategies optimized for customer-specific domains and data - Build and maintain prompt libraries, templates, and best practices for customer use cases - Conduct systematic prompt experimentation and A/B testing to improve model outputs - Implement RAG (Retrieval Augmented Generation) systems and fine-tuning pipelines where appropriate Leadership & Collaboration - Serve as the Engineering Manager and technical point of contact for strategic enterprise accounts - Lead a team that is collaborating with customer data scientists, ML engineers, and software developers to ensure smooth integration - Work closely with Scale's product and engineering teams to translate customer needs into product improvements - Document technical architectures, integration patterns, and best practices Problem Solving & Innovation - Debug complex technical issues across the entire stack, from data pipelines to model outputs - Rapidly prototype solutions to unblock customers and prove out new use cases - Stay curr
Research Scientist, Frontier Risk Evaluations
Scale Labs, Research Scientist — Frontier Risk Evaluations As the leading data and evaluation partner for frontier AI companies, Scale plays an integral role in understanding the capabilities and safeguarding AI models and systems. Building on this expertise, Scale Labs has launched a new team focused on policy research, to bridge the gap between AI research and global policymakers to make informed, scientific decisions about AI risks and capabilities. Our research tackles the hardest problems in agent robustness, AI control protocols, and AI risk evaluations to help governments, industry, and the public understand and mitigate AI risk while maximizing AI adoption. This team collaborates broadly across industry, the public sector, and academia and regularly publishes our findings. We are actively seeking talented researchers to join us in shaping this vision. As a Research Scientist focused on Frontier Risk Evaluations, you will design and create evaluation measures, harnesses and datasets for measuring the risks posed by frontier AI systems. For example, you might do any or all of the following: - Design and build harnesses to test AI models and systems (including agents) for dangerous capabilities such as security vulnerability exploitation, CBRN uplift, and other high-risk activities; - Work with government agencies or other labs to collectively scope and design evaluations to measure and mitigate risks posed by advanced AI systems; - Publish evaluation methodologies and write technical reports for policymakers. Ideally you’d have: - Commitment to our mission of promoting safe, secure, and trustworthy AI deployments in the industry as frontier AI capabilities continue to advance. - Practical experience conducting technical research collaboratively. You should be comfortable building and instrumenting ML pipelines, writing evaluation harnesses, and quickly turning new ideas from the research literature into working prototypes. - A track record of published research in machine learning, particularly in generative AI. - At least three years of experience addressing sophisticated ML problems, whether in a research setting or in product development. - Strong written and verbal communication skills to operate in a cross-functional team. Nice to have: - Experience in crafting evaluations and benchmarks, or a background in data science roles related to LLM technologies. - Experience with red-teaming or adversarial testing of AI systems. - Familiarity with AI safety policy frameworks (e.g., NIST AI RMF, EU AI Act, Korea AI Basic Act). Our research interviews are crafted to assess candidates' skills in practical ML prototyping and debugging, their grasp of research concepts, and their alignment with our organizational culture. We will not ask any LeetCode-style questions. If you’re excited about advancing AI safety and contributing to our mission, we encourage you to apply, even if your experience doesn’t perfectly align with every requirement. Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position and may be inclusive of several career levels at Scale; it will be determined during the interview process based on work location and additional factors, including jo
Solutions Engineer, Enterprise
Scale plays a vital role in the development of AI applications. Our customer base is growing exponentially, and you will be on the front lines, ensuring that the world's most innovative companies become passionate, lifelong Scale customers. Solutions Engineers partner closely with AEs, Product, and MLEs to lead prospective customers through pre-sales, delivering customized demos and pilots to secure the “technical win”. Solutions Engineers scope customer technical requirements and develop an actionable SOW. They will work closely with the delivery team to help with initial implementation. Solutions Engineers are relentlessly curious about customer needs and pain points. They employ their expert Scale product knowledge and GenAI knowledge to design solutions that best address these needs. Solutions Engineers are strong relationship builders, great project managers, and provide technical expertise. You will: - Partner with Scale AEs on the customer journey, delivering tailored demos and prototypes according to the customer's requirements. - Develop technical domain expertise in Generative AI / large language model applications for Enterprise use cases, including customers in financial services, insurance, SaaS, and similar enterprises. - Be accountable for securing the “technical win” by unblocking technical challenges - Interact with customers daily to understand their needs and design solutions to better serve them. - Design and develop “Scopes of Work” by breaking down customer challenges into a project plan - Work closely with forward-deployed Software and Machine learning Engineers to develop agents in the initial post-sales stage - Work with AEs and PMs to identify customer-specific feature requests. - Drive strategic initiatives to improve the efficiency and effectiveness of the Solution Engineering team. Ideally, you'd have: - Strong engineering background with prior experience working with clients in a pre or post-sales capacity to realize business goals. - Prior experience developing with Python, Java and/or other web development languages. - Experience working in enterprise SaaS, cloud tech, finance, fintech or similar industries in a technical capacity with end-customer engagement. - A track record as a self-starter, motivated to independently unblock technical issues in the field with the customer, away from the mothership. - Presentation skills with a high degree of technical credibility when speaking with executives and front-line engineers. - High level of comfort communicating effectively across internal and external organizations. - Intellectual curiosity, empathy, and ability to operate with high velocity. Nice to haves: - GenAI Experience - Forward deployed engineering experience - Machine Learning Experience Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position and may be inclusive of several career levels at Scale; it will be determined during the interview process based on work location and additional factors, including job-related skills, experience, qualifications, interview performance, and relevant education or training. Scale employees in eligible roles are also granted equ
Infrastructure Software Engineer, Enterprise GenAI
Scale GP (Scale Generative AI Platform) is an enterprise-grade AI platform that provides APIs for knowledge retrieval, inference, evaluation, and more. We are looking for a strong engineer to join our team and help us build and scale our core infrastructure in a fast-paced environment. The ideal candidate will have a strong understanding of software engineering principles and practices, as well as experience with large-scale distributed systems. You will implement solutions across multiple cloud providers (GCP, Azure, AWS) for customers in diverse, highly-regulated industries like healthcare, telecom, finance, and retail. What You’ll Do: - Architect multi-cloud systems and abstractions to allow the SGP platform to run on top of existing Cloud providers - Implement custom integrations between Scale AI's platform and customer data environments (cloud platforms, data warehouses, internal APIs) - Collaborate with platform, product teams and our customers directly to develop and implement innovative infrastructure that scales to meet evolving needs. - Deliver experiments at a high velocity and level of quality to engage our customers - Work across the entire product lifecycle from conceptualization through production - Be able, and willing, to multi-task and learn new technologies quickly What We’re Looking For: - 4+ years of full-time engineering experience, post-graduation - Experience scaling products at hyper growth startups - Experience tinkering with or productizing LLMs, vector databases, and the other latest AI technologies - Proficient in Python or Javascript/Typescript, and SQL - Experience with Kubernetes - Experience with major cloud providers (AWS, Azure, GCP) - Excellent communication skills with the ability to explain technical concepts to both technical and non-technical audiences Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position and may be inclusive of several career levels at Scale; it will be determined during the interview process based on work location and additional factors, including job-related skills, experience, qualifications, interview performance, and relevant education or training. Scale employees in eligible roles are also granted equity based compensation, subject to Board of Director approval. Your recruiter can share more about the specific salary range for your preferred location during the hiring process, and confirm whether the hired role will be eligible for equity grant. You'll also receive benefits including, but not limited to: comprehensive health, dental and vision coverage, retirement benefits, a learning and development stipend, and generous PTO. Additionally, this role may be eligible for additional benefits such as a commuter stipend. Please reference the job posting's subtitle for where this position will be located. For pay transparency purposes, the base salary range for this full-time position in the locations of San Francisco, New York, Seattle is: $216,000 - $270,000 USD PLEASE NOTE:&
Research Scientist, AI Controls and Monitoring
Scale Labs, Research Scientist — AI Controls and Monitoring As the leading data and evaluation partner for frontier AI companies, Scale plays an integral role in understanding the capabilities and safeguarding AI models and systems. Building on this expertise, Scale Labs has launched a new team focused on policy research, to bridge the gap between AI research and global policymakers to make informed, scientific decisions about AI risks and capabilities. Our research tackles the hardest problems in agent robustness, AI control protocols, and AI risk evaluations to help governments, industry, and the public understand and mitigate AI risk while maximizing AI adoption. This team collaborates broadly across industry, the public sector, and academia and regularly publishes our findings. We are actively seeking talented researchers to join us in shaping this vision. As a Research Scientist focused on AI Controls and Monitoring, you will design methods, systems, and experiments to ensure that advanced AI models and agents remain aligned with intended goals, even in high-stakes or adversarial environments. For example, you might: - Develop monitoring techniques and observability methods that track AI behavior in real time to identify and flag deviations, emergent capabilities, or anomalous outputs; - Research mechanisms for layered control, including fail-safes, oversight protocols, and intervention methods that can halt or redirect AI systems when risks are detected; - Design red-team simulations to probe weaknesses in oversight and control mechanisms, and build mitigations to close identified gaps; - Collaborate with policymakers, engineers, and other researchers to establish standards and benchmarks for AI monitoring and escalation. Ideally you’d have: - Commitment to our mission of promoting safe, secure, and trustworthy AI deployments in the industry as frontier AI capabilities continue to advance. - Practical experience conducting technical research collaboratively. You should be comfortable designing control and monitoring experiments for AI systems, building prototype systems, and quickly turning new ideas from the research literature into working prototypes. - A track record of published research in machine learning, particularly in generative AI. - At least three years of experience addressing sophisticated ML problems, whether in a research setting or in product development. - Strong written and verbal communication skills to operate in a cross-functional team. Nice to have: - Experience with runtime monitoring, anomaly detection, or observability for ML systems. - Familiarity with AI control or alignment research (e.g., scalable oversight, interpretability, debate). - Experience with post-training and RL techniques such as RLHF, DPO, GRPO, and similar approaches. Our research interviews are crafted to assess candidates' skills in practical ML prototyping and debugging, their grasp of research concepts, and their alignment with our organizational culture. We will not ask any LeetCode-style questions. If you’re excited about advancing AI safety and contributing to our mission, we encourage you to apply, even if your experience doesn’t perfectly align with every requirement. Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new h
Senior/Staff Machine Learning Engineer, General Agents, Enterprise GenAI
Scale AI is the data foundation for AI, helping organizations build and deploy reliable production AI applications. We partner with leading enterprises and government organizations to accelerate their AI initiatives through our data annotation platform, generative AI solutions, and enterprise AI capabilities. About the General Agents Team The General Agents team, part of Scale’s Enterprise organization, builds robust general agents for customer use cases and applications. The team sits at the intersection of frontier agent development and real-world deployment, translating state-of-the-art reasoning and agentic capabilities into reliable, production-grade systems that drive real economic value. Our agents are scalable systems built around recurring enterprise problem domains, with a strong emphasis on generalization, extensibility, and deployment across many customers. About the Role As a Senior/Staff Machine Learning Engineer (MLE) on the General Agents team, you’ll play a critical role in designing, building, and deploying production-ready AI agents that solve high-impact enterprise problems. You will work across the full agent lifecycle—from model and system design to evaluation, deployment, and iteration—bridging cutting-edge agentic techniques with the constraints and requirements of real customer environments. You will: - Design and implement end-to-end agent systems that combine LLM reasoning, tool use, memory, and control logic to solve recurring enterprise use cases. - Build scalable, reliable agent architectures that can be deployed across many customers with varying data, tools, and constraints. - Develop evaluation frameworks, datasets, environments, and metrics to measure agent performance, reliability, and business impact in production settings. - Collaborate closely with product managers, customers, data annotators, and other engineering teams to translate enterprise requirements into robust agent designs. - Productionize frontier agent techniques (e.g., planning, multi-step reasoning and tool-use, multi-agent patterns) into maintainable, observable systems. - Own deployment, monitoring, and iteration of agent systems, including failure analysis and continuous improvement based on real-world usage. - Contribute to technical direction and architectural decisions for general agent development best practices and methods, with increasing scope and leadership at the Staff level. Ideally you’d have: - 5+ years of experience building and deploying machine learning or AI systems for real-world, production use cases. - Strong engineering fundamentals, supported by a Bachelor’s and/or Master’s degree in Computer Science, Machine Learning, AI, or equivalent practical experience. - Deep understanding of modern LLMs, prompt-, context-, and system-level optimization, and agentic system design. - Proven proficiency in Python, including writing production-quality, testable, and maintainable code. - Experience building systems that integrate models with external tools, APIs, databases, and services. - Ability to operate in ambiguous problem spaces, balancing research-driven approaches with pragmatic product constraints. - Strong communication skills and comfort working in customer-facing or cross-functional environments. Nice-to-haves: - Hands-on experience building AI agents using modern generative AI stacks (OpenAI APIs, commercial or open-source LLMs). <li&g
Senior Software Engineer, Full-Stack – Scale GP
Scale GP (Scale Generative AI Platform) is an enterprise-grade Generative AI platform providing APIs for knowledge retrieval, inference, evaluation, and more. We are seeking a strong Senior Full-Stack Engineer to help us build, scale, and refine our rapidly growing product. The ideal candidate is deeply grounded in software engineering best practices and experienced in developing and scaling modern web applications end-to-end. You will work across the stack—from React/TypeScript frontends to Python-based backends—while integrating with LLMs and machine learning systems. You will solve complex challenges in scalability, reliability, and product experience while owning significant product areas in a fast-paced environment. What You’ll Do - Own major full-stack product areas , driving features from design through production deployment. - Build modern frontend experiences using React and TypeScript, ensuring performance, usability, and responsiveness. - Develop reliable backend services in Python, working with distributed systems, data pipelines, and ML/LLM components. - Integrate with LLMs, vector databases, and AI infrastructure to power intelligent product experiences. - Deliver experiments and new features quickly , maintaining high quality and tight feedback loops with customers. - Collaborate across product, ML, and infrastructure teams to shape the direction of Scale GP. - Adapt quickly —learning new technologies, frameworks, and tools as needed across the stack. Ideal Experience - 5+ years of full-time engineering experience , post-graduation. - Strong experience developing full-stack applications using React, TypeScript, and Python . - Experience scaling or shipping products at high-growth startups . - Familiarity with LLMs, vector databases, embeddings, or other modern AI tooling (tinkering or production experience welcome). - Proficiency with SQL and modern API development. - Experience with Kubernetes , containerization, and microservice architectures. - Experience working with at least one major cloud provider (AWS, GCP, or Azure). Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position and may be inclusive of several career levels at Scale; it will be determined during the interview process based on work location and additional factors, including job-related skills, experience, qualifications, interview performance, and relevant education or training. Scale employees in eligible roles are also granted equity based compensation, subject to Board of Director approval. Your recruiter can share more about the specific salary range f
Product Manager of AI Applications, Global Public Sector
Scale is growing rapidly, and joining the Global Public Sector team is an opportunity to work on one of the most exciting and quickly expanding teams at Scale. This team is responsible for generating, executing, and fostering Scale’s work with governments and government-backed entities outside of the United States. We develop bespoke solutions that leverage our customers’ proprietary data and expertise to transform their organizations with AI. We work with them to understand their pain points and workflows and then forward deploy our team to build cutting-edge solutions. The applications we build are powered by the Scale GenAI Platform, a full stack product to build, test and deploy frontier AI agents. - Developing custom AI applications - Building custom LLMs - Providing high-quality training data for research and government institutions building LLMs - Developing partnerships to foster regional talent growth and AI adoption We are looking for an entrepreneurial and experienced product leader to play a pivotal role in the ideation and development of transformative AI solutions. The ideal candidate has deep experience with AI/ML application development, can think strategically about how to solve a problem, is an excellent listener, is comfortable getting into the weeds operationally, and has a strong understanding of software engineering principles and practices. You will be responsible for owning large AI projects for one or many customers. You will lead a cross-functional team of engineers, MLEs, and operators to build a highly impactful solution for our customers that will drive millions in revenue for our business as well. Responsibilities: - Lead design workshops with the client to define custom AI solutions - Scope out new AI application use cases across various government entities - Lead cross-functional development of AI applications and custom LLMs with diverse stakeholders (Engineering + Ops + Go-to-Market) - Consistently engage with future end-users to solicit feedback and ensure we are prioritizing effectively - Stay up to date with latest research in applied AI and training custom LLMs - Scope out model evaluation sets and performance requirements, consistently review results, and iterate on the solution - Give regular progress updates to the client and Global Public Sector leadership Minimum Qualifications: - 4+ years of experience building products with specific experience within the last 1-2 years building AI-powered products - Strong technical background (STEM degree) and/or experience building technical software products - Strong understanding of generative AI technologies and their applications in both enterprise and consumer settings - Experience with vibe coding tools (i.e., Replit, Lovable, Bolt, etc.) and design tools (i.e., Figma/Canva/Miro) - Exceptional leadership, presentation and communication skills with the ability to influence cross-functional teams Nice to haves: - Coding experience (Python) - Proficiency in Arabic, both written and spoken PLEASE NOTE: Our policy requires a 90-day waiting period before reconsidering candidates for the same role. This allows us to ensure a fair and thorough evaluation of all ap
GenAI Strategic Projects Lead, Public Sector
Scale is at the frontier of the AI industry, improving the world’s leading generative AI and large language models through model evaluations, human-powered supervised fine-tuning datasets, world-class reinforcement learning with human feedback, and more. Scale AI’s Public Sector team is growing in the Generative AI space, and we’re seeking an Strategic Projects Lead to own high-impact projects that drive revenue and experimentation. In this role, you’ll work across operations, engineering, and customer engagement to produce world-class training and test and evaluation data for Large Language Models for our Public Sector customers. This role offers a rare opportunity to make a meaningful impact at the intersection of AI and national security. You will help build Generative AI data-labeling pipelines from the ground up, create operational processes to manage and optimize an in-house expert data workforce, and develop novel technology-driven approaches (e.g., scripts, prompt engineering, hybrid data) to improve the quality of our training and evaluation datasets. In addition, you will partner directly with our internal machine learning experts and external stakeholders to ensure our data enables the development of mission-critical applications of AI. You will: - Develop, build, and maintain the infrastructure required to ensure data pipelines are efficient, scalable, and produce high-quality outputs - Take ownership of day-to-day progress on high-priority data production pipelines, ensuring projects move forward efficiently - Partner with subject matter experts in their fields to validate the quality of our data and to translate deep domain knowledge into scalable processes and measurable outcomes - Work closely with customers to understand their requirements and design data taxonomies that optimize model performance. - Utilize analytics and data visualization tools to track progress, identify bottlenecks, and make data-driven decisions to optimize pipeline performance - Influence cross-org collaboration to define and advance human data strategy, influencing technical and non-technical stakeholders to ensure data quality, scalability, and long-term platform leverage - Own larger and larger components of our data delivery processes, until you ultimately serve as the full owner of our most visible and high impact customer pipelines You have: - 5+ years of experience in product development, data science, or operations - A history of successful project management and comfort in ambiguity - Ability to analyze complex operational data, build queries, and identify trends to inform decisions and optimize processes - Technical aptitude to understand how to produce data for state of the art post-training techniques such as supervised fine tuning (SFT), reinforcement learning through human feedback (RLHF), Reinforcement Learning with Verifiable Rewards (RLVR) etc Nice to have: - Experience working in defense tech and/or an AI company - A technical degree in fields like computer science, data science, or engineering - A deep understanding of ML operations for generative AI workflows / products - An active Top Secret security clearance <div class="content-pay-transparenc
Machine Learning Research Scientist, Reasoning
About Scale At Scale AI, our mission is to accelerate the development of AI applications. For 8 years, Scale has been the leading AI data foundry, fueling the most exciting advancements in AI, including generative AI, defense applications, and autonomous vehicles. With our recent Series F round, we’re amplifying access to high-quality data to drive progress toward Artificial General Intelligence (AGI). Building on our history of model evaluation with enterprise and government customers, we are expanding our capabilities to set new standards for both public and private evaluations. About This Role This role operates at the forefront of AI research and real-world implementation, with a strong focus on reasoning within large language models (LLMs). The ideal candidate will study the data types critical for advancing LLM-based agents, including browser and software engineering (SWE) agents. You will play a key role in shaping Scale’s data strategy by identifying the most effective data sources and methodologies for improving LLM reasoning. Success in this role requires a deep understanding of LLMs, planning algorithms, and novel approaches to agentic reasoning, as well as creativity in tackling challenges related to data generation, model interaction, and evaluation. You will contribute to impactful research on language model reasoning , collaborate with external researchers, and work closely with engineering teams to bring state-of-the-art advancements into scalable, real-world solutions. Ideally, you’d have: - Practical experience working with LLMs, with proficiency in frameworks like PyTorch, JAX, or TensorFlow. You should also be skilled at rapidly interpreting research literature and turning new ideas into working prototypes. - A track record of published research in top ML and NLP venues (e.g., ACL, EMNLP, NAACL, NeurIPS, ICML, ICLR, CoLLM, etc.). - At least three years of experience solving complex ML challenges, either in a research setting or product development, particularly in areas related to LLM capabilities and reasoning. - Strong written and verbal communication skills, along with the ability to work effectively across teams. Nice to have: - Hands-on experience fine-tuning open-source LLMs or leading bespoke LLM fine-tuning projects using PyTorch/JAX. - Research and practical experience in building applications and evaluations related to LLM-based agents, including tool-use, text-to-SQL, browser agents, coding agents, and GUI agents. - Experience with agent frameworks such as OpenHands, Swarm, LangGraph, or similar. - Familiarity with advanced agentic reasoning techniques such as STaR and PLANSEARCH. - Proficiency in cloud-based ML development, with experience in AWS or GCP environments. Our research interviews are designed to assess candidates' ability to prototype and debug ML models, their depth of understanding in research concepts, and their alignment with our organizational culture. We do not conduct LeetCode-style problem-solving assessments. Compensation packages at Scale for eligible roles include base salary, equity, and benef
AI Strategy Consultant, Frontier Tech
As a member of our Frontier Tech Consultant team, you will play a critical role in advancing cutting-edge AI innovations by conducting high-impact experiments and ensuring seamless execution at the highest quality standards. Your work will directly contribute to Scale AI’s growth, shaping the future of artificial intelligence. In this role, you will be working on various types of projects, including but not limited to: research experiments, dataset generation, data quality improvements, and in-depth technical analysis. You will tackle complex, technical and operational challenges while collaborating closely with Scale’s ML research scientists and SPM team. The ideal candidate is analytical, detail-oriented, and results-driven, with strong problem-solving abilities and excellent communication skills. We are looking for someone who thrives in a fast-paced environment, is proactive in overcoming challenges, and is committed to delivering exceptional outcomes. If you are eager to contribute to the forefront of AI innovation, we encourage you to apply. You will be responsible for: - Design and execute research experiments - Build and evaluate frontier LLM datasets - Develop training and testing material for frontier pipelines - Improve quality of existing and new products Ideally you’d have: - Strong machine learning knowledge, either by being in the final years of a ML PhD career or having already graduated - Strong writing and verbal communication skills - An action-oriented mindset that balances creative problem solving with the scrappiness to ultimately deliver results - Analytical, planning, and process improvement capability - Experience working in a fast-paced, entrepreneurial environment - Technical skills including familiarity with Python, GPU, AWS, API, LLM, ML, and SQL Pay: $60-80/hr Commitment: This is a fully remote, US-based part-time (10-20 hours per week), on-going contract position staffed via HireArt. HireArt values diversity and is an Equal Opportunity Employer. We are interested in every qualified candidate who is eligible to work in the United States. Unfortunately, we are not able to sponsor visas, including CPT/OPT or employ corp-to-corp . LI-Onsite PLEASE NOTE: Our policy requires a 90-day waiting period before reconsidering candidates for the same role. This allows us to ensure a fair and thorough evaluation of all applicants. About Us: At Scale, our mission is to develop reliable AI systems for the world's most important decisions. Our products provide the high-quality data and full-stack technologies that power the world's leading models, and help enterprises and governments build, deploy, and oversee AI applications that deliver real impact. We work closely with industry leaders like Meta, Cisco, DLA Piper, Mayo Clinic, Time Inc., the Government of Qatar, and U.S. government agencies including the Army and Air Force. We are expanding our team to accelerate the development of AI applications. <p&
Machine Learning Systems Research Engineer, Agent Post-training - Enterprise GenAI
AI is becoming vitally important in every function of our society. At Scale, our mission is to accelerate the development of AI applications. For 9 years, Scale has been the leading AI data foundry, helping fuel the most exciting advancements in AI, including generative AI, defense applications, and autonomous vehicles. With our recent investment from Meta, we are doubling down on building out state of the art post-training algorithms to reach the performance necessary for complex agents in enterprises around the world. The Enterprise ML Research Lab works on the front lines of this AI revolution. We are working on an arsenal of proprietary research and resources that serve all of our enterprise clients. As an ML Sys Research Engineer, you’ll work on building out the algorithms for our next-gen Agent RL training platform, support large scale training, and research and integrate state-of-the-art technologies to optimize our ML system. Your customer will be other MLREs and AAIs on the Enterprise AI team who are taking the training algorithms and applying them to client use-cases ranging from next-generation AI cybersecurity firewall LLMs to training foundation healthtech search models. If you are excited about shaping the future of the modern AI movement, we would love to hear from you! You will: - Build, profile and optimize our training and inference framework. - Post-train state of the art models, developed both internally and from the community, to define stable post-training recipes for our enterprise engagements. - Collaborate with ML teams to accelerate their research and development, and enable them to develop the next generation of models and data curation.. - Create a next-gen agent training algorithm for multi-agent/multi-tool rollouts. Ideally you’d have: - At least 1-3 years of LLM training in a production environment - Passionate about system optimization - Experience with post-training methods like RLHF/RLVR and related algorithms like PPO/GRPO etc. - Ability to demonstrate know-how on how to operate the architecture of the modern GPU cluster - Experience with multi-node LLM training and inference - Strong software engineering skills, proficient in frameworks and tools such as CUDA, Pytorch, transformers, flash attention, etc. - Strong written and verbal communication skills to operate in a cross functional team environment. - PhD or Masters in Computer Science or a related field Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position and may be inclusive of several career levels at Scale; it will be determined during the interview process based on work location and additional factors, including job-related skills, experience, qualifications, interview performance, and relevant education or training. Scale employees in eligible roles are also granted equity based compensation, subject to Board of Director approval. Your recruiter can share more about the specific salary range for your preferred location during the hiring process, and confirm whether the hired role will be eligible for equity grant. You'll also receive benefits including, but not limited to: comprehensive health, dental and vision coverage, retirement benefits, a learning and development stipend, and generous PTO. Additionally, this role may be eligibl
Deep Research Agent Tech Lead
Scale AI is seeking a highly technical and strategic Staff / Senior Staff Machine Learning Engineer to act as the Tech Lead (TL) for our next generation of deep research agents for the Enterprise. This high-impact role will drive the technical direction and oversight for Deep Research Agent Development , translating cutting-edge research in Generative AI, Large Language Models (LLMs), and Agentic Frameworks into robust, scalable, and high-impact production systems that enhance enterprise operations, analytics, and core efficiency. The ideal candidate thrives in a fast-paced environment, has a passion for both deep technical work and mentoring, and is capable of setting a long-term technical strategy for a critical domain while maintaining a strong, hands-on delivery focus. Responsibilities Technical Leadership & Vision - Set the Technical Roadmap: Define and own the technical strategy, architecture, and roadmap for Deep Research Agents for the Enterprise, ensuring alignment with Scale AI’s overall AI strategy and business goals. - Drive Breakthrough Research to Production: Lead the end-to-end development, from initial research to production deployment, to landing on customer impact, with a focus on integrating diverse data modalities . - Core Agent Capabilities Development: - Advanced Knowledge Retrieval: Architect and implement state-of-the-art retrieval systems to ensure the agents provide accurate and comprehensive answers from public and proprietary data sources from enterprises. - Data analysis: Design and champion the development of data analysis agents that accurately translate complex natural language queries into executable SQL/code against diverse enterprise data schemas. - Multimodal Intelligence: Lead the integration of Multimodal AI capabilities to process and extract structured information from visual documents, tables, and forms, enriching the agent's knowledge base. - Architecture & Design: Design and champion highly scalable, reliable, and low-latency infrastructure and frameworks for building, orchestrating, and evaluating multi-agent systems at enterprise scale. - Technical Excellence: Serve as the technical authority for the team, leading design reviews, defining ML engineering best practices, and ensuring code quality, security, and operational excellence for all agent systems. Team Leadership & Mentorship - Lead and Mentor: Technically lead and mentor a team of Machine Learning Engineers and Research Scientists, fostering a culture of innovation, rigorous engineering, rapid iteration, and technical depth. - Recruiting & Growth: Partner with management to hire, onboard, and grow top-tier talent, helping to shape the long-term structure and capabilities of the team. - Cross-Functional Influence: Collaborate effectively with Product Managers, Data Scientists, and other engineering/science teams to translate ambiguous, high-level business problems into concrete, executable technical specifications and impactful agent sol
Machine Learning Research Scientist, Post-Training
Scale works with the industry’s leading AI labs to provide high quality data and accelerate progress in GenAI research. We are looking for Research Scientists and Research Engineers with expertise in LLM post-training (SFT, RLHF, reward modeling). This role will focus on optimizing data curation and eval to enhance LLM capabilities in both text and multimodal modalities. In this role, you will develop novel methods to improve the alignment and generalization of large-scale generative models. You will collaborate with researchers and engineers to define best practices in data-driven AI development. You will also partner with top foundation model labs to provide both technical and strategic input on the development of the next generation of generative AI models. You will: - Research and develop novel post-training techniques, including SFT, RLHF, and reward modeling, to enhance LLM core capabilities in both text and multimodal modalities. - Design and experiment new approaches to preference optimization. - Analyze model behavior, identify weaknesses, and propose solutions for bias mitigation and model robustness. - Publish research findings in top-tier AI conferences. Ideally you’d have: - Ph.D. or Master's degree in Computer Science, Machine Learning, AI, or a related field. - Deep understanding of deep learning, reinforcement learning, and large-scale model fine-tuning. - Experience with post-training techniques such as RLHF, preference modeling, or instruction tuning. - Excellent written and verbal communication skills - Published research in areas of machine learning at major conferences (NeurIPS, ICML, ICLR, ACL, EMNLP, CVPR, etc.) and/or journals - Previous experience in a customer facing role. Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position and may be inclusive of several career levels at Scale; it will be determined during the interview process based on work location and additional factors, including job-related skills, experience, qualifications, interview performance, and relevant education or training. Scale employees in eligible roles are also granted equity based compensation, subject to Board of Director approval. Your recruiter can share more about the specific salary range for your preferred location during the hiring process, and confirm whether the hired role will be eligible for equity grant. You'll also receive benefits including, but not limited to: comprehensive health, dental and vision coverage, retirement benefits, a learning and development stipend, and generous PTO. Additionally, this role may be eligible for additional benefits such as a commuter stipend. Please reference the job posting's subtitle for where this position will be located. For pay transparency purposes, the base salary range for this full-time position in the locations of San Francisco, New York, Seattle is: $252,000 - $315,000 USD PLEASE NOTE: Our policy requires a 90-day waiting period before reconsidering candidates for the same role.
Senior Machine Learning Engineer - Model Evaluations, Public Sector
Senior Machine Learning Engineer - Model Evaluations, Public Sector The Public Sector ML team at Scale deploys advanced AI systems—including LLMs, agentic models, and multimodal pipelines—into mission-critical government environments. We build evaluation frameworks that ensure these models operate reliably, safely, and effectively under real-world constraints. As an ML Engineer, you will design, implement, and scale automated evaluation pipelines that help customers trust and operationalize advanced AI systems across defense, intelligence, and federal missions. You will: - Develop and maintain automated evaluation pipelines for ML models across functional, performance, robustness, and safety metrics, including LLM-judge–based evaluations. - Design test datasets and benchmarks to measure generalization, bias, explainability, and failure modes. - Build evaluation frameworks for LLM agents, including infrastructure for scenario-based and environment-based testing. - Conduct comparative analyses of model architectures, training procedures, and evaluation outcomes. - Implement tools for continuous monitoring, regression testing, and quality assurance for ML systems. - Design and execute stress tests and red-teaming workflows to uncover vulnerabilities and edge cases. - Collaborate with operations teams and subject matter experts to produce high-quality evaluation datasets. - Comfortable with light travel (approximately 10%) for customer interaction and team needs. This role will require an active security clearance or the ability to obtain a security clearance. Ideally you’d have: - Experience in computer vision, deep learning, reinforcement learning, or NLP in production settings. - Strong programming skills in Python; experience with TensorFlow or PyTorch. - Background in algorithms, data structures, and object-oriented programming. - Experience with LLM pipelines, simulation environments, or automated evaluation systems. - Ability to convert research insights into measurable evaluation criteria. Nice to haves: - Graduate degree in CS, ML, or AI. - Cloud experience (AWS, GCP) and model deployment experience. - Experience with LLM evaluation, CV robustness, or RL validation. - Knowledge of interpretability, adversarial robustness, or AI safety frameworks. - Familiarity with ML evaluation frameworks and agentic model design. - Experience in regulated, classified, or mission-critical ML domains. Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position and may be inclusive of several career levels at Scale; it will be determined during the interview process based on work location and additional factors, including job-related skills, experience, qualifications, interview performance, and relevant education or training. Scale employees in eligible roles are also granted equity based compensation, subject to Board of Director approval. Your recruiter can share more about the specific salary range for your preferred location during the hiring process, and confirm whether the hired role will be elig
Staff Infrastructure Software Engineer, Enterprise AI
Scale GP is building the infrastructure that makes enterprise AI seamless. We are looking for a Senior or Staff Infrastructure Engineer to act as a primary technical lead, engineering the 'paved road' for our knowledge retrieval and inference engines. You won't just be managing resources; you’ll be defining the deployment standards for Agentic workflows at scale. Your mission is to bridge the gap between complex AI orchestration and world-class infrastructure, ensuring our platform remains the most reliable destination for enterprise agents The ideal candidate thrives in a fast-paced environment, has a passion for both deep technical work and mentoring, and is capable of setting a long-term technical strategy for a critical domain while maintaining a strong, hands-on delivery focus. You will architect and implement solutions across multiple cloud providers (GCP, Azure, AWS) for customers in diverse, highly-regulated industries like healthcare, telecom, finance, and retail. What You’ll Do: - Architect multi-cloud systems and abstractions to allow the SGP platform to run on top of existing Cloud providers. - Use our own data and AI platform to analyze build and test logs and metrics to identify areas for improvement. - Define the architectural patterns for our multi-cloud infrastructure to support secure, reliable, and scalable Agentic workflows for enterprise customers. - Enhance engineering and infrastructure efficiency, reliability, accuracy, and response times, including CI/CD processes, test frameworks, data quality assurance, end-to-end reconciliation, and anomaly detection. - Collaborate with platform and product teams to develop and implement innovative infrastructure that scales to meet evolving needs. - Design and champion highly scalable, reliable, and low-latency infrastructure and frameworks for building, orchestrating, and evaluating multi-agent systems at enterprise scale. - Lead the infrastructure roadmap with a strong focus on compliance, privacy, and security standards, including designing change management and data isolation strategies. - Own the development and maintenance of our best-in-class Agentic observability platform (logging, metrics, tracing, and analytics) to proactively ensure system health and enable rapid incident response. - Drive developer efficiency by building automated tooling and championing Infrastructure-as-Code (IaC) paradigms throughout the engineering organization to improve workflows and operational efficiency. What We're Looking For: - Proven experience in a senior role, with 5+ years of full-time software engineering experience. - Deep understanding of modern infrastructure practices, including CI/CD, IaC (e.g., Terraform, Helm Charts), container orchestration (e.g., Kubernetes) and observability platforms (e.g., Datadog, Prometheus, Grafana). - Extensive experience with at least one major cloud provider (AWS, Azure, or GCP). - Strong knowledge of security and compliance in enterprise environments, with a focus on access management, data isolation, and customer-specific VPC setups. - Proficiency in Python or JavaScript/TypeScript, and SQL. - Bonus points: Hands-on experience and a passion for working with Agents, LLMs, vector databases, and other emerging AI technologies. Compensation packages at
Machine Learning Research Engineer, Agent Data Foundation - Enterprise GenAI
AI is becoming vitally important in every function of our society. At Scale, our mission is to accelerate the development of AI applications. For 9 years, Scale has been the leading AI data foundry, helping fuel the most exciting advancements in AI, including generative AI, defense applications, and autonomous vehicles. With our recent investment from Meta, we are doubling down on building out state of the art post-training algorithms to reach the performance necessary for complex agents in enterprises around the world. The Enterprise ML Research Lab works on the front lines of this AI revolution. We are working on an arsenal of proprietary research, tools, and resources that serve all of our enterprise clients. As MLRE on the Data Foundation team, you’ll work on cutting edge research to define the data flywheel that makes the whole machine move. This includes research around synthetic environments from task definitions, building agents for trace analysis, and contributing to a cutting edge framework that automatically hill-climbs agent-building from an eval set. This will involve creating best-in-class Agents that achieve state of the art results through a combination of post-training + agent-building algorithms. If you are excited about shaping the future of the modern GenAI movement, we would love to hear from you! You will: - Build synthetic data pipelines to generate enterprise environments to use for RL post-training - Create agents to convert traces from production into actionable insights to use to improve agents - Contribute to our agent building product which can construct other agents using coding agents + proprietary algorithms - Train state of the art models, developed both internally and from the community, to deploy to our enterprise customers. Ideally you’d have: - 3+ years of building with LLMs in a production environment - Clear experiences with constructing high quality data to use to improve an LLM/Agent - Publications in top conferences such as NEURIPS, ICLR, or ICML within the last two years - PhD or Masters in Computer Science or a related field Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position and may be inclusive of several career levels at Scale; it will be determined during the interview process based on work location and additional factors, including job-related skills, experience, qualifications, interview performance, and relevant education or training. Scale employees in eligible roles are also granted equity based compensation, subject to Board of Director approval. Your recruiter can share more about the specific salary range for your preferred location during the hiring process, and confirm whether the hired role will be eligible for equity grant. You'll also receive benefits including, but not limited to: comprehensive health, dental and vision coverage, retirement benefits, a learning and development stipend, and generous PTO. Additionally, this role may be eligible for additional benefits such as a commuter stipend. Please reference the job posting's subtitle for where this position will be located. For pay transparency purposes, the base salary range for this full-time position in the locations of San Francisco, New York, Seattle is: $250,000
Senior Frontier Agents Engineer
About Scale AI Scale AI is the data foundation for AI, helping organizations build and deploy reliable production AI applications. We partner with leading enterprises and government organizations to accelerate their AI initiatives through our data annotation platform, generative AI solutions, and enterprise AI capabilities. Role Overview As a Senior Forward Deployed AI Engineer on our Enterprise team, you'll be the technical bridge between Scale AI's cutting-edge AI capabilities and our most strategic customers. You'll work with enterprise clients to understand their unique challenges, architect custom AI solutions, and ensure successful deployment and adoption of AI systems in production environments. This is a hands-on technical role that combines deep engineering expertise with customer-facing problem solving. You'll work directly with customer engineering teams to integrate AI into their critical workflows. Key Responsibilities Customer Integration & Deployment - Partner directly with enterprise customers to understand their technical infrastructure, data pipelines, and business requirements - Design and implement custom integrations between Scale AI's platform and customer data environments (cloud platforms, data warehouses, internal APIs) - Build robust data connectors and ETL pipelines to ingest, process, and prepare customer data for AI workflows - Deploy and configure AI models and agents within customer security and compliance boundaries AI Agent Development - Develop production-grade AI agents tailored to customer use cases across domains like customer support, data analysis, content generation, and workflow automation - Architect multi-agent systems that orchestrate between different models, tools, and data sources - Implement evaluation frameworks to measure agent performance and iterate toward business objectives - Design human-in-the-loop workflows and feedback mechanisms for continuous agent improvement Prompt Engineering & Optimization - Create sophisticated prompt engineering strategies optimized for customer-specific domains and data - Build and maintain prompt libraries, templates, and best practices for customer use cases - Conduct systematic prompt experimentation and A/B testing to improve model outputs - Implement RAG (Retrieval Augmented Generation) systems and fine-tuning pipelines where appropriate Technical Leadership & Collaboration - Serve as the primary technical point of contact for strategic enterprise accounts - Collaborate with customer data scientists, ML engineers, and software developers to ensure smooth integration - Provide technical training and knowledge transfer to customer teams - Work closely with Scale's product and engineering teams to translate customer needs into product improvements - Document technical architectures, integration patterns, and best practices Problem Solving & Innovation - Debug complex technical issues across the entire stack, from data pipelines to model outputs - Rapidly prototype solutions to unblock customers and prove out new use cases &
Staff Frontier Agents Engineer
About Scale AI Scale AI is the data foundation for AI, helping organizations build and deploy reliable production AI applications. We partner with leading enterprises and government organizations to accelerate their AI initiatives through our data annotation platform, generative AI solutions, and enterprise AI capabilities. Role Overview As a Staff Forward Deployed AI Engineer on our Enterprise team, you'll be the technical bridge between Scale AI's cutting-edge AI capabilities and our most strategic customers. You'll work with enterprise clients to understand their unique challenges, architect custom AI solutions, and ensure successful deployment and adoption of AI systems in production environments. This is a hands-on technical role that combines deep engineering expertise with customer-facing problem solving. You'll work directly with customer engineering teams to integrate AI into their critical workflows. Key Responsibilities Customer Integration & Deployment - Partner directly with enterprise customers to understand their technical infrastructure, data pipelines, and business requirements - Design and implement custom integrations between Scale AI's platform and customer data environments (cloud platforms, data warehouses, internal APIs) - Build robust data connectors and ETL pipelines to ingest, process, and prepare customer data for AI workflows - Deploy and configure AI models and agents within customer security and compliance boundaries AI Agent Development - Develop production-grade AI agents tailored to customer use cases across domains like customer support, data analysis, content generation, and workflow automation - Architect multi-agent systems that orchestrate between different models, tools, and data sources - Implement evaluation frameworks to measure agent performance and iterate toward business objectives - Design human-in-the-loop workflows and feedback mechanisms for continuous agent improvement Prompt Engineering & Optimization - Create sophisticated prompt engineering strategies optimized for customer-specific domains and data - Build and maintain prompt libraries, templates, and best practices for customer use cases - Conduct systematic prompt experimentation and A/B testing to improve model outputs - Implement RAG (Retrieval Augmented Generation) systems and fine-tuning pipelines where appropriate Technical Leadership & Collaboration - Serve as the primary technical point of contact for strategic enterprise accounts - Collaborate with customer data scientists, ML engineers, and software developers to ensure smooth integration - Provide technical training and knowledge transfer to customer teams - Work closely with Scale's product and engineering teams to translate customer needs into product improvements - Document technical architectures, integration patterns, and best practices Problem Solving & Innovation - Debug complex technical issues across the entire stack, from data pipelines to model outputs - Rapidly prototype solutions to unblock customers and prove out new use cases &l
Senior / Staff Machine Learning Research Scientist, Agents
About Scale At Scale AI, our mission is to accelerate the development of AI applications. For 8 years, Scale has been the leading AI data foundry, helping fuel the most exciting advancements in AI, including: generative AI, defense applications, and autonomous vehicles. With our recent Series F round, we’re accelerating the abundance of frontier data to pave the road to Artificial General Intelligence (AGI), and building upon our prior model evaluation work with enterprise customers and governments, to deepen our capabilities and offerings for both public and private evaluations. About the ACE team The Agent Capabilities & Environments (ACE) team, part of Scale’s Research organization, brings together customer-facing Researchers and Applied AI Engineers. Our core mission includes research on agent environments and RL reward signals, benchmarking autonomous agent performance across real-world scenarios and environments, creating robust data programs to improve Large Language Models (LLMs) agentic capabilities and building foundational tools and frameworks for evaluating models as agents. ACE focuses on autonomous agents that dynamically interact with diverse external environments, including code repositories, GUI interfaces, browsers, and more. About This Role This role is at the intersection of cutting-edge AI research and practical application, with a focus on studying the data types essential for building state-of-the-art agents, such as browser and SWE agents. The ideal candidate will explore the data landscape needed to advance intelligent, adaptable AI agents, guiding the data strategy at Scale to drive innovation. This position requires not only expertise in LLM agents and planning algorithms but also creativity in addressing novel challenges related to data, interaction, and evaluation. You will contribute to impactful research publications on agents, collaborate with customer researchers, and work alongside the engineering team to translate these advancements into real-world, scalable solutions. Ideally you’d have: - Practical experience working with LLMs, with proficiency in frameworks like Pytorch, Jax, or Tensorflow. You should also be adept at interpreting research literature and quickly turning new ideas into prototypes. - A track record of published research in top ML venues (e.g., ACL, EMNLP, NAACL, NeurIPS, ICML, ICLR, COLM, etc.) - At least three years of experience addressing sophisticated ML problems, either in a research setting or product development. - Strong written and verbal communication skills and the ability to operate cross-functionally. Nice to have: - Hands-on experience with open source LLM fine-tuning or involvement in bespoke LLM fine-tuning projects using Pytorch/Jax. - Hands-on experience and publications in building applications and evaluations related to AI agents such as tool-use, text2SQL, browser agents, coding agents and GUI agents. - Hands-on experience with agent frameworks such as OpenHands, Swarm, LangGraph, etc. - Familiarity with agentic reasoning methods such as STaR and PLANSEARCH - Experience working with cloud technology stack (eg. AWS or GCP) and developing machine learning models in a cloud environment. Our research interviews are crafted to assess candidates' skills in practical ML prototyping and debugging, their grasp of research concepts, and their alignment with our organizational culture. We will not ask any Lee
Machine Learning Fellow - Human Frontier Collective (Canada)
PLEASE NOTE: This is a fully remote, 1099 independent contractor opportunity with an estimated duration of six months and the potential for extension. To be eligible, candidates must be authorized to work in Canada. About the Program The Human Frontier Collective (HFC) Fellowship brings together top researchers and domain experts to collaborate on high-impact work that are shaping the future of AI. As an HFC Fellow, you’ll apply your academic and professional expertise to help design, evaluate, and interpret advanced generative AI systems—while gaining exposure to cutting-edge research and working alongside an interdisciplinary network of leading thinkers. What You'll Do - ML Projects: Get invited to engage in high-impact projects with our partnered AI labs and platforms. Help models understand real-world deep learning workflows by designing, reviewing, and optimizing PyTorch models, evaluating complex ML code and AI-generated implementations for efficiency and correctness, and advising on GPU optimization, scaling, and trade-offs. - HFC Community: Beyond the work, you’ll become part of a supportive, interdisciplinary network of innovators and thought leaders committed to advancing frontier AI across domains. - Contribute to Research Publications: Collaborate with Scale’s research team to co-author technical reports and research papers—boosting your academic visibility and professional recognition (e.g., SciPredict , PropensityBench , Professional Reasoning Benchmark ). Who Should Apply - Education: PhD or postdoctoral degree in Computer Science, Computer Engineering, or a related field. - Professional Background: 1-3+ years of experience as a Machine Learning Engineer or Data Scientist. - Skills: Strong proficiency in Python and modern ML frameworks (PyTorch, TensorFlow). Experience with cloud infrastructure (AWS) and MLOps tools (Docker, Langchain) is a plus. - Professional Mindset: Detail-oriented, innovative thinker with a passion in applied AI research and a commitment to collaboration. Why Join the HFC? - Professional Development: High-impact experts expand their influence through review projects, advisory roles, and research, while deepening their AI expertise, strengthening analytical and problem-solving skills, and engaging with pioneering AI applications in science and technology. - Join a Top-Tier Network: Collaborate with a global network of engineers and experts to advance responsible AI through impactful, flexible research and training. 80% of our members come from leading institutions. - Flexible Schedule: Set your own schedule, with flexible 10–40 hour weeks that fit around your life and other commitments. - Competitive Pay: Project pay rates vary across platforms and are depending on a number of factors, including but not limited to; projects, scope, skillset, and location. </li&g
Machine Learning Research Engineer, Agents - Enterprise GenAI
AI is becoming vitally important in every function of our society. At Scale, our mission is to accelerate the development of AI applications. For 9 years, Scale has been the leading AI data foundry, helping fuel the most exciting advancements in AI, including generative AI, defense applications, and autonomous vehicles. With our recent investment from Meta, we are doubling down on building out state of the art post-training algorithms to reach the performance necessary for complex agents in enterprises around the world. The Enterprise ML Research Lab works on the front lines of this AI revolution. We are working on an arsenal of proprietary research, tools, and resources that serve all of our enterprise clients. As an Agent MLRE, you will be working on applying our Agent RL Training + Building algorithms to real life enterprise datasets across our clients + benchmarks. This will involve creating best-in-class Agents that achieve state of the art results through a combination of post-training + agent-building algorithms. If you are excited about shaping the future of the modern GenAI movement, we would love to hear from you! You will: - Train state of the art models, developed both internally and from the community, to deploy to our enterprise customers. - Research cutting edge algorithms to integrate directly into our training stack. - Build agents that leverage our proprietary agent-building algorithms to automatically hill climb datasets – including defining highly performant tools, multi-agent systems, and complex rewards. Ideally you’d have: - 1-3 years of building with LLMs in a production environment - Experience with post-training methods like RLHF/RLVR and related algorithms like PPO/GRPO etc. - Publications in top conferences such as NEURIPS, ICLR, or ICML within the last two years - PhD or Masters in Computer Science or a related field Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position and may be inclusive of several career levels at Scale; it will be determined during the interview process based on work location and additional factors, including job-related skills, experience, qualifications, interview performance, and relevant education or training. Scale employees in eligible roles are also granted equity based compensation, subject to Board of Director approval. Your recruiter can share more about the specific salary range for your preferred location during the hiring process, and confirm whether the hired role will be eligible for equity grant. You'll also receive benefits including, but not limited to: comprehensive health, dental and vision coverage, retirement benefits, a learning and development stipend, and generous PTO. Additionally, this role may be eligible for additional benefits such as a commuter stipend. Please reference the job posting's subtitle for where this position will be located. For pay transparency purposes, the base salary range for this full-time position in the locations of San Francisco, New York, Seattle is: $250,000 - $350,000 USD PLEASE NOTE: </strong&
Research Scientist, Safety Post Training
Scale Labs, Research Scientist — Safety Post Training As the leading data and evaluation partner for frontier AI companies, Scale plays an integral role in understanding the capabilities and safeguarding AI models and systems. Building on this expertise, Scale Labs has launched a new team focused on policy research, to bridge the gap between AI research and global policymakers to make informed, scientific decisions about AI risks and capabilities. Our research tackles the hardest problems in agent robustness, AI control protocols, and AI risk evaluations to help governments, industry, and the public understand and mitigate AI risk while maximizing AI adoption. This team collaborates broadly across industry, the public sector, and academia and regularly publishes our findings. We are actively seeking talented researchers to join us in shaping this vision. As a Research Scientist working on Safety Post-Training you will develop and apply post-training methods and interpretability techniques to make frontier AI systems safer, and better understood by researchers and policymakers.. For example, you might: - Design and run post-training pipelines to study how training choices affect model safety, robustness, and alignment properties; - Develop interpretability-informed evaluations that reveal how and why models produce unsafe, deceptive, or otherwise undesirable behaviors, and use those insights to guide targeted mitigations; - Collaborate with policymakers, engineers, and other researchers to translate post-training and interpretability findings into actionable safety standards, evaluation benchmarks, and best practices. Ideally you’d have: - Commitment to our mission of promoting safe, secure, and trustworthy AI deployments in the industry as frontier AI capabilities continue to advance. - Experience with post-training and RL techniques such as RLHF, DPO, GRPO, and similar approaches. - A track record of published research in machine learning, particularly in generative AI. - At least three years of experience addressing sophisticated ML problems, whether in a research setting or in product development. - Strong written and verbal communication skills to operate in a cross-functional team. Nice to have: - Experience with mechanistic interpretability, probing, or other techniques for understanding model internals. - Familiarity with red-teaming or adversarial evaluation of post-trained models. - Experience studying failure modes introduced or masked by post-training, such as reward hacking, sycophancy, or alignment faking. Our research interviews are crafted to assess candidates' skills in practical ML prototyping and debugging, their grasp of research concepts, and their alignment with our organizational culture. We will not ask any LeetCode-style questions. If you’re excited about advancing AI safety and contributing to our mission, we encourage you to apply, even if your experience doesn’t perfectly align with every requirement. Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position and may be inclusive of several career levels at Scale; it will be determined during the interview process based on work location and additional facto
Technical Lead Manager, Physical AI
Scale AI is the data engine for the entire AI industry. Our mission is to accelerate the development of AI applications by providing organizations with the high-quality data they need. The Physical AI team at Scale is focused on the next frontier: building general AI that can reason and act in the physical world. By leveraging Scale’s massive data infrastructure, we are helping frontier labs build Foundation Models for Physical AI that will redefine the future of automation. Role Overview As the Technical Lead Manager (TLM) for the Physical AI team of Scale , you will bridge the gap between cutting-edge Machine Learning research and physical robot deployment. You will lead a high-performing team of Research Engineers while remaining a hands-on technical contributor (~60% of your time). Your primary focus will be the development and evaluation of Large-Scale Foundation Models (e.g VLAs, World models) that allow robots and AVs to generalize across diverse tasks, environments, and morphologies. Key Responsibilities Technical Leadership & Research - Model Scaling: Direct research into scaling laws for Physical AI, determining how to best utilize massive datasets for pre-training and fine-tuning generalist policies. - VLA and World model development: Develop novel methods for developing and evaluating models, including new Physical AI industry benchmarks - Hands-on Modeling: Actively write code to implement, train and test SOTA architectures. Conduct research on Physical AI data collection, cross-embodiment training, and policy fine-tuning. - Data Strategy: Collaborate with internal labeling teams to design "robotic-native" data pipelines, including the use of VLMs for automated trajectory annotation and data synthesis. - Collaborate closely with customers to drive the industry forward in using Scale data Team Management & Execution - Mentorship: Lead and grow a team of 4-6 elite Physical AI researchers, fostering a culture of high-velocity experimentation and rigorous evaluation. - Paper-to-Product: Translate the latest research from NeurIPS, ICRA, and CVPR into production-ready features for Scale’s Physical AI partners. - Cross-functional Alignment: Work with cross-functional teams (e.g Product and Operations) to bring our research breakthroughs into production. Required Qualifications AI/ML Excellence - Deep Learning Mastery: Expert-level proficiency in PyTorch , with deep knowledge of Transformer architectures , Attention mechanisms , and Self-Supervised Learning . - VLM/VLA Experience: Proven track record of working with Vision-Language Models (e.g., CLIP, PaLM-E) and adapting them for spatial reasoning or embodied tasks. - Generative AI: Experience with Diffusion Models for sequence generation or Generative World Models for predictive
Staff Software Engineer, Enterprise GenAI
Scale GP (Scale Generative AI Platform) is an enterprise-grade Generative AI platform that provides APIs for knowledge retrieval, inference, evaluation, and more. We are looking for a strong engineer to join our team and help us build and scale our product in a fast-paced environment. The ideal candidate will have a strong understanding of software engineering principles and practices, as well as experience with large-scale distributed systems. You will be responsible for owning large new areas within our product, working across backend, frontend, and interacting with LLMs and ML models. You will solve hard engineering problems in scalability and reliability. You will: - Own large new areas within our product - Work across backend, frontend, and interacting with LLMs and ML models - Deliver experiments at a high velocity and level of quality to engage our customers - Work across the entire product lifecycle from conceptualization through production - Be able, and willing, to multi-task and learn new technologies quickly Ideally you'd have: - 7+ years of full-time engineering experience, post-graduation - Experience scaling products at hyper growth startups - Experience tinkering with or productizing LLMs, vector databases, and the other latest AI technologies - Proficient in Python or Javascript/Typescript, and SQL - Experience with Kubernetes - Experience with major cloud providers (AWS, Azure, GCP) Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position and may be inclusive of several career levels at Scale; it will be determined during the interview process based on work location and additional factors, including job-related skills, experience, qualifications, interview performance, and relevant education or training. Scale employees in eligible roles are also granted equity based compensation, subject to Board of Director approval. Your recruiter can share more about the specific salary range for your preferred location during the hiring process, and confirm whether the hired role will be eligible for equity grant. You'll also receive benefits including, but not limited to: comprehensive health, dental and vision coverage, retirement benefits, a learning and development stipend, and generous PTO. Additionally, this role may be eligible for additional benefits such as a commuter stipend. Please reference the job posting's subtitle for where this position will be located. For pay transparency purposes, the base salary range for this full-time position in the locations of San Francisco, New York, Seattle is: $252,000 - $315,000 USD PLEASE NOTE: Our policy requires a 90-day waiting period before reconsidering candidates for the same role. This allows us to ensure a fair and thorough evaluation of all applicants. About Us: At Scale, our mission is to develop reliable AI systems for the world's most important decisions. Our products provide the high-quality data and full-sta
Senior Forward Deployed Data Scientist/Engineer
At Scale AI, we help leading enterprises turn AI from a promising capability into reliable systems that improve real workflows and deliver measurable business value. We are hiring a Senior Forward Deployed Data Scientist / Engineer to work directly with customers on ambiguous, high-impact problems at the intersection of data science, product development, and AI deployment. This is not a traditional analytics role. On this team, data scientists do the core statistical and modeling work, but they also build real tools and products: evaluation explorers, operator workflows, decision-support systems, experimentation surfaces, and customer-specific AI/data applications that get used in production. In many cases, the data scientist builds the first usable version of the solution, proves value quickly, and helps drive it into a durable product or platform capability. The right candidate is strong in first-principles problem solving, rigorous measurement, and technical execution. They know how to define metrics, design experiments, diagnose failures, and build systems that people actually use. They are also comfortable using modern AI-assisted development tools to prototype and iterate quickly without sacrificing reliability, observability, or judgment. Python and SQL matter in this role, but as execution fluency in service of building better products and making better decisions. What you’ll do - Partner directly with enterprise customers to understand workflows, operational pain points, constraints, and success criteria - Turn ambiguous business and product problems into measurable solutions with clear metrics, technical designs, and deployment plans - Design and build internal and customer-facing data products, including evaluation tools, workflow applications, decision-support systems, and thin product layers on top of data/ML systems - Build end-to-end solutions across data ingestion, transformation, experimentation, statistical modeling, deployment, monitoring, and iteration - Design evaluation frameworks, benchmarks, and feedback loops for ML/LLM systems, human-in-the-loop workflows, and model-assisted operations - Apply rigorous statistical thinking to experimentation, causal inference, metric design, forecasting, segmentation, diagnostics, and performance measurement - Use AI-assisted development workflows to accelerate prototyping and product iteration, while maintaining strong engineering discipline - Diagnose failure modes across data quality, model behavior, retrieval, workflow design, and user experience, and drive fixes into production - Act as the voice of the customer to Product, Engineering, and Data Science, using field learnings to shape roadmap and platform capabilities What we’re looking for - 5+ years of experience in data science, machine learning, quantitative engineering, or another highly analytical technical role - Proven track record of shipping data, ML, or AI systems that delivered measurable business or product impact - Exceptional ability to structure ambiguous problems, define the right success metrics, and translate them into executable technical plans - Strong foundation in statistics, experimentation, causal reasoning, and measurement - Experience building tools or products, not just analyses — for example internal workflow tools, evaluation systems, operator-facing products, experimentation platforms, or customer-specific applications - Hands-on fluency in Python, SQL, and modern data/AI tooling; able to inspect d
Product Manager of AI Applications, Global Public Sector
Scale is growing rapidly, and joining the Global Public Sector team is an opportunity to work on one of the most exciting and quickly expanding teams at Scale. This team is responsible for generating, executing, and fostering Scale’s work with governments and government-backed entities outside of the United States. We develop bespoke solutions that leverage our customers’ proprietary data and expertise to transform their organizations with AI. We work with them to understand their pain points and workflows and then forward deploy our team to build cutting-edge solutions. The applications we build are powered by the Scale GenAI Platform, a full stack product to build, test and deploy frontier AI agents. - Developing custom AI applications - Building custom LLMs - Providing high-quality training data for research and government institutions building LLMs - Developing partnerships to foster regional talent growth and AI adoption We are looking for an entrepreneurial and experienced product leader to play a pivotal role in the ideation and development of transformative AI solutions. The ideal candidate has deep experience with AI/ML application development, can think strategically about how to solve a problem, is an excellent listener, is comfortable getting into the weeds operationally, and has a strong understanding of software engineering principles and practices. You will be responsible for owning large AI projects for one or many customers. You will lead a cross-functional team of engineers, MLEs, and operators to build a highly impactful solution for our customers that will drive millions in revenue for our business as well. Responsibilities: - Lead design workshops with the client to define custom AI solutions - Scope out new AI application use cases across various government entities - Lead cross-functional development of AI applications and custom LLMs with diverse stakeholders (Engineering + Ops + Go-to-Market) - Consistently engage with future end-users to solicit feedback and ensure we are prioritizing effectively - Stay up to date with latest research in applied AI and training custom LLMs - Scope out model evaluation sets and performance requirements, consistently review results, and iterate on the solution - Give regular progress updates to the client and Global Public Sector leadership Minimum Qualifications: - 4+ years of experience building products with specific experience within the last 1-2 years building AI-powered products - Strong technical background (STEM degree) and/or experience building technical software products - Strong understanding of generative AI technologies and their applications in both enterprise and consumer settings - Experience with vibe coding tools (i.e., Replit, Lovable, Bolt, etc.) and design tools (i.e., Figma/Canva/Miro) - Exceptional leadership, presentation and communication skills with the ability to influence cross-functional teams Nice to haves: - Coding experience (Python) - Proficiency in Arabic, both written and spoken PLEASE NOTE: Our policy requires a 90-day waiting period before reconsidering candidates for the same role. This allows us to ensure a fair and thorough evaluation of all ap
Senior Machine Learning Engineer, Public Sector
The goal of a Senior Machine Learning Engineer at Scale is to leverage techniques in the fields of generative AI, computer vision, reinforcement learning, and agentic AI to improve Scale's products and customer experience in production environments. Our machine learning engineers take advantage of robust internal infrastructure and unique access to massive datasets to deliver improvements to our customers. Our Public Sector Machine Learning team is focused on deploying cutting-edge models to mission-critical government systems through products like Donovan and Thunderforge . Our work spans multiple modalities, with a strong focus on both large language models and computer vision. On the LLM side, we are developing agentic systems that help solve complex operational and planning challenges for government partners. This includes building agent frameworks that integrate with custom retrieval pipelines and production APIs, as well as evaluation tools to benchmark and refine agent behavior. We're also advancing research in areas like reinforcement learning for agentic LLMs, with successful deployment into real-world operational environments. On the computer vision front, we're training advanced models to increase labeling throughput and automate perception tasks. Our efforts include building large-scale fine-tuning pipelines, training models across multiple modalities, and developing generalizable vision foundation models to support a wide range of defense applications. You will: - Take state of the art models developed internally and from the community, use them in production to solve problems for our customers and taskers - Improve and maintain production models through retraining, hyperparameter tuning, and architectural updates, while preserving core performance characteristics - Collaborate with product and research teams to identify and prototype ML-driven product enhancements, including for upcoming product lines - Work with massive datasets to develop both generic models as well as fine tune models for specific products - Build scalable machine learning infrastructure to automate and optimize our ML services - Serve as a cross-functional representative and advocate for machine learning techniques across engineering and product organizations - Be comfortable learning new technologies quickly and managing multiple priorities in a fast-paced environment - Comfortable with light travel (approximately 10%) for customer interaction and team needs - This role will require an active security clearance or the ability to obtain a security clearance Ideally You’d Have: - Extensive experience with GenAI, Agentic AI, natural language processing, deep learning and deep reinforcement learning, or computer vision in a production environment - Solid background in algorithms, data structures, and object-oriented programming - Strong programing skills in Python, experience in Tensorflow or PyTorch Nice to Haves: - Graduate degree in Computer Science, Machine Learning or Artificial Intelligence specialization - Experience working with cloud platforms (eg. AWS or GCP) and deploying machine learning models in cloud environments - Experience with computer vision, generative AI models, large language models, or agentic systems - Familiarity with ML evaluation frameworks and agentic model design
Manager, Machine Learning Research Scientist, GenAI
Scale AI accelerates the development of AI systems by providing the data, infrastructure, and tooling that power the most advanced models in the world. Our teams operate at the intersection of cutting-edge research, large-scale engineering, and real-world deployment, partnering with leading frontier labs, enterprises, and government agencies to push Generative AI into new capabilities and applications. As AI rapidly evolves from static models to dynamic, agentic systems, Scale is building the foundational research, evaluation methodologies, and agent/RL infrastructure that will define this next era. You’ll join a high-impact research organization driving advances in large-language models, post-training, evaluation, and agentic/RL environments, helping shape how next-generation AI is built, measured, and deployed. As a Research Scientist Manager, you will lead a world-class team of research scientists and engineers, define the research roadmap, and drive execution from early prototyping to deployment. You’ll thrive in a fast-moving environment, balancing deep technical leadership with people management, vision setting, and delivery. You will: - Lead, mentor and grow a team of research scientists and engineers working on GenAI research initiatives (e.g., evaluation, post-training, agents, RL environments). - Define and drive a multi-year research roadmap: identify key scientific questions, set milestones, allocate resources, and ensure rigorous execution. - Collaborate cross-functionally with engineering, product, client-facing teams and external academic or industry partners to translate research into components, insights, and actionable outcomes. - Communicate compellingly: publish research, present at conferences, engage in open-source contributions, and represent the team externally. - Drive an inclusive, high-performing culture: help your team through technical challenges, provide growth opportunities, and attract top talent. - Stay deeply connected to the research community, understanding major trends, and helping set them. - Thrive in a high-energy, fast-paced startup environment and are ready to dedicate the time and effort needed to drive impactful results. Ideally you'd have: - 5+ years of hands-on research experience (PhD or equivalent preferred) in machine learning, deep learning, generative models, agent/rl systems or related domains. - A strong track record of research excellence, including publications in top-tier ML/AI venues (NeurIPS, ICML, ICLR, ACL, etc.). - Experience and track of recording in landing major research impacts in a fast-paced environment - Experience leading or managing research teams. You’re excited to mentor, coach and develop talent. - Excellent written and verbal communication skills. You are able to articulate research ideas and outcomes to both technical and non-technical stakeholders. Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position and may be inclusive of several career levels at Scale; it will be determined during the interview process based on work location and additional factors, including job-related skills, experience, qualifications, interview performance, and relevant education or training. Scale employees in eligible roles are
ML Research Engineer, ML Systems
Scale’s ML platform (RLXF) team builds our internal distributed framework for large language model training and inference. The platform has been powering MLEs, researchers, data scientists and operators for fast and automatic training and evaluation of LLM's, as well as evaluation of data quality. Scale is uniquely positioned at the heart of the field of AI as an indispensable provider of training and evaluation data and end-to-end solutions for the ML lifecycle. You will work closely across Scale’s ML teams and researchers to build the foundation platform that supports all our ML research and development. You will be building and optimizing the platform to enable our next generation of LLM training, inference and data curation. If you are excited about shaping the future AI via fundamental innovations, we would love to hear from you! You will: - Build, profile and optimize our training and inference framework - Collaborate with ML teams to accelerate their research and development and enable them to develop the next generation of models and data curation - Research and integrate state-of-the-art technologies to optimize our ML system Ideally you’d have: - Strong excitement about system optimization - Experience with multi-node LLM training and inference - Experience with developing large-scale distributed ML systems - Strong software engineering skills, proficient in frameworks and tools such as CUDA, Pytorch, transformers, flash attention, etc. - Strong written and verbal communication skills and the ability to operate in a cross functional team environment Nice to haves: - Demonstrated expertise in post-training methods &/or next generation use cases for large language models including instruction tuning, RLHF, tool use, reasoning, agents, and multimodal, etc. Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position and may be inclusive of several career levels at Scale; it will be determined during the interview process based on work location and additional factors, including job-related skills, experience, qualifications, interview performance, and relevant education or training. Scale employees in eligible roles are also granted equity based compensation, subject to Board of Director approval. Your recruiter can share more about the specific salary range for your preferred location during the hiring process, and confirm whether the hired role will be eligible for equity grant. You'll also receive benefits including, but not limited to: comprehensive health, dental and vision coverage, retirement benefits, a learning and development stipend, and generous PTO. Additionally, this role may be eligible for additional benefits such as a commuter stipend. Please reference the job posting's subtitle for where this position will be located. For pay transparency purposes, the base salary range for this full-time position in the locations of San Francisco, New York, Seattle is: $189,600 - $237,000 USD PLEASE NOTE: Our po
AI Applications Ops Lead, GPS
Role Overview Scale’s rapidly growing Global Public Sector team is focused on using AI to address critical challenges facing the public sector around the world. Our core work consists of: - Creating custom AI applications that will impact millions of citizens - Generating high-quality training data for national LLMs - Upskilling and advisory services to spread the impact of AI As a Production AI Ops Lead, you will design and develop the production lifecycle of full-stack AI applications, while supporting end-to-end system reliability, real-time inference observability, sovereign data orchestration, high-security software integration, and the resilient cloud infrastructure required for our international government partners. At Scale, we’re not just building AI solutions—we’re enabling the public sector to transform their operations and better serve citizens through cutting-edge technology. If you’re ready to shape the future of AI in the public sector and be a founding member of our team, we’d love to hear from you. You will: - Own the production outcome: Take full accountability for the long-term performance and reliability of AI use cases deployed across international government agencies. - Ensure Full-Stack integrity: Oversee the end-to-end health of the platform, ensuring seamless integration between the AI core and all full-stack components, from APIs to UI, to maintain a responsive and production-ready environment. - Scale the feedback loop: Build automated systems to monitor model performance and data drift across geographically dispersed environments, ensuring the right levels of reliability. - Navigate global compliance: Manage the technical lifecycle within diverse regulatory frameworks. - Incident command: Lead the response for production issues in mission-critical environments, ensuring rapid resolution and building the guardrails to prevent them from happening again. - Bridge the gap: Translate deep technical performance metrics into clear insights for senior international government officials. - Drive product evolution: Partner with our Engineering and ML teams to ensure the lessons learned in the field directly influence the technical architecture and decisions of future use cases. Ideally, you have: - Experience: 6+ years in a high-impact technical role (SRE, FDE or MLOps) with experience in the public sector. - Global perspective: Familiarity with international government security standards and the complexities of deploying sovereign AI. - System architecture proficiency: Proven experience maintaining production-grade applications with a deep understanding of the full request lifecycle-connecting frontend/API layers to the backend and AI core. - Modern AI Stack expertise: Proficiency in coding and the modern AI infrastructure, including Kubernetes, vector databases, agentic development, and LLM observability tools. - Ownership: You treat every production deployment as your own. You race toward solving hard problems before the customer even sees them. - Reliability: You understand that in the public sector, a model failure m
Staff Machine Learning Research Engineer, Agent Post-training - Enterprise GenAI
AI is becoming vitally important in every function of our society. At Scale, our mission is to accelerate the development of AI applications. For 9 years, Scale has been the leading AI data foundry, helping fuel the most exciting advancements in AI, including generative AI, defense applications, and autonomous vehicles. With our recent investment from Meta, we are doubling down on building out state of the art post-training algorithms to reach the performance necessary for complex agents in enterprises around the world. The Enterprise ML Research Lab works on the front lines of this AI revolution. We are working on an arsenal of proprietary research, tools, and resources that serve all of our enterprise clients. As a Staff Agent Post-Training MLRE, you will build out our next-gen Agent RL training platform. You’ll build out the platform that will train best-in-class Agents that achieve state of the art results on real enterprise use-cases. You’ll integrate cutting edge research into our training stack, enabling MLREs on the Enterprise AI team to deploy use-cases ranging from next-generation AI cybersecurity firewall LLMs to training foundation healthtech search models. If you are excited about shaping the future of the modern GenAI movement, we would love to hear from you! You will: - Train state of the art models, developed both internally and from the community, to deploy to our enterprise customers. - Research cutting edge algorithms to integrate directly into our training stack. - Design solutions that enable complex multi-agent systems to directly learn from both process + outcome based rewards. Ideally you’d have: - 5+ years of LLM training in a production environment - Experience with post-training methods like RLHF/RLVR and related algorithms like PPO/GRPO etc. - Publications in top conferences such as NEURIPS, ICLR, or ICML within the last two years - PhD or Masters in Computer Science or a related field Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position and may be inclusive of several career levels at Scale; it will be determined during the interview process based on work location and additional factors, including job-related skills, experience, qualifications, interview performance, and relevant education or training. Scale employees in eligible roles are also granted equity based compensation, subject to Board of Director approval. Your recruiter can share more about the specific salary range for your preferred location during the hiring process, and confirm whether the hired role will be eligible for equity grant. You'll also receive benefits including, but not limited to: comprehensive health, dental and vision coverage, retirement benefits, a learning and development stipend, and generous PTO. Additionally, this role may be eligible for additional benefits such as a commuter stipend. Please reference the job posting's subtitle for where this position will be located. For pay transparency purposes, the base salary range for this full-time position in the locations of San Francisco, New York, Seattle is: $250,000 - $350,000 USD <div class="content-conclusion">
Frontier Agents Engineer
About Scale AI Scale AI is the data foundation for AI, helping organizations build and deploy reliable production AI applications. We partner with leading enterprises and government organizations to accelerate their AI initiatives through our data annotation platform, generative AI solutions, and enterprise AI capabilities. Role Overview As a Forward Deployed AI Engineer on our Enterprise team, you'll be the technical bridge between Scale AI's cutting-edge AI capabilities and our most strategic customers. You'll work with enterprise clients to understand their unique challenges, architect custom AI solutions, and ensure successful deployment and adoption of AI systems in production environments. This is a hands-on technical role that combines deep engineering expertise with customer-facing problem solving. You'll work directly with customer engineering teams to integrate AI into their critical workflows. Key Responsibilities Customer Integration & Deployment - Partner directly with enterprise customers to understand their technical infrastructure, data pipelines, and business requirements - Design and implement custom integrations between Scale AI's platform and customer data environments (cloud platforms, data warehouses, internal APIs) - Build robust data connectors and ETL pipelines to ingest, process, and prepare customer data for AI workflows - Deploy and configure AI models and agents within customer security and compliance boundaries AI Agent Development - Develop production-grade AI agents tailored to customer use cases across domains like customer support, data analysis, content generation, and workflow automation - Architect multi-agent systems that orchestrate between different models, tools, and data sources - Implement evaluation frameworks to measure agent performance and iterate toward business objectives - Design human-in-the-loop workflows and feedback mechanisms for continuous agent improvement Prompt Engineering & Optimization - Create sophisticated prompt engineering strategies optimized for customer-specific domains and data - Build and maintain prompt libraries, templates, and best practices for customer use cases - Conduct systematic prompt experimentation and A/B testing to improve model outputs - Implement RAG (Retrieval Augmented Generation) systems and fine-tuning pipelines where appropriate Technical Leadership & Collaboration - Serve as the primary technical point of contact for strategic enterprise accounts - Collaborate with customer data scientists, ML engineers, and software developers to ensure smooth integration - Provide technical training and knowledge transfer to customer teams - Work closely with Scale's product and engineering teams to translate customer needs into product improvements - Document technical architectures, integration patterns, and best practices Problem Solving & Innovation - Debug complex technical issues across the entire stack, from data pipelines to model outputs - Rapidly prototype solutions to unblock customers and prove out new use cases <li&g
Tech Lead Manager- MLRE, ML Systems
Scale's LLM post-training platform team builds our internal distributed framework for large language model training. The platform powers MLEs, researchers, data scientists, and operators for fast and automatic training and evaluation of LLMs. It also serves as the underlying training framework for the data quality evaluation pipeline. Scale is uniquely positioned at the heart of the field of AI as an indispensable provider of training and evaluation data and end-to-end solutions for the ML lifecycle. You will work closely with Scale’s ML teams and researchers to build the foundation platform which supports all our ML research and development works. You will be building and optimizing the platform to enable our next generation LLM training, inference and data curation. If you are excited about shaping the future AI via fundamental innovations, we would love to hear from you! You will: - Build, profile and optimize our training and inference framework. - Collaborate with ML and research teams to accelerate their research and development, and enable them to develop the next generation of models and data curation. - Research and integrate state-of-the-art technologies to optimize our ML system. Ideally you’d have: - Passionate about system optimization - Experience with multi-node LLM training and inference - Experience with developing large-scale distributed ML systems - Experience with post-training methods like RLHF/RLVR and related algorithms like PPO/GRPO etc. - Strong software engineering skills, proficient in frameworks and tools such as CUDA, Pytorch, transformers, flash attention, etc. - Strong written and verbal communication skills to operate in a cross functional team environment. Nice to haves: - Demonstrated expertise in post-training methods and/or next generation use cases for large language models including instruction tuning, RLHF, tool use, reasoning, agents, and multimodal, etc. Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position and may be inclusive of several career levels at Scale; it will be determined during the interview process based on work location and additional factors, including job-related skills, experience, qualifications, interview performance, and relevant education or training. Scale employees in eligible roles are also granted equity based compensation, subject to Board of Director approval. Your recruiter can share more about the specific salary range for your preferred location during the hiring process, and confirm whether the hired role will be eligible for equity grant. You'll also receive benefits including, but not limited to: comprehensive health, dental and vision coverage, retirement benefits, a learning and development stipend, and generous PTO. Additionally, this role may be eligible for additional benefits such as a commuter stipend. Please reference the job posting's subtitle for where this position will be located. For pay transparency purposes, the base salary range for this full-time position in the locations of San Francisco, New York, Seattle is: $264,800 - $331,000 USD &
Applied AI Engineer, Global Public Sector
Scale’s rapidly growing Global Public Sector team is focused on using AI to address critical challenges facing the public sector around the world. Our core work consists of: - Creating custom AI applications that will impact millions of citizens - Generating high-quality training data for national LLMs - Upskilling and advisory services to spread the impact of AI We are hiring Applied AI Engineers to build custom end-to-end AI applications for our public sector clients using the latest developments in the field of AI. You will also get the opportunity to develop and be part of creating custom datasets, evaluations, and fine-tuning these sophisticated models to maximize performance and apply on real world use cases with global reach. At Scale, we’re not just building AI solutions—we are building repeatable blocks to enable the public sector to transform their operations and better serve citizens through cutting-edge technology. If you’re ready to shape the future of AI in the public sector and be a member of our rapidly expanding team, we’d love to hear from you. You will: - Partner with public sector clients to deeply understand their challenges and define AI-driven solutions - Build and deploy end-to-end AI applications into production leveraging latest developments from the biggest AI labs, and open source models - Collaborate with cross-functional teams, including data annotation specialists, to create high-quality training datasets - Design and maintain robust evaluation frameworks to ensure the reliability and effectiveness of AI models - Participate in customer engagements, including occasional travel (approximately two weeks per quarter) - Contribute to the scaling of AI capabilities in the public sector through hands-on knowledge sharing Ideally you’d have: - A strong engineering background, with a Bachelor’s degree in Computer Science, Mathematics, or a related quantitative field (or equivalent practical experience) - 7+ years of post-graduation engineering experience, with demonstrated proficiency in languages such as Python, TypeScript/JavaScript, Java, or C++. - 2+ years of experience applying AI/ML in production environments, such as deploying deep learning solutions, building generative/agentic AI applications or setting up evaluations pipelines - Familiarity with cloud-based machine learning tools and platforms (e.g. AWS, GCP, Azure) - Strong problem-solving skills, with a data-driven approach to iterating on machine learning models and datasets - Excellent written and verbal communication skills to collaborate effectively in a cross-functional environment Nice to haves: - Experience working at a startup, particularly as founding engineer - Experience building and deploying large-scale AI solutions - Strong written and verbal communication skills to operate in a cross-functional team environment - Proficiency in Arabic (if focused on language models) PLEASE NOTE: Our policy requires a 90-day waiting period before reconsidering candidates for the same role. This allows us to ensure a fair and thorough evaluation of all applicants. About Us: At Scale, our mission is
Machine Learning Fellow - Human Frontier Collective (US)
PLEASE NOTE: This is a fully remote, 1099 independent contractor opportunity with an estimated duration of six months and the potential for extension. To be eligible, candidates must be authorized to work in the United States; visa sponsorship is not available for this role. About the Program The Human Frontier Collective (HFC) Fellowship brings together top researchers and domain experts to collaborate on high-impact work that are shaping the future of AI. As an HFC Fellow, you’ll apply your academic and professional expertise to help design, evaluate, and interpret advanced generative AI systems—while gaining exposure to cutting-edge research and working alongside an interdisciplinary network of leading thinkers. What You'll Do - ML Projects: Get invited to engage in high-impact projects with our partnered AI labs and platforms. Help models understand real-world deep learning workflows by designing, reviewing, and optimizing PyTorch models, evaluating complex ML code and AI-generated implementations for efficiency and correctness, and advising on GPU optimization, scaling, and trade-offs. - HFC Community: Beyond the work, you’ll become part of a supportive, interdisciplinary network of innovators and thought leaders committed to advancing frontier AI across domains. - Contribute to Research Publications: Collaborate with Scale’s research team to co-author technical reports and research papers—boosting your academic visibility and professional recognition (e.g., SciPredict , PropensityBench , Professional Reasoning Benchmark ). Who Should Apply - Education: PhD or postdoctoral degree in Computer Science, Computer Engineering, or a related field. - Professional Background: 1-3+ years of experience as a Machine Learning Engineer or Data Scientist. - Skills: Strong proficiency in Python and modern ML frameworks (PyTorch, TensorFlow). Experience with cloud infrastructure (AWS) and MLOps tools (Docker, Langchain) is a plus. - Professional Mindset: Detail-oriented, innovative thinker with a passion in applied AI research and a commitment to collaboration. Why Join the HFC? - Professional Development: High-impact experts expand their influence through review projects, advisory roles, and research, while deepening their AI expertise, strengthening analytical and problem-solving skills, and engaging with pioneering AI applications in science and technology. - Join a Top-Tier Network: Collaborate with a global network of engineers and experts to advance responsible AI through impactful, flexible research and training. 80% of our members come from leading institutions. - Flexible Schedule: Set your own schedule, with flexible 10–40 hour weeks that fit around your life and other commitments. - Competitive Pay: Project pay rates vary across platforms and are depending on a number of factors, including but not li
Director, Enterprise Machine Learning & Research
The Enterprise ML team works on the front lines of the AI revolution, partnering deeply with customers to identify high-impact business problems and build cutting-edge AI systems using Scale’s proprietary research, data, and infrastructure—unlocking domain expertise through high-quality data and expert feedback. As Director of Enterprise ML, you will lead a world-class team of research scientists and engineers, define the research roadmap, and drive execution from early prototyping to deployment. You’ll thrive in a fast-moving environment, balancing deep technical leadership with people management, vision setting, and delivery. This role is ideal for a leader who thrives in ambiguity, understands both frontier GenAI capabilities and their limitations, and is motivated by turning research into durable, production-ready systems. What You’ll Do - Lead, mentor and grow a team of research scientists and engineers working on GenAI research initiatives (e.g., evaluation, post-training, agents, RL environments). - Define and drive a multi-year research roadmap: identify key scientific questions, set milestones, allocate resources, and ensure rigorous execution. - Collaborate cross-functionally with engineering, product, client-facing teams and external academic or industry partners to translate research into components, insights, and actionable outcomes. - Communicate compellingly: publish research, present at conferences, engage in open-source contributions, and represent the team externally. - Drive an inclusive, high-performing culture: help your team through technical challenges, provide growth opportunities, and attract top talent. - Stay deeply connected to the research community, understanding major trends, and helping set them. - Thrive in a high-energy, fast-paced startup environment and are ready to dedicate the time and effort needed to drive impactful results. What We’re Looking For Core Qualifications - 8+ years of hands-on research experience (PhD or equivalent preferred) in machine learning, deep learning, generative models, agent/rl systems or related domains. - A strong track record of research excellence, including publications in top-tier ML/AI venues (NeurIPS, ICML, ICLR, ACL, etc.). - Experience and track of recording in landing major research impacts in a fast-paced environment - Experience leading or managing research teams. You’re excited to mentor, coach and develop talent. - Excellent written and verbal communication skills. You are able to articulate research ideas and outcomes to both technical and non-technical stakeholders. - Exceptional communication and stakeholder management skills, with the ability to influence executives, customers, and cross-functional partners Nice to Have - Hands-on experience building and deploying agent-based, tool-augmented, or workflow-driven LLM systems in enterprise environments - Prior ownership of enterprise AI platforms, internal ML products, or customer-facing AI services at scale - Proven track record of partnering directly with enterprises to identify high-impact use cases and deliver measurable business outcomes Compensation packages at Scale for eligible roles in
Machine Learning Fellow - Human Frontier Collective (UK)
PLEASE NOTE: This is a fully remote, 1099 independent contractor opportunity with an estimated duration of six months and the potential for extension. To be eligible, candidates must be authorized to work in the country they reside in. About the Program The Human Frontier Collective (HFC) Fellowship brings together top researchers and domain experts to collaborate on high-impact work that are shaping the future of AI. As an HFC Fellow, you’ll apply your academic and professional expertise to help design, evaluate, and interpret advanced generative AI systems—while gaining exposure to cutting-edge research and working alongside an interdisciplinary network of leading thinkers. What You'll Do - ML Projects: Get invited to engage in high-impact projects with our partnered AI labs and platforms. Help models understand real-world deep learning workflows by designing, reviewing, and optimizing PyTorch models, evaluating complex ML code and AI-generated implementations for efficiency and correctness, and advising on GPU optimization, scaling, and trade-offs. - HFC Community: Beyond the work, you’ll become part of a supportive, interdisciplinary network of innovators and thought leaders committed to advancing frontier AI across domains. - Contribute to Research Publications: Collaborate with Scale’s research team to co-author technical reports and research papers—boosting your academic visibility and professional recognition (e.g., SciPredict , PropensityBench , Professional Reasoning Benchmark ). Who Should Apply - Education: PhD or postdoctoral degree in Computer Science, Computer Engineering, or a related field. - Professional Background: 1-3+ years of experience as a Machine Learning Engineer or Data Scientist. - Skills: Strong proficiency in Python and modern ML frameworks (PyTorch, TensorFlow). Experience with cloud infrastructure (AWS) and MLOps tools (Docker, Langchain) is a plus. - Professional Mindset: Detail-oriented, innovative thinker with a passion in applied AI research and a commitment to collaboration. Why Join the HFC? - Professional Development: High-impact experts expand their influence through review projects, advisory roles, and research, while deepening their AI expertise, strengthening analytical and problem-solving skills, and engaging with pioneering AI applications in science and technology. - Join a Top-Tier Network: Collaborate with a global network of engineers and experts to advance responsible AI through impactful, flexible research and training. 80% of our members come from leading institutions. - Flexible Schedule: Set your own schedule, with flexible 10–40 hour weeks that fit around your life and other commitments. - Competitive Pay: Project pay rates vary across platforms and are depending on a number of factors, including but not limited to; projects, scope, skillset, and loca
Principal AI Ops Architect, GPS
Role Overview Scale’s rapidly growing Global Public Sector team is focused on using AI to address critical challenges facing the public sector around the world. Our core work consists of: - Creating custom AI applications that will impact millions of citizens - Generating high-quality training data for national LLMs - Upskilling and advisory services to spread the impact of AI As a Principal AI Ops Architect, you will design and develop the production lifecycle of full-stack AI applications, while supporting end-to-end system reliability, real-time inference observability, sovereign data orchestration, high-security software integration, and the resilient cloud infrastructure required for our international government partners. At Scale, we’re not just building AI solutions—we’re enabling the public sector to transform their operations and better serve citizens through cutting-edge technology. If you’re ready to shape the future of AI in the public sector and be a founding member of our team, we’d love to hear from you. You will: - Own the production outcome: Take full accountability for the long-term performance and reliability of AI use cases deployed across international government agencies. - Ensure Full-Stack integrity: Oversee the end-to-end health of the platform, ensuring seamless integration between the AI core and all full-stack components, from APIs to UI, to maintain a responsive and production-ready environment. - Scale the feedback loop: Build automated systems to monitor model performance and data drift across geographically dispersed environments, ensuring the right levels of reliability. - Navigate global compliance: Manage the technical lifecycle within diverse regulatory frameworks. - Incident command: Lead the response for production issues in mission-critical environments, ensuring rapid resolution and building the guardrails to prevent them from happening again. - Bridge the gap: Translate deep technical performance metrics into clear insights for senior international government officials. - Drive product evolution: Partner with our Engineering and ML teams to ensure the lessons learned in the field directly influence the technical architecture and decisions of future use cases. Ideally, you have: - Experience: 6+ years in a high-impact technical role (SRE, FDE or MLOps) with experience in the public sector. - Global perspective: Familiarity with international government security standards and the complexities of deploying sovereign AI. - System architecture proficiency: Proven experience maintaining production-grade applications with a deep understanding of the full request lifecycle-connecting frontend/API layers to the backend and AI core. - Modern AI Stack expertise: Proficiency in coding and the modern AI infrastructure, including Kubernetes, vector databases, agentic development, and LLM observability tools. - Ownership: You treat every production deployment as your own. You race toward solving hard problems before the customer even sees them. - Reliability: You understand that in the public sector, a model failu
Software Engineer, Enterprise AI
Scale GP (Scale Generative AI Platform) is an enterprise-grade Generative AI platform that provides APIs for knowledge retrieval, inference, evaluation, and more. We are looking for a strong engineer to join our team and help us build and scale our product in a fast-paced environment. The ideal candidate will have a strong understanding of software engineering principles and practices, as well as experience with large-scale distributed systems. You will be responsible for owning large new areas within our product, working across backend, frontend, and interacting with LLMs and ML models. You will solve hard engineering problems in scalability and reliability. You will: - Own large new areas within our product - Work across backend, frontend, and interacting with LLMs and ML models - Deliver experiments at a high velocity and level of quality to engage our customers - Work across the entire product lifecycle from conceptualization through production - Be able, and willing, to multi-task and learn new technologies quickly Ideally you'd have: - 4+ years of full-time engineering experience, post-graduation - Experience scaling products at hyper growth startups - Experience tinkering with or productizing LLMs, vector databases, and the other latest AI technologies - Proficient in Python or Javascript/Typescript, and SQL - Experience with Kubernetes - Experience with major cloud providers (AWS, Azure, GCP) Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position and may be inclusive of several career levels at Scale; it will be determined during the interview process based on work location and additional factors, including job-related skills, experience, qualifications, interview performance, and relevant education or training. Scale employees in eligible roles are also granted equity based compensation, subject to Board of Director approval. Your recruiter can share more about the specific salary range for your preferred location during the hiring process, and confirm whether the hired role will be eligible for equity grant. You'll also receive benefits including, but not limited to: comprehensive health, dental and vision coverage, retirement benefits, a learning and development stipend, and generous PTO. Additionally, this role may be eligible for additional benefits such as a commuter stipend. Please reference the job posting's subtitle for where this position will be located. For pay transparency purposes, the base salary range for this full-time position in the locations of San Francisco, New York, Seattle is: $216,000 - $270,000 USD PLEASE NOTE: Our policy requires a 90-day waiting period before reconsidering candidates for the same role. This allows us to ensure a fair and thorough evaluation of all applicants. About Us: At Scale, our mission is to develop reliable AI systems for the world's most important decisions. Our products provide the high-quality data and full-sta
Staff Software Engineer, Full-Stack - Enterprise Gen AI
Staff Software Engineer, Full-Stack - Enterprise Gen AI Scale GP (Scale Generative AI Platform) is an enterprise-grade AI platform providing APIs for knowledge retrieval, inference, evaluation, and more. We are looking for a frontend-focused full-stack engineer to help build AI-powered applications that redefine enterprise workflows and push the boundaries of interactive AI. This role is ideal for someone who thrives in a fast-paced environment, enjoys working on a diverse set of projects, and has a passion for crafting high-quality, intuitive user experiences. At Scale, you'll work on a mix of cutting-edge customer-facing AI applications and internal SaaS products. Our engineering team powers projects like TIME’s Person of the Year AI experience ( see it in action ), where our AI technology helped shape one of the most iconic features in media. You'll also contribute to Scale’s GenAI Platform ( SGP ), a powerful system that enables businesses to build and deploy AI agents at scale. Whether it’s developing interactive AI assistants, enterprise-grade web applications, or refining our core SaaS platform, you’ll play a crucial role in shaping how AI integrates into real-world applications. You Will: - Build and enhance user-facing AI applications for major enterprise customers, including high-profile media and Fortune 500 companies - Develop and refine features for Scale’s GenAI Platform , empowering businesses to build, deploy, and manage AI-driven agents - Design, build, and optimize polished, high-performance UIs using Next.js, React, TypeScript, and Tailwind - Work closely with product managers, designers, and AI/ML teams to create seamless, intuitive, and impactful user experiences - Integrate frontend applications with backend services, working with APIs, authentication systems, and cloud-based infrastructure - Ship features at a rapid pace while maintaining a high level of code quality, performance, and accessibility Ideally, You Have: - 5+ years of experience developing frontend or fullstack applications in a modern tech stack - Strong proficiency in Next.js, React, TypeScript, and Tailwind , with an eye for building polished, user-friendly interfaces - Experience working on high-visibility, customer-facing applications and making trade-offs between speed and quality in fast-paced environments - A passion for AI and experience working on interactive AI applications, agent-based systems, or data-rich web platforms - Familiarity with backend technologies such as FastAPI, PostgreSQL, GraphQL , and cloud infrastructure like AWS, Azure, or GCP - A track record of collaborating cross-functionally with design, product, and ML teams to bring AI-powered applications to life This role is a unique opportunity to shape the future of AI-powered user experiences , working on projects that impact millions of users while developing tools that empower businesses to deploy AI at scale. If you’re excited by the intersection of AI, frontend engineering, and product design, we’d love to hear from you. The base salary range for this f
Senior Staff Frontier Agents Engineer
About Scale AI Scale AI is the data foundation for AI, helping organizations build and deploy reliable production AI applications. We partner with leading enterprises and government organizations to accelerate their AI initiatives through our data annotation platform, generative AI solutions, and enterprise AI capabilities. Role Overview As a Senior Staff Forward Deployed AI Engineer on our Enterprise team, you'll be the technical bridge between Scale AI's cutting-edge AI capabilities and our most strategic customers. You'll work with enterprise clients to understand their unique challenges, architect custom AI solutions, and ensure successful deployment and adoption of AI systems in production environments. This is a hands-on technical role that combines deep engineering expertise with customer-facing problem solving. You'll work directly with customer engineering teams to integrate AI into their critical workflows. Key Responsibilities Customer Integration & Deployment - Partner directly with enterprise customers to understand their technical infrastructure, data pipelines, and business requirements - Design and implement custom integrations between Scale AI's platform and customer data environments (cloud platforms, data warehouses, internal APIs) - Build robust data connectors and ETL pipelines to ingest, process, and prepare customer data for AI workflows - Deploy and configure AI models and agents within customer security and compliance boundaries AI Agent Development - Develop production-grade AI agents tailored to customer use cases across domains like customer support, data analysis, content generation, and workflow automation - Architect multi-agent systems that orchestrate between different models, tools, and data sources - Implement evaluation frameworks to measure agent performance and iterate toward business objectives - Design human-in-the-loop workflows and feedback mechanisms for continuous agent improvement Prompt Engineering & Optimization - Create sophisticated prompt engineering strategies optimized for customer-specific domains and data - Build and maintain prompt libraries, templates, and best practices for customer use cases - Conduct systematic prompt experimentation and A/B testing to improve model outputs - Implement RAG (Retrieval Augmented Generation) systems and fine-tuning pipelines where appropriate Technical Leadership & Collaboration - Serve as the primary technical point of contact for strategic enterprise accounts - Collaborate with customer data scientists, ML engineers, and software developers to ensure smooth integration - Provide technical training and knowledge transfer to customer teams - Work closely with Scale's product and engineering teams to translate customer needs into product improvements - Document technical architectures, integration patterns, and best practices Problem Solving & Innovation - Debug complex technical issues across the entire stack, from data pipelines to model outputs - Rapidly prototype solutions to unblock customers and prove out new use cases</li
Company Details
Registered Agents
No registered agents are associated with this company yet.