Senior AI Platform Engineer, ATLAS AI

Cognite

Cognite

Software Engineering, Data Science
Phoenix, AZ, USA
Posted on Aug 27, 2025
About Cognite
Embark on a transformative journey with Cognite, a global SaaS forerunner in leveraging AI and data to unravel complex business challenges through our cutting-edge offerings including Cognite Atlas AI, an industrial agent workbench, and the Cognite Data Fusion (CDF) platform. We were awarded the 2022 Technology Innovation Leader for Global Digital Industrial Platforms & Cognite was recognized as 2024 Microsoft Energy and Resources Partner of the Year. In the realm of industrial digital transformation, we stand at the forefront, reshaping the future of Oil & Gas, Chemicals, Pharma and other Manufacturing and Energy sectors. Join us in this venture where AI and data meet ingenuity, and together, we forge the path to a smarter, more connected industrial future.
Learn more about Cognite here
Our values
Impact: Cogniters strive to make an impact in all that they do. We are result-oriented, always asking ourselves.
Ownership: Cogniters embrace a culture of ownership. We go beyond our comfort zones to contribute to the greater good, fostering inclusivity and sharing responsibilities for challenges and success.
Relentless: Cogniters are relentless in their pursuit of innovation. We are determined and deliverable (never ruthless or reckless), facing challenges head-on and viewing setbacks as opportunities for growth.
The Role
We are seeking an AI Platform Engineer to join the Cognite Atlas AI Product team in Phoenix, AZ, to engineer, build, and operate the production-grade, multi-cloud platform that enables our internal and partner teams to build, deploy, and manage industrial AI agents. You will be responsible for creating the core services, frameworks, and infrastructure for our "agent builder workbench" and agent runtime, focusing on scalability, reliability, cost-efficiency, and security. Your work will directly impact industrial efficiency and sustainability, which is critical to our mission of powering a high-tech, sustainable, and profitable industrial future.

Responsibilities

  • Design, build, and maintain the core Python SDKs and services for the Atlas AI platform. Create clean abstractions that empower Solution Engineers to easily define and test agents and workflows.
  • Build the core agentic runtime, ensuring it is scalable, meets its SLOs, and can reliably manage the state, orchestration, and execution of industrial agents.
  • Develop a robust, governed, and secure framework for AI agent tool-use. Engineer the platform components that allow solution engineers to safely add new tools (e.g., API calls, database queries) and that manage the secure execution, monitoring, and access control for those tools.
  • Manage the LLM serving layer, including deploying and optimizing models for low-latency/high-throughput inference. Build and maintain model routing logic to select the most appropriate model (e.g., performance vs. cost) for a given task.
  • Implement evaluation and observability for all AI services. Create standardized frameworks for systematically evaluating the performance, accuracy, cost, and safety of LLMs and agentic workflows. Drive the implementation of robust, automated testing strategies for LLM-based systems.
  • Own the full development lifecycle for services in a production SaaS environment. This includes establishing automated code coverage goals, rigorous code reviews, defining SLOs, participating in on-call rotations, and ensuring a fast and effective incident response process.
  • Work closely with the Lead Architect to translate the technical vision into implemented, production-grade services. Act as a key partner for the Solution Engineers (your internal customers) to understand their needs and abstract common patterns into reusable, robust platform components.
  • Stay up to date on the latest developments in the field, and mentor junior developers.

What We Are Looking For

  • Bachelor's or Master’s degree in Computer Science or a related field, or equivalent practical experience.
  • 8+ years of professional experience in backend software engineering, platform engineering, or MLOps, with a proven track record of architecting and operating complex systems at scale.
  • 2+ years of hands-on experience building applications or platforms on top of AI/ML models or LLMs.
  • Expert-level proficiency in Python and a strong background in software architecture, robust API design, and building maintainable, well-documented SDKs for other developers.
  • Hands-on experience with Kubernetes (K8s) and building services on managed PaaS in a multi-cloud environment (AWS, Azure, GCP). Strong understanding of Infrastructure as Code (e.g., Terraform).
  • Proven experience building and operating production-grade SaaS software. Understanding of the full development life cycle, including CI/CD, monitoring, telemetry, and on-call incident response.
  • Practical experience with LLM orchestration frameworks (Bedrock, Vertex, Semantic Kernel, LangChain).
  • Strong verbal and written communication skills, with the ability to articulate complex technical designs and decisions clearly.

Bonus Skills

  • Hands-on experience deploying and managing LLMs in production using high-performance serving frameworks.
  • Experience with MLOps/LLMOps tools for tracing, monitoring, and evaluating LLM applications (LangSmith, Arize, Phoenix, or equivalent).
  • Experience with RAG Infrastructure, embedding generation pipelines, vector database integrations, and high-performance vector similarity search APIs.
Why choose Cognite? 🏆 🚀
* Join us in making a real and lasting impact in one of the most exciting and fastest-growing new software companies in the world.
* We have repeatedly demonstrated that digital transformation, when anchored on strong DataOps, drives business value and sustainability for clients and allows front-line workers, as well as domain experts, to make better decisions every single day.
* Built In 2024 Best Places to Work in Austin, TX and Houston, TX
A snapshot of our many perks and benefits as a Cogniter
* Competitive compensation
* 401(k) with employer matching
* Competitive health, dental, vision & disability coverages for employees and all dependents
* Unlimited PTO
* Paid Parental Leave Program
* Employee Referral Program
* Join a team of 60+ different nationalities 🌐 with Diversity, Equality and Inclusion (DEI) in focus 🤝.
* A highly modern and fun working environment with sublime culture across the organization, follow us on Instagram @cognitedata 📷 to know more
* Opportunity to work with and learn from some of the best people on some of the most ambitious projects found anywhere, across industries
* Join our HUB 🗣️ to be part of the conversation directly with Cogniters and our partners.
* Paid mobile phone and WiFI
All candidates must be legally authorized to work in the United States without the need for current or future company sponsorship for employment visa status.
Equal Opportunity
Cognite is committed to creating a diverse and inclusive environment at work and is proud to be an equal opportunity employer. All qualified applicants will receive the same level of consideration for employment; everyone we hire will receive the same level of consideration for training, compensation, and promotion.
We ask for gender as part of our application because we want to ensure equal assessment in the recruitment process. Your answer will help us reach this commitment! However, the question about gender is optional and your choice not to answer will not affect the assessment of your application in any way.