AI Platform Engineer, Applied AI

circleso • Remote

No Relocation

Posted: January 30, 2026

Job Description

About the role

The Applied AI team at Circle is at the forefront of leveraging AI to drive innovation for our customers through transformative applications of large language models (LLMs). We build AI-powered features for our customers — AI agents, AI workflows, and AI copilot. As we scale these features to more customers, we're investing in robust infrastructure for measuring, diagnosing, and improving AI system speed and performance.

We're looking for a strong AI Platform Engineer to help establish the foundation for this layer. This is a new focus area for us, with this being the first AI platform engineer hire at Circle. You’ll have significant ownership in shaping how we approach it. This role focuses on production AI systems — building infrastructure to measure and improve them— rather than ML research or model training. If you're excited about making AI systems work reliably at scale, this is for you.

You'll build evaluation frameworks, observability tooling, and diagnostic infrastructure that tells us whether our AI systems are working well and helps identify where and how to improve them. When you find gaps, you'll prototype solutions, validate them with data, and work with the team to implement these improvements.

What you'll be doing

Build evaluation infrastructure to measure AI system speed and accuracy — both offline (during development) and online (in production).
Create observability tooling and dashboards that surface quality metrics week-over-week.
Diagnose quality gaps. When accuracy drops, trace whether it's retrieval, ranking, prompting, or something else causing the issue.
Experiment with different models and agent configurations, using data to guide decisions.
Prototype and validate improvements to our RAG pipeline — chunking strategies, retrieval methods, re-ranking approaches.
Analyze how our customers are using our AI features to help us identify improvements or new areas for development.
Work closely with other engineers to give them confidence that their changes improve quality.
Help us stay up-to-date with the cutting-edge AI research, techniques, and tools.

What you'll need to be successful

Experience building and evaluating AI systems in production — including RAG pipelines, search / retrieval systems, LLM-powered applications, and both offline and LLM-based evaluation frameworks.
Strong proficiency in Python for prototyping and experimentation.
Openness to learning Ruby on Rails — our production system is built in Rails, and you'll need to integrate with and instrument the existing codebase.
Comfortable building infrastructure and tooling (eval pipelines, dashboards, data processing).
Deep understanding of RAG architecture: chunking strategies, embeddings, retrieval, re-ranking, context management.
Strong experimentation mindset — you’re comfortable designing and running A/B tests, measuring results, and iterating quickly to discover what works and what doesn’t.
Strong data analysis skills — you can interpret results, identify patterns, and communicate findings clearly.
A desire to work in an environment which values speed of iteration and individual autonomy, while also embracing personal accountability and the ability to collaborate effectively as part of a dynamic team.
Comfortable in a fast-paced environment with a certain level of ambiguity, especially when learning and picking up new technologies when projects require it.
Strong alignment with our values, find our values on our career page if you haven’t read up on them yet.
You are proficient in English (spoken, written, and reading) at a CEFR Level C2 / ILR Level 5.

Bonus Points

Experience with evaluation frameworks (Braintrust, LangSmith, or similar).

$130,000 - $140,000 USD per year

The cash compensation range shown is a starting point. In addition to equity, benefits and perks, your cash compensation is subject to an annual review and increase on a once per year basis.

Additional Content

About the role

What you'll be doing

Build evaluation infrastructure to measure AI system speed and accuracy — both offline (during development) and online (in production).
Create observability tooling and dashboards that surface quality metrics week-over-week.
Diagnose quality gaps. When accuracy drops, trace whether it's retrieval, ranking, prompting, or something else causing the issue.
Experiment with different models and agent configurations, using data to guide decisions.
Prototype and validate improvements to our RAG pipeline — chunking strategies, retrieval methods, re-ranking approaches.
Analyze how our customers are using our AI features to help us identify improvements or new areas for development.
Work closely with other engineers to give them confidence that their changes improve quality.
Help us stay up-to-date with the cutting-edge AI research, techniques, and tools.

What you'll need to be successful

Experience building and evaluating AI systems in production — including RAG pipelines, search / retrieval systems, LLM-powered applications, and both offline and LLM-based evaluation frameworks.
Strong proficiency in Python for prototyping and experimentation.
Openness to learning Ruby on Rails — our production system is built in Rails, and you'll need to integrate with and instrument the existing codebase.
Comfortable building infrastructure and tooling (eval pipelines, dashboards, data processing).
Deep understanding of RAG architecture: chunking strategies, embeddings, retrieval, re-ranking, context management.
Strong experimentation mindset — you’re comfortable designing and running A/B tests, measuring results, and iterating quickly to discover what works and what doesn’t.
Strong data analysis skills — you can interpret results, identify patterns, and communicate findings clearly.
A desire to work in an environment which values speed of iteration and individual autonomy, while also embracing personal accountability and the ability to collaborate effectively as part of a dynamic team.
Comfortable in a fast-paced environment with a certain level of ambiguity, especially when learning and picking up new technologies when projects require it.
Strong alignment with our values, find our values on our career page if you haven’t read up on them yet.
You are proficient in English (spoken, written, and reading) at a CEFR Level C2 / ILR Level 5.

Bonus Points

Experience with evaluation frameworks (Braintrust, LangSmith, or similar).

$130,000 - $140,000 USD per year

The cash compensation range shown is a starting point. In addition to equity, benefits and perks, your cash compensation is subject to an annual review and increase on a once per year basis.

Apply Now View Full Posting

RemoteJob Guru

Menu

AI Platform Engineer, Applied AI

Job Description

About the role

What you'll be doing

What you'll need to be successful

Bonus Points

$130,000 - $140,000 USD per year

Additional Content

About the role

What you'll be doing

What you'll need to be successful

Bonus Points

$130,000 - $140,000 USD per year