Senior AI Engineer

ethoslife • Remote US

No Relocation

Posted: January 22, 2026

Job Description

About the Role

We’re building several LLM-powered copilots across critical workflows (e.g., underwriting productivity, agent enablement, customer support, operations/compliance, fraud). We need an AI engineer to own the LLM + retrieval + context layer that makes these copilots accurate, auditable, fast, and cost-efficient.

Typical stack: Python/FastAPI, Postgres + vector (pgvector/Pinecone/Weaviate), OpenSearch, optional graph DB, Kubernetes + GPUs, OTEL/Datadog

Duties and Responsibilities:

Production RAG: indexing, retrieval, hybrid search, reranking, query rewriting, grounding, citations
Context Graph: entity resolution + linking + provenance; graph + vector retrieval; supports multi-hop context
LLM orchestration: tool/function calling, structured outputs, routing across model tiers, failure modes
GPU/inference cost optimization: batching, caching/KV reuse, quantization, autoscaling; optimize $/session + latency
Safety + compliance: PII/PHI handling, redaction, audit logs, deterministic replay, hallucination mitigation
LLMOps: eval harness (golden sets, regression, adversarial), monitoring for quality/cost/drift
Design/ship the end-to-end pipeline: retrieve → assemble context → generate → cite → log/monitor
Improve quality and trust via evaluation, feedback loops, and clear evidence-backed outputs
Partner with product, security, and domain teams; write crisp design docs; raise engineering bar
Ship RAG v1 with citations + measurable quality metrics
Deliver Context Graph v1 that improves retrieval on real copilot tasks
Reduce cost/latency with a concrete inference optimization plan shipped to prod

Qualifications and Skills:

7+ years building production systems; 2+ years hands-on LLMs/RAG
Proven RAG experience (embeddings, vector DBs, hybrid search, reranking, eval)
Strong backend/distributed systems + observability
Track record shipping in high-stakes environments with auditability/correctness
Knowledge graph / entity resolution / provenance systems
GPU inference optimization (vLLM/TGI/TensorRT-LLM, quantization AWQ/GPTQ, batching)
Regulated domain experience (insurance/fintech/healthcare)

#LI-Remote #LI-MK1

The US national base salary range for this full-time position is $146,000 - $236,000. Our salary ranges are determined by role, level, and location. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position across all US locations. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training.

Please note that the compensation details listed in US role postings reflect the base salary only and do not include applicable bonus, equity, or benefits.

You can find further details of our US benefits at https://www.ethoslife.com/careers/

Additional Content

About the Role

Typical stack: Python/FastAPI, Postgres + vector (pgvector/Pinecone/Weaviate), OpenSearch, optional graph DB, Kubernetes + GPUs, OTEL/Datadog

Duties and Responsibilities:

Production RAG: indexing, retrieval, hybrid search, reranking, query rewriting, grounding, citations
Context Graph: entity resolution + linking + provenance; graph + vector retrieval; supports multi-hop context
LLM orchestration: tool/function calling, structured outputs, routing across model tiers, failure modes
GPU/inference cost optimization: batching, caching/KV reuse, quantization, autoscaling; optimize $/session + latency
Safety + compliance: PII/PHI handling, redaction, audit logs, deterministic replay, hallucination mitigation
LLMOps: eval harness (golden sets, regression, adversarial), monitoring for quality/cost/drift
Design/ship the end-to-end pipeline: retrieve → assemble context → generate → cite → log/monitor
Improve quality and trust via evaluation, feedback loops, and clear evidence-backed outputs
Partner with product, security, and domain teams; write crisp design docs; raise engineering bar
Ship RAG v1 with citations + measurable quality metrics
Deliver Context Graph v1 that improves retrieval on real copilot tasks
Reduce cost/latency with a concrete inference optimization plan shipped to prod

Qualifications and Skills:

7+ years building production systems; 2+ years hands-on LLMs/RAG
Proven RAG experience (embeddings, vector DBs, hybrid search, reranking, eval)
Strong backend/distributed systems + observability
Track record shipping in high-stakes environments with auditability/correctness
Knowledge graph / entity resolution / provenance systems
GPU inference optimization (vLLM/TGI/TensorRT-LLM, quantization AWQ/GPTQ, batching)
Regulated domain experience (insurance/fintech/healthcare)

#LI-Remote #LI-MK1

Please note that the compensation details listed in US role postings reflect the base salary only and do not include applicable bonus, equity, or benefits.

You can find further details of our US benefits at https://www.ethoslife.com/careers/

Apply Now View Full Posting

RemoteJob Guru

Menu

Senior AI Engineer

Job Description

About the Role

Duties and Responsibilities:

Qualifications and Skills:

Additional Content

About the Role

Duties and Responsibilities:

Qualifications and Skills: