Senior Backend Engineer (Python)

Expand and maintain our real-time voice pipeline: design, implement, and maintain Python micro-services for conversational AI orchestration—audio capture, streaming transcription, prompt/LLM logic, synthesis, and playback.
Integrate new providers & transports: add plug-ins for emerging ASR, TTS, LLM, and memory services; wire up WebRTC, SIP, or phone endpoints; build adapters that allow hot-swapping components without downtime. Build API endpoints.
Deliver ultra-low latency (<500 ms round-trip): profile async pipelines (asyncio, FastAPI, gRPC), optimize buffering, concurrency, and back-pressure handling.
Instrument & observe every hop: emit structured traces (OpenTelemetry), metrics, and logs for each pipeline stage; define SLOs for first-token latency, end-to-end latency, and streaming reliability.
Harden for production: implement graceful retries, idempotent message passing, circuit breakers, and HIPAA-compliant security (encryption in transit, per-tenant isolation, secrets rotation).
Collaborate cross-functionally with ML, product, data engineering, and client-SDK teams to deliver features such as voice cloning, multimodal hand-offs, and domain-specific memory retrieval.

4+ years building production back-ends in modern Python.
Proven experience with real-time streaming systems—WebRTC, WebSockets, or gRPC streaming—and proficiency with asyncio, FastAPI, or similar async frameworks.
Deep understanding of concurrency, buffering, audio codecs (Opus, PCM), and distributed tracing.
Solid understanding of AWS/GCP/Azure, including container orchestration (Kubernetes/EKS/GKE), message queues (Kafka/SQS/Pub/Sub), and IaC (Terraform).
Solid grasp of relational (PostgreSQL) and in-memory (Redis) data stores; able to model and persist conversational state.
Excellent communication skills and a bias for measured, observable, and continuously deployable software.

B.S./M.S. in CS, EE, or related fields.
Familiarity with voice-agent frameworks,
Hands-on with telephony (Twilio, Telnyx), SIP, or PSTN integrations.
Experience integrating multimodal inputs (vision, text chat) into voice agents.
Familiarity with GPU inference and streaming pipelines.
Prior work in regulated industries (healthcare, finance) and comfort preparing for SOC 2 / HIPAA audits.

As a health technology company, we reserve the right to run a background check on any applicant to whom we extend an offer and to re-perform any such check at any time during the course of employment. Please note that there is no set policy for rejecting candidates based on specific background check results, and we consider each candidate as a whole before making any decisions. We comply with all “ban the box” laws in applicable jurisdictions.

We offer competitive salaries and benefits, including 401 (k) matching up to a specified percentage of your salary, health, vision, and dental insurance, and flexible paid time off. The typical salary range for this role is $180,000 to $260,000 USD. The amount offered will be determined by a variety of factors, including, but not limited to, your skills, qualifications, and past experience relative to the role’s requirements.

If you have a disability or require accommodations during the application or recruitment process, please contact careers@ellipsishealth.com.