Aryan Sahni · CS, UC Santa Cruz · Cum Laude

BuildingAIsystemsthatthink,speak,andremember.

Engineering real-time voice AI and agentic systems. Founding Engineer at Aura, incubated at South Park Commons. Previously Snowflake.

Selected projects · 2024 — 2026

Things I've built.

Voice-first AI companion app with cycle-phase awareness and persistent semantic memory. Architected the full real-time pipeline: Groq Whisper STT → LLaMA 3.3 70B → ElevenLabs Flash v2.5 TTS → LiveKit WebRTC, with phase-conditioned memory retrieval using pgvector composite scoring. Built at South Park Commons, SF.

  • React Native
  • Supabase
  • pgvector
  • Groq
  • ElevenLabs
  • LiveKit
  • Expo

8-agent ReAct system collapsing 30-minute outreach into 60 seconds. Resume gap analyzer, warm-path finder, and 0–100 response-rate scorer.

  • React
  • FastAPI
  • Groq
  • Apollo.io
  • Gmail API

Autonomous AI software factory — ideation through deployment on ECS Fargate. Six self-healing maintenance agents replace PagerDuty with semver-aware auto-merge.

  • Next.js
  • AWS ECS
  • Nemotron 120B
  • PostgreSQL

Full-stack agent reviewing GitHub PRs in under 60s with line-level inline comments. Resilient inference layer with model failover and self-healing JSON parser.

  • Next.js
  • FastAPI
  • GitHub API
  • NVIDIA NIM

Multi-agent stock analysis tool. Enter a ticker and 8 specialized agents — price, news, sentiment, technicals, fundamentals, bull case, bear case — stream live over SSE before a Portfolio Manager agent delivers a BUY/HOLD/SELL verdict with confidence score.

  • Python
  • Flask
  • React
  • Claude
  • Finnhub
  • SSE
  • SQLite

Hand Gesture Music Control

2024

CV-driven playback control with 95% gesture accuracy and sub-200ms response. Open/closed and thumb gestures drive play/pause and track switching — 1.2s → 850ms.

  • Python
  • OpenCV
  • Mediapipe
  • Osascript

Real-Time Facial Detection

2024

Haar Cascade face recognition reaching 95% accuracy and 50ms latency on images and live video. Preprocessing tuned for 30% lower latency under variable lighting.

  • Python
  • OpenCV
  • Haar Cascade
  • Computer Vision

Experience

Where I've worked.

Data Science & Gen AI Intern · Aug 2025 — Dec 2025

Snowflake

  • Built a production Snowpark pipeline processing 2,000+ database tables, re-architecting SQL into batched aggregations to collapse multi-hour runtimes.
  • Engineered an automated metadata classification system reaching 91.7% accuracy, reducing manual review from hours to minutes.
  • Partnered with engineers and stakeholders to translate business requirements into production-grade technical solutions.

Currently

Shipping Aura. Somewhere with good coffee.

SF · --:--:-- PT
  • Avg voice response latency

    ~800ms

  • Current LLM

    Llama 3.3 70B

  • Embedding model

    text-embedding-3-small

  • Retrieval architecture

    Hybrid RAG · pgvector

Toolkit

What I work with.

Languages

  • Python
  • TypeScript
  • JavaScript
  • SQL
  • Bash

AI & ML

  • LLM integration
  • RAG
  • pgvector
  • Multi-agent orchestration
  • Agentic system design
  • Groq
  • ElevenLabs
  • OpenAI API
  • Supabase

Data & Infrastructure

  • Snowflake
  • Snowpark
  • Docker
  • Vercel
  • Render
  • AWS ECS Fargate
  • GitHub Actions

Frontend & Mobile

  • React
  • React Native
  • Expo
  • Next.js
  • Tailwind CSS
  • Zustand
  • TanStack Query

Backend & APIs

  • FastAPI
  • Flask
  • REST APIs
  • Server-Sent Events
  • GitHub API
  • GraphQL
  • LiveKit
  • WebRTC
  • PostgreSQL

Daily Tools

  • Claude Code
  • Cursor
  • GitHub Copilot

Contact

Let's build something worth remembering.

Open to SWE and AI/ML roles. Building agents, voice systems, and the infrastructure between them.