Alpha.so
by AfterHour Inc.
AI Product Engineer
2025 — Present
Think of Alpha.so as a vibe trading friend — AI bots that users follow, chat with, and let manage their portfolios. Each bot develops its own personality and trading strategy over time. Started as a social network for traders, then pivoted into personal automated trading bots. I own the backend AI engineering — agent infrastructure, persistent memory, sandbox system, and the real-time data layer.
AI Agent Infrastructure
Multi-LLM Agent Orchestration
CoreCore loop: Vercel AI SDK Tool Loop Agent running inside Temporal workflows. Model gets a task, skills, and a prompt that lets it load more skills on demand — it decides when it has enough information to complete the task. Custom sandbox per bot: persistent virtual filesystem backed by S3 directory buckets, with tools that mimic real fs operations (list, read, write, delete) and virtual CLIs that route commands and return help text the model can iterate on. No processes spawned — just function routing that looks like a terminal to the LLM. On top of that, a skills system — composable tool+prompt bundles the bot loads depending on context: market analysis, portfolio management, news digestion, macro environment, options flow.
RAG & Hybrid Search
Three search dimensions: Meilisearch + Typesense for full-text hybrid search across financial data, Milvus for vector semantic similarity across documents, and FalkorDB for graph relationships between entities, sectors, and market signals. Exa.js for AI-native web search feeding into bot context.
Temporal Workflow Workers
Temporal harness runs everything. Workflows for agent execution, activities for LLM calls and trade ops, scheduling producers for recurring tasks like market open checks and rebalancing. Custom webpack bundling for workflow sandboxing. Workers auto-scale on EKS with custom controllers based on queue depth. Self-hosted Temporal via Pulumi IaC.
Observability & Evals
PostHog for AI tracing — session attribution, full tracking of prompts and outcomes. Evals running on captured traces with LLM-as-a-judge: checking tone, accuracy, and whether bots stay friendly for beginners who don't know what a stop-loss is. Prompt changes rolled out via feature flags with evals comparing control vs experiment groups. E2E tests with real LLM calls, evals as assertions, middleware that caches LLM responses when model + prompts remain unchanged.
Agent-Ready Test Infrastructure
Built the test harness that lets coding agents ship confidently. Full replica of production systems available for E2E runs, middleware that caches LLM calls and responses for deterministic replay when model + prompts remain unchanged. Agents can run tests autonomously, iterate on failures, and ship without human intervention on the testing side. Part of the job now is enabling agents to work — this infrastructure is what makes 10-15 PRs/day possible.
Platform Infrastructure
Internet-Scale Chat
Java 21Matrix-compatible chat infrastructure built on Java 21 / Spring Boot WebFlux (reactive) with Red Planet Labs Rama — a distributed data processing platform. Designed for massive concurrent connections. Separate from the Node.js backend, purpose-built for chat at scale.
Real-Time Market Data
Go WebSocket proxy for Polygon.io market data feeds. Fastly Compute@Edge WASM proxy for WebSocket fanout (GRIP protocol). Dedicated market-data-ingress service with Redis pub/sub. MCP server integration with Polygon.io for AI agent access to live data. Aurora Limitless PostgreSQL (sharded) for paper brokerage.
Mobile & Brokerage
Expo 54 / React Native 0.81 app with Shopify Skia graphics, Victory Native charts, and MobX state management. Brokerage connectivity via Plaid, SnapTrade, and direct Alpaca API supporting 1,000+ integrations. RevenueCat subscriptions, PostHog + self-hosted RudderStack analytics, Datadog APM.
Architecture
pnpm monorepo. NestJS 11 backend (Prisma 7 + Drizzle ORM). Heavier workloads on EKS with Pulumi IaC — Temporal workers auto-scale with custom controllers based on queue depth. Aurora Limitless PostgreSQL (sharded) with pgvector for AI embeddings, PgBouncer connection pooling. Redis for caching and BullMQ job queues. Milvus and FalkorDB for vector and graph storage.