Knowledge Graph Pipelines & AI-Native Team Enablement
Client: Helix Analytics
Helix Analytics needed a production-grade ingestion and knowledge graph layer to power their next analytics product, and they needed their internal engineering team to keep shipping after we left. We delivered both — a Neo4j-backed pipeline running on Airflow and dbt in 6 weeks, and a hands-on enablement program that made AI coding agents a default part of how their team writes, reviews, and ships code.
The Challenge
Helix's roadmap depended on a knowledge-graph data model that linked customer entities, accounts, and product usage signals — but their current ingestion was a brittle mix of cron jobs and manual exports, with no graph layer at all.
The core engineering team was small (4 engineers) and already over-allocated. Hiring more was off the table for at least two quarters, so any new system also had to make the existing team faster, not just hand them more surface area to maintain.
Leadership had been experimenting with AI coding tools individually, but adoption was inconsistent — agents were used for autocomplete-style work and abandoned for anything non-trivial. They wanted a real workflow, not a license bump.
Our Approach
We ran the build and the enablement program in parallel tracks, with the enablement piece overlapping the tail of the build so Helix's engineers were learning on the system they were about to own.
Discovery & Graph Modeling
Mapped source systems, defined entities and relationships for the knowledge graph, and agreed on the slice of the model we'd ship in v1.
Pipeline Build
Stood up Airflow DAGs for ingestion, dbt models for transformation, and a loader that materialized the graph into Neo4j. Backfilled 18 months of historical data.
Graph Layer & API
Wrapped Neo4j with a thin internal query API the product team could consume, including caching and basic guardrails on traversal depth.
AI Enablement
Worked side-by-side with Helix's engineers on real tickets using AI coding agents. Built a small library of repo-specific skills (codegen patterns, dbt model scaffolding, PR review checks) tuned to their stack.
The Solution
Pipelines run on Airflow with dbt handling transformations; the loader keeps Neo4j in sync incrementally, with a Postgres metadata store tracking lineage and run health. The graph API gives the product team a single, well-bounded entry point so consumers don't need to learn Cypher.
The enablement piece is the multiplier. Helix's engineers now use AI agents with shared, repo-specific skills as part of the default workflow — scaffolding new dbt models, drafting PR descriptions, generating tests, and handling routine refactors. Skills are versioned in their repo so the workflow improves as the codebase does.
Tech Stack
Architecture
Airflow on a managed cluster orchestrates ingestion; dbt handles SQL transformations against Postgres; the loader pushes deltas into Neo4j on each run. The graph API is a small Python service with read-through caching. AI agent skills live alongside the codebase under a `skills/` directory and are reviewed like any other code.
Results
From kickoff to first product feature consuming the graph.
Measured by tickets shipped per engineer per sprint, sustained over the following quarter.
Avoided contractor spend and deferred hiring needs after the team's velocity gains.
Through AI-assisted scaffolding, test generation, and PR review.
“The pipelines were table stakes — what changed our quarter was the way our team works now. AI agents went from a toy on someone's laptop to a normal part of how we ship.”
Want similar results?
Let's discuss how we can apply the same engineering rigor to your project.
Start a Conversation