Amaan Shaikh

@amaans

Your Database Is Downstream of Your Product

Submitted Jun 24, 2026

Session description

For years, our analytics platform served a workflow nobody complained about. We build planning software for out-of-home advertising, the billboards and digital screens you pass on roads, in transit, in malls. Our users are media planners: they assemble a campaign, kick off an insights run, wait, and review. Behind that sat Postgres for analytics, MongoDB and Redis for proximity search, and a layer of services stitching results across the seams. Some queries took 22 seconds, some took eight minutes, and nobody minded, because nobody was watching them run. Then the product changed shape, for a reason that had nothing to do with database performance: planners had started doing their real work in an external tool with an interactive map where filters update as you drag them, returning to us only to push execute.

So we reimagined it as a conversational, AI-driven planner: describe what you want in natural language and watch a live map, KPI panel, and data grid update together. That surfaced three forces, sub-2-second responses with many planners querying at once, proximity filtering (“billboards near any of our cafés”) as the default way to plan, and an AI agent that now writes its own queries against the data. The fix wasn’t another cache layer or a bigger cluster; it was rebuilding around ClickHouse, a single columnar engine fast enough on raw data that you don’t need to anticipate the next question, with spatial indexing pushed into that same engine so proximity never has to leave the database. This talk is the postmortem: what the product change demanded of the data layer, what we tried that didn’t work, and the patterns that travel well beyond ad-tech.

Takeaways

  • Your database choice is downstream of your product’s interaction model, not its latency numbers. We didn’t rebuild on ClickHouse for raw performance, we rebuilt because the product became an interactive, conversational map and the old assemble-wait-review workflow no longer existed. Re-evaluate your data layer when the interaction model changes, not when latency complaints arrive.
  • A columnar engine fast on raw data buys you the questions you didn’t anticipate. Caches and pre-aggregation optimise for known queries; when an AI agent (or a user dragging filters) can ask anything, raw speed beats anticipation.

Who is this for

Backend, data, and platform engineers who own a database in production; anyone caught in the “tune what we have vs rebuild” debate; and teams building interactive or AI-driven data products, real-time dashboards, conversational analytics, geospatial search. Most useful if you’re comfortable with databases and SQL in production and have hit a performance wall at least once.

Bio

Amaan is a software engineer with four years of experience. He started on the MERN stack building interactive systems for business operations, then moved into large-scale data applications, designing, optimising, and operating data workflows on Spark, Airflow, Scala, and Python. He currently works on data infrastructure for an ad-tech analytics platform.

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Jumpstart better data engineering and AI futures