2025-10-02
2025-10-02

For years I’ve used InfluxDB as my main database for logging all kinds of time-based data. It’s reliable, lightweight, and well-suited for storing metrics or events. One of the more personal data streams I collect is my own location history, captured through OwnTracks on Android. The app continuously sends geo-coordinates to my server, where they’re stored in InfluxDB.

Until now, my usage was simple: I would pull up a map view of a single day or a couple of weeks and see where I had been. It was more about having a “quantified self” archive than about running complex analysis. Recently, though, I started asking: what’s possible if I treat this data more seriously?


Setting up Geo-Temporal Data in InfluxDB

OwnTracks logs latitude/longitude points every few minutes. In my case:

  • Around 91,000 points per year (every 6 minutes).
  • A single day of data is easy to visualize, but over longer time spans, performance matters.
  • I wanted to test whether InfluxDB’s S2 index could make queries faster when filtering spatially. Thanks Matthias for the hint :)

The S2 index is a way of mapping latitude/longitude coordinates onto a hierarchical cell system that’s easier to query than raw floats. Instead of searching through raw coordinates, InfluxDB can quickly filter based on cell IDs.


Performance Benchmark

I compared two setups:

  • With s2_cell_id present: data stored with precomputed S2 cell IDs.
  • On the fly: InfluxDB computes S2 cells during query time.

Dataset: one year of data (~91k points).
Typical filter returned ~2,300 points (≈2.5%).

Results

Query type With S2 index On the fly
Raw data (wide form) 7.18s 13.1s
Filter @ level 10 7.49s 20.48s
Filter @ level 11 8.28s 21.37s
Filter @ level 12 11.16s 24.05s

Observations

  • Even with S2 indexing, base query cost dominates: fetching raw data is the main bottleneck.
  • The index does help: filtering is roughly 2–3× faster compared to on-the-fly calculations.
  • Increasing resolution (higher S2 levels) increases query time, so there’s a trade-off between spatial precision and performance.

A Short Note on S2 Geometry

S2 is a geometry library originally developed at Google. Its core idea is to represent the Earth’s surface by projecting it onto the faces of a cube, which is then subdivided recursively into cells by using Hilbert curves. S2 Geometry

  • Each S2 cell is identified by a unique integer ID.
  • Cells can be subdivided hierarchically, so you can choose different levels of resolution (coarse regions vs. fine-grained areas).
  • This makes spatial queries (like “points within this area”) much faster than comparing raw lat/long values.

In short: S2 turns messy geo data into something databases can index efficiently.


And now

So far, my experiments confirm that S2 indexing is useful for speeding up geo-temporal queries in InfluxDB, but the raw data size and query design still matter a lot.

For my personal use case — reviewing trips, walks, and places I’ve been — the current performance is good enough. But this exploration opens doors:

  • Building aggregate views (e.g. heatmaps of most-visited areas).
  • Running long-term analyses (seasonal mobility, time spent in places).
  • Integrating with visualization tools beyond simple day-by-day maps.

InfluxDB, OwnTracks, and S2 indexing give me an okayish base. The next step is to think less like a log collector and more like a geospatial analyst. Well the last sentence I would not have written myself. Otherwise I am good creating the blog post with the help of ChatGPT.