offergenie_white
Trillion

Data Engineering Manager

Trillion

RemoteRemote
Senior LevelMachine Learning EngineerRemote
Apply with AI Cover Letter

Job Description

Remote (Australasia Time Zone)

Role Summary

We’re hiring a hands-on Data Engineering Manager to build, lead, and actively contribute to Trillion’s AdTech analytics and reporting platform. This role owns the end-to-end data pipeline—from event and log publishing, through streaming transport (Kafka), into analytical storage, rollups, and data warehousing (ClickHouse and related systems).

This is a deeply technical, player-coach role. You will write production code, design schemas, review queries, and build pipelines alongside the team. While the role may become less day-to-day hands-on as the data organization scales, it is never hands-off—technical ownership and architectural involvement are core expectations.

You will be accountable for data correctness, latency, and monetization attribution across an ad-driven platform, supporting metrics such as impressions, clicks, revenue, RPM/CPM, and downstream reporting used by both internal teams and customer-facing products.

Key responsibilities

Technical leadership & hands-on development

Own the architecture and implementation of the AdTech data platform, remaining directly involved in coding and technical decision-making.

Design, implement, and operate ETL/ELT pipelines ingesting high-volume ad and traffic events from real-time streams and legacy sources.

Build and maintain a high-granularity canonical event model (single source of truth) and a set of optimized roll-up tables for analytics, BI, and application use.

Architect and implement minute / near-real-time aggregation pipelines, including deterministic reprocessing and backfills for late, duplicated, or corrected data.

Actively write and review production code for pipelines, rollups, schema migrations, and operational tooling.

Define standards for schema design, partitioning, clustering, and performance tuning in ClickHouse and related analytical systems.

Review application logging and instrumentation; design and implement new or parallel event publishing where existing logs do not meet reporting or attribution needs.

Team leadership & execution

Lead and mentor the current data engineers while scaling the team over time.

Serve as the primary technical multiplier: unblock work, set patterns, and raise the technical bar through hands-on examples.

Translate business and reporting requirements into clear technical designs, specs, and execution plans.

Own delivery timelines, operational stability, and quality of data outputs.

Establish best practices for testing, monitoring, documentation, SLAs, and data quality validation.

Lead investigation and resolution of data discrepancies impacting revenue, optimization, or trust in reporting.

Collaboration & stakeholder management

Partner closely with Product, Engineering, Analytics, and Leadership to define monetization and reporting requirements.

Perform gap analysis for current and future reporting needs; propose phased, pragmatic solutions.

Collaborate on BI and semantic models (Metabase, Looker, Tableau, Power BI) to ensure metrics are consistent, explainable, and performant.

Clearly communicate technical trade-offs, risks, and progress to technical and non-technical stakeholders.

Must-have skills & experience

6+ years building and operating production data systems handling high-volume event and AdTech data.

Experience leading or mentoring engineers while remaining hands-on in production systems.

Strong SQL skills and deep understanding of analytical and time-series query patterns.

Hands-on experience with columnar/OLAP databases (ClickHouse strongly preferred).

Practical experience with streaming platforms (Kafka or equivalent) and near-real-time processing.

Strong familiarity with AdTech monetization metrics, including impressions, clicks, revenue attribution, CPM, RPM, CTR, and common failure modes.

Solid understanding of data modeling, partitioning, sharding, and performance trade-offs in distributed systems.

Experience shipping pipelines with CI/CD, orchestration, and safe backfill/reprocessing workflows.

Possesses the ability to analyze and comprehend application code for the purpose of improving data schemas and instrumentation.

Strong debugging instincts and excellent written and verbal communication.

Nice-to-have

Deep ClickHouse cluster operations and performance tuning experience.

Experience modernizing or replacing legacy analytics and reporting pipelines.

Experience with real-time stream processors (Flink, Spark Structured Streaming).

Data science or machine learning experience, particularly as it relates to optimization algorithms, experimentation, forecasting, or monetization performance.

Familiarity with data observability, SLAs, and data quality frameworks.

Experience working in high-scale monetization or traffic-driven platforms.

What success looks like in the first 3–6 months

Completed a comprehensive audit of existing data pipelines, event logs, and legacy reporting systems, identifying gaps, correctness risks, and scalability issues.

Performed a gap analysis against current and future ad-tech reporting requirements, including attribution accuracy and latency expectations.

Produced a clear, phased technical roadmap for achieving accurate, scalable, and trusted reporting.

Designed and implemented new parallel event logging and publishing pipelines containing the required fields and dimensions for efficient, correct roll-ups.

Delivered at least one end-to-end reporting pipeline from raw events to BI-ready tables, with documentation, monitoring, and reprocessing support.

Established technical standards and execution rhythm that the data team consistently follows.

Benefits and Culture

Flexible remote work with optional in office days.

Fast-moving, collaborative team; strong focus on growth and retention.

Health coverage and 401k match.