Lead Data Engineer (Autonomy)

  • Full-time

Company Description

About Grab and Our Workplace

Grab is Southeast Asia's leading superapp. From getting your favourite meals delivered to helping you manage your finances and getting around town hassle-free, we've got your back with everything. In Grab, purpose gives us joy and habits build excellence, while harnessing the power of Technology and AI to deliver the mission of driving Southeast Asia forward by economically empowering everyone, with heart, hunger, honour, and humility.

Job Description

Get to Know the Team

You will join a team building production robotics and autonomy systems for urban environments across Southeast Asia. We develop perception, planning, and control capabilities step by step, using safety evidence to guide each release. The team focuses on robust systems, quality code, and technical depth maintained in-house. You'll work with a senior engineering group that values clean interfaces and reproducible development workflows.

Get to Know the Role

As a Lead Data Engineer, you'll report into the Head of Engineering and be based at Grab One North Singapore office. You'll lead the data pipeline that transforms raw vehicle logs into training-ready datasets for autonomy and simulation engineers. You'll build systems to ingest, validate, transform, and version multimodal data from cameras, lidar, radar, and vehicle telemetry. Your work enables machine learning engineers to train models reproducibly and at scale.

The Critical Tasks You will Perform

  • You'll design and maintain ingestion pipelines that transfer vehicle log data from onboard storage to AWS, including coordination of physical SSD offload, upload tooling, and data integrity verification.
  • You'll build automated workflows to validate, synchronize, and transform raw multimodal logs into structured training datasets, detecting corruption and ensuring schema consistency throughout the pipeline.
  • You'll develop systems for dataset versioning and lineage tracking that allow ML engineers to reproduce training runs and trace model inputs back to specific vehicle logs.
  • You'll collaborate with perception, planning, and simulation engineers to define data schemas, synchronization requirements, and quality gates that determine when data is ready for training.
  • You'll maintain AWS-based storage and compute infrastructure for petabyte-scale autonomy datasets, implementing monitoring and fault-tolerant recovery mechanisms for pipeline failures.
  • You'll optimize data pipelines for throughput and cost efficiency, implementing storage tiering, compression strategies, and compute scaling policies to manage infrastructure expenses.

Qualifications

What Essential Skills You will Need

You have:

  • At least 7 years of experience building data pipelines that handle binary or time-series data from hardware sensors, including experience with edge-to-cloud data transfer and integrity checking mechanisms.
  • Experience with distributed data processing frameworks (e.g., Apache Spark, Ray, or Dask) and implementing data quality checks for multimodal datasets that include camera, lidar, or telemetry data.
  • Experience with data versioning tools (e.g., DVC, Pachyderm, or Git LFS) and ML data formats (e.g., TFRecord, Parquet, or Arrow), including the ability to track dataset lineage from raw logs to training splits.
  • Experience designing data contracts using serialization formats (e.g., Avro, Protobuf, or Parquet schemas) and collaborating with ML engineers to specify training data requirements.
  • Experience with AWS data services (S3, EC2, EMR, or Glue), infrastructure-as-code tools (e.g., Terraform or CloudFormation), and implementing monitoring dashboards and alerting for data pipeline health.
  • Experience implementing data compression algorithms, storage lifecycle policies, and auto-scaling compute resources to manage petabyte-scale data processing within budget constraints.
  • A Degree in Computer Science, Data Engineering, or equivalent field that helps you understand the distributed systems and data architecture principles that this role requires.

Additional Information

Life at Grab

We care about your well-being at Grab, here are some of the global benefits we offer:

  • We have your back with Term Life Insurance and comprehensive Medical Insurance.
  • With GrabFlex, create a benefits package that suits your needs and aspirations.
  • Celebrate moments that matter in life with loved ones through Parental and Birthday leave, and give back to your communities through Love-all-Serve-all (LASA) volunteering leave
  • We have a confidential Grabber Assistance Programme to guide and uplift you and your loved ones through life's challenges.
  • Balancing personal commitments and life's demands are made easier with our FlexWork arrangements such as differentiated hours

What We Stand For at Grab

We are committed to building an inclusive and equitable workplace that enables diverse Grabbers to grow and perform at their best. As an equal opportunity employer, we consider all candidates fairly and equally regardless of nationality, ethnicity, religion, age, gender identity, sexual orientation, family commitments, physical and mental impairments or disabilities, and other attributes that make them unique.

Privacy Notice