Senior Principal Machine Learning Engineer (Fulfilment)

Full-time

Company Description

About Grab and Our Workplace

Grab is Southeast Asia's leading superapp. From getting your favourite meals delivered to helping you manage your finances and getting around town hassle-free, we've got your back with everything. In Grab, purpose gives us joy and habits build excellence, while harnessing the power of Technology and AI to deliver the mission of driving Southeast Asia forward by economically empowering everyone, with heart, hunger, honour, and humility.

Job Description

Get to Know the Team

The Fulfilment Tech Family builds the systems that power Grab's marketplaces across Southeast Asia. We design real-time, distributed systems and Machine Learning (ML) solutions that process hundreds of millions of requests each day. Our work drives supply allocation, pricing, and order matching for millions of users and driver-partners.Our mission is three-fold:

Deliver products that work for our driver-partners
Meet consumer demand, regardless of conditions
Build marketplaces that balance experience and cost for everyone involved

We are looking for a Senior Principal Machine Learning Engineer to lead our shift toward automated marketplace optimization. You'll advance how we use data and ML to automate pricing, dispatch, and supply management decisions.

Get to Know the Role

This is a Senior Principal individual contributor role where you'll build the foundation for autonomous, learning-driven marketplace systems. You'll work at the intersection of reinforcement learning, large language models, and production systems that operate at scale.

Your work centres on two areas:

Reinforcement Learning (RL) Systems: You'll develop systems that jointly optimize pricing, dispatching, and supply repositioning. You'll build decision agents that handle multiple objectives and adapt when real-world conditions change.
LLM-Based Behavioural Intelligence: You'll architect systems using fine-tuned language models to predict, explain, and simulate user decision-making at scale. These models will power the next generation of marketplace automation.

You'll serve as the technical lead for a small team, guiding both research direction and production implementation. You'll report to the Head of Data Science and work from Grab's One-North Singapore office.

The Critical Tasks You will Perform

You'll:

Design and implement end-to-end RL systems that combine model-based RL, offline RL, simulation, and online learning into a unified training pipeline. This includes creating state representations and reward structures that balance short-term results with long-term outcomes.
Build latent world models and marketplace state representations that capture supply-demand interactions, location-based patterns, and behavioural signals from users and drivers.
Develop systems that optimize across multiple marketplace levers simultaneously—pricing, dispatching, and supply repositioning—to expand the set of achievable outcomes for the business.
Create policy evaluation frameworks and establish monitoring systems that allow safe deployment of new decision-making policies in production.
Fine-tune open-source large language models on domain-specific data to build capabilities for prediction, reasoning, and simulation within marketplace applications.
Design and implement training strategies for language models, including supervised fine-tuning, preference-based alignment, and iterative improvement methods.
Work with data engineers and backend engineers to integrate RL and LLM systems into real-time production environments serving millions of users.

Qualifications

What Essential Skills You Will Need

You have PhD in Computer Science, Operations Research, Applied Mathematics, or related field with at least 10 years of experience — to lead complex technical initiatives spanning research and production systems.
You are proficient in RL fundamentals — including Markov Decision Processes, stochastic control, and reward design trade-offs. You'll apply these to build closed-loop systems that make sequential decisions in the marketplace.
You have experience building production ML/ RL systems with online learning or simulation-based optimization — to deploy models that learn and adapt in real-time environments.
You have knowledge in world models and sequential modelling — including latent dynamic models (RNNs, transformers, state-space models) and representation learning for complex systems. You'll use these to model marketplace dynamics accurately.
You have hands-on experience fine-tuning large language models in production — including supervised fine-tuning and at least one preference-based alignment method (RLHF, DPO, or GRPO). You'll apply parameter-efficient methods (LoRA, QLoRA, or PEFT) and understand their trade-offs in accuracy, memory, and cost.
You can design evaluation frameworks for generative models — including metrics for factual accuracy and reasoning quality. You'll use these to validate model outputs before deployment.
You have experience with distributed training frameworks — such as DeepSpeed, FSDP, or Megatron-LM. You'll use these to train large models efficiently across multiple GPUs or nodes.
You are proficient in Python and ML frameworks (PyTorch or TensorFlow) — to implement models and integrate with production codebases.
You have experience with scalable computing platforms — such as Spark or Ray. You'll use these to process large datasets and distribute training workloads.
You can translate ambiguous business problems into concrete modelling tasks — to identify what can be solved with ML/RL and define the scope, data requirements, and success criteria.

Additional Information

Life at Grab

We care about your well-being at Grab, here are some of the global benefits we offer:

We have your back with Term Life Insurance and comprehensive Medical Insurance.
With GrabFlex, create a benefits package that suits your needs and aspirations.
Celebrate moments that matter in life with loved ones through Parental and Birthday leave, and give back to your communities through Love-all-Serve-all (LASA) volunteering leave
We have a confidential Grabber Assistance Programme to guide and uplift you and your loved ones through life's challenges.
Balancing personal commitments and life's demands are made easier with our FlexWork arrangements such as differentiated hours

What We Stand For At Grab

We are committed to building an inclusive and equitable workplace that provides equal opportunity for Grabbers to grow and perform at their best. We consider all candidates fairly and equally regardless of nationality, ethnicity, race, religion, age, gender, family commitments, physical and mental impairments or disabilities, and other attributes that make them unique.

By clicking the link above or any third-party link within this posting, you are leaving this site and going to a third-party website where the third-party website's terms and privacy policy apply

I'm interested

Privacy Notice

Senior Principal Machine Learning Engineer (Fulfilment)

Company Description

Job Description

Qualifications

Additional Information

Job Location