Data scientist Intern

  • Full-time

Company Description

BlaBlaCar’s vision is to bring fairness, freedom and fraternity to the world of travel. 

BlaBlaCar was created in 2006 with one dream in mind: fill the millions of empty seats on the road to create an efficient, affordable and friendly way to travel. Today, our global, trusted community counts 84 million members in 22 countries, enabling a smarter, large scale and truly sustainable transport network. 

Born from a simple idea of people sharing their rides, BlaBlaCar’s ambition is to become the go-to marketplace for shared road mobility. We offer long distance carpooling service, a commuting carpooling service, and since 2018  a marketplace bus service too. A filled bus emits one third of the emissions per passenger kilometer as an average car. So we want to fill those seats too! 

In 2018, 84 million travellers used BlaBlaCar globally saving 1.6 million tons of CO2. Meanwhile we enabled human connections, bringing people closer together in more ways than one with 87% of members say that carpooling is an enriching experience. 

BlaBlaCar offers a truly unique international environment with a team counting 35 nationalities,  serving a global member-base from 7 offices in Berlin, Kiev, Madrid, Moscow, Paris (HQ), São Paulo, Warsaw. English is the official spoken language across BlaBlaCar. We are privately-held and founder-led. Our team of 500 employees is entrepreneurial, passionate, and fundamentally mission-driven.

Job Description

Why join us?

One of BlaBlaCar’s key strategic goals is to leverage data through Machine Learning in order to trigger growth opportunities. Data Science has been instrumental to successfully launching new services for the past two years, yet the potential for improvements is still massive. If you are looking for exciting challenges, impacting millions of users, and working on a state-of-the-art cloud platform, come and join us!

You’ll join the data science automation team, which aims at rationalizing and automating decision-making processes everywhere in the company. Among many other projects, the team is building a simulation tool to design an efficient data-driven bus transport network. The team is also leading the efforts on creating a marketing automation platform, on top of our first-class campaign tracking assets, to automate and optimize low granularity google ads spendings (bidding, placement, keywords strategy).

BlaBlaCar data stack is composed of Google Cloud Platform suite (BigQuery, Pub/Sub, Compute Engine, ...), in-house tracking technology based on Kafka, all being orchestrated by Airflow. Dashboards and reports are built-in Tableau. As a Data Scientist, you will mostly be coding in Python. The infrastructure used to deploy models is often on Cloud Run, through a BlaBlaCar custom library called Framework of ML (FraML). However, we are very free to use other frameworks when we believe it makes more sense.

What you will directly contribute to:

The current stack of the non-compliant business drivers detection is composed of:

  • Multiple algorithms running real-time with all the constraints coming from this and powered with the help of multiple Google Cloud Platform tools.

  • Multiple ETLs to create datasets for offline prediction. 

  • Multiple POCs including graph learning techniques and technologies. 

We are considering several ways of improving this stack:

  • Enhance current algorithm by looking at new sources of signal (i.e. add features to the linked account algorithm)

  • Unifying model to use all information available. Currently multiple algorithms are predicting the same outcome using data from different sources. Investigate stacking of all these scores to improve detection.

  • Deploy machine learning models to detect very different types of fraud on a real-time data stack.

Qualifications

What you will need to be successful :

  • Hold an advanced degree (MSc or PhD) in Data Science, Machine Learning, Computer Science, Mathematics, Statistics, Economics, or Operations Research.

  • Strong knowledge of Machine Learning theory, statistics and probabilities.

  • Have a first successful experience in Data Science, building and implementing a model in production.

  • Be fluent in SQL, Python, and at least one ML package: scikit, xgboost, keras...

  • Experience working on the Google Cloud Platform is a plus.

  • Have some experience applying data science techniques to geospatial data is a plus.

  • Possess good communication skills: you are able to explain your models clearly to both analysts and decision makers.

  • Be humble, structured, organized, motivated by innovation and a relentless doer

  • Enjoy working as a team-player and learning from others.

  • Be fluent in English.

Additional Information

A few practical details about the role

  • Start: As soon as you are ready!
  • Location: Paris HQ
  • Contract: Internship 6months

What we offer all of our employees:

  • A start-up spirit that fosters agility, teamwork and impact

  • Challenging career opportunities in a high-growth and fast-paced environment

  • An inspiring working environment including state-of-the-art office spaces

  • Weekly Tech Demos

  • And much more to check out on BlaBlaCar.com/dreamjobs

What is next ? 

If you are ready to join our exciting journey, please apply below: upload your resume in English (PDF format) and answer our questions in English.

Kindly note that only complete applications will be reviewed by our hiring team and that all your information will be kept confidential.

You can expect us to review your application within the following 3 weeks. If your application fits our requirements, we will first invite you to an interview call followed by a case study and 2 onsite interviews.

BlaBlaCar is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.