Data Engineer
- Full-time
Company Description
Vitol is a leader in energy and commodities. Vitol produces, manages and delivers energy and commodities, including metals, to consumers and industry worldwide. In addition to its primary business, trading, Vitol is invested in infrastructure globally, with $13+billion invested in long-term assets.
Vitol’s customers include national oil companies, multinationals, leading industrial companies and utilities. Founded in Rotterdam in 1966, today Vitol serves its customers from some 40 offices worldwide. Revenues in 2024 were over $330bn.
Find out more at vitol.com.
Job Description
The Data Engineering team is responsible for a fundamental data system processing 50+ billion rows of data per day and feeding directly into trading decisions. The Data Engineer will be designing, implementing and maintaining this system, keeping it reliable, resilient, and low-latency.
We are looking for a highly technical engineer who is highly technical, with strong experience working in MPP platforms and/or Spark, “big data” (e.g., weather forecasts, AIS pings, satellite imagery, …), and developing resilient and reliable data pipelines. You will be responsible for data pipelines end to end: acquisition, loading, transformation, implementing business rules/analytics, and delivery to the end user (trading desks / data science / AI).
You will partner closely with business stakeholders and engineering teams to understand their data requirements and deliver the necessary data infrastructure to support their activities. Given our scale, you should bring a deep focus on performance optimisation, improving data access times and reducing latency.
This role will require strong coding skills in SQL and Python, and a deep understanding on how to leverage the AWS stack.
Strong communication is essential: you should be comfortable translating technical concepts to non-technical users, as well as turning business requirements into clear, actionable technical designs.
Qualifications
Essential
- 5+ years in the data engineering space
- Proficient with MPP Databases (Snowflake, Redshift, Big Query, Azure DW) and/or Apache Spark
- Proficient at building resilient data pipelines for large datasets
- Deep AWS or cloud understanding across core and extended services.
- 2+ years experience working with at least 3 of the following: ECS, EKS, Lambda, DynamoDB, Kinesis, AWS Batch, ElasticSearch/OpenSearch, EMR, Athena, Docker/Kubernetes
- Proficient with Python and SQL, and with good experience with data modelling
- Experience with a modern orchestration tools (Airflow / Prefect / Dagster / similar)
- Comfortable working in a dynamic environment with evolving requirements
Desirable
- Exposure to trading and/or commodity business
- Snowflake experience
- DBT experience
- Infrastructure as Code (Terraform, Cloud Formation, Ansible, Serverless)
- CI/CD Pipelines (Jenkins / GIT / BitBucket Pipelines / similar)
- Database/SQL tuning skills
- Basic data science concepts