Senior Data Engineer
- Full-time
- RR Division or Department: Technology
Company Description
About Us
RepRisk is the world’s most respected Data as a Service (DaaS) company for reputational risks and responsible business conduct. Our mission is to provide transparency on business conduct risks to drive positive change. Combining advanced AI with deep human expertise, and a proven methodology at the core, RepRisk’s solutions bring performance and peace of mind, enabling clients to know more, be sure, and act faster. With our values of intellectual honesty and humility, operational excellence, and openness and respect, our diverse teams of talented experts are pioneering solutions that enable clients to make better informed decisions. Headquartered in Zurich, and with offices in Toronto, New York, London, Berlin, Manila, and Tokyo, we stay close to clients and bring an independent lens to the industry. United by our shared belief in the power of data, our 400 people are proud to be setting the global standard for business conduct data and driving positive and meaningful change through transparency.
We Offer
- Join a growing, diverse, and experienced team that fosters skill development and offers support.
- Work in an agile development ecosystem using state-of-the-art open-source technologies.
- Flexible working hours and arrangements to accommodate your needs.
- Thrive in an entrepreneurial, international, and dynamic work environment.
- Be part of a shared mission to hold companies accountable and encourage responsible behaviour.
- A company that embraces diversity, because life would be boring if we were all the same!
Job Description
About You
Are you looking for an opportunity to build robust, scalable data infrastructure that powers meaningful, cutting-edge machine learning projects? Do you want to work at a company where your contributions have a real, measurable impact - and you're recognized and rewarded for it?
If you're passionate about data architecture, pipelines, and enabling ethical tech development, then this is the perfect role for you. We value autonomy, giving you the space to bring innovative engineering solutions to life in an inclusive, feedback-oriented environment. Your work will directly support NLP and machine learning initiatives that drive corporate responsibility through technology.
Your Responsibilities
As our new Senior Data Engineer, you will architect, build, and scale a modern data platform leveraging Databricks and lakehouse architecture principles. You will lead the design and delivery of enterprise-grade data infrastructure as part of our global Technology division. You will also:
Architect and implement end-to-end lakehouse solutions on Databricks, leveraging Delta Lake, Unity Catalog, and the Medallion architecture (Bronze/Silver/Gold)
Design, build, and maintain scalable, reliable ELT pipelines using Databricks workflows, Delta Live Tables, and Apache Spark
Develop and optimize high-throughput streaming and batch data pipelines using Spark Structured Streaming and Auto Loader
Drive data platform performance tuning, cost optimization, and cluster/compute governance across Databricks environments
Define and enforce data contracts, schemas, and governance standards through Unity Catalog and Delta Lake
Ensure data quality, observability, and lineage across the platform using tools such as Databricks Data Observability and Great Expectations
Collaborate cross-functionally with data scientists, analysts, and platform teams to deliver reliable, self-serve data products
Establish and champion internal data engineering best practices, standards, and reusable frameworks
Stay current with the Databricks ecosystem, lakehouse trends, and emerging data engineering patterns
Participate in code reviews to maintain high standards of quality, performance, and security
Engage actively in Agile/Scrum ceremonies, contributing architectural insights and technical direction to the team
Qualifications
You Offer
A Bachelor’s Degree within subjects related to computer science, or related STEM field
5+ years of hands-on experience in Data Engineering or similar role
Strong proficiency in Python and SQL
Solid experience with Batch processing (e.g. AWS Glue / dbt) and stream processing technologies (e.g. Kafka)
Proven experience with Dimensional Data Modelling and Data Vault methodologies
Experience with Data Orchestration tools such as Airflow or Dagster
Familiarity with data quality and validation frameworks (e.g. Great Expectations, SODA or similar)
Experience integrating with Metadata tools such as Collibra, OpenMetadata etc.
Strong understanding of version control (Git) and CI/CD pipelines
Experience working with cloud platforms (AWS preferred)
Practical experience with Data Lakehouse concepts and technologies such as Databricks and Snowflake
A proactive mindset with strong ownership, initiative and drive to push things forward
Strong communication skills with professional proficiency in English
Additionally, the following are a plus
Delivering workflow configurations in BPM based software such as Camunda etc.
Experience working with Machine Learning teams, familiarity with ML/DL/NLP concepts
Additional Information
Please note that we will only consider candidates with a valid work permit