Data Engineer

  • London, UK
  • Employees can work remotely
  • Full-time
  • Department: Data
  • Office: London

Company Description

Genomics England successfully led the world-leading 100,000 Genomes Project, which compared and analysed individuals’ genetic codes to help diagnose, treat and prevent illness.

We're now accelerating our impact, working with the NHS to further develop and embed genomic healthcare and research in Britain.  Our next chapter involves working with patients, doctors, scientists, government and industry to improve genomic testing, and help researchers access the health data and technology they need to make new medical discoveries and create more effective, targeted medicines for everybody.

Job Description

We are looking for experienced data engineers to join our growing data family at Genomics England. You will be part of multi-discipline squads/teams delivering a step change in the way that healthcare is delivered, ensuring that patients receive improved diagnosis and care. We use the best in science and technology to deliver genetic insights for personalised medicine. 

You will get to work with truly large scale clinical and genomic data. We currently hold over 40 petabytes of structured and unstructured data. Our organisation is dynamic, agile and full of talented people working on the cutting edge of science. We look forward to having you join our organisation. 

Key Responsibilities: 

  • Automating and optimising big data pipelines on structured, semi-structured and unstructured data 
  • Continuous integration and deployment of end-to-end data delivery 
  • Acquisition, ingestion, transformation, curation and productisation of data assets 
  • Thought leadership and innovation for data delivery 
  • Support the business to migrate to scalable architecture on cloud (AWS) 

Key Skills:

  • Experience with tooling for data manipulation in programming languages, low-code ETL tooling and/or data visualisation software
  • Strong programming skills against cloud-based pipelines
  • Experience in co-design of curated data products from raw data assets working with business users to meet business needs 
  • Experience of data modelling and developing/reverse-engineering data assets and the necessary components for data model conformance 
  • Experience developing, optimising and automating data extract, transform and load routines to create a coherent high-quality comprehensive curated data  
  • Experience with developing scheduled data flows using an integration/workflow engine, including message management and troubleshooting, log integration and complex data lineage 
  • Strong AWS experience preferred 
  • Healthcare experience helpful 

Example tooling:

  • Cloud: AWS, Azure
  • ETL: AWS Glue, Trifacta, KNIME
  • Metadata & master data management: White Rabbit
  • Data models: XML, JSON, HL7 FHIR, OMOP
  • Databases: AWS S3 & Athena, AWS DynamoDB, AWS RDS, AWS Aurora (Postgres)
  • Continuous deployment: Jenkins, AWS Lambda, Docker, Kubernetes
  • Programming languages: Python, R, SQL
  • Visualisation software: Tableau
  • Machine learning: Regression, decision trees, SVM, Bayes, NLP
  • Practices: DMBOK2, Continuous Integration/Continuous Deployment (CI/CD)


Ideally, Master’s degree or equivalent experience working in data management, biostatistics, clinical informatics or data analysis 

Additional Information

Originally conceived as a project, Genomics England has transformed to meet the long-term opportunities created by our scientific breakthroughs in understanding the Human Genome. Being part of this journey is a reward in itself, however we're pleased to offer our colleagues a great benefits package including:

  • competitive salary
  • 30 days holiday
  • generous pension scheme
  • individual learning budgets for every colleague
  • a raft of other benefits

Talk to our Talent Team and find out how a career with Genomics England will benefit you.

As part of our recruitment process, all successful candidates are subject to a Standard Disclosure and Barring Service (DBS) check.  We therefore require applicants to disclose any previous offences at point of application, as some unspent convictions may mean we are unable to proceed with your application due to the nature of our work in healthcare. 

Privacy Policy