- London, UK
- Employees can work remotely
- Department: Data
- Office: London
Genomics England successfully led the world-leading 100,000 Genomes Project, which compared and analysed individuals’ genetic codes to help diagnose, treat and prevent illness.
We're now accelerating our impact, working with the NHS to further develop and embed genomic healthcare and research in Britain. Our next chapter involves working with patients, doctors, scientists, government and industry to improve genomic testing, and help researchers access the health data and technology they need to make new medical discoveries and create more effective, targeted medicines for everybody.
We are looking for experienced data engineers to join our growing data family at Genomics England. You will be part of multi-discipline squads/teams delivering a step change in the way that healthcare is delivered, ensuring that patients receive improved diagnosis and care. We use the best in science and technology to deliver genetic insights for personalised medicine.
You will get to work with truly large scale clinical and genomic data. We currently hold over 40 petabytes of structured and unstructured data. Our organisation is dynamic, agile and full of talented people working on the cutting edge of science. We look forward to having you join our organisation.
- Automating and optimising big data pipelines on structured, semi-structured and unstructured data
- Continuous integration and deployment of end-to-end data delivery
- Acquisition, ingestion, transformation, curation and productisation of data assets
- Thought leadership and innovation for data delivery
- Support the business to migrate to scalable architecture on cloud (AWS)
- Experience with tooling for data manipulation in programming languages, low-code ETL tooling and/or data visualisation software
- Strong programming skills against cloud-based pipelines
- Experience in co-design of curated data products from raw data assets working with business users to meet business needs
- Experience of data modelling and developing/reverse-engineering data assets and the necessary components for data model conformance
- Experience developing, optimising and automating data extract, transform and load routines to create a coherent high-quality comprehensive curated data
- Experience with developing scheduled data flows using an integration/workflow engine, including message management and troubleshooting, log integration and complex data lineage
- Strong AWS experience preferred
- Healthcare experience helpful
- Cloud: AWS, Azure
- ETL: AWS Glue, Trifacta, KNIME
- Metadata & master data management: White Rabbit
- Data models: XML, JSON, HL7 FHIR, OMOP
- Databases: AWS S3 & Athena, AWS DynamoDB, AWS RDS, AWS Aurora (Postgres)
- Continuous deployment: Jenkins, AWS Lambda, Docker, Kubernetes
- Programming languages: Python, R, SQL
- Visualisation software: Tableau
- Machine learning: Regression, decision trees, SVM, Bayes, NLP
- Practices: DMBOK2, Continuous Integration/Continuous Deployment (CI/CD)
Ideally, Master’s degree or equivalent experience working in data management, biostatistics, clinical informatics or data analysis
Originally conceived as a project, Genomics England has transformed to meet the long-term opportunities created by our scientific breakthroughs in understanding the Human Genome. Being part of this journey is a reward in itself, however we're pleased to offer our colleagues a great benefits package including:
- competitive salary
- 30 days holiday
- generous pension scheme
- individual learning budgets for every colleague
- a raft of other benefits
Talk to our Talent Team and find out how a career with Genomics England will benefit you.