Data Engineer
- Full-time
Company Description
Lyst is a technology platform that revolutionises the way people shop for fashion. We connect millions of consumers globally with the world’s leading fashion designers and stores, giving them a simpler, more engaging and more effective shopping experience. Lyst has grown over 300% every year since launch in 2011 and has raised over $60M from top-tier investors including Accel, DFJ, Balderton and the teams behind LVMH, Michael Kors and Oscar de la Renta.
Job Description
Lyst are looking for a Data Engineer to help us continue to build out our data pipeline capabilities and improve the way we handle data ingestion and ETL processes at scale.
We record over 100 million rows of data a day and use this information to allow our partners and the business to gain huge levels of insight into consumer behaviour.
To support this volume, we have built our own analytics architecture, predominantly in Python and leveraging the convenience of many cloud services like AWS Redshift, Lambda, CloudFront, S3, etc.
What you'll be working on:
- Designing and implementing different ETL and data jobs using Python, AWS etc.
- Working on challenges to help us build out a more automated analytics platform using a micro-services architecture.
- Streamlining the anomaly detection process across all our data, building tools and processes around it to speed up the required actions.
- Re-designing if necessary and constantly optimising our database (table) designs to make them faster and more efficient for AWS Redshift.
- Creating a development platform with tools and a pipeline that are effective, simple, well documented and scalable.
- Support and implementation of Looker as the BI tool of the company, defining good practices and supporting users needs.
Qualifications
What you'll need:
- Experience writing large, complex systems in a commercial environment
- Confidence in writing performant code in Python
- Experience with the some of the following technologies in production: Django, PostgreSQL, Redis, AWS, Elasticsearch, Docker, RedShift
Bonus points for:
- Experience building and maintaining large scale data pipelines.
- Experience setting up and running ETL processes and a great understanding of database design.
Additional Information
You will be challenged, supported and have the opportunity to learn a lot. You will work a fast paced, autonomous environment with like minded people who are passionate about what they do.
We care deeply about helping the tech industry become a more inclusive and diverse place and we work hard to lead by example. Our workplace is dynamic, diverse and highly collaborative. Join a company with;
- 50 engineers and data scientists with plans to double the team size in the next 6 months.
- 5M duplicated products detected and merged using product image features (http://www.slideshare.net/ejlbell/fashion-productdeduplication)
- 300k online recommendation model updates per day (http://developers.lyst.com/data/2014/11/11/word-embeddings-for-fashion/)
- 72k crowd-sourced labels generated per day
- 40k product gender classifications per day via deep learning
- 500k recommended products per day
- 120 EC2, 8 RDS, 7 ElastiCache and 10 Redshift instances
- our internal analytics system collects ~100M data points/day
...and a team that…
- ~10 deployments/day
- 40+ merged pull requests/day
- 20k lines of change/week
- Lots of open source projects - https://github.com/lyst and https://github.com/SSAW
- Get invited to talk at great events (PyCon, Europython, PyData etc)
- feature toggling and A/B testing
- Twice monthly internal engineering meet up events
- Paid attendance at conferences
- A clothing allowance
- Internal training opportunities (want to learn Python, or improve your presentation skills?)
- Desk beer Fridays
- A well stocked kitchen and fridge
- Things that keep you happy and healthy: Yoga in the office regularly, football teams, netball teams, board game nights and burger eating clubs.