Data Engineer

  • Full-time

Company Description

DataChef is a boutique big data consultancy, based in Amsterdam. We are specialists in helping modern marketing and sales teams by demystifying data and simplifying information for them.

Job Description

⚡️ The Shortest Possible Description

We are looking for an experienced, pragmatic data engineer to join our team and help designing and building a green field data mesh platform on AWS.

The mission is building a scalable data foundation for one of the largest distribution companies in world; we are looking for A-players who are ready for the challenge and fun of creating such a unique foundation and stream processing backend using AWS services (Lambda, Glue, Lake formation,...), Spark (Scala/Python) and Kafka.

This is a consultant position so we are looking for an all-round candidate and not only an AWeSome engineer.

🙂 We like if you ... [a.k.a must have]

  • Have a real click with Our Core Values. If you’re nodding emphatically while reading it, you’ll probably fit right in, in which case, we can’t wait to hear from you. If your inner voice says "bla bla bla" while reading it, we want to save your and our time by not proceeding with interviews.
  • Are good consultants: have excellent communication skills to simplify and present concepts to other people. Show them how the future might be and help them participate in creating it. They don’t assume others know, so make everything (meetings, decisions, thoughts, code, etc) explicit & traceable.
  • Are doers, not talkers: we are a small team and our individual performance directly impacts the team outcome; so you need not only be taking initiatives but actually finishing what you’ve started. We are looking for a level 4+ problem solver.
  • Can demonstrate solid technical skills in Scala and/or Python (deep understanding of language internals, profiling, testing methods). Prior experience with ZIO is a big plus (DataChef is a contributor to the project ❤️)
  • Have 2+ years of hands-on experience with AWS services like Lambda functions, EMR, ElasticSearch, Lake formation and Glue.
  • Have a solid understanding (and preferably experience) of building pub-sub and asynchronous systems using Apache Kafka or any other messaging API like SQS, Kinesis, Celery, RabbitMQ, ActiveMQ.
  • Design & code defensively for the harsh real world and not for the happy path “Hello World” scenarios. They know missing, late and low-quality raw data is a fact and pipelines failures, replay/re-process are the norm, not a drama.
  • Can ingest new data sources (via REST APIs, file sharing etc) and deal with the ever-changing schemas.
  • Can analyze algorithms complexity and know data structures beyond “List and Stack” and pros/cons of using each for a problem.
  • Have been using Linux/macOS, Docker and git in collaborative workflows for 3+ years.
  • Are fast movers: Our culture is “Go, go and go faster”. Of course, you will break things by running fast which is understood and even appreciated. Just focus on learning fast and changing fast. And yes, we believe in agility and a distilled interpretation of agile manifesto.

💓 We love if you ... [a.k.a nice to have]

  • Have 5+ years of experience not only with developing greenfield projects from scratch but in an operational and live environment with strict high availability requirements.
  • Make quality a requirements issue: it is not enough to deliver something that works sometimes/maybe, we are building a mission-critical data platform. We love people who care about their craft and are proud about the quality of their code. Prior experience with the Great Expectations (DataChef is a contributor to the project ❤️) and/or Deequ is a big plus.
  • Write clean code that’s testable, maintainable, solves the right problem and does it well.
  • Know how to instrument their code for just-enough logging, better monitoring and easier debugging when it goes to production and operational environment.
  • Believe in DRY! They “Don’t Repeat Themselves”. Have allergies for any kind of waste, manual repetitive not-automated task; well… after manually doing it a few times!
  • Understand CAP theorem and also know how to design a resilient and partition tolerant service and its associated costs and tradeoffs.
  • Know about the latest developments in the Big Data community, combined with the ability to decide which of these are most relevant to our business and translate them into opportunities.
  • Contributed to open source: Make us happy with those green dots on Github!

👨‍🍳 This is us, bulletized.

  • DataChef: We're small, profitable, self-funded and growing company based in Amsterdam. If you believe that data can (and must) change the quality of life in companies (and humans who run those companies) then you will find yourself in your **tribe **here at DataChef.
  • We are a consultancy/agency, focused on develop and delivering top quality projects on Big Data and machine learning and on AWS platform.
  • Behind the scenes, we are working on a SaaS product and aim at becoming a 100% product company somewhere in 2023.
  • 100% Open and Transparent company: Our role models are not giant corporates but relatively small yet happily successful companies like Basecamp, Buffer and ahrefs.
  • We thrive on technical excellence by hiring only the best and seeing ourselves at the beginning of the same success path as Databricks and Elasticsearch but only 10 years younger!
  • DataChef is an Equal Opportunity Employer – Minority / Female / Disability / Veteran / Gender Identity / Sexual Orientation.

 

🧪 Interview Process

Our master chef will respond your first email within 2 days of receiving it (always you'll hear from us even if that is a "No"). If you're invited then you will have three interview steps:

  1. Initial screening call zoom video call (29 minutes): Focus will be if core values are a match. Followed by that, we ask one or two quick and easy technical questions and finish the conversation with live coding: You need to share your screen in zoom and write/run a few lines of code, just enough to separate talker and doers. During this call we won’t ask about your resume, past achievements or future ambitions.
  2. Technical interview (59 minutes): a combination of technical dive deep and core values fit. On technical, you won’t need to code, and our focus will be on your prior experience and not necessarily our job description. We’ll ask you curiously but in a non-judgemental conversation to explain technical decisions made by you or your teams.
  3. Take-home Assignment (15-20 hours spent in 7 business days): We'll ask you to work on a mini-project. We don't care if you know bubble sort by heart or google (a.k.a stackoverflow it) so this is a practical, look-like-real-world kind of implementation.

Qualifications

AWS- Python - Scala - Data managment - Big Data