Data Scientist

  • Full-time

Company Description

The Client provides expert advisory and implementation services for open source big data solutions. As the first and only pure-play big data services firm, their Data Scientists and Engineers are trusted advisors to the world's most innovative companies. Their experienced teams combine a distinctive methodology and a proven framework that includes tested design patterns and pre-built components, to help clients build applications faster. The Client helps Customers leverage Big Data analytics by integrating open source platforms, such as Hadoop, NoSQL and Streaming Engines, with best-of-breed data warehousing environments. Service offers include: a Big Data roadmap, Data Engineering, Data Lake and Analytic Operations, Training and ongoing Big Data Solution Support.

Job Description

The Clients Data Science team delivers insights and value to clients from heterogeneous data sets with solutions that integrate into engineering and decision-making processes.  Additionally, our team enables big analytics for our clients through advisory services including use case prioritisations, tool selection and training, and capability definitions. Their success as a services firm relies on their experts' ability to be more than technologists and statisticians.

Qualifications

The following are a list of relevant skills expected from the successful candidate:

Must

  • Have a minimum of two years professional experience
  • Have an excellent understanding of machine learning and statistics
  • Have well-developed quantitative skills and analytical thinking
  • Have demonstrable professional experience in one or more languages e.g. Python, R, Scala, Java, C ...
  • Be proficient with version control tools
  • Have hands-on work experience with:

                   Data analysis and visualisation tools and workbenches

                   Analysing structured, semi-structured and unstructured data

                   Data query languages, e.g. SQL, HiveQL or similar

  • Have clear written and spoken English and experience in presenting to business stakeholders

Should

  • Have experience working in cross-functional agile software engineering teams
  • Have experience in scaling data science methods and accounting for non-functional requirements
  • Have experience in reliably estimating, planning and meeting deadlines for deliverables
  • Be proficient with version control tools and strategies, ideally Git and Gitflow

Desirable

  •  Hands-on work experience/proficient with:

                   Spark

                   Distributed systems

                   Hadoop ecosystem

                   Scikit-learn and Pandas

  • Cloud-based machine-learning APIs
  • Have experience with integrating Data Science within products or enterprise solution

Additional Information

The Data Scientist’s objective is to deliver insights and value to enterprise Customers from heterogeneous data sets, with solutions that integrate into engineering and decision-making processes.

Specific Responsibilities

Customer workshops:

  • Help define and document business requirements and acceptance criteria
  • Assist running workshops and documenting relevant outcomes
  • Identify opportunities and appropriate solutions (e.g. algorithms and libraries)
  • Present to both technical and non-technical stakeholders, internally and in a Customer facing capacity

Agile cross-functional teamwork:

  • Contribute to sprint planning, provide realistic estimates and plan deliverables
  • Attend standups and retrospectives
  • Research, design, evaluate, build, tune and document end to end data science solutions
  • Understand and solve scalability and production issues

Documentation & coding standards:

  •  Adhere to coding standards and best practices
  •  Ensure all models are validated & all business logic is robustly tested