Data Scientist

Full-time

Company Description

The Client provides expert advisory and implementation services for open source big data solutions. As the first and only pure-play big data services firm, their Data Scientists and Engineers are trusted advisors to the world's most innovative companies. Their experienced teams combine a distinctive methodology and a proven framework that includes tested design patterns and pre-built components, to help clients build applications faster. The Client helps Customers leverage Big Data analytics by integrating open source platforms, such as Hadoop, NoSQL and Streaming Engines, with best-of-breed data warehousing environments. Service offers include: a Big Data roadmap, Data Engineering, Data Lake and Analytic Operations, Training and ongoing Big Data Solution Support.

Job Description

The Clients Data Science team delivers insights and value to clients from heterogeneous data sets with solutions that integrate into engineering and decision-making processes. Additionally, our team enables big analytics for our clients through advisory services including use case prioritisations, tool selection and training, and capability definitions. Their success as a services firm relies on their experts' ability to be more than technologists and statisticians.

Qualifications

The following are a list of relevant skills expected from the successful candidate:

Must

Have a minimum of two years professional experience
Have an excellent understanding of machine learning and statistics
Have well-developed quantitative skills and analytical thinking
Have demonstrable professional experience in one or more languages e.g. Python, R, Scala, Java, C ...
Be proficient with version control tools
Have hands-on work experience with:

Data analysis and visualisation tools and workbenches

Analysing structured, semi-structured and unstructured data

Data query languages, e.g. SQL, HiveQL or similar

Have clear written and spoken English and experience in presenting to business stakeholders

Should

Have experience working in cross-functional agile software engineering teams
Have experience in scaling data science methods and accounting for non-functional requirements
Have experience in reliably estimating, planning and meeting deadlines for deliverables
Be proficient with version control tools and strategies, ideally Git and Gitflow

Desirable

Hands-on work experience/proficient with:

Spark

Distributed systems

Hadoop ecosystem

Scikit-learn and Pandas

Cloud-based machine-learning APIs

Have experience with integrating Data Science within products or enterprise solution

Additional Information

The Data Scientist’s objective is to deliver insights and value to enterprise Customers from heterogeneous data sets, with solutions that integrate into engineering and decision-making processes.

Specific Responsibilities

Customer workshops:

Help define and document business requirements and acceptance criteria
Assist running workshops and documenting relevant outcomes
Identify opportunities and appropriate solutions (e.g. algorithms and libraries)
Present to both technical and non-technical stakeholders, internally and in a Customer facing capacity

Agile cross-functional teamwork:

Contribute to sprint planning, provide realistic estimates and plan deliverables
Attend standups and retrospectives
Research, design, evaluate, build, tune and document end to end data science solutions
Understand and solve scalability and production issues

Documentation & coding standards:

Adhere to coding standards and best practices
Ensure all models are validated & all business logic is robustly tested

I'm interested