Data Scientist
- Full-time
Company Description
The Client provides expert advisory and implementation services for open source big data solutions. As the first and only pure-play big data services firm, their Data Scientists and Engineers are trusted advisors to the world's most innovative companies. Their experienced teams combine a distinctive methodology and a proven framework that includes tested design patterns and pre-built components, to help clients build applications faster. The Client helps Customers leverage Big Data analytics by integrating open source platforms, such as Hadoop, NoSQL and Streaming Engines, with best-of-breed data warehousing environments. Service offers include: a Big Data roadmap, Data Engineering, Data Lake and Analytic Operations, Training and ongoing Big Data Solution Support.
Job Description
The Clients Data Science team delivers insights and value to clients from heterogeneous data sets with solutions that integrate into engineering and decision-making processes. Additionally, our team enables big analytics for our clients through advisory services including use case prioritisations, tool selection and training, and capability definitions. Their success as a services firm relies on their experts' ability to be more than technologists and statisticians.
Qualifications
The following are a list of relevant skills expected from the successful candidate:
Must
- Have a minimum of two years professional experience
- Have an excellent understanding of machine learning and statistics
- Have well-developed quantitative skills and analytical thinking
- Have demonstrable professional experience in one or more languages e.g. Python, R, Scala, Java, C ...
- Be proficient with version control tools
- Have hands-on work experience with:
Data analysis and visualisation tools and workbenches
Analysing structured, semi-structured and unstructured data
Data query languages, e.g. SQL, HiveQL or similar
- Have clear written and spoken English and experience in presenting to business stakeholders
Should
- Have experience working in cross-functional agile software engineering teams
- Have experience in scaling data science methods and accounting for non-functional requirements
- Have experience in reliably estimating, planning and meeting deadlines for deliverables
- Be proficient with version control tools and strategies, ideally Git and Gitflow
Desirable
- Hands-on work experience/proficient with:
Spark
Distributed systems
Hadoop ecosystem
Scikit-learn and Pandas
- Cloud-based machine-learning APIs
- Have experience with integrating Data Science within products or enterprise solution
Additional Information
The Data Scientist’s objective is to deliver insights and value to enterprise Customers from heterogeneous data sets, with solutions that integrate into engineering and decision-making processes.
Specific ResponsibilitiesCustomer workshops:
- Help define and document business requirements and acceptance criteria
- Assist running workshops and documenting relevant outcomes
- Identify opportunities and appropriate solutions (e.g. algorithms and libraries)
- Present to both technical and non-technical stakeholders, internally and in a Customer facing capacity
Agile cross-functional teamwork:
- Contribute to sprint planning, provide realistic estimates and plan deliverables
- Attend standups and retrospectives
- Research, design, evaluate, build, tune and document end to end data science solutions
- Understand and solve scalability and production issues
Documentation & coding standards:
- Adhere to coding standards and best practices
- Ensure all models are validated & all business logic is robustly tested