Junior Data Scientist

  • Full-time

Company Description

Join the growing team at DataProphet, the only African company to be named by the World Economic Forum as one of 56 emerging tech firms for its 2019 cohort of Technology Pioneers who are shaping their industry and their region in new and exciting ways.

DataProphet is a global leader in Artificial Intelligence (AI) for manufacturing. Our award winning technology embeds unique adaptations and advancements of deep learning, enabling AI to have a significant, practical, impact on the factory floor. DataProphet’s solutions are built to be adapted and integrated into existing environments, making it possible for our digital transformation team to take your operations from zero to AI. We understand manufacturing and that real impact is achieved with pre-emptive actions because real-time is often too late. For more information, visit www.dataprophet.com

We pride ourselves on working very closely with our clients to develop machine learning powered solutions to gather insights on hidden patterns within their data for commercial benefit. Through a combination of Machine Learning expertise, Management consulting and systems development experience, we deliver simple-to-use services that return accurate, low level and actionable predictions which can be easily integrated into existing infrastructure. Over the past four years, we have delivered bleeding edge deep learning environments and models across multiple industries including Financial services, Retail, Manufacturing, Healthcare and Gaming.

Our models have repeatedly outperformed larger multinational companies and we recently received further recognition by winning the Mercedes Benz Innovation Challenge. DataProphet boasts an experienced and qualified team of 30, with diverse skill sets in Data Science, Engineering, Statistics and Computer Science with the capability and experience across various industries globally.

Job Description

Our team is growing and we are seeking a keen (Graduate) Junior Data Scientist to assist in the development of new products and monitoring of existing products. The role would require you to get involved with all components of the product development and deployment: problem definition, data cleaning, hypothesis generation and testing, model training and testing, and finally monitoring of the predicted outcomes.

Knowing why the data is collected (i.e. what is important when building a statistical model and how that is reflected in the data) is a key component of the role. In this regard, you will frequently be required to communicate effectively with the client to understand their data and database management systems such that the set of data that they provide us is sound.

As such, it will be important to quickly understand the client’s specific problem and identify practical adaptations to the models considering the client’s context and communicating results across to the client in an effective manner.

Qualifications

  • Msc/ BBusSci(Hon)/ BSc(Hon) in a related field: Engineering, Mathematics, Statistics, Computer Science, Actuarial Science, Astrophysics
  • Ability to handle, interpret and analyse data efficiently and to to recognise and allocate the appropriate data science tools to a given problem
  • Fluency in Python (Knowledge of machine learning libraries is critical, e.g.: sklearn)
  • A full understanding of how ANN's, CNN's, RNN's, autoencoders, and variational autoencoders work
  • A solid understanding of fundamental data science concepts (familiarity with linear and logistic regression, SVM's, dimensionality reduction (principal component analysis/T-SNE/U-MAP), decision trees (and how they work), gradient boosting, ensemble models, clustering algorithms, etc
  • Familiarity with implementing the above architectures with deep learning frameworks like Keras and TensorFlow (GitHub should have everything you need). People tend to be much more impressed by those who are proficient in TF, but it's much easier to prototype/learn with Keras - so I think both are quite important. Pytorch is gaining traction as well, but I don't know of any local companies that actually use it in production
  • Database management software e.g.: SQL

NOTE: CV's to be submitted in PDF format please

Additional Information

We are a diverse and ambitious team who are passionate about what we do and aligned in our vision to create inspiring change in the world around us. We spend our days solving challenging technical problems on the cutting edge of AI and Data Science.

We work together in our beautiful office located in Green Point, Cape Town. It is a professional but supportive and fun environment created to provide a space for everyone to bring their best selves to work every day.  

The spirit of curiosity and continuous learning are in our DNA and we encourage everyone to approach all things in this manner so that we continue to grow and develop both as professionals and individuals.

Among us are gamers, yogi’s, series lovers, runners, climbers, gym bunnies and 1 x Iron Mike… ahem, Man. We are committed to supporting our team holistically and therefore; have showers to rinse off after lunch time activities, host games nights for ultimate intercompany show downs, host screenings of favorite series and provide healthy fruit and snacks to keep everyone fueled up along the way. And don’t forget, a steady flow of delicious coffee!