PySpark Developer with OCR experience

  • Contract

Company Description

John Snow Labs is an award-winning AI and NLP company, accelerating progress in data science by providing state-of-the-art software, data, and models. Founded in 2015, it helps healthcare and life science companies build, deploy, and operate AI products and services. John Snow Labs is the winner of the 2018 AI Solution Provider of the Year Award, the 2019 AI Platform of the Year Award, the 2019 International Data Science Foundation Technology award, and the 2020 AI Excellence Award.

John Snow Labs is the developer of Spark NLP - the world’s most widely used NLP library in the enterprise - and is the world’s leading provider of state-of-the-art clinical NLP software, powering some of the world’s largest healthcare & pharma companies. John Snow Labs is a global team of specialists, of which 33% hold a Ph.D. or M.D. and 75% hold at least a Master’s degree in disciplines covering data science, medicine, software engineering, pharmacy, DevOps and SecOps.

Job Description

This is an opportunity for a superstar PySpark developer with proven knowledge of Spark, Data Science, Big Data and great communication skills. We are the team developing the Spark NLP and Spark OCR libraries – and are looking to grow the team building these libraries as well as helping customers use these libraries in their projects.

More details about the project are available here: https://nlp.johnsnowlabs.com/docs/en/ocr 

This is a career opportunity that will enable you to expand your knowledge and experience of different tools and techniques, work with an international team of big data and data science experts, and make a positive impact on your work. If you qualify and are interested, please include the words 'John Snow Labs' in your cover letter and explain in detail why you are the best fit for this role. 

 

Qualifications

  • ·         Python (3+ years) 
  • ·         Apache Spark (3+ years)
  • ·         Data Science (Python, Jupyter, TensorFlow)
  • ·         OCR (using open-source tools such as Tesseract is a plus)
  • ·         Image processing expertise is a plus
  • ·         Scala/Java experience would be a big plus 

 

Additional Information

  • We are a fully virtual company, collaborating across 22 countries.
  • Open to candidates worldwide - work remotely from anywhere.
  • This is a contract opportunity, not a full-time employment role.
  • This role requires the availability of at least 30 hours per week.