Lead Data Scientist
- Full-time
Company Description
DemandMatrix Inc. is a data company that provides Go To Market, Operations and Data Science teams with high quality company level data and intelligence. DemandMatrix uses advanced data science methodologies to process millions of unstructured data points that produce reliable and accurate technology intelligence, organizational intent and B2B spend data.
Job Description
We are looking for a Lead Data Scientist to lead a technical team to discover the information hidden in vast amounts of data, and help us make smarter decisions to deliver even better products. Your primary focus is to work with the team to drive multiple initiatives for applying data mining techniques, doing statistical analysis, and building high quality prediction systems integrated with our products in the domain of technographics and problems like buyer's journey, Technology Adoption Models (TAM)
Responsibilities:
· Run CRISP-DM projects with a team of 3 data scientists in the following -
· Selecting features, building and optimizing classifiers (logistic regression or RF based propensity models) and recommenders using machine learning techniques
· Data mining using state-of-the-art methods - Automate scoring using machine learning techniques, build recommendation systems, improve and extend the features used by recommendation and propensity modeling algos,
· Extending company’s data with third party sources of information when needed
· Enhancing data collection procedures to include information that is relevant for building analytic systems
· Processing, cleansing, and verifying the integrity of data used for analysis and perform deep EDA (we create our own training data for our models)
· Doing ad-hoc analysis and presenting results in a clear manner
· Creating recommended and propensity models and tracking of its performance especially to compensate for concept drift
Qualifications
· Hands on machine learning techniques and algorithms, such as k-NN, Naive Bayes, Ensemble methods XGBoost, Decision Forests and working towards deep learning methods using TensorFlow especially for NLP like word embeddings and topic discovery
· Hands with common data science toolkits, such as Python, scikit learn, numpy, pandas, plotly, TensorFlow, ElasticSearch, Auto-ML platforms like AWS Sagemaker
· Solid proficiency in using query languages such as SQL, Experience with NoSQL databases, such as Elasticsearch and Graph DBs as Neo4j
· Good understanding of applied statistics skills, such as distributions, statistical testing, regression and strong EDA skills
· Data-oriented personality with strong sense of appreciating the business domain of your work e.g.
· Interest in working in tech domain e.g. data center analytics, understanding of macro factors in market for computing, software and cloud adoptions
· Experience of 5+ years
Additional Information
- Flexible Working hours
- Entire Work From Home
- Birthday Leave
- Remote Work