Data Scientist Solution Architect / Data Analyst

  • Contract

Company Description

Established in 1991, Collabera has been a leader in IT staffing for over 22 years and is one of the largest diversity IT staffing firms in the industry. As a half a billion dollar IT company, with more than 9,000 professionals across 30+ offices, Collabera offers comprehensive, cost-effective IT staffing & IT Services. We provide services to Fortune 500 and mid-size companies to meet their talent needs with high quality IT resources through Staff Augmentation, Global Talent Management, Value Added Services through CLASS (Competency Leveraged Advanced Staffing & Solutions) Permanent Placement Services and Vendor Management Programs. 


Collabera recognizes true potential of human capital and provides people the right opportunities for growth and professional excellence.

Job Description

Location: McLean, VA 22102

Duration: 6+ months (could go beyond)

Description:

• The candidate for this position will provide analytical support to the Data Science Division in the Cyber, Cloud and Data Science Service line.

• The successful candidate will support the enterprise through designing solutions for data collection, preparation, and model building to develop end-to-end analytic lifecycles to synthesize actionable information. 

• The candidate will determine appropriate tools and methods for specific projects to design the analytics solution either as a standalone system, or an analytics embedded inside and overall solution.

• The candidate will be working in teams that include enterprise architects, intelligence analysts, data and visualization experts, software developers, and system engineers, and will have an excellent opportunity to broaden skills.

• The candidate must have experience in developing and deploying solutions for customers.


Applicant must have skills applicable to one or more of the following areas:

• Data wrangling, cleansing, and analytics

• The data science process

• Presenting work to both technical and non-technical audiences

• Statistical evaluation

• Machine Learning, Predictive Modeling

Qualifications

Applicant should have skills in one or more of the following areas:

• Machine Learning Technologies, such as Natural Language Processing (NLP) - e.g., Jaro-Winkler, Damerau- Levenshtein, Metaphone, string manipulation, etc.

• Natural Language Processing (NLP) - e.g., Jaro-Winkler, Damerau- Levenshtein, Metaphone, string manipulation, etc.


• R libraries: base, MASS, plyr, rpart, randomForest, maps/mapproj/rworldmap, zoo, adabag, animation, ggplot, igraph, jsonlite, mclust, pROC, hexbin 


• Python libraries: numpy, scipy, matplotlib, scikit-learn,etc 

• SPSS, Oracle Data Miner, SAS Base, DataMiner, Dataflux, STAT

• Entity Resolution - Basis Technology Rosette Name Indexer (RNI), Global Name Recognition (GNR), Probabilistic Matching Engine, Trillium Software (TS) Quality

• Apache Hadoop 2.x, MapReduce, Elastic Search 1.4.x, Sqoop, Pig

• Familiar with Libraries: such as ATS SSO, ATS-common framework, Highchart, jersey, jtidy, one2team, iText, Spring/Spring STS, JSON, Network Markov Clustering, Topic Modeling Tool, Naïve Bayes, Apache Commons, Google’s Guava, Apache Log4j, Open CSV, SecondString

• Working in interdisciplinary teams.

Additional Information

To discuss on this in more detail, Please contact:

Himanshu Prajapat

himanshu.prajapat(at)collabera.com

973-606-3290