Big Data Engineer (Telecom - Data Engineer 2)

  • Contract

Company Description

Job Description:

Big Data Engineers serve as the backbone of the Strategic Analytics organization, ensuring both the reliability and applicability of the team’s data products to the entire Samsung organization. They have extensive experience with ETL design, coding, and testing patterns as well as engineering software platforms and large-scale data infrastructures. Big Data Engineers have the capability to architect highly scalable end-to-end pipeline using different open source tools, including building and operationalizing high-performance algorithms.

Big Data Engineers understand how to apply technologies to solve big data problems with expert knowledge in programming languages like Java, Python, Linux, PHP, Hive, Impala, and Spark. Extensive experience working with both

1) big data platforms and

2) real-time / streaming deliver of data is essential.

Big data engineers implement complex big data projects with a focus on collecting, parsing, managing, analyzing, and visualizing large sets of data to turn information into actionable deliverables across customer-facing platforms. They have a strong aptitude to decide on the needed hardware and software design and can guide the development of such designs through both proof of concepts and complete implementations.

Responsibilities include:

Translate complex functional and technical requirements into detailed design.

Design for now and future success

Hadoop technical development and implementation.

Loading from disparate data sets. by leveraging various big data technology e.g. Kafka

Pre-processing using Hive, Impala, Spark, and Pig

Design and implement data modeling

Maintain security and data privacy in an environment secured using Kerberos and LDAP

High-speed querying using in-memory technologies such as Spark.

Following and contributing best engineering practice for source control, release management, deployment etc

Production support, job scheduling/monitoring, ETL data quality, data freshness reporting

Skills Required:

5+years of Python development experience

3+ years of demonstrated technical proficiency with Hadoop and big data projects

5-8 years of demonstrated experience and success in data modeling

Fluent in writing shell scripts [bash, korn]

Writing high-performance, reliable and maintainablecode.

Ability to write MapReduce jobs

Ability to setup, maintain, and implement Kafka topics and processes

Understanding and implementation of Flume processes

Good knowledge of database structures, theories, principles, and practices.

Understand how to develop code in an environment secured using a local KDC and OpenLDAP.

Familiarity with and implementation knowledge of loading data using Sqoop.

Knowledge and ability to implement workflow/schedulers within Oozie

Experience working with AWS components [EC2, S3, SNS, SQS]

Analytical and problem solving skills, applied to Big Data domain

Proven understanding and hands on experience with Hadoop, Hive, Pig, Impala, and Spark

Good aptitude in multi-threading and concurrency concepts.

B.S. or M.S. in Computer Science or Engineering

Primary Skills:

5+years of Python development experience

3+ years of demonstrated technical proficiency with Hadoop and big data projects with Kafka

5-8 years of demonstrated experience and success in data modeling

Fluent in writing shell scripts [bash, korn]

Pre-processing using Hive, Impala, Spark, and Pig

knowledge of loading data using Sqoop

implement workflow/schedulers within Oozie

Experience working with AWS components [EC2, S3, SNS, SQS]

Industry:

Telecommunication Network side, LTE (4G), 5G

Education:

B.S. or M.S. in Computer Science or Engineering

Additional Information

All your information will be kept confidential according to EEO guidelines.