Senior Data Engineer
- 1675 Broadway, New York, NY 10019, USA
Part of Publicis Groupe (Euronext Paris Exchange: FR0000130577; CAC 40 index), Publicis Spine was established in 2017 to serve the data and analytics and technology needs of Publicis Groupe agencies and their clients. Its mission is to grow clients’ businesses through the transformative application of data and is the home to Publicis Groupe’s proprietary technology platform, Publicis PeopleCloud. It includes a consistent, transparent, best-in-class approach to data, analytics solutions, partnerships and technology via a closely joined network of engineers, technology experts, product designers, analysts and data scientists all empowering marketing and digital business transformation.
Our global platform is Publicis PeopleCloud. It starts with a simple principle: the more we know about people, the more we can drive brand growth by converting people at scale. Publicis PeopleCloud is an end-to-end platform that organizes data into Publicis IDs, integrating 1st, 2nd, and 3rd party data and using machine learning to match cookies, devices, and offline behaviors to actual people, in a privacy-compliant manner. Aggregating these IDs provides a basis for growth planning, building audiences, creating content, managing and measuring consumer campaigns. The platform sits within the Publicis Spine and supports the full spectrum of marketing transformation and digital business transformation for our clients.
We are built on the foundation of Trust, Talent, Transformation:
Trust is the cornerstone upon which we build our relationships. We hold ourselves to the highest standards of how a partner should behave.
We treat our people and our clients with respect, transparency and honesty.
This is first and foremost a people business. We are committed to ensuring Publicis Media a destination for the best talent in our industry. We value people as individuals, growing ourselves as we grow our client’s business.
True transformation comes when we stop managing change, and instead initiate change. We believe in our purpose to be the admired force for business transformation. We believe that focusing on performance and results has the power to transform client business.
We are looking for a talented Data Engineer for an exciting opportunity on the data engineering team. You would be involved with designing workflows for data and analytics tools that are a big part of the road-map for 2019 and managing data and infrastructure to efficiently query data in the billions. Candidates considered based on their ability to design large distributed technical solutions, manage and optimize data pipeline projects resulting in actionable data and data pipelines which support the larger organization.
This position can be based in New York City, Chicago or Boston.
- Architect, Design and Maintain Data Pipelines through the lifecycle of the product
- Optimize and Monitor existing data pipelines using AWS infrastructure.
- Write Python/Scala applications for data processing and job scheduling
- Understand and Manage massive data-stores.
- Integrate products from data pipelines into APIs built in Ruby/Rails
- Expose large data sets
- Enjoy being challenged and solve complex problems on a daily basis.
- Design efficient and robust ETL workflows.
- Manage real time streaming application and data flow.
- Investigate, procure and ramp up to new technologies.
- Work in teams and collaborate with others to clarify requirements.
- Build analytics tools that utilize the data pipelines to provide meaningful insights into data.
- Bachelor's Degree in Mathematics, Computer Science/Engineering, Statistics
- 4-7 years data engineering or data science experience (Scala required and python a plus)
- Must have a strong programming, software engineering background.
- Proficient understanding of distributed computing principles.
- Strong experience with relational SQLand NoSQL databases.
- Knowledge of Big Data Architectures: Hive/Hadoop, Redis, etc.
- Experience with Big Data tools and concepts: Spark, HDFS, MapReduce etc.
- Experience with AWS cloud services: EC2, EMR, RDS, Redshift, Lambda, Kinesis.
- Experience with fast search and analytics engines: Elasticsearch, Lucene, etc.
- Experience with data streams (Kinesis or Kafka)
- Experience with data pipeline workflow management tools like AWS Pipelines (Airflow a plus)
- Experience with Various machine learning concepts a plus
- Excellent oral and written communication skills.
All your information will be kept confidential according to EEO guidelines.