PySpark (Python)/Scala Spark

Job Description

Performing ETL jobs in Batch Modes.
Performing ETL using Real-Time Spark streaming.
Python/Scala programming (intermediate level)
Hands on experience in Spark version 1.6 and >2.
Working with different file formats: Hive, Parquet, CSV, JSON, ORC, Avro etc. Compression techniques.
Integrating PySpark with different data sources, example: oracle, postgres, mysql, MS sqlserver etc.
SparkSQL, DataFrames & Datasets.
Performance Tuning techniques.

Good to Have: