Data Engineer

  • Full-time
  • Work Model: Hybrid

Company Description

Metro Global Solution Center (MGSC) is internal solution partner for METRO, a €31.5 Billion international wholesaler with operations in 32 countries through 625 stores & a team of 85,000 people globally. Metro operates in a further 10 countries with its Food Service Distribution (FSD) business and it is thus active in a total of 34 countries.

MGSC, location wise is present in Pune (India), Düsseldorf (Germany) and Szczecin (Poland). We provide Finance, HR, IT & Business operations support to 31 countries, speak 24+ languages and process over 18,000 transactions a day. We are setting tomorrow’s standards for customer focus, digital solutions, and sustainable business models. For over 12 years, we have been providing services and solutions from our two locations in Pune and Szczecin. This has allowed us to gain extensive experience in how we can best serve our internal customers with high quality and passion. We believe that we can add value, drive efficiency, and satisfy our customers.

Website: https://www.metro-gsc.in

Company Size: 1200-1300

Headquarters: Pune, Maharashtra, India

Type: Privately Held

Inception:  2011

Job Description

Role Summary

We are looking for a Data Engineer responsible for operating, monitoring, and enhancing data ingestion and processing pipelines within a Google Cloud–based data platform. The role combines day-to-day platform reliability, data validation, and development activities, supporting continuous platform evolution, new data onboarding, and automation initiatives. The ideal candidate is hands-on, detail-oriented, and comfortable working across production and pre-production environments.

Key Responsibilities

A) Data Ingestion Operations & Monitoring

  • Monitor and ensure successful daily and monthly data ingestion across multiple upstream systems.
  • Oversee scheduled jobs and queries, validating successful execution and investigating failures or delays.
  • Maintain and enhance data ingestion monitoring checks, including:
    • Updating checks when new datasets or tables are introduced.
    • Adapting monitoring logic based on frequency, country, or business dimensions.
  • Track and investigate unprocessed or delayed files in cloud storage, performing root cause analysis and coordinating resolution with relevant stakeholders.
  • Manage and follow up on incident or service requests related to missing or delayed data until fully resolved.

B) Automation & Platform Reliability

  • Support and troubleshoot automation workflows used to capture ingestion exceptions and feed monitoring logic.
  • Ensure automation processes run reliably and intervene quickly in case of failures.
  • Handle ad-hoc automation requests to reduce manual effort and improve operational efficiency.
  • Proactively identify recurring issues and implement improvements to strengthen pipeline stability.

 

C) Data Validation & Reporting

  • Perform data validation and reconciliation activities by comparing datasets across environments or systems when required (typically on a recurring or monthly basis).
  • Develop, maintain, and execute Python-based validation scripts to identify discrepancies and data quality issues.
  • Prepare recurring operational and reconciliation reports in line with agreed schedules.

D) Development & Enhancements

  • Support environment setup and migration activities within Google Cloud, including configuration changes between production and pre-production environments.
  • Create, test, and deploy new tables and datasets following platform standards.
  • Implement schema changes and enhancements to existing tables while ensuring backward compatibility and data integrity.
  • Enable onboarding of new countries, business units, or data domains into existing ingestion and monitoring frameworks.
  • Continuously refine ingestion and validation logic to align with evolving business requirements.

Technologies & Tools

  • Google Cloud Platform (GCP)
  • SQL
  • Python
  • ODBC
  • Postman
  • Workflow and automation tools

Issue tracking tools (e.g., ticketing systems)

Qualifications

Required Skills & Competencies

2-6 years of experience

Technical Skills

  • Experience with data engineering operations: ingestion monitoring, troubleshooting, and reliability ownership.
  • Hands-on exposure to GCP administration (environment handling, scheduled workloads, monitoring).
  • Strong SQL skills for investigation, validation, and operational queries.
  • Proficiency in Python for scripting, automation, and data validation.
  • Ability to work confidently across pre-production and production environments.
  • Solid understanding of system architecture and end-to-end data flows.

Professional Skills

  • Strong analytical and problem-solving capabilities with a structured approach to root cause analysis.
  • Effective communication and stakeholder management skills, including coordination with cross-functional teams.
  • Ability to balance BAU operations with development and enhancement work.
  • High level of ownership, accountability, and attention to detail.

Nice to Have

  • Experience with cloud data storage patterns and file-based ingestion.
  • Familiarity with incident management and operational SLAs.
  • Interest in improving observability, monitoring, and data quality frameworks.
Privacy NoticeImprint