Senior Data Engineer

  • Full-time
  • Legal Entity: Bosch Global Software Technologies Private Limited

Company Description

Bosch Global Software Technologies Private Limited is a 100% owned subsidiary of Robert Bosch GmbH, one of the world's leading global supplier of technology and services, offering end-to-end Engineering, IT and Business Solutions. With over 27,000+ associates, it’s the largest software development center of Bosch, outside Germany, indicating that it is the Technology Powerhouse of Bosch in India with a global footprint and presence in the US, Europe and the Asia Pacific region.

Job Description

Roles & Responsibilities :

Data Architecture & Engineering

  • Design and implement end-to-end data pipelines for ingestion, transformation, and storage of structured, semi-structured, and time-series data.

  • Build both real-time and batch processing frameworks using Databricks, supporting scalable analytics and AI workloads.

  • Develop and maintain ETL/ELT workflows using Python and SQL, ensuring reusability and maintainability.

  • Architect and optimize data lakes/lakehouses (Azure Synapse, Delta Lake, BigQuery, or Snowflake) for efficient querying and cost control.

  • Design and manage NoSQL databases (MongoDB) and time-series databases (InfluxDB, TimescaleDB, Azure Data Explorer) for sensor and operational data.

  • Enable AI/ML readiness by developing feature pipelines, managing datasets, and integrating with model inference systems.

Cloud & Integration

  • Orchestrate and monitor data pipelines using Azure Data Factory, Azure Functions, and Event Hub for real-time ingestion and transformation.

  • Build serverless, event-driven applications using Azure Functions (Python-based), AWS Lambda, or GCP Cloud Functions.

  • Implement hybrid data integration between edge, on-prem, and cloud using secure APIs, message queues, and connectors.

  • Integrate data from IoT devices, ERP, MES, PLM, and simulation tools to enable enterprise-wide digital twin insights.

  • Develop containerized microservices using Docker and Kubernetes to support portable, cloud-agnostic deployments across Azure, AWS, and GCP.

Performance, Security & Governance

  • Implement frameworks for data quality, lineage, and observability (Great Expectations, Azure Purview, OpenMetadata).

  • Enforce data governance, privacy, and compliance with standards such as GDPR, ISO 27001, and industry regulations.

  • Optimize resource utilization and cost across compute, storage, and database layers.

  • Establish data retention, access control, and lifecycle policies across multi-tenant environments.

Collaboration & Strategy

  • Collaborate with cloud architects, AI/ML engineers, and domain experts to align data architecture with Industry 4.0 and Digital Twin goals.

  • Evaluate and introduce emerging technologies such as vector databases, streaming analytics, and data mesh frameworks.

  • Mentor junior engineers and promote best practices in Pythonic coding, DevOps, and GitOps workflows.

  • Develop and maintain data engineering accelerators and reusable frameworks for internal adoption.

Qualifications

Educational qualification:

Required Qualifications

  • Bachelor’s or Master’s degree in Computer Science, Data Engineering, or related field.

  • 8+ years of experience in data engineering, analytics, or big data systems.

Mandatory skills:

  • Strong programming skills in Python and SQL for data transformation, orchestration, and automation.

  • Expertise in Azure data services (Synapse, Data Factory, Event Hub, Azure Functions, Databricks).

  • Hands-on experience with MongoDB, Cosmos DB, and time-series databases such as InfluxDB, TimescaleDB, or Azure Data Explorer (ADX).

  • Proven experience with streaming frameworks (Kafka, Event Hub, Kinesis) and workflow orchestrators (Airflow, Argo, or Prefect).

  • Proficiency in Docker and Kubernetes for containerization and scalable deployment.

  • Familiarity with data lake/lakehouse architectures, NoSQL models, and cloud-agnostic patterns.

  • Knowledge of CI/CD pipelines and infrastructure-as-code tools (Terraform, Bicep, ARM templates).

Preferred Skills

  • Experience with industrial IoT, Digital Twin data models, and protocols such as OPC-UA and MQTT.

  • Exposure to edge-to-cloud data flows and predictive maintenance or anomaly detection solutions.

  • Knowledge of data quality, governance, and metadata management tools.

  • Strong communication and analytical skills to align data solutions with business and operational KPIs.

     

Additional Information

Position Overview

We are seeking a highly skilled Senior Data Engineer to design, build, and optimize large-scale, cloud-native data platforms that power Digital Twin and Industrial AI solutions. This role focuses on developing high-performance data ingestion and transformation pipelines that unify IoT, enterprise, and AI/ML data, enabling real-time insights, scalability, and interoperability across hybrid environments.

Privacy PolicyImprint