Sr Data Engineer

  • Full-time
  • Legal Entity: Bosch Global Software Technologies Private Limited

Company Description

Bosch Global Software Technologies Private Limited is a 100% owned subsidiary of Robert Bosch GmbH, one of the world's leading global supplier of technology and services, offering end-to-end Engineering, IT and Business Solutions. With over 27,000+ associates, it’s the largest software development center of Bosch, outside Germany, indicating that it is the Technology Powerhouse of Bosch in India with a global footprint and presence in the US, Europe and the Asia Pacific region.

Job Description

Experience Summary

5–10 years of Data Engineer specialized in building document and knowledge-oriented data pipelines for regulatory/compliance domains, with strong capabilities in structured transformations, knowledge graphs, and containerized platform integration.

Core Responsibilities / Focus

  • Build and operate data ingestion and transformation pipelines for legal/regulatory content

  • Normalize and transform heterogeneous source formats (e.g., XML/HTML/structured exports) using tools such as XSLT

  • Implement pipelines for embeddings generation, indexing, and enrichment for downstream AI/RAG systems

  • Design and manage RDF-based knowledge representations and SPARQL-accessible datasets

  • Integrate storage and processing components across containerized/cloud environments

  • Support event-driven or integration-heavy workflows (e.g., via Apache Camel, message brokers)

  • Ensure reproducibility, maintainability, and operational handover of data pipelines

 

Core Skills (Must-Have)

  • Python/

    Java

  • Docker / Docker Compose

  • Kubernetes

  • Knowledge Graphs (RDF)

  • SPARQL

  • XSLT

  • Embeddings pipelines / vector preparation

  • Azure Storage (or equivalent cloud storage services)

  • Apache Camel

  • Git

 

Preferred / Nice-to-Have

  • Docling (or similar Document conversion)

  • CloudEvents

  • Kafka (or other message brokers)

  • Event-based systems / event-driven architecture

  • Dev Containers

  • GitOps

  • Documentation practices

 

Domain Advantage

Experience processing legal/regulatory source documents and preserving semantic structure / provenance

Familiarity with content domains such as EU regulation, privacy, ESG, and compliance frameworks

Qualifications

Educational qualification:

BE/B.Tech or Equivalent Degree

Experience :

5-10 Years

Mandatory/requires Skills :
Strong hands-on expertise in Python/Java, Docker / Docker Compose, Kubernetes, Knowledge Graphs (RDF),SPARQL,XSLT,Embeddings pipelines / vector preparation, Azure Storage (or equivalent cloud storage services),

Apache Camel,Git

Preferred Skills :

Privacy NoticeImprint