Data Architect
- Full-time
- Legal Entity: Bosch Global Software Technologies Private Limited
Company Description
Bosch Global Software Technologies Private Limited is a 100% owned subsidiary of Robert Bosch GmbH, one of the world's leading global supplier of technology and services, offering end-to-end Engineering, IT and Business Solutions. With over 22,700 associates, it’s the largest software development center of Bosch, outside Germany, indicating that it is the Technology Powerhouse of Bosch in India with a global footprint and presence in the US, Europe and the Asia Pacific region.
Job Description
Do you have a passion for building the high-performance data backbone for cutting-edge AI? Are you an expert in designing data architectures that leverage cutting-edge technologies like vector databases and infrastructure as code etc.? If so, please join our team and play a pivotal role in shaping the future of generative AI! We're seeking a forward-thinking Data Architect to collaborate closely with our Generative AI Engineers. You'll be responsible for designing, developing, and implementing the cutting-edge data infrastructure that fuels our innovative generative AI solution development, with a focus on high performance and scalability.
Responsibilities:
- Partner with Generative AI Engineers and architects to understand their data requirements and design a highly scalable, secure, and efficient data architecture, leveraging vector databases for efficient similarity search and retrieval tasks.
- Design and implement data pipelines for ingesting, processing, and storing massive datasets for training and running generative models, utilizing Terraform (or similar technologies) for infrastructure as code (IaC) to ensure infrastructure automation and repeatability.
- Select and implement cutting-edge data storage solutions, considering factors like scalability, performance, cost, and suitability for vector data (e.g., specialized vector databases).
- Ensure data quality by implementing data cleansing, transformation, and validation processes.
- Develop data governance policies and procedures to ensure data security, compliance, and accessibility.
- Automate data pipelines and workflows using tools and techniques optimized for high-performance data processing.
- Monitor and optimize data infrastructure performance for efficiency and scalability, focusing on optimizing vector database usage for generative AI workloads.
- Collaborate with Data Scientists and Machine Learning Engineers to understand broader data needs and ensure alignment.
- Stay up-to-date on the latest big data technologies, vector databases, and best practices for data management in AI environments.
Qualifications:
- 10 years of experience in data architecture design and implementation, with a focus on high-performance data solutions
- Strong understanding of data management principles, data modeling techniques, data governance practices, and distributed systems
- Experience working with big data technologies (e.g., Kafka, postgre, mongo) and familiarity with vector databases (e.g., Pinecone, Faiss, Lance DB etc.)
- Proficiency in SQL and experience with data warehousing solutions (e.g., Snowflake, Redshift) is added advantage.
- Experience with Azure and AWS cloud platforms and Terraform.
- Excellent communication and collaboration skills to effectively interact with technical and non-technical stakeholders.
- Strong problem-solving and analytical skills with a data-driven approach
- Ability to work independently and manage multiple projects simultaneously
Qualifications
B-Tech/BE/ME/M-Tech
Additional Information
10 - 15 years of experience