Technologist, Software Development- AI agentic, Kubernetes, AI/ML framework, python
- Full-time
- Job Type (exemption status): Exempt position - Please see related compensation & benefits details below
- Business Function: Software Development (Sys)
- Work Location: Helios Office (IBH)--LOC_WDT_IBH
Company Description
Sandisk understands how people and businesses consume data and we relentlessly innovate to deliver solutions that enable today’s needs and tomorrow’s next big ideas. With a rich history of groundbreaking innovations in Flash and advanced memory technologies, our solutions have become the beating heart of the digital world we’re living in and that we have the power to shape.
Sandisk meets people and businesses at the intersection of their aspirations and the moment, enabling them to keep moving and pushing possibility forward. We do this through the balance of our powerhouse manufacturing capabilities and our industry-leading portfolio of products that are recognized globally for innovation, performance and quality.
Sandisk has two facilities recognized by the World Economic Forum as part of the Global Lighthouse Network for advanced 4IR innovations. These facilities were also recognized as Sustainability Lighthouses for breakthroughs in efficient operations. With our global reach, we ensure the global supply chain has access to the Flash memory it needs to keep our world moving forward.
Job Description
Position Overview
We are seeking a seasoned Senior AI Infrastructure Engineer specializing in AI-driven automation, Kubernetes orchestration, and containerized AI/ML workloads. This role focuses on architecting intelligent systems that leverage AI agentic workflows for automated product validation and testing at scale through advanced container orchestration.
Key Responsibilities
AI Workflow Architecture & Implementation
- Design and implement AI-driven test creation systems that automatically generate comprehensive test cases using machine learning models and historical data analysis 
- Architect intelligent monitoring solutions leveraging AI/ML algorithms to predict system failures, optimize test coverage, and enable proactive issue resolution 
- Develop autonomous testing agents capable of adaptive test execution, self-healing infrastructure, and intelligent decision-making 
- Integrate large language models (LLMs), and AI decision-making frameworks into automated test orchestration workflows 
- Build AI pipelines that analyze product performance data to continuously improve validation strategies 
Kubernetes & Container Orchestration
- Design, implement, and optimize Kubernetes clusters specifically for AI/ML workloads and containerized testing environments 
- Develop custom Kubernetes operators and controllers for AI workflow management and automated container lifecycle operations 
- Architect scalable container orchestration solutions supporting both AI model inference and training workloads across diverse hardware configurations 
- Implement advanced Kubernetes patterns including multi-cluster management, resource optimization, and intelligent workload scheduling 
- Design container-native AI platforms with automated scaling, resource allocation, and performance optimization 
- Design and implement comprehensive monitoring solutions using Prometheus, Grafana, and related CNCF tools 
Technical Leadership & AI System Architecture
- Lead a team of 3-5 engineers specializing in AI/ML infrastructure, container platforms, and intelligent automation systems 
- Drive architectural decisions for complex AI-driven projects involving containerized workloads and intelligent system integration 
- Establish best practices for AI workflow development, container security, and MLOps processes 
- Mentor team members on advanced Kubernetes patterns, AI system design, and containerization strategies for ML workloads 
Qualifications
Experience & Technical Skills
- BE/B.Tech/ME/M.Tech degree in Electronics & Electrical Engineering, Computer Engineering or related field 
- 10-14 years of overall professional experience 
- 5+ years of hands-on experience with Kubernetes in production, including custom operators, controllers, and AI/ML workload management 
- 4+ years of experience implementing AI/ML workflows in production environments, particularly for automation and intelligent testing. 
- 3+ years of experience in containerized application development and container orchestration for AI/ML systems 
- Proven track record of leading technical teams (3-5 people) and delivering AI-driven infrastructure projects 
Core Technical Competencies
- Expert-level Kubernetes expertise: K8s, Helm, custom operators, resource management, and ML-specific configurations 
- Advanced AI/ML framework proficiency: TensorFlow, PyTorch, LangChain, MLflow or similar tools and AI model deployment patterns 
- Strong programming skills: Python, Go, or Rust with focus on AI workflow development and container/Docker orchestration 
- AI system architecture: Experience with agentic AI systems, LLM & RAG integration, and intelligent automation frameworks 
Leadership & Communication
- Demonstrated ability to lead cross-functional teams focused on AI infrastructure and intelligent system development 
- Strong project management skills with experience delivering AI-driven automation projects 
- Excellent technical communication skills with ability to explain complex AI and container concepts to diverse stakeholders 
- Experience mentoring engineers in AI system design, Kubernetes administration, and containerization best practice 
Additional Information
Sandisk thrives on the power and potential of diversity. As a global company, we believe the most effective way to embrace the diversity of our customers and communities is to mirror it from within. We believe the fusion of various perspectives results in the best outcomes for our employees, our company, our customers, and the world around us. We are committed to an inclusive environment where every individual can thrive through a sense of belonging, respect and contribution.
Sandisk is committed to offering opportunities to applicants with disabilities and ensuring all candidates can successfully navigate our careers website and our hiring process. Please contact us at [email protected] to advise us of your accommodation request. In your email, please include a description of the specific accommodation you are requesting as well as the job title and requisition number of the position for which you are applying.