Vision-Language-Action Models - Intern
- Intern
- Legal Entity: Robert Bosch LLC
Company Description
The Bosch Research and Technology Center North America with offices in Sunnyvale, California, Pittsburgh, Pennsylvania, and Cambridge, Massachusetts is a part of the global Bosch Group (www.bosch.com), a company with over 70 billion euro revenue, 400,000 employees worldwide, a very diverse product portfolio, and a history spanning over 125 years. The Research and Technology Center North America (RTC-NA) is dedicated to providing technologies and system solutions for various Bosch business fields, primarily in the field of artificial intelligence, energy technologies, internet technologies, circuit design, semiconductors and wireless, as well as advanced MEMS design.
As a part of the global research, our AI research in Silicon Valley focuses on Foundation Models, Big Data Visual Analytics, Explainable AI (XAI), Natural Language Processing, Computer Vision & Mixed Reality, Cloud Robotics, Data Science, AI System Engineering, Time-series Analysis. We develop scalable, intelligent, and trustworthy AIoT solutions for Bosch products and services in application areas such as automated driving, advanced driver assistance systems (ADAS), robotics, smart manufacturing, enterprise AI, health care, smart home and building solutions.
Originating from the AI research in Silicon Valley, our Intelligent Autonomous Systems group is responsible for enabling future autonomous Bosch products by pushing the boundaries of automated driving, advanced driver assistance systems (ADAS), robotics and automation through key innovations that encompass system architecture and AI components. These include methods for motion planning, high level task planning and decision making as well as systems for making these technologies work on real products by building frameworks that take advantage of technologies in the field of reliable distributed computing. We work with internal partners of different Bosch business units to transfer our solutions into future products. We also actively collaborate with leading groups in academia and industry to promote research ideas and publish research findings in internationally renowned conferences and journals such as CVPR, ICRA, IROS, RSS, NeurIPS and CoRL.
Job Description
As an intern in the field of Vision-Language-Action Models, your responsibilities will be the following:
- Conduct advanced research on LLMs /VLMs for autonomous driving
- Design, implement supervised and reinforcement fine-tuning algorithms to optimize LLMs/VLMs for autonomous driving task.
- Collaborate with mentors and team members to refine research goals, discuss technical challenges, and explore extensions such as closed-loop fine-tuning and RL integration.
- Regularly report research progress through meetings, written updates, and technical presentations.
- Analyze experimental results, document methodologies, and summarize findings in clear and reproducible formats.
- Contribute to the preparation of research papers, technical reports, or potential submissions to top conferences
Qualifications
Basic Qualifications
- Ph.D. student in Computer Science, Robotics or related fields.
- Hands-on experience on developing algorithms with focus on at least two of the following areas: multimodal foundation models, 3D scene understanding, autonomous driving , reinforcement learning and robotic navigation or planning.
- Solid Python skills and proficient with libraries such as PyTorch.
- Minimum GPA of 3.0
Preferred Qualifications
- Publication record in top venues including CVPR, ICCV, ECCV, ICLR etc.
- Familiar with CARLA or NavSim
- Able to work independently, has strong research and problem-solving skills
- Good communication and teamwork skills
Additional Information
The U.S. base salary range for this intern position is $39.00 - $64.00. Within the range, individual pay is determined based on several factors, including, but not limited to, type of degree, work experience and job knowledge, complexity of the role, type of position, job location, etc. Your Hiring Manager can share more details about the specific salary range for this position during the interview process.*
BOSCH is a proud supporter of STEM (Science, Technology, Engineering & Mathematics) Initiatives
- FIRST Robotics (For Inspiration and Recognition of Science and Technology)
- AWIM (A World In Motion)
Equal Opportunity Employer, including disability / veterans.
*Bosch adheres to Federal, State, and Local laws regarding drug-testing. Employment is contingent upon the successful completion of a drug screen and background check. Candidates who have been offered the position must pass both screenings before their start date.