Sr. Staff Software Engineer, Systems Infrastructure
- Full-time
- Workplace Type: Hybrid
- Career Track & Grade: IC5/10
- Department: Engineering
Company Description
LinkedIn is the world’s largest professional network, built to create economic opportunity for every member of the global workforce. Our products help people make powerful connections, discover exciting opportunities, build necessary skills, and gain valuable insights every day. We’re also committed to providing transformational opportunities for our own employees by investing in their growth. We aspire to create a culture that’s built on trust, care, inclusion, and fun – where everyone can succeed. Join us to transform the way the world works.
Job Description
This role will be based in Sunnyvale or Mountain View, CA.
At LinkedIn, our approach to flexible work is centered on trust and optimized for culture, connection, clarity, and the evolving needs of our business. The work location of this role is hybrid, meaning it will be performed both from home and from a LinkedIn office on select days, as determined by the business needs of the team.
Join LinkedIn’s Serving Foundations team and work on one of the most critical layers of our AI platform—powering large-scale model inference across all major AI use cases. This team sits at the center of LinkedIn’s AI stack and is responsible for making our models faster, more efficient, and more scalable at production scale.
This is a deeply technical, systems-focused position at the intersection of machine learning, compilers, and hardware. You will work across the full stack—from model graphs and optimization techniques to runtime systems, kernels, and GPU execution—to push the limits of performance and efficiency.
You will lead efforts to optimize large-scale inference systems serving billions of requests, driving improvements in latency, throughput, and cost. This includes advancing GPU utilization, designing custom kernel and operator optimizations, improving model efficiency through quantization and compression, and shaping how models are compiled and executed in production environments.
As a Sr. Staff engineer, you will operate with a high degree of autonomy and influence, identifying bottlenecks across the system and driving end-to-end solutions across model, runtime, and infrastructure layers. Your work will directly impact how AI systems perform at LinkedIn scale and will help define the future of our AI serving platform.
Responsibilities
Lead end-to-end optimization of large-scale AI inference systems across model, runtime, and hardware layers
Design and implement GPU-efficient solutions, including kernel optimization, operator fusion, and memory optimization
Apply model optimization techniques such as quantization, pruning, and mixed precision to improve performance and efficiency
Optimize model execution using ML compilers and runtimes (e.g., TensorRT, XLA, TVM, Triton)
Build and scale low-latency, high-throughput inference systems for both online and offline workloads
Identify and resolve bottlenecks across distributed systems and model serving pipelines
Set technical direction and influence best practices for AI performance and efficiency across teams
Qualifications
Basic Qualifications
BS/BA in Computer Science or related technical field or equivalent experience
8+ years of experience in software engineering with a focus on systems and performance
Experience building or optimizing large-scale production ML systems
Experience programming in Python, C++, or similar languages
Experience working with distributed systems and large-scale infrastructure
Preferred Qualifications
Deep expertise in GPU programming and optimization (CUDA, Triton, or similar)
Experience with model optimization techniques such as quantization, pruning, and compression
Experience with ML compilers or runtimes such as TensorRT, XLA, TVM, TorchInductor, or similar
Hands-on experience with kernel-level or operator-level optimization
Experience building or scaling high-performance inference systems, including LLM serving
Understanding of latency, throughput, and cost tradeoffs in production ML systems
Background in high-performance computing or hardware-aware optimization
Suggested Skills
AI/ML Systems and Infrastructure
GPU and Performance Optimization
Model Serving and Inference Systems
Distributed Systems
Technical Leadership
LinkedIn is committed to fair and equitable compensation practices.
The pay range for this role is $198,000 to $326,000. Actual compensation packages are based on several factors that are unique to each candidate, including but not limited to skill set, depth of experience, certifications, and specific work location. This may be different in other locations due to differences in the cost of labor.
The total compensation package for this position may also include annual performance bonus, stock, benefits and/or other applicable incentive compensation plans. For more information, visit https://careers.linkedin.com/benefits.
Additional Information
Equal Opportunity Statement
We seek candidates with a wide range of perspectives and backgrounds and we are proud to be an equal opportunity employer. LinkedIn considers qualified applicants without regard to race, color, religion, creed, gender, national origin, age, disability, veteran status, marital status, pregnancy, sex, gender expression or identity, sexual orientation, citizenship, or any other legally protected class.
LinkedIn is committed to offering an inclusive and accessible experience for all job seekers, including individuals with disabilities. Our goal is to foster an inclusive and accessible workplace where everyone has the opportunity to be successful.
If you need a reasonable accommodation to search for a job opening, apply for a position, or participate in the interview process, connect with us at [email protected] and describe the specific accommodation requested for a disability-related limitation.
Reasonable accommodations are modifications or adjustments to the application or hiring process that would enable you to fully participate in that process. Examples of reasonable accommodations include but are not limited to:
- Documents in alternate formats or read aloud to you
- Having interviews in an accessible location
- Being accompanied by a service dog
- Having a sign language interpreter present for the interview
A request for an accommodation will be responded to within three business days. However, non-disability related requests, such as following up on an application, will not receive a response.
LinkedIn will not discharge or in any other manner discriminate against employees or applicants because they have inquired about, discussed, or disclosed their own pay or the pay of another employee or applicant. However, employees who have access to the compensation information of other employees or applicants as a part of their essential job functions cannot disclose the pay of other employees or applicants to individuals who do not otherwise have access to compensation information, unless the disclosure is (a) in response to a formal complaint or charge, (b) in furtherance of an investigation, proceeding, hearing, or action, including an investigation conducted by LinkedIn, or (c) consistent with LinkedIn's legal duty to furnish information.
San Francisco Fair Chance Ordinance
Pursuant to the San Francisco Fair Chance Ordinance, LinkedIn will consider for employment qualified applicants with arrest and conviction records.
Pay Transparency Policy Statement
As a federal contractor, LinkedIn follows the Pay Transparency and non-discrimination provisions described at this link: https://lnkd.in/paytransparency.
Global Data Privacy Notice for Job Candidates
Please follow this link to access the document that provides transparency around the way in which LinkedIn handles personal data of employees and job applicants: https://legal.linkedin.com/candidate-portal.