Staff HPC Infrastructure Engineer

  • Full-time

Company Description

Guardant Health is a leading precision oncology company focused on helping conquer cancer globally through use of its proprietary tests, vast data sets and advanced analytics. The Guardant Health oncology platform leverages capabilities to drive commercial adoption, improve patient clinical outcomes and lower healthcare costs across all stages of the cancer care continuum. Guardant Health has commercially launched Guardant360®, Guardant360 CDx, Guardant360 TissueNext™, Guardant360 Response™, and GuardantOMNI® tests for advanced stage cancer patients, and Guardant Reveal™ for early-stage cancer patients. The Guardant Health screening portfolio, including the Shield™ test, aims to address the needs of individuals eligible for cancer screening.

Job Description

About the Role:

Guardant’s HPC team builds and operates the computational technology backbone of the company. 

This includes scalable data storage that holds petabytes of genomics data, high-performance compute clusters running a custom bioinformatics pipeline in production and R&D environments, and the software infrastructure that hosts an ecosystem of services for internal data processing and external data integration. To facilitate Guardant Health’s fast growth in the next few years, the HPC team is looking for a strong technical engineer who can help maintain and help grow the HPC infrastructure during its aggressive expansion, while working with corporate IT, SQA, and DevOps/SRE teams.  

While preferred to have someone local to the San Francisco Bay and on premises in Redwood City and Palo Alto, this role can be mostly worked remotely.  While on rotation, during maintenances and during cluster deployments, being present at the location of the work is required. 

Essential Duties and Responsibilities:

  • Act as a technical lead in day to day operations

  • Help manage the HPC interconnects

  • Help integrate the HPC systems with the bandwidth on-demand system

  • Help integrate the HPC system with the single namespace storage system

  • Help integrate cloud bursting as part of the HPC abstraction work

  • Work with the networking infrastructure team to manage and optimize the connectivity to and from the HPC systems and locales

  • Help manage multiple HPC clusters and cluster file systems. 

  • Help research, develop and implement the next generation HPC solution

  • Troubleshoot the production system stack down to source code level e.g. shell scripts, python and others.

  • Maintain, monitor, and support the infrastructure environment and/or facilities.

  • Use and maintain enhanced production monitoring and additional capability.

  • Support improvements for increased system reliability and performance.

  • Support multiple systems or applications of medium to high complex (complexity defined by size, technology used, and system feeds and interfaces) with multiple concurrent users, ensuring control, integrity, and accessibility.

  • Support systems at remote locations, including internationally

  • Work with offsite consultants to maintain the infrastructure

  • Work with vendors to troubleshoot, upgrade and repair systems as needed

  • Participate in a 24/7 on-call rotation

Qualifications

You enjoy an agile, very fast paced and highly technical environment. You are a self-driven accomplished technologist who strives to be ever improving your skills, value to the company and improve the computational infrastructure.  You are dedicated to engineering excellence yet pragmatic and flexible.  You have the ability to maintain the day-to-day support SLA while running various key projects that move the business forward. 

  • B.S. in Computer Science or related field
  • 4+ years of TCP/IP networking experience 

  • 2+ years of RDMA networking experience

  • 4+ years of Linux/Unix administration, knowledge of Unix network protocols, TCP/IP network fundamentals, core infrastructure technologies and virtualization 

  • 2+ years of large-scale data storage and compute clusters (HPC) infrastructure  

  • 2+ years working in and with on-premise and cloud-based (AWS, Google, IBM and Azure) data-centers 

  • 2+ years of building software release and ops processes and automation toolset 

  • 2+ years providing documentation of system administration

Preferred Skills:

  • Cisco Certified Network Professional certification

  • Experience with Arista and compatible networking, up to and including 400 gb/s links

  • Experience with Mellanox infiniband fabric

  • Experience administering IBM’s General Parallel File System

  • Experience administering SLURM scheduler

  • Experience with using warewulf

  • Experience with cloud bursting technologies

  • Experience with wide area file systems 

  • Experience with docker and container technologies

  • Experience with Kubernetes 

  • Operating infrastructure compliant with HIPAA and SOX standards

Additional Information

Hybrid Work Model: At Guardant Health, we have defined days for in-person/onsite collaboration and work-from-home days for individual-focused time. All U.S. employees who live within 50 miles of a Guardant facility will be required to be onsite on Mondays, Tuesdays, and Thursdays. We have found aligning our scheduled in-office days allows our teams to do the best work and creates the focused thinking time our innovative work requires. At Guardant, our work model has created flexibility for better work-life balance while keeping teams connected to advance our science for our patients.

The US base salary range for this full-time position is $174,700 to $235,900. The range does not include benefits, and if applicable, bonus, commission, or equity. The range displayed reflects the minimum and maximum target for new hire salaries across all US locations for the posted role with the exception of any locations specifically referenced below (if any).

For positions based in Palo Alto, CA or Redwood City, CA, the base salary range for this full-time position is $174,700 to $235,900. The range does not include benefits, and if applicable, bonus, commission, or equity.

Within the range, individual pay is determined by work location and additional factors, including, but not limited to, job-related skills, experience, and relevant education or training. If you are selected to move forward, the recruiting team will provide details specific to the factors above.

Employee may be required to lift routine office supplies and use office equipment. Majority of the work is performed in a desk/office environment; however, there may be exposure to high noise levels, fumes, and biohazard material in the laboratory environment. Ability to sit for extended periods of time.

Guardant Health is committed to providing reasonable accommodations in our hiring processes for candidates with disabilities, long-term conditions, mental health conditions, or sincerely held religious beliefs. If you need support, please reach out to [email protected]

Guardant Health is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, or protected veteran status and will not be discriminated against on the basis of disability.

All your information will be kept confidential according to EEO guidelines.

To learn more about the information collected when you apply for a position at Guardant Health, Inc. and how it is used, please review our Privacy Notice for Job Applicants.

Please visit our career page at: http://www.guardanthealth.com/jobs/

#LI-CS4

Privacy Policy