Site Reliability Engineer (SRE/ DevOps) - Engineering Productivity

  • Full-time

Company Description

Arista Networks was founded to pioneer and deliver software driven cloud networking solutions for large data center storage and computing environments. Arista’s award-winning platforms, ranging in Ethernet speeds from 10 to 400 gigabits per second, redefine scalability, agility and resilience. Arista has shipped more than 20 million cloud networking ports worldwide with CloudVision and EOS, an advanced network operating system. Committed to open standards, Arista is a founding member of the 25/50GbE consortium. Arista Networks products are available worldwide directly and through partners.

Additional information and resources can be found at:
www.arista.com
www.twitter.com/aristanetworks
www.facebook.com/AristaNW
www.youtube.com/user/AristaNetworks

Job Description

Working in Engineering Productivity (EngProd), you will collaborate and work with other engineers to design, build, scale, and operate the systems that the rest of Arista’s development teams use.  The EngProd team uses industry-standard systems like Ansible, Jenkins, Kubernetes, Grafana, Spinnaker, MySQL, ElasticSearch, Google Cloud, and Varnish and also internal systems that we’ve built from the ground-up to automate CI/CD, testing, analysis, and visualization.

Responsibilities:

  • Keeping the production status green all the time

  • Proactively monitor, respond to, and enhance alerts

  • Build automated responses to the most common alerts or work with the rest of the EngProd team to build them

  • Create and maintain the incident response runbooks working with the service dev teams

  • Debug and resolve issues impacting developer user experience and infrastructure stability

  • Develop patterns to support system reliability and socialize them within the EngProd team

  • Review and contribute to the specifications and implementations written by other team members.

  • Work with Arista’s software engineers to identify bottlenecks and limitations in our workflows, tooling, and infrastructure and provide fixes for those problems.

  • Provide support for our tools and infrastructure to Arista’s development team.

Qualifications

  • At least BS Computer Science or Engineering +5 years’ experience, MS Computer Science or Engineering + 4 years’ experience, or Ph.D.  in Computer Science or equivalent work experience.

  • Knowledge of one or more of Go, Python, Javascript, Shell Scripting.

  • Knowledge of Linux (or UNIX).

  • Experience operating software systems at scale

  • Strong understanding of the fundamentals of storage and networking

  • Comfortable with Ansible and GitOps 

  • Applied understanding of software engineering principles.

  • Strong problem solving and software troubleshooting skills.

  • Ability to design a solution and implement features independently. Ability to work in small teams.

Additional Information

All your information will be kept confidential according to EEO guidelines.

Privacy Policy