Senior Site Reliability Engineer ( SRE ) - REMOTE USA
- Atlanta, GA, USA
- Employees can work remotely
Working as part of our global SRE organization supporting our cloud email service you will be part of a team responsible for delivering advanced security services via the cloud, where you will proactively analyze cloud services for performance improvements and automation opportunities as we continue to push towards more cloud native services.
What You Will Do:
- Design, write, and deliver software to improve the availability, scalability, latency, and efficiency of FireEye’s cloud services.
- Solve problems relating to mission critical services with a focus on using automation to prevent problem recurrence; with the goal of automating response to all non-exceptional service conditions and building better technologies vs. manual resolution.
- Influence and create new designs, architectures, standards, and methods for large-scale distributed systems.
- Engage in service capacity planning and demand forecasting, software performance analysis, and system tuning.
- Participate in on call rotation responding with urgency to incidents that may arise.
- Work as part of a team serving multiple stakeholders, balancing priorities to deliver on time while communicating status to internal customers.
- Take initiative to identify and address opportunities for improvement within the organization.
- 3+ years of relevant experience.
- Systematic problem-solving approach coupled with a strong sense of ownership and drive.
- Experience in one or more of: C, C++, Java, Python, Go, Ruby, Scala, NodeJS.
- Expertise in designing, analyzing, and troubleshooting large-scale distributed systems.
- Familiarity with running web services at scale; understanding of cloud native technologies and networking.
- Experience developing tools and APIs to reduce manual interaction with systems and applications using a variety of coding and scripting standards.
- Understanding of Unix/Linux systems from kernel to shell and beyond, taking in system libraries, file systems, and client-server protocols along the way.
- Networking: knowledge and understanding of network theory, such as different protocols (TCP/IP, UDP, DNS, routing, OSI layers, load balancing, etc.).
- Experience with Infrastructure as a Service (IaaS) and Platform as a Service (PaaS) technology stacks such as Amazon Web Services (AWS).
- Strong written and verbal communication skills.
- US Citizenship required due to product compliance requirements.
- Familiarity with container solutions (Kubernetes, Docker, etc.) or Configuration Management solutions (Ansible, Consul, Terraform ,etc.)
- Experience with algorithms, data structures, complexity analysis and software design.