Site Reliability Engineer

  • Full-time
  • Position Type: Permanent

Job Description

Your deliverables as a Site Reliability Engineer will include, but are not limited to, the following:

  • Work with containers and container orchestration systems such as Kubernetes 

  • Capacity Planning to determine resource requirements of your service for it to be scalable, efficient, and reliable 

  • Identify and troubleshoot any availability and performance issues at multiple layers of deployment, from hardware to operating environment, network, and application 

  • Collaborate with other engineers to implement operational solutions while defining and adhering to industry best practices 

  • Participate in weekly on-call rotation 

Qualifications

  • 3/5+ years of experience in Cloud Operations
  • Proficiency with Infrastructure as Code technologies such as Terraform, CloudFormation, or ARM
  • Experience developing and deploying resources with a cloud provider (I.e., Azure, AWS, Cloudflare, GCP)
  • Networking concepts (load balancing, TCP/IP, HTTP, gRPC, DNS) and troubleshooting tools (Wireshark, command line, BPF)
  • Experience with version control systems (GitHub, Gitlab, Bitbucket)
  • Comfortable with scripting languages like Python, Bash and Go
  • Familiarity with container technologies like Docker and Kubernetes
  • Knowledge of Cloud-native architecture, Cloud tooling and the latest trends and practices
  • Appropriate Linux, Kubernetes & Cloud Certifications a plus

 

Privacy Policy