Site Reliability Engineer
- Full-time
- Position Type: Permanent
Job Description
Your deliverables as a Site Reliability Engineer will include, but are not limited to, the following:
Work with containers and container orchestration systems such as Kubernetes
Capacity Planning to determine resource requirements of your service for it to be scalable, efficient, and reliable
Identify and troubleshoot any availability and performance issues at multiple layers of deployment, from hardware to operating environment, network, and application
Collaborate with other engineers to implement operational solutions while defining and adhering to industry best practices
Participate in weekly on-call rotation
Qualifications
- 3/5+ years of experience in Cloud Operations
- Proficiency with Infrastructure as Code technologies such as Terraform, CloudFormation, or ARM
- Experience developing and deploying resources with a cloud provider (I.e., Azure, AWS, Cloudflare, GCP)
- Networking concepts (load balancing, TCP/IP, HTTP, gRPC, DNS) and troubleshooting tools (Wireshark, command line, BPF)
- Experience with version control systems (GitHub, Gitlab, Bitbucket)
- Comfortable with scripting languages like Python, Bash and Go
- Familiarity with container technologies like Docker and Kubernetes
- Knowledge of Cloud-native architecture, Cloud tooling and the latest trends and practices
- Appropriate Linux, Kubernetes & Cloud Certifications a plus