Site Reliability Engineer

  • Full-time

Company Description

Located in the heart of the Rocky Mountains – we are a team that loves to play hard and work harder. As an Outside Magazine Best Place to Work 8 years running, we believe success is powered by our team’s happiness. We’re just a talented bunch of innovative developers, creative artists, social experts, email ninjas, search enthusiasts, anglers, alpine skiers, trail runners, environmentalists, who genuinely care about our client’s success.

Our vision is to be the premier digital agency in the hospitality industry by driving quality conversions and creating digital experiences that inspire and motivate travelers to place. Our mission is to create value and deliver measurable results to our clients through innovation and quality in the digital space. We focus on sustainable growth driven by the success of our clients, the strength of our team and a culture that encourages excellence in both our professional and personal lives. 

Job Description

Bluetent’s DevOps team is growing! The Site Reliability Engineer will work with the CTO and Cloud Systems Engineer to continue to ensure that our global service platform is always ready to answer to growing business needs and opportunities. This position is an engineering discipline that combines your systems engineering and software skills to build and run applications on a cutting edge cloud native infrastructure using kubernetes, docker and more.

The site reliability engineer will help improve and maintain our platforms SLOs, help support the development team with CI/CD tools and run the production environment by monitoring availability and health of workloads.

In this role, you will: 

  • Support the development, testing, deployment, monitoring and maintenance of Bluetent’s large-scale, distributed, fault-tolerant platform software, marketing and eCommerce services.
  • Participate in the automation, monitoring and maintenance of Bluetent’s cloud-native and multi-cloud Kubernetes clusters.
  • Develop tools and automated solutions in support of platform services
  • Monitor and manage internal devops cases from intake to resolution in support of client services, software implementation, technical support and product/platform engineering.
  • Troubleshoot performance, reliability, and scalability issues.
  • Collaborate with application developers in the improvement of developer experience and toolsets.
  • Engage in and improve the whole lifecycle of services—from inception and design, through deployment, operation and refinement.
  • Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch reviews.
  • Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
  • Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity.
  • Practice sustainable incident response and blameless postmortems.
  • Maintain and administer data stores ensuring proper backup, replication and failover strategies.
  • Ensure proper security, monitoring, alerting and reporting for production infrastructure.
  • Take broad, conceptual ideas and turn them into functional architecture and software designs to solve customers use cases.
  • Troubleshoot and resolve issues related to application development, deployment and operations.

Qualifications

Minimum qualifications:

  • Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
  • Two years full-time professional experience, or equivalent, with at least a few of the following:
  • Linux/Unix Server Administration
  • Cloud systems administration (AWS, Google, Azure, etc.)
  • CI/CD Automation (Jenkins, JenkinsX, CircleCI, etc.)
  • Kubernetes cluster administration in AWS, Google (Kubectl, Helm, Minikube, etc.)
  • Application Containers (Docker, Docker for Mac)
  • Scripting/programming languages (PHP, Bash, Go, Node.js, Perl, Python, etc.)
  • Databases, NoSQL, Queues, PubSub and Cache (MySQL, MariaDB, AWS Aurora DB, Google Cloud SQL, Scylla, Apache Solr, ElasticCache, Redis, Kafka, Amazon SQS, etc.)
  • Websites, Web Services, Microservices  (HTTP(S), Drupal, Wordpress, Nginx, Apache HTTPD, Kong, SOAP, OAS/Swagger, JSON, CSS, JS, CDN, gRPC, Protocol Buffers, AWS Lambda, Serverless, Letsencrypt, etc.)
  • Application Monitoring and Profiling (Cloudwatch, Stackdriver, Graylog, Grafana, Prometheus, New Relic, etc.)

Other qualifications:

  • BS degree in Computer Science or related technical field involving coding (e.g., physics or mathematics).
  • Ability to debug, profile and optimize code and application design and automate routine tasks.
  • Experience creating and maintaining Jenkins pipelines.
  • Experience with algorithms, data structures, complexity analysis and software design.
  • Excellent communication skills both verbal and written

 

    Additional Information

    All your information will be kept confidential according to EEO guidelines.

    Bluetent is currently unable to offer visa sponsorship.