[VCT] Senior SRE Engineer
- Full-time
Company Description
We are Software Mind, an awesome team of engineers who are ready to ramp up any top-notch company’s projects! Our aim? To always be one step ahead. Become part of a multicultural company in constant growth with an excellent work environment certified by Great Place To Work!
Job Description
Project - the aim you'll have
We're looking for a skilled Senior SRE Engineer to join a team that works on a complex distributed architecture, spanning physical machines - and virtualizing on-prem host/cloud computing. Our Client develops and deploys systematic financial strategies across a variety of asset classes and global markets, and our teams work collaboratively to drive the production of high-quality predictive signals and financial strategies – the foundation of a sustainable, global investment platform.
If you enjoy working with cutting-edge technologies in a fast-paced environment this opportunity is for you!
Qualifications
Expectations - the experience you need
- 5+ years of proven experience in SRE
- Deep expertise and hands-on experience working with Linux-based systems, with a focus on optimization and troubleshooting.
- Strong skills in Python for scripting, automation, and system management.
- In-depth knowledge of container orchestration technologies such as Kubernetes (K8S). Experience with other cluster management tools like Slurm is a plus.
- Hands-on experience with tools like Helm, Terraform, and Ansible to manage infrastructure in a scalable and automated way.
- Strong working knowledge of Docker, Podman, or other containerization systems to enable efficient and consistent deployment.
- Experience working with CI/CD tools, especially GitLab (preferred), GitHub, or Git, to ensure smooth and rapid delivery cycles.
- Experience with monitoring and logging solutions such as Prometheus, Grafana, and the ELK stack to provide comprehensive insights into system performance and health.
- Understanding of relational databases, their performance tuning, and management in distributed systems.
- Familiarity with Agile development methodologies, with a focus on continuous improvement and collaboration.
- Exposure to cloud technologies such as AWS or Google Cloud (GCP) is a strong plus.
Position - how you'll contribute
- Architecture and Automation: Design and deploy As-A-Service solutions using open-source software to automate system management, scaling, and monitoring.
- System Optimization: Develop tools to streamline deployment, monitoring, and incident management for large-scale, distributed environments.
- Collaboration Across Teams: Work with development and operations teams to design and implement software solutions that enhance the overall reliability of services. Contribute to the ongoing DevOps and Agile transformation.
- Monitoring & Incident Response: Set up, configure, and maintain monitoring and alerting systems to ensure real-time visibility into system performance. Participate in on-call rotations to respond to incidents and mitigate downtime.
- CI/CD & Infrastructure Management: Continuously improve CI/CD pipelines using tools like GitLab, Helm, Terraform, and Ansible, ensuring fast, safe, and reliable deployments.
- Container Orchestration: Leverage container orchestration platforms like Kubernetes (K8S) to manage distributed systems at scale. Experience with Slurm or similar cluster management is a plus.
- Cloud and Automation Tools: Use cloud infrastructure (AWS, GCP, etc.) and Infrastructure as Code (IaC) tools to automate the provisioning and scaling of resources.
Our Benefits
- Educational resources.
- Flexible schedule and Work From Anywhere.
- Referral Program.
- Supportive and chill atmosphere.
We are accepting applications from LATAM countries
Position at: Software Mind LATAM
Additional Information