Senior Application Production Support
- Full-time
Job Description
As a member of the Application Production Support team, you will play a key role in ensuring the stability, availability, and performance of critical business applications in a production environment. You will manage incidents, collaborate with internal and external stakeholders, and drive long-term solutions to maintain operational excellence.
Main Responsibilities:
Application Stability & Availability
- Monitor and maintain applications to ensure high availability and optimal performance.
- Actively participate in incident management (e.g., Situation Rooms for P1/P2 incidents, RCA) and problem resolution, identifying trends and implementing definitive solutions.
- Ensure compliance with ITIL governance and SLAs within IT Production.
- Execute change requests and deployments using ITIL and DevOps tools and processes.
- Proactively identify and resolve technical issues to guarantee smooth business operations.
- Participate in on-call rotations to ensure 24/7 support for critical applications.
Technical Support & Collaboration
- Act as a primary contact for Development teams, troubleshooting issues and coordinating fixes.
- Collaborate with Scrum teams to design, deploy, and enhance systems.
- Implement upgrades, patches, and new functionalities while minimizing user impact.
Documentation & Knowledge Sharing
- Maintain and update technical documentation for processes, configurations, and troubleshooting guides.
- Share best practices and knowledge with global support teams to improve efficiency.
Platform Monitoring
- Implement and optimize monitoring tools (e.g., Dynatrace) within the production environment.
- Collaborate with development and Centers of Expertise to define observability and monitoring practices.
- Promote awareness of observability for early detection and resolution of potential issues.
Qualifications
API, Application Servers & Kubernetes (Expert):
- Java application servers (RedHat JBoss EAP), Java knowledge (heap/thread dump analysis, performance optimization).
- OpenShift & Cloud (Kubernetes).
- API gateway integration (Axway/APIGee).
Operating Systems: RHEL Linux (Expert).
- Tooling (Expert):
- Monitoring/observability tools (Dynatrace, Jaeger, Grafana, Prometheus, ELK).
- CI/CD setup & optimization (GitLab, ArgoCD, Jenkins, Nexus Sonatype).
- Automation tools (Ansible, Terraform).
- Databases: MSSQL Server, PostgreSQL (Expert).
Soft Skills
- Problem-solving & critical thinking (Expert).
- Team collaboration and communication skills.
- Resilience and adaptability.
- Stress management and accountability.
- Autonomy, time management, and prioritization.
- Goal-oriented and detail-focused.
Additional Information
Language Skills
- Portuguese: Mastery.
- English: Advanced.
50-50 hybrid model.