Site Reliability Engineering (SRE) Intern
- Full-time
- Compensation: USD 30 - USD 34 - hourly
Company Description
The AWP Safety IT Internship Program immerses you in provides a hands‑on, high‑impact learning experience designed for early‑career professionals who want to build a future in IT Site Reliability Engineering. In this role, you won't just be watching application performance monitoring dashboards; you will be building the observability pipelines that keep our infrastructure and applications resilient, highly available, and robust. You will work at the intersection of Software Engineering and Systems Operations, using Dynatrace as your primary lens to diagnose performance bottlenecks and automate "toil" out of existence.
While this internship is primarily project‑based and can be remote depending on location, interns will also have opportunities to collaborate closely with cross‑functional teams to understand how technical insights drive real‑world business outcomes.
What You’ll Experience
- Full‑Stack Observability: Trace requests from browser to code to database.
- Incident Lifecycle: Join blameless post‑mortems and help implement “never‑twice” fixes.
- AIOps: Use Dynatrace’s predictive AI to find “the needle in the haystack” before an outage occurs.
- Scalable Infrastructure: How to manage monitoring for thousands of hosts without manual intervention.
Professional & Team‑Building Activities
- Attend workshops, panels, and intern networking events.
- Participate in our “Journey‑to‑the‑Job” series to hear from seasoned executives, sharing their diverse career paths within the organization.
Job Description
This 10-week internship places interns at the center of our IT operations, offering meaningful work with real organizational impact. You’ll thrive if you have a passion for “measuring everything”. You’ll collaborate closely with Platform, AppDev, and Security teams on production‑grade outcomes for our business.
Core Responsibilities
- Observability‑as‑Code: Help deploy and configure Dynatrace OneAgent and ActiveGates with automated tooling.
- SLI/SLO Implementation: Define and instrument user‑centric metrics and objectives in Dynatrace.
- AI‑Assisted Troubleshooting: Combine Davis® AI with Copilot/Claude to identify root causes and reduce MTTR.
- Dashboard Engineering: Build actionable, real‑time dashboards for application and cloud health.
- Automation & Scripting: Write Python/Bash to trigger self‑healing or response playbooks from alerts.
Qualifications
- Rising junior/senior or current master’s student.
- Clear communication and teamwork skills in fast‑moving ops environments.
- Systems Thinking: Understand how web apps, databases, and networks interact.
- SRE Mindset: Care deeply about reliability, scalability, and error budgets.
- Scripting Proficiency: Familiarity with Python, Go, or PowerShell.
- Cloud Basics: Exposure to containers (Docker/Kubernetes) and microservices patterns.
- Data Fluency: Read metrics, logs, and traces to tell a story about system health.
- Clear communication and teamwork in fast‑moving ops environments.
Additional Information
- Full‑time, 10‑week temporary internship; non‑benefits eligible
- Compensation: $30-34/hour based on location
Join us for an IT internship that strengthens your technical abilities, builds your professional confidence, and prepares you for a future in high‑impact SRE roles. Apply today and help shape the technical insights that power AWP Safety.
AWP Safety is an Equal Opportunity Employer (EOE). Women, minorities, veterans, and individuals with disabilities are encouraged to apply. Qualified applicants will receive consideration for employment without regard to their race, color, age, religion, national origin, sex, sexual orientation, gender identity, protected veteran status or disability.