Calling all innovators – find your future at Fiserv.
We’re Fiserv, a global leader in Fintech and payments, and we move money and information in a way that moves the world. We connect financial institutions, corporations, merchants, and consumers to one another millions of times a day – quickly, reliably, and securely. Any time you swipe your credit card, pay through a mobile app, or withdraw money from the bank, we’re involved. If you want to make an impact on a global scale, come make a difference at Fiserv.
Job Title
Tech Lead, Infrastructure Engineering
Cloud Site Reliability Engineer (SRE) – L2
We are looking for a Cloud Site Reliability Engineer (SRE) – L2 to drive reliability, scalability, and performance across cloud-based infrastructure in a distributed, dynamic environment. This role blends software engineering and advanced systems operations to improve availability, reduce operational toil through automation, and strengthen observability and incident response across AWS, Azure, and GCP.
What does a successful Cloud Site Reliability Engineer (SRE) – L2 do at Fiserv?
You will design and operate reliable cloud platforms and services by applying SRE principles, automation-first practices, and strong operational discipline. As a hands-on engineer, you will partner closely with Engineering, Architecture, DevOps, and Security to improve uptime, accelerate detection and recovery, and continuously harden systems through capacity planning, monitoring, and blameless post-incident learning.
What you will do:
Design and maintain fault-tolerant, highly available architectures across AWS, Azure, and GCP, including redundancy, load balancing, and automated failover.
Deploy, manage, and optimize cloud resources using Infrastructure as Code (IaC) tools such as Terraform and Ansible.
Implement and improve monitoring, alerting, and logging using tools such as Splunk, Azure Monitor, Dynatrace, AWS CloudWatch, or similar.
Lead incident response for service outages and degraded performance, including real-time triage, mitigation, root cause analysis, and post-incident reviews.
Drive capacity planning and scaling improvements by forecasting demand, optimizing utilization, and enforcing autoscaling and performance best practices.
Build automation and internal tooling to reduce manual toil and improve operational consistency using Python, PowerShell, Bash, or similar.
Collaborate with Security teams to implement secure infrastructure practices including encryption, role-based access control, auditing, and vulnerability management.
Support cloud data migrations using relevant cloud migration tools and best practices.
Work across engineering and DevOps teams to promote reliability best practices and contribute to mentoring and a blameless culture.
What you will need to have:
Hands-on programming/scripting experience in Python, PowerShell, Bash, or equivalent for automation and operational tooling.
Strong experience with one or more cloud platforms (AWS, Azure, or GCP) and core services including VPC/VNet, IAM, serverless patterns, and managed Kubernetes services.
Experience with containers and orchestration technologies including Docker and Kubernetes.
Proficiency in Infrastructure as Code (IaC) using Terraform and/or Ansible.
Experience with observability stacks and operational monitoring using Splunk, Azure Monitor, Dynatrace, AWS CloudWatch, or similar.
Practical experience using cloud data migration tools.
Advanced knowledge of Windows and Linux/Unix environments, including system administration and networking fundamentals.
Proven incident response capability with strong troubleshooting skills under pressure.
Strong collaboration and communication skills, with the ability to coordinate across teams and document clear outcomes.
What would be great to have:
Cloud certifications such as AWS Certified Solutions Architect, Google Cloud Professional DevOps Engineer, or Azure DevOps Engineer.
Experience with chaos engineering or resilience testing frameworks.
Experience supporting multicloud and/or hybrid cloud deployments.
Familiarity with SLOs, SLIs, and error budgets, and how to apply them to improve service reliability.
Experience gathering operational feedback and driving improvement solutions using Azure services.
How you'll work:
This role is on-site Monday through Friday. Fiserv considers in-person collaboration to be an essential part of this role as in-person office experiences help you with your overall onboarding experience and leads to stronger productivity.
Thank you for considering employment with Fiserv. Please:
- Apply using your legal name
- Complete the step-by-step profile and attach your resume (either is acceptable, both are preferable).
Our commitment to Diversity and Inclusion:
Fiserv is proud to be an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, national origin, gender, gender identity, sexual orientation, age, disability, protected veteran status, or any other category protected by law.
Note to agencies:
Fiserv does not accept resume submissions from agencies outside of existing agreements. Please do not send resumes to Fiserv associates. Fiserv is not responsible for any fees associated with unsolicited resume submissions.
Warning about fake job posts:
Please be aware of fraudulent job postings that are not affiliated with Fiserv. Fraudulent job postings may be used by cyber criminals to target your personally identifiable information and/or to steal money or financial information. Any communications from a Fiserv representative will come from a legitimate Fiserv email address.