About this role:Wells Fargo is seeking a Lead Site Reliability Engineer in Technology as part of Wealth and Investment Management Technology who thinks systematically about reliability, can translate business requirements into technical implementations, and thrives on making complex systems more robust. Learn more about the career areas and lines of business at wellsfargojobs.com .
The Site Reliability Engineering team is fundamental to ensure our platform delivers consistent, reliable service to our client base. This role will work at the intersection of software engineering and operations, applying engineering principles to infrastructure challenges. This individual will design and implement scalable systems, create observability solutions that offer actionable insights, and develop automation to improve our platform's reliability.
In this role, you will:- Work alongside developers as well as the business stakeholders and strive to automate the acceptance criteria
- Maintain high reliability and availability for software applications
- Automate the mundane tasks and avoid human errors
- Define SLI (Service level indicator) & SLO (service level objective) by collaborating with Product owners
- Lead incident response efforts and post-mortem analysis to prevent future occurrences.
- Write incident root cause analysis, find out the core reason behind the issue and prevent it from happening again
- Document procedures, best practices and troubleshooting FAQs.
- Debug the system and fixing the production related issues.
- Escalate / follow-up on permanent fix for development related issues.
- Handle complex operational tasks and recommends process and technology changes.
- Provide global support including troubleshooting production related issues and performing checkouts.
Required Qualifications:- 5+ years of Technology Infrastructure Engineering and Solutions experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
- 5+ years of Site Reliability Engineering experience or related experience
Desired Qualifications:- Strong understanding of the REST APIs
- Strong understanding in working of the troubleshooting tools such as Splunk, AppDynamics, and Elastic APM
- Strong experience in API Management tools such as Apigee
- Working knowledge of databases such as MongoDB, Oracle
- Strong foundation in reliability engineering principles and distributed systems behavior
- Experience defining and implementing SLOs/SLIs and using them to drive system improvements
- Demonstrated ability to design and implement observability solutions that provide actionable insights while minimizing alert fatigue
- Understand modern observability practices and experience implementing and maintaining monitoring solutions such as Prometheus/Grafana, Splunk, NewRelic, CloudWatch, and ELK in the cloud
- Strong incident response skills with experience leading incident retrospectives and driving improvements
- Excellent problem-solving abilities and experience debugging distributed systems
- Track record of successfully automating operations and reducing toil
- Strong communication skills with ability to explain complex technical concepts to diverse audiences
- Ability to work both independently and collaboratively (in groups) in an energetic, and diverse team environment.
Job Expectations:- Ability to work weekends
- Participate in on-call rotations to ensure 24/7 system availability and support.
Posting End Date: 24 Apr 2025
*Job posting may come down early due to volume of applicants. We Value DiversityAt Wells Fargo, we believe in diversity, equity and inclusion in the workplace; accordingly, we welcome applications for employment from all qualified candidates, regardless of race, color, gender, national origin, religion, age, sexual orientation, gender identity, gender expression, genetic information, individuals with disabilities, pregnancy, marital status, status as a protected veteran or any other status protected by applicable law.
Employees support our focus on building strong customer relationships balanced with a strong risk mitigating and compliance-driven culture which firmly establishes those disciplines as critical to the success of our customers and company. They are accountable for execution of all applicable risk programs (Credit, Market, Financial Crimes, Operational, Regulatory Compliance), which includes effectively following and adhering to applicable Wells Fargo policies and procedures, appropriately fulfilling risk and compliance obligations, timely and effective escalation and remediation of issues, and making sound risk decisions. There is emphasis on proactive monitoring, governance, risk identification and escalation, as well as making sound risk decisions commensurate with the business unit's risk appetite and all risk and compliance program requirements.
Candidates applying to job openings posted in US: All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other legally protected characteristic.
Candidates applying to job openings posted in Canada: Applications for employment are encouraged from all qualified candidates, including women, persons with disabilities, aboriginal peoples and visible minorities. Accommodation for applicants with disabilities is available upon request in connection with the recruitment process.
Applicants with DisabilitiesTo request a medical accommodation during the application or interview process, visit Disability Inclusion at Wells Fargo .
Drug and Alcohol PolicyWells Fargo maintains a drug free workplace. Please see our Drug and Alcohol Policy to learn more.
Wells Fargo Recruitment and Hiring Requirements:a. Third-Party recordings are prohibited unless authorized by Wells Fargo.
b. Wells Fargo requires you to directly represent your own experiences during the recruiting and hiring process.