Lead reliability assessments and improvement initiatives across electrical, mechanical, and controls-related infrastructure.
Review facility acceptance and commissioning activities to ensure systems are delivered in a reliable, supportable, and maintainable condition.
Design and improve maintenance strategies that enhance uptime, reduce maintenance complexity, and optimize lifecycle performance.
Lead root cause analysis and corrective action development for significant or recurring reliability issues.
Analyze asset utilization, maintenance history, performance trends, and failure patterns to drive data-backed improvements.
Develop risk mitigation plans for operational vulnerabilities, latent failure modes, obsolescence risks, and supportability gaps.
Provide senior technical support and guidance to Site Operations during critical failures, abnormal events, and high-consequence troubleshooting.
Influence design, construction, and turnover decisions by providing reliability and maintainability feedback on new and modified systems.
Support development of standard methodologies for lifecycle planning, spare parts analysis, RAM analysis, and site response procedures.
Mentor less experienced reliability engineers and help raise the technical bar across the function.
Ideal Candidate Profile
6–10 years of experience in critical facilities, reliability engineering, commissioning, maintenance engineering, or uptime-critical infrastructure environments.
Broad familiarity with mission-critical electrical, mechanical, and controls systems and how they interact operationally.
Experience leading RCAs, maintenance strategy improvements, and technical support in live facility environments.
Bachelor’s degree in engineering or related field preferred; equivalent industry experience also valued.
Preferred Skills / Certifications
Experience with commissioning reviews, reliability analytics, lifecycle cost analysis, or maintenance strategy design.
Familiarity with BMS/EPMS/controls data, asset health modeling, or fleet reliability practices.
Advanced knowledge of critical environment operations or design is a plus
CMRP, CRE, or related reliability certification is a plus.
Internal Responsibilities
Skills and Competencies
Strong technical judgment and systems thinking.
Strong data analysis and structured problem-solving capability.
Ability to influence cross-functional teams without formal authority.
Strong communication with technical and operational stakeholders.
Ability to translate operational issues into scalable reliability improvements.
Physical Demands / Work Environment
This role supports mission-critical data center environments and may require regular travel to active sites, turnover events, and operational reviews. You must be able to walk sites, climb stairs, and work safely in industrial environments, with or without reasonable accommodation. Source roles note travel ranging from roughly 25% to 50% depending on specialty and business need.
External Responsibilities
Skills and Competencies
Strong technical judgment and systems thinking.
Strong data analysis and structured problem-solving capability.
Ability to influence cross-functional teams without formal authority.
Strong communication with technical and operational stakeholders.
Ability to translate operational issues into scalable reliability improvements.
Physical Demands / Work Environment
This role supports mission-critical data center environments and may require regular travel to active sites, turnover events, and operational reviews. You must be able to walk sites, climb stairs, and work safely in industrial environments, with or without reasonable accommodation. Source roles note travel ranging from roughly 25% to 50% depending on specialty and business need.