This is a unique opportunity to lead a key part of OCI's Observability stack focused on Logging system, which is essential to ensure the performance, availability, and trustworthiness of all Oracle Cloud services. Our mission is to deliver a world-class Integrated Observability and Management platform that seamlessly supports OCI, hybrid, and multi-cloud environments.
Our platform combines Monitoring, Alarming, Logging, Events, Auditing, and SIEM capabilities to give customers and internal teams a unified, actionable view into their infrastructure and applications. This role specifically focuses on the Logging platform, which provides the foundation for real-time log ingestion and querying.
We are looking for a Senior Engineering Manager to lead an exceptionally talented team of software engineers in advancing this critical part of OCI’s platform. You will drive innovation and scale to ensure our Logging systems remain among the most reliable, performant, and intelligent in the modern cloud landscape.
Internal Responsibilities
Career Level - M3
Responsibilities
- Own the design, development, and operation of a high-scale, distributed logging platform that processes petabytes of logs across OCI regions.
- Ensure the reliability, availability, and operational excellence of services responsible for Monitoring, Alarming, and Canary-based health checks, supporting mission-critical infrastructure.
- Provide technical leadership, direction, and strategic vision for a team of senior and principal engineers, fostering a culture of innovation, accountability, and continuous improvement.
- Define and execute a clear, prioritized roadmap of features, platform investments, and operational improvements delivering on commitments on time and with high quality.
- Collaborate cross-functionally with Product Management, other OCI service teams, and Oracle-wide stakeholders to align goals, manage dependencies, and drive integrated solutions.
- Drive and mature engineering processes, including design reviews, operational readiness reviews, quality standards, and incident postmortems.
- Represent the team in executive-level updates and strategic planning discussions, articulating technical direction, risks, and delivery status.
- Proactively monitor the health and performance of services in the global OCI fleet, identifying trends, mitigating risks, and ensuring fault-tolerant, scalable logging infrastructure.
External Responsibilities
Career Level - M3
Responsibilities
- Own the design, development, and operation of a high-scale, distributed logging platform that processes petabytes of logs across OCI regions.
- Ensure the reliability, availability, and operational excellence of services responsible for Monitoring, Alarming, and Canary-based health checks, supporting mission-critical infrastructure.
- Provide technical leadership, direction, and strategic vision for a team of senior and principal engineers, fostering a culture of innovation, accountability, and continuous improvement.
- Define and execute a clear, prioritized roadmap of features, platform investments, and operational improvements delivering on commitments on time and with high quality.
- Collaborate cross-functionally with Product Management, other OCI service teams, and Oracle-wide stakeholders to align goals, manage dependencies, and drive integrated solutions.
- Drive and mature engineering processes, including design reviews, operational readiness reviews, quality standards, and incident postmortems.
- Represent the team in executive-level updates and strategic planning discussions, articulating technical direction, risks, and delivery status.
- Proactively monitor the health and performance of services in the global OCI fleet, identifying trends, mitigating risks, and ensuring fault-tolerant, scalable logging infrastructure.