In this role, you will help design, build, and operate software that powers the full data center lifecycle, including planning, design, build, and operations, within the DC Software and Automation organization. We are rapidly scaling our data center footprint to power next-generation AI infrastructure, and you will help build and operate the software and automation that accelerates AI-capable data center delivery and simplifies day-to-day operations at global scale.
This position blends architecture and hands-on delivery. You will lead end-to-end execution for major features, guide technical direction across teams, and raise the bar on reliability, operability, and engineering practices. This is not only about shipping functionality. It is also about improving how we build and run systems. You will influence standards for code quality, testing, deployment automation, and service operations, partnering closely with engineering peers, product management, and leadership to deliver durable outcomes aligned to our mission: Accelerate builds and democratize operations.
Internal Responsibilities
As a Consulting Member of Technical Staff (IC5) on the DC Software and Automation team, you will provide technical leadership for software and services that enable data center delivery and operations at scale across OCI Data Centers. You will shape system architecture and drive execution for high-impact initiatives that improve scalability, resiliency, and developer and operator productivity.
You will:
- Architect, design, and operate distributed services supporting data center lifecycle workflows, including planning, design, build, and operations. This includes defining APIs, data models, scalability strategies, and failure-mode behavior.
- Own feature delivery from design through production rollout, including implementation, testing strategy, deployment planning, and operational readiness.
- Operate in a builder-operator model. You will build, deploy, and participate in on-call rotations. You will drive service reliability improvements based on incidents, retrospectives, and operational data.
- Lead technical reviews, including architecture and functional design reviews. You will write and maintain clear documentation, and you will ensure designs are secure by default and supportable for long-term operations.
- Improve engineering velocity and quality by strengthening code review practices, increasing automated test coverage, and enhancing continuous integration and continuous delivery, test automation, and deployment automation.
- Partner with Product Managers and leadership to translate requirements into roadmaps and execution plans. You will communicate trade-offs, risks, and milestones clearly.
- Mentor engineers across levels. You will set expectations for operational excellence, observability, and proactive service ownership, and you will raise engineering standards across multiple teams.
- Reduce operational toil by investing in automation, self-service capabilities, and observability such as metrics, logs, and tracing. You will improve incident response and runbooks.
- Collaborate across dependent teams such as networking, security, identity, observability, storage, and release engineering to deliver integrated solutions for data center delivery and operations tooling.
- Contribute to longer-term technical direction for data center software platforms by identifying opportunities to modernize systems, standardize interfaces, and improve the end-to-end lifecycle experience for builders and operators.
Preferred Qualifications
- A BS, MS, or PhD in Computer Science or Computer Engineering, or equivalent practical experience.
- 10+ years of full stack software development, including design, implementation, and operation of distributed services.
- Demonstrated experience owning production services end-to-end in a builder-operator model, including design, build, deploy, on-call, and continuous improvement.
- Strong product development experience in Java, Python, Go, and or JavaScript.
- Expertise designing REST APIs and building multi-tenant services with clear isolation, quota, and lifecycle management.
- Experience leading architecture and functional reviews, strong written documentation habits, and the ability to partner with Product Managers across the full launch lifecycle.
- Good understanding of databases, NoSQL systems, storage concepts, and distributed persistence technologies.
- Proven technical leadership, including delivering multiple initiatives in parallel, unblocking teams, and driving execution while maintaining high engineering quality.
- Excellent verbal and written communication skills, with the ability to articulate trade-offs, risks, and operational considerations to diverse stakeholders.
External Responsibilities
As a Consulting Member of Technical Staff (IC5) on the DC Software and Automation team, you will provide technical leadership for software and services that enable data center delivery and operations at scale across OCI Data Centers. You will shape system architecture and drive execution for high-impact initiatives that improve scalability, resiliency, and developer and operator productivity.
You will:
- Architect, design, and operate distributed services supporting data center lifecycle workflows, including planning, design, build, and operations. This includes defining APIs, data models, scalability strategies, and failure-mode behavior.
- Own feature delivery from design through production rollout, including implementation, testing strategy, deployment planning, and operational readiness.
- Operate in a builder-operator model. You will build, deploy, and participate in on-call rotations. You will drive service reliability improvements based on incidents, retrospectives, and operational data.
- Lead technical reviews, including architecture and functional design reviews. You will write and maintain clear documentation, and you will ensure designs are secure by default and supportable for long-term operations.
- Improve engineering velocity and quality by strengthening code review practices, increasing automated test coverage, and enhancing continuous integration and continuous delivery, test automation, and deployment automation.
- Partner with Product Managers and leadership to translate requirements into roadmaps and execution plans. You will communicate trade-offs, risks, and milestones clearly.
- Mentor engineers across levels. You will set expectations for operational excellence, observability, and proactive service ownership, and you will raise engineering standards across multiple teams.
- Reduce operational toil by investing in automation, self-service capabilities, and observability such as metrics, logs, and tracing. You will improve incident response and runbooks.
- Collaborate across dependent teams such as networking, security, identity, observability, storage, and release engineering to deliver integrated solutions for data center delivery and operations tooling.
- Contribute to longer-term technical direction for data center software platforms by identifying opportunities to modernize systems, standardize interfaces, and improve the end-to-end lifecycle experience for builders and operators.
Preferred Qualifications
- A BS, MS, or PhD in Computer Science or Computer Engineering, or equivalent practical experience.
- 10+ years of full stack software development, including design, implementation, and operation of distributed services.
- Demonstrated experience owning production services end-to-end in a builder-operator model, including design, build, deploy, on-call, and continuous improvement.
- Strong product development experience in Java, Python, Go, and or JavaScript.
- Expertise designing REST APIs and building multi-tenant services with clear isolation, quota, and lifecycle management.
- Experience leading architecture and functional reviews, strong written documentation habits, and the ability to partner with Product Managers across the full launch lifecycle.
- Good understanding of databases, NoSQL systems, storage concepts, and distributed persistence technologies.
- Proven technical leadership, including delivering multiple initiatives in parallel, unblocking teams, and driving execution while maintaining high engineering quality.
- Excellent verbal and written communication skills, with the ability to articulate trade-offs, risks, and operational considerations to diverse stakeholders.