Oracle Cloud Infrastructure (OCI) is redefining the cloud for the world’s largest enterprises. We operate with the agility and innovation of a startup while delivering the scale, security, and reliability expected from one of the world’s leading technology companies.
OCI powers mission-critical workloads for customers globally, offering a comprehensive cloud platform built for high performance, distributed systems, and enterprise-grade reliability. Our engineering culture is grounded in OCI Values — emphasizing integrity, inclusion, innovation, customer focus, and operational excellence. We invest deeply in our people and foster an environment where diverse perspectives, collaboration, ownership, and continuous learning drive breakthrough results.
At OCI, you’ll work alongside exceptional engineers solving some of the most complex distributed systems challenges at cloud scale.
The OCI Limits Team owns the foundational platform that manages service limits, quotas, and capacity governance across Oracle Cloud Infrastructure (OCI). The team enables customers and internal OCI services to scale reliably and securely by providing automated limit management, quota enforcement, and high-scale control plane integrations. We work closely with service teams across OCI to support rapid cloud growth, operational stability, and enterprise-grade resource governance. The organization operates highly distributed, mission-critical systems that directly impact customer onboarding, expansion, and cloud consumption experiences.
Who We’re Looking For
We are seeking a Principal Software Development Engineer with deep experience in distributed systems, cloud infrastructure, and large-scale service design. You are a hands-on technical leader who has successfully designed and launched major platform features and services into production while operating highly available systems at scale.
You thrive in solving difficult infrastructure challenges and have a strong sense of ownership across the full software lifecycle — from architecture and development to operational excellence and long-term scalability. You are comfortable driving initiatives independently, mentoring engineers, and influencing technical direction across teams.
The ideal candidate combines strong technical depth with pragmatic decision-making, excellent collaboration skills, and a passion for building simple, reliable, and scalable systems.
Internal Responsibilities
In this role, you will:
- Design, build, and operate highly scalable distributed services for the OCI Limits platform.
- Lead architecture and technical design for major features, services, and platform initiatives.
- Drive end-to-end execution from design and development through deployment and operational support.
- Partner with OCI service teams to deliver foundational cloud governance and quota management capabilities.
- Improve service scalability, resiliency, observability, and operational excellence across the platform.
- Write high-quality, maintainable, and performant production code.
- Lead technical reviews, architecture discussions, and engineering best practices across teams.
- Mentor engineers and provide technical leadership in system design, troubleshooting, and operational readiness.
- Drive automation for testing, deployment, monitoring, and incident response workflows.
- Collaborate closely with product managers, architects, and engineering leadership to define roadmap priorities and deliver customer-focused solutions.
- Participate in on-call rotations and help resolve complex production issues across distributed systems environments.
- Proactively identify reliability risks, performance bottlenecks, and operational inefficiencies before they impact customers.
This team is targeting candidates in the U.S. who can work ONSITE in Nashville-TN (priority location) [Secondary Location is Austin-TX]. Relocation Assistance provided. (This is NOT a remote position).
Minimum Qualifications
- BS or MS in Computer Science or equivalent experience.
- 10+ years of experience designing, building, and operating large-scale distributed systems and cloud services.
- Experience developing and operating services on public cloud platforms such as OCI, AWS, Azure, or GCP.
- Strong programming experience in Java, Go, Python, C++, or similar modern programming languages.
- Deep understanding of distributed systems fundamentals, scalability, fault tolerance, and service-oriented architectures.
- Hands-on experience building and operating highly available cloud-native services in production environments.
- Strong understanding of REST API design and multi-tenant service architectures.
- Experience with databases, NoSQL systems, storage technologies, and distributed persistence systems.
- Familiarity with networking fundamentals including TCP/IP, HTTP, and standard cloud network architectures.
- Experience with observability, monitoring, debugging, and performance tuning in large-scale systems.
- Experience driving technical design reviews, architecture discussions, and cross-team engineering initiatives.
- Strong written and verbal communication skills with the ability to influence technical direction across organizations.
Preferred Qualifications
- Experience building infrastructure control plane services, quota management systems, or cloud governance platforms.
- Familiarity with Infrastructure as Code tools such as Terraform, CloudFormation, or similar technologies.
- Experience with compliance-aware distributed systems operating across multiple geographic regions.
- Experience improving developer productivity through automation, tooling, and operational process improvements.
- Proven ability to lead complex technical initiatives across multiple teams and organizations.
External Responsibilities
In this role, you will:
- Design, build, and operate highly scalable distributed services for the OCI Limits platform.
- Lead architecture and technical design for major features, services, and platform initiatives.
- Drive end-to-end execution from design and development through deployment and operational support.
- Partner with OCI service teams to deliver foundational cloud governance and quota management capabilities.
- Improve service scalability, resiliency, observability, and operational excellence across the platform.
- Write high-quality, maintainable, and performant production code.
- Lead technical reviews, architecture discussions, and engineering best practices across teams.
- Mentor engineers and provide technical leadership in system design, troubleshooting, and operational readiness.
- Drive automation for testing, deployment, monitoring, and incident response workflows.
- Collaborate closely with product managers, architects, and engineering leadership to define roadmap priorities and deliver customer-focused solutions.
- Participate in on-call rotations and help resolve complex production issues across distributed systems environments.
- Proactively identify reliability risks, performance bottlenecks, and operational inefficiencies before they impact customers.
This team is targeting candidates in the U.S. who can work ONSITE in Nashville-TN (priority location) [Secondary Location is Austin-TX]. Relocation Assistance provided. (This is NOT a remote position).
Minimum Qualifications
- BS or MS in Computer Science or equivalent experience.
- 10+ years of experience designing, building, and operating large-scale distributed systems and cloud services.
- Experience developing and operating services on public cloud platforms such as OCI, AWS, Azure, or GCP.
- Strong programming experience in Java, Go, Python, C++, or similar modern programming languages.
- Deep understanding of distributed systems fundamentals, scalability, fault tolerance, and service-oriented architectures.
- Hands-on experience building and operating highly available cloud-native services in production environments.
- Strong understanding of REST API design and multi-tenant service architectures.
- Experience with databases, NoSQL systems, storage technologies, and distributed persistence systems.
- Familiarity with networking fundamentals including TCP/IP, HTTP, and standard cloud network architectures.
- Experience with observability, monitoring, debugging, and performance tuning in large-scale systems.
- Experience driving technical design reviews, architecture discussions, and cross-team engineering initiatives.
- Strong written and verbal communication skills with the ability to influence technical direction across organizations.
Preferred Qualifications
- Experience building infrastructure control plane services, quota management systems, or cloud governance platforms.
- Familiarity with Infrastructure as Code tools such as Terraform, CloudFormation, or similar technologies.
- Experience with compliance-aware distributed systems operating across multiple geographic regions.
- Experience improving developer productivity through automation, tooling, and operational process improvements.
- Proven ability to lead complex technical initiatives across multiple teams and organizations.