Join Oracle's Health Data Intelligence (HDI) team as a Principal Software Engineer, where you will design and build the next generation of cloud-native platforms, distributed systems, and intelligent automation solutions that power large-scale healthcare analytics.
This role is ideal for engineers who enjoy solving complex software engineering challenges at scale. You will develop highly available services, reliability platforms, observability systems, automation frameworks, and AI-powered operational tooling that enable mission-critical analytics workloads across Oracle Cloud Infrastructure and multi-cloud environments.
You will partner with product, platform, data, and reliability teams to build scalable software systems that process massive datasets, improve developer productivity, automate operational workflows, and enhance platform resilience.
As Oracle continues investing in AI-native infrastructure, you will help drive the adoption of Generative AI and agent-based technologies to build intelligent operational platforms, self-service infrastructure solutions, and autonomous reliability capabilities.
U.S. citizenship is required for this position, as the successful candidate will be required to obtain and maintain a U.S. government security clearance after hire.
Internal Responsibilities
Required Skills
Software Engineering
Strong software development experience in Python, Java, Go (Golang), or similar languages
Strong hands-on system design experience with the ability to architect and build large-scale distributed systems
Demonstrated expertise writing high-quality, maintainable, testable, and production-grade code
Strong understanding of software architecture, design patterns, and engineering best practices
Experience developing cloud-native applications, microservices, and platform services
Experience leading technical design discussions, architecture reviews, and complex engineering initiatives
Distributed Systems & Platform Engineering
Experience building highly available, fault-tolerant distributed systems at scale
Strong understanding of scalability, concurrency, resiliency, performance optimization, and reliability patterns
Experience developing platform services, shared frameworks, developer tooling, and self-service platforms
Knowledge of event-driven architectures, service-oriented systems, and asynchronous processing patterns
AI-Native Engineering
Hands-on experience building solutions using Generative AI, Agentic AI, Large Language Models (LLMs), and intelligent automation technologies
Experience integrating frameworks such as LangChain, AutoGen, CrewAI, Semantic Kernel, OpenAI, or equivalent AI platforms
Experience building AI-powered automation for:
Incident investigation and root cause analysis
Operational intelligence and observability
Infrastructure lifecycle management
Engineering productivity and developer experience
Experience designing APIs, services, and platforms that incorporate AI capabilities
Experience building AI-assisted operational tooling, autonomous remediation systems, or intelligent platform services is highly desirable
Cloud & Infrastructure Engineering
Strong experience with OCI, AWS, Azure, or multi-cloud environments
Experience building cloud-native services using Kubernetes, Docker, and container orchestration platforms
Strong understanding of cloud architecture, networking, security, compliance, and cost optimization
Deep experience with Infrastructure as Code (IaC) using Terraform, Ansible, and related automation frameworks
Experience building infrastructure automation, deployment tooling, and platform engineering solutions
Data Engineering
Experience building data-intensive applications and analytics platforms
Knowledge of ETL pipelines and large-scale data processing frameworks
Familiarity with data warehouse technologies such as Snowflake, Vertica, or equivalent platforms
Understanding of distributed storage systems, columnar databases, and large-scale analytics architectures
Reliability Engineering
Strong understanding of SRE principles and operational excellence practices
Experience implementing observability solutions using Prometheus, Grafana, OpenTelemetry, or similar technologies
Experience analyzing production issues and implementing durable engineering solutions
Knowledge of monitoring, alerting, reliability engineering, performance tuning, and self-healing systems
What You Bring
10+ years of hands-on software engineering experience designing, building, and operating large-scale distributed systems
Proven experience delivering production software in cloud-native environments
Strong track record of leading complex technical initiatives from architecture and design through deployment and operations
Experience building platform services, developer tooling, infrastructure automation frameworks, or large-scale analytics platforms
Core Technical Expertise
Large-scale distributed systems architecture and hands-on system design
Software engineering with strong coding proficiency in Python, Java, and/or Go
Cloud-native application development and microservices architecture
Infrastructure as Code (Terraform, Ansible) and automation engineering
Platform engineering and developer productivity tooling
Large-scale data processing and analytics systems
Performance optimization, scalability, resiliency, and reliability engineering
AI-powered platforms, intelligent automation, and agent-based system development
AI-Native Experience
Experience building AI-powered software products, engineering platforms, or operational tooling
Experience integrating LLMs, agent frameworks, RAG architectures, and intelligent automation systems into production environments
Understanding of emerging AI engineering patterns and practical applications within software engineering, infrastructure, and operations
Technical Skills
Python, Java, Go (Golang)
Terraform, Ansible, Infrastructure as Code (IaC)
Kubernetes, Docker
CI/CD and DevOps platforms
Prometheus, Grafana, OpenTelemetry
Cloud platforms (OCI preferred)
Generative AI, Agentic AI, LLM frameworks, and AI-powered automation platforms
External Responsibilities
Required Skills
Software Engineering
Strong software development experience in Python, Java, Go (Golang), or similar languages
Strong hands-on system design experience with the ability to architect and build large-scale distributed systems
Demonstrated expertise writing high-quality, maintainable, testable, and production-grade code
Strong understanding of software architecture, design patterns, and engineering best practices
Experience developing cloud-native applications, microservices, and platform services
Experience leading technical design discussions, architecture reviews, and complex engineering initiatives
Distributed Systems & Platform Engineering
Experience building highly available, fault-tolerant distributed systems at scale
Strong understanding of scalability, concurrency, resiliency, performance optimization, and reliability patterns
Experience developing platform services, shared frameworks, developer tooling, and self-service platforms
Knowledge of event-driven architectures, service-oriented systems, and asynchronous processing patterns
AI-Native Engineering
Hands-on experience building solutions using Generative AI, Agentic AI, Large Language Models (LLMs), and intelligent automation technologies
Experience integrating frameworks such as LangChain, AutoGen, CrewAI, Semantic Kernel, OpenAI, or equivalent AI platforms
Experience building AI-powered automation for:
Incident investigation and root cause analysis
Operational intelligence and observability
Infrastructure lifecycle management
Engineering productivity and developer experience
Experience designing APIs, services, and platforms that incorporate AI capabilities
Experience building AI-assisted operational tooling, autonomous remediation systems, or intelligent platform services is highly desirable
Cloud & Infrastructure Engineering
Strong experience with OCI, AWS, Azure, or multi-cloud environments
Experience building cloud-native services using Kubernetes, Docker, and container orchestration platforms
Strong understanding of cloud architecture, networking, security, compliance, and cost optimization
Deep experience with Infrastructure as Code (IaC) using Terraform, Ansible, and related automation frameworks
Experience building infrastructure automation, deployment tooling, and platform engineering solutions
Data Engineering
Experience building data-intensive applications and analytics platforms
Knowledge of ETL pipelines and large-scale data processing frameworks
Familiarity with data warehouse technologies such as Snowflake, Vertica, or equivalent platforms
Understanding of distributed storage systems, columnar databases, and large-scale analytics architectures
Reliability Engineering
Strong understanding of SRE principles and operational excellence practices
Experience implementing observability solutions using Prometheus, Grafana, OpenTelemetry, or similar technologies
Experience analyzing production issues and implementing durable engineering solutions
Knowledge of monitoring, alerting, reliability engineering, performance tuning, and self-healing systems
What You Bring
10+ years of hands-on software engineering experience designing, building, and operating large-scale distributed systems
Proven experience delivering production software in cloud-native environments
Strong track record of leading complex technical initiatives from architecture and design through deployment and operations
Experience building platform services, developer tooling, infrastructure automation frameworks, or large-scale analytics platforms
Core Technical Expertise
Large-scale distributed systems architecture and hands-on system design
Software engineering with strong coding proficiency in Python, Java, and/or Go
Cloud-native application development and microservices architecture
Infrastructure as Code (Terraform, Ansible) and automation engineering
Platform engineering and developer productivity tooling
Large-scale data processing and analytics systems
Performance optimization, scalability, resiliency, and reliability engineering
AI-powered platforms, intelligent automation, and agent-based system development
AI-Native Experience
Experience building AI-powered software products, engineering platforms, or operational tooling
Experience integrating LLMs, agent frameworks, RAG architectures, and intelligent automation systems into production environments
Understanding of emerging AI engineering patterns and practical applications within software engineering, infrastructure, and operations
Technical Skills
Python, Java, Go (Golang)
Terraform, Ansible, Infrastructure as Code (IaC)
Kubernetes, Docker
CI/CD and DevOps platforms
Prometheus, Grafana, OpenTelemetry
Cloud platforms (OCI preferred)
Generative AI, Agentic AI, LLM frameworks, and AI-powered automation platforms