Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate service capacity planning and demand forecasting, software performance analysis, and system tuning.
Internal Responsibilities
You will provide cloud operations for Oracle National Security Realms. You’ll be part of a dynamic team with a broad knowledge of how Oracle’s cloud platform works. You’ll partner with customer support, service owners, and engineering teams around the globe to ensure high-quality service for customers.
Note - this role is not a Monday to Friday core hours role – it will involve working a 24/7 shift rotation with on-call duties, including nights, weekends and public holidays.
- Escalation points for junior Service Operations Engineers during complex or high-impact incidents.
- Manage and execute complex manual Change Management tickets, by working closely with the service teams to ensure safety and minimal disruption to services.
- Support the on-boarding of new services and tools, ensuring they are operationally ready and properly integrated.
- Provide mentorship and training to SOEs, helping build team capability and confidence.
- Create and maintain clear, useful documentation for operational processes and system support.
- Identify areas of manual work and drive automation to reduce toil and improve efficiency.
- Automate tasks to enable continuous delivery and ensure continuous availability with minimal human overhead
- Recognize unsafe or inefficient practices and work with teams to design safer, more effective solutions.
- Complete change requests to enable new functionality and maintain realm compliance
- Ensure timely resolution of incidents, service requests, and change requests
- Collaborate with global service and engineering teams
- Define and drive change management, continuous integration, and deployment best practices
- Help create and maintain real-world production architectures, scalability, and system design
- Use a methodical approach to troubleshoot, large, complex, interconnected systems
We also use…
- Linux and Unix operating systems
- Docker, Kubernetes, and Terraform
- Scripting languages such as Shell, Perl, Python, Java, and Go
- Citizenship/location requirements - i.e. US Citizenship, U.S. Citizenship and possess and maintain TS/SCI w/Poly security clearance, reside in Seattle, WA.
- Technology related bachelor’s degree and/or equivalent work experience
- A desire to learn and keep up with modern technologies
- Proficient with writing services/task automation in Python, Bash, Ruby, Perl, JavaScript, or Java
- Familiarity with core protocols (DNS, DHCP, HTTP, TCP)
- Deep knowledge of Linux internals and host-based networking
- Knowledge of Linux and/or Unix operating systems
- Familiarity with configuration management solutions such as Chef, Puppet, etc
- Experience with devising, managing, and extending monitoring solutions for large scale environments.
- Knowledge of cloud computing concepts
- Experience working in a mission-critical environment (Operations, Technical Support, NOC etc)
- Proficient with communication skills (writing, organization, learning exchange)
- Experience executing tasks under change management procedures
- Experience resolving auto-cut and manual alarms following runbooks
- A focus on customer satisfaction
External Responsibilities
You will provide cloud operations for Oracle National Security Realms. You’ll be part of a dynamic team with a broad knowledge of how Oracle’s cloud platform works. You’ll partner with customer support, service owners, and engineering teams around the globe to ensure high-quality service for customers.
Note - this role is not a Monday to Friday core hours role – it will involve working a 24/7 shift rotation with on-call duties, including nights, weekends and public holidays.
- Escalation points for junior Service Operations Engineers during complex or high-impact incidents.
- Manage and execute complex manual Change Management tickets, by working closely with the service teams to ensure safety and minimal disruption to services.
- Support the on-boarding of new services and tools, ensuring they are operationally ready and properly integrated.
- Provide mentorship and training to SOEs, helping build team capability and confidence.
- Create and maintain clear, useful documentation for operational processes and system support.
- Identify areas of manual work and drive automation to reduce toil and improve efficiency.
- Automate tasks to enable continuous delivery and ensure continuous availability with minimal human overhead
- Recognize unsafe or inefficient practices and work with teams to design safer, more effective solutions.
- Complete change requests to enable new functionality and maintain realm compliance
- Ensure timely resolution of incidents, service requests, and change requests
- Collaborate with global service and engineering teams
- Define and drive change management, continuous integration, and deployment best practices
- Help create and maintain real-world production architectures, scalability, and system design
- Use a methodical approach to troubleshoot, large, complex, interconnected systems
We also use…
- Linux and Unix operating systems
- Docker, Kubernetes, and Terraform
- Scripting languages such as Shell, Perl, Python, Java, and Go
- Citizenship/location requirements - i.e. US Citizenship, U.S. Citizenship and possess and maintain TS/SCI w/Poly security clearance, reside in Seattle, WA.
- Technology related bachelor’s degree and/or equivalent work experience
- A desire to learn and keep up with modern technologies
- Proficient with writing services/task automation in Python, Bash, Ruby, Perl, JavaScript, or Java
- Familiarity with core protocols (DNS, DHCP, HTTP, TCP)
- Deep knowledge of Linux internals and host-based networking
- Knowledge of Linux and/or Unix operating systems
- Familiarity with configuration management solutions such as Chef, Puppet, etc
- Experience with devising, managing, and extending monitoring solutions for large scale environments.
- Knowledge of cloud computing concepts
- Experience working in a mission-critical environment (Operations, Technical Support, NOC etc)
- Proficient with communication skills (writing, organization, learning exchange)
- Experience executing tasks under change management procedures
- Experience resolving auto-cut and manual alarms following runbooks
- A focus on customer satisfaction