The Incident Manager is responsible for overseeing the end‑to‑end response to high‑priority incidents that impact business‑critical applications, systems, or services. This role ensures timely resolution, clear communication, and effective coordination across technical and business teams to minimize downtime and restore normal operations as quickly as possible.
What you will do:
• Lead and coordinate the response to major incidents, ensuring rapid triage, investigation, and service restoration.
• Serve as the primary point of contact during incidents, facilitating communication across engineering, operations, leadership, and business stakeholders
• Drive incident timelines, escalation procedures, bridge calls, and follow‑the‑sun handoffs
• Document incident details, actions taken, and resolution steps in real time • Perform root cause analysis (RCA) and partner with technical teams to identify corrective and preventive actions.
• Track post‑incident remediation work and make sure accountability for long‑term fixes
• Establish and enforce incident response processes, SLAs, and communication standards.
• Prepare incident reports, executive summaries, and customer‑facing updates when required.
• Monitor system health dashboards and alerts to proactively address emerging issues.
• Contribute to the continuous improvement of incident management frameworks, playbooks, and operational readiness.
What you will need to have:
• Experience managing high‑severity incidents in production or enterprise environments.
• Strong understanding of application, infrastructure, and operations concepts. • Excellent communication skills with the ability to convey complex technical information to non‑technical audiences.
• Ability to remain calm, structured, and decisive in high‑pressure situations.
• Strong analytical and problem‑solving skills.
• Familiarity with incident management tools, ticketing systems, and monitoring platforms (Service Now, JIRA, PagerDuty).
• Experience coordinating cross‑functional technical teams during outages or critical events.
• Proficient in Japanese language, both spoken and written.
What would be great to have:
• Background in IT operations, application support, SRE, or infrastructure engineering.
• Knowledge of ITIL processes, particularly Incident, Problem, and Change Management.
• Experience with on‑call scheduling, crisis management, or operational leadership.
• Exposure to cloud environments (AWS, Azure, GCP) and modern observability tooling.
• Strong documentation and process‑improvement skills