What you'd actually do

Architect and deploy large-scale GPU/HPC infrastructure on OCI using tools like Terraform, Ansible, Slurm and Kubernetes.

Build automated solutions for cluster provisioning, software deployment, and infrastructure as code.

Collaborate with Oracle’s largest enterprise customers to define and tailor solutions that meet high-performance compute and AI requirements.

Support LLM-based solutions, agentic AI systems, and robotic AI platforms from design through deployment.

Act as a trusted technical advisor, guiding customers on best practices, cloud migration strategies, and deployment patterns.

Skills

Required

HPC infrastructure
GPU infrastructure
AI platform engineering
Scripting and automation (Python, Bash, PowerShell)
Terraform
Ansible
Kubernetes
Cluster managers (SLURM, PBS, Bright)
Container orchestration
RDMA
Infiniband
MPI
Distributed file systems
Cloud Native experience
AI/ML platforms
Large language models (LLMs)
Inference serving stacks
Pre-sales technical consulting
Solution architecture
Communication and presentation skills

Nice to have

Slurm
PowerShell
Oracle Cloud Infrastructure (OCI)
Bachelor’s or Master’s degree in Computer Science, Engineering, Mathematics, or related field
Thought leadership through publications, speaking engagements, or community contributions

What the JD emphasized

deep expertise in HPC, GPU infrastructure, and AI platform engineering

design and deploy large-scale accelerated computing solutions

lead customer engagements

drive adoption of cutting-edge AI workloads

architect and deploy complex HPC and GPU clusters, AI platforms, and intelligent agentic solutions

pre-sales technical consulting, solution engineering, and AI transformation strategy

deep technical skills

consultative approach

develop scalable AI architectures

large-scale GPU/HPC infrastructure

AI platforms

intelligent agentic solutions

LLM-based solutions, agentic AI systems, and robotic AI platforms

trusted technical advisor

best practices, cloud migration strategies, and deployment patterns

technical gaps

influence product roadmaps

key AI Partners

Hands-on expertise with GPU and HPC architecture

Proficiency in scripting and automation

Experience with cluster managers

Knowledge of RDMA, Infiniband, MPI, and distributed file systems

Core Cloud Native experience

Familiarity with AI/ML platforms, large language models (LLMs), and inference serving stacks

5+ years in pre-sales, technical consulting, or customer-facing solution architecture

Strong communication and presentation skills

deliver innovative cloud solutions

translate complex technical capabilities into business-aligned strategies

design, deployment, and support of large-scale AI, GPU, and HPC infrastructure solutions

partners closely with customers throughout the entire engagement lifecycle

solution architecture and Proof of Concept (POC) through production deployment, optimization, and ongoing operational support

technical leadership

developing reusable assets, automation, reference architectures, and technical enablement content

Other signals

design and deploy large-scale accelerated computing solutions

drive adoption of cutting-edge AI workloads on Oracle Cloud Infrastructure (OCI)

architect and deploy complex HPC and GPU clusters, AI platforms, and intelligent agentic solutions

support LLM-based solutions, agentic AI systems, and robotic AI platforms from design through deployment

familiarity with AI/ML platforms, large language models (LLMs), and inference serving stacks

Principal Software Development Engineer

**Oracle ** Tokyo, Japan (Hybrid)

We are looking for hands-on Principal Core Software Development Engineer with deep expertise in HPC, GPU infrastructure, and AI platform engineering to join our growing team. In this role, you will design and deploy large-scale accelerated computing solutions, lead customer engagements, and drive adoption of cutting-edge AI workloads on Oracle Cloud Infrastructure (OCI). This is an exceptional opportunity for someone with strong technical acumen, customer focus, and passion for cloud-native innovation.

As a Principal Core Software Development Engineer, you will be at the forefront of designing and implementing next generation accelerated computing and AI solutions on Oracle Cloud Infrastructure (OCI). You will engage directly with startup to strategic customers, helping them architect and deploy complex HPC and GPU clusters, AI platforms, and intelligent agentic solutions across POC and production environments. You will play a pivotal role in pre-sales technical consulting, solution engineering, and AI transformation strategy.

This is a highly visible and influential role, combining deep technical skills with a consultative approach to support from emerging AI Startups to Fortune 500 customers, develop scalable AI architectures, and contribute to Oracle’s strategic vision for cloud and AI adoption.

Key Responsibilities

Architect and deploy large-scale GPU/HPC infrastructure on OCI using tools like Terraform, Ansible, Slurm and Kubernetes.
Build automated solutions for cluster provisioning, software deployment, and infrastructure as code.
Collaborate with Oracle’s largest enterprise customers to define and tailor solutions that meet high-performance compute and AI requirements.
Support LLM-based solutions, agentic AI systems, and robotic AI platforms from design through deployment.
Act as a trusted technical advisor, guiding customers on best practices, cloud migration strategies, and deployment patterns.
Conduct customer training, workshops, and technical deep dives to enable successful cloud adoption.
Collaborate cross-functionally with product, support, and engineering to close technical gaps and influence product roadmaps.
Develop and share technical assets including competitive differentiators, code samples, demos, blogs, and white papers.
Identify and work with key AI Partners to support customer requirements from design to deployments.

Required Technical Skills

Hands-on expertise with GPU and HPC architecture in cloud and on-prem environments.
Proficiency in scripting and automation: Python, Bash, PowerShell, Terraform, Ansible.
Experience with cluster managers (SLURM, PBS, Bright), Kubernetes, and container orchestration.
Knowledge of RDMA, Infiniband, MPI, and distributed file systems.
Core Cloud Native experience
Familiarity with AI/ML platforms, large language models (LLMs), and inference serving stacks.

Business & Leadership Skills

5+ years in pre-sales, technical consulting, or customer-facing solution architecture.
Strong communication and presentation skills for both technical and executive audiences.
Passion for working with top-tier customers and partners to deliver innovative cloud solutions.
Ability to translate complex technical capabilities into business-aligned strategies.

Preferred Qualifications

Bachelor’s or Master’s degree in Computer Science, Engineering, Mathematics, or related field.
Demonstrated thought leadership through publications, speaking engagements, or community contributions.
Experience working with Oracle Cloud Infrastructure (OCI) or similar cloud platforms.

The Principal Core Software Development Engineer is responsible for leading the design, deployment, and support of large-scale AI, GPU, and HPC infrastructure solutions on Oracle Cloud Infrastructure (OCI). The role partners closely with customers throughout the entire engagement lifecycle, from solution architecture and Proof of Concept (POC) through production deployment, optimization, and ongoing operational support. As a trusted technical advisor, the engineer provides guidance on cloud-native architectures, Kubernetes, Slurm, AI platforms, automation, and best practices while working closely with Product Management, Engineering, Support, Sales, and partners to deliver successful customer outcomes. In addition, the role contributes to Oracle's technical leadership by developing reusable assets, automation, reference architectures, and technical enablement content that accelerate customer adoption and strengthen Oracle's position in AI and cloud infrastructure.

Career Level - IC4