Principal Platform Software Engineer

Oracle · Enterprise · BENGALURU, KARNATAKA, India

Seeking a Principal Engineer to design and build Agentic AI capabilities for Oracle Cloud Infrastructure's Developer Tools. This role involves creating AI agents to assist developers with code, debugging, automation, and reasoning over cloud systems, utilizing LLMs, RAG, tool use, and orchestration. The position requires strong distributed systems experience, hands-on LLM application development, and leadership in designing safe and reliable AI execution patterns, including evaluation frameworks.

What you'd actually do

Design, build, and operate Agentic AI-powered developer tools for the OCI Developer Tools organization.
Develop AI agents that assist with code authoring, debugging, test generation, build failure analysis, deployment guidance, infrastructure automation, cloud diagnostics, and developer workflow optimization.
Build systems that combine LLMs, retrieval-augmented generation, tool calling, workflow orchestration, code intelligence, structured outputs, and OCI service APIs.
Create agent workflows that can reason across source code, SDKs, APIs, CLI commands, documentation, build logs, telemetry, repositories, deployment artifacts, and cloud resource metadata.
Design safe and reliable agent execution patterns, including human-in-the-loop approval, guardrails, access control, audit logging, tool-use constraints, error recovery, and policy-aware automation.

Skills

Required

Bachelor's or Master’s degree in Computer Science, Computer Engineering, Artificial Intelligence, Machine Learning, or a related technical field, with 10+ years experience.
Strong professional experience designing and building large-scale distributed systems, developer platforms, cloud services, or enterprise software products.
Hands-on experience building applications using large language models, including prompt design, structured outputs, function calling, tool use, retrieval-augmented generation, or AI workflow orchestration.
Practical understanding of Agentic AI patterns, including planning, reasoning loops, task decomposition, tool invocation, memory, context management, agent state, and autonomous or semi-autonomous execution.
Strong programming experience in one or more languages such as Java, Python, Go, or similar.
Experience building developer-facing tools such as CLIs, SDKs, APIs, IDE extensions, build systems, CI/CD platforms, testing frameworks, observability tools, infrastructure-as-code tooling, or cloud development platforms.
Strong understanding of modern software development workflows, including source control, code review, testing, build automation, deployment pipelines, release management, and production operations.
Experience with cloud-native architecture, including microservices, APIs, containers, distributed systems, asynchronous workflows, authentication, authorization, and service observability.
Familiarity with AI/ML infrastructure components such as embedding models, vector databases, model serving, model evaluation, telemetry, and experimentation frameworks.
Ability to reason about risks in AI-powered developer tools, including incorrect code generation, hallucinated APIs, prompt injection, unsafe tool execution, data leakage, permission misuse, and unreliable automation.
Demonstrated ability to lead complex technical projects independently, influence architecture across teams, and deliver high-quality production systems.
Strong written and verbal communication skills, with the ability to explain complex technical decisions to engineering, product, and leadership audiences.

Nice to have

Experience building AI coding assistants, developer copilots, autonomous debugging agents, test generation systems, build failure analyzers, cloud troubleshooting agents, or AI-powered DevOps tools.
Experience with agent frameworks or orchestration technologies such as LangChain, LangGraph, CrewAI, or custom agent runtimes.
Experience with commercial or open-source LLM

What the JD emphasized

Agentic AI capabilities
design and build
large-scale distributed systems
developer platforms
cloud services
enterprise software products
large language models
prompt design
structured outputs
function calling
tool use
retrieval-augmented generation
AI workflow orchestration
Agentic AI patterns
planning
reasoning loops
task decomposition
tool invocation
memory
context management
agent state
autonomous or semi-autonomous execution
developer-facing tools
source control
code review
testing
build automation
deployment pipelines
release management
production operations
cloud-native architecture
microservices
APIs
containers
distributed systems
asynchronous workflows
authentication
authorization
service observability
embedding models
vector databases
model serving
model evaluation
telemetry
experimentation frameworks
risks in AI-powered developer tools
incorrect code generation
hallucinated APIs
prompt injection
unsafe tool execution
data leakage
permission misuse
unreliable automation
lead complex technical projects independently
influence architecture across teams
deliver high-quality production systems
AI coding assistants
developer copilots
autonomous debugging agents
test generation systems
build failure analyzers
cloud troubleshooting agents
AI-powered DevOps tools
agent frameworks
orchestration technologies

Other signals

Agentic AI capabilities
LLMs
RAG
tool use
workflow orchestration
enterprise-grade safety controls
evaluation frameworks

Read full job description

Oracle Cloud Infrastructure is seeking an **Principal Engineer **to join the OCI Developer Tools Team and help shape the next generation of AI-powered developer experiences.

The OCI Developer Tools organization builds products and platforms that improve how developers interact with OCI across the software development lifecycle. This includes tools and experiences for cloud application development, command-line workflows, SDKs, APIs, IDE integrations, CI/CD, infrastructure automation, diagnostics, documentation discovery, and operational troubleshooting.

In this role, you will design and build Agentic AI capabilities that help developers understand complex cloud systems, generate and improve code, troubleshoot failures, automate repetitive workflows, reason over logs and documentation, interact with OCI services, and safely execute multi-step development tasks. These systems will combine large language models, retrieval-augmented generation, tool use, workflow orchestration, code intelligence, cloud APIs, and enterprise-grade safety controls.

This is a senior individual contributor role for an engineer who can operate independently in ambiguous technical spaces, define architecture, influence product direction, mentor engineers, and deliver high-impact capabilities for OCI customers and internal engineering teams.

Key Responsibilities

Design, build, and operate Agentic AI-powered developer tools for the OCI Developer Tools organization.
Develop AI agents that assist with code authoring, debugging, test generation, build failure analysis, deployment guidance, infrastructure automation, cloud diagnostics, and developer workflow optimization.
Build systems that combine LLMs, retrieval-augmented generation, tool calling, workflow orchestration, code intelligence, structured outputs, and OCI service APIs.
Create agent workflows that can reason across source code, SDKs, APIs, CLI commands, documentation, build logs, telemetry, repositories, deployment artifacts, and cloud resource metadata.
Design safe and reliable agent execution patterns, including human-in-the-loop approval, guardrails, access control, audit logging, tool-use constraints, error recovery, and policy-aware automation.
Partner with product managers, UX designers, developer relations, cloud service teams, security, and infrastructure teams to translate developer pain points into scalable AI product capabilities.
Build evaluation frameworks for developer-facing AI systems, including task success, code correctness, grounding quality, tool-call accuracy, hallucination detection, latency, cost, safety, and regression metrics.
Contribute to platform architecture for AI-assisted development, including agent runtimes, context management, prompt orchestration, model routing, evaluation pipelines, telemetry, and feedback loops.
Ensure AI-powered developer tools meet OCI standards for security, privacy, reliability, compliance, operational readiness, scalability, and enterprise-grade quality.
Provide technical leadership through design documents, architecture reviews, code reviews, mentoring, prototyping, and cross-team technical alignment.
Stay current with advances in Agentic AI, LLM application design, AI coding assistants, developer productivity tools, cloud-native development, and responsible AI, and apply them pragmatically to OCI products.

Required Qualifications

Bachelor’s or Master’s degree in Computer Science, Computer Engineering, Artificial Intelligence, Machine Learning, or a related technical field, with 10+ years experience.
Strong professional experience designing and building large-scale distributed systems, developer platforms, cloud services, or enterprise software products.
Hands-on experience building applications using large language models, including prompt design, structured outputs, function calling, tool use, retrieval-augmented generation, or AI workflow orchestration.
Practical understanding of Agentic AI patterns, including planning, reasoning loops, task decomposition, tool invocation, memory, context management, agent state, and autonomous or semi-autonomous execution.
Strong programming experience in one or more languages such as Java, Python, Go, or similar.
Experience building developer-facing tools such as CLIs, SDKs, APIs, IDE extensions, build systems, CI/CD platforms, testing frameworks, observability tools, infrastructure-as-code tooling, or cloud development platforms.
Strong understanding of modern software development workflows, including source control, code review, testing, build automation, deployment pipelines, release management, and production operations.
Experience with cloud-native architecture, including microservices, APIs, containers, distributed systems, asynchronous workflows, authentication, authorization, and service observability.
Familiarity with AI/ML infrastructure components such as embedding models, vector databases, model serving, model evaluation, telemetry, and experimentation frameworks.
Ability to reason about risks in AI-powered developer tools, including incorrect code generation, hallucinated APIs, prompt injection, unsafe tool execution, data leakage, permission misuse, and unreliable automation.
Demonstrated ability to lead complex technical projects independently, influence architecture across teams, and deliver high-quality production systems.
Strong written and verbal communication skills, with the ability to explain complex technical decisions to engineering, product, and leadership audiences.

Preferred Qualifications

Experience building AI coding assistants, developer copilots, autonomous debugging agents, test generation systems, build failure analyzers, cloud troubleshooting agents, or AI-powered DevOps tools.
Experience with agent frameworks or orchestration technologies such as LangChain, LangGraph, CrewAI, or custom agent runtimes.
Experience with commercial or open-source LLM ecosystems such as OpenAI, Anthropic, Google Gemini, Cohere, Meta Llama, Mistral, or enterprise-hosted models.
Experience designing systems that reason over large codebases, dependency graphs, APIs, SDKs, cloud service documentation, build artifacts, logs, metrics, traces, and runtime telemetry.
Deep understanding of developer experience, developer productivity, software engineering workflows, and cloud application development.
Experience with OCI, especially in areas such as developer tools, DevOps, cloud automation, identity, observability, infrastructure provisioning, networking, compute, storage, or managed AI services.
Experience with Kubernetes, containers, Terraform, CI/CD systems, workflow engines, message queues, distributed job execution, or service deployment platforms.
Knowledge of secure software supply chain practices, including artifact integrity, dependency scanning, secrets handling, policy enforcement, provenance, and deployment governance.
Experience building RAG systems with semantic retrieval, hybrid search, reranking, chunking strategies, grounding validation, citation-aware responses, and access-controlled retrieval.
Experience evaluating LLM and agentic systems using golden datasets, synthetic test generation, human review, automated scoring, red teaming, online experimentation, and regression testing.
Experience optimizing AI-powered systems for latency, throughput, reliability, token efficiency, cost, model selection, and service availability.
Track record of technical leadership through architecture ownership, patents, publications, open-source contributions, platform delivery, or high-impact developer tooling initiatives.
Experience mentoring engineers and raising the engineering bar across a team, platform, or organization.

Career Level - IC4

Oracle Cloud Infrastructure is seeking an **Principal Engineer **to join the OCI Developer Tools Team and help shape the next generation of AI-powered developer experiences.

Key Responsibilities

Design, build, and operate Agentic AI-powered developer tools for the OCI Developer Tools organization.
Develop AI agents that assist with code authoring, debugging, test generation, build failure analysis, deployment guidance, infrastructure automation, cloud diagnostics, and developer workflow optimization.
Build systems that combine LLMs, retrieval-augmented generation, tool calling, workflow orchestration, code intelligence, structured outputs, and OCI service APIs.
Create agent workflows that can reason across source code, SDKs, APIs, CLI commands, documentation, build logs, telemetry, repositories, deployment artifacts, and cloud resource metadata.
Design safe and reliable agent execution patterns, including human-in-the-loop approval, guardrails, access control, audit logging, tool-use constraints, error recovery, and policy-aware automation.
Partner with product managers, UX designers, developer relations, cloud service teams, security, and infrastructure teams to translate developer pain points into scalable AI product capabilities.
Build evaluation frameworks for developer-facing AI systems, including task success, code correctness, grounding quality, tool-call accuracy, hallucination detection, latency, cost, safety, and regression metrics.
Contribute to platform architecture for AI-assisted development, including agent runtimes, context management, prompt orchestration, model routing, evaluation pipelines, telemetry, and feedback loops.
Ensure AI-powered developer tools meet OCI standards for security, privacy, reliability, compliance, operational readiness, scalability, and enterprise-grade quality.
Provide technical leadership through design documents, architecture reviews, code reviews, mentoring, prototyping, and cross-team technical alignment.
Stay current with advances in Agentic AI, LLM application design, AI coding assistants, developer productivity tools, cloud-native development, and responsible AI, and apply them pragmatically to OCI products.

Required Qualifications

Bachelor’s or Master’s degree in Computer Science, Computer Engineering, Artificial Intelligence, Machine Learning, or a related technical field, with 10+ years experience.
Strong professional experience designing and building large-scale distributed systems, developer platforms, cloud services, or enterprise software products.
Hands-on experience building applications using large language models, including prompt design, structured outputs, function calling, tool use, retrieval-augmented generation, or AI workflow orchestration.
Practical understanding of Agentic AI patterns, including planning, reasoning loops, task decomposition, tool invocation, memory, context management, agent state, and autonomous or semi-autonomous execution.
Strong programming experience in one or more languages such as Java, Python, Go, or similar.
Experience building developer-facing tools such as CLIs, SDKs, APIs, IDE extensions, build systems, CI/CD platforms, testing frameworks, observability tools, infrastructure-as-code tooling, or cloud development platforms.
Strong understanding of modern software development workflows, including source control, code review, testing, build automation, deployment pipelines, release management, and production operations.
Experience with cloud-native architecture, including microservices, APIs, containers, distributed systems, asynchronous workflows, authentication, authorization, and service observability.
Familiarity with AI/ML infrastructure components such as embedding models, vector databases, model serving, model evaluation, telemetry, and experimentation frameworks.
Ability to reason about risks in AI-powered developer tools, including incorrect code generation, hallucinated APIs, prompt injection, unsafe tool execution, data leakage, permission misuse, and unreliable automation.
Demonstrated ability to lead complex technical projects independently, influence architecture across teams, and deliver high-quality production systems.
Strong written and verbal communication skills, with the ability to explain complex technical decisions to engineering, product, and leadership audiences.

Preferred Qualifications

Experience building AI coding assistants, developer copilots, autonomous debugging agents, test generation systems, build failure analyzers, cloud troubleshooting agents, or AI-powered DevOps tools.
Experience with agent frameworks or orchestration technologies such as LangChain, LangGraph, CrewAI, or custom agent runtimes.
Experience with commercial or open-source LLM ecosystems such as OpenAI, Anthropic, Google Gemini, Cohere, Meta Llama, Mistral, or enterprise-hosted models.
Experience designing systems that reason over large codebases, dependency graphs, APIs, SDKs, cloud service documentation, build artifacts, logs, metrics, traces, and runtime telemetry.
Deep understanding of developer experience, developer productivity, software engineering workflows, and cloud application development.
Experience with OCI, especially in areas such as developer tools, DevOps, cloud automation, identity, observability, infrastructure provisioning, networking, compute, storage, or managed AI services.
Experience with Kubernetes, containers, Terraform, CI/CD systems, workflow engines, message queues, distributed job execution, or service deployment platforms.
Knowledge of secure software supply chain practices, including artifact integrity, dependency scanning, secrets handling, policy enforcement, provenance, and deployment governance.
Experience building RAG systems with semantic retrieval, hybrid search, reranking, chunking strategies, grounding validation, citation-aware responses, and access-controlled retrieval.
Experience evaluating LLM and agentic systems using golden datasets, synthetic test generation, human review, automated scoring, red teaming, online experimentation, and regression testing.
Experience optimizing AI-powered systems for latency, throughput, reliability, token efficiency, cost, model selection, and service availability.
Track record of technical leadership through architecture ownership, patents, publications, open-source contributions, platform delivery, or high-impact developer tooling initiatives.
Experience mentoring engineers and raising the engineering bar across a team, platform, or organization.

Career Level - IC4