Principal Software Engineer, Agent Policy Fabric

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA +5 · Remote

This role focuses on building the core platform for governed agent action, specifically the Agent Policy Fabric (APF). Responsibilities include developing services for runtime policy verification, signed policy bundle verification, trust-root handling, and authorization APIs. The role also involves designing policy projection into runtime policies, building conformance and verification suites, and collaborating with other engineering teams on runtime integration surfaces. The goal is to mature the APF into a robust core platform for enterprise agent governance.

What you'd actually do

  1. Own APF Core Services: Build and harden the Runtime Policy Verifier, signed policy bundle verification, trust-root handling, freshness, rollback protection, subject binding to attested runtime context, revocation checks, and authorization APIs used by APF-compatible enforcement points.
  2. Design Policy Projection: Implement deterministic projections from the canonical APF policy into OpenShell-native runtime policy, adapter constraints, credential constraints, audit requirements, and model-visible tool hints, while preserving the atomic projection-admission contract.
  3. Build Conformance and Verification: Create golden fixtures, compatibility tests, negative tests, fuzz/property tests, and conformance suites that prove APF-compatible runtimes and adapters honor the same contract.
  4. Collaborate with Runtime Owners: Engage alongside OpenShell and Infrastructure engineers on public runtime interfaces for projection consumption, runtime context attestation, approved adapter paths, direct egress verification, and admission/rejection semantics.
  5. Land the Runtime integration surfaces. Own the cross-team work with OpenShell and other runtime owners to land public substrate interfaces APF composes against — runtime-context attestation, approved adapter path declaration, projection acceptance and rejection semantics, quarantine, and stop-session hooks. Land each as a public RFC or PR.

Skills

Required

  • Rust, Go, C++, or Python
  • designing production services, APIs, schemas, policy engines, authorization systems, or signed artifact pipelines
  • Linux systems, IPC or service-to-service APIs, protobuf/gRPC or equivalent wire formats, CI, test automation, release engineering, and cloud or enterprise deployment environments
  • authorization, cryptographic signatures, trust roots, revocation, subject binding, rollback protection, secure-by-default failure handling, and zero-trust architecture patterns
  • Ability to write streamlined technical specifications, align multiple engineering owners, defend bounded claims, and turn working-draft architecture into buildable interfaces without over-scoping the runtime

Nice to have

  • OPA/Rego, Cedar, Zanzibar-style authorization, policy compilers, sandbox policy, or runtime enforcement systems
  • agent frameworks, tool-call governance, sandboxed execution, OpenShell-like runtime substrates, MCP-style tool routing, or credential isolation for agents
  • Sigstore, TUF, in-toto, HSM-backed signing, package provenance, signed configuration, or enterprise trust-root distribution
  • property testing, model checking, symbolic execution, red-team findings, or bounded verification to constrain security claims
  • contributing to RFCs in identity, supply-chain, or policy spaces (IETF, OpenID Foundation, FIDO Alliance, CNCF, NIST)

What the JD emphasized

  • policy engines
  • authorization systems
  • authorization
  • agent frameworks
  • tool-call governance
  • credential isolation for agents

Other signals

  • building the foundations for the signed policy
  • Runtime Policy Verifier
  • agentic systems
  • governed agent action