Principal Software Engineer, Security AI

Microsoft Microsoft · Big Tech · Mountain View, CA +2 · Software Engineering

Principal Software Engineer focused on building AI-powered security systems for Microsoft's cloud environment. The role involves designing, building, and operating production AI services that combine LLMs, agentic workflows, RAG, knowledge graphs, and multi-modal processing, with a strong emphasis on evaluation, responsible AI, and scalability within a large cloud platform. The candidate will lead architecture, design, and delivery of these systems, collaborating across various teams to translate AI advances into practical security solutions.

What you'd actually do

  1. Design, build, and operate AI-powered software services that support security engineering across Microsoft’s cloud environment.
  2. Develop AI-enabled workflows that help engineering and security teams analyze information, retrieve relevant context, summarize findings, and make faster, higher-quality decisions.
  3. Build scalable systems that use large language models, retrieval-augmented generation, embeddings, semantic search, knowledge graphs, and related AI techniques to support security scenarios.
  4. Create evaluation, measurement, and monitoring approaches that help assess AI system quality, reliability, safety, and effectiveness in production environments.
  5. Partner with engineering, applied science, product, security operations, and other teams to translate AI advances into practical, secure, durable and reliable platform capabilities.

Skills

Required

  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
  • Microsoft Cloud Background Check

Nice to have

  • Master’s degree or PhD in Computer Science, Machine Learning, Artificial Intelligence, or related technical field, OR equivalent industry experience.
  • 5+ years of hands-on experience building AI, machine learning, or large language model-enabled systems, including model or agent development, retrieval and knowledge systems, data pipelines, evaluation, safety, experimentation, and productionization in large-scale cloud environments.
  • Experience designing reliable and scalable software systems with strong fundamentals in APIs, service architecture, data modeling, testing, debugging, observability, incident response, and secure software development.
  • Experience building multi-agent systems, tool-use frameworks, orchestration layers, autonomous workflows, or AI copilots in production environments.
  • Experience with vector databases, embeddings, semantic search, knowledge graphs, entity resolution, ranking, summarization, or context-grounding systems.

What the JD emphasized

  • production AI services
  • large language models
  • agentic workflows
  • retrieval-augmented generation
  • knowledge graphs
  • multi-modal signal processing
  • rigorous evaluation frameworks
  • production AI or ML systems
  • large-scale cloud services
  • reliable, measurable, and trustworthy AI-driven solutions
  • AI-powered systems
  • AI system quality, reliability, safety, and effectiveness
  • AI advances into practical, secure, durable and reliable platform capabilities
  • responsible AI, privacy, security, and compliance
  • production readiness
  • architecture, APIs, reliability, scalability, observability, cost efficiency, incident response, and continuous improvement
  • AI capabilities, system reliability, and platform impact
  • building AI, machine learning, or large language model-enabled systems
  • model or agent development
  • retrieval and knowledge systems
  • data pipelines
  • evaluation
  • safety
  • experimentation
  • productionization in large-scale cloud environments
  • reliable and scalable software systems
  • APIs, service architecture, data modeling, testing, debugging, observability, incident response, and secure software development
  • multi-agent systems
  • tool-use frameworks
  • orchestration layers
  • autonomous workflows
  • AI copilots in production environments
  • vector databases
  • embeddings
  • semantic search
  • knowledge graphs
  • entity resolution
  • ranking
  • summarization
  • context-grounding systems

Other signals

  • AI-powered security systems
  • large language models
  • agentic workflows
  • retrieval-augmented generation
  • knowledge graphs
  • multi-modal signal processing
  • rigorous evaluation frameworks
  • production AI services