Cyber AI Data Engineer Senior Consultant

This role focuses on building and operating governed data foundations for cyber risk, compliance evidence, and agentic AI-enabled cyber workflows. The engineer will design production-grade pipelines and services for risk reporting, controls monitoring, and AI-assisted security operations, emphasizing strong governance, lineage, privacy, and auditability in regulated enterprise environments. Key responsibilities include building data pipelines, designing data models for risk/controls, implementing governance controls, developing AI-enabled capabilities using agentic patterns (RAG, tool calling, orchestration), engineering secure integrations, and partnering with stakeholders.

What you'd actually do

  1. Building scalable batch and stream processing pipelines that ingest security telemetry, control evidence, and compliance artifacts into governed data stores (lakehouse/warehouse).
  2. Designing data models for risk and controls domains (KRIs, issues/defects, risk acceptance, control testing outcomes, audit evidence, policy exceptions) and enabling self-service analytics and dashboards.
  3. Implementing data quality checks, lineage, metadata, and access controls to support auditability, regulatory defensibility, and repeatable evidence generation.
  4. Developing AI-enabled capabilities that accelerate GRC and cyber operations—such as evidence summarization, control testing assist, policy Q&A, investigation copilots, ticket triage, and exception reasoning—using agentic patterns including tool/function calling, workflow orchestration, and Retrieval-Augmented Generation (RAG).
  5. Engineering secure integrations between data platforms, GRC workflows, and enterprise systems (APIs, event patterns, connectors), with observability and runbooks for production support.

Skills

Required

  • 4+ years of hands-on experience in data engineering and software development
  • Python
  • SQL
  • building production data pipelines
  • building data models
  • batch processing
  • streaming processing
  • strong engineering discipline (CI/CD, testing, monitoring, incident response)
  • implementing governance controls in data and AI systems
  • data classification
  • PII handling
  • least-privilege access
  • encryption/secrets
  • retention
  • audit logging
  • lineage/metadata
  • supporting GRC workflows and evidence needs

Nice to have

  • Java/Go/JavaScript
  • LangChain/LangGraph
  • CrewAI
  • AutoGen
  • Semantic Kernel
  • Vector databases (Pinecone, Weaviate, Elastic)
  • Knowledge Graphs
  • RAG pipelines
  • LLMOps/MLOps tooling
  • AWS, Azure, or GCP
  • Kubernetes
  • Docker
  • Terraform/IaC
  • GitOps CI/CD
  • ServiceNow GRC
  • Archer
  • OneTrust
  • BigID
  • SIEM/SOAR data
  • vulnerability data
  • identity logs

What the JD emphasized

  • governed data foundations
  • agentic AI-enabled cyber workflows
  • Governance, Risk, and Compliance (GRC)
  • regulated enterprise environments
  • data governance
  • auditability
  • regulatory defensibility
  • agentic patterns
  • tool/function calling
  • workflow orchestration
  • Retrieval-Augmented Generation (RAG)
  • secure integrations
  • governance controls

Other signals

  • AI-enabled cyber defense
  • agentic AI-enabled cyber workflows
  • governed data foundations
  • risk reporting
  • continuous controls monitoring
  • AI-assisted security operations
  • governance, Risk, and Compliance (GRC)
  • agentic patterns
  • tool/function calling
  • workflow orchestration
  • Retrieval-Augmented Generation (RAG)
  • LLMOps/MLOps tooling