Senior Software Engineer - AI for Security, Data/application

ByteDance ByteDance · Big Tech · San Jose, CA · R&D

Senior Software Engineer focused on AI for Security, building and refining AI security datasets, exploring LLM performance in security contexts, developing interpretability-based standards, performing Red Teaming, and building RAG evaluation systems with interpretability and traceability tools.

What you'd actually do

  1. Build and refine AI security datasets: Design and develop comprehensive, in-depth, and challenging datasets for AI-for-Security across different security scenarios.
  2. Explore model consistency and performance prediction in security contexts: Conduct deep research on LLM performance during training on security tasks and assess the performance limits of models in security applications.
  3. Develop security data and evaluation standards from an interpretability perspective: Propose interpretability-based standards grounded in model mechanisms to assess transparency and reliability of LLMs in security decision-making and remediation.
  4. Red Teaming and model optimization: Perform Red Teaming from an evaluation perspective to systematically identify weaknesses of LLMs in security contexts and propose targeted optimization strategies.
  5. Build RAG evaluation systems: Design end-to-end evaluation metrics and benchmarks for security-specific RAG systems, create automated evaluation workflows, and develop interpretability and traceability tools for RAG systems.

Skills

Required

  • Python
  • Java
  • C++
  • NLP
  • CV
  • ML technologies
  • LLM-related stacks
  • Reward Model
  • GRPO/PPO/DPO
  • SFT/RFT
  • CT
  • PE

Nice to have

  • Published research papers in mainstream conferences/journals in CV/NLP/Security domain
  • Experience with security-related models (e.g., vulnerability detection models, malicious code analysis models)
  • Leading impactful projects or publishing significant papers in the LLM or AI security domain

What the JD emphasized

  • interpretability
  • evaluation
  • security

Other signals

  • AI for Security
  • LLM performance
  • interpretability
  • Red Teaming
  • RAG evaluation