Senior Machine Learning Engineer

Axon Axon · Enterprise · Sterling, VA · 2024 Dedrone R&D

Senior Machine Learning Engineer role focused on designing and implementing high-performance C++ software for real-time computer vision and tracking algorithms on edge devices. The role involves optimizing CUDA kernels, building parallel processing pipelines, and integrating ML models into production, with a strong emphasis on performance and memory management in constrained environments.

What you'd actually do

  1. Design and implement high-performance C++ software that runs computer vision and tracking algorithms in real time on edge devices.
  2. Work closely with computer vision / self-supervised learning engineers to integrate their models into production pipelines, including pre/post-processing, I/O, and system orchestration.
  3. Build and optimize multithreaded and parallel processing pipelines for ingesting, synchronizing, and processing data from a networked system of cameras.
  4. Implement and tune CUDA kernels and GPU-accelerated components to maximize throughput and minimize latency for inference, tracking, and search.
  5. Design robust data structures and memory management strategies for handling large volumes of video, sensor, and metadata streams under tight compute and power constraints.

Skills

Required

  • 5+ years of professional experience in modern C++ (C++14/17 or later)
  • Strong object-oriented and generic programming skills
  • Deep understanding of multithreading and concurrency
  • Experience building robust, concurrent systems
  • Hands-on experience with parallel processing frameworks or patterns
  • Strong command of data structures and algorithms
  • Proven experience with memory management and performance optimization in C++
  • Practical experience with CUDA (or similar GPU programming frameworks)
  • Strong debugging and profiling skills across CPU and GPU
  • Methodical approach to benchmarking and regression testing
  • Excellent collaboration and communication skills
  • Track record of working closely with research or ML teams to move algorithms from prototype to production

Nice to have

  • Experience integrating machine learning or computer vision inference engines (e.g., TensorRT, OpenVINO, ONNX Runtime)
  • Familiarity with Linux-based development (build systems like CMake, unit testing frameworks, containerization and/or cross-compilation for edge devices)

What the JD emphasized

  • high-performance C++ software
  • computer vision
  • tracking algorithms
  • edge devices
  • real time
  • multithreaded and parallel processing pipelines
  • CUDA kernels
  • GPU-accelerated components
  • inference
  • tracking
  • search
  • data structures and memory management strategies
  • large volumes of video, sensor, and metadata streams
  • tight compute and power constraints
  • modern C++ (C++14/17 or later)
  • multithreading and concurrency
  • parallel processing
  • data structures and algorithms
  • memory management and performance optimization
  • CUDA
  • machine learning or computer vision inference engines
  • Linux-based development
  • debugging and profiling skills
  • benchmarking and regression testing
  • working closely with research or ML teams

Other signals

  • Deploying ML models to edge devices
  • Real-time computer vision and tracking algorithms
  • Optimizing CUDA kernels for inference