Lead Software Engineer - Ai/ml Deep Learning & GPU ML Serving

JPMorgan Chase JPMorgan Chase · Banking · Palo Alto, CA +1 · Commercial & Investment Bank

Lead Software Engineer focused on AI/ML Deep Learning and GPU ML Serving within a Commercial and Investment Banking team. Responsibilities include designing, developing, and troubleshooting software solutions, writing production code, producing architecture artifacts, analyzing data, and optimizing deep learning models for inference. The role requires experience with ML systems, Python, ML frameworks, cloud technologies (Docker, Kubernetes), ML model serving frameworks, GPU workloads, low-latency systems, and NoSQL databases. Experience with GPU resource management and microservices architecture is also needed. Preferred qualifications include an advanced degree, proficiency in multiple programming languages, experience with graph neural networks, GPU programming, model monitoring, MLOps tools, and serving large-scale models.

What you'd actually do

  1. Lead the design, development, and troubleshooting of software solutions, applying innovative approaches to complex technical challenges.
  2. Write secure, high-quality production code and maintain algorithms integrated with firm systems.
  3. Produce architecture and design artifacts for advanced applications, ensuring compliance with design constraints.
  4. Analyze and visualize large, diverse data sets to improve software applications and systems.
  5. Identify and resolve hidden issues and patterns in data to enhance code quality and system architecture.

Skills

Required

  • software engineering concepts
  • ML systems
  • Python
  • TensorFlow
  • PyTorch
  • Docker
  • Kubernetes
  • EKS
  • AWS
  • GCP
  • TorchServe
  • TensorFlow Serving
  • Triton Inference Server
  • GPU workloads
  • web services
  • APIs
  • NoSQL databases
  • Cassandra
  • GPU resource management
  • cost optimization
  • microservices architecture
  • large-scale systems design

Nice to have

  • MS/PhD in Computer Science, Machine Learning, or a related field
  • Java
  • Scala
  • C++
  • graph neural networks
  • graph processing frameworks
  • DGL
  • PyTorch Geometric
  • NetworkX
  • CUDA
  • model monitoring
  • A/B testing
  • ML observability tools
  • MLOps tools
  • MLflow
  • Kubeflow
  • SageMaker
  • serving large-scale models
  • performance optimization

What the JD emphasized

  • ML systems
  • ML model serving frameworks
  • GPU workloads
  • low-latency systems
  • GPU resource management
  • large-scale models

Other signals

  • optimize deep learning models for production inference
  • deploy and manage GPU workloads in Kubernetes environments
  • build scalable, low-latency systems using web services and APIs