Sr Lead Software Engineer - Ai/ml

JPMorgan Chase JPMorgan Chase · Banking · New York, NY +1 · Consumer & Community Banking

Senior Lead Software Engineer focused on building, scaling, and maintaining robust machine learning platforms and infrastructure. The role involves designing and optimizing tools for the end-to-end ML lifecycle, including data engineering, feature management, model training, deployment, monitoring, and serving. Emphasis on secure, high-quality production code, MLOps practices, and collaboration with data scientists and ML engineers to accelerate ML development and operations within an enterprise environment.

What you'd actually do

  1. Design, build, and maintain scalable machine learning platforms and infrastructure to support end-to-end ML workflows.
  2. Develop and optimize tools for model training, deployment, monitoring, and lifecycle management.
  3. Integrate data engineering, feature management, and model serving capabilities into unified ML platform solutions.
  4. Implement secure, high-quality production code for platform services, APIs, and automation pipelines.
  5. Collaborate with data scientists, ML engineers, and product teams to understand requirements and deliver platform features that accelerate ML development and operations.

Skills

Required

  • software engineering concepts
  • building, deploying, and maintaining machine learning platforms or infrastructure
  • Python
  • ML frameworks (e.g., TensorFlow, PyTorch, Scikit-learn)
  • data processing frameworks and tools (e.g., Spark, Pandas, SQL)
  • cloud-based ML platforms (e.g., AWS SageMaker, GCP AI Platform, Azure ML) or on-prem ML infrastructure
  • MLOps practices
  • developing APIs and platform services for ML workflows
  • software development life cycle
  • agile methodologies

Nice to have

  • Databricks
  • Snowflake
  • Snorkel AI
  • containerization and orchestration tools (e.g., Docker, Kubernetes, Airflow)
  • feature stores
  • model registries
  • ML metadata management
  • infrastructure-as-code tools (e.g., Terraform, CloudFormation)
  • RESTful APIs
  • microservices architectures

What the JD emphasized

  • building, deploying, and maintaining machine learning platforms or infrastructure
  • MLOps practices, including CI/CD for ML, model versioning, and monitoring

Other signals

  • building scalable machine learning platforms
  • deploy, and monitor models efficiently and securely
  • implementing critical technology solutions
  • ML platform capabilities
  • end-to-end ML workflows
  • model training, deployment, monitoring, and lifecycle management
  • feature management, and model serving capabilities
  • secure, high-quality production code for platform services, APIs, and automation pipelines
  • accelerate ML development and operations
  • platform reliability, scalability, and performance
  • automating infrastructure provisioning, configuration, and CI/CD pipelines for ML platform services