Senior Software Engineer(ai/ml Platform)

Autodesk Autodesk · Enterprise · Pune, India

Senior Software Engineer focused on designing and implementing scalable AI/ML serving systems within a hybrid cloud architecture. Responsibilities include managing model deployment, versioning, performance monitoring, and optimization, while ensuring security and compliance. The role requires strong software engineering skills, familiarity with AI/ML frameworks, and experience with cloud technologies and DevOps practices.

What you'd actually do

  1. Design and Implement Scalable AI/ML Serving Systems: Develop scalable and efficient systems for serving AI/ML models, ensuring that these systems can handle varying loads and perform with low latency across diverse environments
  2. Hybrid Cloud Architecture Management: Architect and manage a hybrid cloud environment that uses both on-premises resources and multiple cloud platforms (e.g., AWS, Azure, GCP) to optimise performance, cost, and scalability
  3. Model Deployment and Versioning: Oversee the deployment of AI/ML models into production, including the setup of CI/CD pipelines for model deployment and versioning, ensuring smooth and reliable model updates and rollbacks
  4. Performance Monitoring and Optimization: Implement monitoring tools and practices to track the performance of AI/ML models in production, identifying bottlenecks and optimizing system and model performance for better efficiency and reduced costs
  5. Security and Compliance: Ensure that the AI/ML serving systems follow industry standards and regulatory requirements for data security and privacy, including the management of data encryption, access controls, and audit trails

Skills

Required

  • Python
  • TensorFlow
  • PyTorch
  • AWS
  • Azure
  • GCP
  • Docker
  • Kubernetes
  • CI/CD
  • DevOps
  • latency optimization

Nice to have

  • AWS Certified Solutions Architect
  • Google Cloud Professional Cloud Architect
  • Microsoft Certified: Azure Solutions Architect Expert
  • Hadoop
  • Spark
  • Kafka
  • MLflow
  • Kubeflow
  • TensorBoard

What the JD emphasized

  • low latency
  • hybrid cloud
  • model deployment
  • performance monitoring
  • data security and privacy

Other signals

  • design and implement scalable AI/ML serving systems
  • hybrid cloud architecture management
  • model deployment and versioning
  • performance monitoring and optimization
  • collaboration and leadership