Senior/staff Software Engineer - Pakistan

Vectara Vectara · Data AI · Remote · Platform Engineering and ML

Senior/Staff Software Engineer role focused on developing and managing the core infrastructure for an enterprise RAG and Agentic AI platform. Responsibilities include infrastructure automation, CI/CD, deploying and optimizing ML/NLP workloads (especially model inference), and supporting customer deployments. The role requires strong backend development, DevOps, and containerization skills, with experience in ML platforms and observability tools.

What you'd actually do

  1. Drive infrastructure automation, CI/CD, and monitoring/alerting pipelines.
  2. Collaborate with Field Engineering teams to support PoCs, and Platform deployments in customer Cloud VPCs and on-prem.
  3. Deploy, scale, and optimize ML/NLP workloads, especially model inference.
  4. Lead initiatives to improve system reliability, scalability, and developer experience.
  5. Contribute to architecture and infrastructure decisions as we scale our platform.

Skills

Required

  • 5+ years of experience as a software engineer with a focus on backend systems and platform engineering.
  • Deep experience with all computing environments (GCP, AWS, Onprem, or Azure).
  • Strong understanding of containerization and orchestration (Docker, Kubernetes).
  • Experience with observability tools (Prometheus, Grafana, ELK/EFK, etc.).
  • Proficiency in languages like Go, Python, or Java; experience with infrastructure-as-code (Terraform, Pulumi, etc.).

Nice to have

  • Experience working on ML platforms or supporting ML workloads in production.
  • Familiarity with data infrastructure (e.g., Kafka, Spark, Airflow).
  • Experience with providing technical support of custom-developed systems to customers.

What the JD emphasized

  • Accuracy, Security, and Explainability
  • Accuracy, Security, and Explainability
  • Accuracy, Security, and Explainability
  • production ready

Other signals

  • Deploy, scale, and optimize ML/NLP workloads, especially model inference.
  • Develop IaaC and Helm charts for deploying and managing the core infrastructure that powers our retrieval-augmented generation and agentic AI platform.