Software Developer 4

Oracle Oracle · Enterprise · Santa Clara, CA +1

Software Developer 4 role at Oracle focused on building advanced AI applications for network automation, optimization, and security. Responsibilities include designing and implementing AI/ML systems for serving and training models, incorporating research on AI agents and inference, optimizing training and inference workloads, leading initiatives in RAG and LLM fine-tuning, and developing GPU-accelerated AI pipelines. Requires strong Python, ML frameworks, distributed systems, and MLOps experience.

What you'd actually do

  1. Design and implement scalable orchestration for serving and training AI/ML models.
  2. Explore and incorporate contemporary research on AI, agents, and inference systems into the software stack for designing, monitoring, troubleshooting and deploying networks.
  3. Evaluate, Integrate, and Optimize technologies across the stack, for latency, throughput, and resource utilization for training and inference workloads.
  4. Lead initiatives in AI systems design, including Retrieval-Augmented Generation (RAG) and LLM fine-tuning.
  5. Design and develop scalable services and tools to support GPU-accelerated AI pipelines, Python/Go, and observability frameworks.

Skills

Required

  • Python
  • ML frameworks (PyTorch, TensorFlow)
  • LLMs
  • embeddings
  • vector search
  • RAG pipelines
  • fine-tuning
  • Data engineering: Spark, Kafka, Flink, OCI Streaming/Data Flow
  • Distributed systems
  • large-scale training/inference
  • Handling network telemetry (NetFlow, packet captures, streaming telemetry)
  • Network automation frameworks (Terraform, Ansible, NAPALM, Batfish is a plus)
  • Containerization
  • model serving
  • GPU workflows
  • CI/CD
  • MLOps tools
  • Writing design docs
  • scoping features
  • owning delivery end-to-end
  • 7+ years of experience building software systems
  • prior experience building AI applications training models

Nice to have

  • MSEE, MSCS, or MSCE
  • Batfish

What the JD emphasized

  • building advanced AI applications powered by AI models
  • training AI models
  • building and optimizing large-scale AI systems
  • development and deployment of AI solutions
  • serving and training AI/ML models
  • training and inference workloads
  • LLM fine-tuning
  • large-scale training/inference
  • model serving
  • MLOps tools
  • building software systems
  • building AI applications training models

Other signals

  • design and development team to build advanced AI applications powered by AI models
  • use AI/ML to automate, optimize, and secure networks
  • training AI models
  • building and optimizing large-scale AI systems
  • development and deployment of AI solutions
  • serving and training AI/ML models
  • contemporary research on AI, agents, and inference systems
  • training and inference workloads
  • LLM fine-tuning
  • GPU-accelerated AI pipelines
  • large-scale training/inference
  • model serving
  • MLOps tools