Systems Development Engineer

Amazon Amazon · Big Tech · NSW, Australia +1 · Systems, Quality, & Security Engineering

This role is for a Systems Development Engineer on the AWS Region Services team in Sydney, Australia. The primary focus is on supporting and refining system requirements, developing and delivering operability features like monitoring, diagnostics, and self-healing automation. The role involves managing and improving operations for scalable, high-availability cloud services, including participating in on-call rotations and incident resolution. While the team works with AI/ML, this specific role is centered on the engineering and operational aspects of secure cloud infrastructure, not direct AI/ML model development or research. It requires experience with automation, programming languages, Linux/Unix, and ideally with distributed systems, performance tuning, and infrastructure as code.

What you'd actually do

  1. Support the refinement of system requirements, participate in the development and delivery of operability-related features such as system health monitoring, diagnostics, repair, and other self-healing automation
  2. Develop or further existing application and system management tools and processes that reduce manual efforts and increase overall efficiency
  3. Adapt and improve operations management systems and processes to accommodate rapid and increasing growth in systems and traffic
  4. Participate in the design and execution of production acceptance tests and new hardware evaluations
  5. Monitor the health of the fleet, automating system health, maintenance tasks, and reporting systems as needed

Skills

Required

  • 1+ years of contributing to automation for new and current system experience
  • Experience programming with at least one modern language such as Python, Ruby, Golang, Java, C++, C#, Rust
  • Experience with Linux/Unix

Nice to have

  • Experience operating 24x7 high-availability, distributed software applications and performance tuning software applications and optimizing fleet utilization
  • Understanding of network fundamentals (DNS, DHCP, TCP/IP, routing, load balancing, load shedding) and experience with monitoring frameworks (such as CloudWatch, Datadog, Grafana, Elastic or similar)
  • Experience scripting operating system tasks in Bash, Python, etc. and with Infrastructure as Code, (such as CDK, CloudFormation, Puppet, Chef, Ansible, or similar)

What the JD emphasized

  • Australian citizens
  • hold or be eligible to obtain an Australian Government Security Clearance
  • successfully complete an Organisational Suitability Assessment