AWS Neuron is the complete software stack for the AWS Inferentia and Trainium cloud-scale machine learning accelerators and the trn* and inf* servers that use them. This position is for a Software Engineer that will lead the development of various services that will aid in optimization, analysis and release of machine learning workloads and artifacts. This candidate must have had experience leading distributed systems and machine learning related projects, preferably starting from architecture through several generations of delivery to customers. Deep knowledge of optimization, resource management, scheduling are needed. The ideal candidate will have experience working on services like EC2, EKS, Lambda in AWS or similar services on other cloud providers.
Key job responsibilities
- This engineer will lead the design and implementation of new tools, pipelines and automation, will work with developers, system architects, hardware engineers and users both within and external to Amazon to ensure compatibility of this new toolset with existing and next-generation AI accelerators.
- Design, implement, and maintain CI/CD pipelines to automate the software release process.
- Collaborate with development teams to integrate new software releases.
- Infrastructure Management: Manage and automate infrastructure provisioning. Ensure high availability and scalability of systems through effective infrastructure management.
- Monitoring and Optimization: Implement monitoring solutions to track system performance. Identify bottlenecks and optimize system performance.
- Security and Compliance: Implement security best practices in the DevOps pipeline. Conduct regular vulnerability assessments and risk management.
A day in the life As you design and code solutions to help our team drive efficiencies in software architecture, you’ll create metrics, implement automation and other improvements, and resolve the root cause of software defects. You’ll also:
- Build high-impact solutions to deliver to our large customer base.
- Participate in design discussions, code review, and communicate with internal and external stakeholders.
- Work cross-functionally to help drive business decisions with your technical input.
- Work in a startup-like development environment, where you’re always working on the most important stuff.
About the team The Neuron Infra Services team fosters a builder’s culture where experimentation is encouraged, and impact is measurable. We emphasize collaboration, technical ownership, and continuous learning. Our team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge-sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects that help our team members develop your engineering expertise so you feel empowered to take on more complex tasks in the future.
Basic Qualifications
- 3+ years of non-internship professional software development experience
- 2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- Experience programming with at least one software programming language
- Knowledge of system performance, memory management, and parallel computing principles
- Experience in debugging, profiling, and implementing software engineering best practices in large-scale systems, or experience debugging, profiling, and implementing best software engineering practices in large-scale systems
- Experience with AWS or cloud technologies
Preferred Qualifications
- 3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Bachelor's degree in computer science or equivalent
- Knowledge of fundamentals of networking, security, databases (relational or NoSQL), operating systems (Unix, Linux, and/or Windows)
- Fundamentals of Machine learning and LLMs, their architecture along with work experience on certain LLM models.
Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.
Los Angeles County applicants: Job duties for this position include: work safely and cooperatively with other employees, supervisors, and staff; adhere to standards of excellence despite stressful conditions; communicate effectively and respectfully with employees, supervisors, and staff to ensure exceptional customer service; and follow all federal, state, and local laws and Company policies. Criminal history may have a direct, adverse, and negative relationship with some of the material job duties of this position. These include the duties and responsibilities listed above, as well as the abilities to adhere to company policies, exercise sound judgment, effectively manage stress and work safely and respectfully with others, exhibit trustworthiness and professionalism, and safeguard business operations and the Company’s reputation. Pursuant to the Los Angeles County Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.
The base salary range for this position is listed below. Your Amazon package will include sign-on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience, qualifications, and location. Amazon also offers comprehensive benefits including health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage), 401(k) matching, paid time off, and parental leave. Learn more about our benefits at https://amazon.jobs/en/benefits.
USA, CA, Cupertino - 165,200.00 - 223,600.00 USD annually