What you'd actually do

Assist in planning and executing benchmarking exercises for AI models, including defining test plans, metrics, and acceptance criteria across accuracy, robustness, bias, and reliability

Support content accuracy, relevancy, and privacy checks by reviewing datasets, model outputs, and data handling practices, escalating potential regulatory risks.

Validate data based on specific annotation guidelines, ensuring the accuracy and quality of the collected information

Prepare clear audit and benchmarking reports, including error ratings, root-cause analysis, and recommendations, and contribute to presentations for senior stakeholders

Maintain organized audit documentation, evidence, and benchmarking datasets to support internal review

What the JD emphasized

Spanish

AI Benchmarking Specialist

AI auditing

quality assurance

traditional audit-style documentation

stakeholder communication

regulatory risks

annotation guidelines

audit documentation

benchmarking datasets

process efficiencies

automation

AI audit methodologies

checklists

test frameworks

regulations

best practices evolve

annotations for training

measuring

improving Artificial Intelligence (AI) and Large Language Models (LLMs)

seller experience

accuracy

robustness

bias

fairness

content accuracy

relevancy

privacy checks

data handling practices

quality of the collected information

error ratings

root-cause analysis

recommendations

senior stakeholders

internal review

team members

managers

drive process efficiencies

explore opportunities for automation

enhance the productivity and effectiveness of the data generation

contributing to the development and continuous improvement of AI audit methodologies

checklists

test frameworks

regulations

best practices evolve

The Seller AI team within International Seller Services organization is focused on helping sellers with the right set of Gen-AI/LLM powered tools and agentic solutions that can enable them to accelerate business growth on Amazon. Our primary focus lies in handling annotations for training, measuring, and improving Artificial Intelligence (AI) and Large Language Models (LLMs), enabling Amazon to deliver a superior seller experience to our sellers worldwide. The AI Benchmarking Associate supports the evaluation of AI systems by designing and executing benchmarking and audit activities to assess model quality, compliance, robustness, and fairness. The role combines elements of AI auditing, quality assurance, and traditional audit-style documentation and stakeholder communication. By joining us, you will play a pivotal role in shaping the future of selling on Amazon for sellers worldwide.

Key job responsibilities As part of your role, you will have the opportunity to, • Assist in planning and executing benchmarking exercises for AI models, including defining test plans, metrics, and acceptance criteria across accuracy, robustness, bias, and reliability • Support content accuracy, relevancy, and privacy checks by reviewing datasets, model outputs, and data handling practices, escalating potential regulatory risks. • Validate data based on specific annotation guidelines, ensuring the accuracy and quality of the collected information • Prepare clear audit and benchmarking reports, including error ratings, root-cause analysis, and recommendations, and contribute to presentations for senior stakeholders Maintain organized audit documentation, evidence, and benchmarking datasets to support internal review • You will work closely with your team members and managers to drive process efficiencies and explore opportunities for automation • You will strive to enhance the productivity and effectiveness of the data generation by contributing to the development and continuous improvement of AI audit methodologies, checklists, and test frameworks as regulations and best practices evolve

We are open to hiring candidates to work out of :

Bengaluru, Karnataka, IND

About the team There are millions of small and medium businesses across international stores such as India, LatAm, Europe, Middle East, Japan etc. who sign up as sellers on Amazon. Our primary focus lies in handling annotations for training, measuring, and improving Artificial Intelligence (AI) and Large Language Models (LLMs), enabling Amazon to deliver a superior seller experience to our sellers worldwide.

Basic Qualifications

Speak, read and write fluently in Spanish

Preferred Qualifications

Experience with machine learning models

Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.

We are open to hiring candidates to work out of :

Bengaluru, Karnataka, IND

Basic Qualifications

Speak, read and write fluently in Spanish

Preferred Qualifications

Experience with machine learning models

AI Benchmarking Specialist, Sp Support - Spanish, International Seller Growth

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

Basic Qualifications

Preferred Qualifications

Basic Qualifications

Preferred Qualifications