Senior Data Engineer, Ww Fba Central Analytics

Amazon Amazon · Big Tech · IN, KA, Bengaluru · Data Science

Senior Data Engineer role focused on building and maintaining data infrastructure for AI-powered insights within Amazon's Fulfillment by Amazon (FBA) division. The role involves architecting data lakes, migrating architectures, establishing metadata management, designing data ingestion patterns, creating a centralized metrics repository, implementing data quality frameworks, designing semantic data models optimized for AI retrieval, developing intelligent orchestration, implementing federated query capabilities, architecting vector database infrastructure, integrating AI-accessible data contracts, and building monitoring frameworks. The position emphasizes the intersection of data engineering and AI, ensuring trustworthy, fast, and scalable GenAI insights.

What you'd actually do

  1. Architect and implement a scalable, cost-optimized S3-based Data Lakehouse that unifies structured and unstructured data from disparate sources across 8 WW FBA metrics domains.
  2. Lead the strategic migration from Redshift-centric architecture to a flexible lakehouse model, targeting query performance improvement from 60–300 seconds to under 10 seconds.
  3. Establish metadata management with automated data classification and lineage tracking.
  4. Design and enforce standardized data ingestion patterns with built-in quality controls and validation gates.
  5. Architect a centralized metrics repository that becomes the single source of truth for all FBA metrics across various time grains.

Skills

Required

  • SQL
  • AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions
  • data warehouse technical architectures, data modeling, infrastructure components, ETL/ ELT and reporting/analytic tools and environments, data structures and hands-on SQL coding experience
  • programming/scripting (Batch, VB, PowerShell, Java, C#, Chef, Perl, Ruby and/or PHP), or experience in any Bigdata architecture and experience that includes strong analytical skills, attention to detail, and effective communication abilities
  • building and maintaining data flows and pipelines
  • Data & AI related technologies, including, but not limited to, AI/ML, GenAI, Analytics, Database, and/or Storage experience

Nice to have

  • mentoring team members on best practices
  • big data technologies such as: Hadoop, Hive, Spark, EMR
  • operating large data warehouses
  • training and deploying machine learning systems to solve large-scale optimizations, or experience with data infrastructures: relational analytic DBMS, Elastic-Search, and Big Data EMR/EC2/Glue/Lambda
  • data mining, ETL, etc. and using databases in a business environment with large-scale, complex datasets
  • machine learning, data mining, information retrieval, statistics or natural language processing, or experience in developing and deploying LLMs in production on GPUs, Neuron, TPU or other AI acceleration hardware
  • building analytic or scientific data products or solutions
  • Bigdata architecture, or experience in Redshift and experience in managing firewalls
  • leading technical initiatives and key deliverables
  • leading large teams, with demonstrable ability to hire, develop, and manage high-performing technical teams

What the JD emphasized

  • integrating LLM-powered solutions with robust backend systems
  • own the data foundation that determines whether GenAI-powered insights are trustworthy, fast, and scalable
  • deliver proactive, AI-generated insights across FBA metrics to business leadership worldwide
  • optimizing for AI retrieval patterns
  • AI retrieval requirements for LLM-powered insight generation
  • Architect vector database infrastructure
  • AI-accessible data contracts

Other signals

  • integrating LLM-powered solutions with robust backend systems
  • own the data foundation that determines whether GenAI-powered insights are trustworthy, fast, and scalable
  • deliver proactive, AI-generated insights across FBA metrics to business leadership worldwide
  • Design extensible metrics schemas that support complex analytical queries while optimizing for AI retrieval patterns
  • Architect vector database infrastructure capable of managing large-scale embeddings with consistent low-latency retrieval
  • Integrate schema definitions through MCP service calls to enable automated, AI-accessible data contracts