Data Scientist

Apple Apple · Big Tech · Hyderabad, India · Software and Services

This role focuses on building and scaling automated insight pipelines for the sales organization, developing ML models for opportunity detection and performance diagnosis, and embedding these insights into AI agents, dashboards, and GenAI tools. The role involves end-to-end insight development, including data preparation, statistical analysis, LLM prompt engineering, and deploying ML models for forecasting, anomaly detection, attribution, and causal inference. It also includes building RCA and recommendation engines, analyzing agent interactions, implementing LLM evaluation pipelines, and supporting experimentation. The role requires partnering with AI engineers and PMs, acting as a data translator, influencing upstream data model design, and driving KPI definitions.

What you'd actually do

  1. Lead end-to-end insight development: from data preparation and statistical analysis to LLM prompt engineering that translates findings into sales-ready insights.
  2. Design and deploy ML models for forecasting, anomaly detection, attribution modeling, and causal inference—either building custom solutions or adapting Apple's existing ML services.
  3. Build RCA and recommendation engines that enhance summarization and chatbot capabilities.
  4. Analyze agent interactions and implementing LLM evaluation pipelines to measure factual accuracy, latency, and user satisfaction.
  5. Support experimentation and A/B testing for new insight types and interaction methods.

Skills

Required

  • 4+ years of experience in a Data Science, Data Analysis, or Data Visualization role.
  • Hands-on experience with LLMs, RAG architectures, and prompt engineering.
  • Strong proficiency in Python and ML/data science libraries.
  • Applied knowledge of statistical data analysis, predictive modeling, classification, Time Series techniques, sampling methods, multivariate analysis, hypothesis testing, and drift analysis.
  • Proficiency in SQL and experience with cloud data platforms (Snowflake, Spark, BigQuery, etc.)
  • Expertise with data visualization tools (such as Tableau, d3, plotly, etc.) for data analysis and presentation.
  • Experience with Git and collaborative development workflows.
  • Familiarity with deployment frameworks and tools (Docker, Kubernetes, FastAPI, or similar).
  • Comfort with ambiguity. Ability to structure complex analysis through data analysis and strategy research.
  • Proven ability to translate business problems into technical solutions and communicate findings to non-technical stakeholders.
  • Experience co-developing with data scientists and software engineers in production environments.
  • Strong time management skills with the ability to collaborate across multiple teams.
  • Able to balance competing priorities, long-term projects, and ad hoc requirements.
  • Bachelor’s degree in Computer Science, Statistics, Mathematics, Engineering, Economics, Applied Mathematics, Machine Learning, or a related field.

Nice to have

  • Experience with Tableau Server, TabPy, and Extensions is a plus.
  • Production experience with GenAI frameworks (LangChain, LlamaIndex, Haystack, etc.)
  • Familiarity with LLM observability and evaluation tools (LangSmith, Weights & Biases, TruLens, etc.)
  • Experience with vector databases, embedding models, and retrieval algorithms
  • Knowledge of agent architectures and knowledge graphs for LLM applications
  • Experience with CI/CD pipelines and MLOps practices
  • Experience with drift detection and model monitoring in production
  • Track record of presenting insights to senior leadership and influencing business strategy
  • Sound communication skills - adept at messaging domain and technical content, at a level appropriate for the audience. Strong ability to gain trust with stakeholders and senior leadership.
  • Advanced Degree (MS or Ph.D.) in Economics, Electrical Engineering, Statistics, Data Science, or a similar quantitative field.

What the JD emphasized

  • Hands-on experience with LLMs, RAG architectures, and prompt engineering.
  • LLM evaluation pipelines
  • agent interactions

Other signals

  • build and scale automated insight pipeline
  • develop ML models that detect opportunities, diagnose performance issues, and recommend actions
  • embed these insights into AI agents, dashboards, and GenAI-powered tools
  • LLM prompt engineering
  • LLM evaluation pipelines