Senior Data Scientist I

Senior Data Scientist I role focused on designing, building, and maintaining ETL/ELT pipelines on cloud (Azure) or on-prem for data collection, ingestion, and storage. The role involves monitoring, optimizing, and troubleshooting data pipelines, ensuring data quality, security, and compliance. It requires collaboration with various partners, mentoring Data Engineers, and leading design and solutioning. Technical skills include SQL, Python/Scala, NoSQL, distributed databases, Big Data frameworks (Spark, Hadoop, Hive), and cloud platforms (Azure).

What you'd actually do

  1. Design, build, and maintain robust ETL/ELT pipelines on cloud(Azure) or on-prem to collect, ingest and store large volumes of structured and unstructured data for batch/real time processing
  2. Monitor, optimize, and troubleshoot data pipelines to ensure reliability, scalability, and performance
  3. Ensure data processing, quality, security, and compliance guidelines, policies and standards are followed
  4. Collaborate with multiple partners from Business, Technology, Operations and D&A capabilities (Data Governance, Data Quality, Data Modeling, Data Architecture, Data science, DevOps, BI & insights)
  5. Mentor Data Engineers

Skills

Required

  • SQL
  • Python/Scala
  • NoSql and distributed databases (Hbase, Cosmos DB)
  • ETL pipleine design and development
  • Solutioning and estimation
  • Big Data Frameworks : Apache Spark, Hadoop, Hive
  • Cloud platforms: Azure data factory, Eventhub, Azure functions, Synapse, Databricks
  • Datawarehouses, data marts, data lakes
  • Medallion architecture
  • Performance tuning, optimization, and data quality validation
  • Real-time and batch data processing , streaming pieplines with Spark
  • Communication skills
  • analytical skills
  • structured problem-solving skills
  • mentorship skills
  • Mentorship experience
  • Storytelling skills
  • Partner & Stakeholder engagement experience

Nice to have

  • DevOps practices: Git, AzureDevops, CI/CD pipelines
  • Unix shell scripting
  • Kafka
  • MongoDB
  • Nifi
  • Exposure to Gen AI technology and tools
  • Banking Financial Services and Insurance domain knowledge