Data Engineer (python, Spark)

Autodesk Autodesk · Enterprise · Bangalore, India

Autodesk is seeking a Data Engineer to build and maintain data engineering pipelines and data models within the Access domain. The role involves improving data quality, reliability, and observability, modernizing legacy workflows, and developing ETL/ELT processes using technologies like Spark, Flink, and cloud services on AWS. The engineer will collaborate with data scientists and product managers, and leverage AI-assisted development tools.

What you'd actually do

  1. You will need a product-focused mindset. It is essential for you to understand business requirements and architect systems that scale and extend to accommodate those needs
  2. Break down complex problems, define technical solutions, and sequence work to enable fast, iterative improvements
  3. Design, build, and maintain scalable data pipelines and data models across Access
  4. Modernize legacy data workflows and infrastructure, including migrations from platforms such as Hive to Iceberg
  5. Develop reliable ETL/ELT workflows to ingest, transform, and serve data for analytics and operational use cases

Skills

Required

  • SQL
  • Python
  • Spark
  • Airflow
  • Snowflake
  • AWS
  • Azure
  • GCP
  • Looker
  • Power BI
  • Git
  • Jenkins CI
  • Flink
  • data modeling
  • pipeline reliability
  • large-scale data processing
  • Jupyter
  • EMR Notebooks
  • Apache Zeppelin
  • AI-assisted development tools (e.g., GitHub Copilot, Cursor, Claude Code)

Nice to have

  • AI-powered tools, agents, or automation in data workflows
  • Model Context Protocol (MCP)
  • data platform enhancements, optimizations, or small-scale migrations
  • data observability and monitoring practices

What the JD emphasized

  • product-focused mindset
  • understand business requirements
  • architect systems that scale
  • accommodate those needs
  • Break down complex problems
  • define technical solutions
  • sequence work
  • fast, iterative improvements
  • Design, build, and maintain scalable data pipelines
  • data models
  • Modernize legacy data workflows
  • infrastructure
  • migrations
  • Develop reliable ETL/ELT workflows
  • ingest, transform, and serve data
  • analytics and operational use cases
  • Interface with data engineers, data scientists, product managers, and other stakeholders
  • understand their needs
  • promote best practices
  • growth mindset
  • identify business challenges and opportunities for improvement
  • solve them using data analysis and data mining
  • make strategic and tactical recommendations
  • Enable analytics
  • provide critical insights
  • product usage, campaign performance, funnel metrics, segmentation, conversion, and revenue growth
  • partner with different teams
  • understand business needs and requirements
  • Own critical data pipelines end-to-end
  • contribute to improving the overall data platform
  • Strong proficiency in SQL
  • programming language such as Python
  • Experience building data pipelines using modern data technologies
  • Spark, Airflow, Snowflake, or similar
  • Experience with cloud-based data architectures
  • AWS, Azure, or GCP
  • Experience building dashboards and analytics in Looker and/or Power BI
  • Experience with version control and CI/CD tools like Git and Jenkins CI
  • Experience with streaming architectures and Flink-based processing
  • Strong understanding of data modelling
  • pipeline reliability
  • large-scale data processing
  • Experience working with notebook solutions like Jupyter, EMR Notebooks, or Apache Zeppelin
  • Experience leveraging AI-assisted development tools
  • GitHub Copilot, Cursor, Claude Code
  • improve productivity
  • Strong problem-solving and communication skills
  • Bachelor's degree in computer science, Engineering, or related field, or equivalent practical experience
  • Exposure to building or using AI-powered tools, agents, or automation in data workflows
  • Familiarity with concepts like Model Context Protocol (MCP) or similar approaches for integrating AI with data systems
  • Experience working on data platform enhancements, optimizations, or small-scale migrations
  • Understanding of data observability and monitoring practices