Lead Software Engineer - Data Engineering

Caterpillar Caterpillar · Industrial · Chennai, Tamil Nadu

Lead Software Engineer for Data Engineering role focused on developing Caterpillar's next-generation Digital Manufacturing Data Platform. Responsibilities include architecting scalable AWS data platforms, designing ingestion frameworks, leading data engineering efforts, and collaborating with Data Science and AI teams to operationalize ML models and AI capabilities. Requires strong expertise in data engineering, AWS, Snowflake, Python, and SQL.

What you'd actually do

  1. Lead and mentor a team of data engineers and platform developers
  2. Architect scalable and secure data platforms on AWS
  3. Design robust data ingestion frameworks for batch and near real-time pipelines
  4. Lead design and development of scalable ingestion pipelines (structured and unstructured data)
  5. Build and optimize Snowflake-based data platforms for performance and cost

Skills

Required

  • 10+ years of experience in Data Engineering / Data Platform roles
  • Strong experience in AWS data ecosystem (S3, Glue, Lambda, EMR, Redshift)
  • Deep expertise in Snowflake (architecture, optimization, data modeling)
  • Strong programming skills in Python and SQL
  • Extensive experience with data ingestion pipelines and ETL/ELT frameworks
  • Exposure to real-time streaming (Kafka, Spark Streaming)
  • Experience with CI/CD tools (GitHub, Jenkins, AWS CloudFormation etc.)
  • Solid understanding of distributed systems and scalable architectures
  • Strong foundation in software engineering principles (Git, testing, design patterns)
  • Experienced in working with Agile teams

Nice to have

  • Experience with Graph Databases (Neo4j, Neptune)
  • Experience with Vector Databases (Milvus, OpenSearch)
  • Knowledge of NVIDIA ecosystem and RAPIDS (cuDF, cuML, cuGraph)
  • Experience integrating AI/ML pipelines or GenAI workflows