Software Engineer - Data Engineering

Caterpillar Caterpillar · Industrial · Chennai, Tamil Nadu +1

Software Engineer focused on data engineering, responsible for designing, developing, and deploying data pipelines, data lakes, and data warehouses. The role involves working with big data technologies, cloud platforms (AWS/Azure), and message brokers, with a requirement for strong Python and SQL skills. While AI concepts are mentioned as a plus, the core focus is on data infrastructure and movement.

What you'd actually do

  1. Contribute to development and deployment of Caterpillar’s state-of-the-art digital platform.
  2. Design and implement end-to-end near real-time data movement solutions.
  3. Works independently on complex systems or infrastructure components that may be used by one or more applications or systems.
  4. Drives application development focused around delivering business valuable features
  5. Maintains high standards of software quality within the team by establishing good practices and habits.

Skills

Required

  • Software development experience
  • Designing and developing Data Pipelines in Python
  • Python as an Object Oriented programming language
  • Data pipelines in Big Data
  • Large Data Lakes and Data Warehouse (Snowflake Preferred)
  • End-to-end near real-time data pipelines for OLTP & OLAP
  • Robust Data Platforms architecture
  • SQL and NO-SQL databases
  • Message brokers and AWS services (Kafka/Kinesis, AWS SQS, AWS SNS, Lambda, API Gateway, DynamoDB, Aurora, AWS RDS PostgreSQL)
  • Deploying software using CI/CD tools (Azure Devops, Jenkins, GoCD, AWS CloudFormation)
  • Deploying and maintaining software using public clouds (AWS or Azure)
  • Test driven development
  • Behaviour driven development
  • Analytical skills
  • Computer science fundamentals (data structures, algorithms, object-oriented design)
  • Ability to work under pressure and within time constraints
  • Leadership on medium to large-scale projects

Nice to have

  • Good understanding of AI concepts and latest developments (Gen AI, MCP, ATA, etc.)

What the JD emphasized

  • 5+ years or more of software development experience
  • 5+ years or more of experience in designing and developing Data Pipelines in Python
  • Proven experience in many of the following
  • At least 2+ plus years of deploying and maintaining software using public clouds such as AWS or Azure.
  • Bachelor’s degree in Computer science or Electrical engineering or related field is required.