Director Software Eng

Honeywell · Industrial · Bengaluru, Karnataka, India

Director of Software Engineering focused on Site Reliability Engineering (SRE) at Honeywell. The role involves strategic direction, leadership, and oversight of SRE practices, ensuring system reliability, operational excellence, and customer satisfaction. Key responsibilities include managing SRE teams and projects, collaborating with cross-functional teams, and ensuring adherence to best practices in reliability, incident management, and automation. The position requires deep understanding of cloud platforms (Azure/GCP), hybrid architectures, data platforms, MLOps, Kubernetes, infrastructure as code, and programming/scripting for automation. Experience in Platform Engineering and Developer Enablement is also crucial.

What you'd actually do

Provide strategic direction and leadership to the site reliability engineering function.
Oversee the planning, execution, and delivery of SRE projects and initiatives.
Collaborate with cross-functional teams and stakeholders to enhance system reliability and performance.
Ensure adherence to best practices in site reliability, incident management, and operational excellence.
Manage and optimize SRE resources, budgets, and timelines.

Skills

Required

Site Reliability Engineering leadership
SRE team and project management
Azure/GCP and hybrid architectures
Data platforms
MLOps
Model deployment strategies
Reliability engineering principles
Incident management
Monitoring
Automation
Cloud platforms
Kubernetes
Infrastructure as code
Programming and scripting languages
Platform Engineering
Developer Enablement
CI/CD capabilities

Nice to have

Bachelor’s degree in Computer Science, Engineering, or a related field
8+ years of experience in site reliability engineering or related fields with leadership responsibilities
Strong problem-solving skills
Continuous improvement
DevOps practices
Cloud-native technologies
Collaboration, innovation, and operational excellence culture

What the JD emphasized

Extensive experience in site reliability engineering leadership roles with proven success in managing SRE teams and projects.
Deep understanding of Azure/GCP and hybrid architectures; able to view designs for scalability, resilience and cost optimization.
Hands-on experience with data platforms and MLOps; able to guide model deployment strategies
Strong expertise in reliability engineering principles, incident management, monitoring, and automation to ensure system uptime and performance.
Proficiency with cloud platforms, container orchestration (such as Kubernetes), and infrastructure as code tools.
Experience with programming and scripting languages used in automation and tooling development.
Experience with Platform Engineering & Developer Enablement. Build Internal development platform consisting of templates, guardrails & self-service CI/CD capabilities.

Read full job description

In this role, you will impact Honeywell’s ability to maintain high system reliability and operational excellence, supporting business continuity and customer satisfaction through effective site reliability engineering practices.

KEY RESPONSIBILITIES

Provide strategic direction and leadership to the site reliability engineering function.
Oversee the planning, execution, and delivery of SRE projects and initiatives.
Collaborate with cross-functional teams and stakeholders to enhance system reliability and performance.
Ensure adherence to best practices in site reliability, incident management, and operational excellence.
Manage and optimize SRE resources, budgets, and timelines.
Identify and implement process improvements to enhance efficiency and productivity.
Lead and develop a high-performing site reliability engineering team.

YOU MUST HAVE

Extensive experience in site reliability engineering leadership roles with proven success in managing SRE teams and projects.
Deep understanding of Azure/GCP and hybrid architectures; able to view designs for scalability, resilience and cost optimization.
Hands-on experience with data platforms and MLOps; able to guide model deployment strategies
Strong expertise in reliability engineering principles, incident management, monitoring, and automation to ensure system uptime and performance.
Proficiency with cloud platforms, container orchestration (such as Kubernetes), and infrastructure as code tools.
Experience with programming and scripting languages used in automation and tooling development.
Experience with Platform Engineering & Developer Enablement. Build Internal development platform consisting of templates, guardrails & self-service CI/CD capabilities.

WE VALUE

Bachelor’s degree in Computer Science, Engineering, or a related field.
8+ years of experience in site reliability engineering or related fields with leadership responsibilities.
Strong problem-solving skills and ability to drive continuous improvement in complex systems.
Experience with DevOps practices, CI/CD pipelines, and cloud-native technologies.
Ability to foster a culture of collaboration, innovation, and operational excellence.

KEY RESPONSIBILITIES

Provide strategic direction and leadership to the site reliability engineering function.
Oversee the planning, execution, and delivery of SRE projects and initiatives.
Collaborate with cross-functional teams and stakeholders to enhance system reliability and performance.
Ensure adherence to best practices in site reliability, incident management, and operational excellence.
Manage and optimize SRE resources, budgets, and timelines.
Identify and implement process improvements to enhance efficiency and productivity.
Lead and develop a high-performing site reliability engineering team.

YOU MUST HAVE

Extensive experience in site reliability engineering leadership roles with proven success in managing SRE teams and projects.
Deep understanding of Azure/GCP and hybrid architectures; able to view designs for scalability, resilience and cost optimization.
Hands-on experience with data platforms and MLOps; able to guide model deployment strategies
Strong expertise in reliability engineering principles, incident management, monitoring, and automation to ensure system uptime and performance.
Proficiency with cloud platforms, container orchestration (such as Kubernetes), and infrastructure as code tools.
Experience with programming and scripting languages used in automation and tooling development.
Experience with Platform Engineering & Developer Enablement. Build Internal development platform consisting of templates, guardrails & self-service CI/CD capabilities.

WE VALUE

Bachelor’s degree in Computer Science, Engineering, or a related field.
8+ years of experience in site reliability engineering or related fields with leadership responsibilities.
Strong problem-solving skills and ability to drive continuous improvement in complex systems.
Experience with DevOps practices, CI/CD pipelines, and cloud-native technologies.
Ability to foster a culture of collaboration, innovation, and operational excellence.