Senior Linux Systems Engineer - Object Storage (2pm -11pm Ist)

CrowdStrike CrowdStrike · Enterprise · Bangalore, India

Senior Linux Systems Engineer focused on designing, building, and maintaining large-scale distributed object storage systems across hybrid cloud environments. Responsibilities include Linux administration, performance optimization, automation, and troubleshooting complex storage and network issues. The role involves collaboration with cross-functional teams and mentorship.

What you'd actually do

  1. Perform Linux engineering and administration for thousands of bare metal and virtual machines
  2. Engineer large-scale cloud environments for clustered object storage solutions while demonstrating proficiency with public cloud and cloud administration concepts
  3. Troubleshoot server hardware issues while monitoring, maintaining, and operating a production environment
  4. Automate complex routine tasks and then deploy using an IaaS model with tools like Chef or Ansible
  5. Configure and optimize Linux kernel parameters, file systems, and storage subsystems to maximize object storage performance across diverse hardware configurations

Skills

Required

  • 11+ years of professional experience working on large scale, distributed systems
  • BS/MS degree in Computer Science or related field (or equivalent work experience)
  • Extensive expertise in Linux internals including virtual memory management, process scheduling, and storage I/O subsystems with proven ability to diagnose and resolve kernel-level performance issues
  • Proficiency with Linux performance analysis and monitoring tools and/or custom metrics collection to proactively identify and resolve system bottlenecks before they impact production
  • Experience with on-premise storage (block and object) technologies, including hands-on usage with Kubernetes-native, AWS S3-compatible object storage solutions
  • Strong networking fundamentals with hands-on experience in TCP/IP, routing protocols, network security, and/or high-bandwidth data transfer optimization for distributed storage environments
  • Proficient with observability tools (Prometheus, Grafana, ELK stack) for large-scale monitoring
  • Automation experience using Python/Go scripting and Chef config management
  • Meticulous attention to detail and have the ability to make good, timely decisions
  • A strong focus on security when managing systems and/or developing/reviewing code
  • A passion for documentation and a desire to constantly improve knowledge transfer across teams
  • Proficiency in the use of project and program tooling (e.g.: Jira, Gitlab)
  • Excellent written and verbal communication skills
  • Comfort collaborating professionally distributed, cross-functional teams spanning multiple divisions around the globe
  • A proactive, can-do attitude that excels both working independently and collaborating as part of a team
  • A passion for getting into the weeds with data by using standard analytics and forecasting methods

Nice to have

  • mentorship and guidance to junior engineers
  • drive improvements to the processes, systems and tooling being utilized to forecast use of our solutions
  • custom monitoring solutions
  • simple to moderately complex scripts and programs for automation, tools, frameworks, dashboards, and alarms
  • raise the technical IQ of the team by being passionate about learning and sharing the newest technologies & tricks with others
  • manage project timelines, and keep tabs on progress from the big picture down to the nitty-gritty, all while juggling inter-team dependencies
  • Proactively track the status of project activities and ensure that schedules and priorities are being met
  • Ensure critical issues are identified, tracked through resolution, and escalated if necessary
  • Collaborate with leadership to develop tools for analyzing and forecasting both organic and surge-based growth within the operating environment
  • Represent the development team as a technical leader both vertically and horizontally across workgroups
  • Share on-call rotation with other team members
  • Ansible

What the JD emphasized

  • custom S3 compatible object store solution
  • 2,000 Linux servers
  • hybrid cloud 24x7 system
  • critical production environment
  • large scale distributed systems
  • Linux internals
  • kernel-level performance issues
  • on-premise storage (block and object)
  • Kubernetes-native, AWS S3-compatible object storage solutions
  • high-bandwidth data transfer optimization