You’ll contribute to the development and upkeep of our configuration management systems and bring deep expertise in Linux kernel internals, networking, and performance tuning.
What You’ll Do
- Maintain and support the firm’s Linux platform as part of a global engineering team.
- Take ownership of tasks and initiatives, acting as a bridge across regional teams.
- Troubleshoot complex issues including packet loss, storage/network latency, and system instability.
- Understand internal client needs and communicate them effectively to leadership across regions.
- Identify potential risks, develop mitigation strategies, and implement solutions.
- Monitor the performance and health of critical computing environments.
- Resolve advanced system and network issues involving Linux, routing, switching, and protocols across platforms like Slurm, Kubernetes, Salt, and Grafana.
- Participate in monthly on-call rotations and provide support to on-call staff.
- Continuously learn and grow—both independently and through mentorship—while maintaining clear documentation and training materials.
What You’ll Bring
- Excellent communication skills with the ability to engage professionally across all levels of the organization.
- A calm, professional demeanor when handling incidents with clients and leadership.
- Deep understanding of Linux internals, including kernel operations, memory management, sockets, and interrupts.
- Strong knowledge of storage technologies such as Pure, Lightbits, GPFS, NFS, and NetApp.
- Solid grasp of networking and storage fundamentals.
- Proficiency in network protocols and how they are managed by the Linux kernel.
- Experience automating tasks using Python, shell scripting, or similar high-performance languages.
- A proactive approach to infrastructure automation and a willingness to contribute to development efforts.
- A collaborative mindset and a commitment to continuous learning and cross-regional teamwork.