MartinFed · 2 days ago
Senior HPC Linux Systems Engineer
XCEL Engineering, Inc. is an award-winning small business providing IT and engineering solutions. They are seeking a Senior HPC Linux Systems Engineer to improve the security, performance, and reliability of computing environments at the National Center for Computational Sciences.
ConsultingCorporate TrainingInformation ServicesInformation TechnologySoftware
Responsibilities
Install, integrate, and administer HPC Linux clusters and high-speed networks
Diagnosing system operational problems quickly and effectively
Coordinating with vendors to resolve hardware and software problems
Recommending, planning, and coordinating hardware and software changes with customer participation using change management processes
Porting and writing system management tools
Documenting system administration procedures for routine and complex tasks
Participating in a 24-hour, 7-day on-call support rotation and off-hours maintenance windows
System implementation/integration into the NCCS environment and systems performance
Lead system deployment, integration and troubleshooting of a large-scale computer
Participate in relevant systems topics with the internal and external community of peers contributing experiences and solutions
Mentor junior-level staff as they join the team
Deliver ORNL's mission by aligning behaviors, priorities, and interactions with our core values of Impact, Integrity, Teamwork, Safety, and Service
Qualification
Required
Bachelor's Degree in a scientific or technical field
8+ years of Linux systems experience is required
An equivalent combination of education and experience will be considered
Preferred
Experience managing Linux operating systems in a large-scale system environment
Solid understanding of networked computing environment concepts
Experience with Linux Cluster Administration
Ability to develop and maintain programs and scripts that aid in the operation and automation of administrative tasks using various shell and scripting languages (bash, Python, Go)
Experience with Lustre and GPFS file systems
Experience with batch schedulers (particularly SLURM)
Experience deploying and maintaining automated configuration management software such as Puppet
Strong interpersonal and communication skills
Ability to work as a team player
Proactive and solution-oriented problem solver
Prior project and/or team leadership experience
Company
MartinFed
Welcome to MartinFederal! For over a decade, MartinFederal has provided the U.S.
Funding
Current Stage
Growth StageRecent News
Washington Technology
2025-09-30
Company data provided by crunchbase