Research Systems Administrator Gemini Systems

October 10 2024
Industries Healthcare, social assistance
Categories System administrator
Toronto, ON • Full time
Research Systems Administrator GEMINI Systems (Job ID: 7526)

GEMINI Medicine is at the forefront of medical research and innovation, providing cutting-edge computational resources to researchers through our high-performance computing infrastructure. We operate a 100% Linux environment and are deeply committed to automating our infrastructure to deliver seamless, efficient services to our users.

The GEMINI Medicine is currently looking for a Research Systems Administrator.

The primary role of the Research Systems Administrator is to ensure that all systems for the research program are running at optimal technical levels. They will have a strong background in Linux systems administration, experience managing Slurm clusters, and a passion for automating infrastructure deployment and management.

This role offers the opportunity to work with advanced technologies and contribute to critical medical research.

Duties/Responsibilities

Performs System Maintenance and Administration Responsibilities (80% of work time)

  • Infrastructure Management: Manage and maintain our HPX infrastructure located at St. Michael's Hospital and the HPC4Health Datacenter at SickKids, including both GPU and CPU nodes.
  • Slurm Cluster Administration: Oversee and optimize the Slurm workload manager, ensuring efficient scheduling and resource allocation across our high-performance computing nodes.
  • Systems Automation: Develop, implement, and enhance automation workflows using Ansible and Python to streamline infrastructure deployment, configuration, and management.
  • Database Administration: Manage and optimize our PostgreSQL databases, ensuring high availability and performance for critical applications.
  • Security and Compliance: Implement and maintain security best practices across all systems, ensuring compliance with industry standards and regulations.
  • User Support: Provide technical support to users, troubleshoot issues, and ensure smooth operation of our computing resources.
  • Collaboration: Work closely with cross-functional teams, including researchers, developers, and other IT staff, to deliver high-quality services and support.

Perform Support, Troubleshooting, and Leadership Responsibilities (20% of work time)

  • Provide high-level technical guidance to the team, in the design of new systems and solutions to streamline workflows and operations.
  • Assist other members of the team with resolving technical issues when needed expertise exceeds their skill set.
  • Actively participate in meetings and provide constant updates and feedback to the team on actions being taken and blockers encountered.
  • Maintain active involvement in designated activities of new projects going live.
  • Provide knowledge transfer and technical training of our systems to partners and stakeholders as needed.
  • Provide leadership in problem-solving, incident identification and resolution.

Qualifications

· An undergraduate degree or equivalent in relevant experience in Computer Science and 2 years of experience in a similar position OR demonstrable equivalent combination of specialized education and experience

  • Experience: Minimum 3-5 years of experience in Linux systems administration, with a focus on high-performance computing environments.
  • Technical Skills:
    • Proficiency in Slurm workload manager administration.
    • Strong experience with Ansible for automation and configuration management.
    • Solid understanding of PostgreSQL database management and optimization.
    • Proficient in Python scripting for automation and system integration tasks.
  • Problem-Solving: Strong analytical and troubleshooting skills, with the ability to resolve complex technical issues.
  • Communication: Excellent verbal and written communication skills, with the ability to convey technical concepts to non-technical audiences.
  • Team Player: Ability to work collaboratively in a team environment and contribute to a culture of continuous improvement.

Please Note: Registering and making an account with Unity Health does not mean you have submitted an application for the position you would like to apply for. Please ensure you register and make an account with Unity Health AND apply to the position. Both need to be completed to consider your application.

If you are an internal employee, please apply through the Intranet for your application to be considered.

    Thank you for applying.

      Apply now!

      Similar offers

      Searching...
      No similar offer found.
      An error has occured, try again later.

      Jobs.ca network