Advanced Grid Engine Training
Course Outline

Overview

This course is designed to extend the system administrator's and end user's knowledge of Grid Engine by covering the detailed aspects of its functionality and commands. The course empowers administrators to translate business goals into a Grid Engine configuration.

The workshops provide valuable experience with the gathering of site-defined shared resources, such as licenses, the configuration of job submission and execution environments, dynamic cluster configurations, and more.

Hands-on exercises are integrated into the courses as well as practical trouble-shooting tips.

Audience

This advanced course is designed for system administrators and advanced end-users who are responsible for extending the role of Grid Engine in site-defined cluster resource management and require the implementation of job and resource controls.

The course content is applicable to all versions of Grid Engine.

Prerequisites:

  • 1

    Basic knowledge of Linux or Unix operating system

  • 2

    Basic knowledge of Unix shell (like bash/csh/ksh and vi editor)

  • 3

    Basic knowledge of system administration concepts and parallel programming models (shared memory/distributed memory)

  • 4

    Basic knowledge of practical Grid Engine (or similar) administration skills or advanced Grid Engine user experience is advantageous but not required

What Customers are Saying

Training brought us many new insights... Just a few weeks later, when we experienced a small issue, we were able to solve it instantly!

Ralf Nolte

Systems Administrator,

CeBiTec, Bielefeld University

cebitec
The benefits of training are significant, especially in managing the risk the business is exposed to.

Mike Twelves

Supply Chain Solutions,

Tata Steel

tata

Course Outline

  • Concepts Review

    • Grid Engine concepts and components

    Advanced Configurations

    • Global configuration
    • Host configuration
    • Queue configuration
    • Load sensors and resources

    Job Types and Environments

    • Parallel jobs and environments
      • Multi-threaded, MPI, etc.
      • Loose vs. tight integration
    • Array jobs
    • Interactive jobs

    Diagnostics and Performance Tuning

    • Debugging and failure diagnosis
    • Tuning for high throughput
    • Data spooling and implications

  • Scheduler Policies and Features

    • Preemption
    • Scheduling policies
    • Resource reservations and backfilling
    • Introduction to advance reservations
    • Resource quota sets
    • Managing different types of workloads

    Job Submission Verifiers and Job Classes

    • Managing job submission (JSV)
      • Client/Server side JSVs
    • Job classes - advanced cases

    Core/Memory Binding, Cgroups and GPU Support

    • Core and memory binding comprehensive
    • Cgroups support
    • GPU support - RSMAP complex type

  • Using Docker with Univa Grid Engine

    • Concepts of integration
    • Submitting Docker jobs
    • Requesting Docker run options

    Questions and Answers

Resources

Diamond Light Source Increase Efficiency with Advanced Grid Engine Training

Learn how Diamond Light Source staff benefited from Advanced Grid Engine training

dls