This course is designed to extend the system administrator's knowledge and cover in detail all aspects of the Grid Engine functionality and commands to deploy a cluster. It empowers administrators to translate business goals into a Grid Engine configuration. The workshops provide valuable experience with the gathering of site-defined shared resources such as licenses, the configuration of job submission and execution environments, dynamic cluster configurations, etc. Students will be performing hands-on exercises and develop fundamental troubleshooting skills.
This advanced course is designed for system administrators and advanced end-users who are responsible for extending the role of Grid Engine in site-defined cluster resource management and require the implementation of job and resource controls.
This course is applicable to all versions of Grid Engine.
A basic knowledge of Linux/UNIX operation system is required. This course assumes that you have taken Grid Engine Intermediate Training and/or have the proficiency of key Grid Engine concepts and configurations.
Course Outline - 3 Days
- Grid Engine concepts and components
Advanced Configurations - Level II
- Global configuration
- Host configuration
- Queue configuration (subordination-queuewise)
- Load sensors and resources
Job Types and Environments - Level II
- Parallel jobs and environments
- Multi-threaded, MPI, etc.
- Loose vs. tight integration
- Array jobs
- Interactive jobs
Scheduler Policies and Features - Level II
- Scheduling policies
- Resource reservations and backfilling
- Introduction to advance reservations
- Resource quota sets
- Managing different types of workloads
Job Submission Verifiers and Job Classes - Level II
- Managing job submission (JSV)
- Client/Server side JSVs
- Job classes - advanced cases
Core/Memory Binding, Cgroups and GPU Support
- Core and memory binding comprehensive
- Cgroups support
- GPU support - RSMAP complex type
Diagnostics and Performance Tuning
- Debugging and failure diagnosis
- Tuning for high throughput
- Data spooling and implications
Introduction to Univa Tools
- License Orchestrator
Questions and Answers
We are the Grid Engine developers
"...we were finally able to switch our focus away from a malfunctioning
[open source] Grid Engine..."
- We have the know-how and the systems
- Proper configuration improves performance
Time is Money
"...the benefits are significant, especially in managing the risk the business is
– Tata Steel
- Professional Training
- Learn the insights to administration not found in the documentation
Future Proofed is Safe
"If we didn't have Grid Engine software it would be a major investment to go live with Aggregator and Hadoop."
- Integrate new applications including Hadoop
» See case studies