Skip to content

How to check CPU, GPU, and memory on those specific Slurm nodes?

sinfo --state=idle

sinfo -N -o "%.15N %.6D %.9P %.11T %.4c %.8z %.6m %.8d %.6w %.8f %.20G" -n g[009,015-017,019,025]

scontrol show node g[009,015-017,019,025] | grep -E "NodeName|CPUTot|RealMemory|Gres"

Here are several ways to check CPU, GPU, and memory on those specific Slurm nodes:

1. Using sinfo (Quick Overview)

sinfo -N -l -n g[009,015-017,019,025]

This shows basic node information including CPUs and memory.

2. Using scontrol show node (Detailed Information)

scontrol show node g[009,015-017,019,025]

This provides comprehensive details including:

  • CPUs (total and allocated)
  • Memory (total and allocated)
  • GPU information (if configured)
  • Node state and features

3. For GPU-specific Information

sinfo -N -o "%.15N %.6D %.9P %.11T %.4c %.8z %.6m %.8d %.6w %.8f %.20G" -n g[009,015-017,019,025]

The %.20G format shows GPU information if available.

4. Running Commands Directly on Nodes

To check real-time usage on specific nodes:

CPU and Memory:

srun -w g[009,015-017,019,025] --pty bash -c "hostname; lscpu | grep -E 'CPU\(s\)|Model name'; free -h"

GPU (if available):

srun -w g[009,015-017,019,025] --pty bash -c "hostname; nvidia-smi --query-gpu=name,memory.total,memory.used --format=csv"

5. Custom Format for Specific Info

scontrol show node g[009,015-017,019,025] | grep -E "NodeName|CPUTot|RealMemory|Gres"

6. Check Current Usage

sstat -n g[009,015-017,019,025] --format=NodeList,AveCPU,AveRSS,MaxRSS

The scontrol show node command is usually the most comprehensive for getting all the hardware specifications you need in one go.

Published inUncategorized

Be First to Comment

Leave a Reply