I am having a cluster开发者_如何学编程 containing 10 nodes all of which have similar operating system(ubuntu 10.4).I want to monitor the performance of each node basically capturing the cpu ,memory etc at a given time.How can i capture the same at each node and aggregate the result to get a combined result Example the average CPU usage of the entire cluster.
Are there any command I can run and get the results.
Thanks in Advance.
You can use the output of the pbsnodes command to capture this information. If you look at the status:
status = rectime=1319751989,varattr=,jobs=,state=free,netload=904408724,gres=,loadave=0.63,ncpus=6,physmem=8193856kb,availmem=14823060kb,totmem=16581436kb,idletime=362,nusers=1,nsessions=15,sessions=1788 1171 19146 19183 19197 19207 19217 19282 19329 19553 19617 20238 20292 20535 20601,uname=Linux napali 2.6.38-12-generic #51-Ubuntu SMP Wed Sep 28 14:27:32 UTC 2011 x86_64,opsys=linux
You can see there that it has the load average for the computer, as well as several pieces of information about the memory state of the machine. By writing some script that parses that are performs the calculations you're looking for, you can solve your problem.
精彩评论