[Lustre-discuss] How to detect process owner on client
John Hammond
jhammond at tacc.utexas.edu
Tue Feb 15 14:17:37 PST 2011
On 02/10/2011 09:16 PM, Satoshi Isono wrote:
> Dear members,
>
> I am looking into the way which can detect userid or jobid on the Lustre client. Assumed the following condition;
>
> 1) Any users run any jobs through scheduler like PBS Pro, LSF or SGE.
> 2) A users processes occupy Lustre I/O.
> 3) Some Lustre servers (MDS?/OSS?) can detect high I/O stress on each server.
> 4) But Lustre server cannot make the mapping between jobid/userid and Lustre I/O processes having heavy stress, because there aren't userid on Lustre servers.
> 5) I expect that Lustre can monitor and can make the mapping.
> 6) If possible for (5), we can make a script which launches scheduler command like as qdel.
> 7) Heavy users job will be killed by job scheduler.
>
> I want (5) for Lustre capability, but I guess current Lustre 1.8 cannot perform (5). On the other hand, in order to map Lustre process to userid/jobid, are there any ways using like rpctrace or nid stats? Can you please your advice or comments?
I've written a utility called lltop which gathers I/O statistics from
Lustre servers, along with job assignment data from cluster batch
schedulers, to give a job-by-job accounting of filesystem load. Here's
its output with names changed to protect the innocent:
$ sudo tacc_lltop work
JOBID WR_MB RD_MB REQS OWNER WORKDIR
1823815 2101 0 4176 al /work/000/al/job1
1823060 774 0 1570 bob /work/000/bob/fftw
1823634 323 3 3244 chas /work/000/chas/boltzeq
1823768 289 0 5108 deb /work/000/deb/mesh-08
1823085 55 0 110 ed /work/000/ed/jumble
login3 18 3 2961
We use it on several systems, only with SGE so far, but it's hookable to
other schedulers.
See https://github.com/jhammond/lltop for source and documentation.
Best,
John
--
John L. Hammond, Ph.D.
TACC, The University of Texas at Austin
jhammond at tacc.utexas.edu
More information about the lustre-discuss
mailing list