Donate. I desperately need donations to survive due to my health

Get paid by answering surveys Click here

Click here to donate

Remote/Work from Home jobs

Multiple nVidia GPU transcoding (NOT computing) bottleneck

I'm doing some nVidia multi-GPU testing. However, this tests are not on the computing field but transcoding, using nvenc/nvdec.

I have a setup with 3 Quadro GPUs (max 4 on this motherboard), and running some transcoding jobs using ffmpeg.

Thing is, up to 8 jobs everything is fine, with tolerable GPU and CPU stats. But when reaching the 9th job, the nVidia cards metrics start to fall down. Whats troublesome is this: it doesn't matter the job distribution. That is, if I send 8 jobs to GPU0, and 1 job to GPU1, is the same as 4-5, or 4-4-1, or 0-4-5, etc.

What I see beggining on the 9th job is:

  1. CPU gets to about 60% usage (30% up to 8th jobs) and doesn't go up much after adding more jobs.
  2. DECODING metrics falls from about 75% (when single card has 8 jobs) to about 20%.
  3. Every card behaves the same when this problem starts, no matter how many jobs they have.

And the last strange thing I see: when that problem happens, and I kill all the jobs, the cards keep working for a while (sometimes even minutes).

All this points to some bottleneck somewhere on the motherboard. Maybe the PCIe bus, maybe some CPU subsystem, I'm not sure. It also points to some buffering happening somewhere. I'm using the usual popular tools to see high-level metrics and curves (top/htop, nvidia-smi, nvtop, etc).

My question: does anybody knows some common bottlenecks regarding multi-GPU setups that could lead to a problem like this?

Any tip would be nice.

Thanks in advance.

Comments