Hello edtriplett,
To understand, if it is the hardware-related issue, I can suggest several tests:
- try to physically remove all GPUs but one and check, if the processing works or a similar issue occurs, you can also try putting GPU to another slot,
- if you have such an option - after the first test, try to put the GPUs (one by one) to another computer and also check, if the similar issue is observed on the second machine.
It may be also worth trying to run some CUDA-based stress tests on your configuration.