Hello Mak,
Thank you for providing the additional information.
We have checked the procedure and the use of CPU may be involved, if the currently processed block cannot be allocated in VRAM, however, it should work in multithreaded mode. Unfortunately, the 1.6.1 build 10009 doesn't contain the multithreaded implementation for such CPU usage cases. However, if you would like to compare the behavior, I can send you a next version pre-release with the fix implemented.