Does anyone know more about this? Sounds like distributing tasks to other processors that are not really designed for the job? Articles are making it out to be a miracle and not sure whether to believe it
Does anyone know more about this? Sounds like distributing tasks to other processors that are not really designed for the job? Articles are making it out to be a miracle and not sure whether to believe it
Msn article links a press release which links the paper: https://dl.acm.org/doi/10.1145/3613424.3614285
The idea seems to be if your computer has several kinds of hardware accelerators, there is a systematic way to use them simultaneously. I only read the beginning but it’s hard to see a big breakthrough.
Yeah, they’re basically talking about a virtual machine (they call it a “framework” but it sounds more in line with the JVM or the .Net CLR) that can automatically offload work to GPUs and tensor cores and the like.
Programs would basically compile to a kind of bytecode that describes what calculations need to be done and the data dependencies between them, and then it’s up to the runtime to decide how to schedule that on available hardware, while compensating for differences, e.g. in supported precision for floating point math.
They’re quoting like 4x speedups but their benchmarks are all things that already have efficient GPGPU implementations, mainly signal processing and computer vision, so the computation is already highly parallelized.
I could see this being useful for that kind of Big Data processing where you have a ton of stuff to churn through and want to get the most bang for your buck on cloud hardware. It helps that that kind of processing is already coded at a pretty high level, where you just lay out the operations you want done and the system handles the rest. This runtime would be a great target for that.
I don’t really see this revolutionizing computing on a grand scale, though. Most day to day workloads (consumer software and even most business applications like web servers) are not CPU bound anyway. I guess memory bound workloads could benefit from being offloaded to a GPU but those aren’t all that common either.
Also, it’s not even a novel idea.
OpenMP has been around for over 20 years, it’s just a lot less “magical”.
macOS has been doing something similar for years with something Apple calls Grand General Dispatch