Today, while reading some materials about Parallel Programming, partitioning datasets and so on, some old echos from the past came to my mind.
In most of the simple samples that I've seen lately, a bunch of data gets loaded into memory, and then the memory dataset is partitioned and each thread reads data from one of the data partitions. We do the whole loading into memory before launching the threads because otherwise different threads would be accessing to disk simultaneously and the I/O bottleneck would keep threads stuck while the I/O operation for other thread gets completed.
OK, I've said bottleneck. There's one that's been well known since long ago, Memory bottleneck. Years ago, as CPUs got faster and faster, the memory access technology was unable to keep up, and we had the problem of a fast CPU on hold waiting for the data retrieval from memory... We have caches and branch prediction to soften the problem, but anyway, it's still there.
Since some years ago CPUs speed has barely increased, but we have several cores, and more and more cores to come... This means, that it's not only that an operation will be we waiting for the memory to provide the data it needs, it's that multiple parallel operations will be waiting for the same memory to provide those data through the same data bus that we had years ago.
This sounds messy, so I thought sure there was some work around in place for this that I was missing (it's been a long while since the times when I was a hardware freak following the last improvements in chipsets and buses...), but after some googling it doesn't seem so, and in fact we can say that Multicore makes the memory bottleneck worse.
Wednesday, 22 September 2010
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment