`code quality dds`

Exercise 5.8

Exercise 5.8 Derive the cost of your processor time by dividing its price over its expected productive lifetime. Find, calculate, or measure the time required to access a byte at each level of your system's memory hierarchy, and calculate the monetary cost of each access. Also calculate the cost of each byte in the memory hierarchy (you can approximate the cost of a processor's internal caches by assuming a fixed price for each processor's transistor, and working from the published transistor counts for the processor.) List your results in a table, and discuss whether the prices you found are fair, or whether a given memory technology is over or under-valued.

Deviating somehow from the exercise's letter we can still get a feeling of how different memory types compare in practice. I've calculated some numbers for a fairly typical configuration, based on some currently best-selling middle-range components: an AMD Athlon XP 3000+ processor, a 256 MB PC2700 DDR memory module, and a 250 GB 7200 RPM Maxtor hard drive. The results appear in the table on the right. I obtained the component prices from TigerDirect.com on January 19, 2006. I calculated the cost of the cache memory by multiplying the processor's price by the die area occupied by the corresponding cache divided by the total size of the processor die (I measured the sizes on a die photograph). The worst case latency column lists the time it would take to fetch a byte, under the worst possible scenario: for example, a single byte from the same bank and following a write for the DDR RAM; with a maximum seek, rotational latency, and controller overhead for the hard drive. On the other hand, the sustained throughout column lists numbers where the devices operate close to ideal conditions for pumping out bytes as fast as possible: 8 bytes delivered at double the bus speed for the DDR RAM; the maximum sustained outer diameter data rate for the hard drive. In all cases, the ratio between bandwidth implied by the worst case latency and the sustained bandwidth is at least one order of magnitude, and it is this difference that allows our machines to deliver the performance we expect. In particular, the ratio is 27 for the level 1 cache, 56 for the level 2 cache, 76 for the DDR RAM, and 1.8 million for the hard drive. Note that as we move away from the processor there are more tricks we can play to increase the bandwidth, and we can get away with more factors that increase the latency.

The byte cost for each different kind of memory varies by three orders of magnitude: with one dollar we can buy KBs of cache memory, MBs of DDR RAM, and GBs of disk space. However, as one would expect, cheaper memory has a higher latency and a lower throughput. Things get more interesting when we examine the productivity of various memory types. Productivity is typically measured as output per unit of input; in our case I calculated it as read operations per second and $ cost for one byte. As you can see, if we look at the best case scenarios (the device operating at its maximum bandwidth), the hard drive's bytes are the most productive. In the worst case (latency-based) scenarios the productivity performance of the disk is abysmal, and this is why disks are nowadays furnished with abundant amounts of cache memory (8 MB in our case). The most productive device in the worst case latency-based measurements is the DDR RAM. These results are what we would expect from an engineering point of view: the hard disk, which is workhorse used for storing large amounts of data with the minimum cost, should offer the best overall productivity under ideal (best case) conditions, while the DDR RAM, which is used for satisfying a system's general purpose storage requirements, should offer the best overall productivity even under worst case conditions. Also note the low productivity of the level 1 and level 2 caches. This factor easily explains why processor caches are relatively small: they work admirably well, but they are expensive for the work they do.

Component	Nominal size	Worst case latency	Sustained throughput (MB/s)	$1 buys	Productivity (Bytes read / s / $)
Component	Nominal size	Worst case latency	Sustained throughput (MB/s)	$1 buys	Worst case	Best case
L1 D cache	64 KB	1.4ns	19022	10.7 KB	7.91·10¹²	2.19·10¹⁴
L2 cache	512 KB	9.7ns	5519	12.8 KB	1.35·10¹²	7.61·10¹³
DDR RAM	256 MB	28.5ns	2541	9.48 MB	3.48·10¹⁴	2.65·10¹⁶
Hard drive	250 GB	25.6ms	67	2.91 GB	1.22·10¹¹	2.17·10¹⁷

Performance and cost of various memory types.

Book homepage | Author homepage

Unless otherwise expressly stated, all original material on this page created by Diomidis Spinellis is licensed under a Creative Commons Attribution-Share Alike 3.0 Greece License.
Last modified: 2006-01-04