When we declare structures, usually their size will not match to the sum of individual elements size. The structures size will atleast be sum of all elements size or more. Why is it so?
I referred many books for information, but almost all are helpless. Some how, I could able to trace reasons using web search. Thanks to search engines, Art of Assembly Programming book and colleagues.
Processor architectures are advanced and optimized for optimum performance. In olden days memory was byte addressable. To maintain backward compatibility next versions of many architectures were also needed to support byte addressing.
Here is the question...
Byte is 8 bits. Present day processors have 32 bit or 64 bit processing word length. It means they can process 4 or 8 bytes at a time. Usually integer is most prominent processing type on any processor. It occupy the size of processing word length.
If the memory is to be byte addressable due to backward compatibility, processor need to issue 4 read cycle to read one integer which is time consuming. In lieu of, the hardware manufacture arranges memory in banks of one byte each. So, a 32 bit processor needs 4 banks of memory. (Assuming such kind of hardware support, few processors will not have hardware pins representing least two bits).
The fun comes here. How to read data types, like char (one or two bytes), double (8 bytes always), vector types (GCC), etc... To overcome this issue, the processors provides byte read instruction (e.g. LDRB on ARM). These byte read instructions can identity, in which bank the byte is stored and shifts the data to Least Significant Byte position automatically during data fetch mode. This impacts on performance.
To get better performance characteristics, few architectures won't allow misalignments. The processor issues a platform exception which is to be addressed by the system programmer. Otherwise the results will be error prone.
I have published much better article on GeeksforGeeks.
No comments:
Post a Comment