[Java8源碼]HashMap的key的哈希值為什么是(h = key.hashCode()) ^ (h >>> 16)
源碼如下:
為什么不是直接使用key.hashCode(),而是(h = key.hashCode()) ^ (h >>> 16)?
官方的解釋:Because the table uses power-of-two masking, sets of hashes that vary only in bits above the current mask will always collide.?So we apply a transform that spreads the impact of higher bits downward. There is a tradeoff between speed, utility, and quality of bit-spreading. Because many common sets of hashes are already reasonably distributed (so don't benefit from spreading), and because we use trees to handle large sets of collisions in bins, we just XOR some shifted bits in the cheapest possible way to reduce systematic lossage, as well as to incorporate impact of the highest bits that would otherwise never be used in index calculations because of table bounds.
翻譯過來:table的容量都是2的N次方,如果有一組哈希值總是在當(dāng)前table容量之上的bit位變化,則總是會發(fā)生hash碰撞。在同時權(quán)衡速度、效用和性能的情況下,將高位和低位進(jìn)行異或讓高位的變化能影響到table索引的計算
舉例說明:
比如:table容量是2的4次方,即16。那么總是在table容量之上的bit位變化,就應(yīng)該滿足公司16 * N + X ,其中N變,X不變

如上表發(fā)生了hash碰撞,那么(h = key.hashCode()) ^ (h >>> 16)就能解決問題嗎?

從上表可看出結(jié)果沒有變化,原因是什么?
首先,我們需要了解h>>>16是將哈希值進(jìn)行無符號右移16位
比如 h(二進(jìn)制表示)= 0101 1010 1111 0000 1111 1011 0101 1001?
h >>> 16之后的值:0000 0000 0000 0000?0101 1010 1111 0000
其次,我們從異或的計算規(guī)則
0 ^ 0 = 0, 1 ^ 0 = 1, 0 ^ 1 = 1, 1 ^ 1 - 0
可以得出
X ^ 0 = X
進(jìn)而知道,如果h(二進(jìn)制表示)的高16位都是0,那么h>>>16之后的值還是h,即只有高16位不全是0,才會有效果
個人總結(jié):(h = key.hashCode()) ^ (h >>> 16) 只能在盡量少的影響運(yùn)算速度下,一小部分情況下減少hash碰撞,同時也有可能在另一小部分情況下增加hash碰撞。直觀感覺上,有點(diǎn)雞肋。那么如果官方是基于現(xiàn)實(shí)的大量數(shù)據(jù)測試結(jié)果之后,選擇這個方案,就另當(dāng)別論了