I just start studying DHT implementation and theory and stuck on on part, 开发者_JAVA百科how generates node id when node startup and connect to network. I read that ID is random hash from some hashes range but, is it unique hash? and is hash generates close no the data which this node store? Help me with this.
Self-generation of the node ID using a good hash function over a large space of values is a common technique used in DHT/P2P systems. Since the hash guarantees good random distribution, the probability of a collision is very small. Statistically, the ID will (almost always) be unique.
That hash is independent from the data stored of the node.
import random
import hashlib
def newID():
s = ""
for i in range(20):
s += chr(random.randint(0, 255))
m = hashlib.sha1()
m.update(s)
return m.digest()
As said in the previous answers, the ID of a node is generated by hashing it's IP address (generally speaking, such is the case in a DHT like Chord) or other uniquely identifiable information.
And since it uses Consistent Hashing when a node will join or leave the n-network, only 1/n
keys needs to be remapped, thus it lends itself to highly dynamic network topologies, such as peer-to-peer.
Technically, the hash generated doesn't convey any information about the data that is stored on this node. Rather the hash for a certain key (or entry in a data store, if used for such purpose) originates from hashing the keyword (or the filename or the file contents).
As a direct consequence of the Consistent Hashing, the abstract concept of distance between keys emerges. (As stated here) A node owns all the keys for which its identifying key (ID) is the closest to according to the distance metric.
精彩评论