资源说明:Playing with datastructures for manging link graphs
Someone gave me the problem of working out an efficient data structure for modelling links from and to URLs. I had a number of 'gut feeling' ideas for how to model this, but wanted to write some actual code to test my thoughts. Problem: - set of nodes labeled with URLs - nodes connected by (directed) edges (url1 - linksto - url2) - how should this be modelled to enable efficient querying about who links to who... My build plan: * Write some code to grab example data from the web * Create a naive implementation based on hashmaps * Implement a tree based approach (possibly using a trie to exploit the fact I have string keys that have a hierarchical nature) * Write some performance tests Good implementation as a trie To write 1) wget scripts to harvest some data 2) python scripts to build up a trie 3) query functions 'Clever' implementation using a trie and surrogate keys performance comparisons
本源码包内暂不包含可直接显示的源代码文件,请下载源码包。