webgraph
文件大小: unknow
源码售价: 5 个金币 积分规则     积分充值
资源说明:Playing with datastructures for manging link graphs
Someone gave me the problem of working out an efficient data structure for modelling links from and to URLs. I had a number of 'gut feeling' ideas for how to model this, but wanted to write some actual code to test my thoughts.

Problem: 
 - set of nodes labeled with URLs
 - nodes connected by (directed) edges (url1 - linksto - url2)
 - how should this be modelled to enable efficient querying about who links to who...

My build plan:

* Write some code to grab example data from the web
* Create a naive implementation based on hashmaps
* Implement a tree based approach (possibly using a trie to exploit the fact I have string keys that have a hierarchical nature)
* Write some performance tests


Good implementation as a trie

To write

1) wget scripts to harvest some data
2) python scripts to build up a trie
3) query functions

'Clever' implementation using a trie and surrogate keys

performance comparisons



本源码包内暂不包含可直接显示的源代码文件,请下载源码包。