摘要: Traditional web page clustering methods exist low accuracy and high computational complexity.The article puts forward a new Web pages clustering method based on URL similarity and simple DOM tree , denosing by using tree matching algorithm ,then using s