一个可扩展的PHPWEB蜘蛛,示例代码:
use VDB\Spider\Spider;use VDB\Spider\Discoverer\XPathExpressioDiscoverer;$spider = ew Spider('https://www.oschia.et');特性:
supportstwotraversalalgorithms:breadth-firstaddepth-first
supportsdepthlimitigadqueuesizelimitig
supportsaddigcustomURIdiscoverylogic,basedoXPath,CSSselectors,orplaioldPHP
comeswithausefulsetofURIfilters,suchasDomailimitig
supportscustomURIfilters,bothprefetch(URI)adpostfetch(Resourcecotet)
supportscustomrequesthadliglogic
comeswithausefulsetofpersistecehadlers(memory,file.Redissootofollow)
supportscustompersistecehadlers
collectsstatisticsaboutthecrawlforreportig
dispatchesusefulevets,allowigdeveloperstoaddevemorecustombehavior
supportsapoliteesspolicy
willsoocomewithmaydefaultdiscoverers:RSS,Atom,RDF,etc.
willsoosupportmultiplequeueigmechaisms(file,memcache,redis)
willevetuallysupportdistributedspiderigwithacetralqueue










评论