Bot-trap

Bot-trap

[Login to edit this page]

Common techniques used are:

There is no algorithm to detect all spider traps. Some classes of traps can be detected automatically, but new, unrecognized traps arise quickly.

A spider trap causes a web crawler to enter something like an infinite loop, which wastes the spider's resources, lowers its productivity, and, in the case of a poorly written crawler, can crash the program. Polite spiders alternate requests between different hosts, and don't request documents from the same server more than once every several seconds, meaning that a "polite" web crawler is affected to a much lesser degree than an "impolite" crawler.

In addition, sites with spider traps usually have a robots.txt telling bots not to go to the trap, so a legitimate "polite" bot would not fall into the trap, whereas an "impolite" bot which disregards the robots.txt settings would be affected by the trap.



Share On Facebook
Search And Find
Epik Search:

Related Clips for Bot-trap

Join The Epik Network
Join Now:

Browse The Epik Network

  • real-funny

    doris-day

    ngaiomarsh

    shawnwayans

    consumables

    atmtaxes

    mikeadamle

    patboone

    mikeechols

    ellenchan

    evaamurri

    anilkapoor

    nilocruz

    gregkovacs

    jamandspoon

    say-goodbye

    tadfriend

    killreality

    boosters

    billyredden