Notices
Results 1 to 4 of 4

Thread: search robots

  1. #1 search robots 
    Forum Freshman
    Join Date
    Jun 2008
    Location
    Poland
    Posts
    38
    Do search engine's robots search the entire web each time that somebody cliks the search button or do their search indexes on say Google's servers?
    My guess is that it's impossible to search the internet in several seconds.
    Howe is it in reality?


    Reply With Quote  
     

  2.  
     

  3. #2  
    Forum Ph.D.
    Join Date
    Apr 2008
    Posts
    956
    What happens is that the search bots “crawl” through the Web, taking snapshots of as many Web pages as they can find. The info they gather is then cached and indexed. When someone makes a Web search, the engine will go through the cache of Web pages and display results that match the search terms.

    This is a simplified account of what basically happens; what actually happens may be much more complicated. For example, some engines can recognize typos or spelling errors and display close-matching results rather than exact-matching results.

    You may notice that when Google displays a search result, it includes a brief quote from the Web page displaying some or all of your search terms; when you click the link to go to the Web page, however, you may find that the contents of the page is different from the brief quote accompanying the search result. This is proof that Google does not search the Web directly, but only its cache of stored pages. Between the time when the crawler took a snapshot of the Web page and the time you made your search, the Web page might have been updated.

    Cached pages are also not permanently stored with the search engine. Each cached page has limit for how long it’s stored; when the time limit has expired, it will be deleted from the cache. This is to prevent the engine’s server from becoming overloaded with outdated cached pages.


    Reply With Quote  
     

  4. #3 thank you for your clarification 
    Forum Freshman
    Join Date
    Jun 2008
    Location
    Poland
    Posts
    38
    As above
    Reply With Quote  
     

  5. #4  
    Forum Freshman CelticMadScientist's Avatar
    Join Date
    Jul 2008
    Location
    U.S.A.
    Posts
    19
    Well, for large scale like Google, I haven't had the chance to learn yet. But maybe for a flavor of the direction you might head, say you have a search engine for a small website. You make a matrix, with each column being a normalized vector of the keyword frequencies (rows correspond to keywords). The search query is turned into a normalized vector of keywords. You do the matrix vector multiplication, resulting in vector of the cosine of the angle between each column and the search vector. Order them by angle to see which pages match the keyword query best.

    References:
    Steven J. Leon, Linear Algebra with Applications, 2002, p 230-232.
    Celtic Mad Scientist
    Celtic Mad Scientist's MetaCafe Channel - Science & Fun How-to Videos
    celticmadscientist.com
    Reply With Quote  
     

Bookmarks
Bookmarks
Posting Permissions
  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •