2007년 11월 8일 목요일

PHP Snoopy Readme

NAME:

    Snoopy - the PHP net client v1.2.2
    
SYNOPSIS:

    include "Snoopy.class.php";
    $snoopy = new Snoopy;
    
    $snoopy->fetchtext("http://www.php.net/");
    print $snoopy->results;
    
    $snoopy->fetchlinks("http://www.phpbuilder.com/");
    print $snoopy->results;
    
    $submit_url = "http://lnk.ispi.net/texis/scripts/msearch/netsearch.html";
    
    $submit_vars["q"] = "amiga";
    $submit_vars["submit"] = "Search!";
    $submit_vars["searchhost"] = "Altavista";
        
    $snoopy->submit($submit_url,$submit_vars);
    print $snoopy->results;
    
    $snoopy->maxframes=5;
    $snoopy->fetch("http://www.ispi.net/");
    echo "<PRE>\n";
    echo htmlentities($snoopy->results[0]);
    echo htmlentities($snoopy->results[1]);
    echo htmlentities($snoopy->results[2]);
    echo "</PRE>\n";

    $snoopy->fetchform("http://www.altavista.com");
    print $snoopy->results;

DESCRIPTION:

    What is Snoopy?
    
    Snoopy is a PHP class that simulates a web browser. It automates the
    task of retrieving web page content and posting forms, for example.

    Some of Snoopy's features:
    
    * easily fetch the contents of a web page
    * easily fetch the text from a web page (strip html tags)
    * easily fetch the the links from a web page
    * supports proxy hosts
    * supports basic user/pass authentication
    * supports setting user_agent, referer, cookies and header content
    * supports browser redirects, and controlled depth of redirects
    * expands fetched links to fully qualified URLs (default)
    * easily submit form data and retrieve the results
    * supports following html frames (added v0.92)
    * supports passing cookies on redirects (added v0.92)
    
    
REQUIREMENTS:

    Snoopy requires PHP with PCRE (Perl Compatible Regular Expressions),
    which should be PHP 3.0.9 and up. For read timeout support, it requires
    PHP 4 Beta 4 or later. Snoopy was developed and tested with PHP 3.0.12.

CLASS METHODS:

    fetch($URI)
    -----------
    
    This is the method used for fetching the contents of a web page.
    $URI is the fully qualified URL of the page to fetch.
    The results of the fetch are stored in $this->results.
    If you are fetching frames, then $this->results
    contains each frame fetched in an array.
        
    fetchtext($URI)
    ---------------    
    
    This behaves exactly like fetch() except that it only returns
    the text from the page, stripping out html tags and other
    irrelevant data.        

    fetchform($URI)
    ---------------    
    
    This behaves exactly like fetch() except that it only returns
    the form elements from the page, stripping out html tags and other
    irrelevant data.        

    fetchlinks($URI)
    ----------------

    This behaves exactly like fetch() except that it only returns
    the links from the page. By default, relative links are
    converted to their fully qualified URL form.

    submit($URI,$formvars)
    ----------------------
    
    This submits a form to the specified $URI. $formvars is an
    array of the form variables to pass.
        
        
    submittext($URI,$formvars)
    --------------------------

    This behaves exactly like submit() except that it only returns
    the text from the page, stripping out html tags and other
    irrelevant data.        

    submitlinks($URI)
    ----------------

    This behaves exactly like submit() except that it only returns
    the links from the page. By default, relative links are
    converted to their fully qualified URL form.


CLASS VARIABLES:    (default value in parenthesis)

    $host            the host to connect to
    $port            the port to connect to
    $proxy_host        the proxy host to use, if any
    $proxy_port        the proxy port to use, if any
    $agent            the user agent to masqerade as (Snoopy v0.1)
    $referer        referer information to pass, if any
    $cookies        cookies to pass if any
    $rawheaders        other header info to pass, if any
    $maxredirs        maximum redirects to allow. 0=none allowed. (5)
    $offsiteok        whether or not to allow redirects off-site. (true)
    $expandlinks    whether or not to expand links to fully qualified URLs (true)
    $user            authentication username, if any
    $pass            authentication password, if any
    $accept            http accept types (image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*)
    $error            where errors are sent, if any
    $response_code    responde code returned from server
    $headers        headers returned from server
    $maxlength        max return data length
    $read_timeout    timeout on read operations (requires PHP 4 Beta 4+)
                    set to 0 to disallow timeouts
    $timed_out        true if a read operation timed out (requires PHP 4 Beta 4+)
    $maxframes        number of frames we will follow
    $status            http status of fetch
    $temp_dir        temp directory that the webserver can write to. (/tmp)
    $curl_path        system path to cURL binary, set to false if none
    

EXAMPLES:

    Example:     fetch a web page and display the return headers and
                the contents of the page (html-escaped):
    
    include "Snoopy.class.php";
    $snoopy = new Snoopy;
    
    $snoopy->user = "joe";
    $snoopy->pass = "bloe";
    
    if($snoopy->fetch("http://www.slashdot.org/"))
    {
        echo "response code: ".$snoopy->response_code."<br>\n";
        while(list($key,$val) = each($snoopy->headers))
            echo $key.": ".$val."<br>\n";
        echo "<p>\n";
        
        echo "<PRE>".htmlspecialchars($snoopy->results)."</PRE>\n";
    }
    else
        echo "error fetching document: ".$snoopy->error."\n";



    Example:    submit a form and print out the result headers
                and html-escaped page:

    include "Snoopy.class.php";
    $snoopy = new Snoopy;
    
    $submit_url = "http://lnk.ispi.net/texis/scripts/msearch/netsearch.html";
    
    $submit_vars["q"] = "amiga";
    $submit_vars["submit"] = "Search!";
    $submit_vars["searchhost"] = "Altavista";

        
    if($snoopy->submit($submit_url,$submit_vars))
    {
        while(list($key,$val) = each($snoopy->headers))
            echo $key.": ".$val."<br>\n";
        echo "<p>\n";
        
        echo "<PRE>".htmlspecialchars($snoopy->results)."</PRE>\n";
    }
    else
        echo "error fetching document: ".$snoopy->error."\n";



    Example:    showing functionality of all the variables:
    

    include "Snoopy.class.php";
    $snoopy = new Snoopy;

    $snoopy->proxy_host = "my.proxy.host";
    $snoopy->proxy_port = "8080";
    
    $snoopy->agent = "(compatible; MSIE 4.01; MSN 2.5; AOL 4.0; Windows 98)";
    $snoopy->referer = "http://www.microsnot.com/";
    
    $snoopy->cookies["SessionID"] = 238472834723489l;
    $snoopy->cookies["favoriteColor"] = "RED";
    
    $snoopy->rawheaders["Pragma"] = "no-cache";
    
    $snoopy->maxredirs = 2;
    $snoopy->offsiteok = false;
    $snoopy->expandlinks = false;
    
    $snoopy->user = "joe";
    $snoopy->pass = "bloe";
    
    if($snoopy->fetchtext("http://www.phpbuilder.com"))
    {
        while(list($key,$val) = each($snoopy->headers))
            echo $key.": ".$val."<br>\n";
        echo "<p>\n";
        
        echo "<PRE>".htmlspecialchars($snoopy->results)."</PRE>\n";
    }
    else
        echo "error fetching document: ".$snoopy->error."\n";


    Example:     fetched framed content and display the results
    
    include "Snoopy.class.php";
    $snoopy = new Snoopy;
    
    $snoopy->maxframes = 5;
    
    if($snoopy->fetch("http://www.ispi.net/"))
    {
        echo "<PRE>".htmlspecialchars($snoopy->results[0])."</PRE>\n";
        echo "<PRE>".htmlspecialchars($snoopy->results[1])."</PRE>\n";
        echo "<PRE>".htmlspecialchars($snoopy->results[2])."</PRE>\n";
    }
    else
        echo "error fetching document: ".$snoopy->error."\n";


COPYRIGHT:
    Copyright(c) 1999,2000 ispi. All rights reserved.
    This software is released under the GNU General Public License.
    Please read the disclaimer at the top of the Snoopy.class.php file.


THANKS:
    Special Thanks to:
    Peter Sorger <sorgo@cool.sk> help fixing a redirect bug
    Andrei Zmievski <andrei@ispi.net> implementing time out functionality
    Patric Sandelin <patric@kajen.com> help with fetchform debugging
    Carmelo <carmelo@meltingsoft.com> misc bug fixes with frames