iSearch2 Free
-------------

iSearch homepage: http://www.isearchthenet.com/isearch


COPYRIGHT/LICENSE NOTICE
------------------------

Copyright 2002-2010 Z-Host. All rights reserved.

This script is free to use on any website, commercial or individual. A small
donation would be much appreciated and encourage me to put more effort into
developing iSearch2 further.

For more information on making a donation, please visit the following web page:

    http://www.isearchthenet.com/isearch/donate.php

Removal of copyright messages is expressly forbidden.

Removal of "Powered by iSearch from Z-Host" messages and associated links is
expressly forbidden. The "Powered by iSearch from Z-Host" message must be clear
and legible to your site visitors. You may not obscure this message by placing
other objects over the top, or changing the style to make it difficult to see.
Should you wish to not display these messages on your web site you must purchase
the professional version of iSearch.

Attempting to reverse engineer, disassamble or modify protected source code of
the free version of iSearch is expressly forbidden. Permission to modify any
unprotected source code for use on your own web site is granted.

By using this script you agree to take full responsibility for it. Z-Host is
in no way accountable for any damage caused or losses suffered.

Reselling or distributing this code in original or modified form without prior
written consent is expressly forbidden.

If you have any questions about this copyright or license, please contact
isearch@z-host.com

For more information about purchasing the professional version of iSearch,
please visit:

http://www.isearchthenet.com/pro


INTRODUCTION
------------

iSearch2 is a tool for allowing visitors to a website to perform a search on the
contents of the site. Unlike other such tools the spidering engine is written in
PHP, so it does not require binaries to be run on the server to generate the
search index.

iSearch2 takes note of the following data from the HTML <HEAD> section of each
page:

<TITLE>page title</TITLE>
<META name="keywords" content="word1,word2,word3...">
<META name="description" content="a description of the page">
<META name="robots" content="nofollow,noindex">

In addition all words from the body are put into the search index.

iSearch2 performs simple page match scoring. Keywords score highly, and some
<BODY> words (those in <H1> to <H5> headings) are given higher relevance in
search scoring.


REQUIREMENTS
------------

iSearch2 has the following requirements:

1. A server that supports PHP4. This must include file operations on URLs (i.e.
   the allow_url_fopen option must be enabled.)
2. A server that supports MySQL


UPGRADE
-------

To upgrade from a previous version you must reset the URL tables. Once
installed, click on the "Reset URL Index" button on the configuration page, then
the "Spider" button.


INSTALLATION AND SUPPORT
------------------------

Please visit http://www.isearchthenet.com/isearch



REVISION HISTORY
----------------

iSearch2 is based on iSearch version 1.9j.

2.0     -   14th October, 2004
        -   First release of iSearch2


2.1     -   22nd October, 2004
        -   Minor bugs fixed.
        -   Fix for HTML accented characters in search descriptions.
        -   Added "Hide" and "Space-Replace" regular expressions.

2.2     -   1st November, 2004
        -   Fix for Parse error: on inc/core.inc.php line 72
        -   Fix for Hide and Space-Replace when highlighting.
        -   Improved highlighting

2.3     -   15th November, 2004
        -   First iSearch pro release


2.4     -   15th December, 2004
        -   Fixes for online conversion (pro version).
        -   Support for searching groups (pro version).
        -   Added "Stop Words Length" for stopping short words from being indexed.
        -   Added configuration of search box display.
        -   Added ability to supress descriptions in search results.
        -   Added ability to supress URLs in search results.
        -   Added ability to supress titles in search results.
        -   Added ability to supress page sizes in search results.
        -   Added error level configuration.
        -   Fixed bug when "+" and "-" were not surrounded by whitespace.
        -   Fixed "max page" limit bug.
        -   Fixed bug with ignoring default pages.
        -   Fixed character set conversion bug.
        -   Added support for JavaScript window.open links.

2.5     -   21st December, 2004
        -   Fixed warning when spidering
        -   Added parsing of JavaScript
        -   Fixed inline frames

2.6     -   4th Feb, 2005
        -   Fixed header problems when viewing cache.
        -   Added MySQL index on url field (speeds up spidering)
        -   Removed some auto_spider messages.
        -   Fixed group search next/previous page links
        -   Fixed multiple group selections.
        -   Fixed ALT image tags
        -   Fixed quote characters in anchor tags
        -   Added menu to spider allowing pause/reset/copy

2.7     -   3rd Mar 2005
        -   Fixed bug with command line detection in reindex.php
        -   Added scoring on title words
        -   Added partial matching
        -   Added "Must Match All" configuration
        -   Added advanced search form
        -   Added automatic creation of .htaccess and .htpasswd files
        -   Added backup and restore of settings
        -   Fixed MySQL connection bugs
        -   Fixed bug with + and quoted search strings
        -   Added highlighting of found words in page titles
        -   Removed some PHP notice and warning messages
        -   Added support for documents without a <body> tag
        -   Fixed anchor link searching with attributes before the href
        -   Added support for href and single quotes (requires aggressive link
            search to be enabled)

2.7a    -   4th Mar 2005
        -   Fixed notice message in viewcache.php
        -   Added admin password authentication to reindex.php and log file
            viewers
        -   Added "Allow Dashes" configuration
        -   Fixed bug which caused results titles to be lowercased and stripped
            of punctuation

2.8     -   18th Apr 2005
        -   Added read only MySQL access for searching
        -   Added groups to sitemap (pro version)
        -   Added extra links to display in results (pro version)
        -   Added timeout to update checking
        -   Added javascript checking for empty search strings
        -   Fixed "partial match" not returning results bug
        -   Prevented highlighting of stopped words
        -   Fixed "basedir" bug when parsing robots.txt
        -   Added support for &#x escaped chars
        -   Fixed bug deleting not found URLs
        -   Removed error messages if tables could not be locked

2.9     -   3rd June 2005
        -   Fixed ALT tags in images
        -   Fixed "Must Match All" setting
        -   Fixed redirect base locations when using fopen on URLs
        -   Fixed pagination of browing search index
        -   Fixed bug evaluating &#x escaped chars
        -   Added suggestions (pro version)
        -   Added "Smart Log" feature to allow you to enter suggestions and links (pro version)
        -   Added max title length
        -   Added max description length
        -   Added matching (and scoring) of words within URLs
        -   Added automatic addition of pdf and doc to file extensions when enabled
        -   Added ability to change style of extra links (pro version)
        -   Added loggin of admin events (login and config changes)
        -   Added automatic fixing of URLs with spaces

2.10    -   5th July 2005
        -   Fixed bug which caused multiple indexes on tables.
        -   Fixed index browsing multiple pages bug.
        -   Fixed MySQL read-only configuration.
        -   Fixed PHP warning caused by automatic fixing of URLs with spaces.
        -   Added ability to ignore image alt tags.
        -   Added ability to select reading mechanism.
        -   Added configuration of previous and next links.
        -   Added German help file.

2.11    -   6th July 2005
        -   Fixed problem regarding readonly MySQL configuration in config file
        -   Fixed bug with displaying results when Optionally showing previous/next links.

2.12    -   14th July 2005
        -   Fixed search logging and statistics
        -   Fixed iconv warnings
        -   Fixed base URL parsing when reading via fopen
        -   Fixed location of robots.txt when using port numbers in URL
        -   Fixed default previous/next link type to show below results
        -   Added localisation of decimal point
        -   Added ability to hide display of search time
        -   Added configuration of the number of results pages to show in previous/next bar
        -   Added test mode to allow link following only
        -   Added ability for URL lists to contain files which list the URLs
        -   Reformatted admin page
        -   Added Portuguese help file
        -   Updated Portuguese language file

2.13    -   9th August 2005
        -   Added capability for "Sounds Like" matching
        -   Added ability to perform "Sounds Like" and/or "Partial" searches if no results are found.
        -   Added Google Sitemap generation (experimental)
        -   Added support for HTTP proxy
        -   Added "Allow Colons" to allow colons in search words
        -   Added ability to add/respider/remove individual pages without needing to respider the whole site.
        -   Added Dutch, German and Spanish help files.
        -   Added total number of searches to stats page.
        -   Updated Dutch and Spanish language files.

2.14    -   20th September, 2005
        -   Added auto_spider_img.php for better autospidering.
        -   Added SPAM hack detection to form submission.
        -   Added tooltips to admin interface
        -   Changed look and feel of admin interface
        -   Added chdir for cron jobs with php5
        -   Added ability to disable javascript checking for empty searches
        -   Added limiting of max sections from MySQL for efficiency
        -   Added character set specific html entity conversion
        -   Added limiting max results display to the maximum that can be displayed.
        -   Added logging of date that spidering is started
        -   Fixed (harmless) divide by zero error message.
        -   Fixed bug finding suggestions for multiple words and quoted strings
        -   Fixed bug with not closing tags when highlighting words in results.
        -   Fixed bug with sockets reading mechanism

2.15    -   23rd September, 2005
        -   Removed advert from admin page in professional version.
        -   Added "Remember My Password" option to admin login.
        -   Show advanced search form in results if user used it for the search.
        -   Fixed selection of group on search form.
        -   Fixed version compare bug with 16 bit character sets.
        -   Fixed problem with multiple group selections.

2.16    -   20 April, 2006
        -   Made XHTML compatiable
        -   Fixed quote bug with 16 bit charsets
        -   Fixed description truncation bug
        -   Fixed bug deleting terms with special characters using smart log
        -   Fixed bug with ranking when number of results exceeded the maximum number that could be displayed
        -   Fixed bug which caused duplicate suggestions to be displayed
        -   Fixed bug with special characters (e.g. &) in google sitemap
        -   Fixed non-transparent pixels on "i" image on admin page
        -   Do charset conversion on Link title and descriptions
        -   Added "Follow Meta Refresh" option to follow meta refresh headers
        -   Added "Allow Dots" and "Allow Commas" options to allow dots and commas in search words
        -   Extended "Allow Dashes" and "Allow Colons" options
        -   Added confirmation dialogs to "Reset URLs" and "Reset Settings" buttons
        -   Prevent following of links with "rel=nofollow"
        -   Strip embedded CSS code when spidering
        -   Enhanced SPAM mail form detection on form submission.
        -   Added ability to get arrays of search results (pro version)

2.17    -   8th March, 2007
        -   Fixed problem with stripping slashes when magic_quotes was disabled
        -   Search form on stats and sitemap now works
        -   Fixed help links on internal pages
        -   Detect iconv failure and use unconverted text
        -   Fixed authentication using fopen reading mechanism
        -   Store meta data regarding PDF files.
        -   Perform &#nn; and &#xnn; conversion in PHP4.3.0 to 4.4.2
        -   Replace spaces with "+" on URLs that are found.
        -   Corrected iso-8859-1 typo in admin help

2.18    -   18th October, 2007
        -   Added warning to character set admin page if iconv not installed
        -   Fixed action and target of internal search form
        -   Fixed divide by 0 warning
        -   Added parsing of absolute URLs in content-location (legal) and location (not legal but are used and followed by IE) HTTP headers
        -   Handle &shy; entity as a dash
        -   Handle &lsquo; &ldquo; &rsquo; and &rdquo; entities as single and double quotes
        -   Fixed bug that prevented <!-- ISEARCH_BEGIN_INDEX --> and <!-- ISEARCH_END_INDEX --> working
        -   Added handling of <!-- ISEARCH_END_INDEX --> without a <!-- ISEARCH_BEGIN_INDEX -->
        -   Added html_entity_decode handling of utf-8 character set with PHP4.
        -   Added https over sockets support
        -   Fixed bugs with data conversion of PDF/word files
        -   Fixed bug with spidering text/plain files
        -   Fixed bug when spaces occurred in URLs
        -   Limit number of search terms when adding suggestions

2.19    -   23 April, 2008
        -   Added ability to respider pages without following links.
        -   Correctly set utf-8 encoding for MySQL queries.
        -   Improved google style extraction to search for complete phrases
        -   Use content-type meta data to determine pdf/msword content (pro version)
        -   Added support for "revisit-after", "sitemap-changefreq", and "sitemap-priority" meta data for sitemap generation.
        -   Handle &bsquo; and &bdquo; entities
        -   Added content-type header to sitemap to show correctly in browsers
        -   Interpret HTML entities in links correctly
        -   Fixed duplicate prevention
        -   Fixed links on the stats page
        -   Fixed cross-site scripting vulnerability
        -   Added paging when browing word index
        -   Break textareas on lines in admin page
        -   Fixed Powered By links on help pages
        -   Split large MySQL queries to reduce query length
        -   Updated Italian language file
        -   Fixed bug that prevents heading being recognised
        -   Made heading tags a breaking character

2.20    -   9th May 2008
        -   Fixed isearch_utf8_chr problem
        -   Fixed behaviour when PHP magic_quotes_sybase configuration enabled
        -   Changed behaviour of read-only MySQL configuration so that both connections are always opened.
        -   Added ability to prefix spider log entries with the current time.
        -   Added ability to email log files to admin email address from admin page.
        -   Added ability to automatically email spider log to admin when spidering has finished.

2.21    -   Non-public release

2.22    -   16th April, 2009
        -   Added configuration of whether to ignore apostrophies
        -   Added passing of the search string as a GET variable when results are clicked on. This could allow matching words to be highlighted on the page.
        -   Added multibyte module support
        -   Added further version checking on Admin page.
        -   Fixed some MySQL warnings on the admin page
        -   Fixed magic_quotes_sybase detection
        -   Fixed &# escapes in PHP5
        -   Fixed encoding of URLs in utf-8 character set
        -   Fixed encoding of ampersands in URLs
        -   Fixed problem with header end tags being included in indexed words
        -   Fixed processing of utf-8 search strings
        -   Fixed "partial" checkbox
        -   Fixed a MySQL injection vulnerability

2.23    -   20th May, 2009
        -   Fixed international character set problems introduced by move to utf-8
        -   Fixed admin page hanging when settings changed and large index present
        -   Tables are created with utf8 collation
        -   Fixed sitemap-priority meta tag
        -   Disabled magic_quotes_runtime


2.24    -   8th June, 2010
        -   Removed calls to functions deprecated in PHP 5.3
        -   Added missing body tags
        -   Fixed entities on several pages
        -   Fixed character set on reindex frame
        -   Fixed partial matching with special characters
        -   Fixed highlighter when words contain special characters
        -   Changed order of reading mechanism auto-detection
        -   Fixed proxy use when accessing secure URLs
        -   Improved iframe matching
        -   Improved recursion protection on URL fetching
        -   Improved mysql resource freeing
        -   Added deletion of orphaned words (reduces table size)
        -   Improved optimisation of tables after spidering