Step 5: "Check + Control Shadow Domain (sdcontrol-e.cgi)"
After creating the phantom pages and crosslink files for your
Shadow Domain in Step 4, and after uploading them to the
SD, you will now want to implement the "Central Keyword Switch"
(CKS) file "X.cgi".



-------------------------------------
ADJUSTMENTS IN FILE
"X.cgi"
(please edit this file in ASCII i. e.
a plain text editor like Notepad
etc. only!)
-------------------------------------

System Path
-----------
* Please check your system's path to location of Perl. The
default path in the script is "/usr/bin/perl".

If you don't know this path, you can check it out under telnet
by entering Unix command "whereis perl". The system path will
then be displayed for you to copy if required (see below).

If your system path to Perl is not "/usr/bin/perl", you will
have to adjust the first line accordingly in the following
script:

- X.cgi

Configuration of Script Parameters (Variables)
----------------------------------------------
The following adjustments are related exclusively to
the Central Keyword Switch" file "X.cgi".

* Please adjust the following variables to your requirements:

- "$standard"
   Variable "$standard" denotes the core domain you want to
   redirect your "normal" visitors to (i.e. NO machines, 
   searchbots, etc.)

- "$keyword_flag"
   As a rule, human visitors will enter some keyword or search term
   on a search engine's main page and will then click on a listed
   phantom page's link which will transfer them to that page's
   Shadow Domain(TM).
   By defining variable "$keyword_flag" as 1, the visitor's search term
   can be included as an info field in the standard URL.
   This information can then be used in statistical analysis of your
   Core Domain's traffic.
   If you opt for this feature, the "$standard" variable's syntax required
   some adjustment.
   Two examples:
      $standard        = "http://www.coredomain.com/index.html?<>"
   or:
      $standard        = "http://www.coredomain.com/affiliate.cgi?keyword=<>"
   Thus, the character string <> is included at some position within the URL.
   The exact form of the URL depends on the manner in which it will be analyzed on
   the Core Domain, i. e. which script is assigned this task.
   Default: the variable "$keyword_flag" is commented out, i. e. not active.

-  RSS Feed Inclusion
   - "use LWP::Simple;"
     This call will integrate Perl module "LWP::Simple" in the overall process.
   - "use XML::RSS;" 
     This call will integrate Perl module "XML::RSS" in the overall process.
     Both modules are included in any standard installation of Perl 5.
     Should one or both be missing from your system's, it will require installation as
     the RSS Feed functionality will not work otherwise.
     (Note that this function is not mandatory: the overall fantomas shadowMaker(TM)
     will work fine, too, if you choose not to make any use of it.)
   - "$rss_flag"
     Set this variable to "1" if you want to include an RSS feed in any of your phantom
     pages.
   - "$rss_items"
     This variable lets you determine how many items of the RSS feed shall be included in
     your phantom pages.
     If this variable is commented out, all the RSS feed's items will be included.
     However, this is not recommended as it can dramatically blow up the phantom pages' size.
     By default, all these 4 variables are commented out, i. e. the RSS Feed Inclusion
     feature is not activated.

- "$main_dir"
   Main directory (DocumentRoot) for your HTML pages. The absolute path is required.
   Examples:
   "/usr/www/htdocs/"
   "/var/www/html_public/"

- "$stats_dir"
   Directory for log files and admin files. The absolute path is required.
   Examples:
   "/usr/www/htdocs/cgi-bin/stats/"
   "/var/www/html_public/cgi-bin/stats/"

- "$hits_log_file"
   Log file listing SD hits
  (Default name is: "hits.log") 

- "$humans_log_file"
   Log file listing SD hits from human visitors
  (Default name is: "human-hits.log") 

- "$links_list_file"
   Links list file name as generated in step 4
  (Default name is: "links.txt") 

- "$selist_file"
   Search engine referrer parsing routines
  (Default name is: "selist.txt") 

- "$botbase_dir"
   Directory of fantomas spiderSpy(TM) botBase file.
   The absolute path is required.
   Examples:
   "/usr/www/htdocs/cgi-bin/stats/"
   "/var/www/html_public/cgi-bin/stats/"

- "$botbase_file"
   File containing spider robots list
  (Default name is: "spiderspy.txt")


The ".htaccess" file
--------------------
The file ".htaccess" should include as a minimum the
following entries:

RewriteEngine on
Options +FollowSymlinks
RewriteBase /
RewriteRule ^$ /cgi-bin/X.cgi?%{REQUEST_URI} [L]
RewriteCond %{REQUEST_URI} !/.*/
RewriteRule ^.*\.html$ /cgi-bin/X.cgi?%{REQUEST_URI} [L]

However, it is recommended to expand the .htaccess file
by including the following condition:

RewriteCond %{REQUEST_FILENAME} -f

Explanation:
This check whether the called page actually exists on the
domain.
If not, a 404 error message will be triggered.
If you don't include this condition, the page index.html
will be displayed instead.

Thus, the complete expanded code is:

RewriteEngine on
Options +FollowSymlinks
RewriteBase /
RewriteRule ^$ /cgi-bin/X.cgi?%{REQUEST_URI} [L]
RewriteCond %{REQUEST_FILENAME} -f
RewriteCond %{REQUEST_URI} !/.*/
RewriteRule ^.*\.html$ /cgi-bin/X.cgi?%{REQUEST_URI} [L]

(A sample ".htaccess" file featuring these entries is
included with our package.)

This .htaccess file offers two functionalities:

1. All calls for HTML pages, be they search engine spider
   or human generated (web browsers) will first be redirected
   to the Central Keyword Switch (CKS)

2. All HTML pages in subdirectories wiil be displayed as
   normal HTML pages


Uploading Files to Your Web Server
----------------------------------
* The following file must be copied into the
Shadow Domain's main directory ("DocumentRoot"):

- .htaccess

- robots.txt
  A generic robots.txt file is included in our package
  which permits all spiders to crawl the phantom pages.
  The use of a robots.txt file is not mandatory.

* The following script must be copied into the
  directory for execution of CGI scripts.
  Usually, this will be directoy /cgi-bin/.

- X.cgi


Creating Subdirectories
-----------------------
* Next, create the directories defined as the following
  variables BELOW your main directory:

- "$stats_dir"
- "$botbase_dir"

Set directory permissions to:
"chmod 777" [drwxrwxrwx]

Typically, the two variables above will point to the same
directory.
This, however, is not mandatory. E.g. if you wish to implement
several Shadow Domains on a single server, you might want to
feed the fantomas spiderSpy(TM) botBase from a central directory
while storing the individual Shadow Domains' log files decentrally
in dedicated directories.


Uploading the SE List File
--------------------------
* Now, copy the following file into the directory
  defined under variable "$stats_dir":

- selist.txt


Uploading the Links List File
-----------------------------
* Next, copy the following file into the directory
  defined under variable "$stats_dir":

- links.txt

This will only be required if you actually generated a matching
Links List file during Step 4.


Uploading empty Log Files
-------------------------
* Now, copy the following blank files into the directory
  defined under variable "$stats_dir":

- hits.log
- human-hits.log


Uploading the fantomas spiderSpy(TM) botBase
--------------------------------------------
* Finally, copy the following file into the directory
  defined under variable $botbase_dir:

- spiderspy.txt


FTP Upload Mode
---------------
* When uploading via FTP, make sure to transfer ALL files in
ASCII mode (including the ".htaccess" file!).
This is quite critical as about 90% of all installation problems
are related to incorrect upload modes!


Assigning Proper File Permissions
---------------------------------
* Assign the following required file permissions:

.htaccess:         "chmod 444"  [-r--r--r--]
X.cgi:             "chmod 755"  [-rwxr-xr-x]
spiderspy.txt:     "chmod 666"  [-rw-rw-rw-]
hits.log:          "chmod 666"  [-rw-rw-rw-]
human-hits.log:    "chmod 666"  [-rw-rw-rw-]
selist.txt:        "chmod 444"  [-r--r--r--]
links.txt:         "chmod 444"  [-r--r--r--]


UNINSTALLING THE CKS
---------------------
For complete uninstall, delete the following files:

.htaccess
X.cgi

Also, delete the following directory or whichever directory
you defined under variables "$admin_dir" and "$botbase_dir"
including all content:

- stats

WARNING
-------
Deinstallation should always include the whole Shadow Domain!
E.g. if you were to delete the Central Keyword Switch (CKS) only,
the phantom pages could be read by any human visitor and no redirection
to your Core Domain would be effected.
Following deinstallation, you may also want to adjust your ".htaccess"
file, restore it to its previous version, delete it altogether or
whatever may be most pertinent to your system setup.



Functionality of the Central Keyword Switch (CKS)
-------------------------------------------------
All visitors' IP addresses will be checked by the CKS.
If found belonging to a search engine spider, the phantom page
will be read internally and fed to the spider. In this case, no
redirection will take place and the spider will not notice the
difference: it will crawl and index the phantom page just like
any other web page.

If no established search engine spider IP is detected, the
visitor's Referrer data will be parsed for keywords/search
phrases.
If keywords/search phrases are found for which redirection
instructions have been defined in the Links List, the visitor
will be redirected to the predefined target URL.

If no Referrer is detected, or if no specific target URL has
been defined for the keywords/search phrases found, the
visitor will be redirected to the defined standard URL.

All hits are logged in the log file "hits.log".
Search engine spider spider hits are marked by two preceding
exclamation marks: "!!".

The file "human-hits.log" logs only hits from human visitors and
spiders not assigned to search engines. (The latter may include
whackers, extractor bots, etc.)
Close window ]