
This is my .htaccess file, pulled from various websites, and from iamawolf.com. As an aside, that site has the most bizarre site logo I’ve ever seen. Tempted to email and ask why?
Anyway, back to htaccess. For this to work the server must be running on Apache and have the all of the required options enabled. .htaccess can either be in the root of your site - which will effect your whole site - or in specific directories to set options on just that directory.
Remember to secure the file and to call it:
.htaccess
For the code to copy and paste see the extended entry, remove anything starting with *Comment.
Feel free to steal the htaccess code lock stock and barrel, if you want to just give my site a link back, www.creationrobot.com, cheers.
Addendum: This is the first hit on google when you search for “parsing html with php” or something similar. That’s cool and all as it brings people in but I don’t want you lot feeling cheated that this isn’t a site just about htaccess. If you want me to expand on what’s here let me know, my MSN/email link is at the bottom of the page.
*From www.creationrobot.com
*Comment None of your visitors can read .htaccess
order allow,deny
deny from all
*Comment Parse html as php
AddType application/x-httpd-php .html .php .htm
*Comment Send readers to specific pages on an error
ErrorDocument 403 /errors/forbidden.html
ErrorDocument 404 /errors/notfound.html
ErrorDocument 500 /errors/internalerror.html
ErrorDocument 408 /errors/timedout.html
*Comment Make sure shtml is parsed on the server
AddType text/html .shtml
AddHandler server-parsed .shtml
Options Indexes FollowSymLinks Includes
*Comment Prevent anyone stealing bandwidth by hotlinking to images
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?creationrobot.com/.*$ [NC]
RewriteRule \.(gif|jpg)$ - [F]
*Comment Do not allow listing of your directories from a browser. These 2 lines do the same thing.
IndexIgnore *
Options -Indexes
*Comment Prevent email address harvest bots
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} ^Bullseye.* [OR]
RewriteCond %{HTTP_USER_AGENT} ^fastlwspider* [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailCollector [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetWebPage.* [OR]
RewriteCond %{HTTP_USER_AGENT} ^lwp-tribial.* [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*NEWT [OR]
RewriteCond %{HTTP_USER_AGENT} ^Crescent [OR]
RewriteCond %{HTTP_USER_AGENT} ^CherryPicker [OR]
RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebEMailExtrac.* [OR]
RewriteCond %{HTTP_USER_AGENT} ^NICErsPRO [OR]
RewriteCond %{HTTP_USER_AGENT} ^SurfWalker [OR]
RewriteCond %{HTTP_USER_AGENT} ^Telesoft [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus.*Webster [OR]
RewriteCond %{HTTP_USER_AGENT} ^Microsoft.URL [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/3.Mozilla/2.01 [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailCollector
RewriteRule ^.*$ - [F]







