Your Ad Here

Protecting Your Websites From Search Engines


Author: Ben Cortese

There are a great number of scenarios in which you should be protecting your websites from the search engines. If you've developed a website and you've developed a personal administrative piece to the site as well which many of us do, you may not want that administrative url showing up on Google or Yahoo search engines.

If you are accepting documents from clients in the form of a word document or pdf document and you're storing them in a directory on your web server, whether or not you have the section of the website password protected or not, the files on that web server are open in that directory simply because they reside on the web server relative to the root directory. And thanks to the power of Google, you can determine what word and pdf documents are available on a website by doing a simple search such as this:

google filetype:doc

Experiment with this format in a Google search, the results may surprise you. It surprised me once when I did a search for pdf documents for a bank website and found pdf documents that were designated for client eyes only, only I wasn't a client and I was able to view their documents.

The website itself was protected, and the area to access the documents was protected by username and password. What wasn't protected was the directory in which the institution was storing the documents. Adding insult to injury was not having done a common web practice in utilizing a robots.txt file in the root directory of their website.

There are a number of levels of security that simply must be in place. I’m certainly not a security expert, but I know that you have to protect your website from a network level, and from an application level as well. Your network operations team needs to do their part in locking down your directories, and installing patches and configuring and monitoring firewalls.

And developers need to do their part in securing their applications that require security. A very simple way to do that is by including a robots.txt file in the root directory of the web server. I’m not speaking of the root directory of the application, I'm talking about the web server.

The robots.txt file must reside in the root of the domain and must be named "robots.txt". A robots.txt file located in a subdirectory isn't valid, as bots only check for this file in the root of the domain. For instance, "http://www.example.com/robots.txt" is a valid location. But, "http://www.example.com/mysite/robots.txt" is not.

There are many variations to your robots.txt file and you have a lot of flexibility in what directories or files you want indexed by search engines, and those you do not. You do not use a robot.txt file to assure that pages are indexed with search engines, you use them to define what files and directories you do not want indexed by search engines. This is one simple but very important way to keep someone from finding the "For Your Eyes Only" documents relative to your company website.

About the Author:
Ben Cortese is a developer and business analyst for the financial industry and enjoys developing websites through MerchantWeb Marketing.

Copyright 2007.


Click here to View more Articles at: Invision-Graphics
Invision-Graphics Article Source:

Posted on Wednesday, March 14 @ 10:13:04 EDT by Admin
 
Options
 Return to the main page Return Home

 Print Page Print Version

 Send to a Friend Send To A Friend

 Discuss Article Discuss Article

 Related Articles Related Articles

 Search Articles Search Articles

 Stories Archive Stories Archive

 Subscribe Newsletter Subscribe Newsletter

 Syndicate Article Topic: Search Engine XML News Feed

 Contact US Contact US
Syndicate Article
 My Yahoo!
 Google
 NewsGator
 Stumbleupon
 PluckIT
 Rojo
 Bloglines
 My AOL
 Blogrolling
 ENewsblog
 NewsIsFree
 NetVibes
 del.icio.us
 Technorati
 Digg This
 FeedBurner
 FeedMailer
Sponsor Advertising
Link Directory
Want Exposure? Looking for one way Backlinks? Get Text Link Advertising for $1.00.

Text Advertising Info Text Advertising Info
Support US

Make a donation!
If you enjoy our services, make a donation today!

Google Support Ads
Related Links
More about Search Engine
News by Admin


Advertise Here

Most read story about Search Engine:
Using Wordtracker to Improve Search Engine Ranks

Article Rating
Average Score: 0
Votes: 0

Please take a second and vote for this article:

Excellent
Very Good
Good
Regular
Bad


Book Advertising
Get this Book Now
Buy this Book Now!
Click Here
Related Categories
Technorati TagsTechnorati Tags
Comment on Article:"193" Login | Create an Account | 0 comments
The following comments are owned by the individual who posted them. Invision-Graphics is not responsible for the content or the accuracy of the following statements.
No Comments Allowed for Anonymous, please Register
Credit Counseling | Cheap Car Insurance | Loan | Loans | Personal Injury Attorney Los Angeles
Click Here to Advertise
Affordable Hosting! http://www.invision-graphics.com/images/banners/468X60_VISIONHOSTING.gif
 Today: 30,209  Yesterday: 42,323  Total Hits: 24,550,868
Page Rendered in: 0.09s - Total Queries: 37 - MySQL DB: 15.6 mb's - Pages served in past 5 minutes: 84