Blocking content from Google.
There are a number of options available for blocking content from Google. You can use password-protection, robots.txt and the noindex meta tag (not to forget, deleting the page!)
Blocking Google via password-protection.
Google and other spiders cant access password protected pages, so this is a very effective way to protect your personal information from appearing in the Google search, or for that matter, any search. You can achieve this by editing your htaccess file, thats if you are using Apache Web Server. There are also a lot of tools available that will allow you to do this.
Blocking Google via robots.txt.
Files and directories can be blocked via robots.txt. The robots.txt file is uploaded to the root of your host (you would naturally need access) Always remembering; robots.txt is not a guarantee. If Google discover the URL on other sites they may still add it to index.
Example of robots.txt that allows all spiders to crawl all files and directories.
Example of robots.txt that disallows all spiders to crawl all files and directories. The backslash disallows all robots (it blocks all files and directories from all robots)
Blocking Google via robots meta tag.
When Google come across the noindex meta tag they will completely drop that page from the search, regardless whether other sites link to it.
Example of robots meta tag, blocking all spiders from indexing and following links.
<meta name=”robots” content=”noindex,nofollow” />
Example of robots meta tag, allowing all spiders to index and follow links.
<meta name=”robots” content=”index,follow” />