Google 200+ ranking and crawling signals
Google 200+ Algorithm Ranking Factors
Some would go as far as to say it is a known fact that Google use over 200 signals when weighing a site in the search results and would say its a known fact that they tweak and update the algorithms around 500 to 600 times a year on their prerogative quest of keeping ahead of other search engines and staying ahead of black hat SEOs.
Besides all the SEO myths and misconceptions, I personal think there are way more than 200+ factors Google use when crawling your site, there algorithm updates alone exceeding those figures.
Below I have compiled a list of the so called 200+ factors Google use in there algorithm when ranking/weighing your site in the search results.
Responsive web design
Responsive web design has now become a ranking signal (actually there are a bunch of mobile signals) which is now part of the mobile algorithm and will come into effect 21 April. Google plan on only displaying mobile friendly sites in the mobile search results, which could have a huge impact on your traffic if you reliant on mobile users.
Below are signals they will be looking for.
- Configure the viewport - Elements outside the viewport, which result in the content been wider than the screen forcing the user to zoom. To fix the issue use the following in the head section of your document <meta name="viewport" content="width=device-width, initial-scale=1"> and configure your CSS.
- Avoid using px and rather use %, em or CSS. Floating and absolutely positioned elements should also be avoided on limited screens for usability reasons since they may trigger scrolling in two directions
- Avoid landing page redirects
- Minify CSS
- Minify HTML
- Optimize images
- Prioritize visible content
- Use legible font sizes
- Size tap targets appropriately for touch screen phones
- Avoid plugins
- Compressing resources with gzip or deflate
- Avoid landing page redirects
- Reduce server response time
Use the tools below to test your site.
Mobile-Friendly Tester - https://www.google.com/webmasters/tools/mobile-friendly/
PageSpeed and user experience test - https://developers.google.com/speed/pagespeed/insights/
Or view the Google help article - http://googlewebmastercentral.blogspot.com/2015/02/finding-more-mobile-friendly-search.html
Google are now nailing sites that create doorway pages for the sole purpose of funnelling traffic, sites that aggregate content will also be nailed and so will sites that are made solely for drawing affiliated traffic with no unique content. Google article - http://googlewebmastercentral.blogspot.com/2015/03/an-update-on-doorway-pages.html
Copy or scraped content from other sites is one of the main crawling factors Google use when weighing your site in the search results and will no longer rank (the copy days are long gone, Panda algorithm targets sites with copy/scraped content and thin sites that have very little to offer)
If you have internal copy, it is suggest to use rel=canonical. That by itself tells Google which is the preferred/original version.
And if quoting a site, it is suggested to use block quotes and link to the original source (preferably using nofollow on the link) And don't make the mistake of quoting an entire article or page, simply quote a snipped of the article then ADD VALUE to what you have quoted. If you cant add value to a quote, DONT QUOTE!!
Google have become extremely anal when it comes to copy, even going as far as to create the Panda algorithm which mostly focuses on thin/copy/scraped content and quality verses junk. If you cant write your own content then get someone else to write it for you. Copy sites will simply no longer rank, Google will send the user to the original source while filtering your site in the search results.
And if excessive (duplicate sites, spam scraping) Google will even take manual spam action against your site, totally removing it from index. SO DONT SCRAP OR COPY!!
Thin pages with very little or no content are likely to be a drag on your entire site and need to be avoided. Googles motto is "content is king" (original, informative, useful, unique content)
Title tags are without a doubt an important factors when crawling and weighing each page. Title tags are more likely to appear in the search if they are relevant to the query used by the visitor.
It is advised to use around 60 characters in the title and to rather concentrate on words and variants, rather than keyword density or repeating words or phrases.
This brings me to an earlier update, 12 January 2012 to be exact. The algorithm update been "better page titles in search results". Google have always stated that titles and descriptions should be unique, descriptive and on the page, giving the visitor a true reflection of the page.
The "better page titles in search results" algorithm determines which title and description to display by generating alternative titles for the user to recognize relevant pages to the queries used.
Usually they will display alternative titles if the Webmaster forgets to use them or if they are spammy and non-descriptive. An example of non-descriptive would be "home". They would also tend to display alternative titles when the Webmaster repeats them or if there are only minor variations. Further more, they will replace titles with more concise, descriptive titles if the title is difficult to read, spammy or unnecessarily long.
Google article - http://googlewebmastercentral.blogspot.com/2012/01/better-page-titles-in-search-results.html
Description meta tag
Google can generate descriptions and titles for search snippets, but at times they display non-relevant snippets in the search results. It is advised to use informative descriptions on important pages, always making sure the description keywords are relevant to the page content.
Keyword meta tag
Google no longer use the keyword tag, but could use them as a spam indicator. It is a known fact the Bing use the meta keyword as a spam indicator. So DONT spam the meta keyword tag.
The same would apply to spamming the link title and any other non-visible tag, whether it be in the head or body section of your document.
Google have on numerous occasions advised Webmasters to use header tags, they have however also warned about misusing them. Forms of misuse would include excessive repeating, stuffing with keywords and disguising the H1 tag using CSS to make it look like normal text, or for that matter, invisible.
Keywords in the conversation
On-page keywords play an important role in optimization, specifically if they match the visitors query. It is also advised to use latent semantic keywords whilst targeting the page niche. Always remembering that excessive keywords could result in an algorithm filter. It is advised to write for the user and NOT the crawler and to ignore keyword density.
Page site speed and load time
Page site speed is one of Googles 200 algorithm ranking factors and will affect your sites rankings. Bad page speed could result from excessive scripts or from your server.
Matt Cutts has confirmed Site speed as one of the 200 crawling factors/signals, having the following to say:
"I pointed out that we still put much more weight on factors like relevance, topicality, reputation, value-add, etc. — all the factors that you probably think about all the time. Compared to those signals, site speed will carry much less weight"
Read more about page site speed:
Google weigh your images on the information received (the spider cant see images as we do) and are reliant on the following: alternative text, titles, captions, image URLs or text around the image.
The more information you give, the more chance you stand of ranking in the image search, the more you spam the keywords, the greater the filter. .
Yes you read correct, Google give weight to iframes (believe it or not) Google credit you for accommodating the blind visitor that cant view iframes, so use it to your advantage.
Google pull snippets from the iframes and display them in the search results. Use it wisely, taking advantage of them weighing those keywords and displaying them in the snippets, but at the same time, not too excessive with the targeted keywords. AN IFRAME THOUGHT FOR THE DAY!! ;)
Spelling and grammar
Spelling is most defiantly one of the 200 factors used when weighing your site. Specifically when it comes to spun content that don't make sense.
Fresh content and up-dating
Google can detect updated pages and fresh content, a dormant site could slip in the rankings, while an updated site would climb. Freshness is extremely important and a big part of the algorithm updates.
How frequent pages are updated with significant content will determine authority over other pages on your site and in the search results.
The more you link internally to a page, the more important the page becomes in relation to less linked pages.
In general, high authority sites push more link juice than low PR sites. But that is not always the case, it depends on the relevance of the link. Furthermore, PageRank is one of the most misunderstood and misused metrics when it comes to SEO and search results. PageRank is only one of over 200 signals that Google use to determine your sites authority in search.
Outbound links that are follow, purely to pass PageRank would be unnatural and against the Google guidelines, specifically excessive reciprocal links. Your site on Google is based on analysis of sites you link to and sites that link back to you. Meaning the quality and relevance either count towards you or against your rankings in the search.
If by any chance you link to bad neighbourhoods (banned, malware, poor quality sites, etc) then that by itself could and would reflect back on your website and perhaps even result in an unnatural link penalty from the Webspam team. Further more, engaging in link exchange schemes simply for the sake of cross linking, disregarding the quality of those links and the neighbourhood is a violation of the Google guidelines.
Illicit unnatural linking and redirects
When you read text out aloud and the links don't make sense to the "conversation" (links in a specific paragraph or sentence that are unrelated and completely different to the content) that would be an unnatural link and very irritating for the visitor to be redirected to an advertising or porn site, or for that matter one of your internal pages.
Meta refresh are also used for sneaky redirects. Pages like those will be perceived as doorway pages especially if the time set is less than 5 seconds. Redirects can be achieved by using 301 re-direct which involves configuring the .htaccess file on an Apache web server. Meta redirects should be your last option to redirect.
Avoid all unnatural links that are intended to manipulate PageRank. Lets have a look at a few more unnatural links.
- Participating in link schemes
- Buying links to pass PageRank
- Excessive reciprocal links (you link to me, i link to you)
- Excessive percentage of repetitive anchor text targeting your keywords
- Multiple inter linking between your own sites, creating doorway pages
- Auto generated links from link directories, paid links
- Spammy forum links, duplicating forum posts in multiple forums
- Directory links that are spammed on multiple pages, irrelevant to your content
If you have to many affiliate links they could hurt your site.
Links from bad neighbourhoods
Bad neighbourhood links can hurt your site. In Webmaster Tools you can disavow untrusted links by using the disavow tool.
Avoid excessively long URLs, they tend to be spammy and difficult to remember. It is however SEO friendly to use keywords in URLs.
Link relevant pages together, the more relevant links pointing to a page, the higher its authority.
Excessive links on a page
Avoid excessive links on a page. they are distracting and they obscure the page content.
Google have repeatedly mentioned this. Its best you have sub-navigation menus rather than one big menu on each page, the sub navigation links naturally all been relevant to each other.
Sitewide links no longer count, they are compressed to count as a single link.
Forum profile links
Due to forum spam it is believed Google now devalue forum links, and is a known fact they take manual action against sites that participate in such behaviour.
Keep pages close to the root of your domain (your index page) pages hidden deep may not be found, hence they will carry less authority, and use a sitemap for hard to find pages.
Franchise and store owners linking
Large companies and franchise tend to hide information behind a search form or post request, don't do it (search engines cant read forms)
If you have a bunch of store or franchise, consider making a web page for each store that lists the stores address, phone number, etc (a landing page for search engines and visitors)
Make sure you have an HTML sitemap pointing to those pages with regular HTML links.
If not used or miss-used canonical can result in pages been dropped or classed as copy. Canonical allows you to specify the preferred version of your website. Or for that matter, similar pages on your website.
You would have a canonical issue if Google index both the www and non-www versions of your website. Most people would consider these URL's the same, but they are not.
Technically speaking all those URL's are different and Google treat them as different URL's. When that happens it is advised you 301 redirect to a preferred version rather than using rel=canonical.
By 301 redirecting you consolidated link popularity to the preferred version. Pages that have similar content can specify the canonical version by using rel="canonical" in the head section of those pages. The pages need to point to the preferred version. PageRank and other signals will also be transferred to the canonical page.
Google also take rel="canonical" as a strong hint, taking your preference into account when determining the most relevant page to show in search. Updating links to point to a single canonical page is also important. That by itself would ensure optimal canonicalization.
Avoid using crappy platforms like Joomla and Wordpress, and if you use them, keep the plugin's and platform up-dated. In Webmaster Tools Google now warn Webmaster to update due to possible malware and hacking.
Page layout is with out a doubt one of the 200 factors and is extremely important. The main content needs to be immediately visible and not obscured by pop-ups, ads, roll-overs, navigation menus and other crap. And the page needs to load fast.
Hence the Page Layout Algorithm, which is part of the 200+ factors Google use when ranking your site. This factor affects sites who persistently push content down by placing blocks of ads above the page content, demoting MFA (made for advertising) sites, sites that concentrate on ads and not content.
Contact us page
The contact us page needs to have a substantial amount of contact info that matches Whois registration.
Terms of service and privacy page
This page tells Google that the site is trustworthy, it helps to write your terms and privacy rather than copy.
Make sure your site is mobile friendly.
YouTube does without a doubt get preference in the search due to Google owning it.
Reputable review sites
You can gain from reputable review sites.
Google do crawl parked domains, they just don't index them. From a SEO point of view its best to add content to the parked domain prior to launching, this should speed up the indexing process.
HTTPS as a ranking signal
Google have now started to use HTTPS as one of the 200+ ranking signals when crawling your site. It does however carry very little weight in comparison to the other crawling factors, but that could change in the near future.
Google article - http://googlewebmastercentral.blogspot.com/2014/08/https-as-ranking-signal.html
Webmaster SEO Help
AdWords are NOT one of the factors related to weighing your site in the search results, AdWords customers do not receive special treatment in organic search.
Myth: Human algorithm filters
It is a known fact that humans evaluate websites before and after an algorithm update. Most Webmasters assume these quality raters are part of the algorithm filter that filters sites from search.
That is not true, the quality raters only evaluate the site and the algorithm results giving feedback to the algorithm team, feedback that could be used in future algorithms, which does get used, and so does info from feedbacks and spam reports.
Myth: Algorithm reconsideration required
Reconsideration requests for algorithm updates/filters are not necessary and will not work, algorithms are handled by programs and not humans.
Therefore, the program itself algorithmically determining your rankings.
Myth: Immediate recover after an algo filter
Recovery will not occur immediately after you make changes to you site, they will only occur after an algorithm refresh or update which could take a month to 3 months, depending on when they re-run the algorithm.
A site very seldom fully recovers. Depending on the changes made, will depend on your new rankings.
Myth: robots.txt prevents pages been crawled
That is incorrect If the crawler picks up the links on other sites, it will crawl those pages and index them.
If you want to permanently block pages from index, use the noindex meta tag in the head section of your document.
Myth: Dynamic URLs cannot be crawled
That is a myth, Google crawl static and dynamic URLs and can interpret different parameters.
Crawling issue will only arise if you remove information by hiding parameters and making them look static.
Myth: 301 redirect 404 not found URLs
No that is a crawling myth. It will be seen as soft 404s and not real 404s.