Google today announced its search engine has started indexing HTTPS pages by default. More specifically, Google Search now crawls HTTPS equivalents of HTTP pages, even when the former are not linked to from any page.
HTTPS is a more secure version of the HTTP protocol used on the Internet to connect users to websites. Secure connections are widely considered a necessary measure to decrease the risk of users being vulnerable to content injection (which can result in eavesdropping, man-in-the-middle attacks, and other data modification). In August 2014, Google’s search algorithm started prioritizing encrypted sites in search results with a slight ranking boost.
Google’s decision to index more HTTPS pages is thus a natural progression. When two URLs from the same domain appear to have the same content, and that S is the only difference, Google will choose to index the HTTPS URL as long as:
- It doesn’t contain insecure dependencies.
- It isn’t blocked from crawling by robots.txt.
- It doesn’t redirect users to or through an insecure HTTP page.
- It doesn’t have a rel=”canonical” link to the HTTP page.
- It doesn’t contain a noindex robots meta tag.
- It doesn’t have on-host outlinks to HTTP URLs.
- The sitemaps lists the HTTPS URL, or doesn’t list the HTTP version of the URL.
- The server has a valid TLS certificate.
These eight requirements make perfect sense. Google doesn’t want to send users to an HTTPS page that the owner is pointing away from, nor does it want to send users to an HTTPS page that isn’t really properly secured.
Google is also encouraging webmasters to redirect their HTTP site to the HTTPS version and to implement the HSTS header on their server. This will tell “other search engines” that don’t prioritize HTTPS over HTTP to do just that.
The end goal is of course to only point to HTTPS sites and eventually not have any HTTP websites at all. With its dominant positions on the web and mobile, Google has the power to slowly but surely tip the scales towards such a future.