EsheleD Marketing & Technology

cash advance riverside
16Oct/140

Best practices for XML sitemaps & RSS/Atom feeds

Webmaster level: intermediate-advanced

Submitting sitemaps can be an important part of optimizing websites. Sitemaps enable search engines to discover all pages on a site and to download them quickly when they change. This blog post explains which fields in sitemaps are important, when to use XML sitemaps and RSS/Atom feeds, and how to optimize them for Google.

Sitemaps and feeds

Sitemaps can be in XML sitemap, RSS, or Atom formats. The important difference between these formats is that XML sitemaps describe the whole set of URLs within a site, while RSS/Atom feeds describe recent changes. This has important implications:

  • XML sitemaps are usually large; RSS/Atom feeds are small, containing only the most recent updates to your site.
  • XML sitemaps are downloaded less frequently than RSS/Atom feeds.

For optimal crawling, we recommend using both XML sitemaps and RSS/Atom feeds. XML sitemaps will give Google information about all of the pages on your site. RSS/Atom feeds will provide all updates on your site, helping Google to keep your content fresher in its index. Note that submitting sitemaps or feeds does not guarantee the indexing of those URLs.

Example of an XML sitemap:

<?xml version="1.0" encoding="utf-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
 <url>
   <loc>http://example.com/mypage</loc>
   <lastmod>2011-06-27T19:34:00+01:00</lastmod>
   <!-- optional additional tags -->
 </url>
 <url>
   ...
 </url>
</urlset>

Example of an RSS feed:

<?xml version="1.0" encoding="utf-8"?>
<rss>
 <channel>
   <!-- other tags -->
   <item>
     <!-- other tags -->
     <link>http://example.com/mypage</link>
     <pubDate>Mon, 27 Jun 2011 19:34:00 +0100</pubDate>
   </item>
   <item>
     ...
   </item>
 </channel>
<rss>

Example of an Atom feed:

<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
 <!-- other tags -->
 <entry>
   <link href="http://example.com/mypage" />
   <updated>2011-06-27T19:34:00+01:00</updated>
   <!-- other tags -->
 </entry>
 <entry>
   ...
 </entry>
</feed>

“other tags” refer to both optional and required tags by their respective standards. We recommend that you specify the required tags for Atom/RSS as they will help you to appear on other properties that might use these feeds, in addition to Google Search.

Best practices

Important fields

XML sitemaps and RSS/Atom feeds, in their core, are lists of URLs with metadata attached to them. The two most important pieces of information for Google are the URL itself and its last modification time:

URLs

URLs in XML sitemaps and RSS/Atom feeds should adhere to the following guidelines:

  • Only include URLs that can be fetched by Googlebot. A common mistake is including URLs disallowed by robots.txt - which cannot be fetched by Googlebot, or including URLs of pages that don't exist.
  • Only include canonical URLs. A common mistake is to include URLs of duplicate pages. This increases the load on your server without improving indexing.
Last modification time

Specify a last modification time for each URL in an XML sitemap and RSS/Atom feed. The last modification time should be the last time the content of the page changed meaningfully. If a change is meant to be visible in the search results, then the last modification time should be the time of this change.

  • XML sitemap uses  <lastmod>
  • RSS uses <pubDate>
  • Atom uses <updated>

Be sure to set or update last modification time correctly:

  • Specify the time in the correct format: W3C Datetime for XML sitemaps, RFC3339 for Atom and RFC822 for RSS
  • Only update modification time when the content changed meaningfully
  • Don’t set the last modification time to the current time whenever the sitemap or feed is served.

XML sitemaps

XML sitemaps should contain URLs of all pages on your site. They are often large and update infrequently. Follow these guidelines:

  • For a single XML sitemap: update it at least once a day (if your site changes regularly) and ping Google after you update it.
  • For a set of XML sitemaps: maximize the number of URLs in each XML sitemap. The limit is 50,000 URLs or a maximum size of 10MB uncompressed, whichever is reached first. Ping Google for each updated XML sitemap (or once for the sitemap index, if that's used) every time it is updated. A common mistake is to put only a handful of URLs into each XML sitemap file, which usually makes it harder for Google to download all of these XML sitemaps in a reasonable time.

RSS/Atom

RSS/Atom feeds should convey recent updates of your site. They are usually small and updated frequently. For these feeds, we recommend:

  • When a new page is added or an existing page meaningfully changed, add the URL and the modification time to the feed.
  • In order for Google to not miss updates, the RSS/Atom feed should have all updates in it since at least the last time Google downloaded it. The best way to achieve this is by using PubSubHubbub. The hub will propagate the content of your feed to all interested parties (RSS readers, search engines, etc.) in the fastest and most efficient way possible.

Generating both XML sitemaps and Atom/RSS feeds is a great way to optimize crawling of a site for Google and other search engines. The key information in these files is the canonical URL and the time of the last modification of pages within the website. Setting these properly, and notifying Google and other search engines through sitemaps pings and PubSubHubbub, will allow your website to be crawled optimally, and represented accordingly in search results.

If you have any questions, feel free to post them here, or to join other webmasters in the webmaster help forum section on sitemaps.

Posted by Alkis Evlogimenos, Google Feeds Team

11Sep/140

An update to the Webmaster Tools API

Webmaster level: advanced

Over the summer the Webmaster Tools team has been cooking up an update to the Webmaster Tools API. The new API is consistent with other Google APIs, makes it easier to authenticate for apps or web-services, and provides access to some of the main features of Webmaster Tools.

If you've used other Google APIs, getting started with the new Webmaster Tools API will be easy! We have examples for Python, Java, as well as OACurl (for fans of command lines).

This API allows you to:

  • list, add, or remove sites from your account (you can currently have up to 500 sites in your account)
  • list, add, or remove sitemaps for your websites
  • get warning, error, and indexed counts for individual sitemaps
  • get a time-series of all kinds of crawl errors for your site
  • list crawl error samples for specific types of errors
  • mark individual crawl errors as "fixed" (this doesn't change how they're processed, but can help simplify the UI for you)

We'd love to see what you're building with our APIs! Feel free to link to your projects in the comments below. Should you have any questions about the usage of the API, feel free to post in our help forum as well.

Posted by John Mueller, fan of long command lines, Google Zürich

9Sep/140

Webmaster Academy now available in 22 languages

Webmaster level: Beginner

Today, the new Webmaster Academy goes live in 22 languages! New or beginner webmasters speaking a multitude of languages can now learn the fundamentals of making a great site, providing an enjoyable user experience, and ranking well in search results. And if you think you’re already familiar with these topics, take the quizzes at the end of each module to prove it :) .

So give Webmaster Academy a read in your preferred language and let us know in the comments or help forum what you think. We’ve gotten such great and helpful feedback after the English version launched this past March so we hope this straightforward and easy-to-read guide can be helpful (and fun!) to everyone.

Let’s get great sites and searchable content up and running around the world.

Posted by Mary Chen, Webmaster Outreach

6Sep/140

An improved search box within the search results

Webmaster level: All

Today you’ll see a new and improved sitelinks search box. When shown, it will make it easier for users to reach specific content on your site, directly through your own site-search pages.

What’s this search box and when does it appear for my site?

When users search for a company by name—for example, [Megadodo Publications] or [Dunder Mifflin]—they may actually be looking for something specific on that website. In the past, when our algorithms recognized this, they'd display a larger set of sitelinks and an additional search box below that search result, which let users do site: searches over the site straight from the results, for example [site:example.com hitchhiker guides].

This search box is now more prominent (above the sitelinks), supports Autocomplete, and—if you use the right markup—will send the user directly to your website's own search pages.

How can I mark up my site?

You need to have a working site-specific search engine for your site. If you already have one, you can let us know by marking up your homepage as a schema.org/WebSite entity with the potentialAction property of the schema.org/SearchAction markup. You can use JSON-LD, microdata, or RDFa to do this; check out the full implementation details on our developer site.

If you implement the markup on your site, users will have the ability to jump directly from the sitelinks search box to your site’s search results page. If we don’t find any markup, we’ll show them a Google search results page for the corresponding site: query, as we’ve done until now.
As always, if you have questions, feel free to ask in our Webmaster Help forum.

Posted by Mariya Moeva, Webmaster Trends Analyst, and Kaylin Spitz, Software Engineer

5Sep/140

Optimizing for Bandwidth on Apache and Nginx

Webmaster level: advanced

Everyone wants to use less bandwidth: hosts want lower bills, mobile users want to stay under their limits, and no one wants to wait for unnecessary bytes. The web is full of opportunities to save bandwidth: pages served without gzip, stylesheets and JavaScript served unminified, and unoptimized images, just to name a few.

So why isn't the web already optimized for bandwidth? If these savings are good for everyone then why haven't they been fixed yet? Mostly it's just been too much hassle. Web designers are encouraged to "save for web" when exporting their artwork, but they don't always remember.  JavaScript programmers don't like working with minified code because it makes debugging harder. You can set up a custom pipeline that makes sure each of these optimizations is applied to your site every time as part of your development or deployment process, but that's a lot of work.

An easy solution for web users is to use an optimizing proxy, like Chrome's. When users opt into this service their HTTP traffic goes via Google's proxy, which optimizes their page loads and cuts bandwidth usage by 50%.  While this is great for these users, it's limited to people using Chrome who turn the feature on and it can't optimize HTTPS traffic.

With Optimize for Bandwidth, the PageSpeed team is bringing this same technology to webmasters so that everyone can benefit: users of other browsers, secure sites, desktop users, and site owners who want to bring down their outbound traffic bills. Just install the PageSpeed module on your Apache or Nginx server [1], turn on Optimize for Bandwidth in your configuration, and PageSpeed will do the rest.

If you later decide you're interested in PageSpeed's more advanced optimizations, from cache extension and inlining to the more aggressive image lazyloading and defer JavaScript, it's just a matter of enabling them in your PageSpeed configuration.

Learn more about installing PageSpeed or enabling Optimize for Bandwidth.

Posted by Jeff Kaufman, Make the Web Fast

[1] If you're using a different web server, consider running PageSpeed on an Apache or Nginx proxy.  And it's all open source, with porting efforts underway for IIS, ATS, and others.

26Aug/140

#NoHacked: a global campaign to spread hacking awareness

Webmaster level: All

This June, we introduced a weeklong social campaign called #NoHacked. The goals for #NoHacked are to bring awareness to hacking attacks and offer tips on how to keep your sites safe from hackers.

We held the campaign in 11 languages on multiple channels including Google+, Twitter and Weibo. About 1 million people viewed our tips and hundreds of users used the hashtag #NoHacked to spread awareness and to share their own tips. Check them out below!

Posts we shared during the campaign:

https://plus.google.com/+GoogleWebmasters/posts/1BzXjgJMGFU

https://plus.google.com/+GoogleWebmasters/posts/TMhfwQG3p8P

https://plus.google.com/+GoogleWebmasters/posts/AcUS4WhF6LL

https://plus.google.com/+GoogleWebmasters/posts/DUTpSGmkBUP

https://plus.google.com/+GoogleWebmasters/posts/UjZRbySM5gM

Some of the many tips shared by users across the globe:

  • Pablo Silvio Esquivel from Brazil recommends users not to use pirated software (source)
  • Rens Blom from the Netherlands suggests using different passwords for your accounts, changing them regularly, and using an extra layer of security such as two-step authentication (source)
  • Дмитрий Комягин from Russia says to regularly monitor traffic sources, search queries and landing pages, and to look out for spikes in traffic (source)
  • 工務店コンサルタント from Japan advises everyone to choose a good hosting company that's knowledgeable in hacking issues and to set email forwarding in Webmaster Tools (source)
  • Kamil Guzdek from Poland advocates changing the default table prefix in wp-config to a custom one when installing a new WordPress to lower the risk of the database from being hacked (source)

Hacking is still a surprisingly common issue around the world so we highly encourage all webmasters to follow these useful tips. Feel free to continue using the hashtag #NoHacked to share your own tips or experiences around hacking prevention and awareness. Thanks for supporting the #NoHacked campaign!

And in the unfortunate event that your site gets hacked, we’ll help you toward a speedy and thorough recovery:

Posted by your friendly #NoHacked helpers

7Aug/140

HTTPS as a ranking signal

Webmaster level: all

Security is a top priority for Google. We invest a lot in making sure that our services use industry-leading security, like strong HTTPS encryption by default. That means that people using Search, Gmail and Google Drive, for example, automatically have a secure connection to Google.

Beyond our own stuff, we’re also working to make the Internet safer more broadly. A big part of that is making sure that websites people access from Google are secure. For instance, we have created resources to help webmasters prevent and fix security breaches on their sites.

We want to go even further. At Google I/O a few months ago, we called for “HTTPS everywhere” on the web.

We’ve also seen more and more webmasters adopting HTTPS (also known as HTTP over TLS, or Transport Layer Security), on their website, which is encouraging.

For these reasons, over the past few months we’ve been running tests taking into account whether sites use secure, encrypted connections as a signal in our search ranking algorithms. We've seen positive results, so we're starting to use HTTPS as a ranking signal. For now it's only a very lightweight signal — affecting fewer than 1% of global queries, and carrying less weight than other signals such as high-quality content — while we give webmasters time to switch to HTTPS. But over time, we may decide to strengthen it, because we’d like to encourage all website owners to switch from HTTP to HTTPS to keep everyone safe on the web.

Lock

In the coming weeks, we’ll publish detailed best practices (we’ll add a link to it from here) to make TLS adoption easier, and to avoid common mistakes. Here are some basic tips to get started:

  • Decide the kind of certificate you need: single, multi-domain, or wildcard certificate
  • Use 2048-bit key certificates
  • Use relative URLs for resources that reside on the same secure domain
  • Use protocol relative URLs for all other domains
  • Check out our Site move article for more guidelines on how to change your website’s address
  • Don’t block your HTTPS site from crawling using robots.txt
  • Allow indexing of your pages by search engines where possible. Avoid the noindex robots meta tag.

If your website is already serving on HTTPS, you can test its security level and configuration with the Qualys Lab tool. If you are concerned about TLS and your site’s performance, have a look at Is TLS fast yet?. And of course, if you have any questions or concerns, please feel free to post in our Webmaster Help Forums.

We hope to see more websites using HTTPS in the future. Let’s all make the web more secure!

Posted by Zineb Ait Bahajji and Gary Illyes, Webmaster Trends Analysts

17Jul/140

Testing robots.txt files made easier

Webmaster level: intermediate-advanced

To crawl, or not to crawl, that is the robots.txt question.

Making and maintaining correct robots.txt files can sometimes be difficult. While most sites have it easy (tip: they often don't even need a robots.txt file!), finding the directives within a large robots.txt file that are or were blocking individual URLs can be quite tricky. To make that easier, we're now announcing an updated robots.txt testing tool in Webmaster Tools.

You can find the updated testing tool in Webmaster Tools within the Crawl section:

Here you'll see the current robots.txt file, and can test new URLs to see whether they're disallowed for crawling. To guide your way through complicated directives, it will highlight the specific one that led to the final decision. You can make changes in the file and test those too, you'll just need to upload the new version of the file to your server afterwards to make the changes take effect. Our developers site has more about robots.txt directives and how the files are processed.

Additionally, you'll be able to review older versions of your robots.txt file, and see when access issues block us from crawling. For example, if Googlebot sees a 500 server error for the robots.txt file, we'll generally pause further crawling of the website.

Since there may be some errors or warnings shown for your existing sites, we recommend double-checking their robots.txt files. You can also combine it with other parts of Webmaster Tools: for example, you might use the updated Fetch as Google tool to render important pages on your website. If any blocked URLs are reported, you can use this robots.txt tester to find the directive that's blocking them, and, of course, then improve that. A common problem we've seen comes from old robots.txt files that block CSS, JavaScript, or mobile content — fixing that is often trivial once you've seen it.

We hope this updated tool makes it easier for you to test & maintain the robots.txt file. Should you have any questions, or need help with crafting a good set of directives, feel free to drop by our webmaster's help forum!

Posted by Asaph Arnon, Webmaster Tools team

15Jul/140

Promoting modern websites for modern devices in Google search results

Webmaster level: all

A common annoyance for web users is when websites require browser technologies that are not supported by their device. When users access such pages, they may see nothing but a blank space or miss out a large portion of the page's contents.

Starting today, we will indicate to searchers when our algorithms detect pages that may not work on their devices. For example, Adobe Flash is not supported on iOS devices or on Android versions 4.1 and higher, and a page whose contents are mostly Flash may be noted like this:

Developing modern multi-device websites

Fortunately, making websites that work on all modern devices is not that hard: websites can use HTML5 since it is universally supported, sometimes exclusively, by all devices. To help webmasters build websites that work on all types of devices regardless of the type of content they wish to serve, we recently announced two resources:

  • Web Fundamentals: a curated source for modern best practices.
  • Web Starter Kit: a starter framework supporting the Web Fundamentals best practices out of the box.

By following the best practices described in Web Fundamentals you can build a responsive web design, which has long been Google's recommendation for search-friendly sites. Be sure not to block crawling of any Googlebot of the page assets (CSS, JavaScript, and images) using robots.txt or otherwise. Being able to access these external files fully helps our algorithms detect your site's responsive web design configuration and treat it appropriately. You can use the Fetch and render as Google feature in Webmaster Tools to test how our indexing algorithms see your site.

As always, if you need more help you can ask a question in our webmaster forum.

Posted by Keita Oda, Software Engineer, and , Webmaster Trends Analyst

15Jul/140

Troubleshooting hreflang annotations in Webmaster Tools

If you are targeting users in more than one country, chances are you already heard about rel-alternate-hreflang. If you haven't, in short, this annotation enables Google and other search engines to serve the correct language or regional version of pages to searchers, which can lead to increased user satisfaction.


Making sure the deployed annotations are usable by search engines can be rather difficult, especially on sites with many pages, and site owners all around the world haven’t been shy telling us about this. Today we're releasing a feature that should make debugging rel-alternate-hreflang annotations much easier.


The Language Targeting section in the International Targeting feature enables you to identify two of the most common issues with hreflang annotations:
  • Missing return links: annotations must be confirmed from the pages they are pointing to. If page A links to page B, page B must link back to page A, otherwise the annotations may not be interpreted correctly.
    For each error of this kind we report where and when we detected them, as well as where the return link is expected to be.
incorrect_backlinks.png


  • Incorrect hreflang values: The value of the hreflang attribute must either be a language code in ISO 639-1 format such as "es", or a combination of language and country code such as "es-AR", where the country code is in ISO 3166-1 Alpha 2 format.
    In case our indexing systems detect language or country codes that are not in these formats, we provide example URLs to help you fix them.


incorrect_language.png


Additionally, we've moved the geographic targeting setting to this part of Webmaster Tools, so that you can find all information relevant to international and multilingual targeting in the same place.


We hope you'll find this new feature useful and that it helps you to identify issues with the rel-hreflang-implementation on your site. If you have comments or questions about the feature, please post in our Webmaster Help Forum.

Posted by Gary Illyes, Webmaster Trends

Page 1 of 1412345...10...Last »