Welcome to 92798.NET [Post Free/Paid hosting information] [Contact US]
HOME Documentation Free Hosting Paid Hosting Tools Search
Current : HOME >> Documentation >> Web >> Content

Google sitemap standard format

From : Unknown, View : 199, 2009-10-09 21:17:14
Google SiteMap Protocol is the introduction of Google itself as a site map of Agreement, the file robots.txt file based on an earlier agreement, and they are escalating. In Google's official guidelines that joined the Google SiteMap files site will be more conducive to Google crawling robots crawling the index, which would increase the efficiency of indexing Web content and accuracy. File protocol applied a simple XML format that is used a total of six tags, key tags, including links addresses, update time, update frequency and index of priority.
Google SiteMap file format is as follows:
以下为引用的内容: <urlset xmlns="http://www.google.com/schemas/sitemap/0.84">
<url>
<loc>http://homepage.yesky.com</loc>
<lastmod>2005-06-03T04:20-08:00</lastmod>
<changefreq>always</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>http://homepage.yesky.com/300687.html</loc>
<lastmod>2005-06-02T20:20:36Z</lastmod>
<changefreq>daily</changefreq>
<priority>0.8</priority>
</url>
</urlset>
XML tags
changefreq: page content update frequency.
lastmod: Page last modified
loc: address of the page Permanent link
priority: the priority relative to other pages
url: compared with the previous four labels the parent tag
urlset: compared with the previous five tag parent tag

I will explain in a decomposition of an xml file for each label:
<urlset xmlns="http://www.google.com/schemas/sitemap/0.84">
This line defines the name of this xml file space, the equivalent of a web document in the same role <html> label.

<url>
This is a specific definition of a particular link to the entrance, you want to display in the SiteMap file link for each one should use the <url> and </ url> included in the inside, it is necessary.

<loc> http://homepage.yesky.com </ loc>
With <loc> describe a specific link address, It should be noted that the link to address some of the special characters must be converted to XML (HTML) defined escape character, the following table: characters after the escape character
HTML Character Character Encoding
and (and) & & &
Single quotation marks' ''
Double quotation marks "" "
Greater than "" "
Less than the number "" "

<lastmod> 2005-06-03T04 :20:32-08: 00 </ lastmod>
<lastmod> is used to specify the link was last updated, this is very important. Google's robots will be in the index this link before and the last time the index record was last updated comparison will be skipped if time is no longer the same as the index. So if you link to the content based on the last Google index the contents of the change, should update the time, so that the next Google will re-index the contents of the link analysis and keyword extraction. Here must be the time specified in ISO 8601 format description, format time format is as follows:
Year: YYYY (2005)
Year and month: YYYY-MM (2005-06)
Date: YYYY-MM-DD (2005-06-04)
Date hours minutes: YYYY-MM-DDThh: mmTZD (2005-06-04T10: 37 +08:00)
Hours, minutes, seconds, date: YYYY-MM-DDThh: mmTZD (2005-06-04T10: 37:30 +08:00)
Here must be noted that TZD, TZD specify that the local time zone marker, like China is an +08:00

<changefreq> always </ changefreq>
Use this tag to tell Google this link to the update frequency may occur, such as the home definitely will use our always (always), and for a long time ago, links or links to updated content will no longer be able to use yearly (annually). Here can be used to describe the total of these words: "always", "hourly", "daily", "weekly", "monthly", "yearly", concrete meaning I do not have to explain it, just look at the meaning of the word will understand.

<priority> 1.0 </ priority>
<priority> is used to specify this link priority relative to other links ratio, this value is set at 0.0 - 1.0
There are </ url> and </ urlset>, these two is to turn off the xml tags, and this, and HTML in the </ body> and </ html> is a reason
Also to note is that this xml file must be utf-8 encoding format, whether you are manually generated or through code generation, the proposal should check whether the xml file is utf-8 encoding, the easiest way is to use Notepad, Open the xml and then save it too choice of encoding (or converter) to UTF-8. 

Google login to submit your SiteMap file, so that Google began crawling bar open http://www.google.com/webmasters/sitemaps/ link, if it is not registered or login Google, with their own account on the first landing Google, after landing switch to Your Sitemaps Status page, you can click that Add a Sitemap + Jump to submit pages Sitemap submission of documents. Guidance Note on your site root directory. To Google to submit your Sitemap URL you can see in the list, after already exists, but it is not in force when it comes to be a few hours after the Status column becomes OK formally entered into force, if not OK, you can view the status of Google gives marked take a look at what the reasons are explained.
TAGS:Google  sitemap
  • Say Something
  • Comment List
COMMENT
Your Name :      Contact : (OPTIONAL)
NOTICE:
1, No Ads, No Spam;
2, Prohibition of personal attacks;
3, Less then 2000 characters;
Security Code: Refresh
No Comment