Results 1 to 6 of 6

Thread: sitemap.xml appears in google site search results

  1. #1

    Default sitemap.xml appears in google site search results

    I routinely submit the sitemap.xml file generated by NOF to Google on their Webmaster site. I also use the search box feature that NOF provides with both web and site options. Many times when I search my site for a term, I see among the site search results the sitemap.xml file. Two questions:
    1) Why is sitemap.xml showing up in search results?
    2) We changed our site structure over a year ago and have submitted updated sitemap.xml files to Google. Yet they still crawl, and report 404 errors, for pages that no longer exist and are not in the sitemap. Why?

  2. #2
    Senior Member gotFusion's Avatar
    Join Date
    Jan 2010
    Location
    www.gotHosting.biz
    Posts
    4,529

    Default

    You should be able to exclude any file using a robots.txt file in your domain root.

    More details can be found here: http://www.gotfusion.com/tutorials/tut.cfm?itemID=10

    As far as why and what google indexes.... That would need to be asked of google.

    To avoid 404 errors you should create a 404 redirect page in your hosting control panel or redirect 404 errors via code.
    NetObjects Fusion Cloud Linux enabled Web Hosting, support + training starts at $14.95
    NetObjects Fusion web Hosting and support + ASP + PHP + ColdFusion + MySQL + MS SQL
    FREE NetObjects Fusion Support & training comes with all web hosting accounts
    NetObjects Fusion Web Hosting: http://www.gotHosting.biz

  3. #3

    Default

    Quote Originally Posted by gotFusion View Post
    You should be able to exclude any file using a robots.txt file in your domain root.

    More details can be found here: http://www.gotfusion.com/tutorials/tut.cfm?itemID=10

    As far as why and what google indexes.... That would need to be asked of google.

    To avoid 404 errors you should create a 404 redirect page in your hosting control panel or redirect 404 errors via code.
    I already have this robots.txt file in my home page:
    Sitemap: http://www.berkeleyandbeyond.com/sitemap.xml

  4. #4
    Senior Member gotFusion's Avatar
    Join Date
    Jan 2010
    Location
    www.gotHosting.biz
    Posts
    4,529

    Default

    You need to properly format your robots txt file to disallow spiders from indexing the page

    http://www.berkeleyandbeyond.com/robots.txt
    NetObjects Fusion Cloud Linux enabled Web Hosting, support + training starts at $14.95
    NetObjects Fusion web Hosting and support + ASP + PHP + ColdFusion + MySQL + MS SQL
    FREE NetObjects Fusion Support & training comes with all web hosting accounts
    NetObjects Fusion Web Hosting: http://www.gotHosting.biz

  5. #5

    Default

    Quote Originally Posted by gotFusion View Post
    You need to properly format your robots txt file to disallow spiders from indexing the page

    http://www.berkeleyandbeyond.com/robots.txt
    It looks like that file gets updated whenever I publish. I manually changed it to use the correct format, but after publishing my changes were overridden. How can I set the robots.txt file within the NOF UI?

  6. #6
    Senior Member
    Join Date
    Sep 2010
    Posts
    144

    Default

    You can add robotos.txt to your assets to always publidh it. I have not, BUT....

    Okay this is weird - I just noticed NOF 2015 is uploading a robots.txt file - overwritting my correct file with a one line file pointing to the sitemap. I didnt ask or want NOF to do that, although I do want it to publish the sitemap.xml
    Does turning on publish sitemap also publish the robots.txt file?

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •