开发者

How can I hide certain text from search engines?

开发者 https://www.devze.com 2023-03-17 06:32 出处:网络
In my WordPress blog, I have \"Posted ? days ago\" on every post. I have 10 po开发者_开发百科sts on my homepage. So according to most keyword analysis tools, \"days ago\" is a top keyword on my blog,

In my WordPress blog, I have "Posted ? days ago" on every post. I have 10 po开发者_开发百科sts on my homepage. So according to most keyword analysis tools, "days ago" is a top keyword on my blog, but I don't want it to be. How can I hide those words from search engines?

I don't want to use Javascript. I can easily use PHP and the $_SERVER variable, but I'm afraid I might get penalized for cloaking. Is there a HTML tag or an attribute like rel="nofollow" that I can use?


From Is there any way to have search engines not index a certain section of a page?

Supposedly you can add the class robots-nocontent to elements on your page, like this:

<div class="robots-nocontent">

    <p>Ignore this stuff.</p>

</div>

Yahoo respects this, though I don't know if other search engines respect this. It appears Google is not supporting this at this time. I suspect if you load your content via ajax you would get the same effect of it not being present on the page.

and

There's no general way to do that and personally I wouldn't bother with it. Search engines are pretty good at recognizing relevant content on a page, and even though that content might show up in the keywords that search engines have found, it doesn't mean that it would make the page relevant for those keywords.

If you have a page about "Fish" and a page about "Dogs" (that has the link to the page about "Fish" somewhere in the sidebar), search engines will generally be able to recognize that the page about "Fish" is much more relevant for "Fish" than the page about "Dogs" that mentions "Fish" in the sidebar. It's possible that both pages might be found at some point, but generally given that mostly one page from the site is shown in the search results, that's not something worth worrying about.

There's no need to be fancy with that, and search engines are likely to just get more confused if you try (eg if you use JavaScript to hide the content, you never know when search engines will start to find that content regardless). Similarly, using iframes with robots.txt disallows or AJAX will frequently degrade the quality of your pages to users (slow it down or make it less usable on a variety of devices), so unless there is a very, very strong & proven reason that you need to do this, I would strongly recommend not bothering with it.


What I have found on wiki:

For Yandex:

<!--noindex-->Don't index this text.<!--/noindex-->

For Yahoo:

<div class="robots-nocontent">Don't index this text.</div>

For Google:

<!--googleoff: index--> Don't index this text.<!--googleon: index-->


Linksku, I'm fairly sure you shouldn't be worried about that particular piece of text. Our algorithms do a relatively good job detecting boilerplate text. As far as I can tell from your question, this text is boilerplate and we likely already know that.

As for detecting Googlebot and don't serving this text for it, you're right, that would be cloaking and you should never do it. In this case if you hide that text from us, we will also have a hard time detecting it's boilerplate and you would end up doing exactly what you're trying to avoid :)


I worked this out and posted it up at: http://www.scivillage.com/thread-2580.html

This should work, however more testing of it and feedback would be appreciated.

   .x:before{
      content:attr(title);
      display:inline;
   }
			
<ul>
  <li><a href="#"><span class="x" title="Homepage"></span></a></li>
  <li><a href="#"><span class="x" title="Contact" /></a></li>
</ul>

(I kept the class name short to reduce mark-up creep)

The search engines should ignore HTML tags with empty values when comes to looking for keywords, this should mean that it ignores what is written in the title attribute. (It assumes that the value is what's important, if it's empty then there is no point checking the attributes)

It was suggested that it's possible to negate having the closing tag in HTML5 due reduced strictness, however there is counter suggestions that end tags are still required.

I'd suggest not using it directly on a (anchor) tags since they can be used for sitemaps (using #), so it's means they would like have the Title spidered.

Although it is possible that it might assume any title content is there to inflate keywords through hidden elements, however I can not confirm this.


To exclude specific text from Google search results you can add data-nosnippet attribute.

https://developers.google.com/search/reference/robots_meta_tag#data-nosnippet-attr

From google documentation

You can also prevent certain parts of the page text content from being shown in a snippet by using data-nosnippet.


HTML:

<div class="hasHiddenText">_</div>

It is important that you leave a non-whitespace character between the element with a hidden text.

External CSS:

.hasHiddenText{
content: "Your hidden text here...";
/*This ovewrites the default content of the div but it isn't supported by all browsers.*/
}
.hasHiddenText::before{
content: " Your hidden text here...";
/*Places a hidden text above the div.*/
}

The "hidden text" pertains to content hidden to all search engines but visible to visitors.
You can also use nextline and all sorts of Unicode characters by escaping them with \uXXXX. To display linebreak characters correctly, be sure to add the

white-space:pre-line;  

property.

0

精彩评论

暂无评论...
验证码 换一张
取 消