LessWrong search traffic doubles

https://www.lesswrong.com/posts/j7fKMzgB3Y6XDiuWx/lesswrong-search-traffic-doubles

LessWrong search traffic doubles… despite Google thinking our site is a pro-family pro-democracy astrology blog! More on that in a minute.

First, The Good News: Since I started doing SEO on LessWrong (10 months ago) search traffic from Google has doubled! It took researching >200 different techniques—actually implementing 14 of them (w/​ help from Tricycle) -- 2 of which I think are responsible for most of the improvement:

The analytics make me believe that this improvement is due to structural changes and not just generally increased traffic. But it certainly hasn’t hurt that people have been writing new content and that HP:MoR exists. Anyway, I’m *really happy *about this! This was the explicit goal I set for myself 10 months ago. It’s nice to achieve goals… especially unreasonably ambitious ones.

So… YAY!! :D

OK, Now, The Bad News: So I was trying to figure out why we never get any traction for search terms like "rationality" when I looked through Google Webmaster tools. This is what Google thinks our site is about, keyword wise:

Keyword Occurrences

vote 196504

points 152881

permalink 95106

children 84578

parent 56374

people 37047

it’s 27082

march 21846

february 21520

january 20425

human 19587

december 18005

september 15695

august 15667

password 15377

april 14714

october 14011

seem 12822

november 11546

july 11265

june 9283

world 8542

post 8496

actual 8251

probability 8114

child 7828

moral 7787

work 7143

might 6250

new 6156

theory 5827

argument 5639

read 5278

utility 5206

account 5002

evident 4777

belief 4749

remember 4691

recent 4584

intelligent 4582

science 4424

eliezer 4384

doesn’t 4339

rationality 4188

brain 3969

decision 3904

life 3795

username 3732

mind 3721

All the keywords that I **bolded **are purely structural elements of the Less Wrong site layout. And it appears Google actually is punishing our site for this keyword density imbalance. Google really does think our site is about voting, parenting, and astrology. And while I find it somewhat hilarious that our top source of Google impressions (27,000/​mo) is for the keyword "babies", I also lament that the keyword "rationality" is our #3955 source of traffic. We should invert this.

So does anyone have any ideas? How do other sites solve this problem?

Comment

https://www.lesswrong.com/posts/j7fKMzgB3Y6XDiuWx/lesswrong-search-traffic-doubles?commentId=JRcF2XkvuxKSubDiW

[joke] Change the names of the structural elements to keywords we consider important! For instance,

  • "Vote up /​ down" → "rationality up /​ down"

  • "points" → "paperclips"

  • "permalink" → "timeless commenting decision"

  • "password" → "the teacher’s password"

  • "username" → "code name in the Bayesian conspiracy"

EDIT: You know, I actually like the "points" → "paperclips" change for real.

Comment

https://www.lesswrong.com/posts/j7fKMzgB3Y6XDiuWx/lesswrong-search-traffic-doubles?commentId=LLGs9qsinCMHup6WR

+1 to points → paperclips :-D

I have previously suggested "Vote up/​down" to "More like this/​Less like this", to generally positive reception.

parent/​children → above/​below? There should be something suitable.

When I put the word "rationality" into Google, the first hit is Wikipedia, the second is "Twelve Virtues of Rationality" and the third is LessWrong. How much of LW’s low traffic on the word can be attributed to people just not searching on the word much? Edit: This was an artifact of searching logged-in—not logged in, it’s not even on the front page.

Bending one’s site out of shape for an idiot Googlebot sorta sucks, really. But on my own sites, Google supplies 97% of the search engine traffic. So I suppose one must do what one has to if traffic is a goal.

RationalWiki doesn’t give a hoot about SEO, so has an accordingly poor showing and terrible pagerank. RW’s hit articles tend to be stuff that it covers well that doesn’t rate a Wikipedia article, e.g. Poe’s law, Project Blue Beam, European Union Times. The whole answer to succeeding as a wiki is "provide something Wikipedia can’t or won’t."

Comment

When I put the word "rationality" into Google, the first hit is Wikipedia, the second is "Twelve Virtues of Rationality" and the third is LessWrong. How much of LW’s low traffic on the word can be attributed to people just not searching on the word much?

Are you signed into google or not? When you’re signed in, it tailors the results to your search history.

Comment

D’oh! Well spotted—not logged in, LessWrong is not on the front page.

Comment

On the plus side, Harry Potter and the Methods of Rationality is the fourth response to Rationality, even signed out.

Comment

And Yudkowski.net is result #6

https://www.lesswrong.com/posts/j7fKMzgB3Y6XDiuWx/lesswrong-search-traffic-doubles?commentId=Nx5bd9WWuMrfctqq2

I am completely clueless about SEO, but the tag line "a community blog devoted to refining the art of human rationality" is part of an image file and as such invisible to Google, right? Making it equally prominently visible to Google as it is to humans seems like the sort of thing that would help. I don’t know what the best way to do that would be though, alt text?

Comment

https://www.lesswrong.com/posts/j7fKMzgB3Y6XDiuWx/lesswrong-search-traffic-doubles?commentId=ynySM59kQfxyjNXZS

Yes looking at the source html, the image has the alt text "Less Wrong"/​"Less Wrong Discussion", but does not include the tag line, which it should.

https://www.lesswrong.com/posts/j7fKMzgB3Y6XDiuWx/lesswrong-search-traffic-doubles?commentId=mip6rKgr3q2B2Yrdi

Google is smart enough to know about this kind of "trick" and trying it will actually decrease your pagerank. Do not meddle in the ways of google… ;)

https://www.lesswrong.com/posts/j7fKMzgB3Y6XDiuWx/lesswrong-search-traffic-doubles?commentId=7WCb5j7MhAQKXSRsx

This is all inherited from Reddit, right? Does Reddit get a lot of search traffic for babies?

https://www.lesswrong.com/posts/j7fKMzgB3Y6XDiuWx/lesswrong-search-traffic-doubles?commentId=4Wa276PMuAdfzmyb7

My best SEO advice would be to turn the structural links (vote, edit, etc) into buttons (ie post instead of get). AFAIK, google doesn’t consider buttons to be as "contenty" as ordinary links.

https://www.lesswrong.com/posts/j7fKMzgB3Y6XDiuWx/lesswrong-search-traffic-doubles?commentId=dSnCyFnn4TgrNiqGS

Actually, Less Wrong does have a fair amount of discussion about babies (mainly about killing them). And I would guess searches about babies are several orders of magnitude more frequent than searches about rationality.

Edit: Continuing this line of thought, maybe an effective strategy would be to figure out what potentially receptive people are searching for and write some posts about how to apply rationality to those things.

Comment

https://www.lesswrong.com/posts/j7fKMzgB3Y6XDiuWx/lesswrong-search-traffic-doubles?commentId=6TvdsQxXQnYaDWGDt

If someone wrote something like "Babies: A Rational Analysis", our site’s current structuring would help it be unreasonably popular in Google. This would be analogous to Less Wrong "doing what it’s best at".

CarlShulman’s articles about voting are overly-popular for the same reason… probably by accident.

Comment

Does "Babies and Bunnies: A Caution About Evo-Psych" show this effect?

Comment

yes

This would be analogous to Less Wrong "doing what it’s best at".

I suggest you make a post of suggested topics that spring to mind. You don’t have to write all the posts, but then someone inspired by the title can.

Can people please not write articles simply to improve Google ranking? That’s dark sidish and also easily leads to a decline in content quality.

https://www.lesswrong.com/posts/j7fKMzgB3Y6XDiuWx/lesswrong-search-traffic-doubles?commentId=hqpPM4Rmuidwb83zq

It looks to me like this is just a raw count of word occurrences rather than what google thinks are the most relevant keywords, because I wouldn’t expect the latter to contain words like "it’s". If I’m right then the list isn’t very informative.

Regarding words like "vote" and "parent", I think one way to hide them would be to put them in buttons rather than links.

Comment

https://www.lesswrong.com/posts/j7fKMzgB3Y6XDiuWx/lesswrong-search-traffic-doubles?commentId=5fKS9ndc275swkMy2

Google does do some word-ranking. From memory:

  1. if it’s in the url—it’s more important

  2. if it’s in headings (h1/​h2 etc tags) then it’s more important—the bigger the tag the better… but in descending in order down the page (ie an h3 right at the top may be considered more important than an h1 at the bottom of the page)

  3. google starts at the top of the page and works down. Stuff at the top is more important than stuff below that.

  4. If it occurs more frequently, then it’s probably more relevant (thus vote and parent)

  5. If other links, that point at this site contain the same keywords.. then they are more important

There’s plenty of other stuff that goes into this—most of which google keeps secret and it changes on a day by day basis. There are people who make whole careers (lucrative ones!) out of figuring it all out.

Comment

Are ‘Top’ and ‘Bottom’ defined as on the unstyled page? If so, sidebars may be getting undue weight...

Comment

Yes, defined as on the unstyled page, however, if you’re talking about the right-hand sidebar… it appears below the content on the page (I checked). The only things that appear "above" the content are the header-image, the top tabbed-navigation and that discussion blurb.

https://www.lesswrong.com/posts/j7fKMzgB3Y6XDiuWx/lesswrong-search-traffic-doubles?commentId=EMA6vKh83fcFvmcP4

This probably would be bad for performance, but purely structural sections of the site could be loaded in no-indexed iframes.

If we were dealing with certain Russian search engines, structural sections could be no-indexed inline:

Russian search engines Yandex and Rambler introduce a new tag which only prevents indexing of the content between the tags, not a whole Web page.

Do index this text block.

Don’t index this text block

Unfortunately, I don’t see any indication that Google honors such a thing.

Comment

https://www.lesswrong.com/posts/j7fKMzgB3Y6XDiuWx/lesswrong-search-traffic-doubles?commentId=58erYTJHS3NjwuHR7

If HTML is supposed to be about semantics of the page, the NOINDEX tag should have been a part of every HTML specification, at least since server-side scripting became popular.

There is a lot of repeated text on each page of many websites, that really isn’t part of the content, such as: "write your comment here", "next page", "previous page", "username /​ password", "permalink", etc.

I wonder if your website contains a word "permalink" in each page and comment, and there is one page that is really about permalinks, whether Google can tell the difference.

https://www.lesswrong.com/posts/j7fKMzgB3Y6XDiuWx/lesswrong-search-traffic-doubles?commentId=FqS7jJibDWaQs8osp

Your SEO problem with "votes" and "points" keywords is not entirely due to the comment-voting sections. It’s also because of the short blurb above the main article-title.

Google ranks things literally from top-down (in the html)… and that blurb starting "This part of the site is for the discussion of topics" (class = infobar) - appears on most pages, and it appears above the H1 tag containing the article’s title. Thus google thinks it’s MORE important the main content of the article.

If you want that kind of thing to appear above the title… you can actually do funky things with CSS-positioning that will keep it below the article in the html, but appear to the humans as being at the top of the page.

https://www.lesswrong.com/posts/j7fKMzgB3Y6XDiuWx/lesswrong-search-traffic-doubles?commentId=b6rwaTTimJiJgqq8c

I just noticed that in the recent comments feed, article links on comment replies to "Philosophy: A Diseased Discipline" go to http://​​lesswrong.com/​​r/​​lukeprog-drafts/​​lw/​​4zs/​​philosophy_a_diseased_discipline/​​ , which is a broken link because it’s no longer a draft. That’s probably bad for their rank, and it might be a more general problem.

https://www.lesswrong.com/posts/j7fKMzgB3Y6XDiuWx/lesswrong-search-traffic-doubles?commentId=GvyH8PxkyNt5ooFs6

It’s a content vs. formatting issue. Words like vote, march, reply, points, etc are really formatting, but Google reads them as content.

To fix this, you could do a lot of JavaScript hacking so that the timestamps, etc are displayed using DHTML. The search engine robots won’t run JavaScript, so they’ll only see the content.

Comment

https://www.lesswrong.com/posts/j7fKMzgB3Y6XDiuWx/lesswrong-search-traffic-doubles?commentId=kb59airTvx3DEfose

JS hacking will also make the page less stable, less accessible and more annoying to maintain. So it’s possible, but there is a significant cost involved.

https://www.lesswrong.com/posts/j7fKMzgB3Y6XDiuWx/lesswrong-search-traffic-doubles?commentId=dtWz8jFzdFMgsnF5d

Well done, sir.

Unfortunately, I know very little about SEO.

Would it do anything to make the title be:

Article TItle—Less Wrong: a community blog devoted to refining the art of human rationality

Comment

https://www.lesswrong.com/posts/j7fKMzgB3Y6XDiuWx/lesswrong-search-traffic-doubles?commentId=DLMNPqZj7N7tiEwmJ

googlehacking is a fine art… and too much can be just as detrimental as too little.

https://www.lesswrong.com/posts/j7fKMzgB3Y6XDiuWx/lesswrong-search-traffic-doubles?commentId=mk3f38yxB8gKDgGsZ

Utility, belief, intelligent, brain, decision and mind are also topical, aren’t they? Arguably moral, argument, theory and science as well. Except for the structural elements and rationality being too low it doesn’t look too bad.

https://www.lesswrong.com/posts/j7fKMzgB3Y6XDiuWx/lesswrong-search-traffic-doubles?commentId=x8wEtG9mMbv6HtJg3

From https://​​sites.google.com/​​site/​​webmasterhelpforum/​​en/​​faq—webmaster-tools :

Q: Why do my Webmaster Tools stats show common phrases such as "buy now" that are not directly related to my site?

A: While some common words and phrases are filtered by Webmaster Tools, there may be some that you use which are not. Having these words or phrases listed in your Webmaster Tools account does not mean that our algorithms will view your site as being only relevant for those keywords. While Webmaster Tools mostly counts the occurences of words on your site, our web-search algorithms use well over 200 other factors for crawling, indexing and ranking. In other words: don’t worry if you see keywords like this listed in your Webmaster Tools account.

I couldn’t find a more detailed estimation of the impact of such keywords, but we should consider the option of just ignoring the issue. Especially since according to this the only effective options are JavaScript or frames tricks, both of which would make LW significantly more annoying or slow to use.

taryneast’s idea of using CSS to pretend-shove the opening blurb to the bottom of the page could be rather painless, though.

https://www.lesswrong.com/posts/j7fKMzgB3Y6XDiuWx/lesswrong-search-traffic-doubles?commentId=5c8yKRM6XnGf79x4h

Great job!

https://www.lesswrong.com/posts/j7fKMzgB3Y6XDiuWx/lesswrong-search-traffic-doubles?commentId=hTMzeE9XHEeBdcmTN

it occurs to me that those most frequent structural words are embedded in anchors that have url’s back to lesswrong itself.. seems like a decent heuristic for peeling apart structure and ignoring it?

Edit: I suppose my theory is that Google would make efforts to ignore structural terms in analyzing topic, that this wouldn’t be all that hard, and that the ‘babies’ effect is a coincidence.

https://www.lesswrong.com/posts/j7fKMzgB3Y6XDiuWx/lesswrong-search-traffic-doubles?commentId=RqALgWxx39FMsowKc

For the months: fix the date display so that the month isn’t written out.

https://www.lesswrong.com/posts/j7fKMzgB3Y6XDiuWx/lesswrong-search-traffic-doubles?commentId=hGsBbqDn6dkdhWZKj

I assume both the right and left will think that we support their cause because we’re "rational".