Friday, June 01, 2007

Which words do I "own"? Caveblogem evaluates Second Effort

Two weeks ago I mentioned that I'd volunteered this blog for an ongoing study by Caveblogem of Pretty Good on Paper.

Well, late yesterday, Second Effort became the 21st blog to undergo Caveblogem's word analysis. Here's a link to Caveblogem's post.

(And the post where Caveblogem announced and explained his study can be found here.)

As I understand it, Caveblogem was inspired when he read that the size of the average college graduate's vocabulary ranges from 20-25,000 words to 60,000 'active' words and 75,000 'passive' words. This got him thinking about how many distinct words he used in his own blog. He took a sample of 15,000 words -- reduced to 13,000 by stripping out proper names, misspellings, emoticons and other non-words -- and determined how many words he actually used. Then he extrapolated from that sample to come up with an approximation of the number of words that constituted a working vocabulary for his blog.

He then took the experiment into the Blogosphere: How many words are people really using out here? This is the 21st blog that Caveblogem has evaluated. (Linda's blog, Are We There Yet? was the 18th. Reading about Caveblogem on Linda's blog was what inspired me to volunteer.)

Caveblogem says I added 717 new words to his growing word base, using a total of 4,490 different words in total of 28,931 words published between March 15 and May 14 of this year.

(And you eagerly devoured each of them, didn't you, Dear Reader?)

Caveblogem said the 4,490 different words used here "may be some kind of record, actually. It seems pretty far from the norm, but I would have to check the records to say for sure." (Please. I blush.) He added that "the sample I took from my own eclectic and highly literate blog yielded only 3,100 unique words." Of course, this is not fair: The sample from Caveblogem's blog was less than half as big as the sample he took from here. And, in any event, it's a long way from even the low estimate for a college graduate's supposed vocabulary.

Clearly, there is much about this that I don't understand. But -- no matter -- there are cool pictures: Caveblogem makes a "word cloud" showing words sized according to frequency of use. Here's the one he did for Second Effort:

This "word cloud" is in a font called "Curlmudgeon." I think we can guess why Caveblogem chose it.

Looking at it now... apparently I do talk a lot about baseball. Hmmmmmmm. In my defense I can only say that the sample included Opening Day. I suppose I could try and argue that "Sox" is also a common abbreviation for the Sarbanes-Oxley Act -- which, as it happens, it is -- but this is yet another legal topic about which I am entirely ignorant.

And one more cool picture. Caveblogem generated a Venn diagram showing, on the left, words I used that so far no one else had, "sized relative to the frequency of use;" in the middle, words that everybody else has used "sized according to how much more frequently The Curmudgeon used them in the sample;" and, on the right, words that everyone else used... but that I did not:

My thanks to Caveblogem for allowing me to participate in this study.

13 comments:

Mother Jones RN said...

"Caveblogem says I added 717 new words to his growing word base, using a total of 4,490 different words in total of 28,931 words published between March 15 and May 14 of this year."

Leave it to an attorney to be long winded and wordy:-) And yes, I devoured each and every word.

MJ

Empress Bee (of the High Sea) said...

sigh. you mean to tell me i read that many words curmy? i've got to get a hobby honey! ha ha

take your vitamin now.

smiles, bee

Shelby said...

venntastic post :)

Smalltown RN said...

how intriguing....I love the pictorial demonstration of the mathematical calculations used to show your word usage....awesome....

Dave said...

Cubs is almost as big as Sox?

Star8278 said...

Very cool! Though I am not sure I followed all of the logistics, I am still impressed.

I always knew yours was a good read!

The Curmudgeon said...

Dave -- yes, that would be the case because you can't talk for too long about the White Sox without mentioning the Cubs somehow -- like the very true story I told about the nice, grandmotherly Cub fan who slapped me in the face after the Sox won a Crosstown Classic game... because I said "hello." "Ja," she said, "but I knew vat chu vas thinking!"

sari said...

Look! I'm famous!!

:-)

Dave said...

Ach! That explains it. I really want to know what the Cubs battery were saying has they had at it today in the dugout.

Linda said...

Somehow I just knew that you were going to add a lot more words than the average blogger to Caveblogem's word base! Leave it to a barrister to be wordy and also to use words that the "average bear" does not!

I don't understand the logistics of this experiment either but I think it's a rather cool thing - for lack of a fancier word!

Linda said...

I went back over and double-checked and I added 650 new words to Cave's database at the time of my post so I guess that's not too bad and puts me only 67 words behind a world-renowned attorney! That actually makes me feel almost smart! I knew I should have been a lawyer!

Ben & Bennie said...

Leave it to an attorney to be long winded and wordy:-) And yes, I devoured each and every word.

Darn it! Someone beat me to it!

BTW, following Dave's train of thought I expect a comment or 717 about the Cubs clubhouse yesterday.

Where fibers meet mud said...

This makes blogging sound too much like WORK! If I am going to work that much I would rather be in the garden or knitting something to wear. Blogging is a deposit for the loose thoughts that accumulate in the head and cannot get spoken sometimes and at other times just a plethera of facts...

What is up with Lou? Guess he never got a way to keep a lid on it. Sorry about that, Crum!