Wednesday, September 30, 2009

Surprising Top Wikipedia Articles

Here's a list of the top 100 most visited Wikipedia pages from the first 8 months of 2009. These were the ones that surprised me:
  • Favicon.ico (#4) = This is the little icon to the left of the url in your web browser. Maybe a lot of people wanted to learn how to put one into their own website?
  • Deaths in 2009 (#8) = This shows that Wikipedia is often the go to for current events. The other day I was watching Monday Night Football and checked out the page for the Wildcat formation, and the stats for the game I was watching were already updated. Weird.
  • India (#18) and Australia (#33) = These were the first two countries. What makes them stand out--up and coming regions perhaps?
  • Scrubs (TV series) (#20) = This is the first TV series! It's surprising because the show isn't necessarily plot heavy nor is it necessarily even that good. Shows you how random things can be big on the internet for no good reason.
  • Naruto (#32) = Surprising because I had never heard of it. But the idea is actually pretty sweet -- "the story of Naruto Uzumaki, an adolescent ninja who constantly searches for recognition and aspires to become a Hokage, the ninja in his village that is acknowledged as the leader and the strongest of all."
  • Henry VIII of England (#67) = I guess he has a pretty cool story with all of those wives, but I don't see what makes him cooler than say, Attila the Hun or Otto von Bismark.
What did you find surprising?

Tuesday, September 22, 2009

Evolutionary Psyc and the Internet

Using the internet further your relationship via dating or even social network "stalking" is big and getting bigger. According to Socialnomics, 1 out of 8 couples married in the U.S. last year met through social media websites, and as of 2008 social media has overtaken porn as the #1 activity on the web.

Although the medium has changed, there are more similarities between our interactions online and in the real world than you might assume. For example, male college students edit their communications more (i.e., have more insertions, deletions, and backspaces) when they think they are typing a message online to females of the same age (49.50 +/- 24.72) rather than other males (24.00 +/- 16.15). Likewise females edit their communications more when talking to males of the same age (70.00 +/- 42.12) rather than other females (16.80 +/- 31.5, see Walter 2007, doi:10.1016/j.chb.2006.05.002 for the study). Plus, online daters value physical attractiveness in a partner just as much as offline ones.

Besides dating, evolutionary psyc might help to explain the online disinhibition effect. In the tribes of 150 close-knit people that humans have spent most of their evolutionary history, guarding your reputation was huge because everyone knew one another and gossip was commonplace. Piazza and Bering (2009, see doi:10.1016/j.chb.2009.07.002) argue that without human eyes, voice, and faces, the urge to behave altruistically and conceal secrets that developed due to our evolutionary history will be lost.

Robin Hanson thinks that what makes our era unique is that talking to other people who can talk is as easy as it will ever be. This implies that if humans will ever be able to destroy the in group / out group mentality, it will be now. Let the great social experiment begin.

Monday, September 21, 2009

Chemo Not Therapy

My friend Jon has a blog up called Chemo Not Equal Therapy about his fight against cancer. You can find it here. It's pretty powerful stuff. Check it out if you have a chance.

Friday, September 18, 2009

What Humans Need to Learn

John Langford wants to create a machine learning algorithm to solve problems at least as complicated as anything that a human can do. Today he explained the six characteristics of such a system that would be essential:
  • Access to large data sets. This is clearly necessary because when you deprive a human (or a cat!) of sensory inputs he won't develop properly.
  • Prediction making and immediate feedback on the efficacy of these predictions. Also known as "online" learning.
  • An exchange between the learner and the verifier, also known as interactive learning. For humans, the verifier is reality and it is always trusted.
  • A system that can be broken down into components and can be integrated back into the whole. One might imagine a learning system based on the evolutionary approach, but one basic research goal is a faster and more efficient design than randomness.
  • A large input context that includes tons of information bits and allows for multiple ways to reach the same conclusion.
  • Non-linear input representations can and often must be used.
Machine learning algorithms to automate the reconstruction of neural connectivity matrices following serial section transmission electron microscopy would be a great leap for neuroscience. Currently it would be technically possible for a human to do but a whole brain reconstruction would take 90,714,400 work hours, given 40 hours per mm (as given here) and an average brain width of 140 mm, length of 167 mm, and height of 93 mm. The only connectivity matrix that has currently been mapped is that of C. elegans, which took one intrepid neuroscientist 15 years to map, despite the fact that the worm contains only 302 neurons! The point I am trying to make is that a "reasonable" ML algorithm could change the world in at least one concrete way, so keep up the good fight.

Thursday, September 17, 2009

Refreshing Plausibility

One popular snowclone is to say that "If I had an X for every time that I've heard Y, I'd have enough X's to be/do Z." Usually these are ridiculous exaggerations once broken down. For example, "If I had a penny for every time you said you'd clean the dishes later, I'd be a billionaire." Totally untrue and impossible.

In one of my classes today someone said, "If I had a dollar for every time I heard X, I could buy us all beer tonight. But I don't, so you will all have to buy your own." This managed to be actually funny. On closer inspection I think it's because what he described is a legitimately plausible scenario, especially coming from a snowclone in which we are habituated to hyperbole. There were about twenty people in the class. It is reasonable for him to have heard this particular argument thirty or so times, and that probably adds up to enough cash to get us all tipsy on tall cans of PBR.

This plausability humor runs directly contra to the explanation of humor as exaggeration, which claims 400k+ Google hits. The website My Life is Average also is contra to the exaggeration hypothesis, humorously parodying the excesses of FML. It seems that the balance of what is funny at a given moment really does wax and wane like a sine wave.

Tuesday, September 15, 2009

Searching for the Imdb of Books

Here are the contenders:

Amazon Reviews: Upside: Tons of traffic. Rating the raters in terms of helpfulness and having a top 1000 reviewer list both give incentives for people to take their ratings seriously. Plus, most serious internet users already have Amazon accounts with demographic info, so they don't have to pressure people into joining. Thus all rating requires is one click. Downside: Amazon's team has all the tools, but they just don't seem to want it. They're like the Carmelo Anthony of online book rating. They don't really take their rater's reviews seriously, they refuse to expand to a much more informative 10 star system and they don't give out actual numbers. According to their system, The Shawshank Redemption VHS Tape is the #6 best book of all time. Listen, I'm willing to accept Amazon has much bigger fish to fry. They're trying to take over the world, after all, and they don't have the energy to create a silly top books of all time list. But what that means is that although Amazon may be the best right now, the true Holy Land will have to be found elsewhere.

The Internet Book List: Upside: Pretty solid system in place with cool ratings breakdown on a book by book basis, and a large emphasis on the actual ratings, which is basically the only point of the site. Downside: Scale! The most rated book on the site is Tolkein's The Fellowship of the Ring, with 541 votes. Compare that to the 5,100 who rated HPGOF on Amazon, or the 440,000 who have voted on Shawshank on imdb. Some of this is inherent to the medium, as it takes more time and effort to read a book than it does to watch a movie. But there are more book readers out there than 500, and in order to get them to the site there's going to have to be some other kind of attraction aside from the ability to rate. Of course, the internet wouldn't be in a Nash equilibrium if building online communities were easy. I recognize that achieving scale is tough.

ISBNdb.com: Upside: Tons of web crawlers on various library web sites have created a massive database that categorizes books by subject and offers a link to the lowest possible price. It's a very cool tool, and tries to explicitly model itself after imdb. Downside: No actual rating system. Included in the list anyways because that isn't the hardest problem in the world to fix.

Metacritic Books: Upside: Standard metacritic methodology applied to books yields a solid if unspectacular rating system. The opinions of critics may be slightly better in some ways but never enough to outweigh the benefits of pure bias-defeating scale. Attractive interface, easy to browse web-site, and a best of all-time list, in theory. Downside: Stopped aggregating ratings for books in 2007, presumably because it wasn't profitable. This raises some troublesome questions. Most importantly, why do people seem to care less about the ratings of books than of movies? One has to put such a big time investment into reading a book that from a time-optimization perspective research into its quality would be of huge value. Anyway, metacritic's book rating section is done. Put a fork in 'em.

Complete Review: Upside: Breaks down the top books into tiers by categories. Editorial board is willing to take a stand and submit their independent opinion. Downside: No user input and the lack of a quantifiable system means that it will never scale. At best, it's another data point in a more encompassing effort.

Reviews of Books
: Upside: Aggregates reviews by various important book reviewers and gives details regarding nominations for major prizes, like the Booker or the National Book Award. Downside: Nothing quantifiable. No overall list of the best books.

Internet Book Database
: Upside: Rank authors, books, and series's, according to both number of votes and average rating. Has some good ideas for how a website ultimately could be done, despite technical issues. Downside: For some reason they've bought into the stupid Amazon model of 5 star rankings. Just like Amazon they don't give the actual numerical average rating. But they go further and don't even show the total number of raters. This makes me uneasy about their methodology--does the site even weigh the top rated authors by the number of votes in a Bayesian fashion? Moreover the lack of transparency is troubling--it's hard to tell where exactly their data comes from. The fact that Lara Leigh is apparently the #2 author of all time yet doesn't even have her own Wikipedia page lends legitimacy to these worries.

LibraryThing
: Upside: Lots to like here. Tons of statistics on books and authors, including the most highly rated, the most reviewed, and the most number of times a book appears in a user's library. They actually show the numerical rating that a book gets and the pages for individual books are useful. Since they allow half stars and allow us to see the actual numerical rating, I can't even really complain about their use of a five star system. Downside: A little bit too hipster in that they refuse to actually number their lists and seem to be going for the minimalist look. But hey, if it gets raters, that's all that matters, right?

All of these sites may one day blossom into something useful. If it's legal, there would be value in some site aggregating all of these ratings plus a few others to become a meta-rater. But for now, I'm still searching...

Monday, September 14, 2009

Implications of Illegal Parking

Parking in the opposite direction as the flow of traffic on that side of the street is a ticketable offense in most cities. The traffic cops are allowed to infer that you must have broken the law in order to position your car into that orientation.

This technicality has broad philosophical implications, because it means that the state can arrest people based on inferences in general. For example, shouldn't we give marijuana possession citations to anyone playing ultimate frisbee? Don't we have to have a congressional hearing for any non-Olympian that can bench press 400+ pounds? Mustn't we arrest all current politicians in Chicago and New Jersey on the basis that there's no way they could have achieved their position without bribing a few people during their ascent? I say, perhaps regrettably, that we must. Consistency and honesty demand it.

Sunday, September 13, 2009

Against Political Moderatism

In his long essay on the issues with the Golden State, Troy Senik notes that although the media loves to sympathize with centrism, California's moderates are in fact the most responsible for the state's budget crisis. They make the politically expedient moves characteristic of both sides, never saying no to a spending increase or yes to a tax hike. This stance endears them to the short-term interests of voters but hurts the state in the process.

"Moderate" itself is a tantalizingly vague word. Does it mean that one stands in the middle of Dems and Repubs on all issues, always going for the compromise? Does it mean that one stands on the Dem side on some issues but the Repub side on the other, in around a 50-50 ratio? Or does it mean that one is so free thinking as to be completely untethered to party opinions? Although it is very sexy to call oneself a moderate, we should be wary of those who do so, and especially wary of those who are unable to elaborate on which type of moderate they are.

Saturday, September 12, 2009

What I've Been Reading

1) Cities of the Plain, by Cormac McCarthy. The overall quality of reading experience is good as always with Cormac, even if the actual story doesn't have as much to offer. The best line was, "Where do we go when we die? he said. I don't know, the man said. Where are we now?"

2) Creativity in Science, by Dean Keith Simonton. He uses statistics and historical records to explain the the developmental and career factors underlying scientific creativity, based on publication data. It's a quant approach and at the beginning you must slog through theoretical underpinnings, learning about stuff like the equal odds rule, Price's law, and the backwards J-shaped curve of publications within a lifetime. But the qualitative observations he derives from this background knowledge are well worth the entry costs. See my notes here for more.

3) The Illusion of Conscious Will, by Daniel Wegner. This book argues that although you have the sensation of walking around and making decisions on a day-to-day basis, there is good reason to expect that this sensation is contrived. For example, when a decision is surreptitiously made for a subject, that person will still inevitably find some reason to rationalize that action. This is clearly outrageously counter-intuitive, but at the same time it's hard to pick holes in his argument. It's a terminology laden book, but for me it was worth the effort to understand.

4) The Crying of Lot 49, by Thomas Pynchon. Super weird and windy plot that has some seriously impressive moments. Conspiracies within conspiracies. I found myself underlining tons of words I didn't know, so much so that it became more of a jigsaw puzzle than a novel. And at 160 pages, the opportunity cost is not too cumbersome.

5) Grooming, Gossip, and the Evolution of Language, by Robin Dunbar. His thesis in this book is that social communication is the hominid form of the constant grooming seen in other primates. Instead of merely stating that "humans are social creatures" like nearly everyone else does, he proposes some explanations for why that might be. Among his evidence for his argument is that sixty to seventy percent of conversations focus on purely social topics, like personal relationships, personal likes and dislikes, personal experiences, the behavior of other people, and such. There's also a really sweet section that describes how the limit to a conversation size is about four people, above which the conversation will falter and eventually break down into multiple groups.

6) Synaptic Self: How Our Brains Become Who We Are, by Joseph LeDoux. He explains some interesting experiments about emotions, especially with respect to the amygdala. But I wanted more detail, more methodology, and more numbers, and ultimately this book was written with a more casual fan in mind.

7) Bonfire of the Vanities, by Tom Wolfe. I was annoyed by the beginning of the book because I thought it was too predictable, but once the predictable event occurred near the middle the book gained steam. Indeed, I couldn't put it down for the last 200 pages and stayed awake until 4 AM to finish it, a remarkable feat given how much I value my sleep. The core theme of this book is as relevant in 2009 as it was in 1988, and many of the details haven't even changed. Recommended if you're willing to make a long-term commitment.

Friday, September 11, 2009

Expanding Medical Education

Interesting article by Richard Cooper in favor of eliminating the caps on funded residency positions, the limiting factor in preventing the expansion of graduate med education. He argues that the reason this change is not broadly supported is due to the systemic myth that more specialty physicians leads to worse health outcomes, which is an artifact of faulty statistics. In reality more spending leads to better outcomes, especially among higher-income patients, because the care consists of a broader spectrum of services. Cooper indicates as much in in this interview.

Here's one particularly juicy segment of his article, discussing the methods of the Dartmouth group's “30% solution” study:
The surprising observation was that outcomes were the same in all of the quintiles. The lowest-spending quintile was like the highest. How was that possible? The answer lies in a third problem—the aggregation error. By including a diverse range of hospital regions with diverse total health care spending (despite similar Medicare spending) and a diversity of subpopulations in each, the averages within each quintile regressed to the mean. The extremes of affluence and poverty in the highest-spending quintile came to resemble middle America. However, had meaningful subpopulations within each quintile been compared, striking differences in utilization and outcomes would have been observed, and their strong relationship to communal wealth, individual income, and clinical risk would have been appreciated.

The Dartmouth group apparently saw no need to disaggregate their quintiles to reveal the effects of income and risk. They were content with the knowledge that nothing was “necessarily better,” nor was anything worse. And, because nothing was better in the highest-spending quintile, the added spending was assumed to have been wasted; and because this waste was unexplained, it must have been due to the excess use of “supply-sensitive services”; and these services must have resulted from an oversupply of specialists.
This may seem esoteric and ivory tower-esque, but it's not--this Dartmouth study is the research cited and supported by the CBO, NYT, and lots of politicians.

In his interview, Cooper favors the creation of an insurance system that includes everyone and leaves the rest of the system alone. But he notes that this will likely lead to a doctor shortage as in Massachusetts, which leads us back to the need for expanding graduate med education.

Monday, September 7, 2009

Save Your Energy

It's common at the beginning of some endeavor like a semester or a sport season to be filled with adrenaline and want to dominate the whole thing all at once. This is healthy and moreover it's a good sign for whatever you're going to be doing, so rejoice. And then calm down.

These moments at the beginning of your endeavor are unlikely to be important compared to the moments in the middle or towards the end of it. So, you want to capture your naive energy and bottle it up like a potion of Felix Felicis. Wait until you're prepping for your third final in a row, or bouncing back after the night of your most crushing defeat, or chugging through the fourth hour of a brutal five hour exam. Only then do your dig into your deepest reserves. Before then, it's just a waste.

Thursday, September 3, 2009

My Definition of Consciousness

Like any other part of a biological system, our consciousness has been molded via natural selection. In humans, any divergences in consciousness as compared to other animals likely serve the purpose of presenting oneself as an individual that is stable as well as predictable, and therefore able to cooperate in the reciprocal altruism scenarios that were essential to survival in the Pleistocene. Although it is often cast as an intractable problem, we all know what it is like to experience consciousness. Relativistic philosophers often ponder whether someone could perceive a color in a different way and not know so, but this is highly improbable. Our three types of cone cells respond to ranges of electromagnetic wavelengths in such a specific way that there is little reason to expect that there might be variation in their function between members of the same species, aside from the few color blind exceptions that we already know of. All of the other environmental inputs that we detect are also transduced through physical processes that leave little room for or suggest potential a benefit due to variation. Not all of these sensory inputs reach the level of consciousness. Those that come close compete for attention in the brain and the few that we are able to focus on or recall past instances of at any given moment are what constitute the experience of consciousness.

I don't see why this is so often branded as a "mystery." Certainly the mechanisms for some of the necessary processes are currently quite undetermined but there is no reason that they will remain so forever. There is also no need to suggest anything unphysical or nonclassical.

Wednesday, September 2, 2009

The Alien Nation

It has been said that before you judge someone you should walk a mile in their shoes, because then you will be a mile away and they won't be to run after you without potentially hurting their feet. Hodson et al designed an experiment that is based on this principle as an intervention to fight overt anti-homosexual biases.

Participants imagine that they have crash landed on a foreign planet. Humans exist on the alien planet but face similar constraints that are inadvertently experienced by homosexuals on ours. That's because the aliens on the planet look like humans but don't allow any PDA, live in same sex housing, and reproduce via artificial insemination. Basically, these aliens are total prudes. The participants get into groups and discuss strategies for how they would cope with such a situation.

By taking the perspective of a gay person in a non-explicit way, the participants were exposed to the side of out-group while their guard was down. Participants in the Alien Nation group had average "intergroup perspective-taking" scores on a post-study questionnaire of 4.03 out of 7, as compared to students that listened to educational lecture on homophobia who had scores of 3.07 out of 7. Moreover, Alien Nation group members had attitudes towards homosexuals scored at 75.87 out of 100 as compared to 64.26 out of 100, another positive and significant difference. You can see the effect sizes for yourselves, which remained steady when attitudes were retested after one week.

There are a number of ways to explain the efficacy of this study, but I think its main advantage is that it minimizes negative backlash. It is very easy to tune out a lecture or dismiss a speaker, but when participants are forced to take a perspective themselves, they have no choice but to engage. As virtual reality tech improves and as more creative forms of therapy are devised, we should try to make such attitude interventions both more active and less antagonistic.

(HT: BPSRD. Ref: Hodson et al, 2009, doi: 10.1016/j.jesp.2009.02.010)