Which points out how while you can go from a bunch of indivdiual case examples to a generality, you can't go from a generality to an individual case example very well ;-).
Does 80% accuracy mean that it will work for everyone? No. But it does mean that over hundreds of cases with hundreds of people, gender can be identified to a fairly high degree of accuracy.
Even so, if you look at the figures presented on the site itself, it's closer to 60% accurate than 80%.
What it comes down to is that you cannot apply mathematics to something that isn't mathematical.
There are a number of things that contribute to how a person writes... Their nationality, their social class, their level of education, weather or not English is a second language for the writer, how old they are, what mood they were in, whose point of view they were writing from... and I really could go on. It's a very long list. Gender is probably about last on the list of contributing factors. IMHO there is no way an algorithim is going to be able to take all of that into concideration. Sure, this one takes "genre" into account - that being fiction, non-fiction, blog entry. There is a difference between formal and informal language, so it's good they look at that... but there are just so many other factors... They should, at the very least, be asking country of origin and level of education!!
Call me skeptical, but there you go.