A friend of mine tells a story that's about as close as real life gets to this kind of language trick. He was helping a friend in college study for her Test of English as a Foreign Language. She was working on past tenses and, sincerely attempting to explain a mistake she'd made, he told her: "If you had had 'had' here, you would have had to have had 'had' there as well." Whereupon she screamed.
As English, this stuff is totally incomprehensible and unusable. Absolutely nothing is conveyed to actual human English speakers by saying the word 'buffalo' 400 times in a row.
If it is 'grammatical' then it is grammatical by virtue of conforming to some idealized grammar. But when this grammar is so far off not just from anything people say, but anything they can actually understand, it really only means that the idea that this grammar models real English has been reduced to total absurdity.
They're tricks, but they're far from incomprehensible. In my experience, both with this sentence and the buffalo one, there's a certain mental click when one "gets it", after which the sentence makes sense and one can "feel" its grammatical structure. It's a curious and rather Chomskyan experience. Before that, of course, the notion that such a string of words might mean anything is absurd. "Getting it" is much like those 3D visual puzzles where at first you see only a noise pattern, but when you hold it at the right distance and let your eyes refocus a certain way, a picture leaps out at you.
That's prosody. The difference in meaning isn't a consequence of the presence of the question mark, although that's what people think of as "grammatical." The presence of the question mark and the difference in meaning are both consequences of the prosodic differences between the two sentences.
This shows one of many challenges inherent computational linguistics: "the written word" only encapsulates a small part of what it means to "speak English" or "understand English."
As a nice benefit, it makes the sort of grammarian who obsesses over the written word look (rightly) like they're missing the forest for the trees.
While you have a point in there, some things are still missing. In my opinion the buffalo sentence doesn't have enough prosody to ever make the verbal version intelligible without explanation. The sentence in the OP does, but it's also missing mandatory punctuation. Punctuation conveys a solid fraction of the information prosody does, and sometimes even contains information prosody doesn't.
Actually, I find it perfectly comprehensible when said aloud with the right emphasis and pacing - it's only difficult to understand when written down with no punctuation.
The trick to all of these sentences is to omit otherwise necessary punctuation. The rare exceptions are sentences describing recursive concepts. For instance, if you have a radar detector, the police will catch you because they have a radar detector detector, which makes it imperative that you own a radar detector detector detector.
Here's another example: who polices the police? If there were any one agency in charge of that, certainly we would call them the police police. But who polices the police police? Clearly, the police police police police the police police.
There may be a lack of pronunciation but written English has clear rules demanding punctuation. The use of quoted words without quotation marks in the OP 'sentence' is ridiculous.
Buffalo buffalo and radar detector detector detector are much more valid.
The article title is ungrammatical without correct punctuation. With correct punctuation, it's fine - and it is, after all, talking directly about an error in the very grammatical construct it is highlighting. It's a perfectly natural sentence that could easily come about in normal discussion.
The 'buffalo' one is just nonsensical - the word 'buffalo' just isn't used that way, and even with punctuation, needs to be separately explained for people to understand it - even if they are aware of the regional dialect that uses the word 'buffalo' as a verb.
If English was a workable language, English majors would have nothing to base their theses on. The 'had had had' exercise highlights the absurd nature of English. This philosophy on English is why it stopped evolving after its peak - post English after the death of the worlds greatest playwright Shakespeare (who wrote phonetically, may I add).
On the other hand it does give an insight into syntax trees and parsing.
Every language has absurdities. Genders for non-gendered things is one example.
What's really absurd about English is the contempt for diacritical marks. Other languages give you a clue as to how the word is pronounced, whereas in English, if I write 'wind', you don't know if I'm talking about air blowing or charging a mechanical clock unless you have context - which may come later in the sentence.
> absurdities: Genders for non-gendered things is one example.
This. I never understood how e.g. Spanish speakers think that a door is female or a clock is male. I mean, it's not like there are any body parts you can examine for a definitive answer, or clothing and mannerisms which let you make a pretty good guess...I never really got a satisfactory answer other than "it's usually -o or -a, but not always; really, you just have to memorize it." Seriously...WTF?
> diacritical marks
Other European languages love them. To me as an English speaker, they look like misplaced inkspots or dirt on my monitor. I never had any class in school or college that taught what they mean [1]. I blithely type "fiancee," "naive" and "Geiger-Muller," since I don't want to get out Character Map or whatever the Linux equivalent is [2], and I'm not really sure which marks to use or where to put them. I pretty much pretend they don't exist, unless they cause compiler errors [3], in which case I terminate them with extreme prejudice.
[1] I did once learn that an overline (a line above a character; I don't know if that's actually what it's called) means a long vowel sound, and an upside-down e means schwa. I haven't seen either of these used outside dictionary pronunciation keys.
[2] In my current operating system, Linux Mint, I don't even know how to get those characters other than copy-and-pasting the Unicode text somebody else has put on a webpage, or spending an hour or two sitting down with the RFC's that specify UTF-8 encoding and a hex editor. The only reason I know on Windows is that I eventually stumbled on Character Map by curiously exploring all the menus. This may give you a clue how often I deal with international text
"I never understood how e.g. Spanish speakers think that a door is female or a clock is male. I mean, it's not like there are any body parts you can examine for a definitive answer..."
But "gender," as a term in grammar, just means an arbitrary classification of words for grammatical purposes. It's only a few languages which, absurdly, map these grammatical tags to biological or cultural sex distinctions. English compounds the absurdity by retaining this grammatical distinction only in this one bizarre case.
(My favourite example of the arbitrary nature of grammatical gender is Dutch. Dutch has two genders, common and neuter, which, if you wanted to map them to sex, would mean "either male or female" and "neither male nor female", respectively.)
English used to have diacritical marks, specifically the diaereses, as can be seen in names like Zoë or in the surname Brontë (as in the family of English authors.)
The New Yorker loves the diaereses to this day, and frequently uses it in words like "coöperate".
I made the realisation when I went to Vietnam, where basically you right a sentence then shake a bagful of diacritics over it. I thought "Heh, English doesn't require any of that nonsense... hey... wait a minute..."
The story behind this sentence is of two students comparing their texts, the one with "had" (version 1), the other with "had had" version 1, yielding 11 "had"s in a row. Call that story S_0. Now imagine a story S_(n+1) where two students write texts discussing story S_n, with version 1 replaced by the consecutive row of "had"s from S_n, and version 2 replaced by a similar row of "had"s, but one longer.
An example which itself suggests a possible recursion:
"Wouldn't the sentence 'I want to put a hyphen between the words Fish and And and And and Chips in my Fish-And-Chips sign' have been clearer if quotation marks had been placed before Fish, and between Fish and and, and and and And, and And and and, and and and And, and And and and, and and and Chips, as well as after Chips?"
The correct way to do this sentence is "I want to put hyphens between the words Fish, And,* and Chips in my Fish-And-Chips sign"
*The Oxford comma makes a lot more sense because it reflects how we speak, but even if you omit it, this is the correct way to present a list of items - you don't use 'and' between every one.
Edit: I was wondering which other rules I was breaking with that sentence - I'm sure there's more :)
I'm pretty sure the correct way to write any of these grammatical puzzler sentences removes the grammatical puzzle, because grammatical puzzles are difficult and impede understanding, which is presumably the entire purpose of language to begin with.
Or that episode of the Simpons where Marge saw "SUGAR FREE DONUTS" until Apu added a comma between "sugar" and "free", and explained that it was actually sugar with free donuts.
Wouldn't the sentence 'I want to put a hyphen between the words Fish and And and And and Chips in my Fish-And-Chips sign' have been clearer if quotation marks had been placed before Fish, and between Fish and and, and and and And, and And and and, and and and And, and And and and, and and and Chips, as well as after Chips?
Wouldn't the sentence, "Wouldn't the sentence 'I want to put a hyphen between the words Fish and And and And and Chips in my Fish-And-Chips sign' have been clearer if quotation marks had been placed before Fish, and between Fish and and, and and and And, and And and and, and and and And, and And and and, and and and Chips, as well as after Chips?" have been clearer if quotation marks had been placed before Fish, and between Fish and and, and between and and between, and between between Fish, and between Fish and and, and and and and, and and and and, and and and and, and and and and, and and and And, and and And and and, and and and And, and And and and, and and and and, and and and and, and and and and, and and and and, and and and And, and And and and, and and and And, and And and and, and and and and, and and and and, And and and and, And and and and, And and and Chips, as well as after Chips?
I do not know where family doctors acquired illegibly perplexing handwriting, nevertheless, extraordinary pharmaceutical intellectuality counterbalancing indecipherability transcendentalizes intercommunication's incomprehensibleness.
word 1 = 1 letter, word 20 = 20 letters. A friend didn't like "intercommunication's", but hasn't replied on the worth of 'intercommunicationy'...
No, crntaylor has it right: we're supposed to understand that James wrote what he did because it had had a better effect on the teacher in the past. That's how you can justify squeezing that one last "had" in there.
It’s is not, it isn’t ain’t, and it’s it’s, not its, if you mean it is. If you don’t, it’s its. Then too, it’s hers. It isn’t her’s. It isn’t our’s either. It’s ours, and likewise yours and theirs.
If you are using this as a mnemonic about its/it's, this is fine; but your statement of it is as a rule ("the apostrophe replaces the missing letters in contractions") is misleadingly incomplete. Apostrophes do that, but they also serve as a possessive marker in the general case ("Sam's"), which is of course why its/it's causes so much trouble in the first place.
No need; I think the original formulation---as a mnemonic---is just fine. We don't really need a "general rule" here anyway, and to be honest, English orthography and "general rules" don't really go well together.
The only reason I posted at all is because linguistics is an area where a lot of quite intelligent people hold some extremely unexamined (and incorrect) beliefs, and there is furthermore a common tendency to propagate those beliefs as if they were fact. As a result, whenever I see someone articulating anything that is formulated like a general rule about a language (or about language in general), I try to make corrections where I can.
You took "the apostrophe replaces the missing letters in contractions" as me postulating that all apostrophes are only used for contractions, which is jumping to conclusions, IMO.
Even if that statement is taken as a general rule, I can't think of any contractions that don't use an apostrophe, and it certainly doesn't state that contractions are the only place where apostrophes are used.
In my Portuguese class we were given this ambiguous phrase as a riddle:
"Maria toma banho porque sua mãe disse ela traga a toalha."
We had to make those words make sense only by adding punctuation and without changing the order of any words. Can you figure it out?
For those who don't speak Portuguese, the phrase above translates to: "Maria takes a bath because her mom said to her bring the towel." Doesn't make much sense!
The trick is that "sua," which means "her" when the following noun is feminine, is also the present third-person singular of the verb "suar," meaning "to sweat." Thus, with a few commas and quotation marks, it suddenly makes sense:
Maria toma banho porque sua. "Mãe," disse ela, "traga a toalha."
=
Maria takes a bath because she sweats. "Mom," she said, "bring the towel."
Not nearly as ambiguous as the "had had had had" example, but a similar lesson regarding the need for punctuation.
Edit: as personlurking pointed out, suar is "to sweat," not (as I put originally) "to smell." Thanks for catching that!
Oh, yes, I remember that one. Portuguese is not my first language, but I am fluent. I've been given that phrase before (and failed on the suar part despite knowing that verb). Tricky, indeed. Just a small correction, suar is to sweat.
I recall a protest sign that said "Veta Dilma!", or "Veto Dilma!" (the President of Brazil), but the protester meant to put a comma in there, as in "Veto, Dilma!" because the protest was about a bill running through congress.
Another one was "Mesmo sujo, governo quer rio Pinheiros sem cheiro" (Even though it's dirty, the government wants the Pinheiros river to be rid of the bad smell). The problem is the wording which makes it seem like the government is dirty, and not the river. Better would have been "Governo quer rio Pinheiros sem mau cheiro, mesmo que sujo" (The government wants the Pinheiros river without the bad smell, even though it's dirty.).
How so? The problem comes from having a highly context sensitive grammar, which is hardly something I associate specifically with functional languages; C++ is the usual language that people mock for having an all but Turing complete grammar. I guess the other obvious candidate is Lisp, but that's a different beast all together.
I agree with you that the context sensitive grammar make this sentence hard to parse. However if you look at the clarifications of meaning they break things into bits.
In some languages (e.g.: Java without lambdas), you can't write a function without giving it a name. You have to break things into bits and give them names.
In the functional languages you can just create a lambda and use it.
In a functional style conditionals return values you can use directly. In languages like Java you end up having to assign to temporaries in the branches of the if.
I can't resist mentioning the Lion-Eating Poet in the Stone Den poem I came across recently. The poet plays with the many tones and many variants of the sound 'sh' in Chinese:
« Shī Shì shí shī shǐ »
Shíshì shīshì Shī Shì, shì shī, shì shí shí shī.
Shì shíshí shì shì shì shī.
Shí shí, shì shí shī shì shì.
Shì shí, shì Shī Shì shì shì.
Shì shì shì shí shī, shì shǐ shì, shǐ shì shí shī shìshì.
Shì shí shì shí shī shī, shì shíshì.
Shíshì shī, Shì shǐ shì shì shíshì.
Shíshì shì, Shì shǐ shì shí shì shí shī.
Shí shí, shǐ shí shì shí shī shī, shí shí shí shī shī.
Shì shì shì shì.
Translation:
« Lion-Eating Poet in the Stone Den »
In a stone den was a poet called Shi, who was a lion addict, and had resolved to eat ten lions.
He often went to the market to look for lions.
At ten o'clock, ten lions had just arrived at the market.
At that time, Shi had just arrived at the market.
He saw those ten lions, and using his trusty arrows, caused the ten lions to die.
He brought the corpses of the ten lions to the stone den.
The stone den was damp. He asked his servants to wipe it.
After the stone den was wiped, he tried to eat those ten lions.
When he ate, he realized that these ten lions were in fact ten stone lion corpses.
Try to explain this matter.
Note that this poem isn't in any "real" Chinese language. Basically, it's written in Classical Chinese, but pronounced as if it were Mandarin.
Old Chinese (the spoken language on which the written Classical form was based) had a much more complex phonology, which was simplified in Mandarin, resulting in many Old Chinese words becoming homophones or near homophones. To reduce the ambiguity, Mandarin uses many more compound words than Classical Chinese.
For example, a poet is called shīrén in Mandarin (literally, "poet person") while this poem just uses shī.
"buffalo" can function as a verb (bully) and a plural noun (bison).
So we can write "Bison bully bison." as "Buffalo buffalo buffalo."
I'm going to tag my groups of bison with numbers.
> Bison(1) bully bison(0).
> Buffalo(1) buffalo buffalo(0). (3 words)
Now, we can qualify bison(1), by stating that they are bullied by another group of bison, which I'll call bison(2). We can rewrite "who are bullied by bison(2)" as "bison(2) bully" (similar to rewriting "food that is liked by me" as "food I like") so we get:
> Bison(1) who are bullied by bison(2) bully, bison(0).
So that gives us sentences of lengths 3, 5, 7, 9, 11, ... words. We can add a word to each sentence by just adding the qualifier "Buffalo" (meaning "from the city of Buffalo") before buffalo(0). This gives us sentences of lengths 4, 6, 8, ...:
> Bison(1) bully bison(0) from the city of Buffalo.
So now we just need the sentences of lengths 1 and 2, which we can do by using the imperative. I.e. we give the general instruction "Bully." or the more specific instruction "Bully bison.":
> Buffalo. (1 word)
> Buffalo buffalo. (2 words)
The issue remains: can we call a sentence "valid" if it is entirely incomprehensible?
The sentence is nonsensical, because it refers to the same group three times - Bison from Buffalo. If bison from Buffalo bully other bison from Buffalo, it's tautological to again repeat the point that they bully bison from Buffalo.
You'd need more specific demographics to make the sentence valid, methinks, and when that happens, you're adding in a new word.
You're making the classic nerd error of assuming that natural languages should be as precisely defined as mathematics or computer languages. You're going to tell me that there's no such thing as deceleration next!
(To be fair, "Buffalo buffalo buffalo buffalo buffalo." isn't exactly a "natural" sentence.)
And in any case, the sentence is still syntactically valid, which is all I'm really concerned about here.
I'm not operating from logic, but from a 'natural' perspective. I've no problems with the syntax ("French citizens German citizens bully, bully Belgian citizens") though it's quite an awkward sentence in that format and any editor would require a rewrite, but it's just not something that someone would say if all the demographics were the same ("French citizens French citizens bully, bully French citizens") - ie, it's nonsense for that reason.
While the sentence is technically ambiguous, we say ambiguous things all the time and still manage to understand one another. This is the domain of the study of Pragmatics.
In natural communication, we START by assuming that the speaker had a meaningful intention, and then attempt to infer that meaning from context and assumptions about shared knowledge. I can easily think of contexts in which your "French citizens" sentence make sense.
> Subd. 14. C. Notwithstanding the language contained in Subdivision 6 of this section, all “potentially dangerous animals,” as defined in this Ordinance, which are outside of the owner’s residence, must be kept on a suitable leash, or in an enclosure which restricts the animal’s ability to egress from the owner’s property.
Buffalo buffalo are not buffaloed by fences, but it might buffalo a Buffalo buffalo to see another Buffalo buffalo on a leash.