Search This Blog

Friday 21 October 2016

Just an Essay Justifying "Universal Grammar" in a non-technical way

21. How successful, in your view, is Chomsky’s attempt to interpret the theory of grammar as an investigation of a human biological capacity?

The research programme that Noam Chomsky began in linguistics, starting with the publication of his monograph, Syntactic Structures, in 1957, and properly expounded in his 1965 work Aspects of the Theory of Syntax, is not very well understood by the majority of its popular and even academic critics. Chomsky’s frequent failure to hedge his claims about the postulated “language organ” (and his penchant for overly strong pronouncements) has contributed to this failure of understanding, since few critics bother to understand how he explicates these terms in his methodological framework or where they sit in his metaphysics of mind. Of course, at the same time, he has himself ignored some issues in biology that ought to have compelled him to slightly change his tune on “innateness”. Nevertheless, in this essay, I claim that, if one does properly understand the methodological and philosophical foundations of the generative-grammar-based research programme into the “Universal Grammar”, two things are evident: 1.) that the programme has been about as successful as its progenitors hoped, if not more (with no degeneration, in Lakatos’ terms), and 2.) that the programme has been successful in the independent sense that it has led to genuine insights about the human mind.

One of the major things that critics of ‘UG’ are wont to gloss over is that, strictly speaking, there is no one “theory of Universal Grammar”. Instead, there is only a general ‘UG’-research programme based on the construction of generative grammars held to capture the innate implicit knowledge which is falsifiably (and each model ought to be falsifiable individually) held to be required for the achievement of human linguistic competence (the generative grammar in the human mind).[1] As this implies, the empirical support is (and could not be) uniform for ‘UG’, understood as the UG-research programme; instead, because the approaches to generative grammar that have been developed as part of this overall research programme since Chomsky’s exposition of a Transformational Grammar approach in Aspects to the Theory of Syntax (1965) vary in the level of ‘innate’ ‘knowledge’ they postulate, different evidence is required to falsify each of these approaches, and to confirm any one of them relative to the others. In particular, a stronger version of the famous “Poverty of the Stimulus” argument – one which draws on more evidence of linguistic universals, infant reliance on rules and hard-to-empirically-explain examples of “structure-dependence” –  are needed to support the early Transformational Grammar models than for the models proposed since the Principle and Parameters paradigm shift, with the Minimalist Program starting from the postulation that the generative system is simpler than a grammar (per se), and therefore probably being more prone to rejection by evidence of too many universals and such (this would, of course, be rejection relative to another model of generative grammar).
It is very important to acknowledge, of course, that there are some very sensible philosophical critics of the UG research-programme, like the Australian-born philosopher of biology, Fiona Cowie [1999], who see Chomsky’s very use of the word “innate” as problematic from a scientific viewpoint. One of Cowie’s motivations for her, in the end, quite moderate critique of “nativist” theories in What’s Within? is the fact that the word has no clear scientific interpretation. She shares this view with her fellow philosopher of biology, Paul Griffiths (of the University of Sydney), who in his 2002 paper “What is innateness?” points out that the word “innate” is no longer used in any field of biology except cognitive science, and argues that it ought to be discarded in cognitive science, too, since innateness is a “folk-biological” concept which yokes together three biologically separate notions: species-typicality, developmental fixity and intended design [Griffiths, 2002: 2].
Whilst I myself agree with this critique, I do not think it is in the least bit destructive of Chomsky’s research-programme, because I also think (and Griffiths does not explicitly deny) that the generativist talk of an “innate language faculty” can simply be substituted for one of two other phrases: a “developmentally canalised, species-typical language faculty” (in the case of Chomsky and his fellow non-adaptationists (spandrelists) about the evolution of the language faculty) or “an adaptive, environmentally canalised, species-typical language faculty” (in the case of Pinker, Jackendoff and the other adaptationists about the evolution of the language faculty).[2] One small criticism I have of Chomsky (along with fellow generativists like Charles Yang, Robert Berwick, Steven Pinker, Ray Jackendoff, etc) is that he hasn’t made this terminological alteration himself, still using the language of Cartesian or Humboldtian rationalism, of which he has always regarded his work a direct descendant [Chomsky, 1965, 1986]. However, since I don’t think that the use of this unscientific language is a significant problem for the UG-research programme, I will myself, in the rest of this essay, always type the folk term in single inverted commas while still assuming that I am defending Chomsky.
Past this terminological hurdle, the single biggest reason why I think that Chomsky’s research programme has been a success in both the senses I outlined in my introduction is empirical validation. It seems to me quite clear that the theory that Homo sapiens does have a developmentally canalised, species-typical language faculty has not been falsified over the decades, but is instead clearly still the best (indeed, the only) explanation for the empirical evidence of rapid language acquisition and human linguistic competence.
Despite the impression evoked by some critics, Chomsky’s generative research-programme did not begin as a scholastic, a priori enterprise, but was motivated by empirical considerations of a fairly fundamental kind. The reason that Chomsky’s famous 1957 monograph, Syntactic Structures, is often heralded as the founding document of modern cognitive science, despite not framing itself as a work of mentalistic investigation (and containing no argument for the existence of an ‘innate’ “language faculty”) is that Chomsky’s formal conclusions in SS directly entail the powerful scientific conclusion that at least one language (English) cannot be understood in behaviourist terms, and that English speakers must ‘possess’ (in some vague sense, leaving representational and acquisitional issues aside) what Chomsky would later call, in Aspects of the Theory of Syntax, “a system of generative processes” [1965: 4]. The most pivotal part of Chomsky’s monograph is his chapter 3 proof that a finite-state or Markov model (a model which involves the mono-directional chaining of words) is simply inadequate to generate all the grammatical sentences of English, and the implication, explored in the next two chapters (“Phrase Structure Grammar” and “Limitations of Phrase Structure Description”, in which he introduces the notion of a “transformation” to complement the inadequate phrase structure grammar), that an adequate generative grammar must be a hierarchical or syntax-based grammar of some kind.[3] Although the inadequacy of the finite-state model perhaps should have been obvious, it was no trifling result, because it refutes the strongest empiricist view about language production. If it is impossible to formally generate English sentences by a finite-state model, then it is also impossible to generate English sentences by the kind of finite-state model one might want to implement in a computer, or one might imagine existing in a brain: a complicated word-chain device relying on transitional probabilities. One instead needs generative principles, and this fact literally precludes the strongest empiricist understanding of language production.[4]
Evidently, behaviourists wanted to avoid talking about mentation at all, but the theory of language production found within works like Skinner’s Verbal Behavior (the subject of Chomsky’s famously destructive 1959 review) clearly rules out the possibility that human language is produced by a generative grammar. As Steven Pinker writes in The Language Instinct, the finite-state model is directly “congenial to stimulus-response theories: a stimulus elicits a spoken word as a response, then the speaker perceives his or her own response, which serves as the next stimulus, eliciting one out of several words as the next response, and so on” [1994: 93].
Of course, in Syntactic Structures itself, Chomsky uses his chapter 3 proof to motivate a more purely methodological conclusion: that linguistics ought to shift from static, structural description of corpora – the behaviourist methodology of American Structuralism[5] – towards the construction of the kind of descriptively adequate generative grammar he attempts to construct for English in SS (a transformational phrase structure grammar). Yet it is not hard to see how this led to the far bolder scientific shift expounded in Aspects of the Theory of Syntax. In order to make the case for an entirely new vision of linguistics in Aspects, Chomsky extended his insight about the necessity of a creative model for English sentence-production into all languages (on the strongly empirically based contention that no other human language can be adequately described by a finite state model either, supplemented by the empirically based assumption of strong human cognitive universality), and combined this with an argument in support of (essentially) 18th Century Rationalism. His new claim was that the aim of the discipline should be to construct generative grammars that are descriptively adequate for all natural languages, and explanatorily adequate as “universal grammars”. That is to say, linguists should attempt to describe the ‘innate’ linguistic competence or “system of generative processes” held to be necessary for the acquisition of any natural language by a child [1965: 6].
At this point, I should point out that, whilst I am claiming that Chomsky’s research-programme was empirically motivated at its inception, I am not claiming that all its empirical presuppositions were thoroughly confirmed in 1965. It clearly might have been the case that the evidence collected after 1965 strongly disconfirmed these empirical presuppositions, and thus rendered the research programme untenable. If it had turned out that there were significant cognitive group-differences in Homo sapiens – that some populations don’t have (and couldn’t have) language with any recursion (as Daniel Everett claimed to show, falsely[6])then that would be a kind of falsification of the universalist aspect of the programme (although, presumably, that wouldn’t sound the death-knell for the investigation of the linguistic competence of the human populations with the developmentally canalised language faculty). Similarly, if “Nim Chimpsky” (or, for that matter, some other animal) had proven capable of learning actual grammar, rather than mere sign-strings, then that would have been a blow to Chomsky’s pretty significant working assumption that the language faculty is unique to humans (and it would probably have forced him to attribute to the ‘language faculty’ a lower degree of developmental canalisation).
More interestingly, it might have been the case that the evidence gathered after 1965 strongly favoured the hypothesis that language acquisition occurs by means of solely domain-general cognitive processes, as Michael Tomasello’s “usage-based theory of language” holds (a theory of language which, despite its domain-generality, avoids getting wrecked on Chomsky’s SS proof about the inadequacy of Markov word-chains by positing that speakers learn entire grammatical constructions by means of a powerful “theory of mind” and contextual-awareness) [Tomasello, 2003]. If it had turned out that the condition known as “Specific Language Impairment” was in all cases actually just a misdiagnosed general cognitive impairment, or that all fully articulate older children diagnosed with autism or an autism-spectrum disorder have either been misdiagnosed or had full cognitive empathy in the critical period for acquisition, then that would lend strong support to a Tomasello-type theory over one which postulates a language faculty.
As it stands, however, things have not turned out this way. Thus, the UG-research programme has been completely justified in continuing to exist and grow.

 In more recent years, some subtler empirical objections to the UG research-programme have come from the AI community, which has had far more success with trained statistical models (for example, probabilistic context-free grammars) than categorical models (any of the ‘pure’ models created by generativists). The schism between these two worlds – one practical, one theoretical – came to the fore in 2011 after Chomsky made some highly disdainful comments about statistical approaches to “various linguistic problems” (accusing statistical modellers of doing ‘butterfly-collecting’ rather than “science”) at the Brains, Machines and Minds symposium held on MIT’s 150th Anniversary. Soon after, the Director of Research at Google, Peter Norvig, published an essay on Chomsky online in which he argued that the father of linguistics was entirely in the wrong, since probabilistic models have been far more successful in actual implementations than any categorical ones [Norvig, 2011]. Nevertheless, as Chomsky’s student Charles Yang has argued in response to similar critiques, whilst there is now “a good deal of evidence against the ‘triggering’ model of learning” (which, Yang happily concedes, deserves to be replaced by a “probabilistic model”, which is domain-general) “one needn’t, and shouldn’t, abandon the categorical theory of GRAMMAR”, which is domain-specific [Yang, 2007: 215]. In other words, it is still perfectly cogent to study the abstract generative processes, because, as Chomsky has always claimed, they represent the underlying linguistic competence and nothing more.

 There is one kind of critique of Chomsky’s research-programme which has nothing to do with the empirics of ‘UG’, but the metaphysics. Towards the end of her critique of ‘UG’ in What’s Within? Fiona Cowie brings up an objection of exactly this metaphysical kind. Tapping into a broader philosophical doubt many hard-line connectionists and ‘Churchlandian’ eliminativists have about the computational theory of mind in general, Cowie claims that it is deeply problematic that Chomsky cannot specify what the implicit “knowledge of language” [7] he theorises about actually is [Cowie, 1999: 274]. In particular, it is unclear, she claims, in what sense grammar actually could be “represented” [1999: 274].
Despite the seeming importance of this line of objection, Chomsky is simply a deflationist about this question of “representation”, and I think he is right to be so. The reality is that there is no possible theory of linguistic-competence – no possible explanatory scientific theory of language – other than one which posits ‘knowledge’ in the form of a system of “rules and representations” [Chomsky, 1980]. This means that Chomsky’s UG research-programme is the only possible scientific research programme into the faculty of language. The fact that, as Chomsky himself says in a reply to the philosopher Georges Rey (quoting Randy Gallistel) “we clearly do not understand how the nervous system computes,” or even “the foundations of its ability to compute,” should not put a halt to the only scientific investigation into the human language faculty, just as it shouldn’t put a halt to any other cognitive scientific studies (including in other species) [Chomsky in Chomsky and his Critics, 2003: 276]. As Chomsky says in that same reply, “surely no one expects that some isolable part of the organism is dedicated to digestion, or navigation, or language, or any other component that is singled out for investigation in any rational approach to the study of a complex system” [Chomsky, 2003: 276].

In summary, I believe that Chomsky’s attempt to interpret the theory of grammar as an investigation of a human biological capacity has been highly successful, according to any reasonable criteria for such things. The Universal Grammar-research programme he began in the early 1960s has enjoyed significant internal development while its core presuppositions have been strongly confirmed. This has, in turn, revealed to us important insights about the human mind.




[1] Of course, the “theory of Universal Grammar” might then be understood as the empirical presuppositions necessary for the tenability of this programme in general. However, across the history of the UG-research programme, it seems to me that the only constant empirical presuppositions are: 1.) human cognitive universality (no significant cognitive group-differences in Homo sapiens), and 2.) the existence, in all humans without severe impairment, of a developmentally canalised, domain-specific, computational ‘module’ (in the vaguest possible sense) which explains human “linguistic competence” (and possibly several other human abilities) for which we can construct generative models (not even grammars per se) by investigating the syntax of the world’s languages. It seems to me that many critics who mount general attacks on “Universal Grammar” believe they are attacking a stronger thesis than the conjunction of these two propositions.
[2] Strictly speaking, it would be best to rephrase Chomsky’s usage of “innate language faculty” with “a developmentally canalised, species-typical language faculty which originated as a spandrel but proved adaptive and was then selected for” [Chomsky, 2012: 14].  
[3] He doesn’t use the word “hierarchical” in SS.
[4] Of course, this in itself shows nothing about the ‘innateness’ (more properly, developmental canalization, etc) of the generative processes necessary for language production in an adult speaker. In itself, it also clearly doesn’t rule out the possibility of probabilistic language models more sophisticated than Markov chains (i.e. probabilistic generative grammars), or, arguably, Michael Tomasello’s usage-based theory of language [2003], which I’ll discuss later.  
[5] Of which one of the major proponents was Chomsky’s teacher and mentor, Zellig Harris.
[6] Pirahã does have recursion, and (more fundamentally) Piraha speakers can learn Portuguese, so Everett’s ‘argument’, such as it is, poses no problem for the research programme at all [Nevins, Pesetsky, Rodrigues: 2009].
[7] Which he has tried to re-term the “cognizance of language” in an attempt to stop the philosophical controversy.

No comments:

Post a Comment