Language—like reading—may not be innate

August 12, 2023

Colors of the mind
Language is a uniquely human phenomenon that develops in children with remarkable ease and fluency. Yet questions remain about how we acquire language. Is it innately wired in our brain, or do we learn all facets rapidly from birth?

Two books – Rethinking Innateness and The Language Game – provide us with some fascinating perspectives on language learning that bears implications for how we think about learning to read and write, and furthermore, for how we talk about the power and limitations of AI.

A Review of Where We’ve Been

In a previous series, we pursued an interesting debate about whether learning to read is more unnatural than learning oral or signed languages. We also investigated the notion, frequently stated by “science of reading” proponents, that “our brains were not born to read,” while our brains are “hard-wired” for language.

While I agree with researchers Gough, Hillinger, Liberman and others that written language is more complex and abstract than oral language and—hence—more difficult to acquire, I’m not convinced that calling it unnatural is most accurate. Instead, I suggest terming it effortful.

In one of the earlier papers we examined, Liberman argued that oral language is pre-cognitive, meaning that it requires no cognition to learn and thus is more natural to acquire. He used this claim to counter the Goodmans’ assertion that oral and written language were largely synonymous, and that kids therefore could learn to read merely through exposure to literacy, rather than explicit instruction in the alphabetic principle (“whole language”). While I most definitely don’t agree with the Goodmans, I paused on Liberman’s claim with some skepticism, as there are a subset of kids who also struggle to develop speech and language skills, just as there are a subset of kids who struggle to develop reading and writing skills.

Liberman also made another strong claim that I paused on: that the evolution of oral language is biological, while written language is cultural (which parallels arguments that language is “biologically primary” while reading and writing are “biologically secondary,” which I have also questioned, given that making the distinction is harder than it seems when social and cultural advancements are deeply interwoven with human existence over generations of time). But I mostly accepted this premise, as it seems to be self-evident that language is baked into our brains. After all, babies begin to attune to languages spoken around them even while still in the womb.

Liberman does not stand on his own in these assertions, I should hasten to add. I just bring one of his papers up because we spent time with it here. Noam Chomsky, for example, has long argued for a universal grammar, which is taught in foundational courses on linguistics, and the related study of generative grammars is alive and well.

Why is this important? It’s important because whether we consider language “natural” or written language “unnatural” bears implications for how we decide to teach them (or not). If we think of language as completely innate, then perhaps we don’t think it requires much of any teaching that is explicit, systematic, or diagnostic. Or conversely, if we think of written language as wholly unnatural, we may not consider how to strategically design opportunities for implicit learning, volume, and exposure.

Yet I have just read two books, written in two different decades, that provide some really interesting critiques against the widely adopted supposition that language is innate.

Language Models

The first book, Rethinking Innateness: A Connectionist Perspective on Development, by Elizabeth Bates, Jeffrey Elman, Mark H. Johnson, Annette Karmiloff-Smith, Domenico Parisi and Kim Plunkett, was published in 1996, and approaches language from the lens of neuroscience, explaining connectionist models and their implications for neural development and learning. These models are not only part of the lineage of the current renaissance of Large Language Models, such as ChapGPT, but also part of a lineage of models that have informed our theoretical understanding of how children learn to read, and may continue to inform explorations of “statistical learning.”

I was led to this book from a recommendation by Marc Joanisse, a researcher at Western University, when he commented on my tweet (are we still calling them that?) about research on artificial neural networks that suggests they can accurately model language learning in human brains.

It was a great recommendation, and I found the book extremely relevant to ongoing conversations about AI and LLMs today, in addition to providing key insights from connectionist models into language and literacy development that challenge assumptions around innateness, such as:

Simulations show that simple learning algorithms and architectures can enable rapid learning and sophisticated representations, such as those seen in younger infant competencies, without any innate knowledge.
U-shaped learning and discontinuous change also occur in neural networks without innate knowledge, due to architecture, input, and time spent on learning. This parallels studies of the development of linguistic abilities in children, such as the learning of past-tense and pronouns.
The way in which neural networks learn new things can be simple, yet the learning yields surprisingly complex results. This complexity emerges as the product of many simple interactions over time (this point, written in 1996, seems incredibly prescient to me as a reader in 2023 using Claude2 to distill and summarize my notes from each book for this post).
Connectionist models show global effects can emerge from local interactions rather than centralized control. Connectionist models also show how structured behaviors can emerge in neural networks through exposure to and interactions with the environment, without explicit rules or representations programmed in (which makes me think of statistical learning).

Language Games

The second book, The Language Game: How Improvisation Created Language and Changed the World, by Morten H. Christiansen and Nick Chater, was published last year in 2022, and focuses more on cultural evolution and social transmission of language, arguing that language is akin to a game of charades that is honed and passed on from generation to generation. I happened to check it out from the library and read it concurrently with Rethinking Innateness, and there was some great synergy between the two, especially around challenging the notion that language is innate. Some of the key points of the book:

Language relies on and recruits existing cognitive mechanisms, becoming increasingly specialized through extensive practice and use.
Language evolves culturally to fit the human brain, not the reverse.
Language is shaped for learnability and for coordinating with other learners, not for abstract principles and rules. Children follow paths set by previous generations.
This cultural transmission across generations shapes language to be more learnable through reuse of memorable chunks (“constructions”).
Due to working memory limitations, more memorable chunks survive, causing a design without a designer. These chunks become increasingly standardized over time.
Language input must be processed immediately before it is lost (what the authors call the “Now-or-Never” bottleneck).
Chunking sounds into words and phrases buys more time to process meaning.
Gaining fluency with increasingly larger and more complex constructions of language requires extensive practice.

Across Connectionism and Charades

Together, these books provide a picture of language as an emergent, complex cultural and statistical phenomena that has evolved from simple learning mechanisms across generations. Rather than an innate universal grammar baked into children’s brains, language itself has adapted and molded over time to become essential to our human inheritance, as with clothing, pottery, or fire. Language emerges through social human communication and interaction. It becomes increasingly complex, yet also streamlined and standardized, without any explicit rules governing it beyond the constraints of our brains, tongues, and cognition.

This isn’t to say there isn’t something unique about the human brain architecture in comparison to our closest animal brethren—there clearly is—but rather that language has adapted symbiotically to that architecture, like a parasite, rather than specific parts of our brain that are genetically pre-determined for language.

Like reading, using language drives increasing specialization of our brain—and this specialization, in turn, drives greater cognitive ability and communicative reach.

There’s a lot here to unpack and synthesize, but I wanted to begin bringing these together, because just as I feel myself pushing against the zeitgeist when I argue that calling learning to read “unnatural” isn’t quite right, so too are arguments that learning language is not “innate” swimming against the tide. These two counterclaims are interwoven, and I think worth further exploring.

Consider this post the first in an exploratory series. We’ll geek out on language development and its similarities and differences to literacy development, maybe dig into the relation of cognition and language and literacy a little, and riff on the implications for AI, ANNs, and LLMs.

#language #literacy #natural #innateness #unnatural #reading #neuralnetworks #research #brains #linguistics #models