Is Computational Linguistics Becoming Too Computational for Linguistics?

Educational

Is Computational Linguistics Becoming Too Computational for Linguistics?

Although I’ve recently transitioned from the grueling job market to a comfortable tenure-track position, I still find myself compulsively checking job postings on LinguistList every day. There are, of course, professional reasons for this habit, but the real motivation is more peculiar. I seem to take a strange pleasure in things that should normally cause stress, anxiety, or frustration—like the job search process for computational linguists, which has plenty of all those elements.

If you also follow that well-known linguistics blog, you’re likely familiar with their annual analysis of the linguistics job market. Computational linguistics often shines in these reports, with job openings far exceeding the number of new PhDs. However, my issue isn’t with the quantity of available jobs, but with the nature of the jobs being advertised. The problem isn’t the industry positions; those are straightforward NLP roles requiring skills that you would naturally acquire in a computational linguistics MA/MS or PhD program focused on NLP. Startups might have more demands than larger companies like Google or Nuance, but they’re also often more flexible in hiring. But not every computational linguist is interested in NLP, and when they turn to job searches within linguistics departments, that’s where the complications begin.

Linguists’ Affinity for Numbers

Having observed countless job postings over the past five years, I’ve concluded—without any outrage—that linguistics departments don’t seem particularly interested in computational linguistics as such; they’re more interested in linguistics that involves numbers.

If faced with a choice between someone studying Optimality Theory (OT) through formal language theory and someone applying Stochastic OT with a MaxEnt algorithm to model gradience effects, they’ll choose the latter. If it’s a choice between someone working with Minimalist grammars or Tree Adjoining Grammars and someone doing corpus-based sociolinguistics, again, they’ll pick the latter. The same goes for choosing between someone exploring the learnability of formal language classes and someone presenting a Bayesian model for word segmentation. (These are hypothetical examples.) Exceptions do exist—after all, I landed a job—but they are rare, and a job search that explicitly favors formal computational linguistics over others is virtually unheard of.

Even for those who consider quantitative research essential, this situation is unfortunate. It makes it much harder to assure students that they stand a good chance of finding a job if they delve into computational linguistics. More significantly, it means that linguistics departments—and the field at large—are missing out on a wealth of fascinating work. This is where I’m supposed to ramp up the rhetoric, criticizing the narrow-mindedness of linguists and their resistance to rigorous work. But I prefer to avoid ranting, as my attempts tend to be more like a clumsy assault rather than a precise critique. So instead, let’s rationally explore why linguists don’t seem to value theoretical computational linguistics.

The reason is straightforward: theoretical computational linguistics is as approachable as a porcupine on the defensive. This is due to three interconnected factors that create a situation where most linguists can’t even begin to grasp what computational linguistics is really about.

Reason 1: Abstract Nature and Limited Empirical Results

Theoretical work in computational linguistics often deals with “big picture” issues like the generative capacity of language, linguistic formalisms, and their memory requirements or parsing performance. This kind of research is appealing because it highlights broad, theory-independent properties of language, showing how formalisms differ when abstracting away from specific linguistic universals. While this eventually leads to empirical claims, such as predicting certain string patterns in natural language or identifying contexts where linguistic principles might break down, the connection between the theoretical groundwork and these empirical applications is often difficult to discern. Moreover, many computational linguists are not trained linguists, so they may be hesitant to step outside their formal expertise. In contrast, those working in Bayesian or probabilistic models often focus on specific empirical phenomena, using tools that are accessible and produce tangible results. When asked about the significance of their work, they can offer a brief summary and a plethora of publications for reference. In contrast, a theoretical computational linguist needs much longer to explain and often has few accessible papers to share.

Reason 2: Complexity and Lack of Accessible Resources

Computational linguistics is not easy to approach for beginners. Before even starting, one must understand mathematical notation and proof strategies, and that’s just the baseline. The necessary math and computer science concepts—like formal language theory, complexity theory, abstract algebra, mathematical logic, and others—are typically unfamiliar and challenging. This isn’t the kind of math most people encounter in high school or undergraduate courses. As a result, many might prefer to focus on more accessible areas like Bayesian statistics, which require only basic arithmetic to get started.

Furthermore, most of this math must be self-taught, as few linguistics departments offer relevant courses, and the ones available in math or CS departments often don’t align with what computational linguists need. Good textbooks are also scarce, so students often have to piece together their own learning resources from various materials—an unappealing task. It’s no wonder many choose instead to stick to more straightforward topics or relax with some entertainment.

Reason 3: Lack of Visibility and Support

Computational linguistics is already a niche area within linguistics, and its theoretical aspect is even more so. Formal papers are rare in mainstream journals, and presentations on such topics are uncommon at major conferences. Without enough visibility, the field struggles to gain recognition, leading to fewer job opportunities, publications, and students, which further reduces visibility—a classic Catch-22.

Progress and Hope

The above paints a rather bleak picture of computational linguistics: a field with a steep learning curve, minimal rewards, and limited recognition. However, the situation is not entirely grim. People do land good jobs, although the path may be less straightforward than in other areas like syntax or phonology. As a theoretical computational linguist aiming for a position in a linguistics department, you’ll need to carve out your niche, develop your own research program, and figure out how to market it in a field where many lack even basic knowledge of computational linguistics.

This situation may be akin to what generativists faced in the 1950s and 1960s when linguistic departments were dominated by descriptivists who were unfamiliar with or even hostile toward Transformational grammar. Yet, Transformational grammar succeeded by offering a new and insightful perspective. Similarly, theoretical computational linguistics is making strides, with significant advances in learnability, grammar formalism, and other areas, suggesting a bright future ahead. On a sociological level, there are still challenges, but progress is being made.

The issues mentioned above—complexity, lack of resources, and limited visibility—can be addressed by producing more work that solves empirical problems using computational tools that are easier to understand intuitively. Thanks to the efforts of scholars like Robert Berwick, Aravind Joshi, Paul Smolensky, Ed Stabler, Mark Steedman, and their students, such work has become more common. For instance, Jeff Heinz and Jim Rogers have done impressive work in phonology that is relatively easy to grasp, and Tim Hunter has used MGs to unify various syntactic phenomena. There are also practical applications, like using the Stabler parser for MGs to model and predict processing difficulties. I, too, have shifted from abstract constraints to applying these results to empirical issues like binding and island constraints. These topics can be taught in advanced graduate courses without requiring extensive formal training, which I am currently doing in my computational seminar to engage students and encourage them to explore more challenging material.

While it would be beneficial to have textbooks or online courses for students in departments without a computational linguist, these resources take time to develop. I’m working on some of these, but for now, I’ll continue to share updates on interesting computational work. Perhaps, in time, this will be considered when departments are searching for a computational linguist.

Is Computational Linguistics Becoming Too Computational for Linguistics?