A Century Late on Proto-Finnic sibilants

There are broadly two commonly seen ways of thinking about progress in science. The first is the “naive” Science Marches On narrative where we have ever-increasing aggregation of solid Results; the archetype is mathematics, where results indeed stay around as long as they’ve been established once, but a good part of the natural sciences today follow this as their main narrative as well (for no lack of reason, I feel). The second is the Kuhnian succession-of-paradigms narrative where most of the time scientists can go around aggregating results, but ever once in a while some basic assumption is declared to have been wrong, quite a lot of stuff ends up discarded and work is started over. Hence even the results we continue to accept still not should be thought of as unchanging truths but to be rather more temporal, provisional even. The archetypes for this seem to come from the humanities, where theories of how to understand even the main forces of history or literature or psychology still seem to be in quite a bit of flux and views are often split between battling schools.

In historical linguistics, as really in most even vaguely empirical sciences, we clearly have aspects of both around. Etymology and reconstruction generally turn up ever more results as time passes, though some individual results occasionally turn out to have been built on sand. We are lucky to have avoided drastic paradigm shifts though: there clearly do not exist any examples of things like language families that were first set up in detail and later abandoned entirely. [1]

These two attitudes have also similarities, not just differences. Above all, both are forward-looking: they hold that science is something that continues to be done and will have something new to say ten, hundred, probably a thousand years from now (no matter if built on top of or beside the things it says today). Another alternative yet exists as well though — the “golden age” narrative, according to which knowledge is not created (anymore?): it is or has been already out there, and what we can accomplish amounts to either preserving or rediscovering it. “Nothing new under the sun” & its restatements in various forms (probably this sentiment is itself ages-old too).

In fields like Uralistics, with a “long-and-thin” history, occasionally this also rings true. To quote here my colleague Niklas Metsäranta in the foreword of his recent PhD thesis (English translation mine):

“The best aspects of etymological research are doubtlessly those fleeting moments, when, while reading dictionaries, the stars align and one notices or at least thinks of having noticed a new connection between words, that no one has noticed before. Occasionally the initial buzz turns to disappointment though, when upon more careful browsing on etymological references one realizes to not have found anything new, but to only have brushed up an old dusty comparison advanced already by E.N. Setälä or Yrjö Wichmann.” [2]

Metsäranta’s work in Mari and Permic etymology has indeed a lot of preliminaries and precedessors around for it in the late 19th and early 20th century. Most progress in Uralic etymology in the second half of the 20th has not come from extending the corpus of comparisons, but rather from trimming it down, trying to find which parts of it are actually reliable and which of them might have other, better explanations, e.g. as Indo-European loanwords. This issue has been particularly obvious during the work that led to my recent paper on a sound change *i > *i̮ in Permic, which consists almost entirely of the rehabilitation of old etymological comparisons, most of them rejected later on for one reason or the other (but generally without any detailed critique). Only time will tell if this idea will lead to any all-new etymologies, too. Probably yes however, if the numerous 21st century works that again seek also entirely novel Uralic etymologies are anything to go by, I already cite also one applicable new etymology from a preprint by Aikio after all.

The early pioneers of Uralic of course did not just work on etymology. The development of general Uralic historical phonology shows also a similar broad outline: a “brainstorming” phase pre-WW1 eventually turning into a “consolidation” phase post-WW2. Here the situation seems also much more precarious in the details, really. There are several major etymological dictionaries out there by now, all household names to the historical Uralicist (FUV, SKES, KESK, DEWOS, TESz, UEW, YSS, SSA…). Most early etymologies worth consideration have been caught by at least one of them, even if not necessarily concluding in their favor. [3] By contrast studies of historical phonology have remained more data-driven / less literature-driven. Small details can be often re-derived as needed as long as their underlying etymologies remain known, sidelining credit from their first discoverers; or, also, they may end up forgotten entirely.

Getting finally to the topic my post’s title, about five years ago I sketched an observation about a distinct reflex of Proto-Finnic *c in Karelian. This is quite noteworthy in that *c is the first new phoneme to be added to Setälä’s 1890s reconstruction of Proto-Finnic that seems likely to stick, first properly consolidated as recently as by Kallio in 2007. Here we would then have evidence that this has not been retained only in the previously marginal South Estonian (its proper importance to Finnic reconstruction was not realized before at least the 70s) but also in the long-researched Karelian. There is quite a bit of noise in the Karelian data though, e.g. due to secondary affective affrication and some evident dialect mixing in the complex reflexes of *s. I wouldn’t blame earlier generations for not catching this idea.

But caught it has been. Earlier this year I noticed yet another old journal relevant to Uralic studies to be available online by now: De Monde Oriental, published in Uppsala from 1906 to 1947, still turning up regularly in bibliographies thanks to several contributions from K. B. Wiklund. The early issues are by now in the public domain and can be found at least in part in the archive.org collections. I usually follow up these kind of finds by taking a brief look over the contents of the back issues in general. Vol. 6 from 1912 turned out to contain an article from one N. Moosberg (not a previously familiar name to me at all), “Om utvecklingen af samfinskt s i den ryskkarelska dialekten in Vuonninen”. This contains pretty much exactly my observation, just more than a century earlier already: while North Karelian (in his article: just from the village of Vuonninen in the parish of Vuokkiniemi) reflects *s as /š/ by default, it also maintains instances of /s/ that cannot be explained by any regular secondary conditioning factors. In particular, this holds for the assibilated reflex of *t before *i, where we today reconstruct *c per the South Estonian evidence. Moosberg too concludes that the result of this assibilation must have been a consonant distinct from plain *s. I don’t know what to make however of his suggestion for a “probably more spirantic sound” (“troligen mera spirantiskt ljud”) — should this be read as suggesting something like a nonsibilant *θ?

Moosberg’s primary data behaves also more cleanly that what I was able to scrape together. In particular he finds *c > /s/ just fine also in kaksi ‘2’, kuusi ‘6’, kyⁿsi ‘nail’, uusi ‘new’, varsi ‘shaft’. Several preterite stems like kokosi ‘collected’, läksi ‘left’, löysi ‘found’, makasi ‘lay (down)’, tuⁿsi ‘felt’ are also adduced, some of these are even confirmed by the KKS data, while I did not look into the topic at all. My own preliminary suggestion that (*uc, *rc >) *us₂, *rs₂ > *us₁, *rs₁ (> , ) could of course still hold for some other varieties upon closer investigation, but I am now less trustful.

The other typical position where we (at Helsinki at least) now reconstruct PF *c is the cluster *cr. Moosberg notes a reflex /s/ in these as well, but he follows E. N. Setälä’s influential reconstruction with *str and is unable to treat this as the exact same sound change, instead assuming a distinct cluster change *str > *s₂r. Yet, also these cases still have had an affricate reconstruction advanced for them early on as well. This I believe was first proposed by Frans Äimä in a 1921 article in Virittäjä; as I found out already a bit sooner after my previous blog post, in the spring of 2018. Äimä in fact refers to an outright palatalized pronunciation with [źr] or [śr] from the dialects of Rugajärvi, Jyvöälahti and partly Tver. This has not been recorded in the macrophonemic transcription of Karjalan Kielen Sanakirja, but aluckily, scans of the original field records are already partly available too and they do show this unexpected palatalization: Rj. aźrain, keźrä, Pistojärvi aśroan, Tver ḱeźŕä, ildaḱeźro (the latter still there also in 1958) and even Vuokkiniemi keśrä (1956). Äimä also builds here on a suggestion made slightly earlier by Ojansuu (Karjala-aunuksen äännehistoria, 1918 [4]) to reconstruct *st > *ts > *ćć just for Karelian, but takes a step further and proposes a very modern-looking reconstruction *tsr already for Proto-Finnic. Unfortunately, it appears that no one has before now brought their proposal(s) together with Moosberg’s. The nascent discussion on what exactly to reconstruct behind the correspondence NKrl sr ~ SKrl, Ludian–Veps zr ~ Western Finnish hr ~ EFi and southern Finnic *tr simply seems to have been dropped post-WW2, with overviews defaulting to Setälä’s *str almost up to the present day. Even the current reconstruction with *cr is still not highly prominent really, being proposed by again Petri Kallio merely in a lengthy footnote #9 of his 2012 article “The Prehistoric Germanic Loanword Strata in Finnic“. A bit more visibility seems to be warranted here, and I would propose introducing the name “Moosberg’s law” for the North Karelian retention of /s/ from *s₂ < *c.

These finds taken together do not amount to merely rediscovering lost earlier wisdom, but the flavor is certainly there, and it’s hard not to wonder what other small but potentially crucial notes on Uralic historical phonology might be already out there, theoretically available to the reader but not roadposted by any modern back-references. [5] Considering the issue I have, in fact, considered starting work on a Uralic analogue of N. E. Collinge’s 1985 monograph The Laws of Indo-European, or maybe first some more limited analogue similar to e.g. Nathan W. Hill’s 2011 paper “An Inventory of Tibetan Sound Laws“.

[1] The closest is maybe the defunct Ural-Altaic hypothesis, and its succession in the Altaic wars on the other hand (restricting the family by the exclusion of Uralic and perhaps other parts), the Nostratic hypothesis on the other (widening it by the inclusion of e.g. Indo-European, Kartvelian and Yukaghir). All early defenses of Ural-Altaic are however obviously sketchy and often admit as much. There are no systematic reconstructions of grammar or phonology or lexicon, only take-it-or-leave-it collections of parallels, many of them by now reinterpretable as areal or typological rather than genealogical; and hence not strictly speaking abandoned as such.
[2] “Etymologisen tutkimustyön parhaimpia puolia ovat epäilemättä ne ohikiitävät hetket, kun sanakirjoja lukiessaan tähdet asettuvat linjaan, ja sitä huomaa löytäneensä tai ainakin luulee löytäneensä yhteyden sanojen väliltä, jota kukaan muu ei ole ennen huomannut. Välillä ensihuuma muuttuu pettymykseksi, kun tarkemmin etymologisia sanakirjoja selailtuaan tajuaa, ettei todellisuudessa olekaan löytänyt mitään uutta, vaan on tomuttanut esiin vain jonkin vanhan pölyisen jo E. N. Setälän tai Yrjö Wichmannin esittämän rinnastuksen.”
– Seconded on the buzz as well, which you might get a glimpse of from my previous post.
[3] The biggest remaining gaps are probably among words not found in the four “key languages” to have been covered by dedicated etymological dictionaries already in the 20th century (i.e. Hungarian, Finnish, Komi and Khanty). Newer etymological or etymologically-minded comparative dictionaries exist also for Estonian, Mordvinic, Mari and Selkup at least, but these do not pay much attention to early literature.
[4] Earlier in the shorter overview “Karjalan äänneoppi” (1905; p. 30) Ojansuu still follows Setälä in positing one-step *str > *sr.
[5] A search in my digital literature collection indeed turns up zero references to this article of Moosberg’s, only a handful of mentions of his other work on Ume Sami.

A Century Late on Proto-Finnic sibilants