Details of some vulpine words in Uralic

A recent open access paper by half a dozen Leiden Indo-Europeanists: Palmér, Jakob, Thorsø, van Sluis, Swanenvleugel & Kroonen, “Proto-Indo-European ‘fox’ and the reconstruction of an athematic -stem” presents a very thorough analysis of various core IE words for medium-sized carnivores (h/t Languagehat). The main conclusion is that these constitute two etyma rather than just one: *h₂lop-eḱ- ‘fox’ ≠ *wl̥p-i- ‘wildcat’ (surely not **ulp-i-?), even though some reflexes of the latter do end up with the meaning ‘fox’, namely Latin vulpēs and Albanian dhelpër. The latter has been included here thru a dissimilation *v > dh / _V(C)p (another tally to the already lengthy list of Weird-Ass Albanian Sound Changes™, but the other mentioned examples dhampir ‘vampire’ and dialectal dhespër ‘evening’ do look watertight to me).

The paper includes also a lengthy digression on loanword reflexes of the former etymon in Uralic. Despite the unusually-large-for-linguistics author team however, none of the writers seem to be Uralic specialists. They have had some good help on this at least; Petri Kallio has been thanked for consultation and Sampsa Holopainen’s 2019 thesis treatment of these loanwords is also referred to repeatedly. I would still add a few details to the account of the Uralic data though, as they seem to illustrate several novel or less-known phenomena in phonology and morphology.

1. Finnic

Palmér et al. start their discussion of Finnic by asserting a back-harmonic Proto-Finnic **rpoi behind North Finnic *repoi. I however do not see any grounds for this. Second-syllable *o was neutral with respect to vowel harmony in PF; key data for this phonological interpretation comes from two corners of the southern part of Finnic language area, where we still find even an explicitly disharmonic vowel comination ä–o in languages that otherwise follow vowel harmony. The first is Votic, showing e.g. tšäko ‘cuckoo’ (< *käkoi), pääsko ‘swallow’ (< *pääskoi), sälko ~ śalko ‘foal’. Note that these also cannot be explained as later loanwords, since their cognates in North Finnic do end up re-asserting harmony (Fi. käkö, Ing. käkö(i), Krl. Lud. kägöi; Ing. pääsköi, Livvi piäsköi ~ piätšköi; Fi. sälkö, Krl. proper šälkö ~ šäľgö). Secondly this vowel combination has been retained also in South Estonian. Besides pääsokõnõ ‘swallow’ (no reflexes of *käkoi, *sälko), cf. at least näio ‘maiden’ from PF *näito(i) (> core Finnic *neito(i) > Fi. Vt. neito, Ing. neitoi, Krl. Lud. ńeidoi, Veps ńeidō) and räbo ‘junk’ ~ Est. räbu; also Fi. räp-eä, Krl. räp-äkkä, Veps räb-ed ‘brittle’ (different derivatives but affirming original *ä). Vt. repo and also SE rebo ‘fox’, neglected in the paper, can be therefore taken to directly continue PF disharmonic *repoi.

The “clipped” derivation *rebäs → *repoi is certainly unproblematic: this is very typical for *oi-diminutives in Finnic, already found among the oldest examples such as *jänis ‘hare’ → *jänoi > NF *jänöi ‘bunny’, *kaunis ‘beautiful’ → *Kaunoi ‘name of a cow’, and perhaps (the semantics seem off) *talas ‘platform, shed’ → *taloi ‘house’. In later, more localized examples we find all sorts of stem-final or even root material dropping off, like Ingrian hanoi ‘goose’ ← han[hi], Ludian ohtoi ‘thistle’ ← oht[ikaz], South Ostrobothnian Fi. Torstoo ‘name of a cow born on Thursday’ ← torst[ai] ‘Thursday’. [1] There is also some minor evidence of stem-final *-(a)s : *-aha- being reanalyzed as a suffix eventually at least, since we find it sometimes secondarily attached to native stems, e.g. Fi. lippa ‘overhang, visor, etc.’ → lipas : lippaa- ‘chest’. This leaves some space for an analysis similar to Hungarian (cf. below).

Other stem variants present two other problems, which to me appear to largely cancel each other out however. For one, while scarcely attested PF *rebäs could indeed regularly continue earlier *rebäś < *repäć(ə), to me this would not seem to predict an inflectional stem **repäh(e)-: there is no positive evidence that the early lenition *-s- > *-h- between unstressed syllables applied to secondary *s from palatalized *ś < *ć. I believe an explicit counterexample is at least the North Finnic conditional mood marker –isi-, which I would derive from pre-PF *-j-śə- < *-j-ćə- (*-j- from the imperfect stem); not from a suffix *-ŋćə- with an original nasal (the Samic potential mood marker *-ńće̮- I would consider to get its nasal from the PU potential mood marker *-nə-). For two, the authors note that forms like Estonian rebane could continue a diminutive *repäh-inen, but that Veps rebāńe does not quite support this, pointing instead to PF *repäinen. This is not a problem though if the PF paradigm of *rebäs originally did not have forms with *-h-! I would instead consider an earlier West Uralic *repäć(ə) first giving *repäś : *repäśə-, evolving by late Proto-Finnic into a paradigm *rebäs : *repäise-, with *-i- by palatal unpacking. The latter would then have been readily interpretable as the oblique stem of a diminutive *repäinen, motivated also by the fact that by far most bisyllabic nouns ending in *-s had either an oblique stem in *-hE- (if from pre-PF simple *s) or *-ksE- (if with the PU noun-deriving suffix *-ksə).

A similar reshuffling of an unalternating *s-stem into two different paradigms seems to have taken place also in the other example where we can clearly reconstruct a noun ending in pre-PF *-Ać(ə). This is the word for ‘male pig’: Fi. oras ~ orainen ~ oraisa, Krl. orattšu, Veps oraž(a-) ~ oratš(u-). These have their origin in West Uralic *worać(ə) ← Indo-Iranian *warādźa- (cf. Holopainen 2019: 313–314); whence also Moksha /urəś/, dim. /urəź-i/ (with voicing alternation pointing to a pre-Mo. consonant stem *oraś : *oraśə-). From this it seems to me that “reconstructing forwards” would yield PF *oras : *oraise̮-; the first form of these then later gaining an analogous inflected stem *oraha-, the second an analogous *orainen. This last-mentioned form would have been further folk-etymologically interpretable as a derivative of ora ‘awl’, leading to the creation of two further variants ora-isa, ora-ttšu.

Tangentially, I think this mechanism also explains the two different shapes of the word for ‘crow’ in Finnic: Fi. Ing. Krl varis (: varikse-), Lud. Veps variž , Livonian vaŗīkš ending in *-is, versus Est. and SW Fi. vares (: Est. varese-, Fi. varekse-), Vt. varõz (: varõ(h)sõ-), SE varõs (: varõ(s)sõ-) ending in *-e̮s. While words for ‘crow’ display a wide variety of different suffixes across Uralic altogether — e.g. Erzya /varaka/, Hungarian varjú, Southern Khanty /wărŋaj/ (< pseudo-PU ? *wara-kka, *warV-ja, *warV-N-woj) — evidence for a suffix with *-ć- can be found in both Samic (*vōre̮ć) and Mordvinic (? *varśəŋ > Er. /varćej ~ varśej ~ varkśij/, Mk. /varśi ~ varći/). It would seem to be possible to reconstruct already a common West Uralic *warə-ć(ə). From this I would expect to see in PF a paradigm *vare̮s : *varise̮-, again with clean depalatalization syllable-finally vs. palatal cheshirization medially. The stem was then maybe reworked to *varikse̮- already early on; there seems to be no evidence for a reanalysis as a diminutive **varinen (maybe avoided due to the crow being a relatively large bird).

2. Samic

The Samic reflexes do not receive a separate discussion in the article. The main question raised is if a suggested Proto-Samic *reapēš should be considered a recent loanword, and if so, where from.

At least the suggestion of *š being a substitute of North Karelian š in a lost **reväš seems anachronistic to me. PS dates to ca. 2500 BP, the shift of *s > š in NKrl. to ca. 1000 BP at the earliest, if taking place right around the split-up of Old Karelian. The distribution of this variant of the word in Samic (South thru North, with no Eastern Samic reflexes) does not match with a Karelian origin either, either old or more recent. Examples that Palmér et al. bring up of the type PS *še̮lmē ‘eye of an axe’ ~ PF *silmä ‘eye’ (~ inherited PS *če̮lmē ‘eye’) or PS *še̮ltē ~ silta < PF *cilta ‘bridge’ (← Baltic), where PS *š seems to continue a Finnic *s, probably represent mostly an allophonic palatalized realization of Proto-Finnic *s as [sʲ] when adjacent to *i. To me the simplest loan source would therefore seem to be the Finnic inflected stem *repäise- (whether or not it already had *repäinen as its nominative). The suffix *-ise- indeed later regains phonemic palatalization in quite many Finnic varieties, already so in Karelian and Eastern Finnish. This interpretation also accounts for the retention of *-p-, as in a hypothetical late loan from an unattested NKrl. **reväš we’d probably expect reflexes like Lule Sami **rievij rather than the attested riebij.

3. Permic

Following Holopainen, Palmér et al. consider Permic *rući̮ an independent back-vocalic loan. — For a preface before continuing, I write the vowels here as they would be in the classic reconstruction of Itkonen and Lytkin; the paper instead follows Zhivlov’s most recent sketch of Proto-Permic reconstruction in reconstructing *roću̇, on which suffice to say I am not especially convinced of it. I do not wish to get bogged down in details of PP vowel reconstruction schemes here though, as I agree with the point that Komi /u/ would regularly reflect a PU non-open back vowel *a/*e̮/*o and not front *e. This would be itself a sufficient reason to not derive PP *rući̮ from the preform *repäć(ə) indicated by Finnic, Mordvinic and Mari.

The authors however also advance the claim that medial *-ć- should have been voiced and that therefore a preform with a geminate is required, along the lines of *rApaćća. I believe this is an overreach. An underappreciated fact of Permic historical phonology is that word-medial lenition only fully applies post-tonically! The best-known examples of the development later in a word are the possessive suffixes: cf. Komi-Permyak 2PS /-ɨt/, 3PS /-ɨs/ << PU *-(n)tə, *-(n)sa, and the ordinal suffix: KP /-ət/, Udmurt /-et/ << PU *-mtə, which remain voiceless (with secondary voicing of *t in Zyrian Komi /-ɨd/, /-əd/). The possessive suffixes do end up as /-ɨd/, /-ɨz/ in Ud., possibly originating e.g. as positional variants after secondary stress; but in any case note that despite voicing, we do not find this feeding into further lenition *-d- > *-ð- > ∅ as is the fate of root-medial *-t-. Some derivational suffixes show this same development too, most clearly the adjectival suffix /-ɨt/ << PU *-ətA, as in examples like Ud. /peľmɨt/ << PU *piďm-ətä >> Fi. pimeä ‘dark’; perhaps also the adjectival suffix Ud. /-eś/, K. /-e̮ś/ (from PU *-ća?). There also appear to be examples among the few trisyllabic word roots that can be reconstructed for PU, such as K. /rɨnɨš/ < PP *ri̮ŋi̮š ‘threshing ground’ < PU *riŋəšə, PP *ľaŋes ‘birch bark vessel’ < PU *ďäŋäsə. [2]

Lack of voicing of the affricate in *rući̮ is therefore no problem even if going back to something like *rApaća, borrowed already roughly from Proto-Indo-Iranian. We do need to date it as younger than the deaffrication *ć > *ś that is represented in oldest II loans like late common Uralic *ćarwə > *śarwə >> PP *śur ‘horn’, though. This “new” *ć that survives into modern Permic probably also should be able to continue not just a PII *ć but also a slightly later Proto-Iranian depalatalized *c. Permic has never had a native dental affricate, and even some early Russian loans into Komi end up substituting ц as /ć/ (IIRC including in nonpalatalized positions, but I don’t have a list of these readily around).

4. Hungarian

In Hungarian, ravasz ‘cunning’ (OHu. ‘fox’) and róka ‘fox’ represent additional clearly independent loanwords. Following Holopainen, who in turn follows early less assertive suggestions by Sköld and Joki, we can easily agree that at least the former is likely to come from later Alanic, insted of by any kind of ad hoc backing development from *repäć(ə).

I would indeed also rule out an early loan with PU *s. The example of fészek ‘nest’ < PU *pesä is not really itself well-explained enough to make a precedent for retention of *s as sz /s/. The only real suggestion that has been advanced for this is a somewhat ad hoc blocking of *s > *h before a word-initial fricative f-, which is itself not clear without knowing how early *p- > f- is exactly, nor does it not strike me as clear if voiced -v- or slightly earlier *-β- could be assumed to have had the same effect as voiceless *f-. There is one seemingly exact parallel to this dissimilation, fasz ‘penis’ ← PII *pásas (a loan etymology re-defended by Holopainen, 185–186); but this has an apparent Samic cognate *pōče̮, pointing to PU *ć and not *s, which IMO leaves also the loan etymology uncertain. For this word I would actually not even entirely rule out the suggestion of Rédei, who in one of his last papers [3] suggested relatively recent loaning from an archaic but unattested Old High German reflex *fas; which is certainly at a disadvantage though since only a derived reflex, in OHG fasal ‘offspring’ (> modern German Fasel) seems to be actually attested in Germanic, and with not much trace of the meaning ‘penis’.

(I have also wondered if all this is maybe barking up the wrong phoneme and fészek should not be segmented as fész-ek, but rather fé-szek; where the second component could then perhaps represent a reduced reflex of szék ‘chair, seat’, cf. in Indo-European nest << *ni-sd-os ≈ ‘down-seat’. However this is not quite matched by the oblique stem fészke-, demonstrating that also the nominative singular continues earlier *fészk < *fēskĭ. Dialect forms such as fécek with an affricate might be an additional problem, though really equally also for any proposal that sz < PU *s.)

Back to foxes though: for modern róka it is indeed easy to analyze -ka as a diminutive suffix added to an earlier *raw-. This would on first look seem to represent similar “clipped” derivation as Finnic *rep-oi. While this is not the typical application of -ka in Hungarian, there are still examples, say JóskaJózsef, this usage perhaps motivated by the homographic and “homophonological” (even if not exactly homophonic) Slavic diminutive -ka. But I do like Palmér et al.’s proposal via reanalysis: ravasz would have been analyzable as containing the rareish suffix -asz and would have allowed *raw-ka to be formed by suffix alternation instead. [4] If I’m not mistaken, most examples of -asz and also the front variant -esz are nouns though — and the phonologically closest match is maybe tavasz ‘spring’ — so dating this change specifically after the shift ‘fox’ > ‘cunning’ in ravasz does not strike me as necessary at all. For that matter, this might be also too late for root-medial *aw > ó to be operative even analogically anymore, since the sense ‘fox’ is still attested for ravasz as late as 1403, ‘cunning’ only from about there on out.

Postscript: ‘Wildcat’ in Uralic (?)

After finding the Indo-European ‘fox’ borrowed, thru Indo-Iranian, directly or indirectly into half a dozen Uralic branches (including also relatively straightforward reflexes in Mordvinic and Mari that I don’t comment on specifically here), it is interesting to note that probably also *wl̥pi- ‘wildcat’ seems to have made the leap. These are the Samic and Finnic words for ‘lynx’: PS *e̮lpe̮s (narrowly distributed: North albbas, Lule albas) ~ PF *ilbes (pan-Finnic: Es. SE. Fi. Krl. ilves, Vt. ilvez, Lud. Veps ilbez, Liv. īlbõks). This time, retained *l and front-vocalism seem to point towards Baltic (Lith. vilpišys) and not Indo-Iranian. Only the loss of *w- would readily create a problem.

To my knowledge the comparison of these with Indo-European remains unpublished, but I’ve heard it from a couple of colleagues (for the time being please do not cite me on this). Its first public presentation might have been by Mikko Heikkilä at the 2017 conference Contextualizing historical lexicology — narrowly missed by Kroonen, who was scheduled to participate but IIRC had to cancel entirely. I’m not sure if his proposed routing thru an additional Uralic substrate in the northwest is at all necessary though. If the word was originally loaned as *wülpəs/š- or the like, the Samic word would reflect this entirely natively (*wü- > *ü- feeding into *ü > *i > *e̮ is known also in *wülä- > *e̮lē- ‘up, above’ — possibly the only word in Samic that retains a trace of PU *i/*ü contrast). In light of the apparently rare suffix -iš- in Lithuanian, final *-s maybe more likely continues earlier *-(k?)š, which again regularly gives Samic *-s, but would be expected to give **-h in Finnic. (One of Palmér et al.’s two other examples of this suffix is takišys ‘weir’, whose preform has also been borrowed into Finnic as *tokəš > *toge̮h > Fi. Ing. toe (: tokee-), Vt. tõgõ, Es. tõke ~ tõge, Liv. to’ggõd; or, since apparently there’s no good IE or even Balto-Slavic etymology, is it perhaps a loan from (pre-)Finnic into Baltic instead? [5])

A Samic loan already into Proto-Finnic would be unexpected though. All known words of Samic origin in Estonian have made it there late thru the mediation of Finnish, and I don’t think any are known in Livonian at all. The same is the case for “Language X” of some supposed non-Samic and non-Finnic hydronyms however, which (by the current evidence) seems to have arrived in Karelia / inner Finland / Sápmi via a northwestern route, not thru the Baltic. It’s also not clear to me why wouldn’t Finnic have simply borrowed the word itself from Baltic straight away? when it’s commonly thought that even Baltic loans in Samic were mostly mediated by early Finnic.

Word-initial *wi- does in general survive in Finnic, e.g. *viici ‘five’, *viimä(-) ‘end’, *viska- to throw’, which at first seems to weigh against direct derivation from IE. However I wonder if there could simply have been a conditional loss here. Proto-Uralic is known to have lacked word roots of the shape *PV(C)PV with two bilabials *m, *p in consecutive onsets; and also *w…m seems to lack any good examples for it. (Note that PF *viimä is a derivative with a contracted long vowel < *wiŋə-mä; ditto for e.g. *vaima ‘heart’ < *wajŋ(ə)-ma.) Perhaps by early Finnic, this constraint was then further extended to *w…p. This sequence still occurs natively in PU *woppə- ‘to observe’, but then *wo- simplifies to *o- in Finnic anyway, indeed also Samic, Mordvinic and in most cases Mari. Thus it does not seem out of the question to me that we have simply an early Baltic *wilpi(k)ši- borrowed into pre-Finnic as *ilpəksə. It would be also possible to then treat the Northern and Lule Sami words as (earlyish?) loans from Finnic rather than archaisms dating already to pre-Proto-Samic times. But this remains a hypothesis that could still use parallels, especially since there are also some Baltic loans into Finnic that do retain *w…p or similar pairings, e.g. *virpi ‘branch, rod’.

[1] For loads more examples, see e.g. Rapola, Martti (1920), “Kantasuomalaiset pääpainottomain tavujen i-loppuiset diftongit suomen murteissa“.
[2] I would suspect that this general point has been made before, but offhand I can only find partial statements, e.g. the classic Uotila, T. E. (1933), Zur Geschichte des Konsonantismus in den permischen Sprachen only really discusses the case of PU *t (p. 92 on).
[3] Rédei, Károly (2005), “Szófejtések 351–358“, Nyelvtudományi Közlemények 102.
[4] Another point that I also could believe to have been made before already, but I’ve not gone digging into the Hungarian literature.
[5] There is one nominally compatible PU root that could be considered as a source for this: *čoka ‘shallow, dry’; weirs are best built in relatively shallow rivers. This is reflected only in Samic and Selkup though, the latter reflex also being *če̮kə- ‘to dry’ rather than expected *čwe̮kə(-), which leaves almost everything in this comparison not especilly compelling.

Details of some vulpine words in Uralic