Tuesday, January 29, 2013

An Old Classic


This is an old post of mine from another blog, slightly edited. 

Cartesian Product Conlangs


This will be familiar to anyone who has taken algebra 101, but since I am writing for an audience consisting of humanities hobbyists, I feel I need to introduce a definition here.

Definition: Cartesian Product

A cartesian product for a couple of sets A = {a1, a2, a3, ... an} and B = {b1,b2,b3,b4,...bm}, A x B is a new set consisting of all pairs {a1b1, a1b2, a1b3...a1bn, a2b1 ... anbm}

Now, what does this have to do with conlanging, and why do I complain about such things appearing in conlangs?

Well, one natural place of utilizing this set theoretical notation is in morphology (although it is also applicable in other places as well). Oftentimes, conlangs follow a formula along these lines: an agglutinating language, with, e.g. two-four numbers, four or up to a dozen cases, etc. and for every number, there's every case - or more generally, for every pair of accidents - every combination is possible.  With verbs, you get voice * tense * aspect * person * number * mood * other thingy * other thingy, with nouns you get case * number * gender * ... 


Naturally, this does occur in agglutinating languages, but what annoys me a lot is when people claim that, for instance, Finnish essentially is a Cartesian product. In Finnish, you have about 15 cases, two numbers, {3 persons x 2 numbers} + Ø (that is, unmarked) possessive suffixes (there already we find an exception from the cartesian product style - the plural and singular 3rd person possessive suffixes are conflated. This can be expressed as {{five distinct persons} + Ø}).

But this gives an incomplete and too regular picture of the Finnish 

  • Nominative and accusative are fully conflated in the plural; plural nominative morphology is used with personal pronouns (and the wh-pronoun kuka, "who") as a separate accusative case.
  • Non-partitive objects in the singular appear in either the accusative I or II, which in modern Finnish are identical to the nominative and genitive. As mentioned in the previous item, there is only one accusative in the plural
  • The singular nominative, genitive, accusatives, and the plural nominative and accusative are identical whenever they have a possessive suffix stacked on them (so, {cases} x {numbers} x {possessive suffixes} - ({sg} x {nom, acc, gen} x {five distinct persons} + {plural} x {nom, acc} x {five distinct persons})
  • The comitative lacks distinct singular forms
  • The comitative noun phrase never appears without possessive suffixes (but the adjectives in a noun phrase only take the case desinence -i-ne) - other cases get six forms, the full {[empty], 1sg, 1pl, 2sg, 2pl, 3sg/pl}. Comitatives lac [empty] in that set. 
Estonian presents another interesting uncartesian thing in its verbs - the negative. For normal indicative present and past verbs, you have personal inflections on the verb, but in the negative, you don't. (Finnish is similar, but Finnish indicates the person on the negating auxiliary, whereas Estonian does not). Estonian doesn't have three persons x two numbers x {positive, negative}, but rather ({three persons} x {two numbers}) + negative.

Finnish also seems to have various complications, in various dialects, where e.g. the plural partitive or the genitive might have several surface forms, which are used slightly differently (e.g. ratkaisuja vs. ratkaisuita, or elukoita, elukkoja) and some nouns have more than one essive form (lapsena, lasna) which can be used in slightly different contexts.

It is not hard to come up with any number of other examples, c.f. Russian tenses. Russian has a rather natural verb system with basically three tenses (present, past and future), two aspects (perfect and imperfect), three persons, two numbers, three genders. The cartesian product would consist of 3 x 2 x 3 x 2 x 3 forms, which is 3^3*4 = 108

However, the numbers pattern more like genders, as the genders are not distinguished in the plural (i.e., no cartesian product gender x number; Polish does this even worse, by distinguishing part of one of the genders in the plural and conflating the rest into one; Russian only does this as far as nominal case morphology goes), so let's for the sake of honesty instead simplify this to
{3 genders + plural} x {imp, perf} x {pres, fut, past} x {three persons} - 72 forms.

What Russian in reality has, is:

  • imperfect past marks no person, only gender/plural
  • perfect past marks no person, only gender/plural
  • imperfect future marks person and number, no gender
  • perfect future marks person and number, no gender
  • perfect present doesn't exist at all
  • imperfect present marks person and number, no gender

This gives a total of 1 * 4 + 1 * 4 + 1 * 3 * 2 + 1 * 3 * 2 + 1 * 3 * 2 = 24 distinct forms. Yes, cartesian products do occur everywhere - basically anywhere you see multiplication above, there is a cartesian product, but this cartesian product is limited to that particular context only and does not go on exponentially combining with other cartesian products - the language isn't just one huge cartesian product.

Generally, a cartesian product conlang will be more boring because:

  •  it's predictableYou see the list of distinctions, and you get the full idea of how it adds up, and no fun quirks to think about
  • seeing huge paradigms where all the features are predictable is boring
  •  it does not really permit for much creativity - you're letting a mathematical operator do the creation for you, rather than being the conductor of an orchestra of operators that add and multiply things in various ways
  • reading the grammar of yet another conlang that is just A x B x C x D for verbs and M x N x Y x Z for nouns ... is not that exciting really. Got that already?

Languages often fail to distinguish something somewhere, and this removes entire rows or columns out of a regular table of combinations. E.g. English fails to mark person on past tense verbs (with some exceptions). These failures bring a language to life in a way that this huge regular table doesn't.

Another example: in English, some nouns lack singular forms.
Another example: in Russian, in every gender, some cases coincide - the feminine merges dative and locative, the masculine either genitive and accusative or nominative and accusative, the neuter merges accusative and nominative, the plural merges genitive and accusative.
Another example: in Russian, for a few masculines, the locative is split in two different subcases, the -e and the -u case, where the -u case is used with adpositions when the semantics of the situation is concrete)

All of this helps create an impression of realism. Sometimes, the conflations might be rather random - having appeared through sound changes - or reflect the historical grammatical background of a construction or something about the distinctions themselves.

The latter is the case with the lack of one entire column in the Russian tense/aspect combination - present perfect just doesn't make much sense in light of the Russian semantics of tenses and aspects, and what's formed using the same tense morpheme as the present imperfect, but with perfect aspect instead is parsed as future perfect, and there's no need to use the synthetic future that the imperfect calls for.


I will quote one of Tom H Chappel's posts from the ZBB on how to, at least to some extent, avoid 'cartesianness':


  •  Not every root verb needs to have all of the cells in its coungugation filled in. Not every root verb needs to have all the distinct cells in its conjugation be filled in with distinct-sounding surface-forms. Not every root needs to have the same cells of its paradigm filled in as every other root.
  •  Don't fill in a cell in a conjugation just because it is a cell in a conjugation. Instead, actually come up with a sample sentence in which that meaning is actually needed.
  •  Don't make two cells in a conjugation sound different just because they're two different cells. Instead, actually come up with a sample sentence in which both meanings are actually needed and it's important to tell which is which; and come up with a pair of sentences, one using one meaning and one usuing the other, in which it's important to tell which is which."


I would personally even go so far as to claim that even if you perceive a need for distinction between two cells, it's not necessarily the case that such a distinction actually is needed. Lots of Finnish verbs conflate present and past forms throughout the active voice non-negative paradigm, and I bet that would strike most of you as a necessary distinction, no?

Anyway, I have no good way of wrapping up this post so here goes.

[Slightly edited, and in for more editing; I'd like to add sources, and more examples to it, as well as more in-depth functional musings about it.].

Detail #25: Interrogatives and verbs

I have been thinking about very simple interrogative systems, something along these lines:


  • one interrogative particle, which I for now will call Q (from "question")
  • some kind of verbal marking that has to do with some notion of definiteness, with more than two degrees
The combination of Q and fully indefinite correlates with wh-interrogatives, Q and incomplete indefiniteness and a noun correlates with wh-interrogatives as determiners, finally Q and complete definiteness correlates to yes/no-questions. Q before a noun with full definiteness 
"Q came.indef here" - "Who came here?"
"Q came.def here" - "Did someone come here?"
"Q he came.indef here" - "Did he come here?"
"Q weather.½def is it?" - "What weather is it?"
Some verbs would have a specific, likely interpretation, so if there's a verb for weather conditions in general, no marking beyond Q would be triggered, and the same may go with nouns. Of course, the definiteness could be marked on nouns as well.

This definiteness-marking on verbs could possibly be used for aspect or intensity or something with non-interrogatives, thus removing that distinction in interrogative sentences. I figure this would make it unlikely that weak orders would be phrased as questions.

Tuesday, January 22, 2013

Detail #16, addendum #1: Wackernagel exceptions

(This post provides further development of this idea.)

In a language with some Wackernagel adpositions, adpositions are prohibited from entering the Wackernagel position if the NP has an extracted pre-nominal attribute, e.g.

many.loc he went in towns.loc - he went to many towns
In this language, extraction of adjectives and some quantifiers can serve as a kind of emphasis.

Friday, January 18, 2013

Detail #24: Voice Morphology on occasional nouns

I came up with an idea that kind of provides a possible road down which to take the question of challenge #3.

Reusing voice morphology on some nouns. It should be somewhat restricted which nouns do this, and there's no need that each noun take every form. In a language with this feature, it's no fun if the nouns and verbs tend to share a lot of roots already.

boss.passive = employee, servant
boss.reflexive = self-employed, independent
boss.mediopassive =  some kind of neurotic?

Tuesday, January 15, 2013

Challenge #3: Verbal morphology occasionally affecting nouns

In English, past participle-like morphology sometimes is used in combination with nouns to create a kind of adjective pertaining to possession or such:
a six-legged creature
a two-pronged approach
turreted walls
 What other things  could one do with things like this?

Wednesday, January 9, 2013

Detail #23: Conjunctions with dummy pronouns

This idea has some similarities to this old post, but with quite a different locus of marking and even basically a different function altogether: this time, conjunctions no longer serve to mark the relation between two verb phrases (or even two nouns, by way of implicitly coordinated verb phrases), but marks something about a single verb phrase or noun phrase.

In some language, (I would especially like it in one where conjunctions are followed by an oblique case along the line of what English does, only with non-pronouns as well - alternatively, case marking is sensitive to context), there could be a commonly used construction with dummies carrying grammatical information about the VP or NP in which they occur, so, for instance,

A bit like if "... and shit" was grammaticalized, but for a more wide range of meanings: TAM, case/adposition-like things, adjectival things for nouns, indefinite determiners and quantifying expressions, ... some other expressions could be formed likewise with and + noun.
(obviative) it and (obviative) it : some 
 ran around and it.obl : ran around intently, ran around (perfective)
stone and them : stones (if the language lacks, say, morphological plural for inanimates or somesuch)
bean and it: some beans (with uncountable nouns, to express large quantity) 
we thought and at it (locative case of some kind) : we considered, but did not reach a conclusion
 we thought and in it (locative case of some other kind): we considered carefully
we thought and it.acc : we concluded
we traveled the land and it.acc: we passed through the land
This would probably go at the end of the vp?
Examples with nouns instead of dummies:
bean and bag : a bagful of beans
 fish and net: a catch of fish
water and pitcher: a pitcherful of water
To obtain the meaning "and" usually would entail would require reversing the order of the nouns:
net and fish: some/a fish and a/some nets
bag and bean: a bag and some beans
 ...
Over time, the conjunction easily could turn into an affix, and sound changes could obscure the relation to the conjunction.

Sunday, January 6, 2013

Detail #22: More flexible morphology

Consider the role syntax has, in Russian or Latin, for determining which argument is the object or whether a certain noun is definite. Imagine a similarly unreliable and highly variable use for the morphology of some language. This idea is not very far developed yet, but it seems there would be a lot of possibilities for it.

Thursday, January 3, 2013

Challenge #2: A Source of quick grammatical change

In many Australian languages and other languages thereabouts, there has been a historical taboo against words sounding like recently deceased people. This has lead to great changes in vocabulary over short times, especially as loans have been a common way of solving the problem of taboo words. Apparently, tribe elders have planned ahead for the death of different tribesmen and women in at least some tribes.

 Could a taboo likewise arrange for quick grammar change? I am not just thinking of replacing a morpheme with another, but of readjusting the use of a morpheme, the structure of a phrase, where markings go, typological details, merging cases, merging tenses, splitting cases, readjusting how the case-space is split, etc etc.

How would such a taboo work? Would something other than a taboo be better at achieving this effect? The social context in which I imagine this would be a pre-literate, probably even hunter-gatherer-type tribe. A larger community would probably not be able to keep up with the pace of grammar change I envision without a very efficient modern school system, and that's a bit beyond what I'd like for this.

Detail #21: Implicit negation vs. Double negation

Certain verbs tend to have a sort of negative component to their meaning:

  • lack
  • forget
  • miss (both main meanings)
  • ...
In a language where double negation does not cancel out, how about having negation not affect these at all (since the negation doesn't cancel an implicit negation assumed to be part of the verb), and require other verbs instead to mark the genuine negation of them.