miniatureconlangs: April 2017

Sunday, April 30, 2017

The Basics of Ćwarmin Folk Dances

The Ćwarmin tribes are increasingly adopting sedentism, but many tribes still retain very nomadic lifestyles. Many of the traditions and rituals among the sedentist tribes stem from nomadic rituals, and many of the dances have ritual use.

A very common type of dance happens before setting up a new camp, or building additional tents or structures (and in the sedentist tribes, before building a new house). One style of dance consists of placing sticks in geometric patterns on the ground: crosses, "asterisks", and sometimes tiled patterns are made. Dancers place themselves in symmetric or pseudosymmetric patterns, and make sequences of steps over and back across the sticks, often in various symmetric patterns, although with certain layouts, the ends of the shape may require some different step sequence. Oftentimes, these step sequences are the length of two regular step sequences.

Dances in large crosses or asterisks often move in and out along the diameter - often they have a "chorus" part, where the dancers close to the middle grip each others hands (in a variety of ways, depending on the dance) and do some slightly different moves from in the "verse", and now often in some quite simple fashion around the centre.

Sometimes, the final chorus part also differs from the earlier choruses. In some dances, the sticks are moved somewhat in the middle (often really a bit into the second half, length-wise). This is often carried out as a special version of the second main genre of stick dance, where the dancers hit each others' sticks generating percussive sounds as they do so. These often also contain parts where the sticks are hit together in various ways that often feature a significant amount of syncopation.

The final part of such a three-part dance often has the same music as the first part, but with a new arrangement of steps and sticks.

At weddings, married couples often dance either opposite each other or next to each other, unmarried dancers do not dance next to or opposite anyone, and the couple who are building their first common tent move, during the dance, from not being opposite/next to one another to being so. Depending on the region, this may require short movements (just getting one step ahead) or large movements (moving slowly around a large segment of the figure). In other dances, the married couple will dance a 'duet' in an exceptional spot - in the middle of a large figure - and often in an exceptional pattern.

Typologies:

Groupings and symmetries: oftentimes, everyone moves in the same pattern. Such dances are fully symmetrical. Some may permit for gaps. Sometimes, men and women dance in different patterns. Sometimes, one of the genders are divided into two groups with different patterns each. It is unusual, but not unattested for a dance to have two separate groups per gender. Finally, some dances are only for one gender – and such dances also sometimes divide the dancers into as many as three groups.

These are single, double and triple dances (depending on the number of groups), and they usually have rotational or transpositional symmetry.

Finally, four kinds of dances exist where one must consider there to be several 'bunches' of people that move together, rather than separate people moving in similar patterns. In three of these, the bunches all behave similarly, in the final one, the behavior of the groups starts out similarly, until one group breaks the ranks, and after awhile another group also breaks the ranks, and ends up converging on the first group to break ranks. Finally, the whole set converges.

Stick setups: polygons, combs ( |_|_|_|_| ), often symmetrical across the spine ( -|-|-|- ), squares-with-combs (#, possibly with more arms) stars (pentagrams, hexagrams, larger ones are uncommon), crosses and asterisks (+, *, with odd or even numbers of arms permitted). For the second style of dance, there's two main setups: short sticks in each hand, or long stick held by both. Sometimes, different dancers may have different setups in such dances.

Stick transitions: Most dances that include stick transitions go between a comb or square-with-comb layout and a cross or asterisk layout.

Rhythms: Mostly the rhythms consist of short slightly lilting pulses, basically alternating strong and weak, short and long beats. Length and strength combine both ways.

Tuesday, April 25, 2017

Detail #338: A Voice - Dereflexive

Consider a language where the only pronominal way of distinguishing third persons is the distinction between reflexive and regular third persons.

In cases where only one third person is prominent, this is not widely used but sometimes the distinction is used outside its origin in reflexivity.

Here, we can consider a situation where a basic voice marker and the reflexive marker - be they of whatever kind you want - combine forces to form a "dereflexive" - a voice that lacks a proper subject, but instead has a reflexive marking that is the real subject of the thing.

Monday, April 17, 2017

Interrogatives in Sargaĺk

[This post was accidentally deleted, and retrieved from the LCC aggregator]
The interrogatives in Sargaĺk have a few interesting properties, and there are also both gaps and additions in the case system that differ from the case system elsewhere in the language.

Two pronouns correspond to English what: səre and bəre. səre is for count noun-like things, bəre for mass noun-like things. Both lack the pegative form, but səre has several additional locative forms, and the bəre has an ergative form, exceptionally enough. Səre invariantly takes masculine case morphology, even when being a determiner for a feminine noun. Bəre invariantly takes feminine case morphology.

The additional cases for səre are allative -lu, illative -li, elative -rsas.

For persons, the interrogative pronoun is t'əre. T'əre has an accusative form, t'əra. It can take feminine case markers when the answer is assumed or required to be female. The female accusative, t'ərat, is falling out of use in favour of the absolutive. The feminine absolutive is t'əri.

For questions such as 'which X', the pronoun is suffixed to the noun. Otherwise interrogative pronouns are the first element of the NP, or even fronted to sentence-initial position but possibly leaving the NP behind. Usually they are in-situ, though.

A fourth stem with only two forms - the absolutive and the instrumental-comitative - zəre, zərmai. The first is basically a way of asking someone what they think or what they'd say, the second is the main way of asking for a speaker to repeat what he said because you didn't hear. Zərmai is seldom used in any actual phrases, but as a stand-alone word. Essentially zər- sort of is an interrogative for elements of the set of possible utterances.

The interrogative root zər- appears in derived verbs and nouns and adjectives, in ways that parallel the other interrogative pronouns. More about those in a later post.

The Interrogatives in Sargaĺk

In Sargaĺk, there are four interrogative pronouns -

t'əre: who
səre: what (for count-noun-like things)
bəre: what (for mass-noun-like things)
zəre: what (for utterances and thoughts)

These sometimes appear in a variety of verbs, adjectives and nouns.
The interrogative pronouns are inflected for case (and number), but t'əre is most often masculine except if a) the expected answer is feminine, or b) the answer is required by the speaker to be feminine. The three others are always inflected in the masculine except if they are adjectives; as adjectives, they are generally suffixed to the noun.

Adjectives

Adjectives are generally formed from other stems by adding suffixes. These suffixes further are inflected for case.

brəsep - full (of liquid or indifferentiable mass)
zrəsep - full of things to say
t'rəsep - full of people

The suffix -sep means 'full of' or 'saturated with'. The -e- turns to -ə- when case suffixes are added.

sərkuy - 'whatless', insignificant
bərkuy - 'empty'
zərkuy - 'silent' but also 'unthinking' depending on context
t'ərkuy - 'one of a kind' (of a person)

The meaning of -kuy generally is '-less'. -y- disappears before consonant-initial case suffixes.

From these, nominalizations can be formed, but the usual Sargaĺk discourse structure seldom calls for abstract nouns like '-lessness' or '-fulness'. -kir, however, is the usual abstract nominalization for adjectives: brəsefkir: fullness, t'ərkukir: quirkiness, zərkukir: silence, stupidity.

These can be used with appropriate case inflections to signify 'a X one', including the uninflected form for the absolutive case. 'Brəsep' thus can also signify a full container, 'zrəsep' a person with an issue to speak of, or a story-teller, or somesuch, and a t'rəsep can be a full house or a legion or anything like that.

Nouns

There are only a few derivations from these that give nouns without an intermediate adjective or verb in the derivation chain. Three primary examples, however, are

srənki (f) - a question (as to what (səre or bəre))
t'rənki (f) - a question (as to whom)
zrənki (f) - a question (as to what the listener is thinking)

Verbs

Verbs for asking are obvious contenders for this, and include

zrənoj, t'rənoj, brənoj, srənoj

Brənoj and srənoj both are used when the question pertains to time, place, etc, depending on the size and type of the expected answer: spans of time or large locations or maritime locations often are asked for with brənoj, specific days or times of day or weeks or months are asked for with srənoj.

There is also a verb k'yenoj which signifies asking a binary question. K'ye also is the particle that indicates tag questions, and can be initial or final in the tag question.

Other verbs for asking specific types of questions also exist, such as

bnaroj - ask for permission

The idea that asking for whom is somehow the primary type of question can be found in the following verbs, which can refer to any kind of question:

t'rəgrošaj: to overwhelm with questions
t'rəkoŋpoj: to ask questions with the intention of misleading the listener
t'rəroroj: to ask stupid questions
t'rəksturij: to ask a question without an intention to listen to the answer
t'rəksomaj: to ask the same question repeatedly
t'rəparuj: to ask a question in order to embarass someone

Thursday, April 13, 2017

Detail #337: A System for Encoding Numbers

Consider a positional system of numbers based on some form of ordinal thinking. We assume for now a decimal system.

1 is really the first number^{in the first decad^{in the first centad^{in the first millenniad ^...}}}... this means 1 is really ...1111, but we omit leading ones and thus obtain 1. The range ten to nineteen is the second decad. Thus ....11121, ...11122, ...., or as we'd rather have them: 21, 22.

I am not particularly interested in forcing a particular base onto this either, any integer would do... it's just that I want a system where you get the following kind of pattern, given that Z = the base (which also needs a symbol of its own)

1	2	3	4	...	Z
21	22	23	24	...	2Z
				...
Z1	Z2			...
211	212	213	214	...	21Z

We don't need a zero, since we're not interested in those at all: the second '1' may very well come directly after the first '7' for all I am concerned, as long as the pattern is kept intact.

This is fairly similar to bijective numeration in some way, but adds the twist of being slightly off.

Fun thing: there's always an infinite string of 1s to the left of any 'regular' number. One could, however, imagine exceptional numbers where, for instance, there's an infinite string of some other numbers to the left, or an infinite regular pattern (e.g. ...123123123), or even an infinite irregular pattern (reverse your favourite irrational number and drop the decimal mark).

Challenge: develop easy rules for arithmetic for this, without involving conversion back to and from regular numbers.

Saturday, April 8, 2017

Conlanger Lore: Reasoning about Grammar

In part, this intermittent series of posts will deal with reasoning and knowledge in linguistics, when applied in such an unusual way as conlanging is.

One notion that forms part of the backbone of conlanging thought is the idea that we can just apply reasoning at a very basic level to reach conclusions about typology.

Consider, for instance, pro-drop. Common wisdom is that pro-drop and verbal marking for subjects (and possibly objects) go together. Superficially, this seems reasonable, but we know there are languages that have subject congruence, but do not permit pro-drop. Likewise, we know there are languages that have pro-drop, but don't mark their verbs.

Common wisdom is that lack of case (and/or verbal marking) forces word order to be fixed. But many languages without case marking permit some amount of word order rearrangement - Swedish, for instance, permits both SVO and OVS, without any explicit marking. This to the extent that I have been in situations where people have parsed what I have said (SVO) as OVS, because they have parsed contextual cues and salient features of the words involved in the utterances differently than I would have expected.

Yes, of course Swedish doesn't have strictly free word order - SVO still probably accounts for at least nearly the majority of utterances, followed by AVSO (where A = adverbial), followed by some oddities like VO (with omitted subject), followed by OVS fairly far down the line. The point is, you don't need the case marking to free the word order, what you need it for is to obtain very free word order, that is, word order where the different orders don't significantly differ statistically, and thus make it hard even to learn what is what.

The point I am trying to reach is that ultimately, we cannot rely on ideas like "IFF X is marked in one way, then X can be left unmarked in other ways". Some languages simply structure their utterances in ways where who or what the subject is is irrelevant. In some languages, discourse tends to focus more on events than on people involved, in some languages the discourse is more interested in the who does what aspect of it. Much like some languages don't have tense. Further, with subjects and objects, oftentimes there is a significant bunch of additional knowledge the speaker and listener can be assumed to share, and this makes looking at whether the subject can be retrieved from other markings with regards to pro-drop, or whether the subject can be resolved from other markings with regards to case.

Thus, when we reason about language, we need to acknowledge that the actual form is not IFF X, then Y, but rather if any X out of a huge bunch of unknowns, then maybe Y.

Thursday, April 6, 2017

Detail #336: Modelling Restrictions on Compounds

Languages with compounds can have restrictions on what compounds are permitted. Describing such a system of restrictions in some depth could be a nice way of getting an impressive grammar done. Let us consider some ways of 'modelling' such a system. There's a difference between modelling and exhaustively describing, in some sense.

Giving an exhaustive description is possible for a conlanger: we inform the reader how it works and since we're the creators, our fiat holds. However, this might be somewhat uninteresting. Models are interesting in that they attempt to catch what happens, but might simplify some stuff and therefore be mistaken about things as well.

Given the natural scope of a language - spoken over generations, spoken by lots of speakers in varying relations with one another (all the way from family to have never ever interacted due to not even living in the same centuries nor even geographically all that close) it is likely for there to be a lot of variation in some parts of the language, and thus a model makes a lot of sense: it'll be wrong some of the time, but it'll catch the main traits of the system.

So, let's consider compounding and how we could model restrictions on it. First, we can recognize two types of edges of a compound: the left edge and the right edge. We can imagine a compound that does not permit any added morpheme to the left, and the same goes for the right. We call these 'right-saturated' and 'left-saturated' compounds. A compound that is saturated at both edges is simply saturated.

Another thing about modelling is that it'd be good if it also helped parse the compounds. Thus, a good model should tell us whether an element in a compound is a left-branch or a right-branch by looking at the word. It should even, probably, tell us whether two neighbouring elements are only "superficial" neighbours.

Left and right-branching

This gives our model some actual usefulness beyond its 'descriptive' power. Now we come to the nitty-gritty stuff. We of course want to have some way of quantifying whether a word accepts compounding. Let's simply use numbers for this - we could put it in a range [0, 1], where 1 is 'accepts compounding' and '0' is 'saturated' and values inbetween are probabilistic estimates as to how likely it is to accept compounding. So, we have, for any word, two values left and right ∈ [0, 1]. I'll write left and right as a single vector C = (x, y), where x is the left and y is the right edge. Subscript text comes in four varieties: full words represent themselves. Thus C_DonauSchiff is the compound of Donau and Schiff. One-letter capital variables represent an arbitrary word. Small letters

Let us take two words, Donau and Schiff. These have associated vectors C_Donau and C_Schiff. The resulting Donauschiff too has the associated vector C_DonauSchiff, which is a product of the vectors of the two elements. The interesting thing, of course, is the function that takes C_Donau and C_Schiff and produces C_DonauSchiff. It should be clear that order is relevant - we wouldn't expect Schiffdonau and Donauschiff to have the same properties. A very simple model would do something like this:

C_EF = (E_l, F_r), where l and r as subscripts mark "left edge value" and "right edge value".

In such a model, the property at the edges carry on down. However, there's no a priori reason why AB_l = A_l and AB_r = B_r. In other words, there's no reason why a compound's edges should have the same compounding-properties as the element that occupies those edges - shoemaker needs not have the same left-edge property as shoe and right-edge property as maker - in fact, we'd sort of maybe expect, in English, that shoemaker would be more similar at the left edge to maker than to shoe (but not maybe entirely so). The compound is a new word, possibly a word of a different word-class (with regards to at least one of its parts), and thus it seems unjustified to expect the compoundability to be conserved at edges.

Thus we probably want a more detailed idea of what compounds are permitted - we might want both C_l and C_r to be vectors for different types of lexemes: verbs, proper nouns, nouns of different classes, adjectives of different kinds, etc. We might even want to go further: probabilities for specific inflected forms, probabilities for 'heavy' words vs. 'light' words, measured by their nested structure, etc.

Amyways, my next step in modelling this would be to come up with some kind of 'average' probability per word class pair, e.g. adjective-noun 75%, inanimate_noun-transitive_verb 80%. Once this is done, I'd make a weighted graph, where nodes are types of words, and directed edges are the probability of a word of one type compounding before a word of some other type. Self-cycles may exist.

Next, each lexical item in the conlang's lexicon would be given a run where a randomizer decides whether it'll accept a certain word-type as prefix or suffix with the probability given by that graph. The probabilities for the new word's edges would be based on some way of measuring 'saturation', which again creates a new thing we might need: a saturated word does not permit more suffixes, and this may happen even if there are non-zero probabilities going on for some level of the compound at the edges.

I am not going to present any algorithm for this now, this is basically an early rambling intended to come up with something.

Wednesday, April 5, 2017

Detail #335: Possessive Suffixes and Dative Congruence

Consider a language with possessive suffixes as well as an additional, lightly similar thing. We can imagine some interesting restrictions, though, and an immediate detour into that is called for about now.

In Proto-Finnic, the subject could not be marked with possessive suffixes; only the other cases permitted it. This is basically a nominative-alignment thing. Morphologically, this has left the trace that even subjects in Finnish, when marked by a possessive suffix, morphologically are identical to objects.

Now, the kind of suffix I am thinking of is an indirect object congruence marker. Thus, 'I gave a book to him' would come out as 'I gave book.[3sg ind. obj]'. Now, possessive and indirect object suffixes are in complementary distribution - they cannot cooccur.

However, we can imagine a weird situation where the indirect object congruence is permissible on intransitive subjects as well (at least for a short while, until the possessive marker catches up), for situations like 'the book is for him' and such.

For a short while, thus, the possessive marker would follow a nominative pattern, whereas the indirect object congruence marker would follow an ergative pattern.

Monday, April 3, 2017

Detail #334: Number Congruence and Discongruence with Numerals

A fair share of languages use singular nouns after numerals - in e.g. Turkish, you say 'two man', not 'two men'. In part this is a reduction of redundancy, but on the other hand, redundancy can be a feature rather than a bug.

Now, let's consider a language that operates like Turkish on this count, but has an extra quirk: many determiners' stems also encode number, so e.g. the singular 'this' and the plural 'these' do not share a stem. However, both also use a full set of case congruence markers that encode both number and case.

For 'these.acc four.acc dog.acc', "these" would thus have a plural stem with a singular accusative suffix on it.

An obvious suggestion for a situation where the opposite could happen - singular stems with plural morphology - could be when the speaker wants to imply some kind of collective. Thus, collectives would be morphologically plural, and only marked whenever there are determiners involved.