Monday, September 24, 2018

Detail #384: Long-Range Negation Congruence and Probabilistic Grammars

Let us consider a language like Finnish (or almost English), where negation is done by an auxiliary. In this language also, the main verb takes a special form (in Finnish, the connegative, in English, the 'infinitive' or the 'active participle', to the extent we would call those 'special' :/ ).

Now, the main point here is that in English and Finnish, the form you expect are different for positive and negative statements:
he sits vs. he does not sithän istuu vs. hän ei istu
In English, for present progressive or whatever it's called, this breaks down:
he is singing vs. he is not singing
Let's however assume a language like Finnish, where this distinction is more clear-cut and present almost throughout the language. Now, we can of course imagine certain non-negative adverbials that weaken a statement triggering the negative form, giving us things analogous to
he barely workhe seldom thinkhe scarcely turn up
where barely, seldom and scarcely essentially become lightly negative auxiliaries.

Now, that's just one of the milder ideas of where such pseudo-negation might turn up. Another could be embedded negation bleeding outwards:
she tell him not to buy bitcoin
she know that he wasn't at work
We could also have negation bleeding downwards:
she doesn't know that he work in finance
We could of course make a probabilistic grammar for this, and that's a topic I think could be worthwhile for conlangers to consider - modelling the rules of a grammar in terms of probabilities.

Let's use p(x) for the probability for such 'mistaken' congruence, i.e. a connegative verb form with an actually 'positive' meaning. p(x) is then a function, where x is some way of representing this input. x is then, perhaps, the distance between the 'outer' verb and the 'inner' verb.

We may give some simple function for this, say, x is at most 75%, and is squared for each unit distance added.Thus, f(x) = 0.75^x

We could then start by considering, for instance, different subject as a difference worthy of one unit. Every single constituent between the verb and the subclause (or non-finite verb phrase) could be one unit, two units if the constituent is heavy. Either of the verbs being telic adds a unit of distance, but both being telic only adds 1.5 units. The object of the outer verb being the same as the subject of the embedded verb removes 0.5 units.

Of course, we could add special cases - certain verbs whose congruence has become 'linked' and so if these two verbs appear, the probability for mistaken congruence is unusually high, or somesuch. I am deliberately leaving the idea a bit vague here - I only want conlangers to think of grammatical rules in probabilistic terms while also presenting a certain grammatical idea that also fits as a suitable topic to represent probabilistic grammar a bit vaguely with.