Wednesday, November 9, 2011

A Diachronic Approach for Schwa Deletion in Indo Aryan Languages, Monojit CHOUDHURY, Anupam BASU and Sudeshna SARKAR

(a) Which research question was addressed in the paper?

In this paper the authors have proposed a more efficient and accurate syllable minimization based algorithm dealing with schwa deletion in Indo Aryan Languages (Hindi, Bengali, etc.).

(b) Data used (corpus):
The algorithm was implemented for Bengali and Hindi and tested on a set of words of Hindi (tested on the words in a pocket dictionary (Hindi-BanglaEnglish, 2001)). The algorithm for Bengali was tested on 1000 randomly selected words from corpus.

(c) Algorithms/methods used:

A linear time algorithm for syllabification (SYL) is proposed.

This uses the fact that the maximum length of allowable consonant clusters for IAL is three. After syllabification of a word, it is tried to greedily delete the schwas so that the certain constraints are not violated . These constraints state that only a schwa which is a part of an open syllable can be deleted, and constraint states that after schwa deletion, the consonant c is appended to the coda of the previous syllable.

Therefore, both of them together imply schwas in two consecutive syllables cannot be deleted.

Along with that, the following constraints can also be derived from the Dw (acoustic distance function) constraints:

R1. Schwa of the first syllable cannot be deleted.

R2. Schwa cannot be deleted before a consonant cluster.

R3. The word final schwa can always be deleted unless the appending of the penultimate consonant to the previous syllable results in an inadmissible cluster.

R4. For Bengali, which does not allow complex codas, schwas cannot be deleted after consonant clusters.

R5. A schwa followed by a vowel cannot be deleted.