Issues and Challenges in Finding the Structure of Words
Morphological parsing aims to reduce the variability of word forms by defining clear, higher-level linguistic units. It seeks to:
1. Remove unnecessary irregularities.
2. Limit ambiguity.
3. Establish consistent morphological rules.
However, several challenges persist in this process.
01Irregularity
The phenomenon where certain words or word forms does not follow regular patterns or rules in terms of morphology or syntax is referred as irregularity. It is a challenge for algorithms which follow particular patterns.
1. Irregular Verbs: Many verbs do not follow the standard tense formation (adding -ed). For example, the base form go becomes went rather than goed.
2. Exceptional Inflection: Comparative and superlative adjectives often break standard patterns. For example big becomes bigger/biggest, good becomes better/best rather than gooder/goodest.
02Ambiguity
Words or sentences may have multiple interpretations, causing confusion for NLP models. These are of 4 types:
1. Word Sense Ambiguity: A single word has different meanings (e.g., "bank" as a financial institution vs. a riverbank).
2. Parts of Speech Ambiguity: different parts of speech on their usage.
Example: I run ('run' -> verb) and in the sentence 'He went for a run' ('run' -> noun)
3. Structural Ambiguity: A sentence can be interpreted in multiple ways.
Example:
Sentence -> I saw the man with the telescope.
Possible Interpretations:
i. I used a telescope to see the man.
ii. The man I saw was holding a telescope.
4. Referential Ambiguity: Pronouns (e.g., he, she, his, her) can refer to multiple potential antecedents, making it unclear who or what is being discussed
03Productivity
This challenge involves the generation of new words from existing or unknown terms, which algorithms may not recognize.
Example: The term Google (derived from the mathematical term googol) became a base for new words like Googling, Googlish, and googlyology through productivity rules.
New words, as well as proper nouns like names of people or locations, often lack standardized definitions, leading to challenges in algorithmic processing.