Paper details: “Automatic Generation of Tamil Lyrics for Melodies.”
Authors: Ananth Ramakrishnan A., Sankar Kuppan and Sobha Lalitha Devi
I was browsing through the schedule page of a workshop on: “Computational Approaches to Linguistic Creativity”, 2009. I came across this title – “Automatic generation of Tamil lyrics for melodies” and was quite fascinated by it. I am perennially suspicious about works on such sci-fi topics and their efficiency levels. Nevertheless, I was eager to read it. Back then, the workshop did not happen and the paper was not accessible for download. Now, it is, and I finally got a chance to have a look at it (Interested? download-here)
So, as the name indicates, this explains a system which generates lyrics for a given melody, automatically. They did this for Tamil. There was a kind of general overview of the process and a mention about related work. There was this reference to a work about a “poetry generation system” !! I was shocked to a considerable extent. Most of human poetry itself is unreadable and sometimes crappy. We think about Poetry generation! Man’s imagination indeed roams freely in thin air 🙂 There was even this reference to some work on “lyric generation strategies” and I thought – “Oh! this is not sci-fi then, if so many are working in this direction!” 🙂
Coming to the paper, the process of lyric generation involves 2 steps:
1. Syllabic pattern generation
2. Identifying a phrase matching this pattern, as well as satisfying other word/sentence/rhyming requirements.
For the first part, they used a notation called KNM. K stands for Kuril-Short vowel, N for Nedil- Long vowel and M for Mei-Consonants. Taking their own example, the word thAmarai will be broken as “thA-ma-rai” and will be labelled as NKN. To generate such patterns for a given melody, they have used Machine Learning (specifically the Conditional Random Fields aka CRFs) to train a system to learn these patterns. The system was trained with sample film songs and their lyrics as input (and… I got doubts about the size of data they trained with..). This trained model is used to then label a given melody with a syllabic pattern. This pattern is then given to a sentence generation module which generates a sentence that satisfies the following conditions:
1. Words should match the syllabic pattern
2. Sequence of words should have a meaning.
This is like a baseline level work and they mention about the ways they plan to improve their system in future. They also plan to experiment more with different strategies as well as different domain data sets. Finally, they mention my “sci-fi” idea of poetry generation again 🙂
This is an extra brief summary about it. For more details, go on and visit it online.