Text Adaptor

I was reading about this “Text Adaptor” tool of Educational Testing Service (ETS). As the name indicates, it’s an automatic tool to perform text adaptation – which in general means a teacher’s modification of the texts to make them more readable and understandable, given by the student’s learning level. It was interesting to read that this tool actually does some stuff like – automated synonym detection, antonym detection, automated text summarization, some translation and a bit of shallow parsing to identify complex sentence structures.

I was searching for Educational Applications of Natural language processing, when I stumbled across this tool. It was interesting to see these topics being discussed in a totally different and real-life context. I wondered if people do such kind of things for Indian languages. In the present day scenario, our own people don’t know their own language properly; I think such kind of tools are necessary. How cool it will be if such a tool simplifies something like the text of Viswanatha Satyanarayana in to “manava bhasha” and let me read :p Ah! Imagination is such a wonderful thing…. Alas, reality isn’t :(

“Text adaptor” is currently being used at two online teacher development programs for ELL (English Language Learners) Teachers in USA. Its goal is to help teachers identify the linguistic complexity of text and help them in making it more accessible to an ELL. Further, the report that Iam reading also talked about “Text Adaptor” data possibly providing some good resource for NLP research in text quality and related areas. As I proceeded with the report further, I began wondering about measuring the relatedness between two sentences in a passage – taking cue from the work. Since it made me think (after a long time, something made me think!), I think my reading time was respected ;)

Hmm, I want to see how the tool works… to talk any further :)

Yeah, appears to be a very interesting and practical application to NLP – the very place where my search began. So, my search took me to the right path – at least for once ;)

This is where there is a brief Intro: “The Automated Text Adaptation Tool” by Jill Burstein, Jane Shore, John Sabatini, Yong-Won Lee & Matthew Ventura. Educational Testing Service, Demo in NAACL-HLT 2007. I don’t actually know if its available for free-view…

Published in: on November 10, 2009 at 10:14 am Leave a Comment

Computing for global development…..

I was reading this report: “Computing for Global Development -Is it Computer Science Research?” a few minutes back. Thought it was an interesting article – with some questions to ponder upon.

There are few things that captured my attention:

1. “ICTD also poses new research challenges for systems and networking researchers, given the unique constraints of applications e.g., there have been several streams of work in long-distance WiFi to connect remote rural villages to urban centers or the Internet..”
- Hmm, that left me wondering for a while – in what way will stuff like computer vision or robotics or language engineering will be posed new challenges specific to ICTD? I mean, as it will be discussed further down in the paper, as well as in my blog : What research challenges are specific to these areas…and ICTD put together? Do those challenges remain challenges even outside ICTD – i.e., does solving them ICTD imply solving a hitherto unsolved problem in those areas, in general?

2. “ICTD research not only impacts global development, but can also advance “traditional” computer science. Two examples that come to mind are WildNet and HashCache. Although motivated by applications in developing regions,these works fundamentally changed our view ….”
-Hmm..pretty interesting to know about HashCache and WildNet…. Well, I did not know of them before, so, can’t comment more until I read about them :)

3. “ICTD is thoroughly interdisciplinary…. One issue with interdisciplinary work is that problems seen to be legitimate, even crucial, to the area often don’t contain enough research content in any one area to satisfy the contributing disciplines. Also, even if there is enough research content in one area, solving that part of the problem may be only a small portion of the larger problem being addressed. As is the case with some work in systems research, its not clear that a core technical contribution is really valuable without building the whole system. Unlike systems research, however,the whole system might include non-technical components requiring social, cultural, economic, and political efforts, as well.”
-I felt this is a very important observation. Well presented too.

4. “Third, ICTD like some other application sub fields lacks a clear definition of generic technical issues within a well circumscribed context. Just in agriculture applications, there are networking problems (e.g., connecting remote villages to urban experts), speech problems (e.g., for building a Q&A system in multiple dialects), information-retrieval problems (e.g., permitting cross-lingual, geography-relevant database queries), computer-vision problems (e.g., diagnosing diseased crops via photographs), and so on.”
-Which is where I get back to my doubts raise at point 1. From the examples given here, the problems are generally problems in those areas. However, theres another possibility – these problems are not unsolved problems in the respective areas – but there are more of engineering issues involved. Hmm, this particular part of the paper reminds me of my thoughts on sitting through the “e-sagu” talks and wondering – whats so much of research in this project (That was long back and I understood it better later)

5. “Ironically, ICTD is struggling to establish itself within a field that has itself had a history of struggling to establish itself, namely computer science.”
- Irony! :) It was a very interesting section – this section 3.1 – “Acceptance of applied science”. However, my doubt is this : Why should it be accepted as a sub area of computer science? Why not science in general? :) I mean – instead of applied computer science, why not applied science, as mentioned in the section title?

6.” It’s surprisingly difficult to find hard, technical problems that are unique to ICTD { often, the technical challenges are generic computer science research problems (e.g., better speech recognition). The portion that is relevant for ICTD is often limited to adaptation (e.g., what’s the best way to train speech recognition engines quickly in local dialects?). It’s not that challenging technical problems don’t exist in
ICTD, it’s that they’re often not obvious.”

-It goes back to 1 again :) Hmm, so, perhaps, ICTD researchers always keep thinking about this issue, then :)

- I found the first part of this paper to be very readable (Until Section 3.1) – with interesting points raised. The rest of it – is not really my kind of stuff – its for an older audience I guess ;)

It helped me get a better overview of the “ICTD as Computer Science Research” – question, though.

The report can be read here.

Published in: on October 14, 2009 at 1:56 pm Comments (2)

Word games as experimental linguistics

Frankly speaking, I have no clue what “experimental linguistics” is supposed to mean. I have no idea even after reading through this document. But, despite that, I found this paper to be a very interesting read.

Well, when I began this paper, I imagined “word games” to mean something more broader in sense than what is mentioned here. Let me prepare you all – this paper basically talks about the “PigLatin” kinda word games and not “scrabble” kinda games or other such things.

So, read a bit on PigLatin? Proceed further now. :)

Actually, I am new to PigLatin and this paper has some nice examples to explain what it is. So, I proceeded ahead. So, there are a few surprises that PigLatin can throw at you – expect the unexpected sorta confrontations. Using such cases, the author tries to explain how language word games can be used to get some insights on the syllabic structure and working of a language.

There are two case studies, with word games from two other languages called: Awara (Yes, its the name!!) – from Papua new guinea and Komo – from Congo. Both the case studies helped the linguists get a better understanding on how language worked and how people worked with the language.

It was all pretty interesting to read. Only that, it left me wondering about such experiments with Indian languages. I don’t even know if there are this kind of games in our languages, which will help understand the structure of the language. Surely, there are a lot of games to “learn” the language but, are there any to “understand” its working?

As I write this – I wonder how these kind of experiments work for “invented languages” :) The working of these languages is theoritically laid down by their makers. How about verifying them experimentally? ;) “Experimental Linguistics” – Literally!!

Heres the link to this report.

Author: Michael Cahill

Published in: on October 8, 2009 at 12:04 pm Comments (3)
Tags:

On “In the land of invented languages”

Finally, I wrote about the book – No more blog posts on the book again and again – hopefully :)

My introduction to the book at pustakam.net can be seen here.

Nice book!

Published in: on September 29, 2009 at 4:37 pm Leave a Comment

O-lan of the Good Earth

Its only a few days since I read “The Good Earth”. However, lots has happened since then and hence, there is every chance of me losing myself in thinking about a myriad things. Despite a heap of things playing inside my mind, someone kept coming in to my thoughts again and again. Not that, this “someone” was really a “someone”. To a large extent, this “someone” is so commonly seen that you can’t pick a face and say – this is it. Hence, our “someone” is faceless. Yes, our “someone” is the most ignored and hence, people will imagine that “someone” is soul less too. Obviously, since you can’t pick one person, “someone” is nameless too.However, for better understanding’s sake, I will cut short this blabber and tell about someone who epitomizes my “someone” : it is the wife of Wang Lung, the lead character in “The Good Earth” by Pearl S.Buck. Her name is O-lan.

The more I think about her, the more I am left wondering about her. Well, she is not a “heroine”. She is not a “miracle maker”. Then, why am I thinking about her so much? Despite the fact that “hero” of the novel is Wang Lung, Olan is the character that interested me the most.

May be, these are my reasons:

1. All through, her role is so underplayed. She is potrayed almost like a non-entity though nothing in that story, which is a “significant development” would have happened without her.

2. Her endurance. Her ability to fight most of the things alone. Had the story been told in O-lan’s perspective, it would have been a wonderful thing, although I doubt if that would have sold this much.

3. Everyone who read the novel can understand the fact that Wang Lung would have been a normal struggling peasant, if not for O-lan. Whether she did a right thing or not, is a matter of different discussion. The outcome of that discussion does not alter the fact that O-lan is responsible for Wang Lung’s prosperity. But, not once.. not once did she try to take advantage of the fact. Not once did Wang Lung acknowledge her openly and whole heartedly. Yet, she did not complain. Great patience!!

4. At no point of time was there are an indication that Wang Lung cared for his wife’s feelings. Yet, there is not a single instance when O-lan expressed any sorrow about that. Ofcourse, story is told with Wang Lung as a hero and O-Lan’s feelings may not be of significance. But then, it is more closer to real life, rite? How many husbands, even today, give wives their due? How many really care for the wife’s internal strife? (Ok…poor attempt at rhyming..)

5. For that matter, Olan never got any moral/mental support from the rest of the family either – atleast so far as the narration goes. Again, I was thinking about the traditional house wife. How many children actually attempt at doing that?
If your morning coffee is delayed, how many of us will go ask our mom – “shall i make it today?” or “shall we go out for break fast today?” or “are you fine?” or “you make coffee, i make breakfast?” and, how many of us shout at her for delaying our morning coffee?

-Despite the fact that the theme is about the Rural china of 1920s, in certain aspects, nothing changed. Even now, I see those under recognised wives in a lot of families here – in India. (Its a personal opinion).

Nothing might change too. May be, change does not change so much as I imagined it to be.

Published in: on September 17, 2009 at 10:46 am Comments (4)

Purpose of building a language

The more I think about this book, the more I like it. I realized that my childhood fantasy about building a new language is no longer as fantastic as it was. But, knowing more about all those passionate inventors is very much interesting. Perhaps, it is some psychological satisfaction seeing others’ work when you know you can’t do that and don’t want to do that now.

I dont remember what my purpose was in having dreams about developing a new language. Perhaps, its just a childish desire. Perhaps, I was frustrated with the “nature” of a natural language at that age itself! May be, it has something to do with the language troubles with those cousins who spoke a different tongue.

Whatever it might be, reading through this book, I realised how many different reasons can exist for people to decide on developing a new language.

Take Esperanto – its the language of peace. It wants to be the platform for enabling people of different cultures and languages to communicate. However, there are languages like Klingon (Star trek) or Tolkeinish stuff – which are primarily created for fiction. But, then, Klingon and its cult following is a different story again – as the author mentioned, its like “art for art’s sake”. Can you imagine, there is even a “woman’s language” which enables a woman to express her feelings in a versatile and verbose way compared to existing languages. There are these “logical” languages, which are spoken with “logic” ;) There is Ogden’s “simple english” which is like a entry point for non-english users in to English. There are these symbolic languages like “Blissymbolics“. There are “philosophical” languages like that of Wilkins’.

It was very interesting to know about groups of “ConLang” (Constructed Languages) developers and enthusiasts, their passion for what they are doing etc. I did not imagine that there is so much of analysis, study, online help etc for the developer enthusiasts. I am still in the process of exploring these things. So, perhaps, might write more on ConLangs soon.

The most interesting part is the story of revival of the modern Hebrew language. In a way, perhaps, this is a misfit in this group since its not an “invented” language like others. But then, how it was reborn as a “spoken” language is a very inspiring story – if languages had lives and had some societies among themselves, modern hebrew might have been idol worshipped among those societies and it would have walked with a halo behind its head.

Will talk more about modern hebrew later … but, it was amazing to see so many perspectives about development of a language. After every night that I slept reading this book, I woke up muttering to myself – words, language, relations, power, expression, symbols – randomly.The day began with getting confused with the purpose of language and the usefulness/uselessness of it. It began with questions left unanswered. Was I dreaming about things? I dunno.

I still am clueless. :( Is it the inability of my brain or the inability of my ability to express? Why am I not able to answer these questions that my day gives me, after the night reading? Should I keep experiencing this hangover time and again? :(

Published in: on September 14, 2009 at 9:31 am Comments (2)

On Blissymbolics

“Oh, but Stephen Hawking was an adult when the lost the ability to speak”
- It was at this sentence that I got an idea of the significance of “Blissymbolics”, a pictorial language invented in ’40s by one Charles Bliss. I was reading a piece from “In the Land of invented languages” on this language (Enough has been said about this book in this blog by now).

Coming to the point, “Blissymbolics” is a language of symbols. According to Bliss, lots and lots of ideas and feelings can be expressed through a small set of basic symbols and their combinations. The usage of this language at the Ontario Crippled Children Center (OCCC) and the effect it had in the lives of those kids was an interesting and touching story to read.

The arrival of Bliss himself to OCCC and the events that followed were both amusing and astonishing at the same time. In the beginning of this book, when Arika Okrent was talking about the eccentricities in language inventors, I thought – may be thats an exaggeration. But, as I read further, especially, as I read the story of Bliss and his language – I understood Arika’s words on eccentricities better. In a way, its amusing to see people getting possessive about a language. I mean, they might have invented it – but, if it is to have a universal acceptance, how can they be the sole owner?

Anyways, coming back to the story of Blissymbolics, it was to some extent interesting to see how Bliss did not understand the differences in intentions of the OCCC and himself, regarding the development of a language of symbols. The idea of “tyranny of words” or on the complaints about “nouns, verbs” – was really very interesting, I never thought about that aspect before.

Ofcourse, I do sympathise with the lead lady of this story – Shirley. It might have been a real pain to bear Bliss and persist with the association, just for the benefit of OCCC children. But, at any rate, all the drama that happened once Bliss got involved with OCCC can easily be made as a very hilarious movie.

Reading about Bliss’ past – his life, moving in to German concentration camp and then to China where he first got insights about the universal appeal of a language of symbols – leaves me with some doubts about the roots of his eccentricity.

What purpose does the above thought serve now, anyways? !!

This book is a must read i say!

Published in: on September 12, 2009 at 9:51 am Comments (2)

Language Evolution: Wilkins’ language

Its such a thing to understand the evolution of a language. I always wondered about the origins of words and how actually do characters combine to form a meaningful word block. Perhaps, these doubts had some impact in making the choice that I made about what I am doing now. (Wonder how Natural Language Understanding handles convoluted sentences like above! :P ) Coming to the point, I never got a chance to actually understand this “language evolution” part.

I was reading “In the land of invented languages” a few weeks back (about which I mentioned in my blog in July) and got an idea of word formations in the “Philosophical Language” of Wilkins. Yeah, it is an artificial language. But, it was interesting nevertheless, to see how a word can be formed, albeit in Wilkins’ way. In his language, words are formed by their meaning. For example: (quoting from the book),

You must simply learn that a dog is “dog” in English, “chien” in French or a “perro” in Spanish or a “hund” in German. The sounds in these words are just sounds to be arbitarily memorized. They tell you what to call a dog, but they do not tell you what a dog is. In Wilkins’ system, the word for “Dog” does tell you what a dog is.

Its like – the spelling for dog is formed by using the ‘Wilkins tree of the universe’ charts as a “beast-oblong headed-bigger kind” and is spelled “zitα” accordingly (zi-category id for beasts, t-for the sub category:oblong headed,α-for sub sub category:bigger kind)

However, despite the fact that Wilkins Language is extremely interesting and very painstakingly developed, my primary doubts about how our natural languages evolved are still the same. Is it really that a “dog” should be a “dog” and nothing else, or is there something more to that?

Again, the less maverick part of my brain keeps asking : Well, how does it matter now?

Confusion about choices or choice about confusions?

There are a couple of very interesting stories about different invented languages in this book and may be I’ll write more about the book soon – here or at pustakam.net. Let us see :)

Published in: on September 10, 2009 at 11:01 am Comments (4)

Susan Dumais’ Salton Award lecture

I was just browsing through a page on Susan Dumais‘ Salton Award lecture, (Gerald Salton Award).

It was titled “An Inter disciplinary perspective on Information Retrieval”.
It began with Susan Dumais talking about how she got interested in IR, and finally began working in the related areas. Next, she went on to describe what kind of work did she get involved in, how things have changed in the past two decades in the field of Human Computer Interaction and in the context of Information Retrieval. After that, she began on a note about future directions, which I found interesting.

1. “The first area has to do with the dynamics of information and users’ interactions with it. …. How can we extend retrieval models and systems to go beyond a single, static snapshot of information? How can we model searchers in a way that captures the evolution of their information needs within a single session, and across many search episodes?”

2. “The second area has to do with evaluation…… Evaluations methodologies need to be extended to handle the scale, diversity, and user interaction that characterize information systems today. …..Can we go even further and develop a kind of “living laboratory” in which research groups can try new ideas with searchers in situ, thus enabling controlled experiments in the wild?”

Left me with some kinda vague thoughts… too vague to put them on blog too…
The article was too small, but, very interesting. If there was a “talk” and there is a video uploaded somewhere, I would love to see that..

I am not sure about an access to this pdf… but, heres the link.

Update: Here it is – Susan’s presentation.
-Thanks for sharing, Praneeth!

Published in: on August 28, 2009 at 10:51 am Leave a Comment

Web searching for Daily living – and lolz

Query free search – I’ve been ROFL-ing ever since I read about this in the first two pages of a research paper. I did not go beyond that. My imagination ran so wild that – I got lost in that wilderness to proceed further. Let me tell you a few examples mentioned there and you know why I am saying that.

“When we are washing a coffee maker, for example, a web page is retrieved that includes tips such as ‘cleaning a coffee maker with vinegar removes stains well.’ A method designed on the basis of this concept automatically searches for a web page by using a query constructed from the use of ordinary household objects that is detected by sensors attached to the objects.”

-These lines from the abstract say enough about what this paper is about. Now that its possible to make anything from a domestic appliance or a furniture item to a Television turn in to our internet enabled browser, we can get search results despite the absence of a “real” querying, according to the paper.

“To search the web by using objects, we assume that a set of names of objects that are used (moved) by a user in a given period of time corresponds to user’s context. For example, when a cup, milk, and cocoa are moved in a given period of time, “cup milk cocoa” becomes the user’s context. Then, we search for a web page that matches the context.”

Reading these, I began getting doubts, as usual.
Firstly, it appears to be a dangerous intrusion in to my privacy. What if someone comes and attaches some sensor to some random object of mine? How embarassing it would be to suddenly see search results on the nearby sofa, on something that you can’t imagine to see at the moment?
Secondly, Who asked for it? I don’t want to search for milk and cocoa when I drink milk with cocoa. I may not want to think about more details on the topic at all.

My doubt is : Instead of going to this extent, why cant “voice based search” be improved upon? Instead of attaching sensors for each and every article, why can’t a sensor be attached to our own self, and perform a voice based search and then display the results?

Finally – its highly comical to imagine all these happen in “daily living”. It’ll be good to see a sci-fi comedy thriller with all these ingredients :) Ofcourse, with some creativity and some sense of history,I can see a historical sci-fi comedy thriller in making already. A “psycho sci-fi comedy historical thriller” (or with the order changed) or a “all-this-blah” + political thriller – may be on the cards in future..who knows?

The title of this paper – “Web searching for daily living” is a misnomer. I thought it talks about money generation through web search. Ideally, like a friend suggested, it should be – “Web searching intruding in to daily living”.

Details of the paper: Web Searching for Daily Living Takuya Maekawa (NTT ), Yutaka Yanagisawa , Yasushi Sakurai , Yasue Kishino , Koji Kamei , Takeshi Okadome (NTT) In proceedings of SIGIR-2009

Published in: on August 21, 2009 at 9:53 am Leave a Comment