Text Adaptor

I was reading about this “Text Adaptor” tool of Educational Testing Service (ETS). As the name indicates, it’s an automatic tool to perform text adaptation – which in general means a teacher’s modification of the texts to make them more readable and understandable, given by the student’s learning level. It was interesting to read that this tool actually does some stuff like – automated synonym detection, antonym detection, automated text summarization, some translation and a bit of shallow parsing to identify complex sentence structures.

I was searching for Educational Applications of Natural language processing, when I stumbled across this tool. It was interesting to see these topics being discussed in a totally different and real-life context. I wondered if people do such kind of things for Indian languages. In the present day scenario, our own people don’t know their own language properly; I think such kind of tools are necessary. How cool it will be if such a tool simplifies something like the text of Viswanatha Satyanarayana in to “manava bhasha” and let me read :p Ah! Imagination is such a wonderful thing…. Alas, reality isn’t :(

“Text adaptor” is currently being used at two online teacher development programs for ELL (English Language Learners) Teachers in USA. Its goal is to help teachers identify the linguistic complexity of text and help them in making it more accessible to an ELL. Further, the report that Iam reading also talked about “Text Adaptor” data possibly providing some good resource for NLP research in text quality and related areas. As I proceeded with the report further, I began wondering about measuring the relatedness between two sentences in a passage – taking cue from the work. Since it made me think (after a long time, something made me think!), I think my reading time was respected ;)

Hmm, I want to see how the tool works… to talk any further :)

Yeah, appears to be a very interesting and practical application to NLP – the very place where my search began. So, my search took me to the right path – at least for once ;)

This is where there is a brief Intro: “The Automated Text Adaptation Tool” by Jill Burstein, Jane Shore, John Sabatini, Yong-Won Lee & Matthew Ventura. Educational Testing Service, Demo in NAACL-HLT 2007. I don’t actually know if its available for free-view…

Published in:  on November 10, 2009 at 10:14 am Leave a Comment

Computing for global development…..

I was reading this report: “Computing for Global Development -Is it Computer Science Research?” a few minutes back. Thought it was an interesting article – with some questions to ponder upon.

There are few things that captured my attention:

1. “ICTD also poses new research challenges for systems and networking researchers, given the unique constraints of applications e.g., there have been several streams of work in long-distance WiFi to connect remote rural villages to urban centers or the Internet..”
- Hmm, that left me wondering for a while – in what way will stuff like computer vision or robotics or language engineering will be posed new challenges specific to ICTD? I mean, as it will be discussed further down in the paper, as well as in my blog : What research challenges are specific to these areas…and ICTD put together? Do those challenges remain challenges even outside ICTD – i.e., does solving them ICTD imply solving a hitherto unsolved problem in those areas, in general?

2. “ICTD research not only impacts global development, but can also advance “traditional” computer science. Two examples that come to mind are WildNet and HashCache. Although motivated by applications in developing regions,these works fundamentally changed our view ….”
-Hmm..pretty interesting to know about HashCache and WildNet…. Well, I did not know of them before, so, can’t comment more until I read about them :)

3. “ICTD is thoroughly interdisciplinary…. One issue with interdisciplinary work is that problems seen to be legitimate, even crucial, to the area often don’t contain enough research content in any one area to satisfy the contributing disciplines. Also, even if there is enough research content in one area, solving that part of the problem may be only a small portion of the larger problem being addressed. As is the case with some work in systems research, its not clear that a core technical contribution is really valuable without building the whole system. Unlike systems research, however,the whole system might include non-technical components requiring social, cultural, economic, and political efforts, as well.”
-I felt this is a very important observation. Well presented too.

4. “Third, ICTD like some other application sub fields lacks a clear definition of generic technical issues within a well circumscribed context. Just in agriculture applications, there are networking problems (e.g., connecting remote villages to urban experts), speech problems (e.g., for building a Q&A system in multiple dialects), information-retrieval problems (e.g., permitting cross-lingual, geography-relevant database queries), computer-vision problems (e.g., diagnosing diseased crops via photographs), and so on.”
-Which is where I get back to my doubts raise at point 1. From the examples given here, the problems are generally problems in those areas. However, theres another possibility – these problems are not unsolved problems in the respective areas – but there are more of engineering issues involved. Hmm, this particular part of the paper reminds me of my thoughts on sitting through the “e-sagu” talks and wondering – whats so much of research in this project (That was long back and I understood it better later)

5. “Ironically, ICTD is struggling to establish itself within a field that has itself had a history of struggling to establish itself, namely computer science.”
- Irony! :) It was a very interesting section – this section 3.1 – “Acceptance of applied science”. However, my doubt is this : Why should it be accepted as a sub area of computer science? Why not science in general? :) I mean – instead of applied computer science, why not applied science, as mentioned in the section title?

6.” It’s surprisingly difficult to find hard, technical problems that are unique to ICTD { often, the technical challenges are generic computer science research problems (e.g., better speech recognition). The portion that is relevant for ICTD is often limited to adaptation (e.g., what’s the best way to train speech recognition engines quickly in local dialects?). It’s not that challenging technical problems don’t exist in
ICTD, it’s that they’re often not obvious.”

-It goes back to 1 again :) Hmm, so, perhaps, ICTD researchers always keep thinking about this issue, then :)

- I found the first part of this paper to be very readable (Until Section 3.1) – with interesting points raised. The rest of it – is not really my kind of stuff – its for an older audience I guess ;)

It helped me get a better overview of the “ICTD as Computer Science Research” – question, though.

The report can be read here.

Published in:  on October 14, 2009 at 1:56 pm Comments (2)

Word games as experimental linguistics

Frankly speaking, I have no clue what “experimental linguistics” is supposed to mean. I have no idea even after reading through this document. But, despite that, I found this paper to be a very interesting read.

Well, when I began this paper, I imagined “word games” to mean something more broader in sense than what is mentioned here. Let me prepare you all – this paper basically talks about the “PigLatin” kinda word games and not “scrabble” kinda games or other such things.

So, read a bit on PigLatin? Proceed further now. :)

Actually, I am new to PigLatin and this paper has some nice examples to explain what it is. So, I proceeded ahead. So, there are a few surprises that PigLatin can throw at you – expect the unexpected sorta confrontations. Using such cases, the author tries to explain how language word games can be used to get some insights on the syllabic structure and working of a language.

There are two case studies, with word games from two other languages called: Awara (Yes, its the name!!) – from Papua new guinea and Komo – from Congo. Both the case studies helped the linguists get a better understanding on how language worked and how people worked with the language.

It was all pretty interesting to read. Only that, it left me wondering about such experiments with Indian languages. I don’t even know if there are this kind of games in our languages, which will help understand the structure of the language. Surely, there are a lot of games to “learn” the language but, are there any to “understand” its working?

As I write this – I wonder how these kind of experiments work for “invented languages” :) The working of these languages is theoritically laid down by their makers. How about verifying them experimentally? ;) “Experimental Linguistics” – Literally!!

Heres the link to this report.

Author: Michael Cahill

Published in:  on October 8, 2009 at 12:04 pm Comments (3)
Tags:

Susan Dumais’ Salton Award lecture

I was just browsing through a page on Susan Dumais‘ Salton Award lecture, (Gerald Salton Award).

It was titled “An Inter disciplinary perspective on Information Retrieval”.
It began with Susan Dumais talking about how she got interested in IR, and finally began working in the related areas. Next, she went on to describe what kind of work did she get involved in, how things have changed in the past two decades in the field of Human Computer Interaction and in the context of Information Retrieval. After that, she began on a note about future directions, which I found interesting.

1. “The first area has to do with the dynamics of information and users’ interactions with it. …. How can we extend retrieval models and systems to go beyond a single, static snapshot of information? How can we model searchers in a way that captures the evolution of their information needs within a single session, and across many search episodes?”

2. “The second area has to do with evaluation…… Evaluations methodologies need to be extended to handle the scale, diversity, and user interaction that characterize information systems today. …..Can we go even further and develop a kind of “living laboratory” in which research groups can try new ideas with searchers in situ, thus enabling controlled experiments in the wild?”

Left me with some kinda vague thoughts… too vague to put them on blog too…
The article was too small, but, very interesting. If there was a “talk” and there is a video uploaded somewhere, I would love to see that..

I am not sure about an access to this pdf… but, heres the link.

Update: Here it is – Susan’s presentation.
-Thanks for sharing, Praneeth!

Published in:  on August 28, 2009 at 10:51 am Leave a Comment

Web searching for Daily living – and lolz

Query free search – I’ve been ROFL-ing ever since I read about this in the first two pages of a research paper. I did not go beyond that. My imagination ran so wild that – I got lost in that wilderness to proceed further. Let me tell you a few examples mentioned there and you know why I am saying that.

“When we are washing a coffee maker, for example, a web page is retrieved that includes tips such as ‘cleaning a coffee maker with vinegar removes stains well.’ A method designed on the basis of this concept automatically searches for a web page by using a query constructed from the use of ordinary household objects that is detected by sensors attached to the objects.”

-These lines from the abstract say enough about what this paper is about. Now that its possible to make anything from a domestic appliance or a furniture item to a Television turn in to our internet enabled browser, we can get search results despite the absence of a “real” querying, according to the paper.

“To search the web by using objects, we assume that a set of names of objects that are used (moved) by a user in a given period of time corresponds to user’s context. For example, when a cup, milk, and cocoa are moved in a given period of time, “cup milk cocoa” becomes the user’s context. Then, we search for a web page that matches the context.”

Reading these, I began getting doubts, as usual.
Firstly, it appears to be a dangerous intrusion in to my privacy. What if someone comes and attaches some sensor to some random object of mine? How embarassing it would be to suddenly see search results on the nearby sofa, on something that you can’t imagine to see at the moment?
Secondly, Who asked for it? I don’t want to search for milk and cocoa when I drink milk with cocoa. I may not want to think about more details on the topic at all.

My doubt is : Instead of going to this extent, why cant “voice based search” be improved upon? Instead of attaching sensors for each and every article, why can’t a sensor be attached to our own self, and perform a voice based search and then display the results?

Finally – its highly comical to imagine all these happen in “daily living”. It’ll be good to see a sci-fi comedy thriller with all these ingredients :) Ofcourse, with some creativity and some sense of history,I can see a historical sci-fi comedy thriller in making already. A “psycho sci-fi comedy historical thriller” (or with the order changed) or a “all-this-blah” + political thriller – may be on the cards in future..who knows?

The title of this paper – “Web searching for daily living” is a misnomer. I thought it talks about money generation through web search. Ideally, like a friend suggested, it should be – “Web searching intruding in to daily living”.

Details of the paper: Web Searching for Daily Living Takuya Maekawa (NTT ), Yutaka Yanagisawa , Yasushi Sakurai , Yasue Kishino , Koji Kamei , Takeshi Okadome (NTT) In proceedings of SIGIR-2009

Published in:  on August 21, 2009 at 9:53 am Leave a Comment

Flirtation Detection and my blah

Hardly a day since I wrote on Satire Detection, I happen to stumble upon Flirtation detection!

Coming to think of it, in a way, may be this is good. It will help those people who are vulnerable to deception. And for the deceiving folks, it will help them escape the guilt etc etc, since the machine is now there to expose them :P Its always the machine’s fault now, if the deceived can’t realize the real intentions of the deceiving. Win-Win situation for both the parties involved.

A friend was commenting on developing a mobile phone software with flirt detection enabled – which will save lot of time for both the parties. But, Iam not revealing his name here because all the mobile companies, which literally survive on these long phone calls and long distance relationships will attack him for giving such a blasphemous idea ;)

And the perennial romantic’s view: What the hell is all these? There will be no value for emotional intelligence with such ideas becoming implementable. How nice will that feeling of “getting blind folded by love” be? how sweet and sour that suffering can be? how cold that indifference can be…how cool can those sweet nothings be…aha! ignorance is bliss!!! I dont want these tools.

A pessimist’s view : No no no no no. AI has no future. AI can’t do anything. AI just shows you multi-colour dreams. The typical Prof. Y way.

The Optimist says: Who knows? This might infact benefit us. Why should we think only of the negitive side? Lots of innocent people can be saved from the clutches of those people of deceit!

Me says: Lite le lo yaar! Long way to go! Perhaps, they will never go to that extent. Think like me, not like a scientist in a sci-fi movie.

Frankly speaking, I am a split personality. So, all the above said views are mine. But, I don’t know the names of those splits. When I am me, I am me :P

Here it is, the actual paper: It’s Not You, It’s Me: Detecting Flirting and Its Misperception in Speed-Dates

At anyrate, beware, all those deceptive lovers. You might have tough time in the future ;)

Published in:  on August 14, 2009 at 3:31 pm Comments (7)

Satire Detection – Hopes,Fears and me ;)

Automatic Satire Detection: Are You Having a Laugh?

- I came across this paper today and the introduction section itself left me thinking.

Firstly, I don’t understand why should satire be detected by a machine? Or.. are people hoping for this to lead to even something like – a machine making satirical conversations? :P Oh my God.. satire detection sounds scary to me.

At this rate, my conversations need to be extra cautious. I shamelessly accept that I am very satirical and many times, I seek revenge on people who torment me by replying satirically to them. I enjoy the sadistic pleasure in seeing those tormentors getting tormented and being left clueless, ultimately ending in failing to reply back. Sometimes, I do this with people who can’t understand the intended satire..and have masochistic pleasure.

Two years back, I was reading an advertisement of some workshop which had “Computational Humor” was one of its topics. Back then, I felt that that thought was interesting. Then, it dawned upon me that its actually amusing to imagine “computing” humor. Later, I wondered – “Will computers cut jokes with us now?”.

Humor, Satire… what next?

If this can go on, I want something that can convey your feelings to the person who is not able to understand when you yourself say, directly or indirectly. Let our feelings be converted in to mathematical equations and be shown to the intended person graphically, like those colourful matlab graphs :)

Theres an advantage to the whole affair. If you feel the other person is understanding too much..or reading too much between the lines, you can always say – “I did not mean that. It was the machine’s fault”. Oh, I am futuristically thinking about over-learning of human heart..already!!! ;)

Published in:  on August 13, 2009 at 12:40 pm Comments (3)

“Free” software

I was reading this : “Why “Open Source” misses the point of Free Software” by Richard Stallman (Link here).

I never gave serious thoughts to the difference between “open source” and “free software”. For that matter, I never realized the real meaning intended by GNU people, for the term “free software”. In that sense, this article made an interesting read, though I have my reservations on a few points.

Firstly, “This is a matter of freedom, not price, so think of “free speech,” not “free beer.”” – This in essence tells you the “ethics” of free software.

Somewhere towards the end of the article, here are some comments on the “DRM” software.

“This malicious feature is known as DRM, or Digital Restrictions Management (see DefectiveByDesign.org), and it is the antithesis in spirit of the freedom that free software aims to provide. And not just in spirit: since the goal of DRM is to trample your freedom, DRM developers try to make it hard, impossible, or even illegal for you to change the software that implements the DRM.

Yet some open source supporters have proposed “open source DRM” software. Their idea is that by publishing the source code of programs designed to restrict your access to encrypted media, and allowing others to change it, they will produce more powerful and reliable software for restricting users like you. Then it will be delivered to you in devices that do not allow you to change it.

This software might be “open source,” and use the open source development model; but it won’t be free software, since it won’t respect the freedom of the users that actually run it. If the open source development model succeeds in making this software more powerful and reliable for restricting you, that will make it even worse.”
-This pretty much conveyed to me the actual meaning of “free” in “free software”.

Finally, I was wondering – whats so wrong in earning money by software development? all the time… its also a profession like others. I keep questioning- Software can be open source but not “free” as in “free beer” too. But one doubt- if the software is “free” as in “free speech” – what will hackers have fun in? ;)

Published in:  on August 11, 2009 at 11:09 am Comments (3)

The end of science-revisited

John Horgan is a famous American journalist who wrote this book called “The End of science” in the late 90s. Well, I did not read the book to comment further on it. But, I read his 2004 article titled – “The End of science revisited” just now and thought I should share my thoughts about it online.

Coming to the book, the title says it all. In the book, he argued that science may be entering an era of “diminishing returns”. Our article in the current context is about the author’s reflections on the same topic. While Horgan does not at any point of time deny the advances in science and technology, which are happening at a rapid pace, he claims that the days of something like “ground breaking” discoveries might be over soon. Theres this example he puts in the article, which shows this “diminishing returns” trend in Nobel Prizes : The Russian Physicist Pyotr Kapitsa discovered super fluidity in liquid Helium in 1938 and won a Nobel prize. 40 years later, David Lee and his colleagues won a Nobel prize in 1996 for showing that super fluidity also occurs in Helium isotope He-3.

Horgan systematically discusses different scientific fields and their progress, responses from the scientists in these respective fields for his “End of science” statements and his comments. It also talks about how fields like Nano Technology, Artificial Intelligence etc did not live up to the promises they made. Here and there, he chips in the opinions of scientists from the relevant field – both positive and negitive comments. He talks about the promises of Neuro science and understanding of functioning of brain. And concludes on a note of “hopeful skepticism” saying – “I would like to see a greater recognition of science’s limitations – particularly in mind related fields, where our desire for self-knowledge can make us susceptible to pseudo scientific cults such as Marxism, social Darwinism, eugenics and psycho analysis”

It was a very interesting article. Personally, I got some gyaan on several ideas from several people on several scientific disclipines. Reading such articles always leaves me wondering on the basic questions like – what is life? why research? etc :P

The article is not available for public view I suppose… :) Can’t help it..

Details of the article:
The End of science revisited by John Horgan, IEEE computer, January 2004 issue.

Published in:  on August 8, 2009 at 12:50 pm Comments (1)
Tags:

ChaCha voice search

I have always wondered about the quality of ChaCha, the search engine. Well, they use human query interpreters and hence, I was wondering if its really scalable to use humans when it is used to the extent as say google. I forgot about ChaCha as time passed and its been more than one year since I last saw the ChaCha homepage.

Now, I read this news: “ChaCha Voice Search Beats Google, Yahoo/Vlingo In Accuracy, Reliability” here. This study was sponsored by ChaCha and the evaluation searches were performed by a single individual from his iPhone.

While the others are automated speech recognition while ChaCha uses humans for voice search query understanding. So, its bound to perform better. However, I just wonder – how many human guides does one need to have on the background if ChaCha Voice search is used as frequently as a general google search is used now.

Published in:  on July 26, 2009 at 1:30 pm Comments (2)