I am attending this year’s Machine Learning Summer School and we just finished one week of lectures. I thought now is the moment to look back and note down my thoughts (mainly because we thankfully don’t have lectures on sundays!). One more week to go and I am already very glad that I am here listening to all these amazing people who are undoubtedly some of the best researchers in this area. There is also a very vibrant and smart student community.
Until Saturday evening, my thoughts on the summer school focused more on the content of the sessions. They were mostly about the mathematics in the sessions, my comfort and discomfort with it, their relevance, understanding the conceptual basis of it etc., I won’t make claims that I understood everything. I understood some talks better, some talks not at all. I also understood that things could have been much better for me if we were informed about why we need to actually seriously follow all the Engineering Mathematics courses during my bachelors ;).
However, coming to the point, as I listened to the Multilayer Nets lecture by Leon Bottou on Saturday afternoon, there was something that I found particularly striking. It looks like two things that I always thought of as possibly interesting aspects of Machine Learning are not really a part of the real machine learning community. (Okay, one summer school is not a whole community but I did meet some people who have been in that field of research for years now).
1) What exactly are you giving as input for the machine to learn? Shouldn’t we give the machine proper input for it to learn what we expect it to learn?
2) Why isn’t the interpretability of a model an issue worth researching about?
Let me elaborate on these.
Coming to the first one, this is called “Feature Engineering”. The answer that I heard from one senior researcher for this question was: “We build algorithms that will enable the machine to learn from anything. Features are not our problem. The machine will figure that out.” But, won’t the machine need the right eco-system for that? If I grow up in a Telugu speaking household and get exposed to Telugu input all the time, will I be expected to learn Telugu or Chinese? Likewise, if we want to construct a model that does a specific task, is it not our responsibility to prepare the input for that? Okay, we can build systems that figure out the features that work by itself. But won’t that make the machine learn anything from the several possible problem subspaces, instead of the specific issue we want it to learn? Yes, there are always ways to assess if its learning the right thing. But, thats not the point. In a way, this connects again to the second question.
Am not knowledgeable enough on this field to come up with a well-argued response to that above comment by the senior researcher. The matter of fact is also that there is enough evidence that that approach does work in some scenarios. But, this is a general question on the applicability of the models, issues regarding domain adaptation if any etc. I found so less literature on theoretical aspects connecting feature engineering to algorithm design and hence these basic doubts.
The second question is also something that I have been thinking about for a long time now. Are people really not bothered about how those who apply Machine Learning in their fields interpret their models or am I bad at searching for the right things? Why is there no talk about the interpretability of models? I did find a small amount of literature on “Human comprehensible machine learning” and related research, but not much.
I am still in the process of thinking, reading and understanding more on this topic. I will perhaps write another detailed post soon (with whatever limited awareness I have on this topic). But, in the mean while,
* Here is a blogpost by a grad student, that has some valid points on interpretability of models.
* “Machine Learning that matters“, ICML 2012 position paper by Kiri Wagstaff. This is something that I keep getting back to time and again, whenever I get into thinking about these topics. Not that the paper answers my questions.. it keeps me motivated to think on them.
* An older blogpost on the above paper which had some good discussion in the comments section.
With these thoughts, we march towards the second week of awesomeness at MLSS 2013 :-).