Machine Learning at CSE, IITK — The Culture, Research and Prospects
(originally at asnaninishit.wordpress.com)
You could be a curious soul wanting to know about machine learning (ML) at IIT Kanpur. Or a current IITK student wanting to get into ML. Or a prospective undergraduate / graduate student. Or someone else with a keen interest in this subject. If you want to roam around beings that pride themselves on designing advanced AI that can see through walls and beat you in your favorite board game and sing your favorite songs to you, look no further — IITK is here. Um.. just kidding. No seeing through walls here. But anyway, read on to find out who we are and what we do and what we hope to achieve.
Who am I, and Why I Am Probably Not Lying
I am a recent graduate (graduated in June 2018) from Computer Science and Engineering (CSE), IIT Kanpur, and spent the best part of these four years working on problems in machine learning either through courses, projects or internships. I have had the good fortune of taking courses with numerous faculty members in ML at IITK and working on research projects with a couple of them as well. So while my view of the ML culture at IITK may not be truly comprehensive, it is not entirely ill-informed either.
I have been on the other side of the spectrum as well, albeit occasionally. I have mentored student projects in machine learning as part of courses as well as through the freshmen semester project initiative by the Association of Computing Activities (ACA), CSE IITK. I was also an instructor for a two-week intensive summer school course in machine learning organised by ACA in May-June 2018, through which I had a first hand glimpse of the machine learning landscape among (mostly) undergraduates across engineering institutes in north India. Again, not an experience enough to build a truly comprehensive picture, but definitely for an indicative one.
A Bit of a Primer on IITK and Machine Learning
IIT Kanpur has a rich legacy of being the pioneer of sorts in computer science, being the first institute in India to start an undergraduate program in the subject. In the recent years, machine learning has emerged as one of the most promising areas in computer science thanks to the availability of huge amount of data, as well as computing resources in the form of GPUs and powerful CPUs.
IITK has not been sitting back, and the machine learning culture in the institute has taken off as well. This piece aims to paint a picture as close to being accurate as possible of this culture and how people try to make the most of it.
Just a disclaimer at this point — ML is not just confined to computer science. Its theoretical foundations are developed by statisticians and electrical engineers alike, and its applications are as widespread as one can dream of. Healthcare, law, education, security, finance — these are just some sectors that have seen quite a disruption due to the advancements in machine learning. Another disclaimer, the views in this post are purely my own and need not be construed as being the views of the institute.
Now that we have the motivation of this post and a little history out of the way, let’s dive in and see what lies under the surface.
Faculty Members, Courses and Research Areas
Q. How many faculty members in CSE, IITK work on machine learning, artificial intelligence or allied fields, and who are they?
A: The CSE department at IITK boasts of having excellent faculty members approaching ML from both the ends of the spectrum: Prof. Purushottam Kar comes from the optimization background, while Prof. Piyush Rai works in probabilistic modeling and inference. This is a unique blend, rarely achieved by even the best universities in the world in computer science.
We have Prof. Vinay Namboodiri, who works in computer vision and graphics, Prof. Nisheeth Srivastava, working in computational cognitive science and computational social science, Prof. Sunil Simon working on game theory, Prof. Swaprava Nath who is involved in mechanism design and multi agent systems, and Prof. Harish Karnick, who has been working on diverse aspects of machine learning theory and applications over the years, and has most recently been involved in ongoing work in natural language processing.
Q. What are the venues where machine learning research from IITK is generally published, and how often?
A: Our research gets published in top tier journals and conferences in machine learning, namely ICML (International Conference on Machine Learning), NIPS (Neural Information Processing Systems), KDD (Knowledge Discovery and Data Mining), JMLR (Journal of Machine Learning Research) and a few others. A considerable portion of the research is also presented in top tier conferences in the applications of machine learning or allied fields, like in computer vision (CVPR, WACV), Artificial Intelligence (AAAI and IJCAI), statistics (AISTATS), data mining (ICDM), web (WWW), software engineering (ICSE) etc.
Most of these venues have at least 2 to 3 publications from IITK every year, with the ones in core machine learning (ICML, NIPS etc.) usually having a few more. The volume of publications is growing with more student and industry collaborations and the increasing faculty strength.
Q. Is the research environment conducive to collaborations? How significant is the student involvement in research?
A: Our faculty has collaborations with researchers from some of the top universities of the world, including those from USA, Canada, China, Spain, Italy and India. Industrial collaborations from research institutes at IBM, Microsoft etc. have also been blossoming.
Over the last few years, student participation in machine learning research has grown manifold, especially among the undergraduates. For instance, requests for enrolling in CS771, the introductory machine learning course, have grown significantly over the last half a decade, and stand at over 300 now for a single offering. Many of the students who complete the course end up conducting research in ML or using ML for a project at a research lab, company or in campus itself.
Q. What kind of projects do faculty and students work on?
A: Quite a few projects are currently going on in the broad domain of ML at the institute, some of which are interdisciplinary in nature as well. A recent example of one such project pertains to using machine learning to improve the C compiler that is used for the first year programming course (ESC101), which has had a real impact on the way the programming labs are conducted and the way students learn their first bits in writing code.
There are other projects which explore problems at the heart of machine learning, like learning under uncertainty, learning with millions of examples, robust learning and the like.
Most of the instructors are quite flexible and allow even undergraduate students to pick projects of their liking, if they have some alignment with the faculty member’s area of interest.
Q. What sort of work is carried out in deep learning at IITK?
A: Deep learning has seen enormous progress in its techniques and success stories over the last few years, and is certainly a key area of research at IITK. It creeps into many learning problems as a viable solution, and a lot of work is done at IITK in deep learning, deep probabilistic modeling, optimization techniques for deep learning, deep learning theory, etc. Moreover, its effects can be most strongly felt in its applications, and research at IITK is not behind either — many of our recent projects and publications employ deep learning models for getting state of the art results in daunting problems in computer vision, natural language processing, recommender systems, speech processing, robotics and finance.
Q. How many and what sort of courses are offered at IITK that are related to machine learning? How easy is it for a student to enroll in these courses?
A: The various course offerings in machine learning follow the trajectories of the faculty members involved themselves.
CS771 is the introductory machine learning course that covers a breadth of topics in supervised and unsupervised learning, and is generally offered once a year by one of the instructors listed above. It commands participation from a healthy mix of undergraduates and graduate students, and it is in this course that many intra institute work partnerships form, as students work in teams for their course projects. It aims to consolidate students’ understanding of the fundamental concepts in machine learning, and introduce them to traditional methods, new approaches, various learning paradigms, and related application areas. I have had the good fortune of taking the course once, and then being a project mentor for a few student groups the following year. I’m not sure how many institutes in India allow undergraduates to have that privilege.
The other courses commonly on offer are (based on the courses that have been offered in the last three years):
Probabilistic ML: Probabilistic machine learning, Bayesian ML, Topics in Markov Chains etc.
Optimization: Online Learning and Optimization, Optimization Techniques, Learning Theory
Applications: Visual Recognition, Topics in Computer Vision, Natural Language Processing, Multi-agent systems
HCI: Computational Cognitive Science, Human Centered Computing
A request to enroll in any of these courses is not a guarantee of being accepted into it. For the advanced machine learning courses, prior experience in terms of courses and / or projects helps in getting an enrollment request accepted. For CS771 though, the faculty members try to maintain a good proportion of non-CSE students in the course, but due to high demand, not everyone who requests the course gets accepted into it. It is a basket course (if you don’t know what that means, think of it as being semi-mandatory) for the CSE students, so it is generally a little easier for them to do the course. But this should not discourage anyone with a genuine interest in ML to pursue the field at IITK — if you have interest and something to show for it, most people at IITK are ready to extend a helping hand.
It is to be noted here that various courses in robotics, neural networks, signal processing etc. are offered in the electrical engineering department, and might be of interest to the ML crowd as well. So would courses in regression, time series analysis, stochastic processes etc. for those who have a more statistical bend of mind (generally offered by the Mathematics / industrial and management engineering departments).
Research Groups and Activities
Q. If I am a student at IITK interested in machine learning, how do I find other students like me, and who would be able to help me find my way around?
A: The Special Interest Group in Machine Learning, popularly known by its acronym SIGML, is a loose group of students and faculty members on campus who organize and take part in activities pertaining to machine learning. (https://www.cse.iitk.ac.in/users/sigml/)
This includes talks, seminars and meetups with eminent researchers from top universities and research institutes, founders, engineers and managers of companies playing big in ML, and others from academia and industry who are advancing the field with their innovations and implementations. Hackathons are held annually in collaboration with other interested parties in industry and on campus to promote a culture of using machine learning to solve common problems. SIGML members sometimes meet among themselves as well, and discuss recent research in areas of common interest or papers that have the potential to cause ripples in the community.
Q. How do students working in ML come to know about the work of others on campus in the field?
A: We also celebrate a machine learning research day (MLRD) on campus every year, where students showcase their research work carried out at IITK and in various other institutes during internships and exchange visits.
SIGML, MLRD and other such initiatives help unite the community of people working in ML on campus, helping people make connections, troubleshoot their problems with help from others and leverage the department infrastructure and mentorship to learn and build valuable technical skills.
Oh, did I mention that we like to crack lame ML jokes every now and then just because we can do no better? And sometimes some people make memes as well. One of which (made by Parth Sharma, a batchmate of mine) was cited by Yann LeCun from his official Facebook page. Beat that!
Future prospects, internships and exchange
Students pursuing machine learning at IITK find themselves quite sought after among industrial internships as well as placements. While this may or may not be a causal relationship, it is hardly a doubt that most big companies and startups alike want engineers and software developers who have a machine learning background and are opening up new job positions for the same. Data science seems to be the buzzword of the decade. Or words.
Q. What sort of companies do students in ML from IITK get placed into?
A: While the research labs at Microsoft, IBM and Adobe stand out among the recruiters from campus for their excellent research infrastructure in India, there are positions available for students in ML at Google, Tower Research, Goldman Sachs, Amazon, Flipkart, Optiver, WorldQuant, Uber, and other top financial and technological giants as well. Many startups, including ShareChat, Zomato and the like, also hire graduates from IITK proficient in machine learning.
Q. What are the prospects for higher education / research internships in machine learning after graduating from IITK?
A: Research internships and PhD positions, while appearing harder to get, are still very much in sight. Students tend to gravitate towards Microsoft Research, EPFL, NYU, Duke University, CMU, IBM Research etc. for potential destinations to strengthen their research background in ML through internships and exchange programs. Students from the recent batches have worked at length with faculty members at IIT Kanpur before embarking on a three- or six-month exchange program.
Potential MS / PhD destinations, judging by the recent past, are spread throughout the globe with most concentration at places that are considered to be generally good in AI — U of Toronto, Carnegie Mellon, University of Montreal, Aalto University, UC Berkeley, Princeton, NYU, Oxford, Georgia Tech, Stanford, EPFL, Duke, etc. constitute a formidable set of universities where our recent graduates are pursuing their higher education in AI / ML and related fields.
Before Signing Off
If one is into machine learning, or wants to get a grip on the area, IIT Kanpur may turn out to be one of the finest places in India to pursue those goals. Then again, if you are into a specific subfield, a more detailed study of the faculty members, labs and the peer group at various places might be of help in order to compare across institutes. I have seen people from other IITs and IIIT Hyderabad do a great job in ML as well, so those might be other potential places to compare with, but an institute’s eagerness to push the boundaries of research and to collaborate across fields should also be taken into consideration.