07 Dec 2016
Since our last demo day, the Grand Rounds team has been busy working on many aspects of the project. Grand Rounds provided our team a database of anonymous medicare information, including information about doctors visits. Associated with each visit were claim numbers, the total charge, the complication code (essentially the reason for the visit), and the NPI number (which indicates which physician was doing the procedure), as well as other information.
03 Dec 2016
Github doesn’t only make version control software; it’s tasked with the unique challenge of analyzing programming languages. With over 10 million repositories, Github has an incredible corpus of files written by over 3 million users… that’s a lot of files. Our team had the unique opportunity of identifying which programming language each file was written in. Our dataset has 50,000+ files with over 600 languages, a nontrivial multi-class problem. We’ve improved on Github’s existing classifier, Linguist, and, in the process, learned about what makes each programming language unique from one another.
22 Nov 2016
Imagine being able to just tell a computer what you want it to do, rather than programming it and having to deal with annoying syntax, semicolons and debugging. For the last few months, we, Code Synthesis team, have been working on just that. This is similar to the process of automated theorem proving, the process of using computers to solve proofs, which has recently experienced some significant breakthroughs through the use of artificial neural networks. Automated theorem proving uses a description of the proof to derive a way to reach the desired end-product, This idea can be applied to our project where we use a description of a program to actually generate the code for that program.
06 Nov 2016
Introduction, Regression/Classification, Cost Functions, and Gradient Descent
Machine learning (ML) has received a lot of attention recently, and not without good reason. It has already revolutionized fields from image recognition to healthcare to transportation. Yet a typical explanation for machine learning sounds like this:
“A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E.”
Not very clear, is it? This post, the first in a series of ML tutorials, aims to make machine learning accessible to anyone willing to learn. We’ve designed it to give you a solid understanding of how ML algorithms work as well as provide you the knowledge to harness it in your projects.
14 Oct 2016
One of our main goals here at ML@B is to help students understand how to use machine learning in real-world situations. This semester, we’ve teamed up with Github, Grand Rounds, SAP, and Intuit to work on solving some of their problems through machine learning. In addition, we have members working on their own independent research projects with groups such as the International Computer Science Institute.
Just this Friday, we had our first demo day–a day where project members got to show what they were up to. Here’s a brief summary of what they had to show: