Personal tools
You are here: Home Events ANC Workshop: Jennifer Sanger and Charles Sutton, Chair: James Raymond

ANC Workshop: Jennifer Sanger and Charles Sutton, Chair: James Raymond

— filed under:

  • ANC Workshop Talk
When Nov 08, 2016
from 11:00 AM to 12:00 PM
Where IF 4.31/4.33
Add event to calendar vCal

Jennifer Sanger:  “Applications of Algebraic Topology to Neuroscience Research”


Recent years have seen a growing interest in exploring the application of algebraic topology to computational neuroscience research questions. Here, I provide a light introduction to the core theory and give some examples of its use in analysis across a variety of neural data. Finally, I ask us to consider the specific example of EEG data in relation to constructing a topological paradigm of investigation.

Charles Sutton

"Parameter-Free Probabilistic API Mining across GitHub"

(This is a practice talk for the Foundations of Software Engineering (FSE) conference next week)

API mining is a problem at the intersection of machine learning, software engineering, and programming languages. The problem is to infer from a large corpus of source code that uses a software library, what the most common patterns are that describe how the library is used.

These patterns can then be shown as illustrative examples to software developers who are unfamiliar with the library.

Existing API mining algorithms are based on methods from the data mining literature, in particular, a set of methods called frequent sequence mining. There is a huge literature on mining sequences; but unfortunately, most of it is statistically unsound. This results in lists of patters that are large, highly redundant and difficult to understand. We present a new method for mining sequences based on probabilistic machine learning.  Applying this method, we present PAM (Probabilistic API Miner), a near parameter-free probabilistic algorithm for API mining. We apply our method to a dataset of projects on GitHub, for which developers have written example code by hand. Our dataset contains over 300,000 lines of hand-written API example code from 967 client projects. We find that the API examples mined by PAM match hand-written examples significantly better than traditional data mining techniques.