Data from Video Games and The Master Algorithm

In episode 20 we chat with Pedro Domingos of the University of Washington, he's just published a book The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World. We get some insight into Linear Dynamical Systems which the Datta Lab at Harvard Medical School is doing some interesting work with. Plus, we take a listener question about using video games to generate labeled data (spoiler alert, it's an awesome idea!)

We're in the final hours of our Fundraising Campaign and we need your help! 

We Need Your Help

Dear Listener, 

I hope you've been enjoying our show so far. Ryan and I have been having a fantastic time making it for you. But we've reached a point where if we're going to keep going, we need your help. 

Please donate to our Kickstarter

We take pride that our podcast is professional quality, but reaching that level of quality takes a lot of time, effort, and resources. We have to pay for studio time, audio engineering, and production time. 

But our greatest expense is probably the thing that you enjoy most about our show, our interviews. We're able to get interviews with the top experts in academia and industry because we're willing to go where they are. It may seems like a small thing, but it really makes a big difference. Unfortunately it's also an expensive difference, travel is not usually an expense that podcasts incur but we've found it's essential for making ours. 

We've got a few days left in our Kickstarter and we've raised a little more than half of the funds we need. We need your help now more than ever, so please lend a hand and let's continue Talking Machines! 

Best, 

Katherine Gorman

 

Strong AI and Autoencoders

In episode nineteen we chat with Hugo Larochelle about his work on unsupervised learning, the International Conference on Learning Representations (ICLR), and his teaching style. His Youtube courses are not to be missed, and his twitter feed @Hugo_Larochelle is a great source for paper reviews. Ryan introduces us to autoencoders (for more, turn to the work of Richard Zemel) plus we tackle the question of what is standing in the way of strong AI.

Talking Machines is beginning development of season two! We need your help! Donate now on Kickstarter. 

Active Learning and Machine Learning in Neuroscience

In episode eighteen we talk with Sham Kakade, of Microsoft Research New England, about his expansive work which touches on everything from neuroscience to theoretical machine learning. Ryan introduces us to active learning (great tutorial here) and we take a question on evolutionary algorithms.

Today we're announcing that season two of Talking Machines is moving into development, but we need your help! 

In order to raise funds, we've opened the show up to sponsorship and started a Kickstarter and we've got some great nerd cred prizes to thank you with. But more than just getting you a totally sweet mug your donation will fuel journalism about the reality of scientific research, something that is unfortunately hard to find. Lend a hand if you can! 

 

Machine Learning in Biology and Getting into Grad School

In episode seventeen we talk with Jennifer Listgarten of  Microsoft Research New England about her work using machine learning to answer questions in biology. Recently, With her collaborator Nicolo Fusi, she used machine learning to make CRISPR more efficient and correct for latent population structure in GWAS studies. We take a question from a listener about the development of computational biology and Ryan gives us some great advice on how to get into grad school (Spoiler alert: apply to the lab, not the program.)

Machine Learning for Sports and Real Time Predictions

In episode sixteen we chat with Danny Tarlow of Microsoft Research Cambridge (in the UK not MA). Danny (along with Chris Maddison and Tom Minka) won best paper at NIPS 2014 for his paper A* Sampling. We talk with him about his work in applying machine learning to sports and politics. Plus we take a listener question on making real time predictions using machine learning, and we demystify backpropagation. You can use Torch, Theano or Autograd to explore backprop more.

Really Really Big Data and Machine Learning in Business

In episode fifteen we talk with Max Welling, of the University of Amsterdam and University of California Irvine. We talk with him about his work with extremely large data and big business and machine learning. Max was program co-chair for NIPS in 2013 when Mark Zuckerberg visited the conference, an event which Max wrote very thoughtfully about. We also take a listener question about the relationship between machine learning and artificial intelligence. Plus, we get an introduction to change point detection. For more on change point detection check out the work of Paul Fearnhead of Lancaster University. Ryan also has a paper on the topic from way back when.

Solving Intelligence and Machine Learning Fundamentals

In episode fourteen we talk with Nando de Freitas. He’s a professor of Computer Science at the University of Oxford and a senior staff research scientist Google DeepMind. Right now he’s focusing on solving intelligence. (No biggie) Ryan introduces us to anchor words and how they can help us expand our ability to explore topic models. Plus, we take a question about the fundamentals of tackling a problem with machine learning.

Working With Data and Machine Learning in Advertising

In episode thirteen we talk with Claudia Perlich, Chief Scientist at Dstillery. We talk about her work using machine learning in digital advertising and her approach to data in competitions. We take a look at information leakage in competitions after ImageNet Challenge this year. The New York Times covered the events, and Neil Lawrence has been writing thoughtfully about it and its impact. Plus, we take a listener question about trends in data size.

The Economic Impact of Machine Learning and Using The Kernel Trick on Big Data

In episode twelve we talk with Andrew Ng, Chief Scientist at Baidu, about how speech recognition is going to explode the way we use mobile devices and his approach to working on the problem. We also discuss why we need to prepare for the economic impacts of machine learning. We’re introduced to Random Features for Large-Scale Kernel Machines, and talk about how using this twist on the Kernel trick can help you dig into big data. Plus, we take a listener question about the size of computing power in machine learning.

How We Think About Privacy and Finding Features in Black Boxes

In episode eleven we chat with Neil Lawrence from the University of Sheffield. We talk about the problems of privacy in the age of machine learning, the responsibilities that come with using ML tools and making data more open. We learn about the Markov decision process (and what happens when you use it in the real world and it becomes a partially observable Markov decision process) and take a listener question about finding insights into features in the black boxes of deep learning.

Interdisciplinary Data and Helping Humans Be Creative

In Episode 10 we talk with David Blei of Columbia University. We talk about his work on latent dirichlet allocation, topic models, the PhD program in data that he’s helping to create at Columbia and why exploring data is inherently multidisciplinary. We learn about Markov Chain Monte Carlo and take a listener question about how machine learning can make humans more creative.

Starting Simple and Machine Learning in Meds

In episode nine we talk with George Dahl, of  the University of Toronto, about his work on the Merck molecular activity challenge on kaggle and speech recognition. George recently successfully defended his thesis at the end of March 2015. (Congrats George!) We learn about how networks and graphs can help us understand latent properties of relationships, and we take a listener question about just how you find the right algorithm to solve a problem (Spoiler: start simple.)  

Spinning Programming Plates and Creative Algorithms

On episode eight we talk with Charles Sutton, a professor in the School of Informatics University of Edinburgh about computer programming and using machine learning how to better understand how it’s done well.

Ryan introduces us to collaborative filtering, a process that helps to make predictions about taste. Netflix and Amazon use it to recommend movies and items. It's the process that the Netflix Prize competition further helped to hone.

Plus, we take a listener question on creativity in algorithms.

The Automatic Statistician and Electrified Meat

In episode seven of Talking Machines we talk with Zoubin Ghahramani, professor of Information Engineering in the Department of Engineering at the University of Cambridge. His project, The Automatic Statistician, aims to use machine learning to take raw data and give you statistical reports and natural languages summaries of what trends that data shows. We get really hungry exploring Bayesian Non-parametrics through the stories of the Chinese Restaurant Process and the Indian Buffet Process (but remember, there’s no free lunch). Plus we take a listener question about how much we should rely on ourselves and our ideas about what intelligence in electrified meat looks like when we try to build machine intelligences.

The Future of Machine Learning from the Inside Out

We hear the second part of our conversation with with Geoffrey Hinton (Google and University of Toronto), Yoshua Bengio (University of Montreal) and Yann LeCun (Facebook and NYU). They talk with us about this history (and future) of research on neural nets. We explore how to use Determinantal Point Processes. Alex Kulesza  and Ben Taskar (who passed away recently) have done some really exciting work in this area, for more on DPPs check out their paper on the topic
Also, we take a listener question about machine learning and function approximation (spoiler alert: it is, and then again, it isn’t).

The History of Machine Learning from the Inside Out

In episode five of Talking Machines, we hear the first part of our conversation with Geoffrey Hinton (Google and University of Toronto), Yoshua Bengio (University of Montreal) and Yann LeCun (Facebook and NYU). Ryan introduces us to the ideas in tensor factorization methods for learning latent variable models (which is both a tongue twister and and one of the new tools in ML). To find out more on the topic, the paper Tensor decompositions for learning latent variable models is a good place to start. You can also take a look at the work of Daniel HsuAnimashree Anandkumar and Sham M. Kakade Plus we take a listener question about just where statistics stops and machine learning begins.

 

Using Models in the Wild and Women in Machine Learning

In episode four we talk with Hanna Wallach, of Microsoft Research. She's also a professor in the Department of Computer Science, University of Massachusetts Amherst and one of the founders of Women in Machine Learning (better known as WiML). We take a listener question about scalability and the size of data sets. And Ryan takes us through topic modeling using Latent Dirichlet allocation (say that five times fast). 

Common Sense Problems and Learning about Machine Learning

On episode three of Talking Machines we sit down with Kevin Murphy who is currently a research scientist at Google. We talk with him about the work he’s doing there on the Knowledge Vault, his textbook, Machine Learning: A Probabilistic Perspective (and its arch nemesis which we won’t link to), and how to learn about machine learning (Metacademy is a great place to start). 

We tackle a listener question about the dream of a one step solution to strong Artificial Intelligence and if Deep Neural Networks might be it.

Plus, Ryan introduces us to a new way of thinking about questions in machine learning from Yoshua Bengio’s Lab at the University of Montreal out lined in their new paper, Identifying and attacking the saddle point problem in high-dimensional non-convex optimizationand Katherine brings up Facebook’s release of open source machine learning tools and we talk about what it might mean

If you want to explore some open source tools for machine learning we also recommend giving these a try: