Coordinates | Thursdays from 2 to 4 PM in room 028 of Ludwigstrasse 31. |
Lecturer | Tom Sterkenburg. Contact me at tom.sterkenburglmu.de; visit me in room 126 of Ludwigstrasse 31. |
Course description | Machine learning is concerned with the design of algorithms that learn from data. In recent years work within this field has led to a plenitude of applications that have an ever-increasing impact on our daily lives. From a somewhat loftier perspective, however, the work within this field constitutes the most modern attempt to confront fundamental questions of epistemology: How can we induce knowledge from data? Can this procedure actually be formalized, and even be automatized?
In this course we will investigate if and how contemporary machine learning theory---including such approaches as statistical learning theory, formal learning theory, online learning, and deep learning---can shed a new light on the traditional philosophical problems with induction. |
Contents and material | See the below schedule for details. (This schedule is provisionary; depending on participants' interests, we might make some changes as the course progresses.)
The first half of the course, six lectures, is devoted to statistical learning theory (SLT). We will go through the first part of the introductory textbook Understanding Machine Learning (Shalev-Shwartz & Ben-David, 2014; freely available here), and the booklet Reliable Reasoning (Harman & Kulkarni, 2007). I will also announce and make available some additional papers and book chapters as reading material. (Further recommended overviews of the conceptual aspects of SLT and PAC learning are von Luxburg & Schölkopf (2011) and Ortner & Leitgeb (2011), sect. 1.) The second half of the course consists of six advanced lectures that each treat of the philosophical aspects of a different approach in machine learning (though we will see that the same fundamental themes keep returning). The reading material consists of papers that will be announced and made available. |
Assessment | The course is worth 9 ECTS. Your grade will be determined by two lesser writing assignments in the first half of the course (each counting for 20% of your grade), and a term paper at the end of the course (counting for 60% of your grade). |
Schedule
Date | Topic | Material | Assignments |
---|---|---|---|
Thu Apr 12 | Introduction. The problem of induction. | Ch. 1 of Shalev-Shwartz & Ben-David. Ch. 1 of Harman & Kulkarni. Ch. 1 of Lipton (2004). | |
Thu Apr 19 | Basic notions of learning theory. Some history of learning theory. | Ch. 2 of Shalev-Shwartz & Ben-David. Intro. and ch. 1 of Vapnik (2000). Wheeler (2017). | |
Thu Apr 26 | The PAC model; the agnostic PAC model. Philosophical presuppositions of SLT. | Chs. 2–3 of Shalev-Shwartz & Ben-David. Ch. 2 of Harman & Kulkarni. | |
Thu May 3 | Uniform convergence. The no-free-lunch theorem(s). The impossibility of a universal learning method. | Chs. 4–5.1 of Shalev-Shwartz & Ben-David. Wolpert (1996). Forster (1999). | |
Thu May 10 | NO CLASS: Ascension Day. | ||
Thu May 17 | VC-dimension and the fundamental theorem of PAC learning. VC-dimension and falsificationism. | Chs. 5.2–6 of Shalev-Shwartz & Ben-David. Ch. 2 of Harman & Kulkarni (again). Steel (2011). Corfield, Schölkopf, & Vapnik (2009). | Deadline assignment 1. |
Thu May 24 | Universal consistency and non-uniform learnability; structural risk minimization. SRM and Goodman's riddle of induction. | Ch. 7 of Shalev-Shwartz & Ben-David. Ch. 3 of Harman & Kulkarni. Steel (2009). | |
Thu May 31 | NO CLASS: Corpus Christi. | ||
Thu Jun 7 | The minimum description length (MDL) principle. Occam's razor. | Sec. 7.3 of Shalev-Shwartz & Ben-David. De Rooij & Grünwald (2011). Domingos (1999). | Deadline assignment 2. |
Thu Jun 14 | Bayesian learning. PAC-Bayes; Naive Bayes. Transduction; generative models. Regularization and prior probability. | Ch. 24 of Shalev-Shwartz & Ben-David. Tipping (2004). | |
Thu Jun 21 | Formal learning theory and means-ends epistemology. Occam's razor. | Schulte (1999). Kelly & Glymour (2004). Also see the relevant SEP entry. | |
Thu Jun 28 | Competitive online learning. The meta-inductive justification of induction. | Secs. 21.1–2 of Shalev-Shwartz & Ben-David. Schurz (2008). | |
Thu Jul 5 | Neural networks and deep learning. | Ch. 20 of Shalev-Shwartz & Ben-David. Secs. 4.1–3 of Harman & Kulkarni. Zhang et al. (2017). | |
Thu Jul 12 | Wrap-up. Machine learning and the philosophy of science. | Ch. 4 of Harman & Kulkarni. Korb (2004). Williamson (2004). | |
Mon Sep 17 | Deadline term paper. |
Material
- Corfield, Schölkopf, & Vapnik (2009): Falsificationism and statistical learning theory: Comparing the Popper and Vapnik-Chervonenkis dimensions. J. Gen. Philos. Sci. [link]
- Domingos (1999). The role of Occam's razor in knowledge discovery. Data. Min. Knowl. Disc. [link]
- Forster (1999). How do simple rules 'fit to reality' in a complex world? Mind. Mach. [link]
- Harman & Kulkarni (2007). Reliable Reasoning: Induction and Statistical Learning Theory. [link]
- Kelly & Glymour (2004). Why probability does not capture the logic of scientific justification. Contemporary Debates in Philosophy of Science. [link]
- Korb (2004). Machine learning as philosophy of science. Mind. Mach. [link]
- Lipton (2004). Inference to the Best Explanation. [link]
- Okasha (2001). What did Hume really show about induction? Philos. Quart. [link]
- Ortner & Leitgeb (2011). Mechanizing induction. Handbook of the History of Logic: Inductive logic. [link]
- De Rooij & Grünwald (2011). Luckiness and regret in minimum description length inference. Handbook of the Philosophy of Science: Philosophy of Statistics. [link]
- Schulte (1999). Means-ends epistemology. Brit. J. Philos. Sci. [link]
- Schurz (2008). The meta-inductivist's winning strategy in the prediction game: A new approach to Hume's problem. Philos. Sci. [link]
- Shalev-Shwartz & Ben-David (2014). Understanding Machine Learning. [link]
- Steel (2009). Testability and Ockham's razor: How formal and statistical learning theory converge in the new riddle of induction. J. Philos. Logic. [link]
- Steel (2011). Testability and statistical learning theory. Handbook of the Philosophy of Science: Philosophy of Statistics. [link]
- Tipping (2004). Bayesian inference: An introduction to principles and practice in machine learning. ML 2003: Advanced Lectures on Machine Learning. [link]
- Vapnik (2000). The Nature of Statistical Learning Theory. Second edition. [link]
- von Luxburg & Schölkopf (2011). Statistical learning theory: Models, concepts, and results. Handbook of the History of Logic: Inductive logic. [link]
- Wheeler (2017). Machine epistemology and big data. The Routledge Companion to Philosophy of Social Science. [link]
- Williamson (2004). A dynamic interaction between machine learning and the philosophy of science. Mind. Mach. [link]
- Wolpert (1996). The lack of a priori distinctions between learning algorithms. Neural Comput. [link]
- Zhang, Bengio, Hardt, Recht, & Vinyals (2017). Understanding deep learning requires rethinking generalization. ICLR 2017. [link]