Work in process
T.F.S. and R.T. Stewart:
Peirce, pedigree, probability.
H. Lin, K. Genin, T.F.S., and F. Zaffora Blando:
Entry for Oxford Bibliographies.
Undecidability in machine learning: what does it tell us?
Ben-David et al. (2019) construct a machine learning problem such that the question of its learnability is undecidable. What is the philosophical lesson that we can draw from this result?
Peer-reviewed publications, forthcoming
On explaining the success of induction. [doi] [philsci]
Forthcoming in The British Journal for the Philosophy of Science.
Douven (2021) observes that Schurz's meta-inductive justification of induction cannot explain the success of induction, and offers an explanation based on simulations of the social and evolutionary development of our inductive methods. I argue that this account does not answer the relevant explanatory question.
T.F.S. and P.D. Grünwald:
The no-free-lunch theorems of supervised learning. [doi] [philsci]
Forthcoming in Synthese.
The no-free-lunch theorems promote a skeptical conclusion that all learning algorithms equally lack justification. But how could this leave room for learning theory, that shows that some algorithms are better than others? We solve this puzzle by drawing a distinction between data-only and model-based algorithms.
T.F.S. and R. de Heide:
On the truth-convergence of open-minded Bayesianism. [doi] [philsci] Forthcoming in The Review of Symbolic Logic.
Wenmackers and Romeijn (2016) develop an extension of Bayesian confirmation theory that can deal with newly proposed hypotheses. We demonstrate that their open-minded Bayesians do not preserve the classic guarantee of weak truth-merger, and advance a forward-looking open-minded Bayesian that does.
The meta-inductive justification of induction. [doi] [philsci] Episteme 17(4): 519-541.
I discuss and reconstruct Schurz's proposed meta-inductive justification of induction, that is grounded in results from machine learning. I point out some qualifications, including that it can at most justify sticking with object-induction for now. I also explain how meta-induction is a generalization of Bayesian prediction.
The meta-inductive justification of induction: The pool of strategies. [doi] [philsci] Philosophy of Science 86(5): 981-992.
I pose a challenge to Schurz's proposed meta-inductive justification of induction. I argue that Schurz's argument requires a dynamic notion of optimality that can deal with an expanding pool of prediction strategies.
Putnam's diagonal argument and the impossibility of a universal learning machine. [doi] [philsci] Erkenntnis 84(3): 633-656.
The diagonalization argument of Putnam (1963) denies the possibility of a universal learning machine. Yet the proposal of Solomonoff (1964), made precise by Levin (1970), promises precisely such a thing. In this paper I discuss how this proposal is designed to evade diagonalization, but still falls prey to it.
A generalized characterization of algorithmic probability. [doi] [arxiv]
Theory of Computing Systems 61(4): 1337-1352.
In this technical paper I employ a fixed-point argument to show that the definition of algorithmic probability does not essentially rely on the uniform measure. A motivation for establishing this result was to question the view that algorithmic probability incorporates principles of indifference and simplicity.
Solomonoff prediction and Occam's razor. [doi] [philsci]
Philosophy of Science 83(4): 459-479.
Many writings on the subject suggest that Solomonoff's theory of prediction can offer a formal justification of Occam's razor. In this paper I make this argument precise and show why it does not succeed.
G. Barmpalias and T.F.S. (2011):
On the number of infinite sequences with trivial initial segment complexity. [doi] [preprint]
Theoretical Computer Science 412(52): 7133-7146.
In this technical paper, based on results from my MSc thesis [pdf], we answer an open problem [pdf] in the field of algorithmic randomness. This problems concerns infinite sequences of minimal Kolmogorov-complexity. On the way we prove several results on the complexity of trees.
- E.R.G. Quaeghebeur, C.C. Wesseling, E.M.A.L. Beauxis-Aussalet, T. Piovesan, and T.F.S. (2017): The CWI world cup competition: Eliciting sets of acceptable gambles. [pdf] [poster] Proceedings of Machine Learning Research 62: Proceedings of the Tenth International Symposium on Imprecise Probability: Theories and Applications, 10-14 July 2017, pp. 277-288. Poster presented at ISIPTA '15.
Review of Deborah G. Mayo: Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars. [doi]
Journal for General Philosophy of Science 51(3): 507-510.
What's hot in mathematical philosophy. [pdf]
The Reasoner 12(12): 97-98.
J.-W. Romeijn, T.F.S. and P.D. Grünwald (2012):
Good listeners, wise crowds, and parasitic experts. [doi] [pdf]
Analyse & Kritik 34(2), pp. 399-408.
PhD dissertation (2018, cum laude)
Universal prediction: A philosophical investigation. [cwi-repo] [handle] [philsci]
Supervisors: J.-W. Romeijn (U Groningen) and P.D. Grünwald (Centrum Wiskunde & Informatica, Amsterdam; Leiden U) .
Assessment committee: H. Leitgeb (LMU Munich), A.J.M. Peijnenburg (U Groningen), and S.L. Zabell (Northwestern U).
Examining committee: the assessment committee, and R. Verbrugge (U Groningen), L. Henderson (U Groningen), and W.M. Koolen (CWI Amsterdam).
In this thesis I investigate the theoretical possibility of a universal method of prediction. A prediction method is universal if it is always able to learn what there is to learn from data: if it is always able to extrapolate given data about past observations to maximally successful predictions about future observations. The context of this investigation is the broader philosophical question into the possibility of a formal specification of inductive or scientific reasoning, a question that also touches on modern-day speculation about a fully automatized data-driven science.
I investigate, in particular, a specific mathematical definition of a universal prediction method, that goes back to the early days of artificial intelligence and that has a direct line to modern developments in machine learning. This definition essentially aims to combine all possible prediction algorithms. An alternative interpretation is that this definition formalizes the idea that learning from data is equivalent to compressing data. In this guise, the definition is often presented as an implementation and even as a justification of Occam's razor, the principle that we should look for simple explanations.
The conclusions of my investigation are negative. I show that the proposed definition cannot be interpreted as a universal prediction method, as turns out to be exposed by a mathematical argument that it was actually intended to overcome. Moreover, I show that the suggested justification of Occam's razor does not work, and I argue that the relevant notion of simplicity as compressibility is problematic itself.