My research to date mostly focuses on "predictive features". Simply put, these are aspects of any stream of data that you'd want to retain in order to understand future data. I also have a parallel interest in the constraints that govern such inference-- organisms only have so much memory. Together, these two threads form a main interest: resource-constrained predictive inference. Are biological organisms efficient predictive inference machines? If so, how do they operate, and how can we use that knowledge to engineer new devices?

## Sensors as optimal compressors of the environment

This paper relates the task of a biological sensor to rate-distortion theory, a branch of information theory. Basically, a sensor has two tasks. First, it has to accurately convey information about the environment. Second, it has to be "smaller"-- have fewer neurons, take less time to compute things, be more compatible with downstream regions. Together, you get one grand objective function that alternately penalizes inaccuracy and memory.

Building such sensors is hard, but biologically-inspired algorithms can sometimes do the trick: here and here.

Building such sensors is hard, but biologically-inspired algorithms can sometimes do the trick: here and here.

## Inferring predictive features

The task I was faced with as a graduate student was this: here's some data. How should we build a model? At first, I was drawn to Maximum Entropy methods, but when I learned about hidden Markov models, I realized I had happened upon something more powerful. However, the number of hidden Markov model topologies grows super-exponentially with the number of states, making a brute force search untenable for real-world data. There are various ways around this, but I decided to turn the brute force search through all topologies into a brute force search for topologies that incorporate "expert knowledge" about the data. In a series of papers, I enumerated what kinds of topologies I expected to see for my favorite discrete-event, continuous-time data. Here's an early paper on the discrete-time case, a paper that morphs discrete into continuous-time, and a paper on the full discrete-event, continuous-time case. Stay tuned for new work that combines this stream of thought and Chris Strelioff's earlier papers into a new modeling and prediction algorithm.

## Benchmarking how well agents infer predictive features

I happily realized that the lossless predictive features studied by Jim Crutchfield could be used to not only build models, but to benchmark how well some agent inferred predictive features. Sometimes, the agent is better off building order-R Markov models; sometimes, the agent is better off uncovering hidden states. And one has to be careful, because sometimes memorizing the past provides no guide to the future. Stay tuned for new work on benchmarking of recurrent neural networks, humans, and neurons!