Many people have written about this, and I do know this literature as well as I'd like, but I did a random calculation that I thought was semi-useful based on this paper. I haven't touched the channel coding aspect of this paper, but I did try to understand the mutual information between input and neural response in the small noise limit. In this case, I'm assuming that the neurons are nearly noiseless, and that their information is coded by a firing rate. The approximation was heavily inspired by this paper.
The details are not pretty, but the story ends up being pretty simple.
Essentially, we'll assume that the firing rate of the neuron is some nonlinear function of some gain multiplied by the input signal, plus some Gaussian noise that ends up being rather irrelevant. We'll also assume that the probability density function of the input is highly peaked around some value. This is essentially ICA, but we learn a few things about the optimal nonlinearity:
Please let me know if someone has already done this calculation! It seems like an obvious move, but I am unfortunately not familiar enough with the literature to know.
Here are some calculations that I will never publish.