# Math Insight

### Neural decoding introduction

Name:
Group members:
Section:
Total points: 1

Neural decoding is the process of attempting to infer features of a stimulus or other external variable from measurements of neurons' activity. The idea is that the neuronal response is encoding information about the inputs to the brain. If one understood how the information is encoded, then one might be able to go backward and make conclusions about these inputs by analyzing patterns of neuronal activity. This backward inference is at the heart of decoding. Since decoding is an attempt to interpret what the brain's activity means, we can whimsically refer to it as mind reading.

For this project, we will analyze a highly idealized scenario so that we can frame the decoding problem in terms of simple probability. We will starting by imagining that we are measuring whether or not a single neuron in a rat's brain spikes once in one short time window. From this single measurement, we'll attempt to infer the rat's location. (In a more realistic setting, we should simultaneously measure the spiking activity from many tens of neurons over a period of many seconds, obtaining thousands of spikes from which to decode the rat's location.)

The overarching questions of this project are:

1. What types of conclusions about the rat's location can we make from just observing a neuron's spikes?
2. What properties of the neuronal spiking enhance or detract from the decoding?

Imagine that a rat is exploring two adjacent rooms while the response of a single neuron is being measured. From initial testing where you both observe the rat and measure the neuron, you discover two things. First the rat spends twice as much time in room A than in room B. Second, when you measure to determine if the neuron spiked in a 10 ms window, the likelihood of measuring a spike depends on the room. When the rat is in room A, you measure a spike about 10% of the time. On the other hand, when the rat is in room B, you measure a spike about 2% of the time.

1. Step 1: map from biology to math
1. Let's start by writing what we know from our initial testing in terms of probability. If you don't observe the neuron's response (and don't observe the rat location, either, of course), what is the probability that the rat is in room A?
(If you don't have enough information, enter underdetermined.)

2. If you don't observe the neuron's response (and don't observe the rat location, either, of course), and you know (or assume) that the rat is either in room A or room B, what is the probability that the rat is in room A?

Let $A$ be the event that the rat is in room A. You have just estimated the probability of event $A$, which, to repeat, is
$P(A) =$
.

Let $B$ be the event that the rat is in room B. What is $P(B)$?
.

3. The spiking data you measured can be interpreted as conditional probabilities. You measured the probability that the neuron spiked conditioned on each of the two events $A$ and $B$.

Let $R$ denote the number of spikes recorded from the neuron in a 10 ms time window. We assume that you record either 0 or 1 spikes. Therefore, we will be working with two more events. If the neuron spiked once, we say we had the event $R=1$; if the neuron did not spike, we say we had the event $R=0$.

If we know that the rat is in room A, what is the probability that the neuron fires a spike in a 10 ms window (i.e., the probability of the event that $R=1$)?

We denote the conditional probability of the event that $R=1$ given the event $A$ as $P(R=1 \,|\, A)$. To repeat, $P(R=1\,|\,A)=$
.

Similarly, what is the conditional probability of the event that $R=1$ given the event $B$? $P(R=1 \,|\, B) =$
.

4. For completeness, we define two more conditional probabilities for the event that the neuron does not spike, i.e., $R=0$.

$P(R=0 \,|\, A) =$

$P(R=0 \,|\, B) =$

2. Step 2: analyze the model
1. Your goal is be able to perform an experiment where we measure whether or not the neuron fired a spike and see what information you can decode (or mind read) about the location of the rat. As a first step, enumerate all the possible outcomes for the experiment and calculate their probabilities given the initial data you collected.

We assume two possible events for the spike measured (the event $R=0$ or the event $R=1$) and two possible events for the rat location (the event A or B, as we will assume from now on that the rat is either in room A or room B). Hence, there will be four possible outcomes of the experiment:

1. The rat is room A and the neuron spikes (the event $A$ and the event $R=1$).
2. The rat is room A and the neuron does not spike (the event $A$ and the event $R=0$).
3. The rat is room B and the neuron spikes (the event $B$ and the event $R=1$).
4. The rat is room B and the neuron does not spike (the event $B$ and the event $R=0$).

For starters, let's determine the probability of the first outcome, that the rat is in room A and the neuron spikes.

We already estimated the probability that the rat is in room A, $P(A)=$
, and the conditional probability of a spike given A, $P(R=1\,|\,A)=$
.

To estimate the probability that the rat is in room A and the neuron spikes, we need to multiply the probability that the rat is in room A, $P(A)$, by the probability that the neuron spikes conditioned on the rat being in room A. In equations, we write this as $$P(R=1, A) = P(A) P(R=1 \,|\, A).$$ (The comma means “and”, so $P(A,B)$ means the probability of event $A$ and event $B$. The order doesn't matter, so $P(R=1, A) = P(A, R=1)$)

Calculate the probability of the first outcome. $P(R=1,A) =$

2. To calculate the probability that the rat is in room $A$ and the neuron did not spike, we can use an analogous formula: $P(R=0,A) = P(A)P(R=0\,|\, A)$. We've already calculated both probabilities on the right hand side, so all is needed is to multiply them. Therefore $P(R=0, A) =$

3. Repeat this calculation for the third and fourth outcomes.

The probability of the third outcome, i.e., that the rat was in room B and the neuron spiked, is:
$P(R=1, B) = P(B) P(R=1\,|\,B) =$
.

The probability of the fourth outcome, i.e., that the rat was in room B and the neuron did not spike, is:
$P(R=0, B) =$
.

4. Summarize the results in a contingency table.
$R=0$$R=1$Total
$A$

$B$

Total

The bottom right corner is the overall total, i.e., the probability that any of the four outcomes occurred: $P(R=1, A) + P(R=0, A) + P(R=1, B) + P(R=0,B)$. The value for this corner makes sense because

5. Now, if you record the neuron for one 10 ms window (without observing the rat), we want to calculate the probability that you will record a spike, i.e., the probability of the event $R=1$. We denote this probability by $P(R=1)$.

If you record a spike, which outcome(s) from the contingency table must have occurred?

Since the probability that a neuron spiked is the sum of the probabilities of those outcomes, the probability you will record a spike is $P(R=1)=$

On the other hand, the probability you don't record a spike is $P(R=0) =$
.

6. We can finally get to answer the question we started out to answer: if we record a spike from the neuron in a 10 ms window (without observing the rat's location), what's the probability it is in room A? (We can then get the probability it is in room B for free since these probabilities must add up to one.)

Here's a trivial question. If we record that the fact that the neuron spiked, what's the probability that $R=1$?
. That's because, if we recorded that spike, we know that we had to be in the second column of the contingency table, so the total probability of that column must be 1.

Given that we know that we are in the second column (i.e., that we measured that a spike occurred), what's the likelihood we were in the first row versus in the second row (i.e., that the rat was in room A rather than room B)? To answer this question, we just need to rescale the probabilities so that the total probability for second column is 1. In other words, we need to divide the values in the second column by its total $P(R=1) =$
.

Given that $R=1$
$A$
$B$
Total
7. What you calculated in that rescaled table were the conditional probabilities: the probability $P(A \,|\, R=1)$ of being in room $A$ conditioned on the presence of a spike and the probability $P(B \,|\, R=1)$ of being in room $B$ conditioned on the presence of a spike.

By rescaling the table column, you calculated $P(A \,|\, R=1)$ as the ratio of two probabilities: (i) the probability that the rat was in room A and the neuron spiked and (ii) the probability that the neuron spiked. In equations: \begin{align} P(A \,|\, R=1) = \frac{P(R=1,A)}{P(R=1)}. \label{eq:pAR1} \end{align} The division by $P(R=1)$ is the rescaling due to the fact we are only considering cases where the neuron spiked. You could think of estimating the probability $P(A \,|\, R=1)$ by counting all the times the rat was in room A with the neuron spiking and dividing (i.e., rescaling) by the total number of times that the neuron spiked.

Let's summarize the results.

• $P(A \,|\, R=1) =$

This is the likelihood (i.e., probability) that the rat was in room A given that you recorded the neuron's response in a 10 ms window and observed that it spiked.
• $P(B \,|\, R=1 ) =$

If you observe a spike, this quantity is the probability the rat was in room B.
8. Equation \eqref{eq:pAR1} gives an expression for $P(A \,|\, R=1)$. The numerator of this expression is $P(R=1, A)$. Above, we calculated this value as $P(R=1, A) = P(R=1 \,|\,A) P(A)$. Recall that we started with the values of $P(R=1 \,|\,A)$ and $P(A)$, as they are the probability of a spike given that the rat was in room A and the probability of the rat being in room A, respectively.

Rewrite equation \eqref{eq:pAR1} using the numerator $P(R=1 \,|\,A) P(A)$.
$P(A \,|\, R=1) =$

This expression for $P(A \,|\, R=1)$ in terms of $P(R=1 \,|\,A)$ is called Bayes' Theorem. If you know the probability $P(A)$ that the rat is in room A and the probability $P(R=1)$ that the neuron fires a spike, then you can use Bayes' Theorem to convert $P(R=1 \,|\,A)$ (the probability of spiking conditioned on being in room A) into $P(A \,|\, R=1)$ (the probability of being in room A conditioned on spiking).

9. You can likewise use Bayes' theorem in the case where you record the activity of the neuron in a 10 ms window and observe that the neuron did not fire a spike. In this case, we know that an outcome from the first column of the contingency table occurred, and Bayes' theorem is equivalent to rescaling that column by its total.

Write Bayes' theorem for determining the probability that the rat is in room A given that you measured that the neuron did not spike.
$P(A \,|\, R=0) =$

Write Bayes' theorem for determining the probability that the rat is in room B given that you measured that the neuron did not spike.
$P(B \,|\, R=0) =$

However, since we assume that rat must be in either room A or room B, we know that $P(A \,|\, R=0) + P(B \,|\, R=0) =$
. Hence, it'd be overkill to use Bayes' theorem twice for both $P(A \,|\, R=0)$ and $P(B \,|\, R=0)$, but you could if you like.

Given that the neuron did not spike in a 10 ms window, what is the probability that the rat is in room A?
$P(A \,|\, R=0) =$
.

What is the probability that the rat is in room B when you observe a lack of a spike?
$P(B\,|\, R=0) =$
.

3. Step 3: interpret the model analysis biologically
1. Bayes' theorem is useful for decoding because it allows one to convert quantities like $P(R=1 \,|\,A)$ (the probability of spiking conditioned on being in room A) into quantities like $P(A \,|\, R=1)$ (the probability of being in room A conditioned on spiking). It is easy to estimate quantities like $P(R=1 \,|\,A)$; one just observes the likelihood of a spike when the rat is in room A. A quantity such as $P(A \,|\, R=1)$, on the other hand, allows us to read from the rat's mind (by looking at the neuron's activity) and estimate the rat's location.

In the decoding, you never determine exactly the room in which room the rat is. Instead, you simply work with probability distributions of the rat position, giving you the likelihood that the rat is in each room. The decoding actually involves three probability distributions of the rat's position.

The first is the prior distribution that you started with before recording the neuron. We wrote this probability distribution as $P(A)$ and $P(B)$ (which is $1-P(A)$).

The second is the distribution of rat position conditioned on measuring a spike in one 10 ms window. We wrote this probability distribution as $P(A\,|\,R=1)$ and $P(B\,|\,R=1)$ (which is $1-P(A\,|\,R=1)$).

The third probability distribution is the one conditioned on measuring the absence of a spike in one 10 ms window. We wrote this probability distribution as $P(A\,|\,R=0)$ and $P(B\,|\,R=0)$ (which is $1-P(A\,|\,R=0)$).

To visualize the results of the decoding, sketch bar graphs of these probability distributions on the following figures.

Feedback from applet
Bar heights:
Feedback from applet
Bar heights:
Feedback from applet
Bar heights:
2. In this example, without recording from the neuron, you estimate that the probability that the rat is in room A to be $P(A) =$
. But, if you record the neuron's activity in a 10 ms window and observe that the neuron spiked, you can conclude

If, on the other hand, you observe that the neuron didn't spike in the 10 ms window, you can conclude
So, measuring the fact that the neuron did not spike
give you much additional information about the rat's location. Is it ever possible to conclude from measuring this neuron once that the rat is highly likely to be in room B?
.

3. If you run this experiment once, what's the likelihood that you'll have a “successful” mind reading experiment, in the sense that you can determine, from measuring from the rat's neuron, which room the rat is highly likely to be in?

$=$

(In the first blank, enter the probability in symbols, i.e. P(something). In the second blank, enter the number for this example.)

That likelihood isn't so crucial, as we can always repeat the experiment multiple times. (We don't want to have to address the issue of whether or not different experiments are independent, so we won't explore this option here.) The important question is how well we can determine the rat's location when we have such a “successful” experiment. In the cases where we believe we can successfully mind read, what is the likelihood the rat will actually be in the room that the neuron seems to indicate?

$=$

(Use the same convention for the two blanks as above.) This is the probability we are truly interested in, as it indicates how well we can actually decode the rat's location when it does seem that the neuron is telling us something.

What is likelihood that we got misinformation from the neuron, i.e., that the rat was actually in one room when the neuron seemed to indicate the rat was in the other room?

$=$

(Use the same convention for the two blanks as above.) This probability is equally important as the previous one, but it's just a different perspective on the same information.

4. Step 4: explore effect of neuronal properties
1. When measuring the neuron that we've described, you were able to do a remarkable job at decoding the rat's location when you did measure a spike from the neuron. Let's make the mind reading a little more difficult by introducing some error into the measurement of the spikes. One approach to measure a neuron's spikes is to insert an electrode into the brain and try to get the electrode close to one neuron to isolate its signal. Nonetheless, the signal the electrode picks up is noisy and includes effects from other neurons' spikes. It's a challenge to isolate the neuron's spikes from that signal, and algorithms to determine the spikes might miss some of the neuron's spikes or misidentify other neurons' spikes as coming from the neuron being recorded.

Let's imagine that, due to being conservative about classifying the signal as a spike, one misses half of the spikes of the neuron. We'll assume there is no relationship between the missed spikes and the location of the rat. Therefore, as far as the measurement is concerned, the neuron spikes only in 5 percent of the ms windows when the rat is in room A, and it spikes in only 1 percent of the ms windows when the rat is in room B. In other words, given that the event $R=1$ corresponds to measuring a spike, the conditional probabilities have dropped to
$P(R=1 \,|\, A) =$
,
$P(R=1 \,|\, B) =$
.

With this measurement error, the overall spiking probability becomes $P(R=1)=$
.

How does this measurement error, and consequent reduced spiking probability, affect the ability to decode the rat's position from the measurement of a spike? With the error, the probability distribution of the rat's position conditioned on measuring a spike becomes
$P(A \,|\, R=1) =$
,
$P(B \,|\, R=1) =$
.

Does the effect of the error on your ability to decode surprise you?
Mathematically, any mystery can be explained by Bayes' theorem for $P(A \,|\, R=1)$. The factor $P(R=1 \,|\, A)$ in the numerator
, and the denominator $P(R=1)$
, so that the ratio
.

Surely, this conservative approach where one misses half the spikes must affect the decoding in some way. Indeed, the difference is that, due to the
in $P(R=1)$, the probability of actually measuring a spike (or having a “successful” experiment)
. If one becomes too conservative and drops $P(R=1)$ a whole lot, then it might take an unreasonably long time to successfully decode the rat's location. But, even so, at least the quality of the decoding would be
.

2. What happens if, on the other hand, one is too liberal in attempting to detect the neuron's spikes and ends up including additional spikes from other neurons. We wouldn't expect these extraneous spikes to be related to the rat's location in the same way as the target neuron's spikes, so let's imagine that these extra spikes increase the probability of measuring $R=1$ by 0.05 regardless of whether the rat is in room A or room B. The measurement error due to the liberal spike detection algorithm has increased the conditional probabilities of measuring a spike to
$P(R=1 \,|\, A) =$
,
$P(R=1 \,|\, B) =$
.

With this measurement error, the overall spiking probability becomes $P(R=1)=$
.

How does this measurement error, and consequent increased spiking probability, affect the ability to decode the rat's position from the measurement of a spike? With the error, the probability distribution of the rat's position conditioned on measuring a spike becomes
$P(A \,|\, R=1) =$
,
$P(B \,|\, R=1) =$
.

With the measurement error, the quality of the decoding was
. This result can be seen through Bayes' theorem for $P(A \,|\, R=1)$. Although both $P(R=1 \,|\, A)$ in the numerator and $P(R=1)$ in the denominator increased by the same amount
, $P(R=1 \,|\, A)$ increased by a
fraction. Thus the ratio of $P(R=1 \,|\, A)/P(R=1)$
after the introduction of the measurement error due to the liberal spike detection algorithm.

3. In general, decoding the rat's position from the spike of one neuron will work the best when $P(A \,|\, R=1)$ is
, as that indicates we've determined that the rat is highly likely to be in one room or the other.

Given that we are viewing $P(A)$ and $P(B)$ as fixed (we aren't changing the rat's behavior), Bayes' theorem indicates that the decoding will work best when the ratio $P(R=1 \,|\, A)/P(R=1)$ is
.

It shouldn't be surprising that this ratio indicates good decoding when the neuron is much more likely to fire in one room than in the other, i.e., when the neuron's response is strongly selective to the rat's location. (You can try calculating the ratio for a few example neurons to convince yourself it is true. For example, compare the non-selective case where $P(R=1 \,|\, A)=0.2$ and $P(R=1 \,|\, B)=0.1$ to the selective case where $P(R=1 \,|\, A)=0.2$ and $P(R=1 \,|\, B)=0.01$.)