# Math Insight

### Neural decoding and mind reading project

Group members:
Total points: 1

To earn credit, a project must meet the following criteria.

CriterionMetNot met
Create an accurate probabilistic model of the influence of the rat's location on the neuron's spikes.
Accurately decode the rat's location from the spikes of each neuron and interpret the results.
Accurately determine the implications of measuring from the two neurons simultaneously.
##### Submitting project

Submit the following by the due date.

1. This cover sheet
2. Answers to the project questions (typed or handwritten)

#### Background

Neural decoding is the process of attempting to infer features of a stimulus or other external variable from measurements of neurons' activity. The idea is that the neuronal response is encoding information about the inputs to the brain. If one understood how the information is encoded, then one might be able to go backward and make conclusions about these inputs by analyzing patterns of neuronal activity. This backward inference is at the heart of decoding. Since decoding is an attempt to interpret what the brain's activity means, we can whimsically refer to it as mind reading.

Many sensory neurons encode features of a stimulus by their firing pattern. For example, a visual neuron might be highly likely to fire a spike only when an object is in a certain location of the visual field, or it may spike only when an object has a certain shape, or spike only when an object is moving in a particular direction. Similarly, an auditory neuron might be likely to spike only for sounds at a certain frequency or it might spike only after certain combinations of sounds. When a neuron is likely to spike only when certain stimulus features are present, we can say that the neuron's response is encoding those features. The process of decoding such a sensory neuron attempts to ask the reverse question: what stimulus features were likely present based on an observation of when the neuron spiked?

In this project, rather than attempting to decode stimulus features, we will attempt to decode the location of a rat. The principles are the same, only we rely on different types of neurons. In an area of the brain called the hippocampus are neurons that we refer to as place cells because they are more likely to fire a spike when an animal, we'll say a rat, is in a particular location. We call the region that a particular neuron “prefers” its place field. We can say that the spikes of a place cell encode the location of the rat in the sense that the spikes give evidence that the rat is in that neuron's place field.

When attempting to decode a rat's position based on the spikes of place cells, we have to take different evidence in account. First of all, rats tend to have preferred location where they are more likely to hang out. (For example, a rat would much prefer to stay near a wall of a room rather than mosey around the middle of a room.) Second, spikes of a place cell gives evidence that the rat may be in its place field. Since, ideally, we'd like to measure from multiple place cells simultaneously, we'd need to integrate evidence from multiple neurons. (A third source of evidence should be our estimate of where we thought the rat was a moment ago, as it would be highly unlikely for a rat to be able to teleport across a room in an instant. But, for simplicity, we won't deal with this evidence from multiple time points.)

To combine these source of information, we will use Bayesian inference. We will use the information about where the rat prefers to be as the prior distribution of the rat's location. Then, using Bayes' Theorem, we can update the probability distribution of the rat's location when we receive evidence from the spiking of one or two place cells.

For this project, we will analyze a highly idealized scenario so that we can frame the decoding problem in terms of simple probability. We will starting by imagining that we are measuring whether or not a single neuron (a place cell) in a rat's brain spikes once in one short time window. From this single measurement, we'll attempt to infer the rat's location. We will then look to see if we can improve our decoding by looking at place cells simultaneously. (In a more realistic setting, we should simultaneously measure the spiking activity from many tens of neurons over a period of many seconds, obtaining thousands of spikes from which to decode the rat's location.)

The overarching questions of this project are:

1. What types of conclusions about the rat's location can we make from just observing a neuron's spikes?
2. What properties of the neuronal spiking enhance or detract from the decoding?

A rat is wandering along a linear track. In its wanderings, the rat prefers to be at the ends of the track. In fact, it spends 50% of the time at the two ends of the track (evenly divided between the two ends) and spends the remaining 50% of the time being equally likely to be anywhere along the rest of the track.

During these wanderings, you record the spiking activity of different neurons. You find two neurons whose firing activity appears to be strongly modulated by the location of the rat along the track (i.e., two place cells). You discover that these neurons are much more likely to fire when the rat is at certain locations along track (i.e., you determine the neurons' place fields). The goal of this project is to determine how well you can decode the rat's location by recording the neurons' spikes.

1. Step 1: map from biology to math
1. Divide the track into ten equally spaced regions, numbered from 1 to 10, where region 1 and region 10 are the two ends of the track. For $j=1,2, \ldots, 10$, let $A_j$ denote the event that the rat is in region $j$. We assume that the rat is in one those 10 regions.

What is the probability that the rat is in each of the regions? In other words, from knowledge about where the rat prefers to be, determine the prior distribution, which is composed of all the $P(A_j)$.

2. You measure the spikes of neurons 1 and 2 in a 10 ms window. During this length of the time, you measure either 0 or 1 spikes. Let $R_1$ be the number of spikes recorded from neuron 1 and $R_2$ be the number of spikes recorded from neuron 2. Then, $R_1=1$ is the event that neuron 1 fired a spike, and $R_2=1$ is the event that neuron 2 fired a spike.

At this point, even though we are measuring from two neurons, we are going to treat them separately. (We won't combine their evidence until step 4.) For now, having two neurons simply means you must repeat all calculations twice, once for each neuron.

By measuring the spiking activity while the rat is wandering the maze, you have determined that, when the rat is in most regions of the track, neuron 1 has about a 5% chance of firing in any 10 ms window. However, when the rat is in regions 3-5, the spiking probability is higher. During any 10 ms window, neuron 1 has a 10% chance of firing when the rat is in region 3, a 20% chance of firing when the rat is in region 4, and a 15% chance of firing when the rat is in region 5. Based on this information, determine the conditional probabilities $P(R_1=1 \,|\, A_j)$ of the event $R_1=1$, conditioned on the events $A_j$, for $j=1,2, \ldots, 10$.

3. Similarly, you determine that neuron 2 has about a 1% chance of firing in a 10 ms window when the rat is in most regions of the track. Its firing probability is elevated in regions 5-7. During any 10 ms window, neuron 2 has a 4% chance of firing when the rat is in region 5, a 6% chance of firing when the rat is in region 6, and a 3% chance of firing when the rat is in region 7. Based on this information, determine the conditional probabilities $P(R_2=1 \,|\, A_j)$ of the event $R_2=1$, conditioned on the events $A_j$.

2. Step 2: analyze the model
1. As a first step toward decoding the rat's location from the spikes, determine the probability $P(R_1=1)$ that neuron 1 fires a spike in any 10 ms window. (We often refer to this as the marginal probability that $R_1=1$, as we are averaging over all locations $A_j$ for $j=1,2,\ldots, 10$.) Similarly, compute $P(R_2=1)$.

This calculation is similar to the total probability calculation for $P(R=1)$ in the neural decoding introduction, except that you have to sum over ten terms rather than just two. (You'd have a large contingency table if you were to write that all out.) You are welcome to do these computations by hand, though it would involve a bit of tedious calculations (calculating ten products then adding them together and repeating the procedure for the second neuron). An R program would greatly reduce the tedium, especially if you used vectors. Imagine, for example, that you created a vector P_A with 10 elements to store all the $P(A_j)$ and a vector P_R1_1_given_A with 10 elements to store all the $P(R_1=1 \,|\, A_j)$. Then, if for some reason, you wanted to multiply the first component of P_A (i.e., $P(A_1)$) with the first component of P_R1_1_given_A (i.e., $P(R=1 \,|\, A_1)$), do the same for all ten components, and then add the total up, you could do this in R with the command sum(P_A*P_R1_1_given_A). That's easier than computing 10 products by hand and then adding them up.

If you calculate these quantities with an R script, you can turn in your script as a way to show your work for the calculations.

2. Now that you computed $P(R_1=1)$ and you already know both $P(A_j)$ and $P(R_1=1 \,|\, A_j)$ for $j=1,2,\ldots, 10$, use Bayes' theorem 10 times (once for each value of $j$) to compute $P(A_j \,|\, R_1=1)$.

Again, an R program could save you some tedious calculations. If you had two 10 element vectors v1 and v2, as well as a scalar (single number) u, then the expression v1*v2/u in R would result in a 10 element vector whose first element, for example, would be the first element of v1 times the first element of v2 divided by u.

3. In a similar manner, use Bayes' theorem to compute $P(A_j \,|\, R_2=1)$.

3. Step 3: interpret the model analysis biologically
1. To decode the rat's location, we are primarily interested in three probability distributions over regions $A_j$: the prior distribution $P(A_j)$, the posterior distribution of $A_j$ conditioned on a neuron 1 spike $P(A_j \,|\, R_1=1)$, and the posterior distribution of $A_j$ conditioned on a neuron 2 spike $P(A_j \,|\, R_2=1)$. Sketch bar graphs of these probability distributions.

In R, you can use the barplot to create these bar graphs. To make the labeling easy, you can create a vector of region names with the command region_names = paste("A", 1:10, sep="") (look at the variable region_names to see what the paste command did). If you created the vectors suggested above, then the following command will plot a bar graph of the prior distribution:

barplot(P_A, col="darkgreen", names.arg=region_names,
ylab="Probability", main="Prior distribution", ylim=c(0,0.25))


Here we made the maximum value of the $y$-axis be 0.25. You can adjust that number as you see fit. Similar commands will plot the remaining probability distributions.

2. Since the rat prefers to spend time at the ends of the track, the end regions play a special role. Without measuring a spike from any neuron (i.e., using the prior distribution), observe that the rat is mostly likely to be in one of the two end regions (region 1 or 10). Let $E$ be the event that the rat is in one of the two end regions of the track (region 1 or 10). What is the probability that the rat is in one of those two regions? Express that probability in symbols, in terms of the event $E$.

3. If you measure a spike from neuron 1 in a 10 ms window, in which region(s) is the rat most likely to be? (If there are multiple regions with the same probability in from this posterior distribution, state all the regions.) One of those identified regions should not be an endpoint; this highly-likely non-endpoint region is the most interesting region. Let $B_1$ be the event that the rat is in that region or one of its neighboring regions on either side. (The event $B_1$ should include three regions.) What is $P(B_1 \,|\, R_1=1)$? Compare that probability to $P(E \,|\, R_1=1)$.

4. If you measure a spike from neuron 2 in a 10 ms window, in which region is the rat most likely to be? Let $B_2$ be the event that the rat is in that region or one of its neighboring regions on either side. What is $P(B_2 \,|\, R_2=1)$? Compare that proability to $P(E \,|\, R_2=1)$.

5. From which neuron can you better decode the rat's location? Justify your answer.

6. What properties of that neuron lead to the better decoding? (Hint look at $P(R_1=1)$ and $P(R_2=1)$ compared to the probability of $R_1=1$ or $R_2=1$ when the rat is in the regions corresponding to $B_1$ or $B_2$.)

4. Step 4: Decode from two neurons simultaneously
1. If we assume that, when the rat is any particular region, the spikes of neuron 1 and neuron 2 are independent, then it's easy to determine the probability that both neurons spiked in a 10 ms window, conditioned on the rat being an region. Let $S$ be the event of a simultaneous spike, i.e., the event that both $R_1=1$ and $R_2$. Then, for $j=1,2,\ldots, 10$, let $P(S \,|\, A_j)$ denote the probability that both $R_1=1$ and $R_2=1$ when the rat is in region $j$. If the events $R_1=1$ and $R_2=1$ are independent in each region, then these probabilities are just the products of the probabilities of the individual spiking events: $$P(S \,|\, A_j) = P(R_1=1, R_2=1 \,|\, A_j) =P(R_1=1 \,|\, A_j)P(R_2=1 \,|\, A_j).$$ Calculate the 10 probabilities $P(S \,|\, A_j)$.

Once you have calculated these ten probabilities, you can treat the event $S$ of the simultaneous spike $R_1=1$ and $R_2=1$ just like the individual events $R_1=1$ and $R_2=1$. In what follows, you'll just repeat the above calculations for these simultaneous spike events.

2. Calculate the (marginal) probability of observing simultaneous spikes, $P(S)$. (The calculation should be exactly the same as for calculating $P(R_1=1)$, except that you use the 10 numbers $P(S \,|\, A_j)$ instead of the 10 numbers $P(R_1=1 \,|\, A_j)$.)
3. Use Bayes' theorem to decode that rat's position based on the measurement of simultaneous spikes in both neurons: $P(A_j \,|\, S)$. (Again, the calculation is exactly the same as for the $R_1=1$ case. Use Bayes' theorem in the same way.)
4. Sketch a bar graph of the probability distribution $P(A_j \,|\, S)$.
5. If you measure a spike from both neuron 1 and neuron 2 in a 10 ms window, in which region is the rat most likely to be? Let $B_3$ be the event that the rat is in that region or one of its neighboring regions on either side. What is $P(B_3 \,|\, S)$? Compare that probability to $P(E \,|\, S)$.
6. From which measurement can you better decode the neuron's location: measurement of a spike in neuron 1, measurement of a spike in neuron 2, or measurement of simultaneous spikes in both neurons? Justify your answer.