8 Accumulator models of choice and response time

So far, we have focused on models that are designed to explain how people choose between two opposing options. We built a random walk model and a diffusion model and then saw how we could fit a diffusion model to choice/RT data. For the latter purpose, we relied on the WienR package to calculate the (log-)likelihood of making a particular choice at a particular time. This is because the math for calculating those likelihoods is not trivial, requiring some bespoke numerical approximations that are beyond the scope of this course. Nonetheless, our cognitive models so far have touched on several major themes that are shared by all cognitive models:

Representations: in making a decision, the model assumes that a decision-maker maintains a representation of the degree to which the evidence they have accumulated so far favors one option or the other.
Processes: Behavior, in the form of a choice and a response time, is the result of applying an evidence accumulation process to the decision-maker’s representation of their current state of evidence.
Simulation: We saw how we could set different parameters of a model in such a way that it generated different patterns of behavior.
Fitting: We saw how we could use the computer to search for the parameters of the model that assign the highest likelihood to a set of observed data, which gives us insight into how someone performed a particular decision task.
Comparison: We saw how we could use model comparison to test hypotheses regarding which of several models “best” accounts for a particular set of data, in the sense that the model achieves a balance between fit and complexity.

In this chapter, we will build a new class of model that allows us to address more complex choice scenarios. We will not only write our own simulation code, like we did before; we will also write our own likelihood function so we can get a bit more insight into how those work. The specific category of model we will address are called accumulator models.

Accumulator models are very similar to the random walk and diffusion models of choice that we have seen already. Like those models, they aim to explain choices and response times in terms of an evidence accumulation process. The difference is in how accumulator models represent the decision-maker’s momentary states of evidence. Instead of representing evidence as a single value representing the “balance” of evidence between two options, accumulator models represent evidence in terms of multiple values. In some accumulator models, each option available to a decision maker is associated with its own evidence value. But it is also possible to make accumulator models more general than that. For example, perhaps each feature of an item (like the color and shape of an object) are processed in separate channels, each of which is associated with an evidence value. In these more general models, it is possible to formulate different decision rules by specifying how values map onto making different choices.

The preceding overview may seem a bit abstract, so let’s consider a concrete situation that will let us see how accumulator models can be used model different kinds of tasks with different characteristics and decision rules. This situation is a visual search task. In a visual search task, a participant is shown a search array consisting of items; one or more of those items could be a “target” for which the participant is supposed to search. For example, an array could consist of colored shapes, with the target defined as a red square.

In the first version of the visual search task we will consider, let’s say that each array shown to the participant contains exactly one target item and the job of the participant is to identify which item is the target. Let’s say that the participant expresses their choice by making a saccade (an eye movement) so that they fixate the item they think is the target. (Alternatively, they could make their selection by clicking on the item if using a mouse or tapping on the item if using a touchscreen.) The point is that, unlike in the random walk and diffusion models we’ve considered so far, the participant may have more than two options if there are more than two items in the array. A similar situation could arise when deciding which of several products to buy, for example.

8.1 Race model simulations

The first kind of accumulator model we will build is called a race model for the simple reason that it models making a decision as a race between different evidence accumulation processes. Each option is associated with a level of evidence which can fluctuate over time as evidence that either favors or disfavors that option gets accumulated. Whichever accumulator first reaches a threshold level of evidence then determines the outcome of the choice, and the response time depends on how long the “winning” accumulator needed to reach threshold (plus residual time, of course).

As we will see, each accumulator can be modeled as a diffusion or random walk process. The only difference is that instead of stopping when it hits either an upper or lower boundary, there is only one upper boundary that acts as the “finishing line”. In general, different accumulators may have different threshold levels.

Thus, accumulator models instantiate the same four theoretical constructs as random walk/diffusion models: Decisions still depend on accumulating evidence regarding the available options. Decisions depend on accumulating until a threshold amount of evidence is reached, such that higher thresholds amount to greater response caution. However, because thresholds can be set at different levels for different options, there is the possibility of response bias. Finally, we must acknowledge the ever-present residual time that contributes to observed response times.

8.1.1 A single accumulator

To begin, let’s simulate just a single accumulator. Although there are a number of ways to do this (and we will consider some variations later on), we will make a minor modification to the diffusion_sim function we wrote already. We will simply remove the lower boundary, so that the process will stop only when hitting its upper threshold. The revised function is shown below, along with comments that indicate the meaning of each of the function arguments.

Code

# Function arguments
# v: "drift rate", the average evidence in support of this option
# a: threshold level of evidence
# t0: residual time
# dt: how long each tiny time interval lasts
# t_max: optional "cut off" time to stop accumulating
accum_sim <- function(v = 0, a = 1, t0 = 0.2, dt = 0.01, t_max = Inf) {
    x <- 0
    t <- t0
    
    x_record <- x
    t_record <- t
    
    while (x < a & t < t_max) {
        x_sample <- rnorm(n = 1, mean = v * dt, sd = sqrt(dt))
        x <- x + x_sample
        t <- t + dt
        x_record <- c(x_record, x)
        t_record <- c(t_record, t)
    }
    
    return(data.frame(t = t_record, x = x_record))
}

Let’s see what the behavior of one of these single accumulators looks like. The chunk of code below simulates 1000 trials using a few different settings of the drift rate parameter v and the threshold parameter a. Note that because there is only one threshold and only one accumulator, there are no “choices” being made yet. We are just simulating how long it takes a single accumulator to reach its threshold.

Code

N_sims <- 1000

paramsToSim <- expand_grid(a = c(1, 2), v = c(1, 2, 3))

sim_rt_results <- c()

for (param_index in 1:nrow(paramsToSim)) {
    for (sim_index in 1:N_sims) {
        this_sim <- accum_sim(
            v = paramsToSim$v[param_index],
            a = paramsToSim$a[param_index]
        )
        
        sim_rt_results <- rbind(
            sim_rt_results,
            tibble(
                sim_index = sim_index,
                v = paramsToSim$v[param_index],
                a = paramsToSim$a[param_index],
                finishing_time = max(this_sim$t)
            )
        )
    }
}

# Plot conditional RT distributions
dens_plot <- sim_rt_results %>%
    ggplot(aes(x = finishing_time, color = v, linetype = factor(a), group = interaction(v, a))) +
    geom_density() +
    scale_color_gradient2(mid = "#bbbbbb") +
    coord_cartesian(xlim = c(0, 5)) +
    labs(x = "Finishing time", y = "Density", color = "Drift rate (v)", linetype = "Threshold (a)")

cdf_plot <- sim_rt_results %>%
    ggplot(aes(x = finishing_time, color = v, linetype = factor(a), group = interaction(v, a))) +
    stat_ecdf() +
    scale_color_gradient2(mid = "#bbbbbb") +
    coord_cartesian(xlim = c(0, 5), ylim = c(0, 1)) +
    labs(x = "Finishing time", y = "Cumulative", color = "Drift rate (v)", linetype = "Threshold (a)")

print(
    dens_plot + cdf_plot + plot_layout(nrow = 1, guides = "collect")
)

The code below plots two graphs. The left graph shows the probability density function (PDF) of the simulated finishing times for each combination of drift rate (v) and threshold (a). The right graph shows the cumulative distribution function (CDF) of the simulated finishing times. The left graph shows the probability of finishing at time $t$. The right graph shows the probability of having finished at or before time $t$. Both the PDF and the CDF will turn out to be important when we simulate a race between multiple accumulators.

For now, though, we can appreciate two things, which are fairly intuitive:

The greater the drift rate v, the faster the accumulator is to finish.
The greater the threshold a, the slower the accumulator is to finish.

8.1.2 A race between independent accumulators

Now let’s introduce a second accumulator to the mix. To return to our visual search example, this would correspond to a search array with two items. For now, we will assume that each accumulator operates independently of the other, meaning that the evidence accumulated by each accumulator is not affected by the evidence level of the other. Shortly, we will relax this assumption for our simulations. Unfortunately, we will have to keep the independence assumption when we later write a function that computes the likelihood of a particular accumulator winning at a particular time. When accumulators can interact with one another, computing the likelihood is a lot harder and requires some bespoke numerical methods that are beyond this course.

For now, we need to modify our simulation code so that it treats x not as a single number, but as a vector where each element corresponds to the evidence level in each accumulator. As a corollary to that, we will also need to treat the v and a arguments as vectors, since they may differ between accumulators. That said, the first three lines of code in the function will repeat the values of v and a if necessary, so we can be lazy and provide just a single value if we want it to be equal across accumulators.

Code

# Function arguments
# v: "drift rate", the average evidence in support of each option
# a: threshold level of evidence for each accumulator
# t0: residual time
# dt: how long each tiny time interval lasts
# t_max: optional "cut off" time to stop accumulating
race_sim <- function(v = 0, a = 1, t0 = 0.2, dt = 0.01, t_max = Inf) {
    n_accum <- max(length(v), length(a))
    
    v <- rep(v, n_accum)[1:n_accum]
    a <- rep(a, n_accum)[1:n_accum]
    
    x <- rep(0, n_accum)
    t <- t0
    
    x_record <- x
    t_record <- t
    
    while (all(x < a) & t < t_max) {
        x_sample <- rnorm(n = n_accum, mean = v * dt, sd = sqrt(dt))
        x <- x + x_sample
        t <- t + dt
        x_record <- rbind(x_record, x)
        t_record <- c(t_record, t)
    }
    
    to_return <- cbind(t_record, x_record)
    
    colnames(to_return) <- c("t", paste0("x", 1:n_accum))
    
    return(as_tibble(to_return))
}

Note that each time we run the simulation, we get the trajectory of accumulated evidence for both accumulators, labeled x1 and x2.

Code

race_sim(v = c(1, 0), a = 1, t0 = 0.2)

# A tibble: 35 × 3
       t     x1       x2
   <dbl>  <dbl>    <dbl>
 1  0.2  0       0      
 2  0.21 0.0595 -0.114  
 3  0.22 0.141  -0.0747 
 4  0.23 0.282   0.00144
 5  0.24 0.392  -0.125  
 6  0.25 0.504  -0.0206 
 7  0.26 0.383  -0.0453 
 8  0.27 0.353  -0.0878 
 9  0.28 0.464  -0.221  
10  0.29 0.555  -0.162  
# ℹ 25 more rows

This makes for some lovely plots:

Code

race_sim(v = c(1, 0), a = 1, t0 = 0.2) %>%
    pivot_longer(matches("x(\\d+)"), names_to = "accumulator", values_to = "x") %>%
    ggplot(aes(x = t, y = x, color = accumulator)) +
    geom_line() +
    geom_hline(yintercept = 1, linetype = "dashed") +
    labs(x = "Time", y = "Accumulated evidence", color = "Accumulator")

We can also easily simulate situations with more than two options. For example, maybe there are four items in the search array. Say that item 1 is the search target, a red square. Item 2 is a red circle—it matches the target in color but not shape. Item 3 is a blue square—it matches the target in shape but not color. Finally, item 4 is a blue circle, which matches in neither color nor shape. If “evidence” reflects the degree of match between an item and the search target, then it would make sense for the drift rate for item 1 to be the greatest, the drift rate for item 4 to be the smallest, and the drift rates for items 2 and 3 to be intermediate (whether the drift rate for item 2 is greater than for item 3 may depend on how much attention is devoted to each feature). I picked some sensible-seeming values for those drift rates in the simulation below, also assuming the same threshold of a = 2 across all accumulators.

Code

race_sim(v = c(1, 0.4, 0.2, 0), a = 2, t0 = 0.2) %>%
    pivot_longer(matches("x(\\d+)"), names_to = "accumulator", values_to = "x") %>%
    ggplot(aes(x = t, y = x, color = accumulator)) +
    geom_line() +
    geom_hline(yintercept = 2, linetype = "dashed") +
    labs(x = "Time", y = "Accumulated evidence", color = "Accumulator")

Although the graph above is just one simulated trial, it gives a visual sense of how evidence accumulation works in a race model. It also shows how the option with the highest drift rate (accumulator x1) need not always win the race!

Of course, to get a sense of the full distribution of behavior this model predicts, we can return to our old friend, the Quantile-Probability plot. The code below simulates 1000 trials using the same parameter values as those used in the plot above.

Code

n_sims <- 1000

sim_results <- c()

for (i in 1:n_sims) {
    current_result <- race_sim(c(1, 0.4, 0.2, 0), a = 2, t0 = 0.2)
    
    # Extract just the choice and RT
    rt <- current_result$t[nrow(current_result)]
    choice <- which.max(current_result[nrow(current_result), 2:ncol(current_result)])
    
    # "Bind" the current simulation to the ongoing record of simulation results
    sim_results <- rbind(
        sim_results,
        tibble(
            sim_index = i,
            rt = rt,
            choice = choice
        )
    )
}

# Plot conditional RT distributions
sim_results %>%
    ggplot(aes(x = rt, color = factor(choice))) +
    geom_density() +
    labs(x = "Response time", y = "Frequency", color = "Choice", title = "Conditional RT distributions")

Code

# Quantile-probability plot
sim_choice_p <- sim_results %>%
    group_by(choice) %>%
    count() %>%
    ungroup() %>%
    mutate(p_resp = n / sum(n))

sim_rt_q <- sim_results %>%
    group_by(choice) %>%
    reframe(rt_q = quantile(rt, probs = c(0.1, 0.3, 0.5, 0.7, 0.9)))

full_join(sim_choice_p, sim_rt_q) %>%
    ggplot(aes(x = p_resp, y = rt_q, color = factor(choice))) +
    geom_point() +
    expand_limits(x = c(0, 1)) +
    labs(x = "Response proportion", y = "RT Quantile", color = "Choice", title = "Quantile-Probability Plot")

Joining with `by = join_by(choice)`

There’s a couple things worth noting in the simulations above. First, looking at the QP plot, it is clear that accumulators with higher drift rates tend to “win” the race more often—the target item (item 1) gets chosen more often than the other items. However, you may also have noticed that the RT distributions look pretty similar regardless of which item was actually chosen in the end. This is somewhat reminiscent of how error and correct RT distributions were the same in the diffusion/random walk until we introduced trial-by-trial variability in drift rates.

The next set of simulations illustrate how we can model response bias by allowing the threshold to vary between accumulators. In the simulations below, I use four accumulators representing the factorial combination of two values of drift rate (v is either 1 or 0) and two values of threshold (a is either 1 or 2).

Code

n_sims <- 1000

sim_results <- c()

for (i in 1:n_sims) {
    current_result <- race_sim(c(1, 1, 0, 0), a = c(2, 1, 2, 1), t0 = 0.2)
    
    # Extract just the choice and RT
    rt <- current_result$t[nrow(current_result)]
    choice <- which.max(current_result[nrow(current_result), 2:ncol(current_result)])
    
    # "Bind" the current simulation to the ongoing record of simulation results
    sim_results <- rbind(
        sim_results,
        tibble(
            sim_index = i,
            rt = rt,
            choice = choice
        )
    )
}

# Plot conditional RT distributions
sim_results %>%
    ggplot(aes(x = rt, color = factor(choice))) +
    geom_density() +
    labs(x = "Response time", y = "Frequency", color = "Choice", title = "Conditional RT distributions")

Code

# Quantile-probability plot
sim_choice_p <- sim_results %>%
    group_by(choice) %>%
    count() %>%
    ungroup() %>%
    mutate(p_resp = n / sum(n))

sim_rt_q <- sim_results %>%
    group_by(choice) %>%
    reframe(rt_q = quantile(rt, probs = c(0.1, 0.3, 0.5, 0.7, 0.9)))

full_join(sim_choice_p, sim_rt_q) %>%
    ggplot(aes(x = p_resp, y = rt_q, color = factor(choice))) +
    geom_point() +
    expand_limits(x = c(0, 1)) +
    labs(x = "Response proportion", y = "RT Quantile", color = "Choice", title = "Quantile-Probability Plot")

Joining with `by = join_by(choice)`

The results show that, as would make sense, accumulators with lower thresholds (2 and 4) tend to win more quickly than those with higher thresholds (1 and 3). The simulations also illustrate a tendency which we are about to explore: Accumulators with lower drift rates (3 and 4) tend to be associated with faster responses.

To get a better sense of what’s going on, let’s run another set of simulations. In the following, we assume only two items, a target and a distractor. The target will always have a drift rate of 1 but we will vary the drift rate associated with the distractor. This might happen, for example, if we make the distractor progressively more similar to the target. We will keep a threshold of 2 on both accumulators and residual time of 0.2.

Code

n_sims <- 1000

sim_results <- c()

for (distractor_drift in c(0, 0.5, 0.9)) {
    for (i in 1:n_sims) {
        current_result <- race_sim(c(1, distractor_drift), a = 2, t0 = 0.2)
        
        # Extract just the choice and RT
        rt <- current_result$t[nrow(current_result)]
        choice <- which.max(current_result[nrow(current_result), 2:ncol(current_result)])
        
        # "Bind" the current simulation to the ongoing record of simulation results
        sim_results <- rbind(
            sim_results,
            tibble(
                distractor_drift = distractor_drift,
                sim_index = i,
                rt = rt,
                choice = choice
            )
        )
    }
}

# Quantile-probability plot
sim_choice_p <- sim_results %>%
    group_by(distractor_drift, choice) %>%
    count() %>%
    group_by(distractor_drift) %>%
    mutate(p_resp = n / sum(n))

sim_rt_q <- sim_results %>%
    group_by(distractor_drift, choice) %>%
    reframe(rt_q = quantile(rt, probs = c(0.1, 0.3, 0.5, 0.7, 0.9)))

full_join(sim_choice_p, sim_rt_q) %>%
    ggplot(aes(x = p_resp, y = rt_q, color = factor(choice))) +
    geom_point() +
    expand_limits(x = c(0, 1)) +
    facet_wrap("distractor_drift", labeller = label_both) +
    labs(x = "Response proportion", y = "RT Quantile", color = "Choice", title = "Quantile-Probability Plot")

Joining with `by = join_by(distractor_drift, choice)`

As the drift rate for the distractor (item 2) increases, two things happen: The probability of choosing the distractor increases and response times decrease. Meanwhile, if you choose the distractor, you tend to do so a bit faster than when you pick the target.

Speeding up when the competition gets more heated—that seems a bit counterintuitive, no? However, it is a consequence of statistical facilitation (Raab, 1962), which is a general phenomenon exhibited by race models. It happens because, for an accumulator to “win” the race, it must have been faster than its competition. Therefore, when the distractor has a high drift rate, the target must be even faster in order to win.

For the same reason, if we introduce more distractors—and therefore more “runners” in the race—the race model also produces faster responses. This is shown in the simulations below, which vary the number of distractors in the array, assuming each distractor has a drift rate of 0.1.

Code

n_sims <- 1000

sim_results <- c()

for (num_distractors in c(1, 3, 7)) {
    for (i in 1:n_sims) {
        current_result <- race_sim(c(1, rep(0.1, num_distractors)), a = 2, t0 = 0.2)
        
        # Extract just the choice and RT
        rt <- current_result$t[nrow(current_result)]
        choice <- which.max(current_result[nrow(current_result), 2:ncol(current_result)])
        
        # "Bind" the current simulation to the ongoing record of simulation results
        sim_results <- rbind(
            sim_results,
            tibble(
                num_distractors = num_distractors,
                sim_index = i,
                rt = rt,
                # Note that I don't bother to keep track of which distractor was selected
                choice = factor(ifelse(choice == 1, "Target", "Distractor"), levels = c("Target", "Distractor"))
            )
        )
    }
}

# Quantile-probability plot
sim_choice_p <- sim_results %>%
    group_by(num_distractors, choice) %>%
    count() %>%
    group_by(num_distractors) %>%
    mutate(p_resp = n / sum(n))

sim_rt_q <- sim_results %>%
    group_by(num_distractors, choice) %>%
    reframe(rt_q = quantile(rt, probs = c(0.1, 0.3, 0.5, 0.7, 0.9)))

full_join(sim_choice_p, sim_rt_q) %>%
    ggplot(aes(x = p_resp, y = rt_q, color = factor(choice))) +
    geom_point() +
    expand_limits(x = c(0, 1)) +
    facet_wrap("num_distractors", labeller = label_both) +
    labs(x = "Response proportion", y = "RT Quantile", color = "Choice", title = "Quantile-Probability Plot")

Joining with `by = join_by(num_distractors, choice)`

First, it is clear that introducing more distractors means the probability of selecting the target decreases, which makes intuitive sense. But paradoxically, making the decision harder and more error-prone by introducing more distractors has sped up the model. This is statistical facilitation once again—for any single accumulator to “win” the race it has to be faster than all of its competitors.

Finally, it is worth remarking once again that errors (i.e., selecting a distractor) take about the same amount of time as correct responses, regardless of the number of distractors.

8.1.3 Introducing competition

To enable the race model to be a bit more flexible—and potentially more psychologically realistic—we will introduce two kinds of competition between the accumulators. As noted above, some forms of competition do not permit easy computation of likelihoods, but they are still important from a theoretical standpoint.

8.1.3.1 Feedforward competition

The first kind of competition we introduce is feedforward competition. This mechanism treats the drift rate for an accumulator as something that can be inhibited by the drift rates for other accumulators.

To get mathy about it, let $v_i$ stand for the drift rate to accumulator $i$, assuming $N$ total accumulators. Then the inhibited drift rate $v_i'$ is

\[ v_i' = v_i - \alpha \sum_{i \neq j}^n v_j \]

where $\sum_{i \neq j}^n v_j$ is the sum of the drift rates for all other accumulators and $\alpha$ is a free parameter representing the strength of feedforward competition.

The code below introduces such a parameter, but calls it feedforward_comp instead of $\alpha$. This parameter is used to define a new vector of drift rates v_comp which represents the drift rates v after being subject to feedforward competition.

Code

# Function arguments
# v: "drift rate", the average evidence in support of each option
# a: threshold level of evidence for each accumulator
# t0: residual time
# dt: how long each tiny time interval lasts
# t_max: optional "cut off" time to stop accumulating
# feedforward_comp: strength of feed-forward competition between accumulators
race_sim <- function(v = 0, a = 1, t0 = 0.2, dt = 0.01, t_max = Inf, feedforward_comp = 0) {
    n_accum <- max(length(v), length(a))
    
    v <- rep(v, n_accum)[1:n_accum]
    a <- rep(a, n_accum)[1:n_accum]
    
    v_comp <- rep(0, n_accum)
    
    for (i in 1:n_accum) {
        v_comp[i] <- v[i] - feedforward_comp * sum(v[-i])
    }
    
    x <- rep(0, n_accum)
    t <- t0
    
    x_record <- x
    t_record <- t
    
    while (all(x < a) & t < t_max) {
        x_sample <- rnorm(n = n_accum, mean = v_comp * dt, sd = sqrt(dt))
        x <- x + x_sample
        t <- t + dt
        x_record <- rbind(x_record, x)
        t_record <- c(t_record, t)
    }
    
    to_return <- cbind(t_record, x_record)
    
    colnames(to_return) <- c("t", paste0("x", 1:n_accum))
    
    return(as_tibble(to_return))
}

Let’s take the new model out for a spin! First, let’s repeat the simulations varying distractor strength, but now set feedforward_comp = 0.5.

Code

n_sims <- 1000

sim_results <- c()

for (distractor_drift in c(0, 0.5, 0.9)) {
    for (i in 1:n_sims) {
        current_result <- race_sim(c(1, distractor_drift), a = 2, t0 = 0.2, feedforward_comp = 0.5)
        
        # Extract just the choice and RT
        rt <- current_result$t[nrow(current_result)]
        choice <- which.max(current_result[nrow(current_result), 2:ncol(current_result)])
        
        # "Bind" the current simulation to the ongoing record of simulation results
        sim_results <- rbind(
            sim_results,
            tibble(
                distractor_drift = distractor_drift,
                sim_index = i,
                rt = rt,
                choice = choice
            )
        )
    }
}

# Quantile-probability plot
sim_choice_p <- sim_results %>%
    group_by(distractor_drift, choice) %>%
    count() %>%
    group_by(distractor_drift) %>%
    mutate(p_resp = n / sum(n))

sim_rt_q <- sim_results %>%
    group_by(distractor_drift, choice) %>%
    reframe(rt_q = quantile(rt, probs = c(0.1, 0.3, 0.5, 0.7, 0.9)))

full_join(sim_choice_p, sim_rt_q) %>%
    ggplot(aes(x = p_resp, y = rt_q, color = factor(choice))) +
    geom_point() +
    expand_limits(x = c(0, 1)) +
    facet_wrap("distractor_drift", labeller = label_both) +
    labs(x = "Response proportion", y = "RT Quantile", color = "Choice", title = "Quantile-Probability Plot")

Joining with `by = join_by(distractor_drift, choice)`

Adding feedforward competition now causes the model to slow down when the distractor is more similar to a target. This matches the intuition we may have about many tasks that a harder choice should take longer, on average. Feedforward competition produces this behavior because strong competitors with high drift rates reduce the drift rates for all accumulators.

What about increasing the number of options? The simulations below replicate our earlier simulations in which we varied the number of distractors. By introducing feedforward competition, each additional distractor now acts to suppress the drift rate for the target accumulator.

Code

n_sims <- 1000

sim_results <- c()

for (num_distractors in c(1, 3, 7)) {
    for (i in 1:n_sims) {
        current_result <- race_sim(c(1, rep(0.1, num_distractors)), a = 2, t0 = 0.2, feedforward_comp = 0.5)
        
        # Extract just the choice and RT
        rt <- current_result$t[nrow(current_result)]
        choice <- which.max(current_result[nrow(current_result), 2:ncol(current_result)])
        
        # "Bind" the current simulation to the ongoing record of simulation results
        sim_results <- rbind(
            sim_results,
            tibble(
                num_distractors = num_distractors,
                sim_index = i,
                rt = rt,
                # Note that I don't bother to keep track of which distractor was selected
                choice = factor(ifelse(choice == 1, "Target", "Distractor"), levels = c("Target", "Distractor"))
            )
        )
    }
}

# Quantile-probability plot
sim_choice_p <- sim_results %>%
    group_by(num_distractors, choice) %>%
    count() %>%
    group_by(num_distractors) %>%
    mutate(p_resp = n / sum(n))

sim_rt_q <- sim_results %>%
    group_by(num_distractors, choice) %>%
    reframe(rt_q = quantile(rt, probs = c(0.1, 0.3, 0.5, 0.7, 0.9)))

full_join(sim_choice_p, sim_rt_q) %>%
    ggplot(aes(x = p_resp, y = rt_q, color = factor(choice))) +
    geom_point() +
    expand_limits(x = c(0, 1)) +
    facet_wrap("num_distractors", labeller = label_both) +
    labs(x = "Response proportion", y = "RT Quantile", color = "Choice", title = "Quantile-Probability Plot")

Joining with `by = join_by(num_distractors, choice)`

As shown above, with feedforward competition between accumulators, adding more distractors slows responding, in keeping with intuition. That said, it is still the case that the race model predicts errors (selecting a distractor) to be faster than correct responses. Again, statistical facilitation is at work—the only way for a race model to make an error is for the “wrong” accumulator (i.e., one with a comparatively low drift rate) to be even faster than the “right” accumulator (i.e., one with a comparatively high drift rate).

8.1.3.2 Lateral competition

A different form of competition enables race models to predict slower errors than correct responses: lateral competition. Lateral competition occurs between accumulators during the accumulation process, unlike feedforward competition which only affects the “inputs” to the accumulators. With lateral competition, the level of evidence in one accumulator acts to suppress the level of evidence in other accumulators.

Mathematically, lateral competition enters into the equation that describes how evidence evolves from one moment in time to the next: \[ \overbrace{x_i(t + \Delta t)}^{\text{Updated evidence}} = \underbrace{x_i(t)}_{\text{Current accumulated evidence}} + \overbrace{\Delta x_i(t)}^{\text{Current sample of evidence}} - \underbrace{\beta \Delta t \sum_{j \neq i}^N x_j(t)}_{\text{Lateral competition}} \] As described in the equation above, during each interval of time, the evidence in accumulator $i$ not only receives a new sample of evidence, it loses some evidence in proportion to how much evidence has accumulated in the other accumulators (accumulators $j$, where $j \neq i$). Conceptually, lateral competition embodies the idea that having accumulated a lot of evidence for one option counts as evidence against picking the other options, encouraging a sort of “winner-take-all” competition. Metaphorically, we can think of lateral competition as sort of like the famous chariot race scene in Ben Hur, where instead of the competitors running in their own separate lanes, they are able to “jostle” with one another during the event.

The chunk of code below amends our race model simulation to include lateral competition. It also includes a new argument to the function min_x, which sets the minimum allowable value for accumulated evidence. By default, min_x = -Inf, such that evidence is allowed to be negative. However, we will see that allowing negative evidence results in some interesting (not necessarily incorrect!) behavior from the model.

Code

# Function arguments
# v: "drift rate", the average evidence in support of each option
# a: threshold level of evidence for each accumulator
# t0: residual time
# dt: how long each tiny time interval lasts
# t_max: optional "cut off" time to stop accumulating
# feedforward_comp: strength of feed-forward competition between accumulators
# lateral_comp: strength of lateral competition between accumulators
# min_x: accumulators are constrained to have at minimum *this much* evidence
race_sim <- function(v = 0, a = 1, t0 = 0.2, dt = 0.01, t_max = Inf, feedforward_comp = 0, lateral_comp = 0, min_x = -Inf) {
    n_accum <- max(length(v), length(a))
    
    v <- rep(v, n_accum)[1:n_accum]
    a <- rep(a, n_accum)[1:n_accum]
    
    v_comp <- rep(0, n_accum)
    
    for (i in 1:n_accum) {
        v_comp[i] <- v[i] - feedforward_comp * sum(v[-i])
    }
    
    x <- rep(0, n_accum)
    t <- t0
    
    x_record <- x
    t_record <- t
    
    while (all(x < a) & t < t_max) {
        # Compute the total amount of lateral competition for each accumulator
        lat <- rep(0, n_accum)
        for (i in 1:n_accum) {
            lat[i] <- sum(x[-i])
        }
        
        x_sample <- rnorm(n = n_accum, mean = v_comp * dt, sd = sqrt(dt))
        
        # Updated values account for lateral competition and use "pmax" to keep them above "min_x"
        x <- pmax(min_x, x + x_sample - dt * lateral_comp * lat)
        t <- t + dt
        x_record <- rbind(x_record, x)
        t_record <- c(t_record, t)
    }
    
    to_return <- cbind(t_record, x_record)
    
    colnames(to_return) <- c("t", paste0("x", 1:n_accum))
    
    return(as_tibble(to_return))
}

To see the effect of lateral competition, let’s take a look at some simulated trials. For comparison purposes, I have set R’s “random seed” to the same value prior to each simulation, which has the effect of making the sequence of random evidence samples the same for each simulated trial. This allows us to focus on the effects of lateral competition and of enforcing a minimum evidence level.

Code

set.seed(2)
baseline_sim <- race_sim(v = c(1, 0), a = 2, t0 = 0.2, lateral_comp = 0, min_x = -Inf)

set.seed(2)
lateral_sim <- race_sim(v = c(1, 0), a = 2, t0 = 0.2, lateral_comp = 1, min_x = -Inf)

set.seed(2)
lateral_min_sim <- race_sim(v = c(1, 0), a = 2, t0 = 0.2, lateral_comp = 1, min_x = 0)

all_sims <- rbind(
    baseline_sim %>% mutate(label = "No lateral competition,\nno minimum value"),
    lateral_sim %>% mutate(label = "Lateral competition,\nno minimum value"),
    lateral_min_sim %>% mutate(label = "Lateral competition,\nnonnegative evidence")
) %>%
    mutate(label = factor(label, levels = c("No lateral competition,\nno minimum value", "Lateral competition,\nno minimum value", "Lateral competition,\nnonnegative evidence")))

all_sims %>%
    pivot_longer(matches("x(\\d+)"), names_to = "accumulator", values_to = "x") %>%
    ggplot(aes(x = t, y = x, color = accumulator)) +
    geom_line() +
    geom_hline(yintercept = 2, linetype = "dashed") +
    geom_hline(yintercept = 0, linetype = "dotted") +
    facet_wrap("label", nrow = 1) +
    labs(x = "Time", y = "Accumulated evidence", color = "Accumulator")

Comparing the first two plots (with and without lateral competition), we can see that the effect of lateral competition is twofold: the “stronger” accumulator (x1) suppresses the “weaker” accumulator (x2), causing its evidence level to diminish over time. Comparing the second two plots, we can see that an accumulator with negative evidence can actually accelerate accumulation for its competitors—after all, subtracting a negative is a positive! When evidence is constrained to be non-negative (as in the third plot), that kind of acceleration is no longer possible.

It is worth noting that many models with lateral competition are inspired by the ways that individual neurons interact, making an analogy between the level of accumulated evidence and the amount of activity in either a single neuron or group of neurons (Purcell et al., 2010; Purcell et al., 2012; Teodorescu & Usher, 2013; Usher & McClelland, 2001). Because neural activity cannot be negative, these models also constrain evidence to be non-negative. As a result, most models that include some form of lateral competition also assume that the level of accumulated evidence cannot be negative, although this assumption is not strictly required.

Now let’s see the effect of lateral competition on the model’s predictions regarding distractor similarity and number of distractors. The simulations below set the lateral competition parameter to 1 and, in keeping with the majority of models with lateral competition, do not allow for negative evidence.

Code

n_sims <- 1000

sim_results <- c()

for (distractor_drift in c(0, 0.5, 0.9)) {
    for (i in 1:n_sims) {
        current_result <- race_sim(c(1, distractor_drift), a = 2, t0 = 0.2, feedforward_comp = 0, lateral_comp = 1, min_x = 0)
        
        # Extract just the choice and RT
        rt <- current_result$t[nrow(current_result)]
        choice <- which.max(current_result[nrow(current_result), 2:ncol(current_result)])
        
        # "Bind" the current simulation to the ongoing record of simulation results
        sim_results <- rbind(
            sim_results,
            tibble(
                distractor_drift = distractor_drift,
                sim_index = i,
                rt = rt,
                choice = choice
            )
        )
    }
}

# Quantile-probability plot
sim_choice_p <- sim_results %>%
    group_by(distractor_drift, choice) %>%
    count() %>%
    group_by(distractor_drift) %>%
    mutate(p_resp = n / sum(n))

sim_rt_q <- sim_results %>%
    group_by(distractor_drift, choice) %>%
    reframe(rt_q = quantile(rt, probs = c(0.1, 0.3, 0.5, 0.7, 0.9)))

full_join(sim_choice_p, sim_rt_q) %>%
    ggplot(aes(x = p_resp, y = rt_q, color = factor(choice))) +
    geom_point() +
    expand_limits(x = c(0, 1)) +
    facet_wrap("distractor_drift", labeller = label_both) +
    labs(x = "Response proportion", y = "RT Quantile", color = "Choice", title = "Quantile-Probability Plot")

Joining with `by = join_by(distractor_drift, choice)`

With lateral inhibition, errors (choosing 2) are generally slower than correct responses (choosing 1). This is because errors tend to occur when the correct response has more competition from the error accumulators. Notice, though, that making the distractor stronger does not slow responding overall—it actually speeds responses. Statistical facilitation still occurs even in the presence of lateral competition.

But while increasing the strength of competition doesn’t necessarily slow responding, the simulations below show that increasing the number of competitors does slow responding if there is lateral competition. The more competitors in the race, the more lateral competition there is, slowing down all the accumulators.

Code

n_sims <- 1000

sim_results <- c()

for (num_distractors in c(1, 3, 7)) {
    for (i in 1:n_sims) {
        current_result <- race_sim(c(1, rep(0.1, num_distractors)), a = 2, t0 = 0.2, lateral_comp = 1, min_x = 0)
        
        # Extract just the choice and RT
        rt <- current_result$t[nrow(current_result)]
        choice <- which.max(current_result[nrow(current_result), 2:ncol(current_result)])
        
        # "Bind" the current simulation to the ongoing record of simulation results
        sim_results <- rbind(
            sim_results,
            tibble(
                num_distractors = num_distractors,
                sim_index = i,
                rt = rt,
                # Note that I don't bother to keep track of which distractor was selected
                choice = factor(ifelse(choice == 1, "Target", "Distractor"), levels = c("Target", "Distractor"))
            )
        )
    }
}

# Quantile-probability plot
sim_choice_p <- sim_results %>%
    group_by(num_distractors, choice) %>%
    count() %>%
    group_by(num_distractors) %>%
    mutate(p_resp = n / sum(n))

sim_rt_q <- sim_results %>%
    group_by(num_distractors, choice) %>%
    reframe(rt_q = quantile(rt, probs = c(0.1, 0.3, 0.5, 0.7, 0.9)))

full_join(sim_choice_p, sim_rt_q) %>%
    ggplot(aes(x = p_resp, y = rt_q, color = factor(choice))) +
    geom_point() +
    expand_limits(x = c(0, 1)) +
    facet_wrap("num_distractors", labeller = label_both) +
    labs(x = "Response proportion", y = "RT Quantile", color = "Choice", title = "Quantile-Probability Plot")

Joining with `by = join_by(num_distractors, choice)`

8.1.3.3 Summary

Feedforward competition diminishes the drift rates for each accumulator, with the result that stronger competition tends to slow responding overall. However, feedforward competition still generally predicts fast errors because errors occur when the accumulators for incorrect responses happen to be fast enough to “beat” the accumulator associated with the correct response.

Lateral competition does not always predict that stronger competition will slow responding. Lateral competition can produce slow errors because they arise in trials where the incorrect accumulators happened to be strong enough to impede the correct accumulator.

Note that the race model theory for slow errors is different from how slow errors were explained with a diffusion model. With a diffusion model, slow errors arose from trial-by-trial variability in the drift rate. One of your exercises will be to compare and contrast these two theories of slow errors.

8.2 Race model likelihood function

In the previous section, we built code to simulate a race model, where each accumulator was associated with a particular response. We will now write a function to calculate the negative log-likelihood (NLL) of a response at a particular time. This will enable us to fit a race model to data. The process of building the function will also illustrate the essential ingredients in any NLL function, namely

Translating a mathematical expression of the likelihood into code.
Deciding how to represent the data to be fit, along with any experimental factors relevant to the model.
Extracting the values for the model’s parameters from a vector (this last step is necessary for using most parameter search algorithms).

8.2.1 Math to Code

In the previous section, we began by simulating just a single accumulator with a single boundary. There were three parameters describing the accumulator: the drift rate, the threshold, and the residual time. There is, in fact, a mathematical expression for the distribution of finishing times for this process: the shifted Wald distribution (also called the shifted inverse Gaussian distribution).

As you can see in the Wikipedia entry, it is possible to write a mathematical expression for both the probability density function (PDF) (the probability of finishing at a particular time) and the cumulative distribution function (CDF) (the probability of finishing by a particular time). Since the Wikipedia entry doesn’t explicitly include residual time in its formulae, let’s write them out ourselves for our own reference.

8.2.1.1 Probability density function for a single accumulator

The PDF, describing the probability that accumulator $i$ reaches its threshold at time $t$, can be written \[ f_i \left( t \mid a_i, v_i, t_{0i} \right) = \frac{a_i}{\sqrt{2 \pi \left(t - t_{0i} \right)^3}} \exp \left\lbrace -\frac{\left[a_i - v_i \left(t - t_{0i} \right) \right]^2}{2 \left(t - t_{0i} \right)} \right\rbrace \] where $a_i$ is the threshold, $v_i$ is the drift rate, and $t_{0i}$ is the residual time. While the expression above may look a bit cumbersome, it only involves elementary mathematical operations which are already implemented in R (and almost all programming languages). Therefore, we can write an R function that will compute the PDF for a single accumulator. That’s what I’ve done below. Note the comments in the code below explain the purpose of each line.

Code

accum_pdf <- function(t, a, v, t0) {
    # Because the term "t - t0" shows up multiple times in the formula, it's easier
    # to compute it once and then refer back to it whenever we need it.  It also
    # gives us a way to "clip" the values so they cannot be negative, since that
    # would otherwise give a weird result.
    t_minus_t0 <- pmax(0, t - t0)
    
    # The next line is a direct implementation of the formula above.
    pdf <- a * exp(-(a - v * t_minus_t0)^2 / (2 * t_minus_t0)) / sqrt(2 * pi * t_minus_t0^3)
    
    # This does some final error-checking, ensuring that there is zero probability
    # of finishing in zero (or negative) time.
    pdf[t_minus_t0 <= 0] <- 0
    
    # Finally, we use an explicit "return" statement to say exactly what the
    # function gives back.
    return(pdf)
}

Cool. Let’s take this function out for a spin!

Code

accum_pdf(
    t = 1,
    a = 2,
    v = 1,
    t0 = 0.2
)

[1] 0.4533567

Nice! But perhaps even more impressive is that the function will also accept vectors for its arguments, not just single numbers. This is very convenient for us because we will often want to compute the likelihood for many trials all at once. While we could use a for loop for that purpose, R code runs much faster if we can use vectors instead.

For example, here’s what we get if we want to compute the likelihood of a single accumulator finishing at a range of times:

Code

accum_pdf(
    t = c(0, 0.5, 1, 1.5, 2, 2.5, 3),
    a = 2,
    v = 1,
    t0 = 0.2
)

[1] 0.00000000 0.03930126 0.45335671 0.44583815 0.32674264 0.22431135 0.15190405

Notice that we get a vector back from the function, which is the likelihood for each finishing time in the vector we supplied for the t argument.

We can also supply vectors for the other arguments to the function if we want. This is illustrated in the example below.

Code

accum_pdf(
    t = c(0.5, 0.5, 0.5, 1, 1, 1, 2, 2, 2),
    a = c(2, 2, 1, 2, 2, 1, 2, 2, 1),
    v = c(1, 2, 3, 1, 2, 3, 1, 2, 3),
    t0 = 0.2
)

[1] 0.0393012555 0.1851666937 2.3877559853 0.4533567093 1.0089639117
[6] 0.1637813117 0.3267426369 0.1622555916 0.0007628903

Just be aware that you can supply vectors with unequal lengths and R will “recycle” them until they are the length of the longest vector you supplied. This is generally a bad idea, though, because it can lead to all sorts of “silent” errors.

Code

accum_pdf(
    t = c(0.5, 1, 1.5, 2),
    a = c(2, 1),
    v = 1,
    t0 = 0.2
)

[1] 0.03930126 0.54377310 0.44583815 0.13829084

At least R will give you a warning if you supply vectors that can’t be recycled evenly.

Code

accum_pdf(
    t = c(0.5, 1, 1.5, 2),
    a = c(2, 1, 3),
    v = 1,
    t0 = 0.2
)

Warning in a - v * t_minus_t0: longer object length is not a multiple of
shorter object length

Warning in a * exp(-(a - v * t_minus_t0)^2/(2 * t_minus_t0)): longer object
length is not a multiple of shorter object length

[1] 0.03930126 0.54377310 0.26569371 0.32674264

But in general, you should adhere to the following rule: For each argument to the function, provide either a single number or a vector that is the same length as any other vectors you might supply.

One nice side-effect of being able to provide vectors to our accum_pdf function is that we can use it to make some nice plots of the distributions of finishing times for different combinations of parameter values:

Code

expand_grid(a = c(1, 2), v = c(1, 2, 3), t0 = 0.2, t = seq(0, 5, length.out = 201)) %>%
    mutate(pdf = accum_pdf(t = t, a = a, v = v, t0 = t0)) %>%
    ggplot(aes(x = t, y = pdf, color = v, linetype = factor(a), group = interaction(v, a))) +
    geom_line() +
    labs(x = "Time (s)", y = "PDF", color = "Drift rate (v)", linetype = "Threshold (a)")

8.2.1.2 Cumulative distribution function for a single accumulator

Just like there’s a mathematical expression for the PDF, we have one for the CDF too. As a reminder, the CDF is the probability that accumulator $i$ reached threshold by time $t$. The formula is given below, using the same parameters as the formula for the PDF above: \[ F_i \left( t \mid a_i, v_i, t_{0i} \right) = \Phi \left[ \frac{v_i \left( t - t_{0i} \right) - a_i}{\sqrt{t - t_{0i}}} \right] + \exp \left( 2 a_i v_i \right) \Phi \left[ -\frac{v_i \left( t - t_{0i} \right) + a_i}{\sqrt{t - t_{0i}}} \right] \] The only new thing in the CDF formula is $\Phi \left( \cdot \right)$, which is a shorthand for the CDF of the standard normal distribution. Recall that the standard normal distribution has a mean of zero and a standard deviation of one. Fortunately, this is built into R in the form of the pnorm function. Although you can supply a mean and sd to pnorm, if you don’t, it will use default settings of mean = 0 and sd = 1, amounting to a standard normal distribution.

We can thus write an R function to calculate the expression in the formula above for the CDF. This function is shown below.

Code

accum_cdf <- function(t, a, v, t0) {
    # As with the pdf function, I compute "t - t0" once at the very beginning,
    # which also offers an opportunity to "clip" any negative values that result.
    t_minus_t0 <- pmax(0, t - t0)
    
    # This expression is a direct translation of the formula above
    cdf <- pnorm((v * t_minus_t0 - a) / sqrt(t_minus_t0)) + exp(2 * a * v) * pnorm(-(v * t_minus_t0 + a) / sqrt(t_minus_t0))
    
    # This formula doesn't give weird results when t_minus_t0 == 0, so we can just
    # return the result as-is.
    return(cdf)
}

Like accum_pdf, accum_cdf can accept vectors for its arguments too (with the same caveat that you should be sure to supply only single values or vectors of the same length to avoid unpredictable “recycling” issues). We can thus make some very nice plots, just like we did for the PDF:

Code

expand_grid(a = c(1, 2), v = c(1, 2, 3), t0 = 0.2, t = seq(0, 5, length.out = 201)) %>%
    mutate(cdf = accum_cdf(t = t, a = a, v = v, t0 = t0)) %>%
    ggplot(aes(x = t, y = cdf, color = v, linetype = factor(a), group = interaction(v, a))) +
    geom_line() +
    expand_limits(y = c(0, 1)) +
    labs(x = "Time (s)", y = "CDF", color = "Drift rate (v)", linetype = "Threshold (a)")

8.2.1.3 Likelihood expression

Armed with functions for calculating the PDF and CDF for a single accumulator, we are now in a position to write a function that computes the likelihood of a particular accumulator winning at a particular time $t$. We will first write the likelihood mathematically and then translate it into an R function, like we did for the PDF and CDF.

To understand the math, it may help to think conceptually first: If accumulator $i$ is the first of $N$ accumulators to reach its threshold at time $t$, that means two things must be true: First, and most obvious, accumulator $i$ must have hit threshold at time $t$. Second, and perhaps less obvious, is that none of the other accumulators must have hit threshold by time $t$. The likelihood must therefore express the probability that both of these things are true.

There are two more conceptual ingredients we need to express the likelihood function: First, we need to know how to express the probability that a non-winning accumulator hasn’t hit threshold by time $t$. Since the CDF is the probability that an accumulator has hit threshold by time $t$, the probability that it hasn’t is just one minus the CDF, since “has already hit threshold” and “hasn’t already hit threshold” are mutually exclusive and exhaustive possibilities. Second, we have to remember that the joint probability of multiple independent events being true is the product of the individual probabilities.

Finally, we can express the likelihood that accumulator $i$ is the first out of $N$ accumulators to reach its threshold at time $t$: \[ \overbrace{L \left(i, t \right)}^{\text{Likelihood that } i \text{ wins at } t} = \underbrace{f_i \left( t \mid a_i, v_i, t_{0i} \right)}_{i \text{ finishes at } t} \overbrace{\prod_{j \neq i}^N \left[1 - F_j \left(t \mid a_j, v_j, t_{0j} \right) \right]}^{\text{Nothing else has finished by } t} \]

It is worth emphasizing that the formula above only works when the accumulators are (conditionally) independent of one another. As a result, the race model cannot include lateral competition or any other interactions between accumulators that would cause them to be dependent on one another. It can, however, accommodate feedforward competition, since that affects the drift rates but not how the accumulators evolve from moment to moment. We will return to feedforward competition later in this section.

To translate the likelihood formula above into R code, we will first see how to compute it for a single trial. We will then generalize that approach so we can compute the likelihood for many trials, exploiting the fact that our accum_pdf and accum_cdf functions can accept vectors as arguments.

But first, let’s imagine a concrete situation: Say we present an array of three items in a visual search task. The target is a red square and the job of the participant is to select which of the three items is the target. The item in position 1 is a blue circle and so doesn’t resemble the target at all; let’s say it has a drift rate of 0. The item in position 2 is the target—let’s say it has a drift rate of 2. Finally, the item in position 3 is a red circle, matching the target’s color but not its shape—let’s say it has a drift rate of 1. Let’s assume that all accumulators have the same threshold value (2) and the same residual time (0.2). We can now write the parameters for all three accumulators as vectors, although we will be pedantic and write out the same value 3 times for the threshold and residual time:

Code

v <- c(0, 2, 1)
a <- c(2, 2, 2)
t0 <- c(0.2, 0.2, 0.2)

What is the likelihood of selecting the target (item 2) in 1 second? Based on the formula above, we can compute that likelihood in R using the accum_pdf and accum_cdf functions from earlier:

Code

v <- c(0, 2, 1)
a <- c(2, 2, 2)
t0 <- c(0.2, 0.2, 0.2)

accum_pdf(t = 1, a = a[2], v = v[2], t0 = t0[2]) *
    (1 - accum_cdf(t = 1, a = a[1], v = v[1], t0 = t0[1])) *
    (1 - accum_cdf(t = 1, a = a[3], v = v[3], t0 = t0[3]))

[1] 0.8481769

We are going to make the expression above a bit more general and more compact. First, rather than refer explicitly to a response time of 1, we will refer to a variable rt (which we can set to one):

Code

v <- c(0, 2, 1)
a <- c(2, 2, 2)
t0 <- c(0.2, 0.2, 0.2)

rt <- 1

accum_pdf(t = rt, a = a[2], v = v[2], t0 = t0[2]) *
    (1 - accum_cdf(t = rt, a = a[1], v = v[1], t0 = t0[1])) *
    (1 - accum_cdf(t = rt, a = a[3], v = v[3], t0 = t0[3]))

[1] 0.8481769

Second, we can use the prod function to multiply together the two terms at the end, making use of the fact that accum_cdf can accept vectors as arguments:

Code

v <- c(0, 2, 1)
a <- c(2, 2, 2)
t0 <- c(0.2, 0.2, 0.2)

rt <- 1

accum_pdf(t = rt, a = a[2], v = v[2], t0 = t0[2]) *
    prod(1 - accum_cdf(t = rt, a = a[c(1, 3)], v = v[c(1, 3)], t0 = t0[c(1, 3)]))

[1] 0.8481769

Third, instead of explicitly referring to individual accumulators, we can define a variable called response (and set it to 2), and use the “negative indexing trick” for the second term:

Code

v <- c(0, 2, 1)
a <- c(2, 2, 2)
t0 <- c(0.2, 0.2, 0.2)

rt <- 1
response <- 2

accum_pdf(t = rt, a = a[response], v = v[response], t0 = t0[response]) *
    prod(1 - accum_cdf(t = rt, a = a[-response], v = v[-response], t0 = t0[-response]))

[1] 0.8481769

Nice! The only problem now is that our code only applies to a single trial at a time. In the next section, we confront the issue of how to compute the likelihood for multiple trials.

8.2.2 Likelihood for multiple trials

In the previous section, we effectively translated a mathematical expression of the model into R code that performs the relevant computations to find the likelihood of a given response at a given time. To apply this code to an entire dataset, which contains many trials, we need to think about how to specify the details of each individual trial.

8.2.2.1 Characterizing each trial

Again, it will help to think about a concrete task like visual search. Any given trial involves a number of items, each of which can occupy a given position in the visual search array. For example, one trial might have 3 items with the target occupying the second position while another trial might have 6 items with the target occupying the fifth position. The distractors occupying the non-target positions could be of varying types depending on whether they are more or less similar to the target. Therefore, to characterize any individual trial in this experiment, we need to know, for each possible location that could contain an item, whether or not it does contain an item and, if so, what kind of item it is.

There are a few ways we could represent this information in an R data structure, but the simplest is in the form of a matrix. Each row of the matrix will correspond to a different trial in the experiment and each column of the matrix corresponds to a different location that may or may not contain an item on that trial. We can give each type of item a numerical index. For example, maybe 1 corresponds to targets, 2 corresponds to similar distractors, and 3 corresponds to dissimilar distractors. To indicate that a location does not contain an item, we can give it the value NA. For example, the following matrix could describe four trials in a search experiment with six possible display locations:

Code

matrix(c(
    1, NA, NA,  2, NA, NA,
    2,  1,  2,  3,  3,  3,
    NA, NA, 3,  1,  3, NA,
    NA,  3, NA,  2, NA, 1
), nrow = 4, ncol = 6, byrow = TRUE)

     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    1   NA   NA    2   NA   NA
[2,]    2    1    2    3    3    3
[3,]   NA   NA    3    1    3   NA
[4,]   NA    3   NA    2   NA    1

Alternatively, we could imagine a decision making task in which each trial presents three options selected from a set of six possible products:

Code

matrix(c(
    1, 2, 3,
    1, 4, 5,
    6, 4, 2,
    5, 2, 6
), nrow = 4, ncol = 3, byrow = TRUE)

     [,1] [,2] [,3]
[1,]    1    2    3
[2,]    1    4    5
[3,]    6    4    2
[4,]    5    2    6

These concrete examples suggest the general structure of the tasks to which we can typically apply a race model: Each trial presents a number of options, each of which is selected from a set of possible options. Moreover, sometimes the ordering/location of the options may matter (like in visual search) but sometimes it may not (like in the decision task).

8.2.2.2 Defining model parameters

These examples also suggest what kind of parameters our model will need to have. Specifically, we will need to have a drift rate parameter for each type of item in the experiment. We could have a single threshold parameter that applies to all accumulators. Alternatively, we could allow different locations to have different thresholds, modeling the idea that there may be a bias in favor of choosing options in particular locations. Similarly, we could have a single residual time parameter for all accumulators or we could allow residual time to vary by location, allowing us to model situations in which processing begins earlier for some locations than others.

Let’s again make this concrete and return to the visual search experiment. We will keep things simple and assume just one threshold parameter and one residual time parameter apply to all accumulators, regardless of location. We will have three possible values of drift rate for three categories of item:

Targets
Similar distractors
Dissimilar distractors

To compute the likelihood for each trial, we will need to refer to the corresponding row in the matrix and use the drift rates that correspond to the items shown on that trial.

Code

# Specify item types/locations for each trial
items <- matrix(c(
    1, NA, NA,  2, NA, NA,
    2,  1,  2,  3,  3,  3,
    NA, NA, 3,  1,  3, NA,
    NA,  3, NA,  2, NA, 1
), nrow = 4, ncol = 6, byrow = TRUE)

# Responses and response times for each trial
# Notice that responses refer to the *position* of the chosen item!
response <- c(1, 2, 4, 4)
rt <- c(0.6, 1.5, 0.7, 1.2)

n_trials <- nrow(items)
n_locations <- ncol(items)

# Model parameters
# One drift rate for each type of item
v <- c(2, 1, 0)
# One threshold for each possible location
a <- rep(2, n_locations)
# One residual time for each possible location
t0 <- rep(0.2, n_locations)

# Allocate vector that will eventually contain the negative log-likelihood
# for each trial
nll <- rep(0, n_trials)

# Calculate the log-likelihood for each trial
for (i in 1:n_trials) {
    # Note that the whole thing is wrapped in "-log()" so we get the negative log-likelihood
    # Note also the use of "na.rm = TRUE" in the "prod()" function to ignore any locations
    # that don't contain an item
    nll[i] <- -log(
        accum_pdf(t = rt[i], a = a[response[i]], v = v[items[i, response[i]]], t0 = t0[response[i]]) *
            prod(1 - accum_cdf(t = rt[i], a = a[-response[i]], v = v[items[i, -response[i]]], t0 = t0[-response[i]]), na.rm = TRUE)
    )
}

print(nll)

[1] 0.6611439 1.9408143 0.1954480 1.6747733

You’ll note that the NLL for each trial is calculated using a for loop at the end—there are ways to make this more efficient, but I will leave those as an exercise for the reader!

8.2.3 Extracting parameters

We are almost there! We have code that will take a set of model parameters and a representation of the options on each trial and compute the negative log-likelihood of each response made on each trial. All that is left is to “wrap” this code into a function. However, we will need to define the arguments of this function in a particular way in order to get the function to play nicely with the parameter search functions we will use. Specifically, the function will need to take the following arguments:

The matrix that specifies the options on each trial; we called this items in the code above.
The vector of responses made on each trial; we called this response in the code above.
The vector of response times on each trial; we called this rt in the code above.
Finally, and this is the tricky part, we need a single vector that contains all the parameters of our model. As we will see, the “trick” is to extract the parameter values from this vector based on the names of the elements in the vector.

Let’s lay out the outline of our function before we get to the tricky part:

Code

race_nll <- function(par, response, rt, items) {
    n_trials <- nrow(items)
    n_locations <- ncol(items)
    
    # TRICKY PART HERE
    v <- ...
    a <- ...
    t0 <- ...
    
    nll <- rep(0, n_trials)
    
    for (i in 1:n_trials) {
        nll[i] <- -log(
            accum_pdf(t = rt[i], a = a[response[i]], v = v[items[i, response[i]]], t0 = t0[response[i]]) *
                prod(1 - accum_cdf(t = rt[i], a = a[-response[i]], v = v[items[i, -response[i]]], t0 = t0[-response[i]]), na.rm = TRUE)
        )
    }
    
    return(sum(nll))
}

To get a grasp on the problem we have to solve, it’ll be good to have in mind a picture of what par looks like. It will be a named vector, meaning it will look something like this:

Code

par <- c(
    "v[1]" = 2,
    "v[2]" = 1,
    "v[3]" = 0,
    "a" = 2,
    "t0" = 0.2
)

We can use the names of the elements to extract them, like so:

Code

par["a"]

a 
2

We can also get a bit clever by using the paste0 function:

Code

par[paste0("v[", 1:3, "]")]

v[1] v[2] v[3] 
   2    1    0

Finally, we can check to see whether the vector contains an element with a particular name by using is.na, which is TRUE if an element does not exist in the vector:

Code

is.na(par["t0"])

   t0 
FALSE

Code

is.na(par["t0[1]"])

<NA> 
TRUE

The latter trick will allow us to make our race_nll function more general, because we can check to see whether the par vector contains, for example, several threshold parameters or only one. The whole shebang is illustrated in the complete function below:

Code

race_nll <- function(par, response, rt, items, n_items = max(items, na.rm = TRUE)) {
    n_trials <- nrow(items)
    n_locations <- ncol(items)
    
    if (is.na(par["v"])) {
        v <- par[paste0("v[", 1:n_items, "]")]
    } else {
        v <- rep(par["v"], n_items)
    }
    
    if (is.na(par["a"])) {
        a <- par[paste0("a[", 1:n_locations, "]")]
    } else {
        a <- rep(par["a"], n_locations)
    }
    
    if (is.na(par["t0"])) {
        t0 <- par[paste0("t0[", 1:n_locations, "]")]
    } else {
        t0 <- rep(par["t0"], n_locations)
    }
    
    nll <- rep(0, n_trials)
    
    for (i in 1:n_trials) {
        nll[i] <- -log(
            accum_pdf(t = rt[i], a = a[response[i]], v = v[items[i, response[i]]], t0 = t0[response[i]]) *
                prod(1 - accum_cdf(t = rt[i], a = a[-response[i]], v = v[items[i, -response[i]]], t0 = t0[-response[i]]), na.rm = TRUE)
        )
    }
    
    return(sum(nll))
}

For each parameter, we check to see whether the par vector contains an element named without an index after it. If so, then the corresponding value is “recycled” for each item/location. Otherwise, we assume that the par vector contains different indexed values for each item/location and we extract those from the par vector for later use. Notice that I added a function argument called n_items so we can specify how many different item types there could be; the default value (the largest number specified in the items matrix) should work, but there may be situations in which it doesn’t.

Before closing this section, you may have noticed something missing! We could include a feedforward competition parameter too. However, that is left as an exercise for the reader.

8.3 Fitting the race model

Finally, we have our race_nll function which we can use to fit the race model to data, that is, to find the parameters of the model that assign the highest likelihood to some set of data.

8.3.1 Simulating some data

To do that, we’ll first need some data! For example purposes, we can simulate some plausible data. Like we did with the diffusion model, the advantage of fitting simulated data is that we know the “ground truth” and can therefore see how well we can recover the parameters we used to generate the data.

The chunk of code below simulates data in a visual search experiment in which two factors are manipulated, set size and target-distractor similarity. Since the simulations don’t allow for different parameters for different locations, I keep things simple and have the target always appear in location 1.

Code

v <- c(2, 1, 0)
a <- 2
t0 <- 0.2

n_trials_per_cond <- 50

set_size_vals <- c(2, 4, 8)

items <- c()
response <- c()
rt <- c()

for (set_size in set_size_vals) {
    for (td_sim in c("low", "high")) {
        distractor_type <- ifelse(td_sim == "low", 3, 2)
        
        for (i in 1:n_trials_per_cond) {
            items <- rbind(
                items,
                c(1, rep(distractor_type, set_size - 1), rep(NA, max(set_size_vals) - set_size))
            )
            
            current_result <- race_sim(v = c(v[1], rep(v[distractor_type], set_size - 1)), a = a, t0 = t0, dt = 0.001)
            
            # Extract just the choice and RT
            rt <- c(rt, current_result$t[nrow(current_result)])
            response <- c(response, which.max(current_result[nrow(current_result), 2:ncol(current_result)]))
        }
    }
}

In the end, we have our matrix specifying what is in each display:

Code

items

       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
  [1,]    1    3   NA   NA   NA   NA   NA   NA
  [2,]    1    3   NA   NA   NA   NA   NA   NA
  [3,]    1    3   NA   NA   NA   NA   NA   NA
  [4,]    1    3   NA   NA   NA   NA   NA   NA
  [5,]    1    3   NA   NA   NA   NA   NA   NA
  [6,]    1    3   NA   NA   NA   NA   NA   NA
  [7,]    1    3   NA   NA   NA   NA   NA   NA
  [8,]    1    3   NA   NA   NA   NA   NA   NA
  [9,]    1    3   NA   NA   NA   NA   NA   NA
 [10,]    1    3   NA   NA   NA   NA   NA   NA
 [11,]    1    3   NA   NA   NA   NA   NA   NA
 [12,]    1    3   NA   NA   NA   NA   NA   NA
 [13,]    1    3   NA   NA   NA   NA   NA   NA
 [14,]    1    3   NA   NA   NA   NA   NA   NA
 [15,]    1    3   NA   NA   NA   NA   NA   NA
 [16,]    1    3   NA   NA   NA   NA   NA   NA
 [17,]    1    3   NA   NA   NA   NA   NA   NA
 [18,]    1    3   NA   NA   NA   NA   NA   NA
 [19,]    1    3   NA   NA   NA   NA   NA   NA
 [20,]    1    3   NA   NA   NA   NA   NA   NA
 [21,]    1    3   NA   NA   NA   NA   NA   NA
 [22,]    1    3   NA   NA   NA   NA   NA   NA
 [23,]    1    3   NA   NA   NA   NA   NA   NA
 [24,]    1    3   NA   NA   NA   NA   NA   NA
 [25,]    1    3   NA   NA   NA   NA   NA   NA
 [26,]    1    3   NA   NA   NA   NA   NA   NA
 [27,]    1    3   NA   NA   NA   NA   NA   NA
 [28,]    1    3   NA   NA   NA   NA   NA   NA
 [29,]    1    3   NA   NA   NA   NA   NA   NA
 [30,]    1    3   NA   NA   NA   NA   NA   NA
 [31,]    1    3   NA   NA   NA   NA   NA   NA
 [32,]    1    3   NA   NA   NA   NA   NA   NA
 [33,]    1    3   NA   NA   NA   NA   NA   NA
 [34,]    1    3   NA   NA   NA   NA   NA   NA
 [35,]    1    3   NA   NA   NA   NA   NA   NA
 [36,]    1    3   NA   NA   NA   NA   NA   NA
 [37,]    1    3   NA   NA   NA   NA   NA   NA
 [38,]    1    3   NA   NA   NA   NA   NA   NA
 [39,]    1    3   NA   NA   NA   NA   NA   NA
 [40,]    1    3   NA   NA   NA   NA   NA   NA
 [41,]    1    3   NA   NA   NA   NA   NA   NA
 [42,]    1    3   NA   NA   NA   NA   NA   NA
 [43,]    1    3   NA   NA   NA   NA   NA   NA
 [44,]    1    3   NA   NA   NA   NA   NA   NA
 [45,]    1    3   NA   NA   NA   NA   NA   NA
 [46,]    1    3   NA   NA   NA   NA   NA   NA
 [47,]    1    3   NA   NA   NA   NA   NA   NA
 [48,]    1    3   NA   NA   NA   NA   NA   NA
 [49,]    1    3   NA   NA   NA   NA   NA   NA
 [50,]    1    3   NA   NA   NA   NA   NA   NA
 [51,]    1    2   NA   NA   NA   NA   NA   NA
 [52,]    1    2   NA   NA   NA   NA   NA   NA
 [53,]    1    2   NA   NA   NA   NA   NA   NA
 [54,]    1    2   NA   NA   NA   NA   NA   NA
 [55,]    1    2   NA   NA   NA   NA   NA   NA
 [56,]    1    2   NA   NA   NA   NA   NA   NA
 [57,]    1    2   NA   NA   NA   NA   NA   NA
 [58,]    1    2   NA   NA   NA   NA   NA   NA
 [59,]    1    2   NA   NA   NA   NA   NA   NA
 [60,]    1    2   NA   NA   NA   NA   NA   NA
 [61,]    1    2   NA   NA   NA   NA   NA   NA
 [62,]    1    2   NA   NA   NA   NA   NA   NA
 [63,]    1    2   NA   NA   NA   NA   NA   NA
 [64,]    1    2   NA   NA   NA   NA   NA   NA
 [65,]    1    2   NA   NA   NA   NA   NA   NA
 [66,]    1    2   NA   NA   NA   NA   NA   NA
 [67,]    1    2   NA   NA   NA   NA   NA   NA
 [68,]    1    2   NA   NA   NA   NA   NA   NA
 [69,]    1    2   NA   NA   NA   NA   NA   NA
 [70,]    1    2   NA   NA   NA   NA   NA   NA
 [71,]    1    2   NA   NA   NA   NA   NA   NA
 [72,]    1    2   NA   NA   NA   NA   NA   NA
 [73,]    1    2   NA   NA   NA   NA   NA   NA
 [74,]    1    2   NA   NA   NA   NA   NA   NA
 [75,]    1    2   NA   NA   NA   NA   NA   NA
 [76,]    1    2   NA   NA   NA   NA   NA   NA
 [77,]    1    2   NA   NA   NA   NA   NA   NA
 [78,]    1    2   NA   NA   NA   NA   NA   NA
 [79,]    1    2   NA   NA   NA   NA   NA   NA
 [80,]    1    2   NA   NA   NA   NA   NA   NA
 [81,]    1    2   NA   NA   NA   NA   NA   NA
 [82,]    1    2   NA   NA   NA   NA   NA   NA
 [83,]    1    2   NA   NA   NA   NA   NA   NA
 [84,]    1    2   NA   NA   NA   NA   NA   NA
 [85,]    1    2   NA   NA   NA   NA   NA   NA
 [86,]    1    2   NA   NA   NA   NA   NA   NA
 [87,]    1    2   NA   NA   NA   NA   NA   NA
 [88,]    1    2   NA   NA   NA   NA   NA   NA
 [89,]    1    2   NA   NA   NA   NA   NA   NA
 [90,]    1    2   NA   NA   NA   NA   NA   NA
 [91,]    1    2   NA   NA   NA   NA   NA   NA
 [92,]    1    2   NA   NA   NA   NA   NA   NA
 [93,]    1    2   NA   NA   NA   NA   NA   NA
 [94,]    1    2   NA   NA   NA   NA   NA   NA
 [95,]    1    2   NA   NA   NA   NA   NA   NA
 [96,]    1    2   NA   NA   NA   NA   NA   NA
 [97,]    1    2   NA   NA   NA   NA   NA   NA
 [98,]    1    2   NA   NA   NA   NA   NA   NA
 [99,]    1    2   NA   NA   NA   NA   NA   NA
[100,]    1    2   NA   NA   NA   NA   NA   NA
[101,]    1    3    3    3   NA   NA   NA   NA
[102,]    1    3    3    3   NA   NA   NA   NA
[103,]    1    3    3    3   NA   NA   NA   NA
[104,]    1    3    3    3   NA   NA   NA   NA
[105,]    1    3    3    3   NA   NA   NA   NA
[106,]    1    3    3    3   NA   NA   NA   NA
[107,]    1    3    3    3   NA   NA   NA   NA
[108,]    1    3    3    3   NA   NA   NA   NA
[109,]    1    3    3    3   NA   NA   NA   NA
[110,]    1    3    3    3   NA   NA   NA   NA
[111,]    1    3    3    3   NA   NA   NA   NA
[112,]    1    3    3    3   NA   NA   NA   NA
[113,]    1    3    3    3   NA   NA   NA   NA
[114,]    1    3    3    3   NA   NA   NA   NA
[115,]    1    3    3    3   NA   NA   NA   NA
[116,]    1    3    3    3   NA   NA   NA   NA
[117,]    1    3    3    3   NA   NA   NA   NA
[118,]    1    3    3    3   NA   NA   NA   NA
[119,]    1    3    3    3   NA   NA   NA   NA
[120,]    1    3    3    3   NA   NA   NA   NA
[121,]    1    3    3    3   NA   NA   NA   NA
[122,]    1    3    3    3   NA   NA   NA   NA
[123,]    1    3    3    3   NA   NA   NA   NA
[124,]    1    3    3    3   NA   NA   NA   NA
[125,]    1    3    3    3   NA   NA   NA   NA
[126,]    1    3    3    3   NA   NA   NA   NA
[127,]    1    3    3    3   NA   NA   NA   NA
[128,]    1    3    3    3   NA   NA   NA   NA
[129,]    1    3    3    3   NA   NA   NA   NA
[130,]    1    3    3    3   NA   NA   NA   NA
[131,]    1    3    3    3   NA   NA   NA   NA
[132,]    1    3    3    3   NA   NA   NA   NA
[133,]    1    3    3    3   NA   NA   NA   NA
[134,]    1    3    3    3   NA   NA   NA   NA
[135,]    1    3    3    3   NA   NA   NA   NA
[136,]    1    3    3    3   NA   NA   NA   NA
[137,]    1    3    3    3   NA   NA   NA   NA
[138,]    1    3    3    3   NA   NA   NA   NA
[139,]    1    3    3    3   NA   NA   NA   NA
[140,]    1    3    3    3   NA   NA   NA   NA
[141,]    1    3    3    3   NA   NA   NA   NA
[142,]    1    3    3    3   NA   NA   NA   NA
[143,]    1    3    3    3   NA   NA   NA   NA
[144,]    1    3    3    3   NA   NA   NA   NA
[145,]    1    3    3    3   NA   NA   NA   NA
[146,]    1    3    3    3   NA   NA   NA   NA
[147,]    1    3    3    3   NA   NA   NA   NA
[148,]    1    3    3    3   NA   NA   NA   NA
[149,]    1    3    3    3   NA   NA   NA   NA
[150,]    1    3    3    3   NA   NA   NA   NA
[151,]    1    2    2    2   NA   NA   NA   NA
[152,]    1    2    2    2   NA   NA   NA   NA
[153,]    1    2    2    2   NA   NA   NA   NA
[154,]    1    2    2    2   NA   NA   NA   NA
[155,]    1    2    2    2   NA   NA   NA   NA
[156,]    1    2    2    2   NA   NA   NA   NA
[157,]    1    2    2    2   NA   NA   NA   NA
[158,]    1    2    2    2   NA   NA   NA   NA
[159,]    1    2    2    2   NA   NA   NA   NA
[160,]    1    2    2    2   NA   NA   NA   NA
[161,]    1    2    2    2   NA   NA   NA   NA
[162,]    1    2    2    2   NA   NA   NA   NA
[163,]    1    2    2    2   NA   NA   NA   NA
[164,]    1    2    2    2   NA   NA   NA   NA
[165,]    1    2    2    2   NA   NA   NA   NA
[166,]    1    2    2    2   NA   NA   NA   NA
[167,]    1    2    2    2   NA   NA   NA   NA
[168,]    1    2    2    2   NA   NA   NA   NA
[169,]    1    2    2    2   NA   NA   NA   NA
[170,]    1    2    2    2   NA   NA   NA   NA
[171,]    1    2    2    2   NA   NA   NA   NA
[172,]    1    2    2    2   NA   NA   NA   NA
[173,]    1    2    2    2   NA   NA   NA   NA
[174,]    1    2    2    2   NA   NA   NA   NA
[175,]    1    2    2    2   NA   NA   NA   NA
[176,]    1    2    2    2   NA   NA   NA   NA
[177,]    1    2    2    2   NA   NA   NA   NA
[178,]    1    2    2    2   NA   NA   NA   NA
[179,]    1    2    2    2   NA   NA   NA   NA
[180,]    1    2    2    2   NA   NA   NA   NA
[181,]    1    2    2    2   NA   NA   NA   NA
[182,]    1    2    2    2   NA   NA   NA   NA
[183,]    1    2    2    2   NA   NA   NA   NA
[184,]    1    2    2    2   NA   NA   NA   NA
[185,]    1    2    2    2   NA   NA   NA   NA
[186,]    1    2    2    2   NA   NA   NA   NA
[187,]    1    2    2    2   NA   NA   NA   NA
[188,]    1    2    2    2   NA   NA   NA   NA
[189,]    1    2    2    2   NA   NA   NA   NA
[190,]    1    2    2    2   NA   NA   NA   NA
[191,]    1    2    2    2   NA   NA   NA   NA
[192,]    1    2    2    2   NA   NA   NA   NA
[193,]    1    2    2    2   NA   NA   NA   NA
[194,]    1    2    2    2   NA   NA   NA   NA
[195,]    1    2    2    2   NA   NA   NA   NA
[196,]    1    2    2    2   NA   NA   NA   NA
[197,]    1    2    2    2   NA   NA   NA   NA
[198,]    1    2    2    2   NA   NA   NA   NA
[199,]    1    2    2    2   NA   NA   NA   NA
[200,]    1    2    2    2   NA   NA   NA   NA
[201,]    1    3    3    3    3    3    3    3
[202,]    1    3    3    3    3    3    3    3
[203,]    1    3    3    3    3    3    3    3
[204,]    1    3    3    3    3    3    3    3
[205,]    1    3    3    3    3    3    3    3
[206,]    1    3    3    3    3    3    3    3
[207,]    1    3    3    3    3    3    3    3
[208,]    1    3    3    3    3    3    3    3
[209,]    1    3    3    3    3    3    3    3
[210,]    1    3    3    3    3    3    3    3
[211,]    1    3    3    3    3    3    3    3
[212,]    1    3    3    3    3    3    3    3
[213,]    1    3    3    3    3    3    3    3
[214,]    1    3    3    3    3    3    3    3
[215,]    1    3    3    3    3    3    3    3
[216,]    1    3    3    3    3    3    3    3
[217,]    1    3    3    3    3    3    3    3
[218,]    1    3    3    3    3    3    3    3
[219,]    1    3    3    3    3    3    3    3
[220,]    1    3    3    3    3    3    3    3
[221,]    1    3    3    3    3    3    3    3
[222,]    1    3    3    3    3    3    3    3
[223,]    1    3    3    3    3    3    3    3
[224,]    1    3    3    3    3    3    3    3
[225,]    1    3    3    3    3    3    3    3
[226,]    1    3    3    3    3    3    3    3
[227,]    1    3    3    3    3    3    3    3
[228,]    1    3    3    3    3    3    3    3
[229,]    1    3    3    3    3    3    3    3
[230,]    1    3    3    3    3    3    3    3
[231,]    1    3    3    3    3    3    3    3
[232,]    1    3    3    3    3    3    3    3
[233,]    1    3    3    3    3    3    3    3
[234,]    1    3    3    3    3    3    3    3
[235,]    1    3    3    3    3    3    3    3
[236,]    1    3    3    3    3    3    3    3
[237,]    1    3    3    3    3    3    3    3
[238,]    1    3    3    3    3    3    3    3
[239,]    1    3    3    3    3    3    3    3
[240,]    1    3    3    3    3    3    3    3
[241,]    1    3    3    3    3    3    3    3
[242,]    1    3    3    3    3    3    3    3
[243,]    1    3    3    3    3    3    3    3
[244,]    1    3    3    3    3    3    3    3
[245,]    1    3    3    3    3    3    3    3
[246,]    1    3    3    3    3    3    3    3
[247,]    1    3    3    3    3    3    3    3
[248,]    1    3    3    3    3    3    3    3
[249,]    1    3    3    3    3    3    3    3
[250,]    1    3    3    3    3    3    3    3
[251,]    1    2    2    2    2    2    2    2
[252,]    1    2    2    2    2    2    2    2
[253,]    1    2    2    2    2    2    2    2
[254,]    1    2    2    2    2    2    2    2
[255,]    1    2    2    2    2    2    2    2
[256,]    1    2    2    2    2    2    2    2
[257,]    1    2    2    2    2    2    2    2
[258,]    1    2    2    2    2    2    2    2
[259,]    1    2    2    2    2    2    2    2
[260,]    1    2    2    2    2    2    2    2
[261,]    1    2    2    2    2    2    2    2
[262,]    1    2    2    2    2    2    2    2
[263,]    1    2    2    2    2    2    2    2
[264,]    1    2    2    2    2    2    2    2
[265,]    1    2    2    2    2    2    2    2
[266,]    1    2    2    2    2    2    2    2
[267,]    1    2    2    2    2    2    2    2
[268,]    1    2    2    2    2    2    2    2
[269,]    1    2    2    2    2    2    2    2
[270,]    1    2    2    2    2    2    2    2
[271,]    1    2    2    2    2    2    2    2
[272,]    1    2    2    2    2    2    2    2
[273,]    1    2    2    2    2    2    2    2
[274,]    1    2    2    2    2    2    2    2
[275,]    1    2    2    2    2    2    2    2
[276,]    1    2    2    2    2    2    2    2
[277,]    1    2    2    2    2    2    2    2
[278,]    1    2    2    2    2    2    2    2
[279,]    1    2    2    2    2    2    2    2
[280,]    1    2    2    2    2    2    2    2
[281,]    1    2    2    2    2    2    2    2
[282,]    1    2    2    2    2    2    2    2
[283,]    1    2    2    2    2    2    2    2
[284,]    1    2    2    2    2    2    2    2
[285,]    1    2    2    2    2    2    2    2
[286,]    1    2    2    2    2    2    2    2
[287,]    1    2    2    2    2    2    2    2
[288,]    1    2    2    2    2    2    2    2
[289,]    1    2    2    2    2    2    2    2
[290,]    1    2    2    2    2    2    2    2
[291,]    1    2    2    2    2    2    2    2
[292,]    1    2    2    2    2    2    2    2
[293,]    1    2    2    2    2    2    2    2
[294,]    1    2    2    2    2    2    2    2
[295,]    1    2    2    2    2    2    2    2
[296,]    1    2    2    2    2    2    2    2
[297,]    1    2    2    2    2    2    2    2
[298,]    1    2    2    2    2    2    2    2
[299,]    1    2    2    2    2    2    2    2
[300,]    1    2    2    2    2    2    2    2

as well as our vectors for the response on each trial

Code

response

x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 x2 x1 x1 
 1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  2  1  1 
x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 x2 x1 x1 x1 x1 x1 x1 
 1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  2  1  1  1  1  1  1 
x1 x1 x1 x1 x1 x1 x1 x1 x2 x1 x1 x2 x2 x2 x1 x1 x1 x1 x2 x1 x1 x1 x1 x1 x1 x1 
 1  1  1  1  1  1  1  1  2  1  1  2  2  2  1  1  1  1  2  1  1  1  1  1  1  1 
x1 x1 x1 x1 x1 x1 x2 x2 x1 x2 x2 x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 
 1  1  1  1  1  1  2  2  1  2  2  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
x1 x1 x2 x1 x1 x1 x3 x2 x1 x1 x2 x1 x1 x3 x1 x1 x2 x1 x1 x1 x2 x1 x1 x4 x1 x1 
 1  1  2  1  1  1  3  2  1  1  2  1  1  3  1  1  2  1  1  1  2  1  1  4  1  1 
x1 x1 x1 x1 x4 x1 x1 x1 x1 x1 x1 x1 x1 x1 x1 x4 x1 x1 x1 x1 x3 x4 x1 x3 x3 x3 
 1  1  1  1  4  1  1  1  1  1  1  1  1  1  1  4  1  1  1  1  3  4  1  3  3  3 
x3 x1 x4 x1 x2 x1 x2 x1 x2 x1 x1 x4 x1 x1 x4 x1 x3 x1 x2 x1 x2 x3 x2 x1 x1 x1 
 3  1  4  1  2  1  2  1  2  1  1  4  1  1  4  1  3  1  2  1  2  3  2  1  1  1 
x1 x2 x1 x1 x1 x3 x1 x2 x1 x1 x2 x2 x3 x1 x4 x1 x1 x1 x5 x1 x1 x1 x8 x1 x1 x1 
 1  2  1  1  1  3  1  2  1  1  2  2  3  1  4  1  1  1  5  1  1  1  8  1  1  1 
x1 x1 x1 x1 x4 x1 x1 x1 x1 x1 x1 x6 x3 x1 x1 x1 x1 x6 x1 x1 x1 x6 x8 x1 x1 x1 
 1  1  1  1  4  1  1  1  1  1  1  6  3  1  1  1  1  6  1  1  1  6  8  1  1  1 
x1 x1 x1 x1 x1 x1 x5 x5 x1 x8 x1 x1 x1 x1 x1 x2 x7 x7 x1 x2 x4 x1 x1 x1 x2 x1 
 1  1  1  1  1  1  5  5  1  8  1  1  1  1  1  2  7  7  1  2  4  1  1  1  2  1 
x8 x4 x1 x1 x7 x1 x1 x3 x1 x1 x1 x2 x1 x1 x7 x1 x4 x2 x1 x8 x8 x1 x4 x6 x7 x7 
 8  4  1  1  7  1  1  3  1  1  1  2  1  1  7  1  4  2  1  8  8  1  4  6  7  7 
x1 x2 x1 x3 x8 x1 x4 x1 x4 x5 x6 x2 x2 x1 
 1  2  1  3  8  1  4  1  4  5  6  2  2  1

and the response time on each trial

Code

rt

  [1] 1.136 1.101 1.486 0.996 2.043 1.382 0.707 0.847 2.243 0.693 0.730 0.806
 [13] 0.802 1.124 1.344 1.546 2.010 1.841 0.872 0.842 1.963 0.713 1.208 1.815
 [25] 0.835 1.085 1.266 1.825 1.375 1.861 1.673 1.858 1.324 1.014 1.641 0.946
 [37] 1.185 1.218 1.597 1.051 0.693 1.790 0.637 0.715 0.562 2.067 2.230 1.848
 [49] 1.058 1.741 0.622 0.820 0.513 0.949 1.324 0.818 1.033 0.604 0.741 0.890
 [61] 0.635 0.671 0.612 1.055 0.878 1.054 1.021 1.444 0.844 1.069 0.843 1.347
 [73] 1.013 1.190 0.664 1.068 0.780 0.887 0.678 1.048 0.939 0.522 0.610 0.759
 [85] 0.928 1.490 1.495 0.810 1.180 1.248 0.948 1.064 0.696 1.127 0.703 1.098
 [97] 0.925 1.023 0.802 0.857 1.059 0.643 0.906 1.362 1.182 1.383 1.069 0.721
[109] 0.966 1.766 1.186 1.764 1.200 0.802 0.957 0.870 0.762 1.328 0.717 0.875
[121] 1.167 0.762 0.881 2.739 2.425 1.378 1.231 1.073 0.900 1.812 0.860 0.782
[133] 1.676 0.699 2.438 1.013 1.322 0.765 0.735 1.546 1.098 1.013 1.553 0.842
[145] 1.384 1.729 0.851 0.900 1.505 1.736 0.607 1.279 0.704 0.537 0.933 1.349
[157] 1.739 0.805 0.991 0.864 1.159 0.799 0.855 1.569 0.513 0.802 1.416 1.123
[169] 1.201 0.764 0.658 1.417 1.054 0.757 0.795 0.799 1.025 0.986 0.792 0.898
[181] 0.661 1.025 0.791 1.323 0.847 1.377 1.034 0.571 0.939 1.207 1.283 2.040
[193] 1.200 0.986 0.736 1.062 0.549 0.677 0.863 0.847 1.065 0.929 1.000 0.792
[205] 0.774 0.823 0.997 0.831 0.938 0.968 1.329 1.726 1.359 0.555 0.770 1.358
[217] 0.492 1.361 0.806 1.713 1.617 1.124 0.846 1.028 0.494 1.828 0.637 1.457
[229] 1.964 0.636 1.435 0.970 0.652 0.755 1.287 0.801 1.091 0.698 1.575 0.613
[241] 0.777 0.717 0.640 1.480 0.759 0.567 1.047 0.693 1.204 0.756 0.644 0.944
[253] 0.662 1.048 0.969 0.692 1.024 0.990 0.561 0.752 0.941 0.657 0.672 1.156
[265] 0.647 1.046 0.683 0.583 0.699 0.977 0.823 0.698 0.536 0.888 0.840 0.882
[277] 0.694 0.848 0.589 0.586 0.880 0.616 1.350 0.755 0.728 1.024 0.574 1.259
[289] 1.166 0.738 0.841 0.621 0.761 0.611 0.908 1.407 1.049 1.232 0.754 1.079

These will be the data to which we will fit our race model.

8.3.2 Fitting parameters

Finally, we are in a position to fit the race model to our (simulated) data. We will be using R’s built-in nlminb function for this purpose, which operates very similarly to the optim function in R. The reason we are using nlminb here is because our model parameters should be bounded, i.e., restricted to fall within particular ranges. nlminb is a bit less clunky in this situation.

Generally speaking, calling nlminb will look like this:

Code

nlminb(
    # A vector of initial values that are the "starting point" for the search
    start = ___,
    # The function that we want to optimize (i.e., get to be as small as possible)
    objective = ___,
    # Lower and upper bounds on model parameters (these can be infinite)
    lower = ___, upper = ___,
    # Other options to set
    control = list(___),
    # Other arguments which will be conveyed to the `objective` to be optimized
    ...
)

It is clear that we first need an initial set of parameter values. Our initial guess does not need to be very good, but it should at least be sensible. We will fit the model assuming different drift rates for each item type and a single threshold and residual time:

Code

start <- c(
    "v[1]" = 0,
    "v[2]" = 0,
    "v[3]" = 0,
    "a" = 1,
    "t0" = 0
)

The values above were chosen because they are typical of what we might use as a “default” when fitting real data and we don’t know what the true values might be. We also need to specify the lower and upper bounds on each parameter; these bounds are also named vectors:

Code

lower <- c(
    "v[1]" = -Inf, # Drift rates allowed to be negative
    "v[2]" = -Inf,
    "v[3]" = -Inf,
    "a" = 0,       # Thresholds must be nonnegative, since accumulators start at zero
    "t0" = 0       # Minimum residual time is zero
)

upper <- c(
    "v[1]" = Inf, # Drift rates and threshold can be as high as they need
    "v[2]" = Inf,
    "v[3]" = Inf,
    "a" = Inf,
    "t0" = min(rt) # Residual time cannot be larger than the smallest RT
)

Finally, we have everything we need to fit the model:

Code

fit <- nlminb(
    start = start,
    objective = race_nll,
    lower = lower, upper = upper,
    # Other options to set ("eval.max" is the most common one to play with)
    control = list(eval.max = 1000),
    rt = rt,
    response = response,
    items = items
)

Now let’s open our present:

Code

fit

$par
      v[1]       v[2]       v[3]          a         t0 
2.02086803 0.95210575 0.02240357 1.98372419 0.20802590 

$objective
[1] 350.1683

$convergence
[1] 0

$iterations
[1] 44

$evaluations
function gradient 
      54      242 

$message
[1] "relative convergence (4)"

As we can see, the fit we got back from the nlminb function is actually a list. fit$par is a named vector giving us the best-fitting parameters, fit$objective is the best (smallest) value of NLL the model found. We also get some additional information, the most pertinent of which being fit$convergence—if this value is 0, we are probably okay. Be sure to check out the help page on nlminb for more information and examples!

For now, it is worth noting that the estimated values of the model parameters are pretty close to those we actually used to simulate the data.

8.3.3 Model comparison

Recall that our simulated data was from a visual search task in which distractors were allowed to vary in how similar they were to the target. Specifically, there were high-similarity distractor (with drift rates of 1) and low-similarity distractors (with drift rates of 0). In a real research context, we might be interested in testing the hypothesis that the two distractor types have different drift rates. We would address that question by fitting the model we used above, which allows for different drift rates for the two types, and also fitting a model in which we allow only a single drift rate parameter that applies to all distractors. We would then use AIC and BIC to decide whether the additional drift rate parameter improved the model fit enough to justify adding the parameter.

Let’s see how that would go. How can we implement a model that assigns a single drift rate to both types of distractors? We will create a modified version of the items matrix that replaces all the 3s with 2s. We can then fit the model using this modified version. The effect of this will be to apply the same drift rate (which will be v[2] in our simpler model) to both types of foils. The code below accomplishes this.

Code

# Create simpler version of the items matrix
items_simple <- items
# Replace all 3's with 2's
items_simple[items_simple == 3] <- 2

start_simple <- c(
    "v[1]" = 0,
    "v[2]" = 0,
    "a" = 1,
    "t0" = 0
)

lower_simple <- c(
    "v[1]" = -Inf, # Drift rates allowed to be negative
    "v[2]" = -Inf,
    "a" = 0,       # Thresholds must be nonnegative, since accumulators start at zero
    "t0" = 0       # Minimum residual time is zero
)

upper_simple <- c(
    "v[1]" = Inf, # Drift rates and threshold can be as high as they need
    "v[2]" = Inf,
    "a" = Inf,
    "t0" = min(rt) # Residual time cannot be larger than the smallest RT
)

fit_simple <- nlminb(
    start = start_simple,
    objective = race_nll,
    lower = lower_simple, upper = upper_simple,
    control = list(eval.max = 1000),
    rt = rt,
    response = response,
    items = items_simple
)

And here’s the result:

Code

fit_simple

$par
     v[1]      v[2]         a        t0 
1.7622345 0.1271764 1.6013550 0.2934703 

$objective
[1] 374.5381

$convergence
[1] 0

$iterations
[1] 26

$evaluations
function gradient 
      38      121 

$message
[1] "relative convergence (4)"

As before, we can compute AIC for each model:

Code

2 * fit$objective + 2 * length(fit$par)

[1] 710.3365

Code

2 * fit_simple$objective + 2 * length(fit_simple$par)

[1] 757.0762

as well as BIC

Code

2 * fit$objective + log(length(rt)) * length(fit$par)

[1] 728.8554

Code

2 * fit_simple$objective + log(length(rt)) * length(fit_simple$par)

[1] 771.8914

On both counts, the original model (fit) is preferred, confirming that the two distractor types are best explained as having two different drift rates.

8.4 Decision rules

One final thing to note is that the race models we have explored so far assume that responses are based on whichever accumulator wins the race. This represents a decision rule that is self-terminating or minimum time. Alternatively, one could imagine tasks in which decisions could only be made once all accumulators had finished, amounting to an exhaustive or maximum time decision rule.

For example, if the visual search task required participants not to pick which item was the target, but to decide whether or not a target was present at all, this would imply different decision rules for different responses. Responding that a target was present at all could be done using a self-terminating rule—if any accumulator reaches threshold, that could cause a participant to decide that there was a target in the display. On the other hand, saying that there were no targets in the display would require exhaustively processing each item to verify that it was not a target. This latter task constraint could imply that each accumulator has both upper and lower bounds, where the lower bound corresponds to a decision that the item is not a target. Alternatively, each item might be associated with two accumulators, one that accumulates evidence that supports the item being a target and another that accumulates evidence that the item is not a target. Finally, one could retain the same racing diffusion model we’ve been developing but include an additional no target accumulator that races against the accumulators for each item—if the “no target” accumulator wins, the participant decides that none of the items were targets.

The point in laying out these possibilities is that, once we consider models with many possibly overlapping and interacting processes, we must also think carefully about how the results of those processes translate into observed behavior. There are many different ways of processing and combining information from multiple sources, each of which could lead to a different model of the task. The value of modeling is that this cornucopia of possibilities need not be overwhelming—we can build models that instantiate these different possibilities and see which best account for our data. There are even methods, known as Systems Factorial Technology (Harding et al., 2016; Houpt et al., 2014; Townsend & Nozawa, 1995), that are specifically designed to obtain data to distinguish between different models in these situations.

8.5 Excursus: Accumulators with negative drift rates

You may have noticed above that, when we fit our race model, we allowed the drift rates for accumulators to be negative. When we were working with diffusion models, this was no big deal—a negative drift rate just means that samples of evidence are more likely to favor the choice associated with the lower bound than the upper bound. The accumulators in our race model, though, do not have a lower bound and so an accumulator with a negative drift rate could actually keep going forever without reaching threshold.

Mathematically, we can actually compute how often that would happen. This involves taking a look at the formula for the CDF. Since the CDF is the probability that the accumulator has reached threshold by time $t$, we can find the probability that the accumulator ever reaches threshold by taking the limit of the CDF as $t$ approaches infinity. As illustrated below, when the drift rate $v_i \geq 0$, this limit equals 1, meaning with nonnegative drift rates the accumulator will eventually reach threshold (though it may take a long time). When the drift rate $v_i < 0$, though, the limit is $\exp \left( 2 a_i v_i \right)$.

\[\begin{align} \lim_{t \rightarrow \infty} F_i \left( t \mid a_i, v_i, t_{0i} \right) & = \begin{cases} \Phi \left( \infty \right) + \exp \left( 2 a_i v_i \right) \Phi \left( -\infty \right) & \text{if } v_i > 0 \\ \Phi \left( 0 \right) + \exp \left( 0 \right) \Phi \left( 0 \right) & \text{if } v_i = 0 \\ \Phi \left( -\infty \right) + \exp \left( 2 a_i v_i \right) \Phi \left( \infty \right) & \text{if } v_i < 0 \end{cases} \\ & = \begin{cases} 1 + \exp \left( 2 a_i v_i \right) \times 0 & \text{if } v_i > 0 \\ \frac{1}{2} + 1 \times \frac{1}{2} & \text{if } v_i = 0 \\ 0 + \exp \left( 2 a_i v_i \right) \times 1 & \text{if } v_i < 0 \end{cases} \\ & = \begin{cases} 1 & \text{if } v_i \geq 0 \\ \exp \left( 2 a_i v_i \right) & \text{if } v_i < 0 \end{cases} \end{align}\]

This behavior is illustrated in the graphs below. In particular, note the asymptotic level of the CDF functions in the right set of plots as the drift rate grows more negative.

Code

toPlot <- expand_grid(t = seq(0, 3, length.out = 101), a = c(1, 2), v = seq(-3, 3, length.out = 7)) %>%
    mutate(threshold_factor = factor(a, levels = sort(unique(a)), labels = paste("Threshold =", sort(unique(a))))) %>%
    mutate(d = accum_pdf(t = t, a = a, v = v, t0 = 0)) %>%
    mutate(p = accum_cdf(t = t, a = a, v = v, t0 = 0))

pdf_plot <- toPlot %>%
    ggplot(aes(x = t, y = d, color = v, group = factor(v))) +
    geom_line() +
    scale_color_gradient2(mid = "#bbbbbb") +
    facet_wrap("threshold_factor", ncol = 1) +
    labs(x = "Time (s)", y = "Accumulator PDF", color = "Drift rate")

cdf_plot <- toPlot %>%
    ggplot(aes(x = t, y = p, color = v, group = factor(v))) +
    geom_line() +
    scale_color_gradient2(mid = "#bbbbbb") +
    expand_limits(y = c(0, 1)) +
    facet_wrap("threshold_factor", ncol = 1) +
    labs(x = "Time (s)", y = "Accumulator CDF", color = "Drift rate")

print(
    pdf_plot + cdf_plot + plot_layout(guides = "collect", nrow = 1)
)

If there were only one accumulator involved, we would generally not want to allow for negative drift rates, since this would predict situations in which someone literally never responded. (To my knowledge, no one has reported such a result, but maybe we just haven’t waited long enough.) In models with multiple accumulators, though, all that matters is that at least one accumulator has a nonnegative drift rate—that way, there will always be a winner that will initiate a response. Indeed, in models with feedforward competition, it may be reasonable to allow that competition to result in negative drift rates, since this would effectively suppress responses associated with those accumulators.

The larger point is that negative drift rates need not be excluded outright. There may be theoretical reasons to allow for negative drift rates. Moreover, in models that allow for negative drift rates, the optimization algorithm will tend to avoid combinations of parameter values that result in too many negative drift rates since these would be associated with a low likelihood of making a response.

8.6 Exercises

Consider potential psychological interpretations of feedforward inhibition and lateral inhibition. In particular, see if you can relate these two forms of competition to the constructs of attention and capacity.
As noted above, diffusion models explain slow errors in terms of trial-by-trial variability in drift rates, while accumulator models explain slow errors in terms of lateral competition between options. Compare and contrast theses. Are there circumstances in which one explanation may be more plausible than the other? Would it be possible to ascribe trial-by-trial variability to lateral competition?
As noted in the chapter, it is possible to calculate the negative log-likelihood for the race model including feedforward competition. Implement this mechanism in the race_nll function. To get you started, the chunk of code below contains comments that suggest how this might be done.

Code

race_nll <- function(par, response, rt, items, n_items = max(items, na.rm = TRUE)) {
    n_trials <- nrow(items)
    n_locations <- ncol(items)
    
    if (is.na(par["v"])) {
        v <- par[paste0("v[", 1:n_items, "]")]
    } else {
        v <- rep(par["v"], n_items)
    }
    
    if (is.na(par["a"])) {
        a <- par[paste0("a[", 1:n_locations, "]")]
    } else {
        a <- rep(par["a"], n_locations)
    }
    
    if (is.na(par["t0"])) {
        t0 <- par[paste0("t0[", 1:n_locations, "]")]
    } else {
        t0 <- rep(par["t0"], n_locations)
    }
    
    # Similar to the above, you can check and see whether a feedforward competition parameter is or is not included; if not, you'll want to set the value of feedforward to zero.
    if (is.na(par["feedforward"])) {
        feedforward <- ...
    } else {
        feedforward <- ...
    }
    
    nll <- rep(0, n_trials)
    
    for (i in 1:n_trials) {
        # Since drift rates depend on the items presented on each trial, we'll need to calculate them
        trial_v <- v[items[i, ]]
        
        for (j in 1:length(trial_v)) {
            trial_v[j] <- ...
        }
        
        # Note that the drift rates now refer to "trial_v"
        nll[i] <- -log(
            accum_pdf(t = rt[i], a = a[response[i]], v = trial_v[response[i]], t0 = t0[response[i]]) *
                prod(1 - accum_cdf(t = rt[i], a = a[-response[i]], v = trial_v[-response[i]], t0 = t0[-response[i]]), na.rm = TRUE)
        )
    }
    
    return(sum(nll))
}

The Leaky Competing Accumulator (LCA) model was proposed by Usher & McClelland (2001). In addition to lateral competition, the LCA (as the name implies) assumes that evidence “leaks” out of accumulators at a constant rate. Take a look at their paper (specifically equation 4 on page 559) and modify our race model simulation code to include a leakage parameter. Describe how you did this (and share your code too!).
An extension to the racing diffusion model explored in this chapter was proposed by Tillman et al. (2020). Their model allows for trial-by-trial variability in the thresholds of the accumulators. Their Equation 5 and Appendix A provide formulae for the PDF and CDF of the resulting accumulators. Translate these formulae into R code that implements their extended racing diffusion model.
The accumulator models we explored in the chapter are all stochastic accumulators, since value of the evidence being accumulated fluctuates randomly around a mean. Brown & Heathcote (2008) proposed a Linear Ballistic Accumulator (LBA) model which only has between-trial variability but is otherwise deterministic. Despite this unrealistic assumption, the model fits data well and often leads to similar conclusions as drawn from stochastic models of evidence accumulation (Donkin et al., 2011). Referring to the equations provided by Brown & Heathcote (2008) for the PDF and CDF of their Linear Ballistic Accumulators, implement their model.

Brown, S., & Heathcote, A. (2008). The simplest complete model of choice response time: Linear ballistic accumulation. Cognitive Psychology, 57, 153–178.

Donkin, C., Brown, S., Heathcote, A., & Wagenmakers, E.-J. (2011). Diffusion versus linear ballistic accumulation: Different models but the same conclusions about psychological processes? Psychonomic Bulletin & Review, 55, 140–151.

Harding, B., Goulet, M.-A., Jolin, S., Tremblay, C., Villeneuve, S.-P., & Durand, G. (2016). Systems factorial technology explained to humans. The Quantitative Methods for Psychology, 12(1), 39–56.

Houpt, J. W., Blaha, L. M., McIntire, J. P., Havig, P. R., & Townsend, J. T. (2014). Systems factorial technology with R. Behavior Research Methods, 46, 307–330.

Purcell, B. A., Heitz, R. P., Cohen, J. Y., Schall, J. D., Logan, G. D., & Palmeri, T. J. (2010). Neurally constrained modeling of perceptual decision making. Psychological Review, 117(4), 1113–1143. https://doi.org/10.1037/a0020311

Purcell, B. A., Schall, J. D., Logan, G. D., & Palmeri, T. J. (2012). From salience to saccades: Multiple-alternative gated stochastic accumulator model of visual search. Journal of Neuroscience, 32(10), 3433–3446. https://doi.org/10.1523/JNEUROSCI.4622-11.2012

Raab, D. H. (1962). Statistical facilitation of simple reaction times. Transactions of the New York Academy of Sciences, 24(5 Series II), 574–590.

Teodorescu, A. R., & Usher, M. (2013). Disentangling decision models: From independence to competition. Psychological Review, 120(1), 1–38.

Tillman, G., Van Zandt, T., & Logan, G. D. (2020). Sequential sampling models without random between-trial variability: The racing diffusion model of speeded decision making. Psychonomic Bulletin & Review, 27(5), 911–936. https://doi.org/10.3758/s13423-020-01719-6

Townsend, J. T., & Nozawa, G. (1995). Spatio-temporal properties of elementary perception: An investigation of parallel, serial, and coactive theories. Journal of Mathematical Psychology, 39, 321–359.

Usher, M., & McClelland, J. L. (2001). The time course of perceptual choice: The leaky, competing accumulator model. Psychological Review, 108(3), 550–592.