Given a model (a list of segment formulas), mcp infers the posterior distributions of the parameters of each segment as well as the change points between segments. See more details and worked examples on the mcp website. All segments must regress on the same x-variable. Change points are forced to be ordered using truncation of the priors. You can run fit = mcp(model, sample=FALSE) to avoid sampling and the need for data if you just want to get the priors (fit$prior), the JAGS code fit$jags_code, or the R function to simulate data (fit$simulate).

mcp(
  model,
  data = NULL,
  prior = list(),
  family = gaussian(),
  par_x = NULL,
  sample = "post",
  cores = 1,
  chains = 3,
  iter = 3000,
  adapt = 1500,
  inits = NULL,
  jags_code = NULL
)

Arguments

model

A list of formulas - one for each segment. The first formula has the format response ~ predictors while the following formulas have the format response ~ changepoint ~ predictors. The response and change points can be omitted (changepoint ~ predictors assumes same response. ~ predictors assumes an intercept-only change point). The following can be modeled:

  • Regular formulas: e.g., ~ 1 + x). Read more.

  • Extended formulas:, e.g., ~ I(x^2) + exp(x) + sin(x). Read more.

  • Variance: e.g., ~sigma(1) for a simple variance change or ~sigma(rel(1) + I(x^2))) for more advanced variance structures. Read more

  • Autoregression: e.g., ~ar(1) for a simple onset/change in AR(1) or ar(2, 0 + x) for an AR(2) increasing by x. Read more

data

Data.frame or tibble in long format.

prior

Named list. Names are parameter names (cp_i, int_i, xvar_i, `sigma``) and the values are either

  • A JAGS distribution (e.g., int_1 = "dnorm(0, 1) T(0,)") indicating a conventional prior distribution. Uninformative priors based on data properties are used where priors are not specified. This ensures good parameter estimations, but it is a questionable for hypothesis testing. mcp uses SD (not precision) for dnorm, dt, dlogis, etc. See details. Change points are forced to be ordered through the priors using truncation, except for uniform priors where the lower bound should be greater than the previous change point, dunif(cp_1, MAXX).

  • A numerical value (e.g., int_1 = -2.1) indicating a fixed value.

  • A model parameter name (e.g., int_2 = "int_1"), indicating that this parameter is shared - typically between segments. If two varying effects are shared this way, they will need to have the same grouping variable.

  • A scaled Dirichlet prior is supported for change points if they are all set to cp_i = "dirichlet(N) where N is the alpha for this change point and N = 1 is most often used. This prior is less informative about the location of the change points than the default uniform prior, but it samples less efficiently, so you will often need to set iter higher. It is recommended for hypothesis testing and for the estimation of more than 5 change points. Read more.

family

One of gaussian(), binomial(), bernoulli(), or poission(). Only default link functions are currently supported.

par_x

String (default: NULL). Only relevant if no segments contains slope (no hint at what x is). Set this, e.g., par_x = "time".

sample

One of

  • "post" (default): Sample the posterior.

  • "prior": Sample only the prior. Plots, summaries, etc. will use the prior. This is useful for prior predictive checks.

  • "both": Sample both prior and posterior. Plots, summaries, etc. will default to using the posterior. The prior only has effect when doing Savage-Dickey density ratios in hypothesis.

  • "none" or FALSE: Do not sample. Returns an mcpfit object without sample. This is useful if you only want to check prior strings (fit$prior), the JAGS model (fit$jags_code), etc.

cores

Positive integer or "all". Number of cores.

  • 1: serial sampling (Default). options(mc.cores = 3) will dominate cores = 1 but not larger values of cores.

  • >1: parallel sampling on this number of cores. Ideally set chains to the same value. Note: cores > 1 takes a few extra seconds the first time it's called but subsequent calls will start sampling immediately.

  • "all": use all cores but one and sets chains to the same value. This is a convenient way to maximally use your computer's power.

chains

Positive integer. Number of chains to run.

iter

Positive integer. Number of post-warmup samples to draw. The number of iterations per chain is iter/chains.

adapt

Positive integer. Also sometimes called "burnin", this is the number of samples used to reach convergence. Set lower for greater speed. Set higher if the chains haven't converged yet or look at tips, tricks, and debugging.

inits

A list if initial values for the parameters. This can be useful if a model fails to converge. Read more in jags.model. Defaults to NULL, i.e., no inits.

jags_code

String. Pass JAGS code to mcp to use directly. This is useful if you want to tweak the code in fit$jags_code and run it within the mcp framework.

Value

An mcpfit object.

Details

Notes on priors:

  • Order restriction is automatically applied to cp_\* parameters using truncation (e.g., T(cp_1, )) so that they are in the correct order on the x-axis UNLESS you do it yourself. The one exception is for dunif distributions where you have to do it as above.

  • In addition to the model parameters, MINX (minimum x-value), MAXX (maximum x-value), SDX (etc...), MINY, MAXY, and SDY are also available when you set priors. They are used to set uninformative default priors.

  • Use SD when you specify priors for dt, dlogis, etc. JAGS uses precision but mcp converts to precision under the hood via the sd_to_prec() function. So you will see SDs in fit$prior but precision ($1/SD^2) in fit$jags_code

See also

Examples

# \donttest{
# Define the segments using formulas. A change point is estimated between each formula.
model = list(
  response ~ 1,  # Plateau in the first segment (int_1)
  ~ 0 + time,    # Joined slope (time_2) at cp_1
  ~ 1 + time     # Disjoined slope (int_3, time_3) at cp_2
)

# Fit it. The `ex_demo` dataset is included in mcp. Sample the prior too.
# options(mc.cores = 3)  # Uncomment to speed up sampling
ex_fit = mcp(model, data = ex_demo, sample = "both")

# See parameter estimates
summary(ex_fit)

# Visual inspection of the results
plot(ex_fit)  # Visualization of model fit/predictions
plot_pars(ex_fit)  # Parameter distributions
pp_check(ex_fit)  # Prior/Posterior predictive checks

# Test a hypothesis
hypothesis(ex_fit, "cp_1 > 10")

# Make predictions
fitted(ex_fit)
predict(ex_fit)
predict(ex_fit, newdata = data.frame(time = c(55.545, 80, 132)))

# Compare to a one-intercept-only model (no change points) with default prior
model_null = list(response ~ 1)
fit_null = mcp(model_null, data = ex_demo, par_x = "time")  # fit another model here
ex_fit$loo = loo(ex_fit)
fit_null$loo = loo(fit_null)
loo::loo_compare(ex_fit$loo, fit_null$loo)

# Inspect the prior. Useful for prior predictive checks.
summary(ex_fit, prior = TRUE)
plot(ex_fit, prior = TRUE)

# Show all priors. Default priors are added where you don't provide any
print(ex_fit$prior)

# Set priors and re-run
prior = list(
  int_1 = 15,
  time_2 = "dt(0, 2, 1) T(0, )",  # t-dist slope. Truncated to positive.
  cp_2 = "dunif(cp_1, 80)",    # change point to segment 2 > cp_1 and < 80.
  int_3 = "int_1"           # Shared intercept between segment 1 and 3
)

fit3 = mcp(model, data = ex_demo, prior = prior)

# Show the JAGS model
cat(ex_fit$jags_code)
# }