Ordinal data regression using the Ordered Stereotype Model (OSM).
osm.Rd
Fit a regression model to an ordered factor response. The model is NOT a logistic or probit model because the link function is not the logit, but the link function is log-based.
Arguments
- formula
a formula expression as for regression models, of the form response ~ predictors. The response should be a factor (preferably an ordered factor), which will be interpreted as an ordinal response, with levels ordered as in the factor. The model must have an intercept: attempts to remove one will lead to a warning and be ignored. An offset may be used. See the documentation of formula for other details.
- data
a data frame, list or environment in which to interpret the variables occurring in
formula
.- weights
optional case weights in fitting. Default to 1.
- start
initial values for the parameters. See the Details section for information about this argument.
- ...
additional arguments to be passed to optim, most often a control argument.
- subset
expression saying which subset of the rows of the data should be used in the fit. All observations are included by default.
- na.action
a function to filter missing data.
- Hess
logical for whether the Hessian (the observed information matrix) should be returned.
- model
logical for whether the model matrix should be returned.
Value
An object of class "osm"
. This has components
beta
the coefficients of the covariates, with NO intercept.
mu
the intercepts for the categories.
phi
the score parameters for the categories (restricted to be
ordered).
deviance
the residual deviance.
fitted.values
a matrix of fitted values, with a column for each
level of the response.
lev
the names of the response levels.
terms
the terms
structure describing the model.
df.residual
the number of residual degrees of freedom, calculated
using the weights.
edf
the (effective) number of degrees of freedom used by the model
n, nobs
the (effective) number of observations, calculated using the
weights.
call
the matched call.
convergence
the convergence code returned by optim
.
niter
the number of function and gradient evaluations used by
optim
.
eta
Hessian
(if Hess
is true). Note that this is a numerical
approximation derived from the optimization proces.
model
(if model
is true), the model used in the fitting.
na.action
the NA function used
xlevels
factor levels from any categorical predictors
Details
This function should be used in a very similar way to MASS::polr
, and
some of the arguments are the same as polr
, but the ordinal model used
here is less restrictive in its assumptions than the proportional odds model.
However, it is still parsimonious i.e. it uses only a small number of
additional parameters compared with the proportional odds model.
This model is the ordered stereotype model (Anderson 1984, Agresti 2010)
It is more flexible than the proportional odds model but only adds a handful of additional parameters. It is not a cumulative model, being instead defined in terms of the relationships between each of the higher categories and the lowest category that is treated as the reference category.
Each of the higher categories has its own intercept term, mu_k, which is
similar to the zeta parameters in polr
, but in the OSM each higher
category also has its own scaling parameter, phi_k, which adjusts the effect
of the covariates on the response. This allows the effect of the covariates
on the response to be slightly different for each category of the response,
thus making the model more flexible than the proportional odds model.
The final set of parameters are coefficients for each of the covariates, and
these are equivalent to the coefs in polr
. Higher or more positive
values of the coefficients increases the probability of the response being in
the higher categories, and lower or more negative values of the coefficients
increase the probability of the response being in the lower categories.
The overall model takes the following form:
log(P(Y = k | X)/P(Y = 1 | X)) = mu_k + phi_k*beta_vec^T x_vec
for k = 2, ..., q, where x_vec is the vector of covariates for the observation Y.
mu_1 is fixed at 0 for identifiability of the model, and the phi_k parameters are constrained to be ordered (giving the model its name) in the following way:
0 = phi_1 <= phi_2 <= ... <= phi_k <= ... <= phi_q = 1.
(The unordered stereotype model restricts phi_1 and phi_q but allows the remaining phi_k to be in any order, and this is suitable for fitting the model for nominal data. However, this package does not provide that option, as it is already available in other packages which can fit the stereotype model.)
After fitting the model, the estimated values of the intermediate phi_k
values indicate a suitable numerical spacing of the ordinal response
categories that is based on the data. The spacings indicate how much distinct
information each of the corresponding levels provide. For example, if you
have five response categories and the fitted phi values are (0, 0.04,
0.6, 0.62, 1)
then this indicates that levels 1 and 2 provide very similar
information about the effect of the covariates on the response, and levels 3
and 4 provide very similar information as each other. The meaning of this is
that you could simplify the response by combining levels 1 and 2 and
combining levels 3 and 4 (i.e. reduce the levels to 1, 3 and 5) and you would
still be able to estimate the beta coefficients with similar accuracy.
Another use for the phi_k values is that if you want to carry out further analysis of the response, treating it as a numerical variable, then the phi values are a better choice of numerical values for the response categories than the default values 1 to q.
start
argument values: start
is a vector of start
values for estimating the model parameters.
The first part of the start
vector is starting values for the
coefficients of the covariates, the second part is starting values for the mu
values (per-category intercepts), and the third part is starting values for
the raw parameters used to construct the phi values.
The length of the vector is [number of covariate terms] + [number of categories in response variable - 1] + [number of categories in response variable - 2]. Every one of the values can take any real value.
The second part is the starting values for the mu_k per-category intercept parameters, and since mu_1 is fixed at 0 for identifiability, the number of non-fixed mu_k parameters is one fewer than the number of categories.
The third part of the starting vector is a re-parametrization used to construct starting values for the estimated phi parameters such that the phi parameters observe the ordering restriction of the ordered stereotype model, but the raw parameters are not restricted which makes it easier to optimise over them. phi_1 is always 0 and phi_q is always 1 (where q is the number of response categories). If the raw parameters are u_2 up to u_(q-1), then phi_2 is constructed as expit(u_2), phi_3 is expit(u_2 + exp(u_3)), phi_4 is expit(exp(u_3) + exp(u_4)) etc. which ensures that the phi_k values are non-decreasing.
This code was adapted from file MASS/R/polr.R copyright (C) 1994-2013 W. N. Venables and B. D. Ripley Use of transformed intercepts contributed by David Firth The osm and osm.fit functions were written by Louise McMillan, 2020.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 or 3 of the License (at your option).
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
A copy of the GNU General Public License is available at http://www.r-project.org/Licenses/