clustord: Clustering Using Proportional Odds Model, Ordered Stereotype Model or Binary Model.
clustord-package.Rd
Biclustering, row clustering and column clustering using the proportional odds model (POM), ordered stereotype model (OSM) or binary model for ordinal categorical data.
Details
The clustord package provides six functions: clustord()
, rerun()
,
mat2df()
, calc.SE.rowcluster()
, calc.SE.bicluster()
, and
calc.cluster.comparisons()
.
Clustering function
The main function is clustord()
, which
fits a clustering model to the data. The model is fitted using
likelihood-based clustering via the EM algorithm. The package assumes that
you started with a data matrix of responses, though you will need to
convert that data matrix into a long-form data frame before running
clustord
. Every element in the original data matrix becomes one
row in the data frame, and the row and column indices from the data matrix
become the columns ROW and COL in the data frame. You can perform
clustering on rows or columns of the data matrix, or biclustering on both
rows and columns simultaneously. You can include any number of covariates
for rows and covariates for columns. Ordinal models used in the package are
Ordered Stereotype Model (OSM), Proportional Odds Model (POM) and a
dedicated Binary Model for binary data.
The rerun()
function is useful for continuing clustering runs that did
not converge on the first attempt, and for running new clustering runs using
the estimated parameters of a previous run as a starting point. The main
input for this function is a clustord
object output by clustord
,
and internally the rerun
function runs clustord
, after setting
up all the input parameters based on the original model fitting run.#'
Utility function
mat2df()
is a utility function provided to convert a data matrix of
responses into the long-form data frame format required by
clustord()
, and can also attach any covariates to that long-form
data frame if needed.
SE calculation functions
calc.SE.rowcluster()
and calc.SE.bicluster()
are functions to
run after running clustord()
, to calculate the standard errors on
the parameters fitted using clustord()
.
Clustering comparisons
calc.cluster.comparisons()
can be used to compare the assigned cluster
memberships of the rows or columns of the data matrix from two different
clustering fits, in a way that avoids the label-switching problem.
Author
Maintainer: Louise McMillan louise.mcmillan@vuw.ac.nz (ORCID) [copyright holder]
Authors:
Daniel Fernández Martínez daniel.fernandez.martinez@upc.edu (ORCID)
Ying Cui ying.cui@sms.vuw.ac.nz
Eleni Matechou e.matechou@kent.ac.uk (ORCID)
Other contributors:
W. N. Venables (clustord osm regression functions and S3 methods derived by Louise McMillan from MASS package polr function by Venables and Ripley) [contributor, copyright holder]
B. D. Ripley (clustord osm regression functions and S3 methods derived by Louise McMillan from MASS package polr function by Venables and Ripley) [contributor, copyright holder]