library(segclust2d)
Both segmentation()
and segclust()
return objects of segmentation-class
for which several functions are available (see below).
There are two types of function: (1) some are general and show likelihood for all the different segmentations; (2) other are specific to a given segmentation and requires selecting a number of segments and of clusters (if applicable).
For the functions specific to a given segmentation, if you do not provide as argument the number of segments and of clusters, the functions will automatically select the best arguments based on a penalized log-likelihood as following:
for outputs of segmentation()
the optimal number of segments is selected with Lavielle’s criterium. Other numbers of segments may be provided with arguments nseg
.
for outputs of segclust()
the optimal numbers of clusters and segments are selected with a BIC-based penalized criterium. Other parameters may be provided with arguments nseg
and ncluster
. It is recommended to manually choose the number of clusters based on biological knowledge or careful exploration of the BIC-based penalized likelihood. Once the number of clusters was chosen (either manually or automatically) it is recommended to select the number of segments using the automatic BIC-based penalized likelihood criterium.
All plot methods use ggplot2
package and return ggplot
objects that can be further modified and customized using classical ggplot2
(see ggplot2 function reference).
order
If you provide argument order = TRUE
to a function specific to a segmentation, then the different segments or clusters will be numbered ordered by the variable provided as order.var
in the segmentation()
or segclust()
call.
For a specific segmentation:
plot.segmentation
to show the segmented time-series, and clusters if applicable.segmap
to show the results of the segmentation as a labelled path (if applicable).stateplot
plot summary statistics for all segments or clusters.Summary for all segmentations:
plot_likelihood
for segmentation() show the log-likelihood of the segmentation for all numbers of segments.plot_BIC
for segclust() show the BIC-based penalized log-likelihood of the segmentation.clustering for all numbers of segments and clusters.For a specific segmentation:
augment
returns a data.frame with the original data as well as the segment or cluster associated for each data pointsegment
returns a data.frame with the beginning and end of each segmentstates
for segclust
provides a data.frame with summary statistics for all clustersSummary for all segmentations:
logLik
for segmentation()
returns a data.frame with the log-likelihood for all numbers of segments.BIC
for segclust()
returns a data.frame with the BIC-based penalized log-likelihood for all numbers of clusters and segments.As functions for segmentation and segmentation/clustering are very similar, we will show examples mostly for the segmentation/clustering outputs, but the use is very similar, argument ncluster
just need to be omitted for obtaining outputs for segmentation.
data(simulmode)
$abs_spatial_angle <- abs(simulmode$spatial_angle)
simulmode<- simulmode[!is.na(simulmode$abs_spatial_angle), ]
simulmode <- segclust(simulmode,
mode_segclust Kmax = 20, lmin=10, ncluster = c(2,3),
seg.var = c("speed","abs_spatial_angle"),
scale.variable = TRUE)
plot.segmentation
for segmented time-seriesplot(mode_segclust, ncluster = 3)
segmap()
plots the results of the segmentation as a labelled path. This can be done only if data have a geographic meaning. Coordinate names are by default “x” and “y” but they can be provided through argument coord.names
.
segmap(mode_segclust, ncluster = 3)
stateplot()
shows statistics for each state or segment.
stateplot(mode_segclust, ncluster = 3)
augment.segmentation()
is a method for broom::augment
. It returns an augmented data.frame with outputs of the model - here, the attribution to segment or cluster.
augment(mode_segclust, ncluster = 3)
segment()
makes it possible to retrieve information on the different segments for a given segmentation. Each segment is associated with the mean and standard deviation for each variable, the state (equivalent to the segment number for segmentation
) and the state ordered given a variable - by default the first variable given by seg.var
. One can specify the variable for ordering states through the order.var
of segmentation()
and segclust()
.
segment(mode_segclust, ncluster = 3)
states()
returns information on the different states of the segmentation. For segmentation()
it is quite similar to segment()
. For segclust
, however it gives the different clusters found and the statistics associated.
states(mode_segclust, ncluster = 3)
logLik.segmentation()
return information on the log-likelihood of the different segmentations possible. It returns a data.frame with the number of segments and the log-likelihood.
data("simulshift")
<- segmentation(simulshift,
shift_seg seg.var = c("x","y"),
lmin = 240, Kmax = 25,
subsample_by = 60)
logLik(shift_seg)
plot_likelihood()
plots the log-likelihood of the segmentation for all the tested numbers of segments and clusters.
plot_likelihood(shift_seg)
BIC.segmentation()
returns information on the BIC-based penalized log-likelihood of the different segmentations possible. It returns a data.frame with the number of segments, the BIC-based penalized log-likelihood and the number of cluster. For segclust()
only. Note that this does not truly return a BIC. Here highest values are favored (in opposition to BIC)
BIC(mode_segclust)
plot_BIC()
plots the BIC-based penalized log-likelihood of the segmentation for all the tested numbers of segments and clusters.
plot_BIC(mode_segclust)