ergm
Termsoptions(rmarkdown.html_vignette.check_title = FALSE)
This document seeks to be the most up-to-date API documentation for ergm
terms. Note that it is not intended to be a tutorial as much as a description of what inputs and outputs different parts of the system expect.
The storage API defines two types of storage: private storage, which is attached the ModelTerm
structure and is specific to each ergm
term, and public storage, which is attached to the Model
and can be accessed by all terms.
A statistic is a familiar ergm
term like “edges
” or “nodefactor
”: it adds at least one sufficient statistic to the model. Every statistic can have private storage, and it can read from public storage, but it cannot write to public storage.
An auxiliary in an ergm
term but not an ERGM term in the mathematical sense: it adds no statistics to the model and exists only to initialize and maintain public storage to be used by statistics. It may not be specified on an ergm
formula by the end-user, but only requested by a statistic.
An auxiliary can rely on another auxiliary’s public storage. Note that circular dependencies are not checked.
For the purposes of this overview, the following information is relevant, and is elaborated formally later:
void *mtp->storage
to private storagevoid **mtp->aux_storage
. (The pointer is to the same location for all terms of a model.)i_
), updaters (u_
), change stats (c_
), difference stats (d_
), finalizers (f_
), writers (w_
), and “eXtended” functions (x_
).InitErgmTerm.
function’s output list can have an additional element, auxiliaries
, a one-sided formula.ergm
, summary
, etc.); this formula may only have statistics.ergm_model()
is called.
InitErgmTerm.<NAME>()
functions in turn (or InitWtErgmTerm.<NAME>()
for valued ERGMs), adding their output to the model term list. Some terms include auxiliaries
formulas in their list.call.ErgmTerm()
when it finds that a term has requested auxiliaries, attaches an attribute attr(., "aux.slots")
containing an integer vector for the model’s own and/or requested auxiliaries’ positions on the aux_storage
vector.ergm.auxstorage()
with the complete model.
auxiliaries
element for a one-sided formula listing their requested auxiliaries.auxiliaries
elements of auxiliary terms and initialises those, etc..model$terms
and the index (in the unique list) of the auxiliary requested by each statistic in the aux.slots
of the requesting statistic.aux.slots
is set to the position of the auxiliary itself.ergm_state
is constructed from an edgelist (state$el
), an empty network (state$nw0
), a model (state$model
), and (optionally) a proposal (state$proposal
) and a statistics vector (state$stats
).update.ergm_state()
is called.
term$ext.encode()
(if defined) to construct a vector state$ext.state
. state$ext.flag
is set to reconciled.ergm_state
is passed to the C code.Redgelist2Network()
initializes the network.ModelInitialize()
is called:
ModelInitialize()
initializes all terms (statistic and auxiliaries), also counting up the number of auxiliaries (distinguished by having no c_
, d_
, or s_
functions). A term can export both a c_
function and a d_
function. In that case, it is responsible for deleting one of them when its i_
function is called.void *
s, one for each auxiliary. The mtp->aux_storage
pointer for each term is set to point to that (one) array.attr(model$terms[[i]], "aux.slotes")
to mtp->aux_slots
.ergm_model
SEXP
to model->R
, in case it’s needed.ergm_model$terms[[i]]
SEXP
to mtp->R
, in case it’s needed.InitStats()
is called, calling the initializer (i_
function) of each term or, if not found, an updater (u_
function) with invalid input (i.e., toggle \((0,0)\)) is called in case the term developer prefers a one-function implementation.ChangeStats()
is called.
d_
functions, for those terms for which they are initialized.c_
functions.u_
functions with the toggle (if more to come).u_
function is called, and network is updated for each toggle.ModelDestroy()
is called:
DestroyStats()
is called, iterating through the terms.
f_
) function is called, if defined.mtp->storage
is not NULL
, it is freed.mtp->aux_storage
if not NULL
.ErgmStateRSave()
is called:
Network2Redgelist()
is called, returning a SEXP
with the state.w_
function) is called if defined, returning a SEXP
with the extended state.state$ext.state
. Element state$ext.flag
is set to signal that a change was made on the C side.NetworkDestroy()
is called.update.ergm_state()
is called.
term$ext.decode()
(if defined) to update state$nw0
or other aspects of the network. state$ext.flag
is set to reconciled.void *mtp->storage
: A pointer to private storage.
void **mtp->aux_storage
: An array of pointers to public storage, referring to the same memory location for all terms in the model.
R
sideUnchanged.
C
sidec_
, d_
, and s_
functionsd_
functions are the original difference statistics. c_
functions are new, while s_
functions have been around for a long time, but never formally documented.
Change statistic (binary): void c_<NAME>(Vertex tail, Vertex head, Model *mtp, Network *nwp, Rboolean edgestate)
Change statistic (valued): void c_<NAME>(Vertex tail, Vertex head, double weight, WtModel *mtp, WtNetwork *nwp, double edgestate)
Difference statistic (binary): void d_<NAME>(Vertex *tails, Vertex *heads, Model *mtp, Network *nwp)
Difference statistic (valued): void d_<NAME>(Vertex *tails, Vertex *heads, double *weights, WtModel *mtp, WtNetwork *nwp)
Summary statistic (binary): void s_<NAME>(Model *mtp, Network *nwp)
Summary statistic (valued): void s_<NAME>(WtModel *mtp, WtNetwork *nwp)
Edge ntgoggles
: Number of edges to be toggled or updated.
Vertex tail
: Tail of (1) dyad to be toggled or updated.
Vertex *tails
: An array of tails of the dyads to be toggled or updated.
Vertex head
: Head of (1) dyad to be toggled or updated.
Vertex *heads
: An array of heads of the dyads to be toggled or updated.
double weight
: New weight for (1) dyad.
double *weights
: An array of new weights for the dyads to be toggled or updated.
Model *mtp
: A pointer to a Model
of interest.
Network *nwp
: A pointer to the Network
of interest before any toggles are applied.
Rboolean edgestate
: An indicator of whether edge (tail,head)
is in the network nwp
pre-toggle.
double edgestate
: The weight of dyad (tail,head)
in the network nwp
pre-update.
All functions except for s_
expect any storage they need to be initialized and up to date (consistent with nwp
). In particular, if their statistic requested \(k\) auxiliary terms, the \(k\) (mtp->n_aux
) elements of its mtp->aux_slots
vector will be the indexes of mtp->aux_storage
where they can find the respective objects.
It is worth noting that macros defined for d_
functions that refer to a specific toggle, such as TAIL
, HEAD
, etc. might not be usable in a c_
function, but it’s made up for by c_
function’s reduced need for bookkeeping: tail
, head
, etc. can be used directly.
These functions overwrite mtp->dstats
(often aliased as CHANGE_STAT
) with the following:
c_
and d_
functions: change of the value of the statistic they implement relative to nwp
due to the toggles.s_
the value of the statistic it implements.Every ergm
term has private storage, found at void *mtp->storage
, which allows it to store arbitrary information about the state of the network, as well as precalculated values of variables, preallocated memory it needs for its calculations, or any other use. It does so by specifying an updating function (and, optionally, an initialization and a finalization function). This updating function is called every time the network is about to change. The API for these functions is defined below.
Public storage is found at void **mtp->aux_storage
. Each auxiliary term gets assigned a slot (i.e., void *nwp->mtp->aux_storage[i]
) to manage; its slot number is the first element of its input vector, and terms requesting it are told which slot to look in in a similar fashion. An auxiliary term that requests other auxiliaries will have its own slot as the first input and the slots of auxiliaries it requests as subsequent inputs.
R
sideA statistic that only references its private storage or is an auxiliary itself does not need to do anything special on the R
side.
To request an auxiliary, a term’s InitErgmTerm
call’s output list must include an auxiliaries
element containing a one-sided ergm
-style formula listing the auxiliary terms it wishes to use separated by the +
operator.
C
side: Modifying Storagei_
functions: Initializer/ConstructorThis function is optional for using storage: if it’s not provided, the model code will call the u_
function with an invalid toggle first, signaling for it to initialize.
Binary: void i_<NAME>(Model *mtp, Network *nwp)
Valued: void i_<NAME>(WtModel *mtp, WtNetwork *nwp)
In general, i_
function expects to be called after ModelInitialize()
and NetworkInitialize()
, before any c_
or d_
functions. That is, the network must be populated with the ties of its initial state and have mtp->aux_storage
vector allocated.
Network populated with initial ties and initialized model.
The first element of mtp->aux_slots
is the index of the element of mtp->aux_storage
to be managed by this auxiliary. That is mtp->aux_storage[mtp->aux_slots[0]]
is a void *
to point to the data to be public.
The other data passed from the InitErgmTerm.
are shifted over to make room for it.
Allocates memory for the information to be stored and overwrites mtp->storage
with a pointer to it, then updates the stored information to be consistent with *nwp
.
Public storage
Allocates memory for the information to be stored and overwrites mtp->aux_storage[mtp->aux_slots[0]]
with a pointer to it, then updates the stored information to be consistent with *nwp
.
An auxiliary can also use its private storage as needed.
u_
functions: UpdaterBinary: void u_<NAME>(Vertex tail, Vertex head, Model *mtp, Network *nwp, Rboolean edgestate)
Valued: void u_<NAME>(Vertex tail, Vertex head, double weight, Model *mtp, Network *nwp, double edgestate)
Initialized network. If no i_
function was provided, to be called with a \((0,0)\) toggle as a signal to initialize; otherwise, initialized storage. Any statistic or auxiliary can rely on its auxiliaries having been initialized before it.
If called with an a toggle (0,0)
and uninitialized storage, initialize. This will never be done if an i_
function is defined for the term.
Update the state of its storage (mtp->storage
and/or mtp->aux_storage[mtp->aux_slots[0]]
) to match what the state of *nwp
would be after the given dyad had been toggled.
f_
functions: Finalizer/DestructorThis function is optional for using storage: if it’s not provided, the model code will free any pointers to mtp->aux_storage
and mtp->storage
that are not NULL
.
Binary: void f_<NAME>(Model *mtp, Network *nwp)
Valued: void f_<NAME>(WtModel *mtp, WtNetwork *nwp)
Network and a model.
Deallocates its storage (mtp->storage
and/or mtp->aux_storage[mtp->aux_slots[0]]
) and sets its pointers to NULL
.
C
side: Accessing Storagec_
, d_
, and s_
functions can read from, but not write to, their private storage. c_
and d_
functions can rely on initialization having been called before.
Auxilaries must not implement c_
, d_
, and s_
functions.
Terms requesting one or more auxiliaries will be passed the indices of the element of mtp->aux_storage
by inserting them at the start of mtp->aux_slots
. That is mtp->aux_storage[mtp->aux_slots[0]]
is a void *
to point to the data public by the first auxiliary term on the auxiliaries
formula, mtp->aux_storage[mtp->aux_slots[1]]
is the second, etc..
x_
functions: eXtensionsThis interface is intended to be used by packages extending ergm
to send arbitrary signals to statistics and auxiliaries. For example, for temporal ERGMs, it may be used to signal to the statistic that the clock is about to advance. It is the responsibility of the extension writer to ensure that everything behaves sensibly.
Binary: void x_<NAME>(unsigned int type, void *data, Model *mtp, Network *nwp)
Valued: void x_<NAME>(unsigned int type, void *data, Model *mtp, Network *nwp)
type
: a magic constant identifying the type of signal being sent. Based on it, the function can ignore the signal, or determine how to interpret data
.
data
: arbitrary data to be sent to the function; it is up to the extension writer to determine how it is formatted and interpreted.
There are no restrictions on side-effects. It is up to the extension writer to ensure that everything works.
The following helper macros have been defined to date, and can be found in storage.h
.
These functions are defined in ergm_storage.h
and exported.
ALLOC_STORAGE(nmemb, stored_type, store_into)
: Allocate a vector of nmemb
elements of type stored_type
, save its pointer to private storage and also to a stored_type *store_into
which is also declared. Should be used by the i_
function, but may also be used by the u_
function.
GET_STORAGE(stored_type, store_into)
: Declare stored_type *store_into
and assign the pointer to private storage to it. Can be used by all functions.
ALLOC_AUX_STORAGE(nmemb, stored_type, store_into)
: Allocate a vector of nmemb
elements of type stored_type
, and save it to the auxiliary storage slot belonging to the calling auxiliary and into a stored_type *store_into
which is also declared. Can be used by the i_
function, but may also be used by the u_
function.
GET_AUX_STORAGE(stored_type, store_into)
: Declare stored_type *store_into
and assign the pointer to the auxiliary storage (either for a statistic or for the auxiliary). Can bn used by all functions.
GET_AUX_STORAGE_NUM(stored_type, store_into, ind)
: Declare stored_type *store_into
and assign the pointer to the ind
th auxiliary). Can be used by clients of auxiliaries.
ALLOC_AUX_SOCIOMATRIX(stored_type, store_into)
: Allocate an array of appropriate dimension with elements of type stored_type
, save it to auxiliary storage, and into **store_type
, so that store_into[i][j]
returns the value associated with dyad \((i,j)\), with vertices indexed from 1. For bipartite and undirected networks, as little space as possible (resp. a rectangle or a triangle) is allocated.
Note that this term assumes that the private and the public storage of the calling term are not used in any other way.
FREE_AUX_SOCIOMATRIX
: Frees the sociomatrix allocated by ALLOC_AUX_SOCIOMATRIX
.
MHproposals may also request auxiliary terms. An InitErgmProposal.<NAME>()
or InitWtErgmProposal.<NAME>()
with an auxiliaries
formula will similarly receive the positions of its auxiliaries’ slots in the network. However, this appears to have a slight cost in speed and a potentially significant cost in memory, since the auxiliary may need to duplicate the information in the MH_
function.
Functions prefixed with Mi_
, Mu_
, and, Mf_
serve as respectively the initializers, the updaters, and the finalizers of the MHproposal storage, though the old-style call with MHp->ntoggles==0
is also supported. Macros in the MHstorage.h
header file can be used to access storage the same way as for the statistics.
The function called to generate the proposal can have a prefix of either MH_
(for backwards compatibility) or Mp_
for consistency.
One important difference is that Mp_
function is permitted to write to its private storage. This may be useful if, say, a systematic sample is desired.