netjack
This vignettes outlines how to develop custom network manipulation functions and network statistic functions for the netjack
package. netjack
was designed to be an easily extendible framework for working with large sets of networks, and the two entry points for custom development are:
We will go through how to develop custom functions now.
A custom network manipulation function takes as its argument a Net
object and any additional arguments for the procedure, and returns a named list of Net
objects. For example, this is the function to perform the node jackknife procedure. This removes each node from the network in turn, and produces a list of Net
objects, each named for the node that has been removed.
node_jackknife = function(Net){
toReturn = list()
for(i in 1:dim(Net@net)[[1]]){
toReturn[[i]] = methods::new("Net", net = Net@net[-i,-i],
net.name = Net@net.name,
node.variables = lapply(Net@node.variables, function(x, i){
return(x[-i])
}, i = i))
}
names(toReturn) = as.character(1:dim(Net@net)[[1]])
return(toReturn)
}
The node_jackknife
is a fairly simple network manipulation function, as it does not require any additional information. Here, it simply constructs a new adjacency matrix by omitting each row and column in turn, removes the node level variables for that node, and constructs a new Net
object.
A more complex network manipulation method might take an additional set of arguments. Take for example the absolute_threshold
function below:
absolute_threshold = function(Net, thresholds){
toReturn = list()
toReturn = lapply(thresholds, function(x, network){
print(x)
temp = (Net@net > x)*1
return(methods::new("Net", net = temp, net.name = Net@net.name, node.variables = Net@node.variables))
}, network = Net@net)
names(toReturn) = as.character(thresholds)
return(toReturn)
}
Again, this function takes as its arguments a single Net object, followed by a vector of thresholds. For each threshold, a thresholded adjacency matrix is constructed, and a new Net
object is produced. The list of these new Net
objects are then named for each threshold.
Finally, network manipulation functions can reference node level variables. Take the network_jackknife
function below:
network_jackknife = function(Net, network.variable){
toReturn = list()
networks = unlist(Net@node.variables[network.variable])
uniNets = unique(names(table(networks)))
for(i in 1:length(uniNets)){
toReturn[[i]] = methods::new("Net", net = Net@net[-which(networks == uniNets[i]),-which(networks == uniNets[i])],
net.name = Net@net.name,
node.variables = lapply(Net@node.variables, function(x, i){
return(x[-i])
}, i = which(networks == uniNets[i])))
}
names(toReturn) = as.character(uniNets)
return(toReturn)
}
Here, the network.variable
argument is a character argument that refers to a specific node level variable. This function removes all nodes associated with a specific community or sub network, and returns a list of new Nets
where each of the sub networks have been removed in turn.
A general framework for a network manipulation function is as follows:
network_manipulation <- function(Net, external.variables, node.variables){
toReturn = list()
for( all combinations of interest of both external variables and node variables ){
temporaryNetwork = some manipulation of the original Net@net
toReturn[[index]] = new("Net", the new net parameters)
}
names(toReturn) = some set of names
return(toReturn)
}
Any number of external packages or functions can be used to perform a network manipulation and as long as the wrapper function takes a single Net
object, and returns a list of named Net
objects, the netjack
framework will be able to utilize it.
This will let a set of network manipulation functions take advantage of further framework development, which includes plans to implement parallelization, and implementation in C++.
Network statistic functions are designed to be almost identical in form to the network manipulation functions. For example, take the modularity
function:
modularity <- function(Net, community.variable){
net = igraph::graph.adjacency(Net@net, mode = "undirected", weighted = T, diag = F)
mod = igraph::modularity(net, as.numeric(as.factor(Net@node.variables[[community.variable]])), weights = igraph::E(net)$weight)
return(mod)
}
netjack
’s modularity function is a wrapper for igraph
’s implementation of modularity, and this highlights the flexiblity of the netjack
’s framework for computing network statistics. A network statistic function takes a single Net
object, any external variables of interest, and character references to any node variables, and returns a single numeric value.
A general framework for creating a new network statistic function is as follows:
network_statistic <- function(Net, external.variables, node.variables){
statistic = doSomething(Net, external.variables, node.variables)
return(statistic)
}
Similarly to the network manipulation function, as long as a new network statistic function takes a Net
object and returns a single numerical value, netjack
can utilize it.
As netjack
is built to be a general framework for working with sets of networks, I (Teague Henry) aim at keeping the interface between user built functions and the framework that works with them as stable as possible. As such, there are a couple of aspects that are not likely to change version to version. Specifically:
The structure of a Net
object will only ever be added to. @net
and @node.variables
will always be components of the Net
object. Some safe accessors might be written later, but this would not invalidate any functions that directly access these slots.
The general form of both network manipulation functions and network statistic functions will not change. Both will take a Net
object, and return a named list of Net
objects or a numeric value respectively.
If any changes are made that I think might invalidate user’s custom code, this will be prominently listed in the change log.
netjack
If you have taken the time to read this far, you are likely writing some custom code to do interesting things! If you would like to make this code public, consider contributing directly to netjack
. Get in touch with me (Teague Henry, package author) via email or via GitHub, and I would be happy to list you as a contributor to the package, write documentation for your function crediting you, optimize the function (with permission of course), and put it into the next release.