Developing New Network Functions and Network Statistics for netjack

Teague Henry

2019-07-07

This vignettes outlines how to develop custom network manipulation functions and network statistic functions for the netjack package. netjack was designed to be an easily extendible framework for working with large sets of networks, and the two entry points for custom development are:

We will go through how to develop custom functions now.

Network Manipulation Functions

A custom network manipulation function takes as its argument a Net object and any additional arguments for the procedure, and returns a named list of Net objects. For example, this is the function to perform the node jackknife procedure. This removes each node from the network in turn, and produces a list of Net objects, each named for the node that has been removed.


node_jackknife = function(Net){
  
  toReturn = list()

  for(i in 1:dim(Net@net)[[1]]){
    toReturn[[i]] = methods::new("Net", net = Net@net[-i,-i],
                        net.name = Net@net.name,
                        node.variables = lapply(Net@node.variables, function(x, i){
                          return(x[-i])
                        }, i = i))
  }
  names(toReturn) = as.character(1:dim(Net@net)[[1]])
  return(toReturn)
}

The node_jackknife is a fairly simple network manipulation function, as it does not require any additional information. Here, it simply constructs a new adjacency matrix by omitting each row and column in turn, removes the node level variables for that node, and constructs a new Net object.

A more complex network manipulation method might take an additional set of arguments. Take for example the absolute_threshold function below:

absolute_threshold = function(Net, thresholds){

  toReturn = list()
  toReturn = lapply(thresholds, function(x, network){
    print(x)
    temp = (Net@net > x)*1
    return(methods::new("Net", net = temp, net.name = Net@net.name, node.variables = Net@node.variables))
  }, network = Net@net)
  names(toReturn) = as.character(thresholds)
  return(toReturn)
}

Again, this function takes as its arguments a single Net object, followed by a vector of thresholds. For each threshold, a thresholded adjacency matrix is constructed, and a new Net object is produced. The list of these new Net objects are then named for each threshold.

Finally, network manipulation functions can reference node level variables. Take the network_jackknife function below:

network_jackknife = function(Net, network.variable){

  toReturn = list()

  networks = unlist(Net@node.variables[network.variable])
  uniNets = unique(names(table(networks)))
  for(i in 1:length(uniNets)){
    toReturn[[i]] = methods::new("Net", net = Net@net[-which(networks == uniNets[i]),-which(networks == uniNets[i])],
                        net.name = Net@net.name,
                        node.variables = lapply(Net@node.variables, function(x, i){
                          return(x[-i])
                        }, i = which(networks == uniNets[i])))
  }
  names(toReturn) = as.character(uniNets)
  return(toReturn)
}

Here, the network.variable argument is a character argument that refers to a specific node level variable. This function removes all nodes associated with a specific community or sub network, and returns a list of new Nets where each of the sub networks have been removed in turn.

A general framework for a network manipulation function is as follows:

network_manipulation <- function(Net, external.variables, node.variables){

  toReturn = list()
  for( all combinations of interest of both external variables and node variables ){
    
    temporaryNetwork = some manipulation of the original Net@net
    toReturn[[index]] = new("Net", the new net parameters)
  }
  names(toReturn) = some set of names
  return(toReturn)
}

Any number of external packages or functions can be used to perform a network manipulation and as long as the wrapper function takes a single Net object, and returns a list of named Net objects, the netjack framework will be able to utilize it.

This will let a set of network manipulation functions take advantage of further framework development, which includes plans to implement parallelization, and implementation in C++.

Network Statistic Functions

Network statistic functions are designed to be almost identical in form to the network manipulation functions. For example, take the modularity function:


modularity <- function(Net, community.variable){

  net = igraph::graph.adjacency(Net@net, mode = "undirected", weighted = T, diag = F)
  mod = igraph::modularity(net, as.numeric(as.factor(Net@node.variables[[community.variable]])), weights = igraph::E(net)$weight)
  return(mod)
}

netjack’s modularity function is a wrapper for igraph’s implementation of modularity, and this highlights the flexiblity of the netjack’s framework for computing network statistics. A network statistic function takes a single Net object, any external variables of interest, and character references to any node variables, and returns a single numeric value.

A general framework for creating a new network statistic function is as follows:

network_statistic <- function(Net, external.variables, node.variables){

  statistic = doSomething(Net, external.variables, node.variables)
  return(statistic)
  
}

Similarly to the network manipulation function, as long as a new network statistic function takes a Net object and returns a single numerical value, netjack can utilize it.

Developer Guarantees

As netjack is built to be a general framework for working with sets of networks, I (Teague Henry) aim at keeping the interface between user built functions and the framework that works with them as stable as possible. As such, there are a couple of aspects that are not likely to change version to version. Specifically:

If any changes are made that I think might invalidate user’s custom code, this will be prominently listed in the change log.

Contribute to netjack

If you have taken the time to read this far, you are likely writing some custom code to do interesting things! If you would like to make this code public, consider contributing directly to netjack. Get in touch with me (Teague Henry, package author) via email or via GitHub, and I would be happy to list you as a contributor to the package, write documentation for your function crediting you, optimize the function (with permission of course), and put it into the next release.