Developing New Network Functions and Network Statistics for netjack

This vignettes outlines how to develop custom network manipulation functions and network statistic functions for the netjack package. netjack was designed to be an easily extendible framework for working with large sets of networks, and the two entry points for custom development are:

Network Manipulation Functions

A custom network manipulation function takes as its argument a Net object and any additional arguments for the procedure, and returns a named list of Net objects. For example, this is the function to perform the node jackknife procedure. This removes each node from the network in turn, and produces a list of Net objects, each named for the node that has been removed.


node_jackknife = function(Net){
  
  toReturn = list()

  for(i in 1:dim(Net@net)[[1]]){
    toReturn[[i]] = methods::new("Net", net = Net@net[-i,-i],
                        net.name = Net@net.name,
                        node.variables = lapply(Net@node.variables, function(x, i){
                          return(x[-i])
                        }, i = i))
  }
  names(toReturn) = as.character(1:dim(Net@net)[[1]])
  return(toReturn)
}

The node_jackknife is a fairly simple network manipulation function, as it does not require any additional information. Here, it simply constructs a new adjacency matrix by omitting each row and column in turn, removes the node level variables for that node, and constructs a new Net object.

A more complex network manipulation method might take an additional set of arguments. Take for example the absolute_threshold function below:

absolute_threshold = function(Net, thresholds){

  toReturn = list()
  toReturn = lapply(thresholds, function(x, network){
    print(x)
    temp = (Net@net > x)*1
    return(methods::new("Net", net = temp, net.name = Net@net.name, node.variables = Net@node.variables))
  }, network = Net@net)
  names(toReturn) = as.character(thresholds)
  return(toReturn)
}

Again, this function takes as its arguments a single Net object, followed by a vector of thresholds. For each threshold, a thresholded adjacency matrix is constructed, and a new Net object is produced. The list of these new Net objects are then named for each threshold.

Finally, network manipulation functions can reference node level variables. Take the network_jackknife function below:

network_jackknife = function(Net, network.variable){

  toReturn = list()

  networks = unlist(Net@node.variables[network.variable])
  uniNets = unique(names(table(networks)))
  for(i in 1:length(uniNets)){
    toReturn[[i]] = methods::new("Net", net = Net@net[-which(networks == uniNets[i]),-which(networks == uniNets[i])],
                        net.name = Net@net.name,
                        node.variables = lapply(Net@node.variables, function(x, i){
                          return(x[-i])
                        }, i = which(networks == uniNets[i])))
  }
  names(toReturn) = as.character(uniNets)
  return(toReturn)
}

Here, the network.variable argument is a character argument that refers to a specific node level variable. This function removes all nodes associated with a specific community or sub network, and returns a list of new Nets where each of the sub networks have been removed in turn.

A general framework for a network manipulation function is as follows:

network_manipulation <- function(Net, external.variables, node.variables){

  toReturn = list()
  for( all combinations of interest of both external variables and node variables ){
    
    temporaryNetwork = some manipulation of the original Net@net
    toReturn[[index]] = new("Net", the new net parameters)
  }
  names(toReturn) = some set of names
  return(toReturn)
}

Any number of external packages or functions can be used to perform a network manipulation and as long as the wrapper function takes a single Net object, and returns a list of named Net objects, the netjack framework will be able to utilize it.

This will let a set of network manipulation functions take advantage of further framework development, which includes plans to implement parallelization, and implementation in C++.

Network Statistic Functions

Network statistic functions are designed to be almost identical in form to the network manipulation functions. For example, take the modularity function:


modularity <- function(Net, community.variable){

  net = igraph::graph.adjacency(Net@net, mode = "undirected", weighted = T, diag = F)
  mod = igraph::modularity(net, as.numeric(as.factor(Net@node.variables[[community.variable]])), weights = igraph::E(net)$weight)
  return(mod)
}

netjack’s modularity function is a wrapper for igraph’s implementation of modularity, and this highlights the flexiblity of the netjack’s framework for computing network statistics. A network statistic function takes a single Net object, any external variables of interest, and character references to any node variables, and returns a single numeric value.

A general framework for creating a new network statistic function is as follows:

network_statistic <- function(Net, external.variables, node.variables){

  statistic = doSomething(Net, external.variables, node.variables)
  return(statistic)
  
}

Similarly to the network manipulation function, as long as a new network statistic function takes a Net object and returns a single numerical value, netjack can utilize it.

Developer Guarantees

As netjack is built to be a general framework for working with sets of networks, I (Teague Henry) aim at keeping the interface between user built functions and the framework that works with them as stable as possible. As such, there are a couple of aspects that are not likely to change version to version. Specifically:

The structure of a Net object will only ever be added to. @net and @node.variables will always be components of the Net object. Some safe accessors might be written later, but this would not invalidate any functions that directly access these slots.
The general form of both network manipulation functions and network statistic functions will not change. Both will take a Net object, and return a named list of Net objects or a numeric value respectively.

If any changes are made that I think might invalidate user’s custom code, this will be prominently listed in the change log.

Developing New Network Functions and Network Statistics for `netjack`

Teague Henry

2019-07-07

Network Manipulation Functions

Network Statistic Functions

Developer Guarantees

Contribute to `netjack`

Developing New Network Functions and Network Statistics for netjack

Teague Henry

2019-07-07

Network Manipulation Functions

Network Statistic Functions

Developer Guarantees

Contribute to netjack

Developing New Network Functions and Network Statistics for `netjack`

Contribute to `netjack`