This package performs a specific form of segmented linear regression analysis over two independent variables. The visualization of that result resembles the quater segment of a cowbell which gives the package name. This package has been specifically constructed for the case where minimal and maximal values of the two independent variables and of the dependent variable are known a prior. This is usually the case if those values are obtained via Likert scale type questionnaires. The definition domain of the resulting regression function is sketched in the following picture.
The model of the regression function contains 4 degrees of freedom.
When one of the two independent variables is at their minimal value a predefined minimal value (1st degree of freedom) for the dependent variable is returned. When both independent variables are over their breakpoint value (2nd and 3rd degree of freedom) a defined maximum value (4th degree of freedom) is returned. In the two indicated regions of the definition domain there is a linear increase in value which depends on the smaller of the two independent variables in relation to the breakpoint.
The model takes some time to compute as a gradient optimizer is used to compute the non linear optimization problem.
This regression model may be worth trying under the following conditions:
As a first step a concept is defined. The concept contains the formula of the regression model to be calculated and the minimal and maximal values of the variables included. The formula only contains one dependent and two independent variables. Afterwards follow the minimal and maximal values of the same variables in the same sequence. If one of those values is omitted the corresponding minimum or maximum found in the data set will be used.
concept<-generateCowbellConcept(Fun ~ Fluency * Absorption, 1, 9, 1, 7, 1, 7)
Fun is here measured in the range of 1 … 9 and Fluency and Absorption in the range of 1..7. In a second step that concept then gets applied to a data set. Assuming we apply it to the data set allFun:
test<-generateCowbell(concept, allFun)
As additional parameters the number of iteration steps and the learning rate can be specified. Feel free to experiment with them if they can give you a better result. Increasing the iteration steps results in longer computation times.
The next interesting command for the result is the summary
summary(test)
which results in an output like the following:
## Working on full model.
## Working on model without breakpoint.
## Base concept:
## ===================================================================
## Formula: Fun ~ Fluency * Absorption
## Fluency : 1 ... 7
## Absorption : 1 ... 7
## Fun : 1 ... 9
##
## Base definition segmented two dimensional linear regression:
## ===================================================================
## Minimal Value dependend variable Fun : 1.970386
## Maximal Value dependend variable Fun : 7.638431
## Breaking point of independend variable Fluency : 4.386884
## Breaking point of independend variable Absorption : 5.565985
##
## Computation of Fun :
## ===================================================================
## Fun<-function(Fluency, Absorption)
## {
## if ((Fluency >= 4.38688353208366) && (Absorption >= 5.56598516657318))
## return (7.63843073483172)
## if (Fluency * (-4.56598516657318) + Absorption * 3.38688353208366 > -1.17910163448952)
## return (0.296857302471897 + Fluency * 1.67352822993063)
## else
## return (0.729022497777713 + Absorption * 1.24136303462482)
## }
##
##
## Statistics:
## ===================================================================
## R Squared: 0.3843271
## F-Statistic (comparison constant function): 167.7977 on 3 and 809 DF p-value: 1.436796e-84
##
## ===================================================================
## No breaking point - reduced model
## ===================================================================
## Minimal Value dependend variable Fun : 2.517464
## Maximal Value dependend variable Fun : 9
## Computation of Fun :
## ===================================================================
## Fun<-function(Fluency, Absorption)
## {
## if (Fluency * (-6) + Absorption * 6 > 0)
## return (1.43704127615332 + Fluency * 1.08042267483524)
## else
## return (1.43704127615332 + Absorption * 1.08042267483524)
## }
##
##
## Statistics:
## ===================================================================
## R Squared: 0.3695049
## F-Statistic (comparison full model against reduced): 9.738247 on 2 and 809 DF p-value: 6.619721e-05
First the output contains the information that was provided in the concept (the formula and minimal / maximal values). Then follows the specification of the values contained in the four degrees of freedom. Afterwards the source code for an explicit R function that computes those values is given. Then the R squared and the F-Statistics of the model compared to a constant function (average) is computed.
In order to estimate the significance of the breakingpoint an additional model with two fewer degrees of freedom is computed. In that model the breaking point is artificially set to the predefined maximal values of the independent variables. The same values as for the full model are also computed here. The difference is in the F-Statistics. In this case the full model is compared to this reduced model with two fewer degrees of freedom to test for the significance of the breaking point.
A visualization of the resulting function can be obtained by
plot(test)
which results in a visualization like the following:
Also the following generic functions that are often used in R for prediction functionality are implemented: