Sparse price information can become a problem when running Price Sensitivity Meter Analyses, particularly in small samples. In order to deal with this issue, the pricesensitivitymeter package has an option interpolate
to approximate values on the price curves for all values where the actual information from respondents is missing.
This vignette illustrates the underlying problem and explains how the interpolate
argument provides a solution.
Consider an example with the following price information.
set.seed(1976)
# create random variables
<- sort(round(rnorm(n = 50, mean = 8, sd = 1.5), digits = 2))
toocheap <- sort(round(rnorm(n = 50, mean = 20, sd = 1), digits = 2))
tooexpensive <- sort(round(rnorm(n = 50, mean = 15, sd = 3), digits = 2))
cheap <- sort(round(rnorm(n = 50, mean = 16, sd = 3), digits = 2))
expensive
# setting up alternative "cheap"/"expensive" variables with a price step
<- which(cheap > 14.25 & cheap < 17.5)
pricestep_cheap <- cheap
cheap_alt <- round(mean(cheap_alt[pricestep_cheap]), digits = 2)
cheap_alt[pricestep_cheap]
<- which(expensive > 14 & expensive < 16)
pricestep_expensive <- expensive
expensive_alt <- round(mean(expensive_alt[pricestep_expensive]), digits = 2) expensive_alt[pricestep_expensive]
When we run a Price Sensitivity Meter Analysis on the data and visualize it, we get the following result. (The plot is slightly simplified: It does not show the “too cheap” and “too expensive” curves, as they are not relevant for the example. Moreover, it is slightly zoomed in.)
library(pricesensitivitymeter)
# Running the analysis with price steps and without interpolation
<- psm_analysis(toocheap = toocheap,
psm_steps cheap = cheap_alt,
expensive = expensive_alt,
tooexpensive = tooexpensive,
validate = TRUE,
interpolate = FALSE)
library(ggplot2)
# Plot with price steps
<- ggplot(data = psm_steps$data_vanwestendorp, aes(x = price)) +
psmplot_steps geom_line(aes(y = ecdf_not_cheap, # line: not cheap
colour = "not cheap"),
size = 1) +
geom_line(aes(y = ecdf_not_expensive, # line: not expensive
colour = "not expensive"),
size = 1) +
geom_rug() +
annotate(geom = "point", # Indifference Price Point (intersection of "not cheap" and "not expensive")
x = psm_steps$idp,
y = psm_steps$data_vanwestendorp$ecdf_not_expensive[psm_steps$data_vanwestendorp$price == psm_steps$idp],
size = 5,
shape = 18,
colour = "#D55E00")
+
psmplot_steps labs(x = "Price",
y = "Share of Respondents (0-1)",
title = "Price Sensitivity Meter Plot (without interpolation)") +
scale_colour_manual(name = "Legend",
values = c("not cheap" = "#009E73",
"not expensive" = "#D55E00")) +
annotate(geom = "text", # Label of Indifference Price Point
x = psm_steps$idp + 1,
y = psm_steps$data_vanwestendorp$ecdf_not_expensive[psm_steps$data_vanwestendorp$price == psm_steps$idp] - 0.05,
label = paste("IDP: ", psm_steps$idp)) +
coord_cartesian(xlim = c(10, 20)) +
theme_minimal()
The Indifference Price Point (IDP) in this plot is not at the actual intersection of the “not cheap” and the “not expensive” curve. This is due to the data structure: We have sparse information for prices between 14 and 15. This can be seen in the rug plot close to the x-axis: Each vertical line indicates a price for which we have information in our data - however, there is a considerable gap around 14-15 (caused by the initial data manipulation that added price steps). The combination of price steps (as can be seen in the steep slope of the curves in this area) and sparse data points creates this problem. If the interpolate
argument is set to FALSE
, the psm_analysis()
function uses the first price for which the “not expensive” curve is higher than the “not cheap” curve, which is the case for a price of 15.13 in our example.
A larger sample size is usually sufficient to overcome this issue. However, this might not always be possible. Therefore, the psm_analysis()
function has an optional argument which helps to interpolate the pricing curve in cases of sparse information.
interpolate
ArgumentThe psm_analysis
function includes an optional argument interpolate
. If this argument is set to TRUE
, the function uses linear interpolation to estimate the price curves between the price points that are provided by the respondents. Consider an easy example: If the “not cheap” function is at 20% for a price of 5.99 and at 21% for a price of 6.01, the function would estimate that it is at 20.5% for a price of 6.00. The function uses price steps as defined in the interpolation_steps
argument, which has a default value of 0.01.1
Below are the results for the same data when the interpolate
function is set to TRUE
.
# Running the analysis with interpolation
<- psm_analysis(toocheap = toocheap,
psm_interpolated cheap = cheap_alt,
expensive = expensive_alt,
tooexpensive = tooexpensive,
validate = TRUE,
interpolate = TRUE)
# Plot with interpolated price steps
<- ggplot(data = psm_interpolated$data_vanwestendorp, aes(x = price)) +
psmplot_interpolated geom_line(aes(y = ecdf_not_cheap, # line: not cheap
colour = "not cheap"),
size = 1) +
geom_line(aes(y = ecdf_not_expensive, # line: not expensive
colour = "not expensive"),
size = 1) +
geom_rug() +
annotate(geom = "point", # Indifference Price Point (intersection of "cheap" and "expensive")
x = psm_interpolated$idp,
y = psm_interpolated$data_vanwestendorp$ecdf_not_expensive[psm_interpolated$data_vanwestendorp$price == psm_interpolated$idp],
size = 5,
shape = 18,
colour = "#D55E00")
+
psmplot_interpolated labs(x = "Price",
y = "Share of Respondents (0-1)",
title = "Price Sensitivity Meter Plot (with interpolation)") +
scale_colour_manual(name = "Legend",
values = c("not cheap" = "#009E73",
"not expensive" = "#D55E00")) +
annotate(geom = "text", # Label of Indifference Price Point
x = psm_interpolated$idp + 1,
y = psm_interpolated$data_vanwestendorp$ecdf_not_expensive[psm_interpolated$data_vanwestendorp$price == psm_interpolated$idp],
label = paste("IDP: ", psm_interpolated$idp)) +
coord_cartesian(xlim = c(10, 20)) +
theme_minimal()
The rug plot demonstrates now that the data includes information for more pricing steps due to the interpolation.
The following plot demonstrates the close relationship between the original data and the interpolated data: The thicker semi-transparent line shows the interpolated values of the “not cheap” and “not expensive” curves. The thinner solid lines are the original data.
It may make sense to set a higher value when this subdivision is not practical in the currency - e.g. for Japanese Yen or Chilean Pesos - or when the product for which the price is assessed is quite expensive. In both cases, a reduced granularity will not affect the results, but increase the computational efficiency.↩︎