Packages used in this vignette.
This vignette demonstrates the use of two functions from the docxtools package:
format_engr()
for formatting numbers in engineering notationalign_pander()
for aligning table columns using a simple pander table styleThe primary goal of format_engr()
is to present numeric variables in a data frame in engineering format, that is, scientific notation with exponents that are multiples of 3. Compare:
syntax | expression |
---|---|
computer | \(1.011E+5\) |
mathematical | \(1.011\times10^{5}\) |
engineering | \(101.1\times10^{3}\) |
This example uses a small data set, density
, included with docxtools, with temperature in K, pressure in Pa, the gas constant in J kg-1K-1, and density in kg m-3.
density
#> date trial T_K p_Pa R density
#> 1 2018-06-12 a 294.05 101100 287 1.197976
#> 2 2018-06-13 b 294.15 101000 287 1.196384
#> 3 2018-06-14 c 294.65 101100 287 1.195536
#> 4 2018-06-15 d 293.35 101000 287 1.199647
#> 5 2018-06-16 e 293.85 101100 287 1.198791
Four of the variables are numeric. The date
variable is of type “double” but class “Date”, so it is not reformatted.
map_chr(density, class)
#> Error in map_chr(density, class): could not find function "map_chr"
map_chr(density, typeof)
#> Error in map_chr(density, typeof): could not find function "map_chr"
Usage is format_engr(x, sigdig = NULL, ambig_0_adj = FALSE)
. The function returns a data frame with all numeric values reformatted as character strings in engineering format with math delimiters $...$
.
density_engr <- format_engr(density)
density_engr
#> date trial T_K p_Pa R density
#> 1 2018-06-12 a $294.0$ ${101.1}\\times 10^{3}$ $287.0$ $1.198$
#> 2 2018-06-13 b $294.2$ ${101.0}\\times 10^{3}$ $287.0$ $1.196$
#> 3 2018-06-14 c $294.6$ ${101.1}\\times 10^{3}$ $287.0$ $1.196$
#> 4 2018-06-15 d $293.4$ ${101.0}\\times 10^{3}$ $287.0$ $1.200$
#> 5 2018-06-16 e $293.8$ ${101.1}\\times 10^{3}$ $287.0$ $1.199$
The formerly numeric variables are now characters. Non-numeric variables are returned unaltered.
map_chr(density_engr, class)
#> Error in map_chr(density_engr, class): could not find function "map_chr"
The math formatting is applied when the data frame is printed in the output document. For example, we can use knitr::kable()
to print the formatted data.
date | trial | T_K | p_Pa | R | density |
---|---|---|---|---|---|
2018-06-12 | a | \(294.0\) | \({101.1}\times 10^{3}\) | \(287.0\) | \(1.198\) |
2018-06-13 | b | \(294.2\) | \({101.0}\times 10^{3}\) | \(287.0\) | \(1.196\) |
2018-06-14 | c | \(294.6\) | \({101.1}\times 10^{3}\) | \(287.0\) | \(1.196\) |
2018-06-15 | d | \(293.4\) | \({101.0}\times 10^{3}\) | \(287.0\) | \(1.200\) |
2018-06-16 | e | \(293.8\) | \({101.1}\times 10^{3}\) | \(287.0\) | \(1.199\) |
The function is compatible with the pipe operator.
date | trial | T_K | p_Pa | R | density |
---|---|---|---|---|---|
2018-06-12 | a | \(294.0\) | \({101.1}\times 10^{3}\) | \(287.0\) | \(1.198\) |
2018-06-13 | b | \(294.2\) | \({101.0}\times 10^{3}\) | \(287.0\) | \(1.196\) |
2018-06-14 | c | \(294.6\) | \({101.1}\times 10^{3}\) | \(287.0\) | \(1.196\) |
2018-06-15 | d | \(293.4\) | \({101.0}\times 10^{3}\) | \(287.0\) | \(1.200\) |
2018-06-16 | e | \(293.8\) | \({101.1}\times 10^{3}\) | \(287.0\) | \(1.199\) |
Comments:
format_engr()
has three arguments:
x
, a data frame with at least one numerical variable.sigdig
, an optional vector of significant digits. Default is 4.ambig_0_adj
, an optional logical to adjust the notation in the event of ambiguous trailing zeros. Default is FALSE.The sigdig
argument can be a single value, applied to all numeric columns.
date | trial | T_K | p_Pa | R | density |
---|---|---|---|---|---|
2018-06-12 | a | \(294\) | \({101}\times 10^{3}\) | \(287\) | \(1.20\) |
2018-06-13 | b | \(294\) | \({101}\times 10^{3}\) | \(287\) | \(1.20\) |
2018-06-14 | c | \(295\) | \({101}\times 10^{3}\) | \(287\) | \(1.20\) |
2018-06-15 | d | \(293\) | \({101}\times 10^{3}\) | \(287\) | \(1.20\) |
2018-06-16 | e | \(294\) | \({101}\times 10^{3}\) | \(287\) | \(1.20\) |
Alternatively, significant digits can be assigned to every numeric column. A zero returns the variable in its original form.
date | trial | T_K | p_Pa | R | density |
---|---|---|---|---|---|
2018-06-12 | a | \(294.05\) | \({101.1}\times 10^{3}\) | \(287\) | \(1.1980\) |
2018-06-13 | b | \(294.15\) | \({101.0}\times 10^{3}\) | \(287\) | \(1.1964\) |
2018-06-14 | c | \(294.65\) | \({101.1}\times 10^{3}\) | \(287\) | \(1.1955\) |
2018-06-15 | d | \(293.35\) | \({101.0}\times 10^{3}\) | \(287\) | \(1.1996\) |
2018-06-16 | e | \(293.85\) | \({101.1}\times 10^{3}\) | \(287\) | \(1.1988\) |
Subset the data to look at just the numerical variables.
Print the data with incrementally decreasing significant digits.
T_K | p_Pa | R | density |
---|---|---|---|
\(294.0\) | \({101.1}\times 10^{3}\) | \(287.0\) | \(1.198\) |
\(294.2\) | \({101.0}\times 10^{3}\) | \(287.0\) | \(1.196\) |
\(294.6\) | \({101.1}\times 10^{3}\) | \(287.0\) | \(1.196\) |
\(293.4\) | \({101.0}\times 10^{3}\) | \(287.0\) | \(1.200\) |
\(293.8\) | \({101.1}\times 10^{3}\) | \(287.0\) | \(1.199\) |
Three digits creates no ambiguity.
T_K | p_Pa | R | density |
---|---|---|---|
\(294\) | \({101}\times 10^{3}\) | \(287\) | \(1.20\) |
\(294\) | \({101}\times 10^{3}\) | \(287\) | \(1.20\) |
\(295\) | \({101}\times 10^{3}\) | \(287\) | \(1.20\) |
\(293\) | \({101}\times 10^{3}\) | \(287\) | \(1.20\) |
\(294\) | \({101}\times 10^{3}\) | \(287\) | \(1.20\) |
With 2 digits, we have three columns with ambiguous trailing zeros.
T_K | p_Pa | R | density |
---|---|---|---|
\(290\) | \({100}\times 10^{3}\) | \(290\) | \(1.2\) |
\(290\) | \({100}\times 10^{3}\) | \(290\) | \(1.2\) |
\(290\) | \({100}\times 10^{3}\) | \(290\) | \(1.2\) |
\(290\) | \({100}\times 10^{3}\) | \(290\) | \(1.2\) |
\(290\) | \({100}\times 10^{3}\) | \(290\) | \(1.2\) |
By setting the ambig_0_adj
argument to TRUE, scientific notation is used to remove the ambiguity.
T_K | p_Pa | R | density |
---|---|---|---|
\({0.29}\times 10^{3}\) | \({0.10}\times 10^{6}\) | \({0.29}\times 10^{3}\) | \(1.2\) |
\({0.29}\times 10^{3}\) | \({0.10}\times 10^{6}\) | \({0.29}\times 10^{3}\) | \(1.2\) |
\({0.29}\times 10^{3}\) | \({0.10}\times 10^{6}\) | \({0.29}\times 10^{3}\) | \(1.2\) |
\({0.29}\times 10^{3}\) | \({0.10}\times 10^{6}\) | \({0.29}\times 10^{3}\) | \(1.2\) |
\({0.29}\times 10^{3}\) | \({0.10}\times 10^{6}\) | \({0.29}\times 10^{3}\) | \(1.2\) |
The ambiguous trailing zero adjustment is applied only to those variables for which the condition exists. For example, consider these data without the adjustment,
T_K | p_Pa | R | density |
---|---|---|---|
\(294.0\) | \({100}\times 10^{3}\) | \(287\) | \(1.20\) |
\(294.2\) | \({100}\times 10^{3}\) | \(287\) | \(1.20\) |
\(294.6\) | \({100}\times 10^{3}\) | \(287\) | \(1.20\) |
\(293.4\) | \({100}\times 10^{3}\) | \(287\) | \(1.20\) |
\(293.8\) | \({100}\times 10^{3}\) | \(287\) | \(1.20\) |
The same numbers with ambig_0_adj = TRUE
,
T_K | p_Pa | R | density |
---|---|---|---|
\(294.0\) | \({0.10}\times 10^{6}\) | \(287\) | \(1.20\) |
\(294.2\) | \({0.10}\times 10^{6}\) | \(287\) | \(1.20\) |
\(294.6\) | \({0.10}\times 10^{6}\) | \(287\) | \(1.20\) |
\(293.4\) | \({0.10}\times 10^{6}\) | \(287\) | \(1.20\) |
\(293.8\) | \({0.10}\times 10^{6}\) | \(287\) | \(1.20\) |
and only the pressure variable has a reformatted power of ten because it is the only variable that had ambiguous trailing zeros.