mplot.lm()
for models where
broom::tidy()
doesn’t record residuals.mosaicCore
, makeFun()
now accepts one-sided formulas such as makeFun(~ x^2)
.p = 0
or p = 1
to the
q-function throws an error. [See #779]compare()
, and
design_plot()
.mplot(model, which = 1)
now uses raw residuals rather
than standardized/studentized. This mathes behavior in
plot()
.na.rm
argument to prop.test().qdata()
so that it is always a
named vector.cdata()
so that is is always a
data frame. Also changed names to “lo” and “hi”.xpchisq()
caused by introducing explicit
arguments and failing to retain ...
. (Issue #737)xpt()
caused by introducing explicit
arguments and failing to handle missing ncp
correctly.
(Issue #736)googleMap()
has be deprecated due to change in policy
at google. Try leaflet_map()
as an alternative.do()
.xpt()
, xqt()
, etc. now have
more explicit arguments. This provides additional help and prompts for
the user.percs()
and counts()
are re-exported from
mosaicCore
confint()
, attempting to set the confidence
level using conf.level
instead of level
throws
and error and provides a reminder to use level
for that
purpose.confint()
methods for binom.test()
have
been modified a bit. See documentation for how names map to
methods.ggformula
is used for plotting in more places (replacing
older lattice
code).CIdata()
now handles negative numbers
correctly.mplot.lm()
now removed points with leverage 1
to avoid errors and warnings; a warning messages notifies which points
have been removed.TukeyHSD()
now correctly follows
system = "gg"
mplot.lm()
now uses ggrapel
to place
labels and offers additional controls for the smooth curve that is
overlaid. [gg version of plots only]orrr()
, oddsRatio()
, and
relrisk()
now accept a 2x2 data frame to match claims in
documentation.cor(~y, ~x)
prop.test()
so it handles
success
argument properly for 2-way tables.ggformula
.which
argument added to
mplot.TukeyHSD()
.ggformula
.mosaic
compatible with
ggplot2
version 3.0.ggplot2
rather
than lattice
by default.cnorm()
, ct()
, xcnorm()
, and
xct()
added to find central portions of distributions.mosaicCore
mosaicCore
.ggplot2
rather than
lattice
.mplot()
on linear models when system =
"gg"
.formals()
.xpnorm()
and friends now use ggplot2
and
can return the plot object, if requested.t.test()
has been completely reimplemented. It no
longer supports “bare variable mode”, but it is more similar to
stats::t.test()
in some cases.gwm()
has been removed since it no longer works with
the current version of dplyr
.mosaicModel
package.props()
and counts()
have been added. They
are a bit like tally()
but designed to play well with
df_stats()
. Currently the formula versions drop missing
data, but that will likely be determined by a user-supplied option in
the future.mosaicCalc
.mosaic
depends on ggformula
, so users will
have lattice
, ggplot2
, and
ggformula
available after loading mosaic
.mplot()
on a data frame supports ggformula
now.ggformula
has been
added.lattice
and ggformula
has been added.mosaic
to
mosaicCore
. This should not affect users of
mosaic
.tally()
now provide names to dimnames in
cases where they were previously missing. This was needed for the
refactoring of bargaph()
.bargraph()
to use tally()
for
tabulation. This means the behavior of bargraph()
should
match expectations of users of tally()
better than it did
before. In particular, proportions now sum to 1 in each panel of a
multi-panel plot.tally()
so the proportions computed when
format = "proportion"
are easier to predict.prop(x ~ y)
was reporting overall proportions
rather than marginal proportions.value()
, a generic with several methods for
extracting a “value” from a more complicated object. Useful for
extracting values from output of uniroot()
,
nlm()
, integrate()
,
cubature::adaptIntegrat()
without needing to know just how
those values are stored in the object.prop(a ~ b)
to compute joint rather
than conditional proportions.favstats()
, mean()
,
sd()
, etc.) now require that the first argument be a
formula. This was always the preferred method, but some functions
allowed bare variable names to be used instead. As a specific example,
the following code now generates an error (unless there is another
object named age
in your environment).favstats(age, data = HELPrct)
## Error in typeof(x) : object 'age' not found
Replace this with
favstats( ~ age, data = HELPrct)
## min Q1 median Q3 max mean sd n missing
## 19 30 35 40 60 35.65342 7.710266 453 0
ggplot2
.mplot.data.frame()
allow it to work
with an expression that evaluates to a data frame. ASH plots are now a
choice for 1-variable plots.deltaMethod()
has been moved to a separate package
(called deltaMethod
) to reduce package dependenciescull_for_do.lm()
now returns a data frame instead of a
vector. This makes it easier for do()
to bind things
together by column name.makeMap()
updated to work with new version of
ggplot2
.cdata()
, ddata()
,
pdata()
, qdata()
and rdata()
have
been reordered so that the formula comes first.rflip()
has
been improved.dfapply()
, also default value for
select
changed to TRUE
.inspect()
, which is primarily intended to
give an over view of the variables in a data frame, but handles some
additional objects as well.data
argument is not an environment or data frame.mm()
has been deprecated and replaced with
gwm()
which does groupwise models where the response may be
either categorical or quantitative.plotModel()
. This is
likely still not the final version, but we are getting closer.do()
.dotPlot()
are now the same size in all panels
of multi-panel plots.cdist()
has been rewritten.mplot()
on a data frame now (a) prompts the user for
the type of plot to create and (b) has an added option to make line
plots for time series and the like.resample()
can now do residual resampling from a linear
model.do()
to create common bootstrap confidence intervals. In
particular, confint()
can now calculate three kinds of
intervals in many common situations.fetchData()
, fetchGoogle()
, and
fetchGapminder()
have been moved to a separate package,
called fetch()
.plotModel()
can be used to show data and model fits for
a variety of models created with lm()
or
glm()
.mosaicData
a dependency of mosaic
. This
avoids the problem of users forgetting to separately load the
mosaicData
package.fetchGoogle()
(and perhaps
read.file()
) from future versions of the package. More and
more packages are providing utilities for bringing data into R and it
doesn’t make sense for us to duplicate those efforts in this package.
For google sheets, you might take a look at the
googlesheets
package which is available via github now and
will be on CRAN soon.binom.test()
,
prop.test()
, and t.test()
, which have also
undergone some internal restructuring. The objects returned now do a
better job of reporting about the test conducted. In particular,
binom.test()
and prop.test()
will report the
value of success
used.(#450, #455)binom.test()
can now compute several different kinds of
confidence intervals including the Wald, Plus-4 and Agresti-Coull
intervals. (#449)derivedFactor()
now handles NAs without throwing a
warning. (#451)pdist()
, pdist()
and related
functions now do a better (i.e., useful) job with discrete distributions
(#417)t.test()
and all the “aggregating” functions like mean()
and
favstats()
. In particular, it is now possible to reference
variables both in the data
argument and in the calling
environment. (#435)CIAdata()
now provides a message indicating the source
URL for the data retrieved (#444)CIAdata()
that seem to be related to a
changed in file format at the CIA World Factbook website. The
“inflation” data set is still broken (on the CIA website). (#441)read.file()
now uses functions from readr
in some cases. A message is produced indicating which reader is being
used. There are also some API changes. In particular, character data
will be returned as character rather than factor. See
factorize()
for an easy way to convert things with few
unique values into factors. (#442)mutate()
is used in place of transform()
in the examples. (#452)tally()
now produces counts by default for all formula
shapes. Proportions or percentages must be requested explicitly. This is
to avoid common errors, especially when feeding the results into
chisq.test()
.msummary()
. Usually this is identical
to summary()
, but for a few kids of objects it provides
modified output that is less verbose.do * lm( )
will now keep track of the F
statistic, too.
confint()
applied to an object produced using
do()
now does more appropriate things.binom.test()
and prop.test()
now set
success = 1
by default on 0-1 data to treat 0 like failure
and 1 like success. Similarly, prop()
and
count()
set level = 1
by default.CIsim()
can now produce plots and does so by default
when samples <= 200
.add=TRUE
improved for
plotDist()
.swap()
which is useful for creating randomization
distributions for paired designs. The current implementation is a bit
slow.MAD()
,
SAD()
, and quantile()
.docFile()
introduced to simplify accessing files
included with package documentation. read.file()
enhanced
to take a package as an argument and look among package documentation
files.factorize()
introduced as a way to convert vectors with
few unique values into factors. Can be applied to an entire data
frame.NHANES
contains the
NHANES
data set and mosaicData
contains the
other data sets.MAD()
and SAD()
were added to compute mean
and sum of all pairs of absolute differences.rspin()
has been added to simulate spinning a
spinner.mosaic
package to simplify R for
beginners.mosaic
package.plotFun()
has been improved so that it does a better
job of selecting points where the function is evaluated and no longer
warns about NaN
s encountered while exploring the domain of
the function.oddsRatio()
has been redesigned and
relrisk()
has been added. Use their summary()
methods or verbose=TRUE
to see more information (including
confidence intervals).Birthdays
data set.mplot()
and several instances have been added
to make a number of plots easy to generate. There are methods for
objects of classes "data.frame"
, "lm"
,
"summary.lm"
, "glm"
,
"summary.glm"
, "TukeyHSD"
, and
"hclust"
. For several of these there are also
fortify
methods that return the data frame created to
facilitate plotting.read.file()
now handles (some?) https URLs and accepts
an optional argument filetype
that can be used to declare
the type of data file when it is not identified by extension.useNA
in the tally()
function has changed to "ifany"
.mosaic
now depends on dplyr
both to use
some of its functionality and to avoid naming collisions with functions
like tally()
and do()
, allowing
mosaic
and dplyr
to coexist more happily.dotPlot()
. In
particular, the size of the dots is determined differently and works
better more of the time. Dots were also shifted down by .5 units so that
theydo()
that caused it to scope incorrectly
in some edge cases when a variable had the same name as a function.ntiles()
has been reimplemented and now has more
formatting options.derivedFactor()
for creating factors
from logical “cases”.HELP
data set has been removed from the
package.HELPrct
instead.plotDist()
now accepts add=TRUE
and
under=TRUE
, making it easy to add plots of distributions
over (or under) plots of data (e.g., histograms, densityplots, etc.) or
other distributions.add=TRUE
have
been reimplemented using layer
from
latticeExtra
. See documentation of these functions for
details.ladd()
has been completely reimplemented using
layer()
from latticeExtra
. See documentation
of ladd()
for details, including some behavior
changes.mean()
, sd()
,
var()
, et al) now use
getOptions("na.rm")
to determine the default value of
na.rm
. Use options(na.rm=TRUE)
to change the
default behavior to remove NA
s and options(na.rm=NULL) to
restore defaults.do()
has been largely rewritten with an eye toward
improved efficiency. In particular, do()
will take
advantage of multiple cores if the parallel
package is
available. At this point, sluggishness in applications of
do()
are mostly likely due to the sluggishness of what is
being done, not to do()
itself.deltaMethod()
from the
car
package to make it easier to propagate uncertainty in
some situations that commonly arise in the physical sciences and
engineering.cdist()
to compute critical values for the
central portion of a distribution.qdata()
. For interactive
use, this should not cause any problems, but old programmatic uses of
qdata()
should be checked as the object returned is now
different.sum()
,
mean()
, sd()
, etc.) to produce
counter-intuitive results (but with a warning). The results are now what
one would expect (and the warning is removed).rsquared()
for extracting r-squared from models
and model-like objects (r.squared()
has been
deprecated).do()
now handles ANOVA-like objects bettermaggregate()
is now built on some improved behind the
scenes functions. Among other features, the groups
argument
is now incorporated as an alternative method of specifying the groups to
aggregate over and the method
argument can be set to
"ddply"
to use ddply()
from the
plyr
package for aggregation. This results in a different
output format that may be desired in some applications.
The cdata()
, pdata()
and qdata()
functions have been largely rewritten. In addition,
cdata_f()
, pdata_f()
and
qdata_f()
are provided which produce similar results but
have a formula in the first argument slot.doc/
and so are available from within the package as well
as via links to external files.fetchGapminder()
for fetching data sets
originally from Gapminder.cdata()
for finding end points of a central
portion of a variable.prop()
to avoid internal
:
which makes downstream processing messier.manipulate()
(RStudio)plotFun()
can be used without
manipulate()
. This makes it possible to put surface plots
into RMarkdown or Rnw files or to generate them outside of RStudio.do() * rflip()
now records proportion heads as well as
counts of heads and tails.mosaicLatticeOptions()
and
restoreLatticeOptions()
to switch back and forth between
lattice
defaults and mosaic
defaults.dotPlot()
uses a different algorithm to determine dot
sizes. (Still not perfect, but cex
can be used to further
scale the dots.)histogram()
so that nint
matches the number of bins used more accurately.i2
: max
number of drinks is at least as large as i1
: the average
number of drinks.D()
and
antiD()
.mPlot()
provides an interactive environment
for creating lattice
and ggplot2
plots.sp2df()
for converting SpatialPolygonDataFrames to regular
data frames (which is useful for plotting with ggplot2
, for
example). Also the Countries
data frame facilitates mapping
country names among different sources of map data.do()
are now marked as such so
that confint()
can behave differently for such data frames
and for “regular” data frames.t.test()
can now do 1-sample t-test described using a
formula.mean()
, var()
,
etc. using a formula interface) have been completely reimplemented and
additional aggregating functions are provided.ntiles()
function has been added to facilitate
creating factors based on quantile ranges.RailTrail
dataset.xhistogram()
is now deprecated. Use
histogram()
instead.mean()
, max()
,
median()
, var()
, etc.) now use
getOption('na.rm')
to determine default behavior.var()
allow it to work in a wider
range of situations.TukeyHSD()
so that explicit use of
aov()
is no longer requiredpanel.lmbands()
for plotting confidence and
prediction bands in linear regressionAnimals
from MASS
has been removed by renaming
the data set GestationLongevity
.freqpolygon()
for making frequency polygons.r.squared()
for extracting r-squared from models
and model-like objects.do()
so that
hyphens (‘-’) are turned into dots (‘.’)fetchData()
.We are still in beta, but we hope things are beginning to stabilize as we settle on syntax and coding idioms for the package. Here are some of the key updates since 0.4:
lm()
and its cousins.makeFun()
now has methods for glm and nls objectsD()
improved to use symbolic differentiation in more
cases and allow pass through to stats::D()
when that makes
sense. This allows functions like deltaMethod() from the car package to
work properly even when the mosaic package is loaded.antiD()
has been modified somewhat. This
may go through another revision if/when we add in symbolic
differentiation, but we think we are now close to the end state.fitSpline()
and fitModel()
have been added
as wrappers around linear models using ns(), bs(), and nls(). Each of
these returns the model fit as a function.