Time series analysis in the
tidyverse
Download the development version with latest features:
::install_github("business-science/timetk") remotes
Or, download CRAN approved version:
install.packages("timetk")
There are many R packages for working with Time Series data.
Here’s how timetk
compares to the “tidy” time series R
packages for data visualization, wrangling, and feature engineeering
(those that leverage data frames or tibbles).
Task | timetk | tsibble | feasts | tibbletime |
---|---|---|---|---|
Structure | ||||
Data Structure | tibble (tbl) | tsibble (tbl_ts) | tsibble (tbl_ts) | tibbletime (tbl_time) |
Visualization | ||||
Interactive Plots (plotly) | ✅ | :x: | :x: | :x: |
Static Plots (ggplot) | ✅ | :x: | ✅ | :x: |
Time Series | ✅ | :x: | ✅ | :x: |
Correlation, Seasonality | ✅ | :x: | ✅ | :x: |
Data Wrangling | ||||
Time-Based Summarization | ✅ | :x: | :x: | ✅ |
Time-Based Filtering | ✅ | :x: | :x: | ✅ |
Padding Gaps | ✅ | ✅ | :x: | :x: |
Low to High Frequency | ✅ | :x: | :x: | :x: |
Imputation | ✅ | ✅ | :x: | :x: |
Sliding / Rolling | ✅ | ✅ | :x: | ✅ |
Machine Learning | ||||
Time Series Machine Learning | ✅ | :x: | :x: | :x: |
Anomaly Detection | ✅ | :x: | :x: | :x: |
Clustering | ✅ | :x: | :x: | :x: |
Feature Engineering (recipes) | ||||
Date Feature Engineering | ✅ | :x: | :x: | :x: |
Holiday Feature Engineering | ✅ | :x: | :x: | :x: |
Fourier Series | ✅ | :x: | :x: | :x: |
Smoothing & Rolling | ✅ | :x: | :x: | :x: |
Padding | ✅ | :x: | :x: | :x: |
Imputation | ✅ | :x: | :x: | :x: |
Cross Validation (rsample) | ||||
Time Series Cross Validation | ✅ | :x: | :x: | :x: |
Time Series CV Plan Visualization | ✅ | :x: | :x: | :x: |
More Awesomeness | ||||
Making Time Series (Intelligently) | ✅ | ✅ | :x: | ✅ |
Handling Holidays & Weekends | ✅ | :x: | :x: | :x: |
Class Conversion | ✅ | ✅ | :x: | :x: |
Automatic Frequency & Trend | ✅ | :x: | :x: | :x: |
Full Time Series Machine Learning and Feature Engineering Tutorial
API Documentation for articles and a complete list of function references.
Timetk is an amazing package that is part of the
modeltime
ecosystem for time series analysis and
forecasting. The forecasting system is extensive, and it can take a long
time to learn:
Your probably thinking how am I ever going to learn time series forecasting. Here’s the solution that will save you years of struggling.
Become the forecasting expert for your organization
High-Performance Time Series Course
Time series is changing. Businesses now need 10,000+ time series forecasts every day. This is what I call a High-Performance Time Series Forecasting System (HPTSF) - Accurate, Robust, and Scalable Forecasting.
High-Performance Forecasting Systems will save companies by improving accuracy and scalability. Imagine what will happen to your career if you can provide your organization a “High-Performance Time Series Forecasting System” (HPTSF System).
I teach how to build a HPTFS System in my High-Performance Time Series Forecasting Course. You will learn:
Modeltime
- 30+ Models (Prophet, ARIMA, XGBoost, Random
Forest, & many more)GluonTS
(Competition Winners)Become the Time Series Expert for your organization.
Take the High-Performance Time Series Forecasting Course
The timetk
package wouldn’t be possible without other
amazing time series packages.
timetk
function that uses a period
(frequency) argument owes it to ts()
.
plot_acf_diagnostics()
: Leverages
stats::acf()
, stats::pacf()
&
stats::ccf()
plot_stl_diagnostics()
: Leverages
stats::stl()
timetk
makes heavy use of floor_date()
,
ceiling_date()
, and duration()
for “time-based
phrases”.
%+time%
&
%-time%
):
"2012-01-01" %+time% "1 month 4 days"
uses
lubridate
to intelligently offset the dayts
, and it’s predecessor is the tidyverts
(fable
, tsibble
, feasts
, and
fabletools
).
ts_impute_vec()
function for low-level vectorized
imputation using STL + Linear Interpolation uses
na.interp()
under the hood.ts_clean_vec()
function for low-level vectorized
imputation using STL + Linear Interpolation uses tsclean()
under the hood.auto_lambda()
uses
BoxCox.Lambda()
.timetk
does not import
tibbletime
, it uses much of the innovative functionality to
interpret time-based phrases:
tk_make_timeseries()
- Extends seq.Date()
and seq.POSIXt()
using a simple phase like “2012-02” to
populate the entire time series from start to finish in February
2012.filter_by_time()
, between_time()
- Uses
innovative endpoint detection from phrases like “2012”slidify()
is basically rollify()
using
slider
(see below).purrr
-syntax for complex
rolling (sliding) calculations.
slidify()
uses slider::pslide
under the
hood.slidify_vec()
uses slider::slide_vec()
for
simple vectorized rolls (slides).pad_by_time()
function is a wrapper for
padr::pad()
.step_ts_pad()
to apply padding as a
preprocessing recipe!ts
system, which is the same system the
forecast
R package uses. A ton of inspiration for visuals
came from using TSstudio
.