install.packages("shinyr")
library(shinyr)
library(shinyr)
shinyr::shineMe()
valid_sets() will give all the data sets that are available in the data frame
library(shinyr)
dsets <- shinyr::valid_sets()
knitr::kable(dsets)
Package | LibPath | Item | Title | |
---|---|---|---|---|
5 | datasets | C:/Program Files/R/R-3.6.2/library | CO2 | Carbon Dioxide Uptake in Grass Plants |
6 | datasets | C:/Program Files/R/R-3.6.2/library | ChickWeight | Weight versus age of chicks on different diets |
7 | datasets | C:/Program Files/R/R-3.6.2/library | DNase | Elisa assay of DNase |
13 | datasets | C:/Program Files/R/R-3.6.2/library | Indometh | Pharmacokinetics of Indomethacin |
17 | datasets | C:/Program Files/R/R-3.6.2/library | LifeCycleSavings | Intercountry Life-Cycle Savings Data |
18 | datasets | C:/Program Files/R/R-3.6.2/library | Loblolly | Growth of Loblolly pine trees |
20 | datasets | C:/Program Files/R/R-3.6.2/library | Orange | Growth of Orange Trees |
21 | datasets | C:/Program Files/R/R-3.6.2/library | OrchardSprays | Potency of Orchard Sprays |
23 | datasets | C:/Program Files/R/R-3.6.2/library | Puromycin | Reaction Velocity of an Enzymatic Reaction |
25 | datasets | C:/Program Files/R/R-3.6.2/library | Theoph | Pharmacokinetics of Theophylline |
27 | datasets | C:/Program Files/R/R-3.6.2/library | ToothGrowth | The Effect of Vitamin C on Tooth Growth in Guinea Pigs |
32 | datasets | C:/Program Files/R/R-3.6.2/library | USArrests | Violent Crime Rates by US State |
33 | datasets | C:/Program Files/R/R-3.6.2/library | USJudgeRatings | Lawyers' Ratings of State Judges in the US Superior Court |
41 | datasets | C:/Program Files/R/R-3.6.2/library | airquality | New York Air Quality Measurements |
42 | datasets | C:/Program Files/R/R-3.6.2/library | anscombe | Anscombe's Quartet of 'Identical' Simple Linear Regressions |
43 | datasets | C:/Program Files/R/R-3.6.2/library | attenu | The Joyner-Boore Attenuation Data |
44 | datasets | C:/Program Files/R/R-3.6.2/library | attitude | The Chatterjee-Price Attitude Data |
53 | datasets | C:/Program Files/R/R-3.6.2/library | esoph | Smoking, Alcohol and (O)esophageal Cancer |
59 | datasets | C:/Program Files/R/R-3.6.2/library | freeny | Freeny's Revenue Data |
62 | datasets | C:/Program Files/R/R-3.6.2/library | infert | Infertility after Spontaneous and Induced Abortion |
63 | datasets | C:/Program Files/R/R-3.6.2/library | iris | Edgar Anderson's Iris Data |
68 | datasets | C:/Program Files/R/R-3.6.2/library | longley | Longley's Economic Regression Data |
71 | datasets | C:/Program Files/R/R-3.6.2/library | morley | Michelson Speed of Light Data |
72 | datasets | C:/Program Files/R/R-3.6.2/library | mtcars | Motor Trend Car Road Tests |
75 | datasets | C:/Program Files/R/R-3.6.2/library | npk | Classical N, P, K Factorial Experiment |
80 | datasets | C:/Program Files/R/R-3.6.2/library | quakes | Locations of Earthquakes off Fiji |
81 | datasets | C:/Program Files/R/R-3.6.2/library | randu | Random Numbers from Congruential Generator RANDU |
83 | datasets | C:/Program Files/R/R-3.6.2/library | rock | Measurements on Petroleum Rock Samples |
84 | datasets | C:/Program Files/R/R-3.6.2/library | sleep | Student's Sleep Data |
87 | datasets | C:/Program Files/R/R-3.6.2/library | stackloss | Brownlee's Stack Loss Plant Data |
98 | datasets | C:/Program Files/R/R-3.6.2/library | swiss | Swiss Fertility and Socioeconomic Indicators (1888) Data |
100 | datasets | C:/Program Files/R/R-3.6.2/library | trees | Diameter, Height and Volume for Black Cherry Trees |
103 | datasets | C:/Program Files/R/R-3.6.2/library | warpbreaks | The Number of Breaks in Yarn during Weaving |
In case you want to load any data sets from the list of datasets from return of valis_sets() function you can use base::get() function to load the data sets. this will help you to choose on data sets to load dynamycally in any program.
dsets$Item <- as.character(dsets$Item)
mtcars <- get(dsets$Item[dsets$Item == "mtcars"])
knitr::kable(head(mtcars))
mpg | cyl | disp | hp | drat | wt | qsec | vs | am | gear | carb | |
---|---|---|---|---|---|---|---|---|---|---|---|
Mazda RX4 | 21.0 | 6 | 160 | 110 | 3.90 | 2.620 | 16.46 | 0 | 1 | 4 | 4 |
Mazda RX4 Wag | 21.0 | 6 | 160 | 110 | 3.90 | 2.875 | 17.02 | 0 | 1 | 4 | 4 |
Datsun 710 | 22.8 | 4 | 108 | 93 | 3.85 | 2.320 | 18.61 | 1 | 1 | 4 | 1 |
Hornet 4 Drive | 21.4 | 6 | 258 | 110 | 3.08 | 3.215 | 19.44 | 1 | 0 | 3 | 1 |
Hornet Sportabout | 18.7 | 8 | 360 | 175 | 3.15 | 3.440 | 17.02 | 0 | 0 | 3 | 2 |
Valiant | 18.1 | 6 | 225 | 105 | 2.76 | 3.460 | 20.22 | 1 | 0 | 3 | 1 |
To figure the class of each column in the given data frame use getnumericcols() it return the column names which are numeric
getnumericCols(mtcars)
## [1] "mpg" "cyl" "disp" "hp" "drat" "wt" "qsec" "vs" "am" "gear"
## [11] "carb"
to split paragraph or sentence to induvidial words use splitAndGet(), it returns the list of induvidual words in the given input which can be later used by getFeqTable()
## [[1]]
## [1] "**shinyr**" "is" "developed" "to" "build"
## [6] "dynamic" "shiny" "based" "dashboards" "to"
## [11] "analyze" "the" "data" "of" "your"
## [16] "choice." "" "It" "provides" "simple"
## [21] "yet" "genius" "dashboard" "design" "to"
## [26] "subset" "the" "data," "perform" "exploratory"
## [31] "analysis" "and" "predictive" "analysis" "by"
## [36] "means" "of"
getFeqTable will be used on the output of spliAndGet() to get the frequency of each word, which will be used by getWordCloud
word | freq | |
---|---|---|
analysis | analysis | 2 |
data | data | 2 |
analyze | analyze | 1 |
based | based | 1 |
build | build | 1 |
choice | choice | 1 |
dashboard | dashboard | 1 |
dashboards | dashboards | 1 |
design | design | 1 |
developed | developed | 1 |
dynamic | dynamic | 1 |
exploratory | exploratory | 1 |
genius | genius | 1 |
means | means | 1 |
perform | perform | 1 |
predictive | predictive | 1 |
provides | provides | 1 |
shiny | shiny | 1 |
shinyr | shinyr | 1 |
simple | simple | 1 |
subset | subset | 1 |
yet | yet | 1 |
Use getWordCloud() to plot word cloud.
getWordCloud(x)
getDataInsights() takes data frame as an input and returns the basic insights such as class, number of values missing, maximum, min, var, sd, mean, median, unique items for each column.
Column | Class | Missing | Max | Min | Mean | Median | SD | Variance | Unique_items |
---|---|---|---|---|---|---|---|---|---|
mpg | numeric | 0 | 33.9 | 10.4 | 20.09 | 19.2 | 6.03 | 36.32 | 21,22.8,21.4,18.7,18.1,14.3,24.4,19.2,17.8,16.4,17.3,15.2,10.4,14.7,32.4,30.4,33.9,21.5,15.5,13.3,27.3,26,15.8,19.7,15 |
cyl | numeric | 0 | 8 | 4 | 6.19 | 6 | 1.79 | 3.19 | 6,4,8 |
disp | numeric | 0 | 472 | 71.1 | 230.72 | 196.3 | 123.94 | 15360.8 | 160,108,258,360,225,146.7,140.8,167.6,275.8,472,460,440,78.7,75.7,71.1,120.1,318,304,350,400,79,120.3,95.1,351,145,301,121 |
hp | numeric | 0 | 335 | 52 | 146.69 | 123 | 68.56 | 4700.87 | 110,93,175,105,245,62,95,123,180,205,215,230,66,52,65,97,150,91,113,264,335,109 |
drat | numeric | 0 | 4.93 | 2.76 | 3.6 | 3.7 | 0.53 | 0.29 | 3.9,3.85,3.08,3.15,2.76,3.21,3.69,3.92,3.07,2.93,3,3.23,4.08,4.93,4.22,3.7,3.73,4.43,3.77,3.62,3.54,4.11 |
wt | numeric | 0 | 5.424 | 1.513 | 3.22 | 3.33 | 0.98 | 0.96 | 2.62,2.875,2.32,3.215,3.44,3.46,3.57,3.19,3.15,4.07,3.73,3.78,5.25,5.424,5.345,2.2,1.615,1.835,2.465,3.52,3.435,3.84,3.845,1.935,2.14,1.513,3.17,2.77,2.78 |
qsec | numeric | 0 | 22.9 | 14.5 | 17.85 | 17.71 | 1.79 | 3.19 | 16.46,17.02,18.61,19.44,20.22,15.84,20,22.9,18.3,18.9,17.4,17.6,18,17.98,17.82,17.42,19.47,18.52,19.9,20.01,16.87,17.3,15.41,17.05,16.7,16.9,14.5,15.5,14.6,18.6 |
vs | numeric | 0 | 1 | 0 | 0.44 | 0 | 0.5 | 0.25 | 0,1 |
am | numeric | 0 | 1 | 0 | 0.41 | 0 | 0.5 | 0.25 | 1,0 |
gear | numeric | 0 | 5 | 3 | 3.69 | 4 | 0.74 | 0.54 | 4,3,5 |
carb | numeric | 0 | 8 | 1 | 2.81 | 2 | 1.62 | 2.61 | 4,1,2,3,6,8 |
getDataInsight() also calculates the correlation table for the given data frame.
knitr::kable(res$cor_matrix)
mpg | cyl | disp | hp | drat | wt | qsec | vs | am | gear | carb | |
---|---|---|---|---|---|---|---|---|---|---|---|
mpg | 1.0000000 | -0.8521620 | -0.8475514 | -0.7761684 | 0.6811719 | -0.8676594 | 0.4186840 | 0.6640389 | 0.5998324 | 0.4802848 | -0.5509251 |
cyl | -0.8521620 | 1.0000000 | 0.9020329 | 0.8324475 | -0.6999381 | 0.7824958 | -0.5912421 | -0.8108118 | -0.5226070 | -0.4926866 | 0.5269883 |
disp | -0.8475514 | 0.9020329 | 1.0000000 | 0.7909486 | -0.7102139 | 0.8879799 | -0.4336979 | -0.7104159 | -0.5912270 | -0.5555692 | 0.3949769 |
hp | -0.7761684 | 0.8324475 | 0.7909486 | 1.0000000 | -0.4487591 | 0.6587479 | -0.7082234 | -0.7230967 | -0.2432043 | -0.1257043 | 0.7498125 |
drat | 0.6811719 | -0.6999381 | -0.7102139 | -0.4487591 | 1.0000000 | -0.7124406 | 0.0912048 | 0.4402785 | 0.7127111 | 0.6996101 | -0.0907898 |
wt | -0.8676594 | 0.7824958 | 0.8879799 | 0.6587479 | -0.7124406 | 1.0000000 | -0.1747159 | -0.5549157 | -0.6924953 | -0.5832870 | 0.4276059 |
qsec | 0.4186840 | -0.5912421 | -0.4336979 | -0.7082234 | 0.0912048 | -0.1747159 | 1.0000000 | 0.7445354 | -0.2298609 | -0.2126822 | -0.6562492 |
vs | 0.6640389 | -0.8108118 | -0.7104159 | -0.7230967 | 0.4402785 | -0.5549157 | 0.7445354 | 1.0000000 | 0.1683451 | 0.2060233 | -0.5696071 |
am | 0.5998324 | -0.5226070 | -0.5912270 | -0.2432043 | 0.7127111 | -0.6924953 | -0.2298609 | 0.1683451 | 1.0000000 | 0.7940588 | 0.0575344 |
gear | 0.4802848 | -0.4926866 | -0.5555692 | -0.1257043 | 0.6996101 | -0.5832870 | -0.2126822 | 0.2060233 | 0.7940588 | 1.0000000 | 0.2740728 |
carb | -0.5509251 | 0.5269883 | 0.3949769 | 0.7498125 | -0.0907898 | 0.4276059 | -0.6562492 | -0.5696071 | 0.0575344 | 0.2740728 | 1.0000000 |
You can use corrplot::corrplot() on correlation table to get the correlation table.
corrplot::corrplot(as.matrix(res$cor_matrix),method = "number")
This function was developed to eliminate few items from the list of items for any reason.
excludeThese(mtcars$mpg, c(21.0))
## [1] 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4 10.4 14.7
## [16] 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7 15.0 21.4
You can find out most repeated values in the given set of values.
getMostRepeatedValue(c(1,1,1,2,2,3,4,5))
## [1] 1
## Levels: 1 2 3 4 5
missing count will calculate the total number of NA, NULL, “”, “NULL”, “NA” s in a given set of values. lets introduce some missing values to mtcars
x <- head(mtcars)
x$mpg[1:2] <- NA
missing_count(x$mpg)
## [1] 2
You can replace the missing values in any column of given data frame with one of mean, median, max, and min, sum and mode by using ImputeMydata(). for example you can impute the missing values in the mpg column by mean of all the values in the column as shown below.
imputeMyData(df = x, col = "mpg", FUN = "mean")
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 20.25 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 20.25 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.80 4 108 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.40 6 258 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.70 8 360 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.10 6 225 105 2.76 3.460 20.22 1 0 3 1
You can summarize the values of one column by grouping the values in the other column using groupByandSummarize(). For example you can calculate mean of hp by am.
knitr::kable(groupByandSumarize(mtcars, grp_col = c("am"), summarise_col = "hp", FUN = "mean"))
am | mean_of_hp_by_am |
---|---|
1 | 126.8462 |
0 | 160.2632 |
You can split a given data set into training set and test set by using datapartition(), you can specify the percentage to specify the size of trainset. For example you can split mtcars into 85 percent to train and 15 to test as shown below.
partition <- dataPartition(mtcars, 85)
partition is a list of length 2, which contains test and train sets.
knitr::kable(head(partition$Train))
mpg | cyl | disp | hp | drat | wt | qsec | vs | am | gear | carb | |
---|---|---|---|---|---|---|---|---|---|---|---|
Cadillac Fleetwood | 10.4 | 8 | 472.0 | 205 | 2.93 | 5.25 | 17.98 | 0 | 0 | 3 | 4 |
Merc 450SE | 16.4 | 8 | 275.8 | 180 | 3.07 | 4.07 | 17.40 | 0 | 0 | 3 | 3 |
Volvo 142E | 21.4 | 4 | 121.0 | 109 | 4.11 | 2.78 | 18.60 | 1 | 1 | 4 | 2 |
Datsun 710 | 22.8 | 4 | 108.0 | 93 | 3.85 | 2.32 | 18.61 | 1 | 1 | 4 | 1 |
Hornet Sportabout | 18.7 | 8 | 360.0 | 175 | 3.15 | 3.44 | 17.02 | 0 | 0 | 3 | 2 |
Merc 450SL | 17.3 | 8 | 275.8 | 180 | 3.07 | 3.73 | 17.60 | 0 | 0 | 3 | 3 |
knitr::kable(head(partition$Test))
mpg | cyl | disp | hp | drat | wt | qsec | vs | am | gear | carb | |
---|---|---|---|---|---|---|---|---|---|---|---|
Merc 240D | 24.4 | 4 | 146.7 | 62 | 3.69 | 3.190 | 20.00 | 1 | 0 | 4 | 2 |
Merc 230 | 22.8 | 4 | 140.8 | 95 | 3.92 | 3.150 | 22.90 | 1 | 0 | 4 | 2 |
Honda Civic | 30.4 | 4 | 75.7 | 52 | 4.93 | 1.615 | 18.52 | 1 | 1 | 4 | 2 |
Pontiac Firebird | 19.2 | 8 | 400.0 | 175 | 3.08 | 3.845 | 17.05 | 0 | 0 | 3 | 2 |
Ferrari Dino | 19.7 | 6 | 145.0 | 175 | 3.62 | 2.770 | 15.50 | 0 | 1 | 5 | 6 |
mod <- lm(formula = wt ~ ., data = mtcars)
mod
##
## Call:
## lm(formula = wt ~ ., data = mtcars)
##
## Coefficients:
## (Intercept) mpg cyl disp hp drat
## -0.230634 -0.041666 -0.057254 0.006685 -0.003230 -0.090083
## qsec vs am gear carb
## 0.199541 -0.066368 0.018445 -0.093508 0.248688
predictions <- predict(mod, mtcars[,-6])
get the metrics of regression model by using regressionModelmMetrics()
actials <- mtcars[,6]
x <- regressionModelMetrics(actuals = actials, predictions = predictions, model = mod)
y <- as.data.frame(x)
row.names(y) <- NULL
knitr::kable(y)
AIC | BIC | MAE | MSE | RMSE | MAPE | Corelation | r.squared | adj.r.squared |
---|---|---|---|---|---|---|---|---|
20.01 | 37.6 | 0.18 | 0.05 | 0.23 | 0.06 | 0.97 | 0.94 | 0.92 |