The features of strex
that were deemed the most
interesting have been given their own vignettes. However, the package
was intended as a miscellany of useful functions, so the functions
demonstrated here encapsulate the spirit of this package, i.e. functions
that save R string manipulators time.
library(strex)
Sometimes you don’t want to know whether something is numeric, just
whether or not it could be. Now you can find out with
str_can_be_numeric()
.
str_can_be_numeric(c("1a", "abc", "5", "2e7", "seven"))
#> [1] FALSE FALSE TRUE TRUE FALSE
To get currencies and amounts mentioned in strings, there are
str_extract_currencies()
and
str_nth_currency()
, str_first_currency()
and
str_last_currency()
. str_first_currency()
just
returns the first currency amount. str_last_currency()
returns the last. str_nth_currency()
allows you to get the
second, third and so on. str_extract_currencies()
returns
all currency amounts mentioned in a string.
<- c("Alan paid £5", "Joe paid $7")
string str_first_currency(string)
#> string_num string curr_sym amount
#> 1 1 Alan paid £5 £ 5
#> 2 2 Joe paid $7 $ 7
<- c("€1 is $1.17", "£1 is $1.29")
string str_nth_currency(string, n = c(1, 2))
#> string_num string curr_sym amount
#> 1 1 €1 is $1.17 € 1.00
#> 2 2 £1 is $1.29 $ 1.29
str_last_currency(string) # only gets the first mentioned
#> string_num string curr_sym amount
#> 1 1 €1 is $1.17 $ 1.17
#> 2 2 £1 is $1.29 $ 1.29
str_extract_currencies(string)
#> string_num string curr_sym amount
#> 1 1 €1 is $1.17 € 1.00
#> 2 1 €1 is $1.17 $ 1.17
#> 3 2 £1 is $1.29 £ 1.00
#> 4 2 £1 is $1.29 $ 1.29
This is a simple wrapper around stringr::str_sub()
.
= "abcdefg"
string str_sub(string, 3, 3)
#> [1] "c"
str_elem(string, 3) # simpler and more exressive
#> [1] "c"
<- c("aa1bbb2ccc3", "xyz7ayc8jzk99elephant")
string str_extract_numbers(string)
#> [[1]]
#> [1] 1 2 3
#>
#> [[2]]
#> [1] 7 8 99
str_extract_non_numerics(string)
#> [[1]]
#> [1] "aa" "bbb" "ccc"
#>
#> [[2]]
#> [1] "xyz" "ayc" "jzk" "elephant"
<- c("aa1bbb2ccc3", "xyz7ayc8jzk99elephant")
string str_split_by_numbers(string)
#> [[1]]
#> [1] "aa" "1" "bbb" "2" "ccc" "3"
#>
#> [[2]]
#> [1] "xyz" "7" "ayc" "8" "jzk" "99" "elephant"
We can give files a given extension, leaving them alone if they already have it.
<- c("spreadsheet1.csv", "spreadsheet2")
string str_give_ext(string, "csv")
#> [1] "spreadsheet1.csv" "spreadsheet2.csv"
If the file already has an extension, we can append one or replace it.
str_give_ext(string, "xls") # append
#> [1] "spreadsheet1.csv.xls" "spreadsheet2.xls"
str_give_ext(string, "csv", replace = TRUE) # replace
#> [1] "spreadsheet1.csv" "spreadsheet2.csv"
<- c("spreadsheet1.csv", "spreadsheet2")
string str_before_last_dot(string)
#> [1] "spreadsheet1" "spreadsheet2"
<- "I hate having these \"quotes\" in the middle of my strings."
string cat(string)
#> I hate having these "quotes" in the middle of my strings.
str_remove_quoted(string)
#> [1] "I hate having these in the middle of my strings."
I’m not mad on CamelCase, I often want to deconstruct it.
<- c("CamelVar1", c("CamelVar2"))
string str_split_camel_case(string)
#> [[1]]
#> [1] "Camel" "Var1"
#>
#> [[2]]
#> [1] "Camel" "Var2"
This is something I did a lot to avoid using regular expression. Don’t do it for that purpose. Learn regex. https://regexone.com/ is a very good start.
<- "R is good."
string str_to_vec(string)
#> [1] "R" " " "i" "s" " " "g" "o" "o" "d" "."
What if something is needlessly surrounded by parentheses and we want to get rid of them?
<- "(((Why all the parentheses?)))"
string %>%
string str_trim_anything(coll("("), side = "left") %>%
str_trim_anything(coll(")"), side = "r")
#> [1] "Why all the parentheses?"
<- c("I often write the word *my* twice in a row in my my sentences.")
string str_singleize(string, " my")
#> [1] "I often write the word *my* twice in a row in my sentences."