The package allows you query time-series data and statistics from Axibase Time-Series Database (ATSD) and save time-series data in ATSD. List of package functions:
Execute library(atsd)
to start working with the atsd package. The connection parameters are loaded from the package configuration file, atsd/connection.config, which is located in the atsd package folder. The command
shows you where the atsd package folder is. Open a text editor and modify the configuration file. It should look as follows:
# the url of ATSD including port number
url=http://host_name:port_number
# the user name
user=atsd_user_name
# the user's password
password=atsd_user_password
# validate ATSD SSL certificate: yes, no
verify=no
# cryptographic protocol used by ATSD https server:
# default, ssl2, ssl3, tls1
encryption=ssl3
Reload the modified connection parameters from the configuration file:
Check that parameters are correct:
Refer to Chapter 9 for more options on managing ATSD connection parameters.
Description: The function retrieves historical time-series data or forecasts from ATSD.
Returns object: data frame
Arguments:
metric (required, string)
The name of the metric you want to get data for, for example, “disk_used_percent”.
To obtain a list of metrics collected by ATSD use the get_metrics() function.
selection_interval (required, string)
This is the time interval for which the data will be selected. Specify it as “n-unit”, where
unit is a Second, Minute, Hour, Day, Week, Month, Quarter, or Year and n is the number of units, for example, “3-Week” or “12-Hour”.
entity (optional, string)
The name of the entity you want to get data for. If not provided, then data for all entities will be fetched for the specified metric. Obtain the list of entities with the get_entities() function.
entity_group (optional, string)
The name of entity group, for example, “HP Servers”. Extracts data for all entities belonging to this group.
tags (optional, string vector)
List of user-defined series tags to filter the fetched time-series data, for example, c(“disk_name=sda1”, “mount_point=/”) .
end_time (optional, string)
The end time of the selection interval, for example, end_time = "date('2014-12-27')"
. If not provided, the current time will be used. Specify the date and time, or use one of the supported expressions: end time syntax. For example, ‘current_day’ would set the end of selection interval to 00:00:00 of the current day.
aggregate_interval (optional, string)
The length of the aggregation interval. The period of produced time-series will be equal to the aggregate_interval. The value for each period is computed by the aggregate_statistics function applied to all samples of the original time-series within the period. The format of the aggregate_interval is the same as for the selection_interval argument, for example, “1-Minute”.
aggregate_statistics (optional, string vector)
The statistic functions used for aggregation. Multiple values are supported, for example, c(“Min”, “Avg”, “StDev”). The default value is “Avg”.
interpolation (optional, string)
If aggregation is enabled, then the values for the periods without data will be computed by one of the following interpolation functions: “None”, “Linear”, “Step”. The default value is “None”.
export_type (optional, string)
Supported options: “History” or “Forecast”. The default value is “History”.
verbose (optional, string)
If verbose = FALSE, then all console output will be suppressed. By default, verbose = TRUE.
Examples:
# get historic data for the given entity, metric, and selection_interval
dfr <- query(entity = "nurswgvml007", metric = "cpu_busy", selection_interval = "1-Hour")
# end_time usage example
query(entity = "host-383", metric = "cpu_usage", selection_interval = "1-Day",
end_time = "date('2015-02-10 10:15:03')")
# get forecasts
query(metric = "cpu_busy", selection_interval = "30-Minute",
export_type = "Forecast", verbose = FALSE)
# use aggregation
query(metric = "disk_used_percent", entity_group = "Linux",
tags = c("mount_point=/boot", "file_system=/dev/sda1"),
selection_interval = "1-Week", aggregate_interval = "1-Minute",
aggregate_statistics = c("Avg", "Min", "Max"),
interpolation = "Linear", export_type = "Forecast")
Description: The function builds a zoo object from the given data frame. The timestamp argument provides a column of the data frame which is used as the index for the zoo object. The value argument indicates the series which will be saved in a zoo object. If several columns are listed in the value argument, they will all be saved in a multivariate zoo object. Information from other columns is ignored. To use this function the ‘zoo’ package should be installed.
Returns object: zoo object
Arguments:
dfr (required, data frame)
The data frame.
timestamp (optional, character or numeric)
Name or number of the column with timestamps. By default, timestamp = "Timestamp"
.
value (optional, character vector or numeric vector)
Names or numbers of columns with series values. By default, value = "Value"
.
Examples:
Description: This function fetches a list of metrics and their tags from ATSD, and converts it to a data frame.
Returns object: data frame
Each row of the data frame corresponds to a metric and its tags:
name
Metric name (unique)
counter
Counters are metrics with continuously incrementing value
lastInsertTime
Last time value was received by ATSD for this metric
tags
User-defined tags (as requested by the “tags” argument)
Arguments:
expression (optional, string)
Select metrics matching particular name pattern and/or user-defined metric tags. For examples refer to “Expression syntax” chapter.
active (optional, one of strings: “true” or “false”)
Filter metrics by lastInsertTime attribute. If active = “true”, only metrics with positive lastInsertTime are included in the response.
tags (optional, string vector)
User-defined metric tags to be included in the response. By default, all the tags will be included.
limit (optional, integer)
If limit > 0, the response shows the top-N metrics ordered by name.
verbose (optional, string)
If verbose = FALSE, then all console output will be suppressed.
Examples:
# get all metrics and include all their tags in the data frame
metrics <- get_metrics()
# get the first 100 active metrics which have the tag, "table",
# include this tag into response and exclude oter user-defined metric tags
metrics <- get_metrics(expression = "tags.table != ''", active = "true",
tags = "table", limit = 100)
Description: This function fetches a list of entities and their tags from ATSD, and converts it to a data frame.
Returns object: data frame
Each row of the data frame corresponds to an entity and its tags:
name
Entity name (unique)
enabled
Enabled status, incoming data is discarded for disabled entities
lastInsertTime
Last time value was received by ATSD for this entity
tags
User-defined tags (as requested by the “tags” argument)
Arguments:
expression (optional, string)
Select entities matching particular name pattern and/or user-defined entity tags. For examples refer to “Expression syntax” chapter.
active (optional, one of strings: “true” or “false”)
Filter entities by lastInsertTime attribute. If active = “true”, only entities with positive lastInsertTime are included in the response.
tags (optional, string vector)
User-defined entity tags to be included in the response. By default, all the tags will be included.
limit (optional, integer)
If limit > 0, the response shows the top-N entities ordered by name.
verbose (optional, string)
If verbose = FALSE, then all console output will be suppressed.
Examples:
Description: Save time-series from the data frame into ATSD. The data frame should have a column with timestamps and at least one numeric column with values of a metric.
Returns object: NULL
Arguments:
dfr (required, data frame)
The data frame should have a column with timestamps and at least one numeric column with values of a metric.
time_col (optional, numeric or character)
Number or name of the column with the timestamps. Default value is 1. For example, time_col = 1, or time_col = “Timestamp”. Read “Timestamps format” section below for supported timestamp classes and formats.
time_format (optional, string)
Optional string argument, indicates format of timestamps. This argument is used in the case when timestamp format is not clear from their class. The value of this argument can be one of the following: "ms"
(for epoch milliseconds), "sec"
(for epoch seconds), or a format string, for example "\%Y-\%m-\%d \%H:\%M:\%S"
. This format string will be used to convert the provided timestamps to epoch milliseconds before storing the timestamps in ATSD. Read “Timestamp format” section for details.
tz (optional, string)
By default, tz = "GMT"
. Specify time zone, when timestamps are strings formatted as described in the time_format argument. For example, tz = "Australia/Darwin"
. View the “TZ” column of the time zones table for a list of possible values.
metric_col (required, numeric or character vector)
Specifies numbers or names of the columns where metric values are stored. For example, metric_col = c(2, 3, 4)
, or metric_col = c("Value", "Avg")
. If metric_name argument is not given, then names of columns, in lower case, are used as metric names when saving them in ATSD.
metric_name (optional, character vector)
Specifies metric names. The series indicated by metric_col argument are saved in ATSD along with metric names, provided by the metric_name . So the number and order of names in the metric_name should match to columns in <tt>metric_col . If metric_name argument is not provided, then names of columns, in lower case, are used as metric names when saving them in ATSD.
entity_col (optional, numeric or character)
Optional argument, should be provided if the entity argument is not given. Number or name of a column with entities. Several entities in the column are allowed. For example, entity_col = 4
, or entity_col = "server001"
.
entity (optional, character)
Should be provided if the entity_col argument is not given. Name of the entity.
tags_col (optional, numeric or character vector)
Lists numbers or names of the columns containing tag values. So the name of a column is a tag name, and values in the column are the tag values.
tags (optional, character vector)
Lists tags and their values in “tag=value” format. Each indicated tag will be saved with each series.
verbose (optional, string)
If verbose = FALSE, then all console output will be suppressed.
Timestamp format.
The list of allowed timestamp types.
Numeric, in epoch milliseconds or epoch seconds. In that case time_format = "ms"
or time_format = "sec"
should be used, and time zone argument tz is ignored.
Object of one of type Date
, POSIXct
, POSIXlt
, chron
from the chron
package or timeDate
from the timeDate
package. In that case arguments time_format and tz are ignored.
String, for example, “2015-01-03 10:07:15”. In this case time_format argument should specify which format string is used for the timestamps. For example, time_format = "\%Y-\%m-\%d \%H:\%M:\%S"
. Type ?strptime
to see list of format symbols. This format string will be used to convert provided timestamps to epoch milliseconds before storing the timestamps in ATSD. So time zone, as written in tz argument, and standard origin “1970-01-01 00:00:00” are used for conversion. In fact conversion is done with use of command: as.POSIXct(time_stamp, format = time_format, origin="1970-01-01", tz = tz)
.
Note that timestamps will be stored in epoch milliseconds. So if you put some data into ATSD and then retrieve it back, the timestamps will refer to the same time but in GMT time zone. For example, if you save timestamp "2015-02-15 10:00:00"
with tz = "Australia/Darwin"
in ATSD, and then retrieve it back, you will get the timestamp "2015-02-15 00:30:00"
because Australia/Darwin time zone has a +09:30 shift relative to the GMT zone.
Entity specification
You can provide entity name in one of entity or entity_col arguments. In the first case all series will have the same entity. In the second case, entities specified in entity_col column will be saved along with corresponding series.
Tags specification
The tags_col argument indicates which columns of the data frame keeps the time-series tags. The name of each column specified by the tags_col argument is a tag name, and the values in the column are tag values.
Before storing the series in ATSD, the data frame will be split into several data frames, each of them has a unique entity and unique list of tag values. This entity and tags are stored in ATSD along with the time-series from the data frame. NA’s and missing values in time-series will be ignored.
In tags argument you can specify tags which are the same for all rows (records) of the data frame. So each series value saved in ATSD will have tags, provided in the tags argument.
Examples:
# Save time-series from columns 3, 4, 5 of data frame dfr.
# Timestamps are saved as strings in 2nd column
# and their format string and time zone are provided.
# Entities and tags are in columns 1, 6, 7.
# All saved series will have tag "os_type" with value "linux".
save_series(dfr, time_col = 2, time_format = "%Y/%m/%d %H:%M:%S", tz = "Australia/Darwin",
metric_col = c(3, 4, 5), entity_col = 1, tags_col = c(6, 7),
tags = "os_type = linux")
In this section, we explain the syntax of the expression argument of the functions get_metrics()
and get_entities()
. The expression is used to filter result for which expression evaluates to TRUE
.
The variable name
is used to select metrics/entities by names:
# get metric with name 'cpu_busy'
metrics <- get_metrics(expression = "name = 'cpu_busy'", verbose = FALSE)
Metrics and entities have user-defined tags. Each of these tags is a pair (“tag_name” : “tag_value”). The variable tags.tag_name
in an expression refers to the tag_value
for given metric/entity. If a metric/entity does not have this tag, the tag_value
will be an empty string.
# get metrics without 'source' tag, and include all tags of fetched metrics in output
get_metrics(expression = "tags.source != ''", tags = "*")
To get metrics with a user-defined tag ‘table’ equal to ‘System’:
# get metrics whose tag 'table' is equal to 'System'
metrics <- get_metrics(expression = "tags.table = 'System'", tags = "*")
To build more complex expressions, use brackets (
, )
, and and
, or
, not
logical operators as well as &&
, ||
, !
.
To test if a string is in a collections, use in
operator:
Use like
operator to match values with expressions containing wildcards: expression = "name like 'disk*'"
. The wildcard *
mean zero or more characters. The wildcard .
means any one character.
# get metrics with names consisting of 3 letters
metrics <- get_metrics(expression = "name like '...'")
There are additional functions you can use in an expression:
list(string, delimeter))
Splits the string by delimeter. The default delimiter is a comma.
upper(string)
Converts the string argument to upper case.
lower(string)
Converts the string argument to lower case.
collection(name)
Refers to a named collection of strings created in ATSD.
likeAll(string, collection of patterns)
Returns true if every element in the collection of patterns matches the given string.
likeAny(string, collection of patterns)
Returns true if at least one element in the collection of patterns matches the given string.
The atsd package uses connection parameters to connect with ATSD. These parameters are:
url - the url of ATSD including port number
user - the user name
password - the user’s password
verify - should ATSD SSL certificate be validated
encryption - cryptographic protocol used by ATSD https server
The configuration parameters are loaded from the package configuration file when you load the atsd package into R. (See Section 2.)
The functions show_connection()
, set_connection()
, and save_connection()
, show configuration parameters, change them, and store them in the configuration file.
Function name: show_connection()
Returns object: NULL
Description: The function prints current values of the connection parameters. (They may be different from the values in the configuration file.)
Arguments: no
Examples:
Function name: set_connection()
Returns object: NULL
Description: The function overrides the connection parameters for the duration of the current R session without changing the configuration file. If called without arguments the function sets the connection parameters from the configuration file. If the file argument is provided the function use it. In both cases the current values of the parameters became the same as in the file. In case the file argument is not provided, but some of other arguments are specified, the only specified parameters will be changed.
Arguments:
url (optional, string)
The url of ATSD including port number.
user (optional, string)
The user name.
password (optional, string)
The user’s password.
verify (optional, string)
String - “yes” or “no”, verify = "yes"
ensures validation of ATSD SSL certificate and verify = "no"
suppresses the validation (applicable in the case of ‘https’ protocol).
encryption (optional, string)
Cryptographic protocol used by ATSD https server. Possible values are: “default”, “ssl2”, “ssl3”, and “tls1” (In most cases, use “ssl3” or “tls1”.)
file (optional, string)
The absolute path to the file from which the connection parameters could be read. The file should be formatted as the package configuration file, see Section 2.
Examples:
# Modify the user
set_connection(user = "user001")
# Modify the cryptographic protocol
set_connection(encryption = "tls1")
# Set the parameters of the https connection: url, user name, password
# should the certificate of the server be verifyed
# which cryptographic protocol is used for communication
set_connection(url = "https://my.company.com:8443",
user = "user001",
password = "123456",
verify = "no",
encryption = "ssl3")
# Set up the connection parameters from the file:
set_connection(file = "/home/user001/atsd_https_connection.txt")
Function name: save_connection()
Returns object: NULL
Description: The function writes the connection parameters into the configuration file. If called without arguments the functions use current values of the connection parameters (including NAs). Otherwise only the provided arguments will be written to the configuration file. If configuration file is absent it will be created in the atsd package folder. Arguments:
url (optional, string)
The url of ATSD including port number.
user (optional, string)
The user name.
password (optional, string)
The user’s password.
verify (optional, string)
String - “yes” or “no”, verify = "yes"
ensures validation of ATSD SSL certificate and verify = "no"
suppresses the validation (applicable in the case of ‘https’ protocol).
encryption (optional, string)
Cryptographic protocol used by ATSD https server. Possible values are: “default”, “ssl2”, “ssl3”, and “tls1” (In most cases, use “ssl3” or “tls1”.)
Examples:
# Write the current values of the connection parameters to the configuration file.
save_connection()
# Write the user name and password in the configuration file.
save_connection(user = "user00", password = "123456")
# Write all parameters nedeed for the https connection to the configuration file.
save_connection(url = "https://my.company.com:8443",
user = "user001",
password = "123456",
verify = "no",
encryption = "ssl3")