How to Use the atsd Package to Communicate with Axibase Time-Series Database

2018-01-29

Contents

  1. Package Overview
  2. Connecting to ATSD
  3. Querying ATSD
  4. Transforming Data Frame to zoo Object
  5. Getting Metrics
  6. Getting Entities
  7. Getting Time Series Tags
  8. Saving time series in ATSD
  9. Expression Syntax
  10. Advanced Connection Options

1. Package Overview

The package allows you query time-series data and statistics from Axibase Time-Series Database (ATSD) and save time-series data in ATSD. List of package functions:

Return to Contents

2. Connecting to ATSD

Execute library(atsd)  to start working with the atsd package. The connection parameters are loaded from the package configuration file, atsd/connection.config,  which is located in the atsd package folder. The command

installed.packages()["atsd", "LibPath"]

shows you where the atsd package folder is. Open a text editor and modify the configuration file. It should look as follows:

 # the url of ATSD including port number
 url=http://host_name:port_number   
 
 # the user name
 user=atsd_user_name
 
 # the user's password
 password=atsd_user_password   
 
 # validate ATSD SSL certificate: yes, no
 verify=no  
 
 # cryptographic protocol used by ATSD https server:
 # default, ssl2, ssl3, tls1
 encryption=ssl3   

Reload the modified connection parameters from the configuration file:

set_connection()

Check that parameters are correct:

show_connection()

Refer to Chapter 9 for more options on managing ATSD connection parameters.

Return to Contents

3. Querying ATSD

Function name: query()

Description: The function retrieves historical time-series data or forecasts from ATSD.

Returns object: data frame

Arguments:

Examples:

# get historic data for the given entity, metric, and selection_interval
dfr <- query(entity = "nurswgvml007", metric = "cpu_busy", selection_interval = "1-Hour")

# end_time usage example
query(entity = "host-383", metric = "cpu_usage", selection_interval = "1-Day", 
      end_time = "date('2015-02-10 10:15:03')")

# get forecasts
query(metric = "cpu_busy", selection_interval = "30-Minute", 
    export_type = "Forecast", verbose = FALSE)

# use aggregation
query(metric = "disk_used_percent", entity_group = "Linux", 
      tags = c("mount_point=/boot", "file_system=/dev/sda1"), 
      selection_interval = "1-Week", aggregate_interval = "1-Minute",
      aggregate_statistics = c("Avg", "Min", "Max"), 
      interpolation = "Linear", export_type = "Forecast")

Return to Contents

4. Transforming Data Frame to a zoo Object

Function name: to_zoo()

Description: The function builds a zoo object from the given data frame. The timestamp  argument provides a column of the data frame which is used as the index for the zoo object. The value  argument indicates the series which will be saved in a zoo object. If several columns are listed in the value  argument, they will all be saved in a multivariate zoo object. Information from other columns is ignored. To use this function the ‘zoo’ package should be installed.

Returns object: zoo object

Arguments:

Examples:

# query ATSD for data and transform it to zoo object
dfr <- query(entity = "nurswgvml007", metric = "cpu_busy", selection_interval = "1-Hour")
z <- to_zoo(dfr)

Return to Contents

5. Getting Metrics

Function name: get_metrics()

Description: This function fetches a list of metrics and their tags from ATSD, and converts it to a data frame.

Returns object: data frame

Each row of the data frame corresponds to a metric and its tags:

Arguments:

Examples:

# get all metrics and include all their tags in the data frame
metrics <- get_metrics()

# get the first 100 active metrics which have the tag, "table", 
# include this tag into response and exclude oter user-defined metric tags
metrics <- get_metrics(expression = "tags.table != ''", active = "true", 
                       tags = "table", limit = 100)

Return to Contents

6. Getting Entities

Function name: get_entities()

Description: This function fetches a list of entities and their tags from ATSD, and converts it to a data frame.

Returns object: data frame

Each row of the data frame corresponds to an entity and its tags:

Arguments:

Examples:

# get all entities
entities <- get_entities()

# select entities by name and user-defined tag "app" 
entities <- get_entities(expression = 
                         "name like 'nur*' and lower(tags.app) like '*hbase*'")

Return to Contents

7. Getting Time Series Tags

Function name: get_series_tags()

Description: The function determines time series collected by ATSD for a given metric. For each time series it lists tags associated with the series, and last time the series was updated. The list of fetched time series is based on data stored on disk for the last 24 hours.

Returns object: data frame

Each row of the data frame corresponds to a time series and its tags:

Arguments:

Examples:

# get all time series and their tags collected by ATSD for the "disk_used_percent" metric
tags <- get_series_tags(metric = "disk_used_percent")

# get all time series and their tags for the "disk_used_percent" metric
# end "nurswgvml007" entity
get_series_tags(metric = "disk_used_percent", entity = "nurswgvml007")

Return to Contents

8. Saving Time-series in ATSD

Function name: save_series()

Description: Save time-series from the data frame into ATSD. The data frame should have a column with timestamps and at least one numeric column with values of a metric.

Returns object: NULL

Arguments:

Timestamp format.

The list of allowed timestamp types.

Note that timestamps will be stored in epoch milliseconds. So if you put some data into ATSD and then retrieve it back, the timestamps will refer to the same time but in GMT time zone. For example, if you save timestamp "2015-02-15 10:00:00" with tz = "Australia/Darwin" in ATSD, and then retrieve it back, you will get the timestamp "2015-02-15 00:30:00" because Australia/Darwin time zone has a +09:30 shift relative to the GMT zone.

Entity specification

You can provide entity name in one of entity  or entity_col  arguments. In the first case all series will have the same entity. In the second case, entities specified in entity_col  column will be saved along with corresponding series.

Tags specification

The tags_col  argument indicates which columns of the data frame keeps the time-series tags. The name of each column specified by the tags_col  argument is a tag name, and the values in the column are tag values.

Before storing the series in ATSD, the data frame will be split into several data frames, each of them has a unique entity and unique list of tag values. This entity and tags are stored in ATSD along with the time-series from the data frame. NA’s and missing values in time-series will be ignored.

In tags  argument you can specify tags which are the same for all rows (records) of the data frame. So each series value saved in ATSD will have tags, provided in the tags  argument.

Examples:

# Save time-series from columns 3, 4, 5 of data frame dfr.
# Timestamps are saved as strings in 2nd column 
# and their format string and time zone are provided.
# Entities and tags are in columns 1, 6, 7.
# All saved series will have tag "os_type" with value "linux".
save_series(dfr, time_col = 2, time_format = "%Y/%m/%d %H:%M:%S", tz = "Australia/Darwin", 
            metric_col = c(3, 4, 5), entity_col = 1, tags_col = c(6, 7), 
            tags = "os_type = linux")

Return to Contents

9. Expression Syntax

In this section, we explain the syntax of the expression  argument of the functions get_metrics()   and get_entities(). The expression  is used to filter result for which expression  evaluates to TRUE .

The variable name is used to select metrics/entities by names:

# get metric with name 'cpu_busy'
metrics <- get_metrics(expression = "name = 'cpu_busy'", verbose = FALSE)

Metrics and entities have user-defined tags. Each of these tags is a pair (“tag_name” : “tag_value”). The variable tags.tag_name  in an expression refers to the tag_value for given metric/entity. If a metric/entity does not have this tag, the tag_value will be an empty string.

# get metrics without 'source' tag, and include all tags of fetched metrics in output
get_metrics(expression = "tags.source != ''", tags = "*")

To get metrics with a user-defined tag ‘table’ equal to ‘System’:

# get metrics whose tag 'table' is equal to 'System'
metrics <- get_metrics(expression = "tags.table = 'System'", tags = "*")

To build more complex expressions, use brackets (, ), and and, or, not  logical operators as well as && , ||, !.

entities <- get_entities(expression = "tags.app != '' and (tags.os != '' or tags.ip != '')")

To test if a string is in a collections, use in  operator:

get_entities(expression = "name in ('derby-test', 'atom.axibase.com')")

Use like  operator to match values with expressions containing wildcards: expression = "name like 'disk*'" . The wildcard *  mean zero or more characters. The wildcard .  means any one character.

metrics <- get_metrics(expression = "name like '*cpu*' and tags.table = 'System'")
# get metrics with names consisting of 3 letters
metrics <- get_metrics(expression = "name like '...'")

There are additional functions you can use in an expression:

get_metrics(expression = "likeAll(lower(name), list('cpu*,*use*'))")
get_metrics(expression = "likeAny(lower(name), list('cpu*,*use*'))")
get_metrics(expression = "name in collection('fs_ignore')")

Return to Contents

10. Advanced Connection Options

The atsd package uses connection parameters to connect with ATSD. These parameters are:

The configuration parameters are loaded from the package configuration file when you load the atsd package into R. (See Section 2.)

The functions show_connection()set_connection(),  and save_connection(),  show configuration parameters, change them, and store them in the configuration file.


Function name: show_connection()

Returns object: NULL

Description: The function prints current values of the connection parameters. (They may be different from the values in the configuration file.)

Arguments: no

Examples:

show_connection()


Function name: set_connection()

Returns object: NULL

Description: The function overrides the connection parameters for the duration of the current R session without changing the configuration file. If called without arguments the function sets the connection parameters from the configuration file. If the file  argument is provided the function use it. In both cases the current values of the parameters became the same as in the file. In case the file  argument is not provided, but some of other arguments are specified, the only specified parameters will be changed.

Arguments:

Examples:

# Modify the user 
set_connection(user = "user001")

# Modify the cryptographic protocol 
set_connection(encryption = "tls1")

# Set the parameters of the https connection: url, user name, password 
# should the certificate of the server be verifyed 
# which cryptographic protocol is used for communication
set_connection(url = "https://my.company.com:8443", 
               user = "user001", 
               password = "123456", 
               verify = "no", 
               encryption = "ssl3")

# Set up the connection parameters from the file:
set_connection(file = "/home/user001/atsd_https_connection.txt")


Function name: save_connection()

Returns object: NULL

Description: The function writes the connection parameters into the configuration file. If called without arguments the functions use current values of the connection parameters (including NAs). Otherwise only the provided arguments will be written to the configuration file. If configuration file is absent it will be created in the atsd package folder. Arguments:

Examples:

# Write the current values of the connection parameters to the configuration file.
save_connection()
 
# Write the user name and password in the configuration file.
save_connection(user = "user00", password = "123456")
 
# Write all parameters nedeed for the https connection to the configuration file.
save_connection(url = "https://my.company.com:8443", 
                user = "user001", 
                password = "123456", 
                verify = "no", 
                encryption = "ssl3")

Return to Contents