The sodium R package provides bindings to libsodium: a modern, easy-to-use software library for encryption, decryption, signatures, password hashing and more.
The goal of Sodium is to provide the core operations needed to build higher-level cryptographic tools. It is not intended for implementing standardized protocols such as TLS, SSH or GPG. Sodium only supports a limited set of state-of-the-art elliptic curve methods, resulting in a simple but very powerful tool-kit for building secure applications.
Authenticated Encryption | Encryption only | Authentication only | |
---|---|---|---|
Symmetric (secret key): | data_encrypt /
data_decrypt |
data_tag |
|
Asymmetric (public+private key): | auth_encrypt /
auth_decrypt |
simple_encrypt /
simple_decrypt |
sig_sign /
sig_verify |
All Sodium functions operate on binary data, called ‘raw’ vectors in
R. Use charToRaw
and rawToChar
to convert
between strings and raw vectors. Alternatively hex2bin
and
bin2hex
can convert between binary data to strings in hex
notation:
test <- hash(charToRaw("test 123"))
str <- bin2hex(test)
print(str)
[1] "e8b785b02e702c0b7edc9683130db36c91e0241ba0c489ff1e20cbb4fa3920f9"
hex2bin(str)
[1] e8 b7 85 b0 2e 70 2c 0b 7e dc 96 83 13 0d b3 6c 91 e0 24 1b a0 c4 89 ff 1e
[26] 20 cb b4 fa 39 20 f9
The random()
function generates n bytes of unpredictable
data, suitable for creating secret keys.
secret <- random(8)
print(secret)
[1] 6c d7 1a 09 68 9c 96 53
Implementation is platform specific, see the docs for details.
Sodium has several hash functions including hash()
,
shorthash()
, sha256()
, sha512
and
scrypt()
. The generic hash()
is usually
recommended. It uses blake2b
with a configurable size between 16 bytes (128bit) and 64 bytes
(512bit).
# Generate keys from passphrase
passphrase <- charToRaw("This is super secret")
hash(passphrase)
[1] 98 5c 9b b6 f6 92 d5 26 10 80 99 25 3e a5 a6 66 67 13 fd 88 10 b6 12 74 86
[26] c8 e9 5c 44 07 45 f5
hash(passphrase, size = 16)
[1] eb 6c df 04 18 40 16 28 c1 b0 2e 76 f3 e6 bd 89
hash(passphrase, size = 64)
[1] d0 89 68 30 26 1d 1b 85 76 dc ad 20 c9 58 0a fb b1 d0 62 ba 10 d6 80 f6 cb
[26] c6 ae 2d 42 57 ee a0 65 fd b0 e8 90 02 ae b3 e0 4f 88 df ba ea 26 bb 47 3f
[51] 29 5a a4 06 cd b8 05 78 83 31 66 dc 7b 24
The shorthash()
function is a special 8 byte (64 bit)
hash based on SipHash-2-4.
The output of this function is only 64 bits (8 bytes). It is useful for
in e.g. Hash tables, but it should not be considered
collision-resistant.
Symmetric encryption uses the same secret key for both encryption and decryption. It is mainly useful for encrypting local data, or as a building block for more complex methods.
Most encryption methods require a nonce
: a piece of
non-secret unique data that is used to randomize the cipher. This allows
for safely using the same key
for encrypting multiple
messages. The nonce should be stored or shared along with the
ciphertext.
key <- hash(charToRaw("This is a secret passphrase"))
msg <- serialize(iris, NULL)
# Encrypt with a random nonce
nonce <- random(24)
cipher <- data_encrypt(msg, key, nonce)
# Decrypt with same key and nonce
orig <- data_decrypt(cipher, key, nonce)
identical(iris, unserialize(orig))
[1] TRUE
Because the secret has to be known by all parties, symmetric encryption by itself is often impractical for communication with third parties. For this we need asymmetric (public key) methods.
Secret key authentication is called tagging in Sodium. A tag is basically a hash of the data together with a secret key.
key <- hash(charToRaw("This is a secret passphrase"))
msg <- serialize(iris, NULL)
mytag <- data_tag(msg, key)
To verify the integrity of the data at a later point in time, simply re-calculate the tag with the same key:
stopifnot(identical(mytag, data_tag(msg, key)))
The secret key protects against forgery of the data+tag by an intermediate party, as would be possible with a regular checksum.
Where symmetric methods use the same secret key for encryption and decryption, asymmetric methods use a key-pair consisting of a public key and private key. The private key is secret and only known by its owner. The public key on the other hand can be shared with anyone. Public keys are often published on the user’s website or posted in public directories or keyservers.
key <- keygen()
pub <- pubkey(key)
In public key encryption, data encrypted with a public key can only be decrypted using the corresponding private key. This allows anyone to send somebody a secure message by encrypting it with the receivers public key. The encrypted message will only be readable by the owner of the corresponding private key.
# Encrypt message with pubkey
msg <- serialize(iris, NULL)
ciphertext <- simple_encrypt(msg, pub)
# Decrypt message with private key
out <- simple_decrypt(ciphertext, key)
stopifnot(identical(out, msg))
Public key authentication works the other way around. First, the owner of the private key creates a ‘signature’ (an authenticated checksum) for a message in a way that allows anyone who knows his/her public key to verify the integrity of the message and identity of the sender.
Currently sodium requires a different type of key-pair for signatures (ed25519) than for encryption (curve25519).
# Generate signature keypair
key <- sig_keygen()
pubkey <- sig_pubkey(key)
# Create signature with private key
msg <- serialize(iris, NULL)
sig <- sig_sign(msg, key)
print(sig)
[1] ac 8e db bd 7f 2f f7 22 c4 12 e8 37 d1 69 64 11 d1 69 d6 e7 49 77 5b fd bd
[26] a9 ef fe 3f 0d 3a ce a5 70 33 60 4f 6a 59 f5 e5 a6 09 28 ae e9 93 bf 0a d8
[51] 0d 42 7b 57 4e bd b2 2a 34 4c 40 5c 0c 02
# Verify a signature from public key
sig_verify(msg, sig, pubkey)
[1] TRUE
Signatures are useful when the message itself is not confidential but integrity is important. A common use is for software repositories where to include an index file with checksums for all packages, signed by the repository maintainer. This allows client package managers to verify that the binaries were not manipulated by intermediate parties during the distribution process.
Authenticated encryption implements best practices for secure messaging. It requires that both sender and receiver have a keypair and know each other’s public key. Each message gets authenticated with the key of the sender and encrypted with the key of the receiver.
# Bob's keypair:
bob_key <- keygen()
bob_pubkey <- pubkey(bob_key)
# Alice's keypair:
alice_key <- keygen()
alice_pubkey <- pubkey(alice_key)
# Bob sends encrypted message for Alice:
msg <- charToRaw("TTIP is evil")
ciphertext <- auth_encrypt(msg, bob_key, alice_pubkey)
# Alice verifies and decrypts with her key
out <- auth_decrypt(ciphertext, alice_key, bob_pubkey)
stopifnot(identical(out, msg))
# Alice sends encrypted message for Bob
msg <- charToRaw("Let's protest")
ciphertext <- auth_encrypt(msg, alice_key, bob_pubkey)
# Bob verifies and decrypts with his key
out <- auth_decrypt(ciphertext, bob_key, alice_pubkey)
stopifnot(identical(out, msg))
Note that even though public keys are not confidential, you should not exchange them over the same insecure channel you are trying to protect. If the connection is being tampered with, the attacker could simply replace the key with another one to hijack the interaction.