1. Introduction

Working with raw bytes is tedious, and sometimes the language abstractions for working with bytes is not very pleasant. octet library offers a simple api for clojure (jvm) and clojurescript (js) that makes working with bytebuffer painless.

This is a short list of project goals:

  • Not to be intrusive (no bytebuffer wrapping).

  • Provide host independent abstraction (in most possible way).

  • Composability.

1.1. Project Maturity

Since octet is a young project there can be some API breakage.

1.2. Install

The simplest way to use octet in a clojure project, is by including it in the dependency vector on your project.clj file:

[funcool/octet "1.1.1"]

And the library works with the following platforms: jdk7, jdk8, node-lts.

2. Getting started

The main goal of octet is provide, multiplatform abstraction for work with byte buffer’s. Offering a ligweight api for define message types in a declarative way and use them for read or write to bytebuffers.

As previously said, octet works with both most used clojure implementations: clojure & clojurescript. Each platform has its own byte buffer abstraction:

2.1. Define a spec

A spec in octet glossary represents a type definition or a composition of types. Two most common composition types are: associative and indexed.

The difference of indexed and associative compositions is the input and output. In associative composition the expected input and output is a map. And in indexed composition, the expected input and input is a vector. Internally them represents the same value in bytes.

Let start defining one:

(require '[octet.core :as buf])

;; Indexed composition
(def my-spec1 (buf/spec buf/int32 buf/bool))

;; The same spec but using associative composition
(def my-spec2 (buf/spec :field1 buf/int32
                        :field2 buf/bool))

You can check that the spec size (in bytes) and number of types internally is the same for both:

Example using size and count functions on specs
(buf/size my-spec1)
;; => 5

(count my-spec1)
;; => 2

(buf/size my-spec2)
;; => 5

(count my-spec2)
;; => 2

2.2. Creating buffer

The next piece in the puzzle is a way to create (or allocate) new byte buffers. This operation is almost platform independent if the library defaults satisfies you.

Example allocating a 24 bytes size byte buffer with default implementation
;; Allocate bytebuffer with 24 bytes of size
(def buffer (buf/allocate 24))

The buffer allocation is parametrizable so you can specify the concrete implementation to use and the type of buffer:

Example allocation 24 bytes size buffer with nio buffer implementation
;; This is a default if you are using clojure
(def buffer (buf/allocate 24 {:impl :nio :type :heap}))

It there are two types of buffers: :heap and :direct. The :heap type of buffer uses the jdk/node heap for store the data and the :direct buffer type stores the data out of the virtual machine heap. The main advantage of use :direct buffers is that they are not affect the GC and may enable have less GC pauses.

The :direct type of buffers are only available on JDK.

Example allocating a 24 bytes size byte buffer using es6typed arrays implementation

;; This is a default if you are using clojurescript
(def buffer (buf/allocate 24 {:impl :es6 :type :heap}))

You can see all supported options here

Note

The return value of allocate depens on implementation used. Is a plain instance without additional wrapping. If you want access to its internals, you can do it with native host platform api.

2.3. Read and write data

It’s time to see how we can write data to buffers and read data from them using specs. Specs are simple schema on how the data should be read or write to the buffer.

Example writing data into buffer using indexed composed schema
;; The indexed composed spec exptects a vector as input
(buf/write! buffer [22 true] my-spec1)
;; => 5

The write! function returns a number of bytes are written into buffer.

As, previously mentioned, indexed and associative specs with same fields (in same order) represents the identical layout. Knowing that, we also can do the same operation but using the associative spec defined previously:

Example writing data into buffer using a map as input
(buf/write! buffer {:field1 22 :field2 true} my-spec2)
;; => 5
Note

Some buffer implementations (nio is an example) has the concept of read or write position. octet doesn’t touch that.

Secondly, the read operation is mostly similar to write one. It reads from buffer following the spec and return corresponding data structure:

Example reading from buffer using indexed spec
(buf/read buffer my-spec1)
;; => [22 true]

Also, you can perform the same operation, but using a associative spec:

Example reading from buffer using associative spec
(buf/read buffer my-spec2)
;; => {:field1 22 :field2 true}
Note

This works idependently of implementation used for allocate the buffer. Some implementations has little limitations, es6 (cljs) as example, des not support int64 typespec due to platform limitations.

Composed type specs and plain value type specs implements the same abstraction and both can be used directly in read and write operations:

Example using plain specs for read data from buffers
(buf/read buffer (buf/int16))
;; => 22

3. Advanced usage

3.1. Read & Write with offset.

If you know that the data what you want read is located in a specific position in a buffer, you can specify it in a read or write operation:

Example writing data in specific offset
(buf/write buffer [0 false] my-spec1 {:offset 20})
;; => [0 false]
Example read data from specific offset.
(buf/read buffer my-spec1 {:offset 20})
;; => [0 false]

3.2. Show readed bytes.

The default read function returns readed data but not returns a amount of readed bytes. For it, octet exposes a convenience function read* that instead of return only readed data, returns a vector with amount of bytes readed and the readed data:

Example using read* function
(buf/read* buffer my-spec2)
;; => [5 {:field1 22 :field2 true}]

3.3. Read & Write repeatedly

Sometimes you will want read some spec repeatedly, for that purpose octet comes with repeat composition function:

Example for read and write using repeat composed spec
(def spec (buf/repeat 5 buf/int32))
(buf/write buffer [1 2 3 4 5] spec)
;; => 20

(buf/read buffer spec)
;; => [1 2 3 4 5]

3.4. Using arbitrary size type specs

Until now, we have seen examples alway using fixed size compositions. Fixed size compositions are easy understand, the size of the spec can be know in any time. But in some circumstances we want store arbitrary length. Strings are one great example:

Example writing arbitrary length string into buffer
(buf/write! buffer "hello world" buf/string*)
;; => 15
Example reading arbitrary length string from buffer
(buf/read buffer (buf/string*))
;; => "hello world"

But, how it works? Type specs like that, is a composition of two typespecs: int32 and fixed length string. On write phase, it calculates the size of string, writes firstly the size as int32 following of fixed size string. The read phase is like write but in backward direction.

Also, the size of that type spec depends on data and can not be known outsize of read/write phase:

Example how obtain a size of specific type spec
(buf/size buf/int16)
;; => 2

(buf/size buf/string*)
;; => IllegalArgumentException No implementation of method: :size of protocol: #'octet.spec/ISpecSize found for class: octet.spec.string$string_STAR_$reify__1804  clojure.core/-cache-protocol-fn (core_deftype.clj:555)

3.5. Put data into new buffer

This is a some kind of helper, that allows easy create a buffer with exactly size for concrete spec and concrete data. It works perfectly with static size specs and arbitrary size specs.

Example using octet.core/into function (semantically similar to clojure’s into)
(def myspec (buf/spec buf/string* buf/string*))
(def buffer (buf/into myspec ["hello" "world!"]))

(buf/get-capacity buffer)
;; => 19

(buf/read buffer myspec)
;; => ["hello", "world!"]

3.6. Vectors

This is a very similar abstraction to the previously explained repeating pattern. The main difference with it is that this one represents an arbitrary size repetition of one spec and allows store an array like datastructures.

Example storing two arrays in a buffer
(def spec (buf/spec
            (buf/vector* buf/int32)
            (buf/vector* buf/int32)))
(def buffer (buf/into spec [[1 2 3] [4 5 6 7 8]])

(buf/get-capacity buffer)
;; => 40

(buf/read buffer spec)
[[1 2 3] [4 5 6 7 8]]

Behind the scenes, an vector is represented with as int32 + type*N, that means that it has always an overhead of 4 bytes for store the length of the vector.

3.7. Read and write spec to multiple byte buffers

In some circumstances (specially when we working with streams) the buffers are splitted. The simplest but not very efficient approach will be copy all data in one unique byte buffer and read a spec from it. Octet comes with facilities for read a spec from a vector of buffers that prevents unnecesary copy action.

Example reading and writing data to a vector of buffers
(def myspec (buf/spec buf/short buf/int32))

(def buffers [(buf/allocate 2)
              (buf/allocate 4)])

(buf/write! buffers [20 30] myspec)
;; => 6

(buf/read buffers spec)
;; => [20 30]

(buf/read (nth buffers 0) buf/short)
;; => 20

(buf/read (nth buffers 1) buf/int32)
;; => 30

3.8. Define own type spec

In some circumstances, you probably need define own typespec for solve concrete situations. octet is build around abstractions and define new type spec is not very complicated job.

An typespec consists mainly in ISpec protocol that has two methods: read and write. Let see an example defining a typespec for point of coordenades:

Example definition of type spec that represents a point
(require '[octet.spec :as spec])

;; Imagine you have a type Point defined like this:
(defrecord Point [x y])

;; Type spec definition for read/write Point instances.
(def point-spec
  (reify
    spec/ISpecSize
    (size [_]
      ;; we kwno that is datatype has fixed size in bytes
      ;; that represents two int32.
      8)

    spec/ISpec
    (read [_ buff pos]
      (let [[readed xvalue] (spec/read (buf/int32) buff pos)
            [readed' yvalue] (spec/read (buf/int32) buff (+ pos readed))]
        [(+ readed readed')
         (Point. xvalue yvalue)]))

    (write [_ buff pos point]
      (let [written (spec/write (buf/int32) buff pos (:x point))
            written' (spec/write (buf/int32) buff (+ pos written) (:y point))]
        (+ written written')))))
Example using the previously defined typespec
(def mypoint (Point. 1 2))
(buf/write! buffer mypoint point-spec)
;; => 8

(buf/read* buffer point-spec)
;; => [8 #user.Point{:x 1, :y 2}]

Moreover, knowing how it can be done in low level way, you can simplify this concrete step using compose function. The compose function is a type spec constructor that helps map an indexed type spec to specific user defined type.

Let see how the previous code can be simplified in much less boilerplate code:

Example using compose function.
(defrecord Point [x y])
(def mypoint (Point. 1 2))

(def point-spec (buf/compose ->Point [buf/int32 buf/int32]))

(buf/write! buffer mypoint point-spec)
;; => 8

(buf/read* buffer point-spec)
;; => [8 #user.Point{:x 1, :y 2}]

4. Additional info

4.1. Supported byte buffers

This is a complete table of supported byte buffer implementations and type of byte buffers:

Platform Name Params

Clojure

Heap NIO ByteBuffer

{:type :heap :impl :nio}

Clojure

Direct NIO ByteBuffer

{:type :direct :impl :nio}

Clojure

Heap Netty ByteBuf

{:type :heap :impl :netty}

Clojure

Direct Netty ByteBuf

{:type :direct :impl :netty}

ClojureScript

Heap ES6 ArrayBuffer/DataView

{:type :heap :impl :es6}

4.2. Supported typespecs

This is a complete list of supported plain value type spec:

Name Function Size (in bytes) Notes

Short

buf/int16

2

Integer

buf/int32

4

Long

buf/int64

8

Only on jvm

Float

buf/float

4

Double

buf/double

8

Boolean

buf/bool

1

Byte

buf/byte

1

String

buf/string

N

Fixed length string

String

buf/string*

4+N

Arbitrary length string

Independently if a spec is a value spec or a composition of value specs, all them implements the same abstraction and can be used in read or write operations.

4.3. Byte order

All the builtin implementations uses the :big-endian as default byte order. That value can be canched at any time using the provided byteorder dynamic var on the octet.buffer namespace.

Let see a little example:

(require '[octet.buffer])

(def myspec (buf/spec buf/string* buf/string*))

(def buffer
  (buf/with-byte-order :little-endian
    (buf/into myspec ["hello" "world!"])))

(buf/get-capacity buffer)
;; => 19

(buf/read buffer myspec)
;; => BufferUnderflowException (because of incorect byte order)

(buf/with-byte-order :little-endian
  (buf/read buffer myspec))
;; => ["hello", "world!"]

5. FAQ

What is the difference with clojurewerkz/buffy?

Buffy is a excelent library, and I have used it in some circumstances, but is has some things that I personally don’t like:

  • It works only with netty bytebuf and I need an abstraction for work with different implementations, including in clojurescript.

  • It has slightly strange and not uniform api when dynamic frames (arbitrary length size types) are used. octet offers unified api for both type of specs.

  • It wraps bytebuf in a self defined type. octet is a lightweight abstraction that works over host implementations, without wrapping them.

  • It not has support for ClojureScript

What is the difference with ztellman/gloss?

Gloss is also similiar project, and has similar purposes, but it has several differeces:

  • It has a limited set of types. Octet has an extensible abstraction for build own arbitrary type specs.

  • It only works with nio as buffer implementations. Octet exposes an extensible abstraction and support few differents out of the box.

  • In my opinion it has slightly ugly and unclear api.

  • Seems not very maintained (has issues from 2013).

  • It not has support for ClojureScript.

6. How to Contribute?

6.1. Philosophy

Five most important rules:

  • Beautiful is better than ugly.

  • Explicit is better than implicit.

  • Simple is better than complex.

  • Complex is better than complicated.

  • Readability counts.

All contributions to octet should keep these important rules in mind.

6.2. Contributing

Unlike Clojure and other Clojure contributed libraries octet does not have many restrictions for contributions. Just open an issue or pull request.

6.3. Source Code

octet is open source and can be found on github.

You can clone the public repository with this command:

git clone https://github.com/funcool/octet

6.4. Run tests

For running tests just execute this (for clojure):

lein test

And this for for clojurescript:

./scripts/build
node ./out/tests.js

6.5. License

octet is licensed under BSD (2-Clause) license:

Copyright (c) 2015-2016 Andrey Antukh <niwi@niwi.nz>
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice, this
  list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright notice,
  this list of conditions and the following disclaimer in the documentation
  and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.