1. Introduction
Working with raw bytes is tedious, and sometimes the language abstractions for working with bytes is not very pleasant. octet library offers a simple api for clojure (jvm) and clojurescript (js) that makes working with bytebuffer painless.
This is a short list of project goals:
-
Not to be intrusive (no bytebuffer wrapping).
-
Provide host independent abstraction (in most possible way).
-
Composability.
1.1. Project Maturity
Since octet is a young project there can be some API breakage.
1.2. Install
The simplest way to use octet in a clojure project, is by including it in the dependency vector on your project.clj file:
[funcool/octet "1.1.1"]
And the library works with the following platforms: jdk7, jdk8, node-lts.
2. Getting started
The main goal of octet is provide, multiplatform abstraction for work with byte buffer’s. Offering a ligweight api for define message types in a declarative way and use them for read or write to bytebuffers.
As previously said, octet works with both most used clojure implementations: clojure & clojurescript. Each platform has its own byte buffer abstraction:
2.1. Define a spec
A spec in octet glossary represents a type definition or a composition of types. Two most common composition types are: associative and indexed.
The difference of indexed and associative compositions is the input and output. In associative composition the expected input and output is a map. And in indexed composition, the expected input and input is a vector. Internally them represents the same value in bytes.
Let start defining one:
(require '[octet.core :as buf])
;; Indexed composition
(def my-spec1 (buf/spec buf/int32 buf/bool))
;; The same spec but using associative composition
(def my-spec2 (buf/spec :field1 buf/int32
:field2 buf/bool))
You can check that the spec size (in bytes) and number of types internally is the same for both:
size
and count
functions on specs(buf/size my-spec1)
;; => 5
(count my-spec1)
;; => 2
(buf/size my-spec2)
;; => 5
(count my-spec2)
;; => 2
2.2. Creating buffer
The next piece in the puzzle is a way to create (or allocate) new byte buffers. This operation is almost platform independent if the library defaults satisfies you.
;; Allocate bytebuffer with 24 bytes of size
(def buffer (buf/allocate 24))
The buffer allocation is parametrizable so you can specify the concrete implementation to use and the type of buffer:
;; This is a default if you are using clojure
(def buffer (buf/allocate 24 {:impl :nio :type :heap}))
It there are two types of buffers: :heap
and :direct
. The :heap
type of buffer
uses the jdk/node heap for store the data and the :direct
buffer type stores
the data out of the virtual machine heap. The main advantage of use :direct
buffers is that they are not affect the GC and may enable have less GC pauses.
The :direct
type of buffers are only available on JDK.
Example allocating a 24 bytes size byte buffer using es6typed arrays implementation
;; This is a default if you are using clojurescript
(def buffer (buf/allocate 24 {:impl :es6 :type :heap}))
You can see all supported options here
Note
|
The return value of |
2.3. Read and write data
It’s time to see how we can write data to buffers and read data from them using specs. Specs are simple schema on how the data should be read or write to the buffer.
;; The indexed composed spec exptects a vector as input
(buf/write! buffer [22 true] my-spec1)
;; => 5
The write!
function returns a number of bytes are written into buffer.
As, previously mentioned, indexed and associative specs with same fields (in same order) represents the identical layout. Knowing that, we also can do the same operation but using the associative spec defined previously:
(buf/write! buffer {:field1 22 :field2 true} my-spec2)
;; => 5
Note
|
Some buffer implementations (nio is an example) has the concept of read or write position. octet doesn’t touch that. |
Secondly, the read operation is mostly similar to write one. It reads from buffer following the spec and return corresponding data structure:
(buf/read buffer my-spec1)
;; => [22 true]
Also, you can perform the same operation, but using a associative spec:
(buf/read buffer my-spec2)
;; => {:field1 22 :field2 true}
Note
|
This works idependently of implementation used for allocate the buffer. Some
implementations has little limitations, es6 (cljs) as example, des not support
|
Composed type specs and plain value type specs implements the same abstraction and both can be used directly in read and write operations:
(buf/read buffer (buf/int16))
;; => 22
3. Advanced usage
3.1. Read & Write with offset.
If you know that the data what you want read is located in a specific position in a buffer, you can specify it in a read or write operation:
(buf/write buffer [0 false] my-spec1 {:offset 20})
;; => [0 false]
(buf/read buffer my-spec1 {:offset 20})
;; => [0 false]
3.2. Show readed bytes.
The default read
function returns readed data but not returns a amount of readed
bytes. For it, octet exposes a convenience function read*
that instead of
return only readed data, returns a vector with amount of bytes readed and the
readed data:
read*
function(buf/read* buffer my-spec2)
;; => [5 {:field1 22 :field2 true}]
3.3. Read & Write repeatedly
Sometimes you will want read some spec repeatedly, for that purpose octet comes
with repeat
composition function:
(def spec (buf/repeat 5 buf/int32))
(buf/write buffer [1 2 3 4 5] spec)
;; => 20
(buf/read buffer spec)
;; => [1 2 3 4 5]
3.4. Using arbitrary size type specs
Until now, we have seen examples alway using fixed size compositions. Fixed size compositions are easy understand, the size of the spec can be know in any time. But in some circumstances we want store arbitrary length. Strings are one great example:
(buf/write! buffer "hello world" buf/string*)
;; => 15
(buf/read buffer (buf/string*))
;; => "hello world"
But, how it works? Type specs like that, is a composition of two typespecs: int32
and fixed length string. On write phase, it calculates the size of string,
writes firstly the size as int32
following of fixed size string. The read phase
is like write but in backward direction.
Also, the size of that type spec depends on data and can not be known outsize of read/write phase:
(buf/size buf/int16)
;; => 2
(buf/size buf/string*)
;; => IllegalArgumentException No implementation of method: :size of protocol: #'octet.spec/ISpecSize found for class: octet.spec.string$string_STAR_$reify__1804 clojure.core/-cache-protocol-fn (core_deftype.clj:555)
3.5. Put data into new buffer
This is a some kind of helper, that allows easy create a buffer with exactly size for concrete spec and concrete data. It works perfectly with static size specs and arbitrary size specs.
octet.core/into
function (semantically similar to clojure’s into
)(def myspec (buf/spec buf/string* buf/string*))
(def buffer (buf/into myspec ["hello" "world!"]))
(buf/get-capacity buffer)
;; => 19
(buf/read buffer myspec)
;; => ["hello", "world!"]
3.6. Vectors
This is a very similar abstraction to the previously explained repeating pattern. The main difference with it is that this one represents an arbitrary size repetition of one spec and allows store an array like datastructures.
(def spec (buf/spec
(buf/vector* buf/int32)
(buf/vector* buf/int32)))
(def buffer (buf/into spec [[1 2 3] [4 5 6 7 8]])
(buf/get-capacity buffer)
;; => 40
(buf/read buffer spec)
[[1 2 3] [4 5 6 7 8]]
Behind the scenes, an vector is represented with as int32 + type*N
, that means
that it has always an overhead of 4 bytes for store the length of the vector.
3.7. Read and write spec to multiple byte buffers
In some circumstances (specially when we working with streams) the buffers are splitted. The simplest but not very efficient approach will be copy all data in one unique byte buffer and read a spec from it. Octet comes with facilities for read a spec from a vector of buffers that prevents unnecesary copy action.
(def myspec (buf/spec buf/short buf/int32))
(def buffers [(buf/allocate 2)
(buf/allocate 4)])
(buf/write! buffers [20 30] myspec)
;; => 6
(buf/read buffers spec)
;; => [20 30]
(buf/read (nth buffers 0) buf/short)
;; => 20
(buf/read (nth buffers 1) buf/int32)
;; => 30
3.8. Define own type spec
In some circumstances, you probably need define own typespec for solve concrete situations. octet is build around abstractions and define new type spec is not very complicated job.
An typespec consists mainly in ISpec
protocol that has two methods: read
and
write
. Let see an example defining a typespec for point of coordenades:
(require '[octet.spec :as spec])
;; Imagine you have a type Point defined like this:
(defrecord Point [x y])
;; Type spec definition for read/write Point instances.
(def point-spec
(reify
spec/ISpecSize
(size [_]
;; we kwno that is datatype has fixed size in bytes
;; that represents two int32.
8)
spec/ISpec
(read [_ buff pos]
(let [[readed xvalue] (spec/read (buf/int32) buff pos)
[readed' yvalue] (spec/read (buf/int32) buff (+ pos readed))]
[(+ readed readed')
(Point. xvalue yvalue)]))
(write [_ buff pos point]
(let [written (spec/write (buf/int32) buff pos (:x point))
written' (spec/write (buf/int32) buff (+ pos written) (:y point))]
(+ written written')))))
(def mypoint (Point. 1 2))
(buf/write! buffer mypoint point-spec)
;; => 8
(buf/read* buffer point-spec)
;; => [8 #user.Point{:x 1, :y 2}]
Moreover, knowing how it can be done in low level way, you can simplify this concrete step using compose function. The compose function is a type spec constructor that helps map an indexed type spec to specific user defined type.
Let see how the previous code can be simplified in much less boilerplate code:
(defrecord Point [x y])
(def mypoint (Point. 1 2))
(def point-spec (buf/compose ->Point [buf/int32 buf/int32]))
(buf/write! buffer mypoint point-spec)
;; => 8
(buf/read* buffer point-spec)
;; => [8 #user.Point{:x 1, :y 2}]
4. Additional info
4.1. Supported byte buffers
This is a complete table of supported byte buffer implementations and type of byte buffers:
Platform | Name | Params |
---|---|---|
Clojure |
Heap NIO ByteBuffer |
|
Clojure |
Direct NIO ByteBuffer |
|
Clojure |
Heap Netty ByteBuf |
|
Clojure |
Direct Netty ByteBuf |
|
ClojureScript |
Heap ES6 ArrayBuffer/DataView |
|
4.2. Supported typespecs
This is a complete list of supported plain value type spec:
Name | Function | Size (in bytes) | Notes |
---|---|---|---|
Short |
|
2 |
|
Integer |
|
4 |
|
Long |
|
8 |
Only on jvm |
Float |
|
4 |
|
Double |
|
8 |
|
Boolean |
|
1 |
|
Byte |
|
1 |
|
String |
|
N |
Fixed length string |
String |
|
4+N |
Arbitrary length string |
Independently if a spec is a value spec or a composition of value specs, all them implements the same abstraction and can be used in read or write operations.
4.3. Byte order
All the builtin implementations uses the :big-endian
as default byte order. That
value can be canched at any time using the provided byteorder
dynamic var on
the octet.buffer
namespace.
Let see a little example:
(require '[octet.buffer])
(def myspec (buf/spec buf/string* buf/string*))
(def buffer
(buf/with-byte-order :little-endian
(buf/into myspec ["hello" "world!"])))
(buf/get-capacity buffer)
;; => 19
(buf/read buffer myspec)
;; => BufferUnderflowException (because of incorect byte order)
(buf/with-byte-order :little-endian
(buf/read buffer myspec))
;; => ["hello", "world!"]
5. FAQ
What is the difference with clojurewerkz/buffy?
Buffy is a excelent library, and I have used it in some circumstances, but is has some things that I personally don’t like:
-
It works only with netty bytebuf and I need an abstraction for work with different implementations, including in clojurescript.
-
It has slightly strange and not uniform api when dynamic frames (arbitrary length size types) are used. octet offers unified api for both type of specs.
-
It wraps bytebuf in a self defined type. octet is a lightweight abstraction that works over host implementations, without wrapping them.
-
It not has support for ClojureScript
What is the difference with ztellman/gloss?
Gloss is also similiar project, and has similar purposes, but it has several differeces:
-
It has a limited set of types. Octet has an extensible abstraction for build own arbitrary type specs.
-
It only works with nio as buffer implementations. Octet exposes an extensible abstraction and support few differents out of the box.
-
In my opinion it has slightly ugly and unclear api.
-
Seems not very maintained (has issues from 2013).
-
It not has support for ClojureScript.
6. How to Contribute?
6.1. Philosophy
Five most important rules:
-
Beautiful is better than ugly.
-
Explicit is better than implicit.
-
Simple is better than complex.
-
Complex is better than complicated.
-
Readability counts.
All contributions to octet should keep these important rules in mind.
6.2. Contributing
Unlike Clojure and other Clojure contributed libraries octet does not have many restrictions for contributions. Just open an issue or pull request.
6.3. Source Code
octet is open source and can be found on github.
You can clone the public repository with this command:
git clone https://github.com/funcool/octet
6.4. Run tests
For running tests just execute this (for clojure):
lein test
And this for for clojurescript:
./scripts/build
node ./out/tests.js
6.5. License
octet is licensed under BSD (2-Clause) license:
Copyright (c) 2015-2016 Andrey Antukh <niwi@niwi.nz> All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.