Introduction

A filesystem toolset and storage implementation for Clojure. In one side you have a collection of functions for:

  • work with paths (creating and manipulating).

  • work with filesystem (files and directories crud and many predicates).

  • IO (clojure.java.io implementation for paths).

And on the other side it provides a Storage abstraction, that facilitates abstract filesystem access under uniform api for web-applications.

Install

Add the following dependency to your project.clj file:

[funcool/datoteka "1.1.0"]

User Guide

Path & Filesystem

The path and filesystem helper functions are all exposed under the datoteka.core namespace, so let’s import it:

(require '[datoteka.core :as fs])

This library uses JVM NIO, so under the hood, the java.nio.file.Path is used instead of classical java.io.File. And then, you have many ways to create a path instance. The basic one is just using the path function:

(fs/path "/tmp")
;; => #path "/tmp"

As you can observe, the path properly prints with a data reader, so once you have imported the library, you can use the #path "/tmp" syntax to create paths.

The paths also can be created from a varios kind of objects (such as URI, URL, String and seq’s):

(fs/path (java.net.URI. "file:///tmp"))
;; => #path "/tmp"

(fs/path (java.net.URL. "file:///tmp"))
;; => #path "/tmp"

(fs/path ["/tmp" "foo"])
;; => #path "/tmp/foo"

The path function is also variadic, so you can pass multiple arguments to it:

(fs/path "/tmp" "foo")
;; => #path "/tmp/foo"

(fs/path (java.net.URI. "file:///tmp") "foo")
;; => #path "/tmp/foo"

And for convenience, in can use the clojure.java.io api with paths in the same way as you have done it with java.io.File:

(require '[clojure.java.io :as io])

(io/reader (fs/path "/etc/inputrc"))
;; => #object[java.io.BufferedReader 0x203dc326 "java.io.BufferedReader@203dc326"]

(subs (io/slurp (fs/path "/etc/inputrc")) 0 20)
;; => "# do not bell on tab"

Look on the reference section for the complete documentation of all available functions and predicates.

Storages

On the contrary to the raw filesystem manipulation functions, the storages abstraction offers a thin and carefully designed api for implement mostly generic file storage abstraction. It is designed mainly for web applications in mind but is not strictly limited to it.

But, why I need this? you may be asking yourself…​ the storage api allows you work with files (such that file uploads as a good example) starting with storing them on local filesystem and then easily switch to a cloud based storage (such that AWS S3 or any other) just providing an implementation for the very restricted and small storage api.

The datoteka comes by default with a local filesystem storage as base storage and many other that stacks on top of other storages and provides additional functionality.

The main api for work with storages is exposed under datoteka.storages namespace and it consists mainly on this functions: save, lookup, exists?, delete and public-url (it has some other functions but they are not relevant for now).

Local FileSystem Storage

The local storage is a default implementation of stroage abstraction that stores the files in a local file system. Let create an instance of it:

(require '[datoteka.storages :as st])
(require '[datoteka.storages.local :refer [localfs]])

(def storage
  (localfs {:basedir "/tmp/"
            :baseuri "http://localhost/media/"}))

Let save some data using the storage api:

@(st/save storage "test-file.txt" "some-content")
;; => #path "test-file.txt"

The save function returns a promise (an instance of CompletableFuture) with the final path relative to the provided :basedir to the storage.

If you application works in a synchronous way, just use the @ or deref function. It blocks the current thread until the underlying operation is finished. And if you application has asynchronous workflow, you can just use funcool/promesa for work with promises in more comfortable way.

This approach allows easily use the storage api in both possible workflows.

Now, let see what is happens if you try to store the same file again:

user=> @(st/save storage "test-file.txt" "some-content")
;; => #path "test-file1.txt"

As you can observe, the storage handles duplicates for you. You don’t need to worry about the filenames, if duplicate name is used, an other name will be automatically choiced.

Note
at this moment the incremental integer is used for choice a new name but this can no longer be valid on next version. You should always rely on the return value of save function instead of having the supposition that the strategy of choicing new name will be the same.

The 3rd argument to the save function represents the content of the file. It accepts anything that can be coerced to InputStream with clojure.java.io/input-stream with the small exception for strings, that instead interpret them as paths it interpret them as content.

You can check if some file exists with the st/exists? predicate:

@(st/exists? storage "test-file.txt")
;; => true

@(st/exists? storage "other.txt")
;; => false

Also, you can lookup the file in the storage by the relative path:

@(st/lookup storage "test-file.txt")
;; => #path "/tmp/test-file.txt"

The return value of lookup is a promise resolved with the absolute path on the local filesystem. And if you are working in a web application, the public-url is your friend:

(st/public-url storage "test-file.txt")
;; => object[java.net.URI 0xa215deb "http://localhost/media/test-file.txt"]

This function just return a concatenation of the relative path of the file with the provided URI on :baseuri option on storage constructor.

Note
In summary, this allows persist files in local file system and store relative paths on the database, which in turn allows future filesystem relocation without worrying about the need of any change on your code or the database.

Additionally to the explained functions, there are the delete function that given relative path or file name tries removing it:

@(st/delete storage "test-file.txt")
;; => true

@(st/delete storage "test-file.txt")
;; => false

Scoped Storage

The scoped storage is considered stacked storage because it works on top of an other storage implmentation, just adding an additional feature to it. In this case just scoping the main storage base directory by a prefix.

Let see an example:

(require '[datoteka.storages :as st])
(require '[datoteka.storages.local :refer [localfs]])
(require '[datoteka.storages.misc :refer [scoped]])

(def storage
  (localfs {:basedir "/tmp/"
            :baseuri "http://localhost/media/"}))

(def scoped-storage
  (scoped storage "subdir"))

And then, a simple example interacting with the scoped storage:

;; Works in the same way as the plain `storage` for store files
@(st/save scoped-storage "test-file.txt" "some-content")
;; => #path "test-file.txt"

;; With the exception that when `lookup` it used:

@(st/lookup scoped-storage "test-file.txt")
;; => #path "/tmp/subdir/test-file.txt"

@(st/lookup storage "subdir/test-file.txt")
;; => #path "/tmp/subdir/test-file.txt"

Hashed Storage

In the same way as scoped, the hashed storage is a stacked storage and requires of an other storage to work.

The hashed storage provides a very nice improvement over a plain storage making the resulting path not guessable and not deterministic. Very useful when you want to expose some downloads under the public address but avoid easy files lookup through brute-force.

Let see an example:

(require '[datoteka.storages :as st])
(require '[datoteka.storages.local :refer [localfs]])
(require '[datoteka.storages.misc :refer [hashed]])

(def storage
  (localfs {:basedir "/tmp/"
            :baseuri "http://localhost/media/"}))

(def hashed-storage (hashed storage))

And then, a simple example interacting with the scoped storage:

@(st/save hashed-storage "test-file.txt" "some-content")
;; => #path "iP2/qIG/PBM/msJ/sJW/zU7/NM4Y49T394nc1jPsqQvvCAsn/test-file.txt"

@(st/lookup hashed-storage "iP2/qIG/PBM/msJ/sJW/zU7/NM4Y49T394nc1jPsqQvvCAsn/test-file.txt"
;; => #path "/tmp/iP2/qIG/PBM/msJ/sJW/zU7/NM4Y49T394nc1jPsqQvvCAsn/test-file.txt"

The hash is splitted in multiple directories for avoid create a huge number of files under the same directory (that depending on the underlying filesystem used for the hard drive may imply some performance penalities for file access when the directory contains a huge number of files).

Also, the hash is generated taking 64 bytes of true random data (using SecureRandom) and hashing it with sha256 and encoding the result using url safe variant of base64.

Reference

Functions by Use

Predicates

A complete set of predicates for check filesystem stuff:

Path manipulation

A complete set of functions for create and work with paths:

Filesystem manipulation

A complete set of functions for work with filesystems:

All functions

path?

Checks if the provided value is a Path instance.

(fs/path? "/tmp")
;; => false

(fs/path? #path "/tmp")
;; => true
file?

Checks if the provided value is a File instance.

(fs/file? "/tmp")
;; => false

(fs/file? (fs/file "/tmp"))
;; => true
absolute?

Checks if the provided path is absolute.

(fs/absolute? "/tmp")
;; => true

(fs/absolute "tmp")
;; => false
relative?

Checks if the provided path is relative.

(fs/relative? "/tmp")
;; => false

(fs/relative "tmp")
;; => true
executable?

Checks if the provided path is executable file.

(fs/executable? "/tmp")
;; => false

(fs/executable? "/bin/sh")
;; => true
exists?

Checks if the provided path exists.

(fs/exists? "/tmp")
;; => true

(fs/exists? "/tmp/foobar")
;; => false
directory?

Checks if the provided path is a directory.

(fs/directory? "/tmp")
;; => true

(fs/directory? "/bin/sh")
;; => false
regular-file?

Checks if the provided path is a regular file.

(fs/regular-file? "/tmp")
;; => false

(fs/regular-file? "/bin/sh")
;; => true

Checks if the provided path is symbolic link.

(fs/link? "/tmp")
;; => false

(fs/link? "/sbin")
;; => true
hidden?

Checks if the provided path is hidden file or directory?

(fs/hidden? "/home/user/.bashrc")
;; => true

(fs/hidden? "/tmp")
;; => false
readable?

Checks if the provided path is readable.

(fs/readable? "/proc/cpuinfo")
;; => true

(fs/readable? "/var/log/auth.log") ;; due to permissions
;; => false
writable?

Checks if the provided path is readable.

(fs/writable? "/proc/cpuinfo")
;; => false

(fs/writable? "/tmp")
;; => true
path

Coerce to provided value to Path instance.

(fs/path "foo.txt")
;; => #path "foo.txt"

(fs/path (java.net.URI. "file:///tmp"))
;; => #path "/tmp"

(fs/path (java.net.URL. "file:///tmp"))
;; => #path "/tmp"

(fs/path ["/tmp" "foo"])
;; => #path "/tmp/foo"

(fs/path "/tmp" "foo")
;; => #path "/tmp/foo"
file

Coerce provided value to File instance.

(fs/file "foo.txt")
;; => #file "foo.txt"
Note
This function accepts the same types as fs/path, because the underlying implementation coerces everything to path and then calls .toFile method.
name

Get the file name part of the provided path.

(fs/name "/tmp/foo.txt")
;; => "foo.txt"
parent

Get the parent directory of the provided path.

(fs/parent "/tmp/foo.txt")
;; => #path "/tmp"
join

Concatenate two or more paths in one using the default current system path separator and normalizing the output.

(fs/join "/tmp" "foo.txt")
;; => #path "/tmp/foo.txt"
ext

Retrieve the extension of the file name part of the path. If file does not have extension, nil will be returned.

(fs/ext "foo.txt")
;; => "txt"

(fs/ext "foo")
;; => nil
split-ext

A helper for split the base path and the extension.

(fs/split-ext "/tmp/foo.txt")
;; => ["/tmp/foo" ".txt"]

(fs/split-ext "/tmp/foo")
;; => ["/tmp/foo" nil]
normalize

A helper for normalize the path and remove redundant segments.

(fs/normalize "~")
;; => #path "/home/myuser"

(fs/normalize "~/.zshrc")
;; => #path "/home/myuser/.zshrc"
normalize

Constructs a relative path of the provided path in terms of an other path.

(fs/relativize "/tmp/foo/bar.txt" "/tmp")
;; => #path "foo/bar.txt"

(fs/relativize "/tmp/foo/bar.txt" "/tmp/test")
;; => #path "../foo/bar.txt"
list-dir

A helper for recursivelly list all contens of the directory. The return value of this function is a lazy-seq.

Warning
You need to realize all the sequence in order to properly close all acquired resources for this operation, independently if you need or not all resources. If you don’t want to worry about resource management, just coerce the value to a vector using vec function.
(fs/list-dir "/tmp/subdir")
;; => (#path "/tmp/subdir/test-file.txt" #path "/tmp/subdir/test-file.html")

Optionally, you can pass a glob expression as second parameter for filter result.

(fs/list-dir "/tmp/subdir", "*.txt")
;; => (#path "/tmp/subdir/test-file.txt")
create-dir

A helper for create a new directory or directories. In fact this works in the same way as mkdir -p, or in other words, it creates all not existing parent directories for you.

(fs/create-dir "/tmp/subdir/testdir")
;; => #path "/tmp/subdir/testdir"

If the directory is already exists, no action is performed and the path to the directory is returned.

move

Move or rename a file to a target file. If the target file exists, by default it will be replaced.

(fs/move "/tmp/foo.txt" "/tmp/bar.txt")
;; => #path "/tmp/bar.txt"

Also, by default an atomic move will be performed and an exception will be raised if your system does not support that. You can pass your own flags as the third optional argument to the move function.

This is a list of available flags: :atomic, :replace and :copy-attributes. The default flags are #{:atomic :replace}.

Example using not atomic move.
(fs/move "/tmp/bar.txt" "/tmp/foo.txt" #{:replace})
;; => #path "/tmp/foo.txt"
create-tempdir

Creates a temporal directory.

(fs/create-tempdir)
;; => #path "/tmp/547228100008922024"

(fs/create-tempdir "myprefix")
;; => #path "/tmp/myprefix5085100141359807070"

The user of this function is responsible of removing it when it is not longer needed.

create-tempfile

Create a temporal file.

(fs/create-tempfile)
;; => #path "/tmp/4790584308117167851.tmp"

(fs/create-tempfile :suffix ".txt" :prefix "test")
;; => #path "/tmp/test7617683814929744528.txt"

The user of this function is responsible of removing it when it is not longer needed.

slurp-bytes

Is a similar function to the slurp but instead of reading the contents of the file as string, it reads it as byte array.

(fs/slurp-bytes "/proc/cpuinfo")
;; => #object["[B" 0x19b8169d "[B@19b8169d"]

Developers Guide

Philosophy

Five most important rules:

  • Beautiful is better than ugly.

  • Explicit is better than implicit.

  • Simple is better than complex.

  • Complex is better than complicated.

  • Readability counts.

All contributions to datoteka should keep these important rules in mind.

Contributing

Unlike Clojure and other Clojure contributed libraries datoteka does not have many restrictions for contributions. Just open an issue or pull request.

Source Code

datoteka is open source and can be found on github.

You can clone the public repository with this command:

git clone https://github.com/funcool/datoteka

Run tests

For running tests just execute this:

lein test

License

datoteka is licensed under BSD (2-Clause) license:

Copyright (c) 2015-2017 Andrey Antukh <niwi@niwi.nz>

All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice, this
  list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright notice,
  this list of conditions and the following disclaimer in the documentation
  and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.