Menu

AWS S3

You can use AWS’s Simple Storage Service (S3) as Crux’s 'document store'.

Project Dependency

In order to use S3 within Crux, you must first add S3 as a project dependency:

  • deps.edn

  • pom.xml

pro.juxt.crux/crux-s3 {:mvn/version "1.18.1"}
<dependency>
    <groupId>pro.juxt.crux</groupId>
    <artifactId>crux-s3</artifactId>
    <version>1.18.1</version>
</dependency>

Using S3

Replace the implementation of the document store with crux.s3/->document-store

  • JSON

  • Clojure

  • EDN

{
  "crux/document-store": {
    "crux/module": "crux.s3/->document-store",
    "bucket": "your-bucket",
    ...
  },
}
{:crux.document-store {:crux/module 'crux.s3/->document-store
                       :bucket "your-bucket"
                       ...}}
{:crux.document-store {:crux/module crux.s3/->document-store
                       :bucket "your-bucket"
                       ...}}

Parameters

  • configurator (S3Configurator)

  • bucket (string, required)

  • prefix (string): S3 key prefix

  • cache-size (int): size of in-memory document cache

Checkpoint store

S3 can be used as a query index checkpoint store.

Checkpoints aren’t GC’d by Crux - we recommend you set a lifecycle policy on your bucket to remove older checkpoints.

;; under :crux/index-store -> :kv-store -> :checkpointer
;; see the Checkpointing guide for other parameters
{:checkpointer {...
                :store {:crux/module 'crux.s3.checkpoint-store/->checkpoint-store
                        :configurator ...
                        :bucket "..."
                        :prefix "..."}}

Parameters

  • configurator (S3Configurator)

  • bucket (string, required)

  • prefix (string): S3 key prefix

Configuring S3 requests

This is unfortunately currently only accessible from Clojure - we plan to expose it outside of Clojure soon.

While the above is sufficient to get crux-s3 working out of the box, there are a plethora of configuration options in S3 - how to get credentials, object properties, serialisation of the documents, etc. We expose these via the crux.s3.S3Configurator interface - you can supply an instance using the following in your node configuration.

Through this interface, you can supply an S3AsyncClient for crux-s3 to use, adapt the PutObjectRequest/GetObjectRequest as required, and choose the serialisation format. By default, we get credentials through the usual AWS credentials provider, and store documents using Nippy.

  • Clojure

{:crux.document-store {:crux/module 'crux.s3/->document-store
                       :configurator (fn [_]
                                       (reify S3Configurator
                                         ...)
                       ...}}