⚠️ Warning: This documentation refers to the proof-of-concept implementation of Wildland client (opens new window) which is no longer maintained. We are currently working on a new Wildland client written in Rust. To learn more about Wildland and the current status of its development, please visit the Wildland.io webpage (opens new window).
# Caching slow storages
To improve the access times of slower, remote containers like those comprising the Pandora forest, Wildland provides the notion of cache storages. Cache storages are local storages which are automatically managed and used to mirror containers' contents for faster access. The contents of a cache storage is automatically kept in sync with the original container's storage so that any changes made to the local cache are mirrored to the remote end, and vice versa.
# Single container example
Let's create a Dropbox-based container. For that, we need a storage template (here named mydropbox
):
$ wl template create dropbox mydropbox --app-key <Dropbox app key>
Now let's create the container (cache-test
):
$ wl container create --path /mydropbox --template mydropbox cache-test
Created: /home/user/.config/wildland/containers/cache-test.container.yaml
Created base path: /6dd6bc9a-c32a-4d63-a3e7-c12497b75011
Adding storage 873eb097-ec52-4ede-83e6-7381b095b2e7 to container.
Saved container /home/user/.config/wildland/containers/cache-test.container.yaml
$ wl container info cache-test
Sensitive fields are hidden.
/home/user/.config/wildland/containers/cache-test.container.yaml
object: container
owner: '0xfa24b153948426f9451f9ebd0103b508b484f7c9042864c0b8f333d974b261b6'
paths:
- /.uuid/6dd6bc9a-c32a-4d63-a3e7-c12497b75011
- /mydropbox
backends:
storage:
- type: dropbox
backend-id: 873eb097-ec52-4ede-83e6-7381b095b2e7
title: null
categories: []
version: '1'
As we can see, this container has only one dropbox
-type storage, as expected.
Let's mount the container and add some files to it:
$ wl start --skip-forest-mount
Starting Wildland at: /home/user/wildland
$ wl c mount cache-test
Loading containers (from 'cache-test'): 1
Checking container references (from 'cache-test'): 1
Preparing mount of container references (from 'cache-test'): 1
Mounting one storage
$ echo "test 1" > ~/wildland/mydropbox/test1.txt
$ dd if=/dev/urandom of=/tmp/rnd bs=4K count=1024
1024+0 records in
1024+0 records out
4194304 bytes (4.2 MB, 4.0 MiB) copied, 0.15051 s, 27.9 MB/s
$ time cp /tmp/rnd ~/wildland/mydropbox/test2.rnd
real 0m4.305s
user 0m0.001s
sys 0m0.011s
$ tree ~/wildland/mydropbox/
/home/user/wildland/mydropbox/
├── test1.txt
└── test2.rnd
0 directories, 2 files
$ time find ~/wildland/mydropbox/test* -type f -print0 | xargs -0 sha256sum
3cd203ac11340842055a6de561c9d69ca4493e912bd4c3c440c80711e16d5aee /home/user/wildland/mydropbox/test1.txt
e6811bb1913eec21d89ecbb0c2fa72df7671995408e6a8b6291a2595ba48c157 /home/user/wildland/mydropbox/test2.rnd
real 0m3.017s
user 0m0.079s
sys 0m0.004s
$ wl c unmount cache-test
Loading containers (from 'cache-test'): 0
Unmounting 1 containers
In reality the container may contain a lot of data and we may not want to always hit the network any time we access some file there. So, let's add a cache to this container. First, we need to create a default storage template that will be used for container caches. This should usually be a fast local storage:
$ mkdir ~/storage/cache
$ wl template create local --location ~/storage/cache cache-tpl
Storage template [cache-tpl] created in /home/user/.config/wildland/templates/cache-tpl.template.jinja
The ~/storage/cache
directory will be the root of all subsequently created local cache storages. Our cache template is named cache-tpl
. Now we set the default cache template for Wildland:
$ wl set-default-cache cache-tpl
Set template cache-tpl as default for container cache storages
The steps above only need to be performed once, after that we are ready to create and use the caches. To mark that our cache-test
container should use cache we simply add the -c
or --with-cache
option to the mount command:
$ wl c mount -c cache-test
Loading containers (from 'cache-test'): 1
Checking container references (from 'cache-test'): 1
Preparing mount of container references (from 'cache-test'): 1
Mounting one storage
$ wl status
Mounted containers:
/.users/0xfa24b153948426f9451f9ebd0103b508b484f7c9042864c0b8f333d974b261b6:/.backends/6dd6bc9a-c32a-4d63-a3e7-c12497b75011/097fc435-1f34-4b9c-b058-487c18765472
storage: local
paths:
/mydropbox
/.users/0xfa24b153948426f9451f9ebd0103b508b484f7c9042864c0b8f333d974b261b6:/.backends/6dd6bc9a-c32a-4d63-a3e7-c12497b75011/873eb097-ec52-4ede-83e6-7381b095b2e7
storage: dropbox
Sync jobs:
:/mydropbox: SYNCED 'dropbox'(backend_id=873eb097-ec52-4ede-83e6-7381b095b2e7) <-> 'local'(backend_id=097fc435-1f34-4b9c-b058-487c18765472)
Now we can see something interesting. The mounted container shows two storages: the first one is a local one, this is our new cache storage. Dropbox storage is now a second storage. We also see that an automatic job was created that keeps these two storages in sync. The status of this sync job may be displayed as ONE_SHOT
first to indicate that an initial sync from Dropbox to cache is in progress. The wl container info
now also shows that the container has a cache storage associated with it:
$ wl c info cache-test
Sensitive fields are hidden.
/home/user/.config/wildland/containers/cache-test.container.yaml
object: container
owner: '0xfa24b153948426f9451f9ebd0103b508b484f7c9042864c0b8f333d974b261b6'
paths:
- /.uuid/6dd6bc9a-c32a-4d63-a3e7-c12497b75011
- /mydropbox
backends:
storage:
- type: dropbox
backend-id: 873eb097-ec52-4ede-83e6-7381b095b2e7
title: null
categories: []
version: '1'
cache:
type: local
backend_id: 097fc435-1f34-4b9c-b058-487c18765472
location: /home/user/storage/cache/6dd6bc9a-c32a-4d63-a3e7-c12497b75011
Let's make sure our files are intact and compare the performance:
$ tree ~/wildland/mydropbox/
/home/user/wildland/mydropbox/
├── test1.txt
└── test2.rnd
0 directories, 2 files
$ time find ~/wildland/mydropbox/test* -type f -print0 | xargs -0 sha256sum
3cd203ac11340842055a6de561c9d69ca4493e912bd4c3c440c80711e16d5aee /home/user/wildland/mydropbox/test1.txt
e6811bb1913eec21d89ecbb0c2fa72df7671995408e6a8b6291a2595ba48c157 /home/user/wildland/mydropbox/test2.rnd
real 0m0.108s
user 0m0.033s
sys 0m0.010s
The checksums match and the operation is much faster since we're now operating on local files. Let's make some modifications to our files now:
$ echo "modified" > ~/wildland/mydropbox/test1.txt
$ time cp /tmp/rnd ~/wildland/mydropbox/test3.rnd
real 0m0.030s
user 0m0.003s
sys 0m0.003s
$ cat ~/storage/cache/6dd6bc9a-c32a-4d63-a3e7-c12497b75011/test1.txt
modified
The last command demonstrates that the physical location of our cache storage is in ~/storage/cache/6dd6bc9a-c32a-4d63-a3e7-c12497b75011
. ~/storage/cache
is our cache root, as set up at the beginning of this tutorial, and the UUID is the container UUID.
After creating the cache with the wl c mount -c
command we no longer need to add the -c
option: the cache will be used by default whenever the container is mounted:
$ wl c unmount cache-test
Loading containers (from 'cache-test'): 0
Unmounting 1 containers
$ wl c mount cache-test
Loading containers (from 'cache-test'): 1
Checking container references (from 'cache-test'): 1
Preparing mount of container references (from 'cache-test'): 1
Mounting one storage
$ wl status
Mounted containers:
/.users/0xfa24b153948426f9451f9ebd0103b508b484f7c9042864c0b8f333d974b261b6:/.backends/6dd6bc9a-c32a-4d63-a3e7-c12497b75011/097fc435-1f34-4b9c-b058-487c18765472
storage: local
paths:
/mydropbox
/.users/0xfa24b153948426f9451f9ebd0103b508b484f7c9042864c0b8f333d974b261b6:/.backends/6dd6bc9a-c32a-4d63-a3e7-c12497b75011/873eb097-ec52-4ede-83e6-7381b095b2e7
storage: dropbox
Sync jobs:
:/mydropbox: SYNCED 'dropbox'(backend_id=873eb097-ec52-4ede-83e6-7381b095b2e7) <-> 'local'(backend_id=097fc435-1f34-4b9c-b058-487c18765472)
Now let's demonstrate how to disable the cache and double-check that the container contents are preserved. Operations like adding or removing a cache should be done on an unmounted container.
$ wl c unmount cache-test
Loading containers (from 'cache-test'): 0
Unmounting 1 containers
$ wl c delete-cache cache-test
Deleting cache: /home/user/.config/wildland/cache/0xfa24b153948426f9451f9ebd0103b508b484f7c9042864c0b8f333d974b261b6.6dd6bc9a-c32a-4d63-a3e7-c12497b75011.storage.yaml
$ wl c info cache-test
Sensitive fields are hidden.
/home/user/.config/wildland/containers/cache-test.container.yaml
object: container
owner: '0xfa24b153948426f9451f9ebd0103b508b484f7c9042864c0b8f333d974b261b6'
paths:
- /.uuid/6dd6bc9a-c32a-4d63-a3e7-c12497b75011
- /mydropbox
backends:
storage:
- type: dropbox
backend-id: 873eb097-ec52-4ede-83e6-7381b095b2e7
title: null
categories: []
version: '1'
Cache manifests are kept in the cache
subdirectory of the Wildland config (~/.config/wildland
by default). Their names are generated using the container owner ID concatenated with the container UUID. After removing the cache we see that our container is back to having only one storage, the Dropbox one.
Note: the delete-cache
command only removes the cache manifest, it does not delete any files from the cache directory to prevent accidental loss of data.
$ wl c mount cache-test
Loading containers (from 'cache-test'): 1
Checking container references (from 'cache-test'): 1
Preparing mount of container references (from 'cache-test'): 1
Mounting one storage
$ wl status
Mounted containers:
/.users/0xfa24b153948426f9451f9ebd0103b508b484f7c9042864c0b8f333d974b261b6:/.backends/6dd6bc9a-c32a-4d63-a3e7-c12497b75011/873eb097-ec52-4ede-83e6-7381b095b2e7
storage: dropbox
paths:
/mydropbox
No sync jobs running
We can see that the cache is not being used now, and only the Dropbox storage is mounted. Let's verify that everything is intact:
$ tree ~/wildland/mydropbox/
/home/user/wildland/mydropbox/
├── test1.txt
├── test2.rnd
└── test3.rnd
0 directories, 3 files
$ cat ~/wildland/mydropbox/test1.txt
modified
$ time find ~/wildland/mydropbox/test* -type f -print0 | xargs -0 sha256sum
4487e24377581c1a43c957c7700c8b49920de7b8500c05590cee74996ef73f42 /home/user/wildland/mydropbox/test1.txt
e6811bb1913eec21d89ecbb0c2fa72df7671995408e6a8b6291a2595ba48c157 /home/user/wildland/mydropbox/test2.rnd
e6811bb1913eec21d89ecbb0c2fa72df7671995408e6a8b6291a2595ba48c157 /home/user/wildland/mydropbox/test3.rnd
real 0m6.173s
user 0m0.098s
sys 0m0.030s
Everything is fine and we're back to slow file operations 😉
# Forest example
Now let's see how we can improve performance of accessing a large Wildland forest, eg. the Pandora forest. We're assuming here that the Pandora forest was correctly imported as described in the Public Forests howto. Here are the steps as a reminder:
$ wl user import --path /mydirs/ariadne https://ariadne.wildland.io
We mount all containers in the Pandora forest to record baseline performance:
$ time wl c mount ':/mydirs/ariadne:/forests/pandora:*:'
Warning: cannot load bridge to [/forests/codepoets]
Warning: *: cannot load subcontainer .uuid/fe42cec7-3044-45c8-81a9-2298cd31b393.yaml: Cannot decrypt manifest.
Warning: cannot load bridge to [/forests/besidethepark]
Loading containers (from ':/mydirs/ariadne:/forests/pandora:*:'): 124
Checking container references (from ':/mydirs/ariadne:/forests/pandora:*:'): 124
Preparing mount of container references (from ':/mydirs/ariadne:/forests/pandora:*:'): 124
Mounting storages for containers: 124
real 1m52.631s
user 0m8.521s
sys 0m0.362s
$ time tree -F -L 2 ~/wildland/mydirs/ariadne:/forests/pandora:/
/home/user/wildland/mydirs/ariadne:/forests/pandora:/
├── README/
│ └── Copyright Notice and Terms of Use/
├── agent/
│ ├── @persons/
│ ├── @timeline/
│ └── agent-client-protocol/
├── arch/
│ ├── @clients/
...
├── Pandora Docs/
├── UX thoughts/
└── Users onboarding - marketplace/
269 directories, 10 files
real 0m5.409s
user 0m0.015s
sys 0m0.016s
$ wl c unmount ':/mydirs/ariadne:/forests/pandora:*:'
Warning: cannot load bridge to [/forests/codepoets]
Warning: *: cannot load subcontainer .uuid/fe42cec7-3044-45c8-81a9-2298cd31b393.yaml: Cannot decrypt manifest.
Warning: cannot load bridge to [/forests/besidethepark]
Loading containers (from ':/mydirs/ariadne:/forests/pandora:*:'): 123
Unmounting 125 containers
Now let's see how adding a cache to Pandora changes things:
$ time wl c mount -c ':/mydirs/ariadne:/forests/pandora:*:'
Warning: cannot load bridge to [/forests/codepoets]
Warning: *: cannot load subcontainer .uuid/fe42cec7-3044-45c8-81a9-2298cd31b393.yaml: Cannot decrypt manifest.
Warning: cannot load bridge to [/forests/besidethepark]
Loading containers (from ':/mydirs/ariadne:/forests/pandora:*:'): 124
Checking container references (from ':/mydirs/ariadne:/forests/pandora:*:'): 124
Preparing mount of container references (from ':/mydirs/ariadne:/forests/pandora:*:'): 124
Mounting storages for containers: 124
real 1m5.364s
user 0m10.324s
sys 0m0.441s
$ time tree -F -L 2 ~/wildland/mydirs/ariadne:/forests/pandora:/
/home/user/wildland/mydirs/ariadne:/forests/pandora:/
├── README/
│ └── Copyright Notice and Terms of Use/
├── agent/
│ ├── @persons/
│ ├── @timeline/
│ └── agent-client-protocol/
├── arch/
│ ├── @clients/
...
├── Pandora Docs/
├── UX thoughts/
└── Users onboarding - marketplace/
269 directories, 10 files
real 0m0.128s
user 0m0.012s
sys 0m0.006s
$ wl c info ':/mydirs/ariadne:/forests/pandora:/home/omeg:'
Sensitive fields are hidden.
object: container
owner: '0x1ea3909882be658d0ab69a822f7c923d12454ec024f4d8dd8f7113465167fcbe'
paths:
- /.uuid/24b3b45c-57e1-44ec-b66c-33d952c99c6a
- /home/omeg
backends:
storage:
- type: delegate
backend-id: 3f65e772-ec1e-419a-9d62-4d1b8c311589
title: null
categories: []
version: '1'
access:
- user: '*'
cache:
type: local
backend_id: 57b6de5e-4341-4399-ae8a-04d4b1d2e907
location: /home/user/storage/cache/24b3b45c-57e1-44ec-b66c-33d952c99c6a
The initial mount time can be slightly longer when creating caches for the first time. We see that accessing Pandora files is pretty much instantaneous now thanks to them being mirrored locally. We can of course modify them if we have write access and the changes will be synced to remote storages.
Note: when mounting such a forest with a cache for the first time, it may take a while to perform the initial sync to the cache. We can always see sync status by using the wl status
command:
$ wl status
Mounted containers:
...large number of containers...
Sync jobs:
:/mydirs/ariadne:/forests/pandora:/.uuid/014f1f66-c5a3-41e3-9c39-8e000e42f062: SYNCED 'delegate'(backend_id=66ba2c9c-abd5-4056-b609-0a3f8e2a8985) <-> 'local'(backend_id=733a3d3a-f278-4e42-ba22-85a2d4ad06e8)
:/mydirs/ariadne:/forests/pandora:/home/maja.kostacinska: SYNCED 'delegate'(backend_id=a2556f45-ed5c-48a5-ac57-f21c4c21856a) <-> 'local'(backend_id=79200c1f-ab9c-48b1-aa8f-70ab7dd1b37d)
...etc...
As with a single container, after using the initial mount command to create caches, we don't need to specify --with-cache
or -c
to use caches when mounting, as they are used by default from then on.
To stop using a cache for Pandora we can use the wl c delete-cache
command:
$ wl c delete-cache ':/mydirs/ariadne:/forests/pandora:*:'
Deleting cache: /home/user/.config/wildland/cache/0x1ea3909882be658d0ab69a822f7c923d12454ec024f4d8dd8f7113465167fcbe.014f1f66-c5a3-41e3-9c39-8e000e42f062.storage.yaml
Deleting cache: /home/user/.config/wildland/cache/0x1ea3909882be658d0ab69a822f7c923d12454ec024f4d8dd8f7113465167fcbe.0389cbcd-53d2-47a6-9a7d-e9b03185711c.storage.yaml
...