Thin out your ZFS snapshot collection with timegaps

Recently, I have released timegaps, a command line tool for — among others — implementing backup retention policies. In this article I demonstrate how timegaps can be applied for filtering ZFS snapshots, i.e. for identifying those snapshots that can be deleted according to a certain backup retention policy.

Start by listing the names of the snapshots (in my case of the usbbackup/synctargets dataset):

$ zfs list -r -H -t snapshot -o name usbbackup/synctargets
usbbackup/synctargets@20140112-191516
usbbackup/synctargets@20140112-193913
usbbackup/synctargets@20140112-195055
usbbackup/synctargets@20140112-195320
usbbackup/synctargets@20140113-170032
[...]
usbbackup/synctargets@20140327-174655
usbbackup/synctargets@20140328-141257
usbbackup/synctargets@20140330-183001
usbbackup/synctargets@20140331-172543
usbbackup/synctargets@20140401-175059
usbbackup/synctargets@20140402-180042

As you can see, I have encoded the snapshot creation time in the snapshot name. This is prerequisite for the method presented here.

In the following command line, we provide this list of snapshot names to timegaps — via stdin. We advise timegaps to keep the following snapshots:

  • one recent snapshot (i.e. younger than 1 hour)
  • one snapshot for each of the last 10 hours
  • one snapshot for each of the last 30 days
  • one snapshot for each of the last 12 weeks
  • one snapshot for each of the last 14 months
  • one snapshot for each of the last 3 years

… and to print the other ones — the rejected ones — to stdout. This is the command line:

$ zfs list -r -H -t snapshot -o name usbbackup/synctargets | timegaps \
      --stdin --time-from-string 'usbbackup/synctargets@%Y%m%d-%H%M%S' \
      recent1,hours10,days30,weeks12,months14,years3

As you can see, the rules are provided to timegaps via the argument string recent1,hours10,days30,weeks12,months14,years3. The switch --time-from-string 'usbbackup/synctargets@%Y%m%d-%H%M%S' informs timegaps about how to parse the snapshot creation time from a snapshot name. Obviously, --stdin advises timegaps to read items from stdin (instead of from the command line, which would be the default).

See it in action:

$ zfs list -r -H -t snapshot -o name usbbackup/synctargets | timegaps \
      --stdin --time-from-string 'usbbackup/synctargets@%Y%m%d-%H%M%S' \
      recent1,hours10,days30,weeks12,months14,years3
usbbackup/synctargets@20140227-180824
usbbackup/synctargets@20140228-201639
usbbackup/synctargets@20140301-180728
[...]
usbbackup/synctargets@20140313-235809

You don’t really see the difference here because I cropped the output. The following is proof that (for my data) timegaps decided (according to the rules) that 41 of 73 snapshots are to be rejected:

$ zfs list -r -H -t snapshot -o name usbbackup/synctargets | wc -l
73
$ zfs list -r -H -t snapshot -o name usbbackup/synctargets | timegaps \
    --stdin --time-from-string 'usbbackup/synctargets@%Y%m%d-%H%M%S' \
    recent1,hours10,days30,weeks12,months14,years3 | wc -l
41

That command line can easily be extended for creating a little script for actually deleting these snapshots. sed is useful here, for prepending the string 'zfs destroy ' to each output line (each line corresponds to one rejected snapshot):

$ zfs list -r -H -t snapshot -o name usbbackup/synctargets | timegaps \
    --stdin --time-from-string 'usbbackup/synctargets@%Y%m%d-%H%M%S' \
    recent1,hours10,days30,weeks12,months14,years3 | \
    sed 's/^/zfs destroy /' > destroy_snapshots.sh
$ cat destroy_snapshots.sh
zfs destroy usbbackup/synctargets@20140227-180824
zfs destroy usbbackup/synctargets@20140228-201639
[...]
zfs destroy usbbackup/synctargets@20140325-215800
zfs destroy usbbackup/synctargets@20140313-235809

Timegaps is well tested via unit tests, and I use it in production. However, at the time of writing, I have not gotten any feedback from others. Therefore, please review destroy_snapshots.sh and see if it makes sense. Only then execute.

I expect this post to raise some questions, regarding data safety in general and possibly regarding the synchronization between snapshot creation and deletion. I would very much appreciate to receive questions and feedback in the comments section below, thanks.

Leave a Reply

Your email address will not be published. Required fields are marked *

Human? Please fill this out: * Time limit is exhausted. Please reload CAPTCHA.

  1. Brian Duff Avatar
    Brian Duff

    My other post got marked as spam, asked for it to be corrected.

    I am able to get my datasets to work with your script, but it’s a messy convert epoch to %Y%m%d-%H%M%S then run your script:

    (dataset=”homePool/offsiteHome/dataset”; for LINE in $(zfs list -t snapshot -H -o name -r $dataset|sed ‘s#homePool.*@##g’); do echo “$dataset@$(date -d @$LINE “+%Y%m%d-%H%M%S”)”; done)|timegaps –stdin –time-from-string “$dataset@%Y%m%d-%H%M%S” recent1,hours24,days7,weeks4,months1

    Looking through your source, it looks like you use epoch at times, so maybe it wouldn’t be too hard to have the string be able to natively handle %s snapshots? I didn’t understand all of what your scripts do, my scripting is basically limited to simple bash.