User Manual

License

parsley is written by Josef Hahn under the terms of the AGPLv3.

Please read the LICENSE file from the package and the Dependencies section for included third-party stuff.

About

Parsley keeps a configured set of places in file systems in sync.

Features:

  • Keeps configured file system places in sync (local and ssh)

  • Robust infrastructure with working retry and error handling

  • Customizable behavior with the availability to add additional program logic for various situations

  • Optional ‘move to sink mode’: always moves all files from the source to a sink and so keep the source empty

  • Has a mechanism for metadata synchronization (tags, rating, …)

  • Can be used stand-alone or embedded in other tools with a flexible and extensible api

  • Rich graphical interface for configuration and for executing synchronization

  • Graphical interface for manually resolving conflicts that occurred in a synchronization run

  • Designed for being driven by a scheduled task (a.k.a. cronjob), which executes a background command (e.g. each minute)

  • In background mode: Own handling of synchronization intervals (independent of the interval for the scheduled task)

Up-to-date?

Are you currently reading from another source than the homepage? Are you in doubt if that place is up-to-date? If yes, you should visit https://pseudopolis.eu/wiki/pino/projs/parsley and check that. You are currently reading the manual for version 3.3.2800.

Maturity

parsley is in production-stable state.

Dependencies

There are external parts that are used by parsley. Many thanks to the projects and all participants.

icon_python Python 3.7, required

icon_linux Typical GNU/Linux Desktop, recommended

icon_ssh sshfs, optional

icon_package python3-pyxattr, optional

icon_package Gtk 3 and PyGObject, recommended : for user interfaces.

icon_artwork font ‘Symbola’, included : for logo symbol; free for use; copied from here.

icon_artwork banner image, included : _meta/background.png; license CC BY-SA 3.0; copied from here.

icon_artwork all files in /_meta, included : if not mentioned otherwise, Copyright 2015 Josef Hahn under license CC BY-SA 3.0 license.

Introduction

Please read how to make Parsley ready for the first steps in Appendix: Installation.

First Steps

Parsley can run in graphical mode as well as in background mode. The latter one is designed for automated file synchronization, e.g. by periodical runs. It is described in a later section.

For the first steps, start Parsley in graphical mode. Either find the appropriate entry in your start menu or call parsley_gui.

_images/maingui.png

It automatically creates an empty configuration file in ~/.parsley/parsley.xml. All changes you do in the Parsley window will eventually be stored there.

It is easy to add a new synchronization now and let it run. This is great for configuration and for testing. For productive usage, it is recommended to run parsley in windowless mode. See the next chapter for more details.

You can set up your configuration entirely from the Parsley window after start. Read more about the user interface in Graphical Configuration and also about the Configuration Model.

Synchronization Model

This section describes how Parsley proceeds in order to keep two filesystem locations synchronized, i.e. storing the same files. It can only explain how a usual configuration behaves. This behavior can be marginally or completely different once Custom Aspects or other advanced features are used.

In order to keep two filesystem places in sync, Parsley traverses those filesystem trees and operates per file:

  • If a file is equally existing on both sides, it proceeds to the next.

  • If a file only exists in one place, it decides whether to delete the one or to clone the other (by evaluating if the file existed before).

  • If a file exists on both sides with different content, it checks which one is the fresher one (by evaluating the files’ modification times) and updates the other. If both files changed since the last run, a filesystem conflict occurred.

It synchronizes the file content and a few metadata.

Using Parsley In Windowless Mode

The Parsley core component does not have any graphical interface but is designed to run completely in background without any user intervention.

The Parsley command line tool parsley runs the synchronization processes this way. It reads your Parsley configuration file (or another other) and executes the synchronizations defined there.

The windowless mode is the recommended day-to-day mode. Parsley is to be called in background in regular intervals (which should be short since parsley has an own interval logic). Add such a line to your crontab (or a similar ‘scheduled task’ in Windows or use whatever your OS provides for executing a command every few minutes):

*/3 * * * * /usr/lib/parsley/parsley.py --sync ALL

Please adapt the Parsley path to your system.

If you want to use another location for the configuration, create it at first by calling parsley --createconfig --configfile /some/other/dir/parsley.xml and adding ` –configfile /some/other/dir/parsley.xml` to the crontab command. You can of course open and edit the resulting /some/other/dir/parsley.xml in the graphical configuration tool as well. Read Appendix: Command Line for more command line parameters.

You should also add a value interval on each for your synchronizations and set it to a time interval like 20m. Otherwise they all actually will run every time Parsley is called.

For details about return values of the Parsley command line tool, see parsley.runtime.returnvalue.ReturnValue.

Graphical Configuration

The Parsley main window gives an overview about all synchronizations and other stuff you have configured in the currently opened configuration file. You can open different ones at any time. Call parsley --createconfig --configfile /some/other/place/foo.xml on command line for creating a fresh Parsley configuration file in some place.

A fresh configuration is mostly empty and has just a Logger configured, so you get output information when syncing. You should not remove it, unless you really want to get rid of that its output. The user interface offers the ‘Add item’ action, which adds new parts to your configuration. A ‘Sync Task’ is what you typically would add in the beginning, while the other stuff is for more advanced cases and beyond this manual.

Once you have created a new sync task (sometimes also called: ‘sync configuration’, just ‘sync’, ‘synchronization’, …) in Parsley, you will see it in the main window. You can ‘Configure’ it, and clicking on its ‘Properties’ button offers all configuration details in a direct way (that should be used with care).

_images/mainguichange.png

Each action modifies your configuration in some way. The following explains the configuration of a sync task in detail.

Configure

This opens a menu of possible configuration changes.

_images/changeguided.png

Those guided changes do not need deeper knowledge about Parsley and often no reading of documentation.

Remove sync

Removes an entire synchronization task.

General, Preparation, Filesystem or Aspect level: Add parameter

Adds a name/value pair to the section.

Only add new parameters that are actually allowed for the underlying data structure. The Configuration Model section will explain more details.

General or Filesystem level: Add aspect

Adds a new aspect to the filesystem or to the general section.

Only add new aspects that actually exist (either in parsley.aspect or custom ones). The Configuration Model section will explain more details.

Filesystem level: Add preparation

Adds a new preparation to the general section.

This is only used for exotic cases.

Parameter, Preparation or Aspect level: Remove

Removes a section entirely.

Both potentially changes the behavior of this synchronization, so you should know what you are doing! Do not remove values that are required for the underlying data structure. The Configuration Model section will explain more details.

Parameter level: Change value

Changes a value.

You should know about the allowed input values before you change something.

Preparation, Aspect or Filesystem level: Change type

Changes the type of a preparation, aspect or filesystem.

This casts a filesystem or aspect to another type. Only choose a new type that actually exists (parsley.filesystem, parsley.aspect or custom ones). After you changed the type, it might be required to add and remove some values according to what the new data structure expects to get. The Configuration Model section will explain more details.

Configuration Model

Beyond some configuration wizards, large parts of the graphical configuration directly reflect structures from the configuration file. So, for advanced usage, knowledge about the configuration file format is often required.

The configuration of Parsley is (at least if you do not use it embedded in your own Python program) written in xml files. As default, the file parsley.xml in your home directory will be used. You may use a different one with the --configfile command line parameter.

A Parsley configuration file contains configuration objects listed as sub nodes in the root node parsleyconfig. There are different kinds of objects (e.g. loggers, synchronization tasks, …), which can be seen in the different tag names in the xml.

Hint

For most values in parsley configuration files, notations may contain references like $FOO, which get replaced by that particular operating system environment variable.

Synchronization Tasks

Synchronization task configurations are the most interesting ones in most situations. They specify a pair of filesystem locations and a lot of optional additional stuff. The Parsley engine will run the specified synchronization tasks as they are configured here.

A synchronization task configuration - using a sync tag in xml - is by far the most complex kind. An example can be seen in _meta/parsley.xml.example. The following shows and explains the formal structure:

<?xml version="1.0" ?>
<parsleyconfig>
    <sync name="example" interval="5m" ...>
        <fs type="..." name="foo" ...>
            <aspect type="..." .../>
            <aspect type="..." .../>
        </fs>
        <fs type="..." name="bar" ...>
            <aspect type="..." .../>
            <aspect type="..." .../>
        </fs>
        <aspect type="..." .../>
        <aspect type="..." .../>
        <preparation type="..." .../>
        <preparation type="..." .../>
    </sync>
    ...

</parsleyconfig>

A sync tag contains a name (e.g. used in log messages) and a synchronization interval. For more options, see parsley.syncengine.sync.Sync.

It contains two fs tags, which specify filesystem locations. They also have a name each and specify a filesystem location (local, ssh, or whatever is supported) for synchronization. See parsley.filesystem for existing implementations.

Each fs tag may contain aspect tags. They control the synchronization behavior, since an aspect is a bunch of small program pieces that react on different events in the synchronization workflow. The complete synchronization functionality, even the builtin one, is part of aspects. See parsley.aspect for existing implementations.

The sync tag may also contain aspect tags directly. Those aspects apply to all filesystems. It is the same as copying those tags into each fs tag.

It may also contain preparation tags. They specify some actions that must take place before the synchronization can take place. Mounting external filesystems is a very common example for this kind of actions. See parsley.preparation for existing implementations.

Loggers

Loggers can output parsley log messages in some way to some target. The configuration of one logger follows this structure:

<?xml version="1.0" ?>
<parsleyconfig>
    <logger minseverity="debug" maxseverity="debug" ...>
        <out type="..." ... />
        <formatter type="..." .../>
    </logger>
    ...
</parsleyconfig>

It specifies a minimum and maximum severity that shall be logged (see parsley.logger.logger.Severity). It also contains a formatter configuration for an instance of parsley.logger.formatter.abstractlogformat.Logformat (formats the log message) and a out configuration for a parsley.logger.loggerout.abstractloggerout.Loggerout (actually does the output).

An example can be seen in _meta/parsley.xml.example.

See parsley.logger for all available functionality.

Includes

A configuration file can include other ones. Those files have the same structure as primary configuration files and must be complete, including the parsleyconfig xml root node.

A configuration file can be included with include, this way:

<?xml version="1.0" ?>
<parsleyconfig>
    <include path="./some_other_file.xml"/>
    ...
</parsleyconfig>

Custom Aspects

A configuration file can bring the implementation for a custom aspect, which can then be used in some sync task configurations. Those implementations are provided in a customaspect:

<?xml version="1.0" ?>
<parsleyconfig>
    <sync ...>
        ...
        <aspect type="DoSomething" />
    </sync>
    <customaspect name="DoSomething">
from parsley.aspect import *
class DoSomething(Aspect):
    def __init__(self):
        Aspect.__init__(self)
    @hook("", "", "", event=SyncEvent.UpdateDir_Prepare)
    def sleepwhilebeginupdatedir(self, ea, fs, ctrl):
        do_something()
    </customaspect>
</parsleyconfig>

A custom aspect can be used for executing some custom code in some situations, e.g. for keeping EXIF tags of JPEG files clean or doing something with metadata tags of other media files.

Read the Customizing Parsley section for details about implementing a custom aspect.

Python Imports

Python Imports are used for customization. It allows importing arbitrary Python class or functions from any available module, so you can refer to it at other places. The configuration of one pythonimport follows this structure:

<?xml version="1.0" ?>
<parsleyconfig>
    <pythonimport importfrom="my.mo.du.le.MyFilesystem" to="MyFilesystem" />
    ...
    <sync ...>
        <fs type="MyFilesystem" ...>
        ...
    </sync>
</parsleyconfig>

While importfrom must be a full name pointing to a Python object that is importable, to is just a bare name without dots!

Filesystem Conflicts

Parsley might encounter conflicts in synchronization runs. Those are situations with incompatible changes of one items on both sides. Those situations must be cleared manually by the user.

Since the Parsley synchronization engine runs decoupled from user intervention, it just stores information about this conflict, so the user can decide later how to resolve it.

There is a graphical user interface available for manually resolving filesystem conflicts. Find it in your start menu, in the Parsley overview, or execute parsley_infssync_manageconflicts_gui.

There is also a command line tool available as parsley_infssync_manageconflicts. It is designed for scripted usage. Just start it for getting further details.

After a conflict occurs, those tools can be used to manually resolve each issue. These tools store a conflict resolution information, which is applied when the synchronization task runs the next time. They do not execute any synchronization action directly.

_images/conflict.png

Reporting

Parsley collects some process and telemetry information while it executes your synchronization tasks. This includes the execution logs for each run and performance data.

There is a convenient user interface for inspecting those data. Find it in your start menu, in the Parsley overview, or execute parsley_report_gui.

_images/reporting.png

Customizing Parsley

General Workflow Overview

The following describes how the inner parts of parsley work together. This knowledge is very helpful for planning and implementing a customization.

For each <sync> in your configuration, the parsley engine will create and configure one instance of parsley.syncengine.sync.Sync. If it is not skipped (e.g. because it was already executed less time ago than the interval defines), the engine tries to prepare the execution.

Preparing a synchronization means activating all <preparation> specified for this synchronization task. This can mount filesystems or whatever is needed for bringing an environment in place, which is required for the actual synchronization to run. Each preparation is one instance of a subclass of parsley.preparation.abstractpreparation.Preparation, which provides the implementation for activating a preparation before synchronization, for deactivating it afterwards and for status checks.

If the synchronization task is successfully prepared, the actual sync operation begins. The sync operation iterates over whatever it can find in your filesystems and just triggers certain events. Without anything more, it would not do anything (and, technically, it would not even iterate that much - but let’s forget that for now). The complete synchronization behavior comes with a bunch of small pieces of program code that react on those events. Even the builtin parsley synchronization behavior is implemented as aspects (which also means that you can completely get rid of it by not listing those aspects in your sync configuration).

Those event handlers are added to the pipe by means of some <aspect>. Each specified aspect (either within one <filesystem> or directly within the <sync>) brings an instance of a subclass of parsley.aspect.abstractaspect.Aspect, which registers one or more event handlers to the synchronization pipe. One aspect typically implements a certain piece of behavior (which often needs to react on more than only one event).

Intermediate summary: A synchronization task itself is not an interesting thing. It will just fly over your files doing nothing. It can be enriched with some preceding or subsequent actions by means of a preparation. But all the interesting synchronization behavior comes with event handlers. Those event handlers are bundled in some aspect.

The following description gives a more detailed overview of how parsley would fly over your filesystem and which events are triggered on that flight (i.e. which junction points exist, where aspects can hook in for own logic). For most events, additional information is available in the developer documentation.

Whenever an event occurs, all registered event handlers are executed. Each event handler execution takes place on top (or: is associated with) one of your specified filesystems. Each event handler typically does its work in that particular filesystem. If an aspect that provides a certain event handler is specified in a filesystem, it will be executed exactly for that filesystem. If it is directly specified in the sync, it will be executed once for each filesystem.

There is a mechanism for ordering the execution of the event handlers within one event. Please read parsley.syncengine.sync.Sync.executeevent() for more details about the ordering and more about the internals.

Customizable Parts

If you want to override or enhance some parts of the default behavior, read the following parts:

High Level Customization

The parsley.aspect.highlevelcustomization.HighLevelCustomization aspect allows to include Python code pieces from somewhere in the synchronization directory tree at defined places into the synchronization behavior.

This is very convenient for including some automation tasks, e.g. automatically converting some kinds of files whenever the appear or reacting in any other custom way to filesystem updates.

Include this aspect to your configuration in order to use this feature. The graphical interface has a guide for it as well.

Python Imports

It is possible to implement own stuff in external Python modules and use those classes and functions from within the configuration (e.g. as a different filesystem type). Specify a Python Import for such a class or function.

Appendix: Command Line

The parsley command-line tool understands this syntax:

parsley [options]*

Options can be some of the following:

  • --sync [syncname] : Runs the synchronization of syncname. Use ALL for syncname in order to run all synchronizations.

  • --listsyncs : Lists all available synchronization configurations.

  • --datadir [dirpath] : Uses dirpath as control data storage directory instead of the default one (~/.parsley).

  • --configfile [configfile] : Uses configuration file configfile instead of the default one (~/.parsley/parsley.xml).

  • --createconfig : Creates a fresh configuration file (can be combined with datadir).

  • --forcesync [syncname] : Marks the synchronization syncname for forceful synchronization, even if the time interval is not elapsed yet. Can be used more than once.

  • --lock [pid] : Just acquires the lock, so no other synchronization run will actually do anything until you unlock. Used for backup. Without a pid, the lock must be unlocked and refreshed every 10 minutes!

  • --unlock : Releases a lock acquired with --lock.

Appendix: Installation

Install Parsley via the installation package for your environment, if a suitable one exists for download. This also takes care of installing dependencies and doing preparation (unless mentioned otherwise in the installation procedure). After the installation, you can skip the rest of this section.

Source Code Archive

Use the source code archive as fallback. Extract it to a location that is convenient to you (Windows users need an external archive program; for example the great ‘7-Zip’ tool). Also take a look at the Dependencies for external stuff you need to install as well.

It is highly recommended to also establish a command line link or alias for parsley/parsley.py so you just have to type parsley (ln -s ...parsley/parsley.py /usr/local/bin/parsley on Unix or any other operating system specific way). Do the same for parsley/parsley_gui.py, parsley/parsley_infssync_manageconflicts.py and parsley/parsley_infssync_manageconflicts_gui.py. This is according to what the installation packages do and required for executing the exact same commands as used in this manual (otherwise you must substitute the full name for the short command names in this manual).