User Manual¶
License¶
parsley is written by Josef Hahn under the terms of the AGPLv3.
Please read the LICENSE file from the package and the Dependencies section for included third-party stuff.
About¶
Parsley keeps a configured set of places in file systems in sync.
Features:
Keeps configured file system places in sync (local and ssh)
Robust infrastructure with working retry and error handling
Customizable behavior with the availability to add additional program logic for various situations
Optional ‘move to sink mode’: always moves all files from the source to a sink and so keep the source empty
Has a mechanism for metadata synchronization (tags, rating, …)
Can be used stand-alone or embedded in other tools with a flexible and extensible api
Rich graphical interface for configuration and for executing synchronization
Graphical interface for manually resolving conflicts that occurred in a synchronization run
Designed for being driven by a scheduled task (a.k.a. cronjob), which executes a background command (e.g. each minute)
In background mode: Own handling of synchronization intervals (independent of the interval for the scheduled task)
Up-to-date?¶
Are you currently reading from another source than the homepage? Are you in doubt if that place is up-to-date? If yes, you should visit https://pseudopolis.eu/wiki/pino/projs/parsley and check that. You are currently reading the manual for version 3.3.2800.
Maturity¶
parsley is in production-stable state.
Dependencies¶
There are external parts that are used by parsley. Many thanks to the projects and all participants.
Python 3.7, required
Typical GNU/Linux Desktop, recommended
sshfs, optional
python3-pyxattr, optional
Gtk 3 and PyGObject, recommended : for user interfaces.
font ‘Symbola’, included : for logo symbol; free for use; copied from here.
banner image, included : _meta/background.png; license CC BY-SA 3.0; copied from here.
all files in /_meta, included : if not mentioned otherwise, Copyright 2015 Josef Hahn under license CC BY-SA 3.0 license.
Introduction¶
Please read how to make Parsley ready for the first steps in Appendix: Installation.
First Steps¶
Parsley can run in graphical mode as well as in background mode. The latter one is designed for automated file synchronization, e.g. by periodical runs. It is described in a later section.
For the first steps, start Parsley in graphical mode. Either find the appropriate entry in your start menu or call parsley_gui
.
It automatically creates an empty configuration file in ~/.parsley/parsley.xml
. All changes you do in the Parsley window will eventually be stored there.
It is easy to add a new synchronization now and let it run. This is great for configuration and for testing. For productive usage, it is recommended to run parsley in windowless mode. See the next chapter for more details.
You can set up your configuration entirely from the Parsley window after start. Read more about the user interface in Graphical Configuration and also about the Configuration Model.
Synchronization Model¶
This section describes how Parsley proceeds in order to keep two filesystem locations synchronized, i.e. storing the same files. It can only explain how a usual configuration behaves. This behavior can be marginally or completely different once Custom Aspects or other advanced features are used.
In order to keep two filesystem places in sync, Parsley traverses those filesystem trees and operates per file:
If a file is equally existing on both sides, it proceeds to the next.
If a file only exists in one place, it decides whether to delete the one or to clone the other (by evaluating if the file existed before).
If a file exists on both sides with different content, it checks which one is the fresher one (by evaluating the files’ modification times) and updates the other. If both files changed since the last run, a filesystem conflict occurred.
It synchronizes the file content and a few metadata.
Using Parsley In Windowless Mode¶
The Parsley core component does not have any graphical interface but is designed to run completely in background without any user intervention.
The Parsley command line tool parsley
runs the synchronization processes this way. It reads your Parsley configuration file (or another other) and executes the synchronizations defined there.
The windowless mode is the recommended day-to-day mode. Parsley is to be called in background in regular intervals (which should be short since parsley has an own interval logic). Add such a line to your crontab
(or a similar ‘scheduled task’ in Windows or use whatever your OS provides for executing a command every few minutes):
*/3 * * * * /usr/lib/parsley/parsley.py --sync ALL
Please adapt the Parsley path to your system.
If you want to use another location for the configuration, create it at first by calling parsley --createconfig --configfile /some/other/dir/parsley.xml
and adding ` –configfile /some/other/dir/parsley.xml` to the crontab command. You can of course open and edit the resulting /some/other/dir/parsley.xml
in the graphical configuration tool as well. Read Appendix: Command Line for more command line parameters.
You should also add a value interval
on each for your synchronizations and set it to a time interval like 20m
. Otherwise they all actually will run every time Parsley is called.
For details about return values of the Parsley command line tool, see parsley.runtime.returnvalue.ReturnValue
.
Graphical Configuration¶
The Parsley main window gives an overview about all synchronizations and other stuff you have configured in the currently opened configuration file. You can open different ones at any time. Call parsley --createconfig --configfile /some/other/place/foo.xml
on command line for creating a fresh Parsley configuration file in some place.
A fresh configuration is mostly empty and has just a Logger configured, so you get output information when syncing. You should not remove it, unless you really want to get rid of that its output. The user interface offers the ‘Add item’ action, which adds new parts to your configuration. A ‘Sync Task’ is what you typically would add in the beginning, while the other stuff is for more advanced cases and beyond this manual.
Once you have created a new sync task (sometimes also called: ‘sync configuration’, just ‘sync’, ‘synchronization’, …) in Parsley, you will see it in the main window. You can ‘Configure’ it, and clicking on its ‘Properties’ button offers all configuration details in a direct way (that should be used with care).
Each action modifies your configuration in some way. The following explains the configuration of a sync task in detail.
Configure
This opens a menu of possible configuration changes.
Those guided changes do not need deeper knowledge about Parsley and often no reading of documentation.
Remove sync
Removes an entire synchronization task.
General, Preparation, Filesystem or Aspect level: Add parameter
Adds a name/value pair to the section.
Only add new parameters that are actually allowed for the underlying data structure. The Configuration Model section will explain more details.
General or Filesystem level: Add aspect
Adds a new aspect to the filesystem or to the general section.
Only add new aspects that actually exist (either in parsley.aspect
or custom ones). The Configuration Model section will explain more details.
Filesystem level: Add preparation
Adds a new preparation to the general section.
This is only used for exotic cases.
Parameter, Preparation or Aspect level: Remove
Removes a section entirely.
Both potentially changes the behavior of this synchronization, so you should know what you are doing! Do not remove values that are required for the underlying data structure. The Configuration Model section will explain more details.
Parameter level: Change value
Changes a value.
You should know about the allowed input values before you change something.
Preparation, Aspect or Filesystem level: Change type
Changes the type of a preparation, aspect or filesystem.
This casts a filesystem or aspect to another type. Only choose a new type that actually exists (parsley.filesystem
, parsley.aspect
or custom ones). After you changed the type, it might be required to add and remove some values according to what the new data structure expects to get. The Configuration Model section will explain more details.
Configuration Model¶
Beyond some configuration wizards, large parts of the graphical configuration directly reflect structures from the configuration file. So, for advanced usage, knowledge about the configuration file format is often required.
The configuration of Parsley is (at least if you do not use it embedded in your own Python program) written in xml files. As default, the file parsley.xml
in your home directory will be used. You may use a different one with the --configfile
command line parameter.
A Parsley configuration file contains configuration objects listed as sub nodes in the root node parsleyconfig
. There are different kinds of objects (e.g. loggers, synchronization tasks, …), which can be seen in the different tag names in the xml.
Hint
For most values in parsley configuration files, notations may contain references like $FOO
, which get replaced by that particular operating system environment variable.
Synchronization Tasks¶
Synchronization task configurations are the most interesting ones in most situations. They specify a pair of filesystem locations and a lot of optional additional stuff. The Parsley engine will run the specified synchronization tasks as they are configured here.
A synchronization task configuration - using a sync
tag in xml - is by far the most complex kind. An example can be seen in _meta/parsley.xml.example
. The following shows and explains the formal structure:
<?xml version="1.0" ?>
<parsleyconfig>
<sync name="example" interval="5m" ...>
<fs type="..." name="foo" ...>
<aspect type="..." .../>
<aspect type="..." .../>
</fs>
<fs type="..." name="bar" ...>
<aspect type="..." .../>
<aspect type="..." .../>
</fs>
<aspect type="..." .../>
<aspect type="..." .../>
<preparation type="..." .../>
<preparation type="..." .../>
</sync>
...
</parsleyconfig>
A sync
tag contains a name (e.g. used in log messages) and a synchronization interval. For more options, see parsley.syncengine.sync.Sync
.
It contains two fs
tags, which specify filesystem locations. They also have a name each and specify a filesystem location (local, ssh, or whatever is supported) for synchronization. See parsley.filesystem
for existing implementations.
Each fs
tag may contain aspect
tags. They control the synchronization behavior, since an aspect is a bunch of small program pieces that react on different events in the synchronization workflow. The complete synchronization functionality, even the builtin one, is part of aspects. See parsley.aspect
for existing implementations.
The sync
tag may also contain aspect
tags directly. Those aspects apply to all filesystems. It is the same as copying those tags into each fs
tag.
It may also contain preparation
tags. They specify some actions that must take place before the synchronization can take place. Mounting external filesystems is a very common example for this kind of actions. See parsley.preparation
for existing implementations.
Loggers¶
Loggers can output parsley log messages in some way to some target. The configuration of one logger
follows this structure:
<?xml version="1.0" ?>
<parsleyconfig>
<logger minseverity="debug" maxseverity="debug" ...>
<out type="..." ... />
<formatter type="..." .../>
</logger>
...
</parsleyconfig>
It specifies a minimum and maximum severity that shall be logged (see parsley.logger.logger.Severity
). It also contains a formatter
configuration for an instance of parsley.logger.formatter.abstractlogformat.Logformat
(formats the log message) and a out
configuration for a parsley.logger.loggerout.abstractloggerout.Loggerout
(actually does the output).
An example can be seen in _meta/parsley.xml.example
.
See parsley.logger
for all available functionality.
Includes¶
A configuration file can include other ones. Those files have the same structure as primary configuration files and must be complete, including the parsleyconfig
xml root node.
A configuration file can be included with include
, this way:
<?xml version="1.0" ?>
<parsleyconfig>
<include path="./some_other_file.xml"/>
...
</parsleyconfig>
Custom Aspects¶
A configuration file can bring the implementation for a custom aspect, which can then be used in some sync task configurations. Those implementations are provided in a customaspect
:
<?xml version="1.0" ?>
<parsleyconfig>
<sync ...>
...
<aspect type="DoSomething" />
</sync>
<customaspect name="DoSomething">
from parsley.aspect import *
class DoSomething(Aspect):
def __init__(self):
Aspect.__init__(self)
@hook("", "", "", event=SyncEvent.UpdateDir_Prepare)
def sleepwhilebeginupdatedir(self, ea, fs, ctrl):
do_something()
</customaspect>
</parsleyconfig>
A custom aspect can be used for executing some custom code in some situations, e.g. for keeping EXIF tags of JPEG files clean or doing something with metadata tags of other media files.
Read the Customizing Parsley section for details about implementing a custom aspect.
Python Imports¶
Python Imports are used for customization. It allows importing arbitrary Python class or functions from any available module, so you can refer to it at other places. The configuration of one pythonimport
follows this structure:
<?xml version="1.0" ?>
<parsleyconfig>
<pythonimport importfrom="my.mo.du.le.MyFilesystem" to="MyFilesystem" />
...
<sync ...>
<fs type="MyFilesystem" ...>
...
</sync>
</parsleyconfig>
While importfrom
must be a full name pointing to a Python object that is importable, to
is just a bare name without dots!
Filesystem Conflicts¶
Parsley might encounter conflicts in synchronization runs. Those are situations with incompatible changes of one items on both sides. Those situations must be cleared manually by the user.
Since the Parsley synchronization engine runs decoupled from user intervention, it just stores information about this conflict, so the user can decide later how to resolve it.
There is a graphical user interface available for manually resolving filesystem conflicts. Find it in your start menu, in the Parsley overview, or execute parsley_infssync_manageconflicts_gui
.
There is also a command line tool available as parsley_infssync_manageconflicts
. It is designed for scripted usage. Just start it for getting further details.
After a conflict occurs, those tools can be used to manually resolve each issue. These tools store a conflict resolution information, which is applied when the synchronization task runs the next time. They do not execute any synchronization action directly.
Reporting¶
Parsley collects some process and telemetry information while it executes your synchronization tasks. This includes the execution logs for each run and performance data.
There is a convenient user interface for inspecting those data. Find it in your start menu, in the Parsley overview, or execute parsley_report_gui
.
Customizing Parsley¶
General Workflow Overview¶
The following describes how the inner parts of parsley work together. This knowledge is very helpful for planning and implementing a customization.
For each <sync>
in your configuration, the parsley engine will create and configure one instance of parsley.syncengine.sync.Sync
. If it is not skipped (e.g. because it was already executed less time ago than the interval
defines), the engine tries to prepare the execution.
Preparing a synchronization means activating all <preparation>
specified for this synchronization task. This can mount filesystems or whatever is needed for bringing an environment in place, which is required for the actual synchronization to run. Each preparation
is one instance of a subclass of parsley.preparation.abstractpreparation.Preparation
, which provides the implementation for activating a preparation before synchronization, for deactivating it afterwards and for status checks.
If the synchronization task is successfully prepared, the actual sync operation begins. The sync operation iterates over whatever it can find in your filesystems and just triggers certain events. Without anything more, it would not do anything (and, technically, it would not even iterate that much - but let’s forget that for now). The complete synchronization behavior comes with a bunch of small pieces of program code that react on those events. Even the builtin parsley synchronization behavior is implemented as aspects (which also means that you can completely get rid of it by not listing those aspects in your sync configuration).
Those event handlers are added to the pipe by means of some <aspect>
. Each specified aspect (either within one <filesystem>
or directly within the <sync>
) brings an instance of a subclass of parsley.aspect.abstractaspect.Aspect
, which registers one or more event handlers to the synchronization pipe. One aspect typically implements a certain piece of behavior (which often needs to react on more than only one event).
Intermediate summary: A synchronization task itself is not an interesting thing. It will just fly over your files doing nothing. It can be enriched with some preceding or subsequent actions by means of a preparation
. But all the interesting synchronization behavior comes with event handlers. Those event handlers are bundled in some aspect
.
The following description gives a more detailed overview of how parsley would fly over your filesystem and which events are triggered on that flight (i.e. which junction points exist, where aspects can hook in for own logic). For most events, additional information is available in the developer documentation.
At the very beginning of the synchronization run (directly after it is prepared),
parsley.syncengine.common.SyncEvent.BeginSync
is triggered.Afterwards it starts the synchronization of the root directory. Synchronization of a directory (the root one and all the other ones) executes those steps:
parsley.syncengine.common.SyncEvent.UpdateDir_Prepare
is triggered for preparing the directory synchronization.parsley.syncengine.common.SyncEvent.UpdateDir_ListDir
is triggered for collecting a list of all direct child entries (files, subdirectories, …) in that directory.Afterwards, parsley iterates over all the listed child entries that were listed and synchronizes this entry. Each entry synchronization executes those steps:
parsley.syncengine.common.SyncEvent.UpdateItem_BeforeElectMaster
is triggered for preparing the next step.parsley.syncengine.common.SyncEvent.UpdateItem_ElectMaster
is triggered for electing the master filesystem. This is the filesystem that contains the ‘right’ version of that item. All the other filesystems are meant to get updated according to this version. The result of the election could be to skip all the other steps for this child entirely. It can also select a non-existing location (which typically leads to deletion later on).parsley.syncengine.common.SyncEvent.UpdateItem_CheckConflicts
is triggered for checking if a conflict exists between the master filesystem and any other one.parsley.syncengine.common.SyncEvent.UpdateItem_ResolveConflicts
is triggered for resolving those conflicts, if conflicts appeared.parsley.syncengine.common.SyncEvent.UpdateItem_SkippedDueConflicts
is triggered if conflicts appeared and could not be resolved. In that situation, after triggering this event, some of the next steps are skipped and processing resumes atUpdateItem_AfterUpdate
(see below).parsley.syncengine.common.SyncEvent.UpdateItem_Update_Prepare
is triggered for preparing the actual updating.parsley.syncengine.common.SyncEvent.UpdateItem_Update_ExistsInMaster
is triggered for actually updating an entry, if it exists in the master filesystem (typically in order to copy it to the other filesystems). If the entry is a directory, this also starts synchronizing this directory by means of the workflow described here (beginning above). This is realized byparsley.aspect.baseinfrastructure.BaseInfrastructure
, which is always implicitely included in each synchronization configuration.parsley.syncengine.common.SyncEvent.UpdateItem_Update_NotExistsInMaster
is triggered for actually updating an entry, if it does not exist in the master filesystem (typically in order to remove it from the other filesystems).parsley.syncengine.common.SyncEvent.UpdateItem_AfterUpdate
is triggered after the entry was synchronized.
parsley.syncengine.common.SyncEvent.UpdateDir_AfterUpdate
is triggered after the directory was synchronized.
At the end of the synchronization,
parsley.syncengine.common.SyncEvent.EndSync
is triggered.
Whenever an event occurs, all registered event handlers are executed. Each event handler execution takes place on top (or: is associated with) one of your specified filesystems. Each event handler typically does its work in that particular filesystem. If an aspect that provides a certain event handler is specified in a filesystem
, it will be executed exactly for that filesystem. If it is directly specified in the sync
, it will be executed once for each filesystem.
There is a mechanism for ordering the execution of the event handlers within one event. Please read parsley.syncengine.sync.Sync.executeevent()
for more details about the ordering and more about the internals.
Customizable Parts¶
If you want to override or enhance some parts of the default behavior, read the following parts:
parsley.aspect.abstractaspect.Aspect
is the base class of your implementation if you want to develop a part of logical behavior, like ‘copy a file to somewhere in some situations’. Inspect the sources inparsley.aspect
for lots of practical examples. Read also about how to put your Custom Aspects to a configuration.parsley.filesystem.abstractfilesystem.Filesystem
is the base class of your own filesystem implementation, which allows usage of a filesystem that is neither the local one nor another support one. Insect the sources inparsley.filesystem
for examples.parsley.preparation.abstractpreparation.Preparation
is the base class for sync preparations, which will be enabled before the synchronization runs and disabled afterwards by the Parsley engine. One typical use case is mounting a remote filesystem. Inspect the sources inparsley.preparation
for examples.
High Level Customization¶
The parsley.aspect.highlevelcustomization.HighLevelCustomization
aspect allows to include Python code pieces from somewhere in the synchronization directory tree at defined places into the synchronization behavior.
This is very convenient for including some automation tasks, e.g. automatically converting some kinds of files whenever the appear or reacting in any other custom way to filesystem updates.
Include this aspect to your configuration in order to use this feature. The graphical interface has a guide for it as well.
Python Imports¶
It is possible to implement own stuff in external Python modules and use those classes and functions from within the configuration (e.g. as a different filesystem type). Specify a Python Import for such a class or function.
Appendix: Command Line¶
The parsley command-line tool understands this syntax:
parsley [options]*
Options can be some of the following:
--sync [syncname]
: Runs the synchronization ofsyncname
. UseALL
forsyncname
in order to run all synchronizations.--listsyncs
: Lists all available synchronization configurations.--datadir [dirpath]
: Usesdirpath
as control data storage directory instead of the default one (~/.parsley
).--configfile [configfile]
: Uses configuration fileconfigfile
instead of the default one (~/.parsley/parsley.xml
).--createconfig
: Creates a fresh configuration file (can be combined withdatadir
).--forcesync [syncname]
: Marks the synchronizationsyncname
for forceful synchronization, even if the time interval is not elapsed yet. Can be used more than once.--lock [pid]
: Just acquires the lock, so no other synchronization run will actually do anything until youunlock
. Used for backup. Without a pid, the lock must be unlocked and refreshed every 10 minutes!--unlock
: Releases a lock acquired with--lock
.
Appendix: Installation¶
Install Parsley via the installation package for your environment, if a suitable one exists for download. This also takes care of installing dependencies and doing preparation (unless mentioned otherwise in the installation procedure). After the installation, you can skip the rest of this section.
Source Code Archive¶
Use the source code archive as fallback. Extract it to a location that is convenient to you (Windows users need an external archive program; for example the great ‘7-Zip’ tool). Also take a look at the Dependencies for external stuff you need to install as well.
It is highly recommended to also establish a command line link or alias for parsley/parsley.py
so you just have to type parsley
(ln -s ...parsley/parsley.py /usr/local/bin/parsley
on Unix or any other operating system specific way). Do the same for parsley/parsley_gui.py
, parsley/parsley_infssync_manageconflicts.py
and parsley/parsley_infssync_manageconflicts_gui.py
. This is according to what the installation packages do and required for executing the exact same commands as used in this manual (otherwise you must substitute the full name for the short command names in this manual).