RePol

From NI4OS wiki
Revision as of 12:19, 5 January 2022 by Branko Marovic (talk | contribs)
Jump to navigation Jump to search

RePol Registry Policy Generator - Description and Documentation

repol.ni4os.eu

Introduction

A trustworthy repository should have a transparent policy, informing users about the roles, responsibilities, rights and procedures aimed at ensuring that their deposited data are preserved and disseminated in line with the FAIR and Open principles. Having a repository policy is required for the onboarding of scientific repositories and data into NI4OS-Europe, OpenAIRE and EOSC. As this is directly linked to the policies governing repositories, the University of Belgrade Computer Centre (RCUB) has developed repository policy generator RePol as a tool to facilitate the making of key choices and drafting of policy documents.

Specification and exchange of metadata in repositories have been largely defined in the context of the related OpenAIRE, EOSC Enhance, EOSpilot and FAIRsFAIR efforts. At the same time, it was found that along with the lack of awareness that a repository should have a clearly defined policy, deciding on the rules of repositories’ operation and expressing these decisions through comprehensive repository policies is a major problem for their owners and administrators.

RePol – Repository Policy Generator is an open-source web application that guides users through the process of defining policies for repositories and web-based services. It helps in defining and maintaining comprehensive and clear repository and privacy policies. Generated privacy policies are suitable for any kind of service. RePol uses a step-by-step wizard and self-explanatory forms to guide users through the process. By choosing options in a form, users shape a policy document with predefined policy clauses formulated in line with the current best practice. The resulting policy document may be downloaded as an XML file, additionally customized, and integrated into the service or repository. The collected data and the key elements of the generated policy are provided in a machine-readable format. This allows for an automated interpretation of created policies and extraction of repository-level metadata for inclusion in registries, catalogues and various operational and data discovery tools. The resulting policy document may be downloaded, additionally edited, and integrated into a repository.

The main purpose of this extensible web application is to generate policies for repositories, but it can be configured to generate any other type of document, due to the versatile nature of its configurable forms and FreeMarker templates. In addition to the creation of repository policies, RePol helps to author generic privacy policies suitable for any kind of service.

The developed tool will hopefully help repository owners in the creation of policies that they must publish and in the alignment of provided services with requirements for participation and onboarding in the above-mentioned infrastructures.

Usage

The main purpose of this web application is to generate policies for repositories, but it can be used to generate any other type of document, due to the versatile nature of its configurable forms and FreeMarker templates.

A policy is generated in four simple steps:

  • Upload – If the user wants to reuse or edit an existing settings or policy file generated by RePol, this step offers to upload the previously generated document or standalone XML.
  • Select – A document to be created is selected here. Current choices include 'Repository Policy' and 'Privacy Policy'.
  • Enter Data – The options and values used in the produced document are entered here.
  • Get Results – The user can download the created document or the content of all edited forms, clear all data, and start over.


By using triggers and conditions, changes of values in form input elements can make panels (groups of input elements) or input elements themselves appear, disappear or change values, making a form act more like a wizard, assisting the user in selecting appropriate values.

One instance of RePol can have multiple forms and corresponding templates. Data is shared among forms to avoid having to enter it more than once.

Access, data entry and document creation

The user can access RePol through a web browser and without any authentication. Once a session is established, it is used to store data from form input elements. After starting the wizard and passing over optional loading of previously saved data, the user selects a policy to be generated. Each form shows how many mandatory and optional fields are filled. The document is generated after the relevant fields are filled. Some fields may be shared across several forms.

Editing of generated documents

The generated HTML document is a draft that must be read and edited by the user before it is considered final. The sections that must be revised are clearly marked in the produced document. If the user wishes to re-upload and re-use the document in RePol, they should refrain from deleting or altering the portion between tags because that is where the machine-readable data is stored. Keeping the original drafts produced by RePol simplifies comparison between drafts, and drafts and manual edits.

Saving all forms

At any time, the user can download an XML (standalone) file containing data for all policies – the latest values of the input elements, from all the forms at once. Each generated document also contains its data in a machine-readable format.

Later changes of data and documents

Both standalone XML exports and generated documents can be re-uploaded to the RePol and their data parsed to fill the appropriate input elements in all forms that have them, allowing to update the produced draft policy, in line with new choices and improvements of RePol templates. All user edits of the earlier created text will have to be repeated in the new document.

Location and support

The baseline implementation of RePol is fully operational and is available at https://repol.ni4os.eu. It is available for evaluation, use and comments by users, which can be sent to repol@rcub.bg.ac.rs.

At this point, the RePol team is not able to directly assist individual users in tailoring repository policies but may tailor the tool to address the needs of a larger number of repositories. The repository owners may obtain the support for their ca(u)se by working along these lines:

  • Using ReRol to create a draft policy; if in doubt, several options can be tried and the produced outputs compared.
  • The chosen draft is saved as a comparison baseline.
  • The user adapts it to accommodate their needs.
  • The baseline draft and customised text are sent to repol@rcub.bg.ac.rs with a brief explanation of the changes that could be added to RePol or its policy template.
  • The RePol team reviews the proposal, updates the tool or at least comments on the proposal and provided customisation.
  • The user recreates the RePol draft and uses it as a new baseline, or merges the changes in it into their customised text; these texts can be compared against the baseline or mutually by using the Compare feature of their word processor, or a diff tool such as https://text-compare.com/ or https://www.diffchecker.com/.

Addressed requirements

The requirements addressed by RePol include the following:

  • The platform is flexible and can customise the output. The tool asks for a few mandatory customisation inputs, several choices, and a few options related to primary choices. Based on these inputs, it creates a corresponding draft repository policy document by tailoring a carefully crafted and redacted template. Questions, inputs, offered choices and values and provided explanations are customizable and are configured through the management of application configuration files and without any additional programming. The used data model reflects the evolving user-facing content, options and choices. The repository policy documents are generated after the user provides a few mandatory inputs (such as repository and owner name) and makes several explained and selectable choices and options. Some options are nested, i.e., some choices are shown only if choices or options they depend on are selected.
  • Choices and inputs offered to users are clear and well-explained, with reasonable defaults, well-handled mutual dependencies and, where possible, validity checks. Some of the offered values are complemented with an open-ended entry field, e.g., when specifying the thematic areas, types and languages of the repository content.
  • The policy template text is concise, clear and aligned with the current best practice, but also adaptable to specific user needs.
  • The collected data and the key elements of the generated policy are provided in a machine-readable format. This allows for an automated interpretation of created policies and extraction of repository-level metadata for inclusion in registries, catalogues and various operational and data discovery tools. Repository policies will be updated as needed to keep them in line with changes in internal and external rules, requirements, norms and conventions. Machine readability permits to easily save user choices and update or modify a previously created registry policy.
  • Users may additionally edit, customise or reformat the generated policy document using their regular document editing tools, as policy documents are produced in an accessible and editable text format. It is assumed that such edits are minor and that they do not affect the validity of associated machine-readable data.

Key features and design decisions

RePol uses a lightweight online form to guide the user through the process of defining a repository policy. By choosing options in the form, the user chooses sets of predefined policy clauses formulated in line with the current best practice. The resulting policy document may be downloaded, additionally customized, and integrated into the repository.

While the policy template must be able to accommodate occasional extensions and new options, the used software platform has to be flexible enough to handle both current requirements and future adaptations of policies. This is achieved by a modular design, which makes it easy to extend the data entry forms and document templates with new sections, options, values and rules.

The development of the underlying policy template was started by collecting a set of representative repository policies, comparing them against the frequently used OpenDOAR “Minimum” and “Optimum” templates, and combining their most essential elements into an integral form with options and alternatives. The resulting composite document was then updated to accommodate the terminology and conventions adopted in the context of recent infrastructure development in Europe. The entire alternative sub-policies needed to be developed to accommodate Creative Commons licenses or policy-related requirements of the Core Trust Seal (CTS) certification framework. The template has been tailored to produce a lightly formatted and easy to edit and publish HTML format.

The data that needs to be provided to generate policy documents, which are supposed to be publicly available anyway, is not particularly sensitive. However, the RePol tool at this point applies a radical approach to sensitive data protection and privacy. The tool does not persist any user-provided data beyond the users’ web browser session and all data is returned to the user within the generated draft policy document, without any local data saving or logging. Although this slightly reduces the capabilities related to the analysis and profiling of RePol, we hope that the resulting simplicity of access to the tool will overweight the drawbacks, particularly during the initial popularisation of the tool. The user-provided data is handed back to the user by being embedded as hidden metadata within the HTML mark-up of the produced policy document or exported in a standalone XML document, both of which can be imported back into the tool. This allows the user to upload a previously saved policy and update or change it.

Plans and potential uses

At this point, it is not foreseen that RePol will be localised to any of NI4OS-Europe languages, but the NI4OS-Europe team will evaluate the feasibility of this endeavour, especially as part of the National Open Science Cloud priorities and the activities performed by the networks of EOSC Promoters and NI4OS-Europe Translation Officers.

Updates may be also performed to the Repository Policy Generator depending on feedback and evaluation during current use. Special-purpose policy templates could be developed for some specific needs. Immediately useful would be the extension of the tool to the generation of other policy and repository configuration-related and potentially machine-readable artifacts, including those associated with various registries and validators. RePol will be aligned as needed with requirements and emergent demands to create artifacts for EOSC services in development, such as EOSC OS Monitor and OS Policy Registry. The policy template will also need to be readjusted to evolving norms in expectations for repository and service policies and the service certification context. The data used to create repository policies could be extended and exchanged with the Agora-SP service portfolio management tool used by NI4OS-Europe and EOSC platform. Similar policy templates and data entry forms, based on the same approach, rationale and modular structure, could be developed for generic services. Furthermore, the developed software platform could be applied to other similar applications, such as the creation of Privacy Notice, Terms of Use or Intellectual Property Policy documents, but special care should be taken to avoid duplication.

The technical implementation of RePol is general and quite independent from its current application related to repository policies, it may be adopted by other groups in need of similar specific or generic functionality, which would facilitate its further development. Although RePol has already been designed to be highly configurable, its potential application for other purposes would require a further separation of core code and functionality from the application-specific configuration, templates and branding.

Implementation details

RePol is a Java web application (using EE Web API 7.0) built upon Java Server Faces 2.2 framework, with PrimeFaces 7 components. It generates documents using FreeMarker 2.3 library.

When the user fills in a form and requests the corresponding document, all related user-provided data is validated and conditions evaluated, the compiled data model passed on to the FreeMarker library along with the template, which generates the customised document and passes it for submission to the user’s web browser.

Architecture

RePol architecture
  • Dashboard – Singleton bean containing the chain of initiation of all vital components in the correct order. Should any of them fail, Dashboard will switch the entire application into the malfunction state and prevent users from accessing action pages (with forms or lists of forms, where user interaction occurs). The administrator can access the dashboard page and reload the failed component after fixing its faulty configuration. The same mechanism can be used to reload the configuration without having to restart the entire server or otherwise re-deploy the application.
  • FormFactory – A class responsible for the creation of forms based on template-forms.xml configuration file. It’s one of the vitals components initialized and monitored by Dashboard.
  • FMHandler – Wrapper class connecting the FreeMarker library to the Dashboard and instances of the Form class to generate documents. It needs to be reloaded through Dashboard, after modifying template files.
  • DataShare – In earlier versions, this bean was session-scoped, but after the last mayor refactoring, the entire session was reorganized. Now this bean has been incorporated into a session-scoped SessionController bean that also caches all forms opened by the user. Still, the purpose of DataShare remains the same and that is sharing values of form input elements between elements with the same id, belonging to different forms, allowing the data to be re-used.

Configuration

Configuration files and their parameters are:

settings.cfg (properties file)
This file contains the following parameters:
  • templatePath – path to the directory containing .ftlh template files
  • authenticationPin – pin used to access the hidden panel for reloading configuration (/faces/dashboard.xhtml)
  • version – version of the RePol instance, accessible from the FreeMarker template (${repol_version})
selection-lists.xml
XML file containing all lists of pre-defined values used by some input elements. Each list must have a unique identifier by which it is referenced by the list-id attribute of an input element. Each list element has a human-readable label and an actual value that is being used in the template. Actual values should be selected so that they can be used in sentences.
template-forms.xml
This XML file contains forms, their identifiers, labels and descriptions. Each form is comprised of panels. The panel is used to group input elements. Input elements are used to receive user-provided data of the appropriate types. Each form input element must have an id unique within the form. Data is shared between forms by having input elements with the same id. If an input element has a list of pre-defined values, it must also have a list-id attribute.

To set up a working instance of RePol, it is necessary to make an arbitrarily placed directory on the server, place FreeMarker templates in it (*.ftlh) and make them readable for the operating system account used by the application. Before deploying the application, it is also necessary to configure repol.TemplatePath parameter in settings.cfg to point to the template directory and to configure in template-forms.xml the associations between forms and corresponding templates. By accessing a hidden panel, an administrator can reload templates and forms without having to restart the server or re-deploy the entire application.

Forms are interactive. Selecting or entering values can trigger changes in other input elements, make them appear, disappear or change values when pre-configured conditions are met. These conditions are formulated as logical expressions of input element values being equal to, containing a given value or being empty or any of the things above negated. They are configured as trees of XML elements, each with a unique condition identifier assigned by the person who set up the form. Each input element can contain triggers that alter values in other input elements when a specified condition is met. If an input element has a condition specified, it is only visible when that condition is met. Conditions are also accessible from FreeMarker templates as Boolean constants that are evaluated at the time of document generation.

Given the complex nature of the form configuration, the application validates it when it is initialised by its container. If it fails to start, the administrator should look at the log file of the servlet container server to figure out what went wrong.

Detailed Architecture

RePol architecture details

More

More more

Dependencies

Name Maven dependancy (groupId, artifactId, version) License
PrimeFaces Bootstrap Theme org.primefaces.themes bootstrap 1.0.10 MIT
PrimeFaces 7.0 org.primefaces primefaces 7.0 MIT
Oracle's Implementation of the JSF 2.2 Specification com.sun.faces jsf-api 2.2.0 CDDL+GPL
Oracle's Implementation of the JSF 2.2 Specification com.sun.faces jsf-impl 2.2.0 CDDL+GPL
JavaEE Web API javax javaee-web-api 7.0 CDDL+GPL/GPL 2.0
JavaServer Pages Standard Tag Library (JSTL) javax.servlet jstl 1.2 CDDL+GPL 2.0
FreeMarker org.freemarker freemarker 2.3.30 Apache License 2.0

Version history

  • xxx – Privacy Policy added
  • xxx – Usage flow changed to the step-by-step wizard, added tracking of filled/touched mandatory and optional fields, UI improvements
  • xxx – Minor UI improvements
  • xxx – Initial release

Use of Valuete

RePolu uses Valuate to TODO!!

Source code and license

RePol source code is available at https://github.com/RestlessDevil/RePol/. If you want to contribute, please get in touch with Vasilije AKA RestlessDevil via repol@rcub.bg.ac.rs or by creating an issue or pull request.

RePol is licensed under the EUPL (European Union Public License, Version 1.2 or later, EUPL-1.2-or-later).

All GPL-licensed (sometimes stated as GPL 2.0) components RePol is depending upon are also dual-licensed under CDDL (sometimes stated as CDDL 1.1), which makes them upstream compatible with EUPL, as stated in the Matrix of EUPL compatible open source licences

FreeMarker is licensed under Apache License 2.0, while PrimeFaces and PrimeFaces Bootstrap Theme are licensed under the MIT license. Both are permissive licenses that do not force derivative works to use the same licence.

RePol does not modify, merge or re-distribute any of the components it uses.

Team

Vasilije Rajović, Milica Ševkušić, Branko Marović