RePol

From NI4OS wiki
Jump to navigation Jump to search

RePol Repository Policy Generator - Description and Documentation

repol.ni4os.eu

Introduction

A trustworthy repository should have a transparent policy, informing users about the roles, responsibilities, rights and procedures aimed at ensuring that their deposited data are preserved and disseminated in line with the FAIR and Open principles. A repository policy is required to onboard scientific repositories into NI4OS-Europe and EOSC service catalogues.

Specifications and exchange of repositories metadata have been largely already defined within the OpenAIRE, EOSC Enhance, EOSCpilot and FAIRsFAIR efforts. At the same time, it was gradually established that every repository should have a clear policy, describing its operation and what users may expect when they decide to use it. Expressing these decisions and writing a comprehensive repository policy, if taken seriously, appeared to be a major problem for repository owners and managers. The presence of service policies, and privacy and repository policies, in particular, is required for the services in the EOSC Marketplace, is increasingly expected by end-users, and is mandated by regulators and assessors and required for service certification. They also describe a set of rules and practices that the organisational management, repository administrators and technical staff need to establish and enforce. As elements of policies directly depend on decisions on how the repositories are to be governed and the available options are limited, the drafting of a policy can be automated to a large extent. That is why the University of Belgrade Computer Centre (RCUB) decided to provide a tool that would facilitate the making of key choices and drafting of policy documents.

RePol – Repository Policy Generator is an open-source web application that helps the user create and maintain the comprehensive and clear repository and privacy policies. Generated privacy policies are suitable for any kind of online service. A step-by-step wizard and self-explanatory forms guide the user through the policy-defining process. By choosing among available options, the user shapes a policy document with clauses formulated in line with the current best practice. With the resulting policies, the resource owner can more easily align the service with GDPR requirements as well as those for onboarding and participation in open-science infrastructures.

The produced document may be downloaded as an XML file and additionally customized or edited before it is published with the service or repository. Individual policy elements are provided in a machine-readable format, allowing for an automated interpretation of created policies and metadata extraction by registries, catalogues and various operational, data discovery and workflow tools.

RePol specifically aims to help service owners in meeting the requirements of participating and onboarding to EOSC. This extensible web application can be configured to generate any type of policy document, due to the versatile nature of its configurable forms and templates.

Usage

The main purpose of this web application is to generate policies for repositories, but it can be used to generate any other type of document, due to the versatile nature of its configurable forms and FreeMarker templates.

A policy is generated in four simple steps:

  • Upload – If the user wants to reuse or edit the existing settings or policy file generated by RePol, this step offers to upload the previously generated standalone XML or policy document in HTML.
  • Select – A policy type for the document to be created is selected here. Current choices include 'Repository Policy' and 'Privacy Policy'.
  • Enter Data – The options and values used in the produced document are entered here.
  • Get Results – The user can download the created document or the content of all edited forms, clear all data, and start over.

By using triggers and conditions, changes of values in form input elements can make panels (groups of input elements) or input elements themselves appear, disappear or change values, making a form act more like a wizard, assisting the user in selecting appropriate values.

One instance of RePol can have multiple forms and corresponding templates. Data is shared among forms to avoid having to enter it more than once.

RePol Usage Demo

Access, data entry and document creation

The user can access RePol through a web browser and without any authentication. Once a session is established, it is used to store data from form input elements. After starting the wizard and passing over the optional loading of previously saved data, the user selects a policy to be generated. Each form shows how many mandatory and optional fields are filled. The document is generated after the relevant fields are filled. Some fields may be shared across several forms.

Editing of generated documents

The generated HTML document is a draft that must be read and edited by the user before it is considered final. The sections that must be revised are clearly marked in the produced document. If the user wishes to re-upload and re-use the document in RePol, they should refrain from deleting or altering the portion between tags because that is where the machine-readable data is stored. Keeping the original drafts produced by RePol simplifies the comparison between versions, drafts and manual edits.

Saving all forms

At any time, the user can download an XML (standalone) file containing data for all policies – the latest values of the input elements, from all the forms at once. Each generated document also contains its data in a machine-readable format.

Later changes of data and documents

Both standalone XML exports and generated documents can be re-uploaded to the RePol and their data parsed to fill the appropriate input elements in all forms that have them, allowing to update the produced draft policy, in line with new choices and improvements of RePol templates. All user edits of the earlier created text will have to be repeated in the new document.

Location and support

The baseline implementation of RePol is fully operational and is available at https://repol.ni4os.eu. It is available for evaluation, use and comments by users, which can be sent to repol@rcub.bg.ac.rs.

At this point, the RePol team is not able to directly assist individual users in tailoring repository policies but may tailor the tool to address the needs of a larger number of repositories. The repository owners may obtain support for their need or customisation by working along these lines:

  • Using ReRol to create a draft policy; if in doubt, several options can be tried and the produced outputs compared.
  • The chosen draft is saved as a comparison baseline.
  • The user adapts it to accommodate their needs.
  • The baseline draft and customised text are sent to repol@rcub.bg.ac.rs with a brief explanation of the changes that could be added to RePol or its policy template.
  • The RePol team reviews the proposal, updates the tool or at least comments on the proposal and provided customisation.
  • The user recreates the RePol policy draft and uses it as a new baseline, or merges the changes in it into their customised text; these texts can be compared against the baseline or mutually by using the Compare feature of their word processor, or a diff tool such as https://text-compare.com/ or https://www.diffchecker.com/.

Addressed requirements

The requirements addressed by RePol include the following:

  • The platform is flexible and can customise the output. The tool asks for a few mandatory customisation inputs, several choices, and a few options related to primary choices. Based on these inputs, it creates a corresponding draft repository policy document by tailoring a carefully crafted and redacted template. Questions, inputs, offered choices and values and provided explanations are customizable and are configured through the management of application configuration files and without any additional programming. The used data model reflects the evolving user-facing content, options and choices. The repository policy documents are generated after the user provides a few mandatory inputs (such as repository and owner name) and makes several explained and selectable choices and options. Some options are nested, i.e., some choices are shown only if choices or options they depend on are selected.
  • Choices and inputs offered to users are clear and well-explained, with reasonable defaults, well-handled mutual dependencies and, where possible, validity checks. Some of the offered values are complemented with an open-ended entry field, e.g., when specifying the thematic areas, types and languages of the repository content.
  • The policy template text is concise, clear and aligned with the current best practice, but also adaptable to specific user needs.
  • The collected data and the key elements of the generated policy are provided in a machine-readable format. This allows for an automated interpretation of created policies and extraction of repository-level metadata for inclusion in registries, catalogues and various operational and data discovery tools. Repository policies will be updated as needed to keep them in line with changes in internal and external rules, requirements, norms and conventions. Machine readability permits easily saving user choices and updating or modifying a previously created repository policy.
  • Users may additionally edit, customise or reformat the generated policy document using their regular document editing tools, as policy documents are produced in an accessible and editable text format. It is assumed that such edits are minor and that they do not affect the validity of associated machine-readable data.

Key features and design decisions

RePol uses a lightweight online form to guide the user through the process of defining a repository policy. By choosing options in the form, the user chooses sets of predefined policy clauses formulated in line with the current best practice. The resulting policy document may be downloaded, additionally customized, and integrated into the repository.

While the policy template must be able to accommodate occasional extensions and new options, the used software platform has to be flexible enough to handle both current requirements and future adaptations of policies. This is achieved by a modular design, which makes it easy to extend the data entry forms and document templates with new sections, options, values and rules.

The development of the underlying policy template was started by collecting a set of representative repository policies, comparing them against the frequently used OpenDOAR “Minimum” and “Optimum” templates, and combining their most essential elements into an integral form with options and alternatives. The resulting composite document was then updated to accommodate the terminology and conventions adopted in the context of recent infrastructure development in Europe. The entire alternative sub-policies needed to be developed to accommodate Creative Commons licenses or policy-related requirements of the Core Trust Seal (CTS) certification framework. The template has been tailored to produce a lightly formatted and easy to edit and publish HTML format.

The data that needs to be provided to generate policy documents, which are supposed to be publicly available anyway, is not particularly sensitive. However, the RePol tool at this point applies a radical approach to sensitive data protection and privacy. The tool does not persist any user-provided data beyond the users’ web browser session and all data is returned to the user within the generated draft policy document, without any local data saving or logging. Although this slightly reduces the capabilities related to the analysis and profiling of RePol, we hope that the resulting simplicity of access to the tool will overweight the drawbacks, particularly during the initial popularisation of the tool. The user-provided data is handed back to the user by being embedded as hidden metadata within the HTML mark-up of the produced policy document or exported in a standalone XML document, both of which can be imported back into the tool. This allows the user to upload a previously saved policy and update or change it.

Plans and potential uses

At this point, it is not foreseen that RePol will be localised to any of the NI4OS-Europe languages, but the NI4OS-Europe team will evaluate the feasibility of this endeavour, especially as part of the National Open Science Cloud priorities and the activities performed by the networks of EOSC Promoters and NI4OS-Europe Translation Officers.

Updates may be also performed on the Repository Policy Generator depending on feedback and evaluation during current use. Special-purpose policy templates could be developed for some specific needs. Immediately useful would be the extension of the tool to the generation of other policy and repository configuration-related and potentially machine-readable artifacts, including those associated with various registries and validators. RePol will be aligned as needed with requirements and emergent demands to create artifacts for EOSC services in development, such as EOSC OS Monitor and OS Policy Registry. The policy template will also need to be readjusted to evolving norms in expectations for repository and service policies and the service certification context. The data used to create repository policies could be extended and exchanged with the Agora-SP service portfolio management tool used by NI4OS-Europe and EOSC platform. Similar policy templates and data entry forms, based on the same approach, rationale and modular structure, could be developed for generic services. Furthermore, the developed software platform could be applied to other similar applications, such as the creation of Privacy Notice, Terms of Use or Intellectual Property Policy documents, but special care should be taken to avoid duplication.

The technical implementation of RePol is general and quite independent from its current application related to repository policies, it may be adopted by other groups in need of similar specific or generic functionality, which would facilitate its further development. Although RePol has already been designed to be highly configurable, its potential application for other purposes would require a further separation of core code and functionality from the application-specific configuration, templates and branding.

Implementation details

RePol is a Java web application (using EE Web API 7.0) built upon Java Server Faces 2.2 framework, with PrimeFaces 7 components. It generates documents using FreeMarker 2.3 library.

When the user fills in a form and requests the corresponding document, all related user-provided data is validated and conditions evaluated, the compiled data model is passed on to the FreeMarker library along with the template, which generates the customised document and passes it for submission to the user’s web browser.

Architecture

RePol architecture
  • Dashboard – Singleton bean containing the chain of initiation of all vital components in the correct order. Should any of them fail, Dashboard will switch the entire application into the malfunction state and prevent users from accessing action pages (with forms or lists of forms, where user interaction occurs). The administrator can access the dashboard page and reload the failed component after fixing its faulty configuration. The same mechanism can be used to reload the configuration without having to restart the entire server or otherwise re-deploy the application.
  • FormFactory – A class responsible for the creation of forms based on template-forms.xml configuration file. It’s one of the vitals components initialized and monitored by Dashboard.
  • FMHandler – Wrapper class connecting the FreeMarker library to the Dashboard and instances of the Form class to generate documents. It needs to be reloaded through Dashboard, after modifying template files.
  • DataShare – In earlier versions, this bean was session-scoped, but after the last major refactoring, the entire session was reorganized. Now, this bean has been incorporated into a session-scoped SessionController bean that also caches all forms opened by the user. Still, the purpose of DataShare remains the same and that is sharing values of form input elements between elements with the same id, belonging to different forms, allowing the data to be re-used.

Configuration

Configuration files and their parameters are:

settings.cfg (properties file)
This file contains the following parameters:
  • templatePath – path to the directory containing .ftlh template files
  • authenticationPin – pin used to access the hidden panel for reloading configuration (/faces/dashboard.xhtml)
  • version – version of the RePol instance, accessible from the FreeMarker template (${repol_version})
selection-lists.xml
XML file containing all lists of pre-defined values used by some input elements. Each list must have a unique identifier by which it is referenced by the list-id attribute of an input element. Each list element has a human-readable label and an actual value that is being used in the template. Actual values should be selected so that they can be used in sentences.
template-forms.xml
This XML file contains forms, their identifiers, labels and descriptions. Each form is comprised of panels. The panel is used to group input elements. Input elements are used to receive user-provided data of the appropriate types. Each form input element must have an id unique within the form. Data is shared between forms by having input elements with the same id. If an input element has a list of pre-defined values, it must also have a list-id attribute.

To set up a working instance of RePol, it is necessary to make an arbitrarily placed directory on the server, place FreeMarker templates in it (*.ftlh) and make them readable for the operating system account used by the application. Before deploying the application, it is also necessary to configure repol.TemplatePath parameter in settings.cfg to point to the template directory and to configure in template-forms.xml the associations between forms and corresponding templates. By accessing a hidden panel, an administrator can reload templates and forms without having to restart the server or redeploy the entire application.

Forms are interactive. Selecting or entering values can trigger changes in other input elements, make them appear, disappear or change values when pre-configured conditions are met. These conditions are formulated as logical expressions of input element values being equal to, containing a given value or being empty or any of the things above negated. They are configured as trees of XML elements, each with a unique condition identifier assigned by the person who set up the form. Each input element can contain triggers that alter values in other input elements when a specified condition is met. If an input element has a condition specified, it is only visible when that condition is met. Conditions are also accessible from FreeMarker templates as Boolean constants that are evaluated at the time of document generation.

Given the complex nature of the form configuration, the application validates it when it is initialised by its container. If it fails to start, the administrator should look at the log file of the servlet container server to figure out what went wrong.

Detailed Architecture

RePol architecture details

SessionController is a session scoped bean that is created whenever a user starts a new session. It contains a DataShare bean that maps FormElement identifiers to the last modified elements in all opened forms. When a form is requested by the user, SessionController uses FormFactory to create a new form instance for the given document_id (GET parameter), making it available to the user but also caching it for future use, and it is going to be available as long as the session lasts, or until the user deletes all of the data.

FormFactory - a singleton responsible for generating forms when needed. It uses the template-forms.xml configuration file, parsing its part of that file and then passing the rest to the more specialized PanelFactory which in turn calls FormElementFactory to generate specific FormElement subclasses. FormFactory implements the Monitorable interface and is a part of Dashboard's initialization chain and its template-forms.xml configuration file can be reloaded at any moment, from the Dashboard page.

ListFactory (also Monitorable) is a factory for generating lists of pre-defined values for list-type FormElements (e.g., poolpicker, selectone, selectmany). It uses selection-lists.xml configuration file.

FormElement is an abstract class that handles all of the UI interactions with the Form and is responsible for providing the value to the Map object that is then passed to the FreeMarker library to generate the document. Each specific input element extends FormElement, and this abstract class must be fully understood before introducing a new input element type, which is unlikely to be necessary.

Dependencies

Name Maven dependancy (groupId, artifactId, version) License
PrimeFaces Bootstrap Theme org.primefaces.themes bootstrap 1.0.10 MIT
PrimeFaces 7.0 org.primefaces primefaces 7.0 MIT
Oracle's Implementation of the JSF 2.2 Specification com.sun.faces jsf-api 2.2.20 CDDL+GPL
Oracle's Implementation of the JSF 2.2 Specification com.sun.faces jsf-impl 2.2.20 CDDL+GPL
JavaEE Web API javax javaee-web-api 8.0.1 CDDL+GPL/GPL 2.0
JavaServer Pages Standard Tag Library (JSTL) javax.servlet jstl 1.2 CDDL+GPL 2.0
FreeMarker org.freemarker freemarker 2.3.31 Apache License 2.0

Templates and parameters

Parameters and forms fields definitions are described here:


Version history

  • 4.2, January 2023 – Corrections in the formatting of user TODOs, language and some phrases; added runtime execution monitoring alerts (runtime, not software change) for better reliability.
  • 4.1, April 2022 – Recent maintenance changes with several minor improvements: adaptive use of English articles before service and institution names in wizards and templates; sharing of names and values of common parameters in forms, templates and XML; more detailed logging of user actions.
  • 4, January 2022 – Upgraded dependency versions and reinstalled the entire production server to use Java 11 and Tomcat 9 Servlet Container. The Privacy Policy is now official and fully supported. UI and usability enhancements and changes in templates and input forms such as introductory and preparatory explanations, elaboration of reasons for data collection, children’s policy and further elaboration of cookies usage. Detailed RePol-based Privacy Policy and separate and much more detailed ToU, linked to the detailed survey "Feedback on NI4OS-Europe tools". Documentation: described detailed architecture and classes.
  • 3, June 2021 – Session is fully reorganized to cache requested forms, faster UI, and the entire workflow is reorganized to be more like a wizard with 4 steps. Introduced experimental Privacy Policy template. Documentation: described RePol purpose in the Introduction and four steps in Usage.
  • 2, February 2021 – Forms definitions rendered as human-readable specifications, also allowing template and form configuration files to be downloaded for inspection. Applied EUPL1.2-or-later. Documented dependencies and their licenses, and elaborated support and handling of suggestions and template improvement proposals.
  • 1, December 2020 – Initial release containing only Repository Policy.

Source code and license

RePol source code is available at https://github.com/RCUB-Official/RePol. If you want to contribute, please get in touch with Jovana via repol@rcub.bg.ac.rs.

RePol is licensed under the EUPL (European Union Public License, Version 1.2 or later, EUPL-1.2-or-later).

All GPL-licensed (sometimes stated as GPL 2.0) components RePol is depending upon are also dual-licensed under CDDL (sometimes stated as CDDL 1.1), which makes them upstream compatible with EUPL, as stated in the Matrix of EUPL compatible open source licences

FreeMarker is licensed under Apache License 2.0, while PrimeFaces and PrimeFaces Bootstrap Theme are licensed under the MIT license. Both are permissive licenses that do not force derivative works to use the same licence.

RePol does not modify, merge or re-distribute any of the components it uses.

Team

Vasilije Rajović, Milica Ševkušić, Jovana Vuleta-Radoičić, Branko Marović