Hands-on workshop on FOSS and tools

From NI4OS wiki
Revision as of 19:36, 30 January 2023 by Branko Marovic (talk | contribs) (→‎Basic public information)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Basic public information

The workshop was organised by the UoB, RBI and SRCE on November 30th and December 1st 2022. It was held online for remote participants, while the attendants from Belgrade and Zagreb gathered in person at the UoB and SRCE. The areas to be addressed during this hackathon included:

  • Use of open-source software in science
  • Developing and onboarding tools and services for open science
  • Developing and improving FOSS research software or a service and increasing its outreach
  • FOSS licence selection: assessment, improvements and recommendations
  • Quality and improvements of data about services and resources
  • Improving research and researcher data

The event and enrolment details of the event are at https://ni4os.eu/hands-on-workshop-touching-on-data-and-open-source-software-for-open-science/, with registration at https://events.ni4os.eu/event/85/. Tentative work period on both dates was from 10 AM to 15 PM.

General description

Besides bringing direct benefits to the participants, this event helped cast new light on existing concerns and problems related to the advancement of open science and the use of free and open software (FOSS) in science, by capturing perspectives from within the NI4OS-Europe communities and generating ideas related to research software, tools and practical approaches. It is a step that should help the participants and stakeholders in solving the subject issues; it also serves as an instrument to promote the tackled topics and improve the general knowledge about them.

Participants

The goal is to familiarise the participants with the covered concepts, available NI4OS-Europe services and the advantages of FOSS. Participants may be newcomers to the topics; the goal is to ideate and increase the awareness of working across various needs, concerns and perspectives. It will hopefully provide all participants with an opportunity to learn something new about OS data and research governance, by developing ideas and initial prototypes on the offered themes. Primary participant profiles include:

  • Researchers with their own research software, tools and services;

  • Other researchers and students working on case studies proposed by mentors or provided by the first group.

The workshop is targeting researchers, but also software, services and tools developers who have created or are about to produce software, tools and services to support their research process. Most researchers do not share such products, either because they do not know how or because they believe that not sharing is a way to prevent uncredited and inappropriate use. The idea is to raise awareness of the benefits of sharing in line with the FAIR principles and the EOSC Rules of Participation. The final goal is to motivate them to prepare their products for onboarding to the EOSC Portal and Marketplace, but also to contribute to the development and use of FOSS research software within their communities.

Participants are supposed to creatively work on solving the problems in OS related to the quality of data on research, OS services, and FOSS software and its use. Groups of typically 3 to 5 members will work on a specific problem of their choice and within one of three selected themes, simultaneously or alternating at several locations. Using their laptops and whiteboards they discuss and design a suitable PoC of a solution for the chosen problem. The goal is not to come up with an MVP or directly applicable solution for a real-life problem but with a relevant concept that could improve data about research and OS services and the adoption of FOSS in OS. Groups work in parallel but may consult with other groups working on a similar problem as they go.

Participants are expected to bring their existing ideas related to the topics described below. They may even choose to work on that topic alone but should be ready to discuss it with other event participants, adjunct mentors, and stakeholders. Those with suitable proposals will get an opportunity to briefly present them by giving short idea pitches with the leader’s name and affiliation, a problem statement, the solution, and the skills/help needed.

Key information and agenda: Hands-on workshop: Touching on Data and Open-Source Software for Open Science

Registration: https://zoom.us/meeting/register/tJ0scuGhpj8tHtI8soGjKOCVuBKAIV76hrjt

Pre-workshop survey: https://events.ni4os.eu/event/85/surveys/11?token=fc531e82-a0aa-492c-94fd-3039ef27dae3

Locations

  • Rektorat Univerziteta u Beogradu, Studentski 1, Beograd, Serbia
  • SRCE, Josipa Marohnića 5, Zagreb, Croatia
  • Online (Zoom)

Wi-Fi (@UoB, @SRCE)

SSID: xxx

PWD: xxx

Detailed agenda

Day 1 (30 November 2022)

10:00 – General intro about the workshop – goals, organisation, overall process, agenda – Branko Marović

10:10 – Open Science, FAIR, EOSC and NI4OS-Europe – Biljana Kosanović

10:20 – Software development for open science – Antica Čulina

10:40 – Collaborative software development; Git and how it differs from scientific repositories – Dubravko Penezić

11:00 – Open-source software licences and their use in open science – Branko Marović

11:20 – Coffee break

11:40 – Data on open science and how to link and cross-check it – Vladimir Otašević

12:00 – From a FOSS project to a service (service documentation and management, policies, service catalogues and registration requirements) – Milica Ševkušić

12:20 – Case study: onboarding of a scientific visualisation service – Davor Davidović

12:30 – Licences for scientific outputs – Panagiota Koltsida

12:50 – Expectations from projects and their topics

13:20 – Lunch break

13:50 – Introduction of mentors and establishment of groups

14:00 – [Rooms and tables] Brainstorming and initial work on projects

15:00 – Brief presentation of project concepts (up to 5 minutes each)

15:30 – Close of Day 1

17:00-21:00 [Rooms] (Optional)

Day 2 (1 December 2022)

Initially planned

10:00 – Reconvening and comments about Day 1

10:15 – [Rooms and tables] Work on projects

11:20 – Coffee break

11:40 – [Rooms and tables] Work on projects

12:50 – Lunch break

13:30 – Presentation of project results

14:30 – Discussion and wrap-up

15:00 – Close of Day 2

Adapted to participants' interests- Day 2 practical work: linear, without break-out rooms

  1. Git basics (Dubravko Penezić@Srce)

    1. Creation of the repository, commit, push, pull
  2. Example of R project (Marija Purgar@IRB)

    1. Basics
    2. Application of Git skills acquired in the previous part (Dubravko)
  3. Integrating the project into a Jupyter Notebook computational document
  4. Choosing a licence, etc. (Branko Marović@UoB)
  5. Publishing a version of software on Zenodo (push from GitHub) (Alen Vodopijevec@IRB)
  6. Technical and operational aspects of a web service (Davor Davidović/Alen@IRB)
  7. Establishing an EOSC service (Milica Ševkušić@UoB)

    1. Key service and policy decisions
    2. Terms of use, privacy policy using RePol
    3. NI4OS-Europe AGORA form showcase

Points for participants

Learning outcomes

  • Improved awareness of the importance of research outputs beyond publications
  • The ability to wrap up an internally used tool as a product that can be used by a wider community
  • The basic understanding of the EOSC ecosystem and the potential place of research software/tools in it
  • Knowing how to use NI4OS tools
  • Knowing how to publish software under an open licence
  • Understanding software licencing

General benefits

  • Greater openness and transparency of the research process
  • Collaboration
  • More reusable free and open-source software/tools are available
  • Open science and EOSC promotion

Benefits for researchers

  • Better visibility of research software/tools due to metadata, PIDs and the ability to cite
  • Improved sustainability: a tool/software that is only internally used is difficult to maintain and keep up-to-date and compatible with current OSs; if the code is open, users may continue improving and updating the software
  • Added value to the original software/tool
  • Citations and acknowledgement

Benefits for students

  • Understanding the EOSC ecosystem: even if they are not interested in academic software, they will learn that both free and commercial tools intended for researchers should onboard to the EOSC Portal (and compliant with the requirements)
  • Specific skills
  • Information about other career opportunities beyond the corporate world
  • Information about the possibilities of FOSS (e.g., it can be commercialised)

Prepared topics

Descriptions of primary topics should be prepared in advance. Instructions for new proposals:

To stimulate the initial thinking about additional ideas, a set of pre-prepared topic descriptions will be shared before the event, so that participants could fit in with specific concretisations or by outlining complementary ones. Proposers will be cut off after one minute to respect the audience’s time. Proposers should design their pitches not as recruiting but as advertising how interesting the work is going to be.

Quality and improvements of data about services and resources

EOSC Marketplace, NI4OS-Europe Agora and other open science catalogues describe resources, services and organisations with many attributes described in the EOSC Profile specification at https://eosc-portal.eu/providers-documentation/eosc-resource-profile, https://eosc-portal.eu/providers-documentation/eosc-provider-portal-provider-profile, and jointly at https://zenodo.org/record/5726890. All providers and resources descriptions collected within the NI4OS-Europe Agora catalogue (https://agora.ni4os.eu/) are exposed as JSON documents via dedicated APIs: https://agora.ni4os.eu/api/v2/public/resources/ and https://agora.ni4os.eu/api/v2/public/providers/.

Furthermore, information available via Agora API could be combined with data stored within the NI4OS-Europe GOCDB database. API interface of the GOCDB is available at https://gocdb.ni4os.eu/gocdbpi/public/. By merging data coming from these two sources, one can associate GOCDB endpoints with Agora's resources and providers. Similarly, one can use Argo monitoring system API (https://api.devel.argo.grnet.gr/api/v3) to check the latest monitoring status of a particular resource, and by combining all available information, assess the quality of data collected during the onboarding process.

Data validations and checks should align with the descriptions and criteria from the EOSC Profiles specification, but could also go beyond its basic requirements. They could be practically conducted by describing and designing controls to be conducted during data entry, on-request creation of reports, or by continual or periodic monitoring and consistency checks. For example, data in NI4OS-Europe Agora could be managed and improved by checking the EOSC Resource Profile compliance, but also by extending these checks to be more precise, address the needs, expectations or conventions of related communities, or by merely suggesting and describing anything that is missing or suitable for a better description of resources, services or organisations and that would facilitate their EOSC onboarding.

These checks may be performed by scanning, parsing or comparing data or doing anything that could be conducted by following and accessing the provided links or references, metadata or other data that services publish on their sites (e.g., documentation, service and privacy policies) or elsewhere. The applied approaches include machine validation of data, links and references, assisted data entry and human-based validation of attributes. These checks may include regular expressions, heuristics, hints, warnings, validation of web resources, and reports. The contributions may also provide suggestions for more detailed descriptions, expected or offered values, or new or changed classifications of the expected data or the content it refers to.

Available information could be also used to design or prototype infrastructure and service monitoring that would not only check data about described services but also confirm that they are available, with satisfactory performance, or measure user satisfaction.

Developing and improving FOSS research software or service and increasing its outreach

This topic is for researchers and final-year students who are working with data of wider relevance and who are planning or developing related research software. It is not only applicable to already established developments but also to proof of concept or early-stage projects. The participants may discuss how to process or collect similar data, the necessary features, how to spread the solution, what FOSS licence to apply, or how to promote it or turn it into a service or shared community development. They can also brainstorm within the group on how to improve usability, presentation, interface or visualisations. What is also relevant in this endeavour is FAIRification and preparation for onboarding of research software, tools and services, that is, application of FAIR principles to the used or generated data and subject software, since they have been also adapted for research software.

Also related to this is the preparation of a software tool or service for onboarding into a wider open science ecosystem. TRLs https://en.wikipedia.org/wiki/Technology_readiness_level, th transition from TRP6 to TRL7 is particularly difficult and sensitive.

Preparation of a tool for onboarding – elements:

  • Summary description
  • Service request types and their brief differentiation and explanation
  • Define new technical solutions or improve/re-think existing
  • Any backend work by supporting staff
  • Use of open or popular data formats
  • What happens with the outputs/results and how to make them FAIR
  • Possible integration of outputs with other elements of the OS infrastructure
  • Prepare a service for NI4OS-Europe onboarding
  • Policies – key elements (RePol for Privacy, analogue elements to those to repository policies?)

Recommendations:

  • Make source code public and use the publicly accessible and versioned repository from the very beginning
  • Adopt a licence (preferably FOSS) and comply with the licensing requirements of all dependencies and contributors
  • Provide basic metadata by registering software in a relevant (community) registry to make it easy to discover
  • Establish clear and transparent contribution, communication and governance workflows
  • Enable citation of software (some archiving services meet this requirement)
  • Meet the standards of the domain community – stick to community expectations in terms of conventions, formats used to read and write data, functionality, terminology and other domain practices.
  • Use a software quality checklist to asses used components and to evaluate and enhance your software. Also, assess and consult the target community. Some factors to consider:

    • Community support and adoption (popularity, reputation, size, communication channels, and involvement)
    • Documentation
    • Costs (licence, training, support, etc.)
    • Licensing conditions
    • Operational characteristics (independence from other software, development language, portability, compliance and testability…)
    • Maturity
    • Quality (reliability, performance, modularity, maintainability, code quality, architecture…)
    • Perceived risks related to confidentiality, integrity, availability…
    • Trustworthiness (of components, architecture and platform, provider reputation, available 3rd-party assessments…)

This topic is for researchers who already developed and are using their custom solutions in their research. The typical process phases are:

  • Prepare the online service for onboarding

    • Using NI4OS tools described in the introduction
    • Learn how to meet NI4OS-Europe/EOSC requirements
    • Publish the tools/service in the NI4OS catalogue
  • Creating an online service out of an existing offline software tool

    • Exploring the ways for making the tool web accessible
    • Publishing the code and making it available for other researchers to contribute

      • Choosing a licence
      • Ways of citing the software and getting a proper attribution

Work modalities for both researchers and final-year students are:

  • Collaborate with the previous group in their work and help find the optimal solutions

OR

  • Work independently on the real-world examples provided by the previous group and compare solutions afterwards.

Stimulating the use of open-source software in science

This is a theme that covers dealing with several subjects related to the use of FOSS research software, especially that is developed by researchers and its establishment as an integral element of research that is essential for research reproducibility and progress. It also extends to FOSS-based services that support research, but also to services that support the use or development of FOSS in open science.

It includes the following matters related to the use of open-source software in and around research, all of which certainly cannot be covered by a single team:

  • Opening up and FOSSification of research software
  • Spreading of knowledge about FOSS and tools for communities
  • Addressing obstacles for researchers and developers creating and using FOSS research software, FOSS-based services and tools that support research
  • Fostering collaboration among researchers and developers to improve research or open science FOSS; do we need specialised FOSS-related tools or collaborative environments?
  • Fostering collaboration among researchers by using FOSS
  • Planning where and how to deposit research software
  • Producing records about software in repositories
  • Citing and referencing research software
  • Helping scientific and open science services to more resolutely adopt FOSS or to better interact with FOSS tools and plug-ins
  • Policies and incentives for the use of FOSS in science

FOSS licence selection: assessment, improvements and recommendations

A self-assessment focused on software research/project inputs and outputs can be conducted by using License Clearance Tool (LCT, https://lct.ni4os.eu/). It can be used to identify suitably licensed sources or to select compatible licences for research products. As the LCT aims to cover licensing of various types of research objects, it can be also used to support the selection and management of FOSS (free and open-source software) licensed research software. Other exemplary tools include https://ufal.github.io/public-license-selector/ and https://choosealicense.com/. With all existing tools, the main issue in licence selection is the presence of legal ambiguities, in particular those related to licence compatibility, which impedes fully deterministic all-inclusive licence selection.

Based on the functionality of this tool and possibly other utilities and available guidelines, participants could explore what should be improved in the LCT or what similar but more specialised solutions could be produced to facilitate the development and application of FOSS in research. Potential areas of work include the user interface, supporting explanations, new features, documentation, general guidance, improved situational and analytical reports to be generated by the tools, or completely new designs for target scenarios.

Improving research and researcher data

Consolidation, improvement and linkage of data and metadata about researchers, institutions, publications, projects and research data in repositories and catalogues. Examples include comparison, matching, cross-checking consolidation, and aggregation of information on authorship, affiliation, citations, references, etc. Participants will be challenged to analyse records with descriptive metadata. Their task will be to determine whether some of the metadata could be enriched with external data by using metadata to create linkable data and how reliable and useful such enrichments would be. The discussion will be based on considerations of further integration of research objects with existing infrastructure and difficulties of conducting a complex analysis of basic metadata by relying on data that does not follow any standards or recommendations. Participants will brainstorm on mechanisms for assessment, evaluation, normalisation, cross-validation, monitoring, etc. This work assumes familiarity with terms such as DOI, ORCID, ROR, and citation DB.

Presenting differences between applying solutions founded on a custom-made workaround or adopting standardised recommendations. Showing the importance of adequately following standardised recommendations (e.g., FAIR principles).

Metadata describes particular objects (e.g., research output, project, researcher, institutions, etc.). Metadata can be enriched with additional information integrated and make metadata linkable. Linkable metadata are useful for integrating external services, e.g.:

  • DOI, handle or other PIDs for identifying research output
  • ORCID for identifying researchers
  • ROR for identifying institutions

Identifying objects (research output, researcher, institutions) is just the first step. Properly identified objects can be further enriched with additional descriptive and linkable metadata and then used for conducting complex processes and analyses besides identifying. Well-designed metadata schema that follows the related standards and recommendations can support establishing more complex procedures. For researchers and institutions, those procedures are used for managing different processes to fulfil different requirements imposed by others (e.g., intuitions, government, project founders, etc.). Some of those procedures are:

  • Assessing researcher output
  • Evaluating project successfulness
  • Monitoring institutions/researches productivity

If research outputs, researchers and institutions are identified properly then it is possible to determine relations between them and perform further analyses. Those relations can be used for comparison, matching, and consolidation by cross-checking.

In the context of repositories and already existing scholarly infrastructure, it needs to have standardised protocols (such as OAI-PMH) for exchanging data and avoiding loss of information kept in enriched metadata. Information about researchers (e.g., ORCID) and research output (e.g., PIDs) can be useful to others, especially if the research includes researchers from different institutions.

Internal backing details

The event is cooperative, not competitive. It should have a strengthening effect on the community in terms of understanding the covered topics and creating new possibilities for collaboration between participants. Some of the developed ideas may be even applicable in practice.

The event is not called a ‘hackathon’ in order not to repel participants who might link the term ‘hackathon’ with programming or the actual development of a software solution. It should provide an opportunity for participants to improve in the areas they are interested in or are at least in line with their primary interests. The participants should be stimulated to acquire new skills or perspectives, and not pressured to deliver unrealistic goals.

Open issues

  • Partners are to recruit researchers and students. How WP7 could assist in this?
  • How many participants? – Depending on the organisers’ capacity. Prospect participants will need to register in advance, so the adjustment of capacities and additional recruitment could be performed as needed.

For groups developing software, any collaborative/versioning platform could be used, but also a repository for archiving production releases and providing persistent identifiers. GitHub+Zenodo is a preferred and convenient solution. See https://docs.github.com/en/repositories/archiving-a-github-repository/referencing-and-citing-content.

Key organisation and logistics elements

  • The event will be held near the end of November 2022.
  • There were difficulties with the open call – similar problems are expected here.
  • Partners’ engagement in the event will be covered through the NI4OS-Europe call for additional effort via WP4 and WP7.
  • Cost coverage by NI4OS-Europe (venue, drinks, snacks) through organising partners.
  • The event sub-site is set up within the NI4OS-Europe site.
  • The first public call should be distributed one month before the event.
  • Use the EOSC symposium that precedes the event to promote it.
  • Most participants register seven days before the event.

Registration/enrollment and preliminary grouping based on participant interests and self-description. Make sure that group members come from different institutions and that the teams are of about equal size.

The event will be held at several locations with the physical presence of participants, but with online introductory presentations. At every location, organisers should provide at least three supporters or mentors for the local groups.

Partners with less than 10 participants do not have to organise a local gathering as they may join it remotely. Partners with 10 and more should form the groups and provide them with suitable physical space for their group work and joint Zoom communication with other locations.

Optionally, the groups may split into separate rooms. The plenary space should have a projector. Group tables are large banquet tables or joint desks where people can sit on all sides. During the workshop part, people should be able to sit on one site and look at the projection canvas. They should not be interrupted except by the mediators’ facilitation of the group work and brief clarifications from stakeholders, but without making joint calls to them.

There should be enough power outlets (one for each group), reliable Wi-Fi, juices, water and coffee. One handout for participants per table with the Wi-Fi info and schedule. Snacks/cakes?

Users can use whatever they want during their work.

Project leaders should be recruited among the known participants a few days before the first meeting, to discuss how to make progress on their projects, and work on tasks for newcomers who may need help.

Three days before the first session send a logistics email to registered participants. Remind participants about the event, location, and schedule, to bring a laptop and charger, preparatory reading – if any, possibility to bring ideas for projects to be performed during the event.

Preparation of topics

Intentionally introduce several topics so that no one is an ‘expert’ in all of them. For each topic, the big picture should be given first so that anyone feels invited to the activity and can contribute. Also, provide some direct hooks to available data, tools, or mechanisms so that those who are technically inclined or focused on details could do something specific. However, these details should be offered in a way that makes them directly available regardless of the tools the participants have at their disposal so that they feel they can meaningfully contribute and get something from the effort regardless of their skills and skill levels.

The ready-for-work topics should be defined and prepared in advance, with sample data and tools that could be used and dome course ideas or examples for the methods and results. Projects revolve around problems that the subject matter experts bring to the table. The topic descriptions (initial elaboration) part of the event will also serve as a workshop/training component.

The presentation of the themes should bridge the gap between the domain of the problem and the ways it could be addressed. The workshop presentations should introduce participants to the subjects and induce short discussions, but each should not take more than 45 minutes (with 15 minutes allocated for deliberation about issues in the field); additional details, if any, should be a part of the prepared projects. There may be 15 minutes breaks in between and a lunch break before the group work.

Mentors, who are about to prepare these topics should be recruited early enough. The topic should be aligned with the mentors’ expertise. Ideally, each topic has a mentor at each location. If several groups share one topic, the responsible meteor is supported by additional supporters.

Datasets or tools to be used or reviewed are prepared and made available in advance. If programming is offered, the contribution should be based on simple and clear interfaces/APIs where additions or plug-ins could be injected. The build environment should be easy to set up and the codebase prepared in advance.

Takeaways, benefits, incentives

Participants will get participation certificates, symbolic gifts, exposure to subjects and means they could use in their later academic work, and, if possible, some free CPU time on our computing resources. The knowledge takeaways include information about pan-European R&E infrastructure, how the services are being incorporated and what is required from them and researchers, the use of open-source in general and in OS, and how the participants could apply it in their projects...

Prepare T-shirts in different sizes, mugs or other paraphernalia.

An extra incentive: selected teams will get an opportunity to work with the teams and developers of EOSC-related services.

Participant recruitment and registration

Scoop wide when promoting the event and inviting potential participants. Make sure people are diverse enough in terms of background and gender. Try to ensure that the groups are diverse enough.

All participants should register in advance. Registration with a limit according to the maximum capacity, depending on the room(s), tables and the number of mentors and prepared projects. Expect that only 2/3 of those who register will show up. Scope wide and limit registration at 150% of the maximum comfortable capacity.

Use the registration form to gather information about participants:

  • Name (and possibly other information as required by venue security)
  • Email address
  • Job title
  • Primary interests in the event: research software, OS tools and services, Software in general, FOSS licences, Infrastructure, Governance and policy
  • Primary role in research or at work: Researcher, Data scientist, Software developer, Service designer or supporter, Service or resource provider, Government staff, Communicator or promoter, Manager
  • What are they interested in working on during the workshop? This a free-form question, where participants may express several interests. Participants can indicate if they have a project idea for the hackathon and provide its short description as part of the registration form
  • How they heard about the event
  • Special needs/requests

Interaction flow

  • All participants register at Zoom to obtain a link to join the event.
  • Those who have not filled in the questionnaire are reminded to do it.
  • At the start of Day 1, everyone joins the Zoom session. The plenary part of the meeting is recorded.
  • During the plenary work, they enlist for a project in a shared document/whiteboard.
  • Organisers distribute participants into groups of similar sizes and establish breakout rooms for the groups.
  • During the after-lunch session on Day 1, participants join groups’ breakout rooms.
  • Groups edit their Jamboard whiteboards or (shared) documents; their members share screens as needed.
  • Participants reconvene to present their concepts.
  • At the end of the day, breakout rooms may be reopened during a defined period so that groups may continue to work together (if they want to).
  • Day 2 starts with a plenary warm-up and comments. The plenary part of the meeting is recorded.
  • Participants join their breakout rooms from Day 1 and groups continue their work.

During the after-lunch session, participants reconvene to present the completed projects.

Preparatory notes for Day 1

Seminar on OS and FOSS (hybrid)

  • Introduction to the workshop, FAIR, EOSC and NI4OS-Europe
  • Software development in open science
  • Work on open-source projects
  • Open-source software licences and their use in open science
  • Data on open science; how to link and cross-check them
  • Onboarding of open science services, related requirements and tools

This is a joint multi-site workshop that should also function as a webinar. Ideally, the onsite sessions will be held in Belgrade and Zagreb, where the co-organisers will provide mentors, supporters and a sufficient number of onsite participants. Other participants may join remotely.

Initial hands-on group activities (in groups at tables or in virtual rooms)

  • Expectations from participants and their collaboration (plenary, hybrid)
  • Introduction of mentors and establishment of groups
  • Definition of projects
  • Brainstorming and initial work on projects
  • Brief presentation of project concepts

At two physical locations and via virtual rooms.

About 25% of workshop participants are likely not to show up.

Groups gather and discuss what they would do for one hour, after which each group presents their idea and implementation approach in no more than five minutes.

Groups prepare tasks and refine steps and expected results. An example of potential tasks: make a plan on how to FAIRify their software/tools; use NI4OS tools to analyse software or define policies; deposit their software in an appropriate repository; start onboarding or explain what is missing for onboarding.

Researchers will work with their software.

Students can work either on examples provided by the trainers, software/tools brought by researchers as examples, which may include NI4OS-Europe products or the elaboration of their ideas which need to be confirmed by the organisers.

Before the event, the participants will be encouraged to read some materials or complete selected courses from the NI4OS-Europe Training Platform (not mandatory).

If any of these steps is impossible to complete, they should be able to explain what is missing.

In case of a multi-day break between the first and second day, the hands-on session could be continued online with people working in separate virtual rooms and collaborative channels. Mentoring and support would also be set up. The participants would communicate digitally and stay in touch after the first session and could consult with mentors, using a chat room such as Slack and shared document space – Google Docs or Dropbox Paper. Mentors would respond to the questions related to topics, the use of the provided resources, the scope of their projects, and technical issues.

Preparatory notes for Day 2

Continued hands-on group activities (in groups at tables or in virtual rooms)

  • Continuation of work on projects
  • Presentation of results
  • Comments on projects

Results and discussion (hybrid)

  • Joint discussion
  • Wrap-up

A wrap-up session gives each group a chance to report what they accomplished or what they learned. This could be done by any volunteer, not necessarily the leader; they may wish to connect to the projector but should be able to show what they want to present before the wrap-up session. Each group should not take more than five minutes overall, including setting up the presentation and comments. Everybody can comment within the limited discussion time – either after each presentation or after all presentations.

  • Finalisation and presentation of projects (up to one hour)
  • Presenting results and discussion – All participants should present their results. If they work in groups, each group should appoint a rapporteur. Rapporteurs explain their ideas, progress, problems they ran into, what should be done if the project is continued and possible plans.
  • The results will be compared for the groups covering the same or similar topics
  • Short discussions about the content of the presented projects and typical problems the teams encountered during the hackathon
  • The coordinator makes the final remarks, gives a summary of the event, suggests participating in sharing of the materials, knowledge, results, related products and other outcomes, and thanks all participants and co-organisers.
  • Feedback is asked for
  • Participants get the booty – participation certificates and paraphernalia (if any)
  • Group pictures are taken

Groups and group work

The initial groups should be proposed in advance but could be additionally adjusted on the spot. Each group could form around a well-known anchor or leader who is already familiar with some of the offered topics. Perhaps it is best to propose the anchors and then let groups come together; if they fail, set up a default distribution and then let each group agree on its topic. Anchors should be known to be coming from related projects or have some prior knowledge. They should be able to guide other participants and will be introduced before the hacking segment of the event. The participants from one institution should be distributed across groups. The newcomers should be evenly distributed among the groups. Not all offered topics have to be covered by the groups.

It is better to group participants with different backgrounds, but if the bulk of participants is from the same milieu it is also acceptable to have them all working from the same corner. In the first case, the supporter may need to assist the group in harmonising the viewpoints and coordinating the work; in the second, the supporter should help them divide the work and challenge the approach of the group to help them to consider the perspectives of various stakeholders.

Participants have at their disposal tables where they can sit together, talk, take notes, draw on the shared flipchart or whiteboard and work on their own when needed. During the hacking session, participants dive into problems. Groups of 2-5 individuals form around a project, such as producing a mock-up solution, collaboratively investigating a problem, concurrently writing a shared document, proposing refinement of an existing solution, prototyping a PoC solution to verify and refine the viability of an approach, at least for a significant portion of relevant situations, population or data.

Participants take out their laptops, connect to power and Wi-Fi, and start to work – discuss, plan, design, analyse data, program, test, present to each other or do whatever is appropriate to reach the result and prepare the final presentation. Projects should clearly articulate or refine the question or problem they are trying to solve and establish a reasonably specific proposed solution. The mentor should help them size the goal to the attainable level, as the participants need to feel accomplished in the end, not interrupted. Periodically, the mentor checks and makes sure:

  • The team knows what they are working on.
  • The goal is attainable within the remaining time; if that it’s not possible, it should be reformulated and reduced as soon as possible.
  • Everybody is working on something that is within their reach; the newcomers are working within the team and feeling comfortable (some tasks may need to be defined in advance).
  • There is a reality check and feedback by a subject matter expert or relevant stakeholder – a group member or the mentor, who should be able to keep the group and conduct work bound to the needs and constraints of the real world.

Help should be made readily available by local mentors and supporters, those who have described the thematic area, and the event coordinators and organisers. Mentors should actively seek and look after those who may not know how to get started and what to do. They should help the participants realise how they can contribute by letting them explain what they think about the offered subjects and then proposing to them related potential actions or, even better, inducing them to formulate what they think could work. This support should be provided at both the group and individual levels and may (if necessary) include adjustment of the initially established groups. Team leaders must ensure each team member has something to work on and be able to help newcomers.

The mentor may decide to take a step back if the group works well without external guidance or if someone from the group is taking the lead in a good direction. Still, the mentor should be present, alert and ready to intervene when needed.

Circulating supporters can alternate mentors if they are absent. A mentor also acts as a proxy between the group and an external stakeholder for the project. The mentor does not have to generate ideas but should be able to comment on them and go beyond the boundaries of current procedures and tools.

Post-event wrap-up and closure

  • Blog posts about the event.
  • Conducting a follow-up survey.
  • Discussing and capturing what should be done better the next time.

Concluding remarks

Registration and participation analysis

As many as 141 persons registered for the workshop at https://events.ni4os.eu Indico. Two registrations were retracted and one was duplicated; 4 registered were or ended up as presenters. There were 9 additional presenters and facilitators, out of which 8 registered with Zoom. Since information about the event was broadly distributed within the community, additional Zoom registration had to be set up. This led to 93 distinct people registered with Zoom (Zoom counted 95: three registered twice with different emails, but it did not count the event host). This adds up to at least 147 people with a valid initial registration or known physical presence (there is a chance that a few more were physically or remotely present, e.g., by sitting with a registered remote participant, but without registering in any way. Out of the 147 counted, 94 registered with Zoom or were physically present.

On the first day, there were up to 70 people during the presentations, while at the end of the discussions on the second day, there were 20. The attendance-registration ratio is therefore between one-half and two-thirds. Since the invitations for the event were distributed to partner organisations and student portals and not relying on a chain of command, prizes or certificates, the recruitment was based purely on interest and goodwill. This makes the achieved attendance and engagement more than satisfactory.

Pre-workshop survey

The pre-workshop survey was responded to by 63 individuals, out of whom 4 did not register with Zoom. Among 13 of those who directly contributed to the workshop, 3 participated in the survey: a case study presenter and mentor and two researchers were recruited during the first day to demonstrate their work, so the insiders' responses did not skew the results. The survey results are available here.

Regarding the relationship between open science and open-source software, the participants most strongly agreed with this statement (the average agreement score is 4.73):

  • Researchers would benefit from more open-source software in science.

These three statements follow closely and are almost equally ranked (with scores of 4.65, 4.65, and 4.60 respectively):

  • Research funders should provide better support to open-source software and its adoption.
  • Researchers would benefit from more training on open-source software.
  • Software developed during the research should be open-source.

When asked about reasons for using open-source software, the participants primarily selected these three reasons, which combine practical and aspirational attitudes:

  • I can use it without buying it or paying for licences. (20.87%)
  • Open-source software assists and enhances open science. (20.00%)
  • I agree with its principles and development model. (15.65%)

Other reasons (avoiding bureaucratic obstacles; personal habit; use by the community; maintainability) were selected by about 11% of participants.

Lessons learned

  • Random or loosely associated enrollees who declare they would come are more likely not to show up if they can switch to remote participation without any consequences.
  • Have no more than one physical location during one hybrid session. Otherwise, tracking the event and participation in it becomes too confusing. (We had two physical locations, but they alternated between the days.)
  • Plan heavily in order to be able to adapt. Prepare more than what will be delivered and select alternatives that stick best. The closely related topics are more likely to work than those that stand apart. Probe lightly and early to test interests and the will to engage and modify the plan accordingly. (In the practical part, we had to opt for the development, licence management and operationalisation, forgoing data opening or improvement.)
  • It is much easier to pressure the people who are physically present to engage. Having a hybrid event further decreases the engagement of remote participants.
  • Having a hybrid event may be at the expense of remote participants' comfort. Those who are in the minority during discussions at hybrid events (in our case - remote participants, despite their larger attendance number) are likely to be frustrated as they may not be able to see the interlocutors or those who comment from the background. For this reason, strict discipline over the use of on-site microphones must be enforced.
  • Physically present participants can be encouraged to take part in the delivery or discussions for the benefit of remote ones, so having them is a good trade-off if they truly engage. Only a few will be truly active, but care should be taken so that no one dominates the stage. Still, some may be encouraged to present what they did and others to elaborate on issues they faced. Embrace and stimulate them to speak up and emphasize the significance of their experiences. Build on top of their stories, extrapolate, generalise and give advice.
  • Participants were notably unwilling to choose the offered thematic physical breakout tables and virtual rooms and even more to propose additional topics. This was possibly due to the open nature of the proposed topics and the scary perspective of physically or remotely interacting with unfamiliar individuals. Therefore, the group work must be either fully arranged in the physical world or fully remote, in which case the membership must be preassigned. (Upon the early detection that the participants were unwilling to join the groups, we rearranged the practical part into a sequential demonstration and discussion of the entire development and operationalisation workflow. We let the developers, researchers and service administrators share their achievements and experience and then they discussed approaches and potential solutions with other presenters. Still, that required finding those who agree to engage in this way.)
  • The flow and covered topics are likely to be more straightforward and compact than originally envisioned, as the participants likely will not be able to absorb the varied content and apply it during the practical part. Instead, the practical work has to be strongly prepared and choreographed or firmly guided. Do not expect a disparate group of participants to self-organize, set goals, and apply or assess new information or tools. Instead, they should be orchestrated to connect the dots and work within their comfort zones.
  • Even the passive participants may enjoy the topics and the interactions. Key qualities: flow, linking direct experiences and needs with formal content. Try to redescribe the participants' examples as deeper and more meaningful than originally presented and relate them with the content that was delivered by others - both speakers and listeners will take pleasure from this reframing.

Appendix: Communication artifacts

Pre-workshop survey

Assessment of interests and attitudes toward the event topics. You received the invitation for this survey because you registered for the hands-on workshop.

Participant information

Registration email:*

Provide the email address that you already used for registration so we could tailor the work of remote and on-site groups. It will be used only in direct relation to this event.

Do you have any special accessibility or dietary needs (if you are about to physically participate) or any other remarks related to this workshop?

Interests and work

What is your primary field of work or interest?*

  • Support and governance (infrastructure, access, support, training, data processing, development...)
  • Natural sciences (Earth, biological, physical, computer & information, chemical, mathematics...)
  • Engineering & technology
  • Humanities
  • Medical & health sciences
  • Social sciences
  • Agricultural sciences

Categories and Scientific Domains details are at https://marketplace.eosc-portal.eu/services/

Your interests in this event include:*

  • Research (scientific) software and research data
  • Open science tools and services and related data
  • Software in general or software licences
  • Research and open science infrastructure
  • Governance and policies in services, research or organisations

If there is a specific topic or project idea you would like to work on during the workshop, what would it be?

Concerning research software, scientific and data analysis tools, select all that apply to your work, research or academic training:

  • I use or rely on paid software.
  • I use or rely on free and open-source software.
  • I develop paid or closed-source software or a related service.
  • I develop or operate open-source software or a related service.

Opinion on open science and open-source software

Enter 1 to 5: 1 - strongly disagree, 5 - fully agree

  • Open science and open-source software are mutually unrelated.*
  • Researchers would benefit from more open-source software in science.*
  • Software developed during the research should be open-source.*
  • Researchers would benefit from more training on open-source software.*
  • Research funders should provide better support to open-source software and its adoption.*
  • Research organisations should provide better support to open-source software development, quality, usability and sustainability.*

Motivation

When using open-source software, you do it because:

  • I can use it without buying it or paying for licences.
  • I avoid bureaucratic obstacles or depending on approvals or favours.
  • I already used it during my academic training or while working on my degree.
  • People I work with or my wider community are using it.
  • Anyone can adapt or fix it.
  • I agree with its principles and development model.
  • Open-source software assists and enhances open science.

Registration

Personal Data

First Name *

Last Name *

Email Address *

The registration will be associated with your Indico account.

Affiliation *

Position *

Participation type *

  • Experienced researcher
  • Researchers with their own research software, tools and services
  • Other researchers, developers, students

Participation details

On-site participation or online *

  • RBI/SRCE
  • University of Belgrade
  • online participation (Zoom)
  • other

Please choose which location would you prefer

Call for the NI4OS-Europe partners

To: [all NI4OS-Europe partners]

Subject: Hands-on workshop: Touching on Data and Open-Source Software for Open Science

Dear NI4OS-Europe partners,

Below is the information on "Hands-on workshop: Touching on Data and Open-Source Software for Open Science". You are being asked to take part in the promotion of this workshop and recruitment of participants, encourage them to join and, if needed, help us in forming the groups.

Please forward the below information to researchers and developers of scientific software tools, developers and managers of open science services, graduates and final-year students who may be interested to participate in this event.

As NI4OS-Europe partners, you are also invited to provide remote mentors for this hackathon from your or associated organisations, who may also contribute to the development and delivery of materials on the covered topics.

Also, please let us know if you are having any comments or questions.

Best regards,

Judit, Alen and Branko

First invitation for the registered

Dear Participant,

You have registered for “Hands-on workshop: Touching on Data and Open-Source Software for Open Science“ at https://events.ni4os.eu/event/85/.

Please fill in the pre-workshop survey at the following link in the next few days, as we want to learn about your interests and attitudes toward the topics that will be covered. Here you can also propose specific topics or project ideas you would like to work on during the workshop:

https://events.ni4os.eu/event/85/surveys/11?token=fc531e82-a0aa-492c-94fd-3039ef27dae3

Groups will be established on the first day; they will discuss and work on details of a specific topic or a concept solution. Their members can collaboratively investigate a problem, propose a refinement of an existing solution, make a prototype or mockup or verify and refine the viability of an approach. A group’s result should be a joint document with a brief analysis, presentation or solution description that can be presented at the end of the workshop. No programming is expected or required, but the groups are welcome to produce programs, scripts, UI mockups or data models if they find them relevant or useful.

Refreshments and light lunches will be served at both locations. They are:

- Rektorat Univerziteta u Beogradu, Studentski 1, Beograd, Serbia

- SRCE, Josipa Marohnića 5, Zagreb, Croatia

All participants will receive a Zoom link before the workshop.

The overall tentative agenda is below.

Best regards,

Branko Marović, RCUB

---

[Agenda]

Second invitation for the registered

Dear Participant,

You are receiving this as you registered for the “Hands-on workshop: Touching on Data and Open-Source Software for Open Science“.

We would like to thank those who have filled out the pre-workshop survey. Some of you could not do it due to an invalid link - the correct one is

https://events.ni4os.eu/event/85/surveys/11?token=fc531e82-a0aa-492c-94fd-3039ef27dae3

Due to the diversity and the number of enrolled, you will have to register with Zoom for a personalised link for the meeting (Zoom accounts are not required). We apologise for this complication. You should do this one day before the event even if you are going to physically join us so that you could share your screens and whiteboards. Please register with Zoom at

https://zoom.us/meeting/register/tJ0scuGhpj8tHtI8soGjKOCVuBKAIV76hrjt

You will then receive a confirmation email containing information about joining the meeting. The sessions start at 10:00 AM CET (you may join from 9:30) on 30 November 2022 and 1 December 2022.

Please download and install the Zoom application. If you already have it, install the latest update.

If you want to want to physically join us only on the first or second day, or you want to change your initial choice, please let us know in a reply to this email or by writing to the organisers from the RBI and SRCE.

The detailed agenda is below.

On behalf of the organisers,

Branko Marović, RCUB

---

[Agenda]

Project topics selection

Participants who bring their own ideas to the event will have an opportunity to briefly (up to one minute) explain what they are working on at the very start of the event so that other participants can join that work. People should not be allowed to pitch ideas that they will not be working on at the event, go into detail about the topics, or complain about projects or approaches to demonstrate their expertise.

The following shared document was offered to all participants:

From solution to service

Description: Adaptation, onboarding metadata, service catalogue, how and where to promote or deposit the resource

Mentors: Davor Davidović, Alen Vodopijevec

Participants willing to join:

  • <your name in Zoom>

Service documentation and policies

Description: Discussing the importance of user documentation, policies and agreements in the context of certification, onboarding and sustainability; using available examples, templates and tools (such as RePol)

Mentors: Milica Ševkušić

Participants willing to join:

  • <your name in Zoom>

Collaborative software development

Description: Establishment of online collaboration, GitHub/GitLab project establishment and licence declaration, project’s online presence

Mentors: Dubravko Penezić, Draženko Celjak, Kristina Posavec

Participants willing to join:

  • <your name in Zoom>

Structuring, linking and improving data about research

Description: Matching and enriching general research data with identifiers; analysing, checking and validating linkable data

Mentors: Vladimir Otašević

Participants willing to join:

  • <your name in Zoom>

Software or project LCT self-assessment

Description: For a specific research software/project with its inputs and outputs, with an exploration of related concepts: extending sharing and reuse, privacy, etc.

Mentors: Panagiota Koltsida

Participants willing to join:

  • <your name in Zoom>

LCT improvement

Description: Ideas on usability, enhancements for components description/data reuse on software and licences, licence selection, integration with other platforms, or something else that they come up with.

Mentors: Panagiota Koltsida

Participants willing to join:

  • <your name in Zoom>

Data preparation, governance and curation plan or research (tool) workflow

Description: <details of your topic>

Mentors: <proposing participant - your name in Zoom>, <one of supporters>

Participants willing to join:

  • <your name in Zoom>
  • ..

Assess scientific tool usability

Description: <details of your topic>

Mentors: <proposing participant - your name in Zoom>, <one of supporters>

Participants willing to join:

  • <your name in Zoom>
  • ..

<Other topic suggested by participant>

Description: <details of your topic>

Mentors: <proposing participant - your name in Zoom>, <one of supporters>

Participants willing to join:

  • <your name in Zoom>
  • ..

<Other topic suggested by participant>

Description: <details of your topic>

Mentors: <proposing participant - your name in Zoom>, <one of supporters>

Participants willing to join:

  • <your name in Zoom>