From NI4OS wiki
Jump to navigation Jump to search

Open-Access Data and Computational Resources on Coronavirus disease 2019 (COVID-19)

COVID-19, Open Science and Open Data

NI4OS-Europe vs COVID-19 initiative (main page, provides the Fast Access Channel for available services, computational and other resources. We also want to share the information about COVID-19 and SARS-CoV-2 related projects and open data sets in the region and joint resources for Open Science such as public and community nomenclatures and guidelines. Some of these are already available at this page, which is about to be opened for editing with the academic account credentials.

The scientific community is relying on Open Science to support and accelerate the ongoing research on the COVID-19 (and the novel coronavirus), but also to overcome some of the established habits and barriers. The relevant research papers and underlying data are immediately made available in open access or at least made freely available for the duration of the epidemic. Research and clinical protocols and standards are being updated and improved in the same manner on a daily basis.

Today, it is not enough to ensure that the available results are at the highest standard and if they remain scattered, unused or behind the paywalls. They must be quickly and effectively shared with the world. Works in progress should be treated as such and not confused with established facts. The papers, data and other results should be traceable, easily accessible to researchers, scrutinized and adequately presented to the public. New guidelines need to be established and existing ones updated in order to help in adapting to specific situations by providing an effective balance between the timely information and highly accurate but late data. The new modalities that scientific method, such as application of FAIR principles, may help in protecting and fostering the scientific work, and help in avoiding the problems from the past. In some cases, the traditional approach required decades of combat with powerful stakeholders before the related conclusions and policies could become mainstream. Today, we want to avoid unfiltered scientific overproduction or create modern versions of past controversies, as those on vaccines or climate change. We want easier access to scientific resources and the produced outputs to be more accessible, practical and actionable. The scientific community needs fast-track sharing and screening, effective and meaningful interpretation and use of results.

We are witnessing a massive ongoing production of thousands of scientific papers on COVID-19. Rapidly evolving scientific opinions in these papers that are based on studies and emerging data and evidence as a part of the standard scientific process. Because of the scale of the effort and high interest of the public and impact on actions and policies that affect everyone, this production may also contribute to ambiguity and confusion in the wider public. These findings and conclusions may even be used in the public debate, which is usual in for other fields such as climate science but is new for epidemiology, virology and medicine in general. The attractiveness and popularity of the subject also make it more prone to opportunistic production od papers, while the availability of preprints that did not go through the standards peer-review process adds a layer on uncertainty and increases the chances for questionable work to be used, popularised and relied upon. On the other hand, these preprints allow quick sharing of information and accelerate research, as studies become available months before they are normally published; this also facilitates external validation and pruning by other researchers outside of the formal peer-review process.

Unlike natural disasters that are easy to witness, the less visible nature of the epidemic makes it easier to dispute its effects and related facts. The conflicting data and messages may lead to clustering of around conflicting views and conclusions and result in inadequate decisions. While such disputes are normally gradually resolved through the scientific process, they may have a great immediate impact on behaviours, social trust and healthcare and public health interventions. Inevitable difficulties related to the accuracy and timeliness of measurements and data complicate their interpretation, something that is obvious in the discussion of the spread of the infection, mortality rates, reinfection or percentage of asymptomatic carriers. On the other hand, sharing of credible data and papers increases the ability to understand and act rationally while reducing anxiety and uncertainty.

The open science and open data may help in addressing these problems and concerns and help in addressing the crisis of trust. This affects the facts, validation process, experts and authorities, regardless of whether they are scientific, medical or societal. While the science operates with hypotheses, uncertainties and changing conclusions, others who are present in the public space may offer absolutes, overconfidence, inadequate simplifications or opportunistic scepticism. What the open science can help in is to develop new ways to communicate the facts and distinguish the valuable information, expertise and truth from everything else while providing the relevant, multi-disciplinary and integrative perspective. COVID-19 emphasises the need for the process that will ease dealing with uncertainties and accelerate the identification of important and valid messages and at the same time minimise the effects of uncertainties and changes in knowledge that diminish trust and increase vulnerability to falsehoods and disinformation.

This NI4OS-Europe incentive on combatting COVID-19 is an effort in this direction in which we all learn by doing.

General data resources

Overall resource lists and national resources

ECDC data

Daily countries aggregate data

  • Serbia
    • Daily aggregates - Data from daily epidemiological reports of the Institute of Public Health of Serbia:
      • Daily number of hospitalized persons
      • Daily number of positive persons
      • Total number of positive persons
      • Daily number of persons tested
      • Total number of persons tested
      • Daily number of deaths
      • Daily number of deceased males
      • Daily number of deceased females
      • Total number of deaths
      • Daily average age of deceased
      • Daily percentage of infected persons among the tested
      • Percentage of infected among the total of tested
      • Percentage of hospitalized among the total of infected
      • Total number of recovered
      • Percentage of recovered among the total of infected
      • Percentage of persons on among the total of hospitalized

Country regional or city aggregate data

  • Serbia
    • Self-isolation - Daily report od the Serbian Ministry of Interior on mandatory self-isolation with the number of people in compulsory self-isolation on the territory of 174 local self-government units
    • Detailed spatial distribution - Aggregate from 6 March to 14 April 2020. The dataset contains sex, age, municipality, place of residence.

Per case country data

  • Serbia
    • Infected - Daily report of the Institute of Public Health of Serbia on infected persons in Serbia. The dataset contains: date, sex, age, municipality of residence
  • Other...


Multitype registries

Publications registries

Computational resources

Offering to researches from NI4OS-Europe countries, but potentially also to other resources, for all who are working in COVID-19 related fields.

The initiation procedure is as follows:

  • Contact NI4OS-Europe fast access channel at and express your need by briefly describing:
    • Area of research,
    • Estimated overall computational load and usage pattern in the near future,
    • Execution environment (programming language, libraries),
    • Parelelisation requirements, if any,
    • Data exchange, etc.
  • The needs will be matched against the available resources and you will be responded within one week.
  • An online meeting might be arranged so that the needs of the project are discussed. If you are not sure about the mentioned technical details of the computation environment, load and data exchange, we could discuss them at this meeting.
  • Subsequently, you will be provided details on how to access the resources.

Endorsed and supported applications

  • ChemBioServer: - Used for filtering Virtual Screening results. Features: (i) browse and visualize compounds along with their properties, (ii) filter chemical compounds for a variety of properties such as steric clashes and toxicity, (iii) apply perfect match substructure search, (iv) cluster compounds according to their physicochemical properties providing representative compounds for each cluster, (v) build custom compound mining pipelines and (vi) quantify through property graphs the top-ranking compounds in drug discovery procedures. ChemBioServer allows for pre-processing of compounds prior to an in silico screen, as well as for post-processing of top-ranked molecules resulting from a docking exercise with the aim to increase the efficiency and the quality of compound selection that will pass to the experimental test phase.
  • FEPrepare: - FEP prepare is a tool, which automates the set-up procedure for performing NAMD/FEP simulations. ​Automating free energy perturbation calculations is a step forward to delivering high throughput calculations for accurate predictions of relative binding affinities before a compound is synthesized, and consequently save enormous time and cost.
  • AFMM - Tool for force field parametrization to run MD simulations of drug-like molecules. The method used fits the molecular mechanics potential function to both vibrational frequencies and eigenvector projections derived from quantum chemical calculations. The program optimizes an initial parameter set (either pre-existing or using chemically-reasonable estimation) by iteratively changing them until the optimal fit with the reference set is obtained. By implementing a Monte Carlo-like algorithm to vary the parameters, the tedious task of manual parametrization is replaced by an efficient automated procedure. The program is best suited for optimization of small rigid molecules in a well-defined energy minimum, for which the harmonic approximation to the energy surface is appropriate for describing the intra-molecular degrees of freedom.
  • Nanocrystal: - Creates nanoparticle models as drug carriers from any crystal structure guided by their preferred equilibrium shape under standard conditions according to the Wulff morphology (crystal habit). Users can upload a cif file, define the Miller indices and their corresponding minimum surface energies according to the Wulff construction of a particular crystal, and specify the size of the nanocrystal. As a result, the nanoparticle is constructed and visualized, and the coordinates of the atoms are output to the user.

Other supporting resources

Data content (nomenclature, forms, guidelines)

Desease-related data

The collected data should be aligned with the goals of the specific research, but also with the data required by applied Guidelines and protocols.

  • WHO Case Record Form(CRF) describes key clinical information that can be useful in the general research, with the content that should be recorded: on admission; on admission to ICU and daily while the patient is in the ICU; and on discharge or death.
  • WHO Case Report Form lists the content that may be useful in the design of epidemiological research data capture.

Data coding

Guidelines and protocols