covid-19

From NI4OS wiki
Jump to navigation Jump to search

Open-Access Data and Computational Resources on Coronavirus disease 2019 (COVID-19)

COVID-19, Open Science and Open Data

NI4OS-Europe vs COVID-19 initiative (main page, ni4os-europe-covid19@ni4os-europe.eu) provides the Fast Access Channel for available services, computational and other resources. We also want to share the information about COVID-19 and SARS-CoV-2 related projects and open data sets in the region and joint resources for Open Science such as public and community nomenclatures and guidelines. Some of these are already available at this page, which is about to be opened for editing with the academic account credentials.

The scientific community is relying on Open Science to support and accelerate the ongoing research on the COVID-19 (and the novel coronavirus), but also to overcome some of the established habits and barriers. The relevant research papers and underlying data are immediately made available in open access or at least made freely available for the duration of the epidemic. Research and clinical protocols and standards are being updated and improved in the same manner on a daily basis.

Today, it is not enough to ensure that the available results are at the highest standard and if they remain scattered, unused or behind the paywalls. They must be quickly and effectively shared with the world. Works in progress should be treated as such and not confused with established facts. The papers, data and other results should be traceable, easily accessible to researchers, scrutinized and adequately presented to the public. New guidelines need to be established and existing ones updated in order to help in adapting to specific situations by providing an effective balance between the timely information and highly accurate but late data. The new modalities that scientific method, such as application of FAIR principles, may help in protecting and fostering the scientific work, and help in avoiding the problems from the past. In some cases, the traditional approach required decades of combat with powerful stakeholders before the related conclusions and policies could become mainstream. Today, we want to avoid unfiltered scientific overproduction or create modern versions of past controversies, as those on vaccines or climate change. We want easier access to scientific resources and the produced outputs to be more accessible, practical and actionable. The scientific community needs fast-track sharing and screening, effective and meaningful interpretation and use of results.

We are witnessing a massive ongoing production of thousands of scientific papers on COVID-19. Rapidly evolving scientific opinions in these papers that are based on studies and emerging data and evidence as a part of the standard scientific process. Because of the scale of the effort and high interest of the public and impact on actions and policies that affect everyone, this production may also contribute to ambiguity and confusion in the wider public. These findings and conclusions may even be used in the public debate, which is usual in for other fields such as climate science but is new for epidemiology, virology and medicine in general. The attractiveness and popularity of the subject also make it more prone to opportunistic production od papers, while the availability of preprints that did not go through the standards peer-review process adds a layer on uncertainty and increases the chances for questionable work to be used, popularised and relied upon. On the other hand, these preprints allow quick sharing of information and accelerate research, as studies become available months before they are normally published; this also facilitates external validation and pruning by other researchers outside of the formal peer-review process.

Unlike natural disasters that are easy to witness, the less visible nature of the epidemic makes it easier to dispute its effects and related facts. The conflicting data and messages may lead to clustering of around conflicting views and conclusions and result in inadequate decisions. While such disputes are normally gradually resolved through the scientific process, they may have a great immediate impact on behaviours, social trust and healthcare and public health interventions. Inevitable difficulties related to the accuracy and timeliness of measurements and data complicate their interpretation, something that is obvious in the discussion of the spread of the infection, mortality rates, reinfection or percentage of asymptomatic carriers. On the other hand, sharing of credible data and papers increases the ability to understand and act rationally while reducing anxiety and uncertainty.

The open science and open data may help in addressing these problems and concerns and help in addressing the crisis of trust. This affects the facts, validation process, experts and authorities, regardless of whether they are scientific, medical or societal. While the science operates with hypotheses, uncertainties and changing conclusions, others who are present in the public space may offer absolutes, overconfidence, inadequate simplifications or opportunistic scepticism. What the open science can help in is to develop new ways to communicate the facts and distinguish the valuable information, expertise and truth from everything else while providing the relevant, multi-disciplinary and integrative perspective. COVID-19 emphasises the need for the process that will ease dealing with uncertainties and accelerate the identification of important and valid messages and at the same time minimise the effects of uncertainties and changes in knowledge that diminish trust and increase vulnerability to falsehoods and disinformation.

This NI4OS-Europe incentive on combatting COVID-19 is an effort in this direction in which we all learn by doing.

General data resources

Overall resource lists and national resources

ECDC data

Daily countries aggregate data

  • Serbia
    • Daily aggregates - Data from daily epidemiological reports of the Institute of Public Health of Serbia:
      • Daily number of hospitalized persons
      • Daily number of positive persons
      • Total number of positive persons
      • Daily number of persons tested
      • Total number of persons tested
      • Daily number of deaths
      • Daily number of deceased males
      • Daily number of deceased females
      • Total number of deaths
      • Daily average age of deceased
      • Daily percentage of infected persons among the tested
      • Percentage of infected among the total of tested
      • Percentage of hospitalized among the total of infected
      • Total number of recovered
      • Percentage of recovered among the total of infected
      • Percentage of persons on among the total of hospitalized

Country regional or city aggregate data

  • Serbia
    • Self-isolation - Daily report od the Serbian Ministry of Interior on mandatory self-isolation with the number of people in compulsory self-isolation on the territory of 174 local self-government units
    • Detailed spatial distribution - Aggregate from 6 March to 14 April 2020. The dataset contains sex, age, municipality, place of residence.

Per case country data

  • Serbia
    • Infected - Daily report of the Institute of Public Health of Serbia on infected persons in Serbia. The dataset contains: date, sex, age, municipality of residence
  • Bulgaria
    • Daily report - The dataset contains: date, age, municipality of residence and medical persons
  • Romania
    • Daily reports - Portal for statistics and real-time information of the spread of Covid-19 on regions and cities
    • Official News - News and statistics from the Strategic Communication Group

Dashboards

  • ECDC
  • Cyprus: dashboard, page 2 is in English
  • Serbia: daily charts, daily and cumulative infected (map, chart, table, data), cumulative self-isolation map, chart, table, data)
  • Slovenia
    • COVID-19 Tracker Slovenia
      • The "Covid-19 Tracker Slovenia" project collects, analyses and publishes data on the spread of the SARS-CoV-2 coronavirus, the cause of COVID-19, in Slovenia. Data is collected from various publicly avalilable sources and the tracker has a direct connection with healthcare institutions and the National Institute of Public Health (NIJZ), that share structured data, which is then validated and shaped into a format suitable for visualization to be presented to the public as well as for further work in model development and forecasting. All data is collected and available in the form of GSheets, CSV or via REST API. The data is freely available and used by other portals and projects. The Tracker includes data on:
        • number of tests performed and number of confirmed infections
        • number of vaccinations
        • number of confirmed infections by category: by age, gender, region, and municipality
        • hospital records for patients with COVID-19: hospitalized, in the intensive care unit (ICU), in critical condition, discharged from hospital care, recovered
        • monitoring of individual cases, particularly those in critical activities: working in healthcare, senior citizens’ homes, civil protection
        • healthcare system capacity: number of beds, intensive care units, respirators for ventilation...
    • Alpaka: Analysis and spread of COVID-19 in Slovenia Visualisation of analyzed data with case growth forecast
    • COVID-19 forecasts for Slovenia COVID-19 forecasts for Slovenia using semi-mechanistic Bayesian models. Analysis is done based on public data gathered at COVID-19 Tracker. All computations and data management are running on ELIXIR-SI research infrastructure.
    • Corona Virus Media Watch The International Research Centre on Artificial Intelligence (IRCAI) in Slovenia, a category 2 centre under the auspices of UNESCO, has launched a Corona Virus Media Watch that provides global and national news updates based on a selection of media entities with open online news. The tool may be useful for policymakers, media and the public to observe the emerging trends related to COVID-19 in their country and the world.
    • CoronaLive CoronaLive is a Slovenian informative website that shows regularly updated statistics on the worldwide COVID-19 situation.
    • COVID-19 spread simulation application
  • North Macedonia
  • Montenegro
  • Romania
    • COVID-19 spread - Map of current situation
    • Daily reports - Portal for statistics and real-time information of the spread of Covid-19 on regions and cities
    • Official News - News and statistics from the Strategic Communication Group

Multitype registries and repositories

Publications registries

  • WHO database of publications on coronavirus disease (COVID-19) - Latest scientific findings and knowledge (primarily journal articles)
  • LitCovid LitCovid is a curated literature hub for tracking up-to-date scientific information about the 2019 novel Coronavirus. It is the most comprehensive resource on the subject, providing central access to relevant articles in PubMed. The articles are updated daily and are further categorized by different research topics and geographic locations for improved access.
  • openscience.hu - The University of Debrecen University and National Library is collecting all Hungarian publications that are on COVID-19 research
  • instantscience.hu - The University of Debrecen University and National Library is collecting all Hungarian publications that are on COVID-19 research
  • mta.hu - Publications related to coronavirus desease.

Computational resources

Offering to researches from NI4OS-Europe countries, but potentially also to other resources, for all who are working in COVID-19 related fields.

The initiation procedure is as follows:

  • Contact NI4OS-Europe fast access channel at ni4os-europe-covid19@ni4os-europe.eu and express your need by briefly describing:
    • Area of research,
    • Estimated overall computational load and usage pattern in the near future,
    • Execution environment (programming language, libraries),
    • Parelelisation requirements, if any,
    • Data exchange, etc.
  • The needs will be matched against the available resources and you will be responded within one week.
  • An online meeting might be arranged so that the needs of the project are discussed. If you are not sure about the mentioned technical details of the computation environment, load and data exchange, we could discuss them at this meeting.
  • Subsequently, you will be provided details on how to access the resources.

Endorsed and supported applications

  • ChemBioServer: https://chembioserver.vi-seem.eu - Used for filtering Virtual Screening results. Features: (i) browse and visualize compounds along with their properties, (ii) filter chemical compounds for a variety of properties such as steric clashes and toxicity, (iii) apply perfect match substructure search, (iv) cluster compounds according to their physicochemical properties providing representative compounds for each cluster, (v) build custom compound mining pipelines and (vi) quantify through property graphs the top-ranking compounds in drug discovery procedures. ChemBioServer allows for pre-processing of compounds prior to an in silico screen, as well as for post-processing of top-ranked molecules resulting from a docking exercise with the aim to increase the efficiency and the quality of compound selection that will pass to the experimental test phase.
  • FEPrepare: https://feprepare.vi-seem.eu - FEP prepare is a tool, which automates the set-up procedure for performing NAMD/FEP simulations. ​Automating free energy perturbation calculations is a step forward to delivering high throughput calculations for accurate predictions of relative binding affinities before a compound is synthesized, and consequently save enormous time and cost.
  • AFMM https://afmm.vi-seem.eu - Tool for force field parametrization to run MD simulations of drug-like molecules. The method used fits the molecular mechanics potential function to both vibrational frequencies and eigenvector projections derived from quantum chemical calculations. The program optimizes an initial parameter set (either pre-existing or using chemically-reasonable estimation) by iteratively changing them until the optimal fit with the reference set is obtained. By implementing a Monte Carlo-like algorithm to vary the parameters, the tedious task of manual parametrization is replaced by an efficient automated procedure. The program is best suited for optimization of small rigid molecules in a well-defined energy minimum, for which the harmonic approximation to the energy surface is appropriate for describing the intra-molecular degrees of freedom.

Other supporting resources

Data content (nomenclature, forms, guidelines)

Desease-related data

The collected data should be aligned with the goals of the specific research, but also with the data required by applied Guidelines and protocols.

  • WHO Case Record Form(CRF) describes key clinical information that can be useful in the general research, with the content that should be recorded: on admission; on admission to ICU and daily while the patient is in the ICU; and on discharge or death.
  • WHO Case Report Form lists the content that may be useful in the design of epidemiological research data capture.

Data coding

Guidelines and protocols

Treatment and medicine support

Application

Education and society support