Skip to content
Snippets Groups Projects
Select Git revision
  • a70841cb971bf681ae1215ba4a6ed372833e0617
  • master default protected
2 results

scrappers

COVID-19 data scrapper

Install

virtualenv -ppython3 env
source env/bin/activate
pip install -r requirements.txt

Running

This creates a directory in ./data with latest dumps in json format. Designed to be run at most once a day.

./run.sh

Sources

  • litcovid: NIH-curated list of COVID-19 articles (https://www.ncbi.nlm.nih.gov/research/coronavirus/) Labels (8): General Information, Mechanism, Transmission, Diagnosis, Treatment, Prevention, Case Report, Epidemic Forecasting Note that topic labels are semi-automatically assigned. See for details https://www.ncbi.nlm.nih.gov/research/coronavirus/faq

  • bibliovid: Paper categories and fine-grained analysis by experts (https://bibliovid.org/) Labels (7): Autres, Diagnostique, Thérapeutique, Épidémiologique, Pronostique, Recommandations, Modélisation Labels (19): Hépato-gastro-entérologie, Neurologie, Cardiologie et maladies métaboliques, Hématologie, Gériatrie, Infectiologie, Gynécologie Obstétrique, Dermatologie, Pédiatrie, Pneumologie, Transversale, Psychiatrie, Virologie, Anesthésie-Réanimation, Radiologie, Hygiène, Néphrologie, Confinement/Déconfinement, Immunité

  • CORD-19 metadata: large set of papers metadata selected with broad queries on general coronavirus research