Extracting the data from World Health Organization (WHO) and converting it into a desirable format for analytical purposes
Problem definition
Dr. Waldemar W. Koczkodaj needed to access data provided by WHO in a desirable format. So, he asked me to find a way to extract the data available at https://covid19.who.int/table. Since this data is updated consistently, he asked me to find a way to prove that WHO hosts the data we extracted.
What we did
Since this page uses React for binding (i.e. it dynamically modifies DOM), it wasn't possible to copy and paste the table or do this by a static web crawler. So, the reasonable solution is to use browser simulation packages like selenium (an automated software testing package). However, even in this case, some trick is needed. So, to solve this problem, I used a different approach. I tracked this page's HTTP requests to find the data source.
Interestingly, I found WHO has made the whole data in JSON format available at LINK . Therefore, we can use the latest version of this dataset whenever we want. The following code shows how we can do this in a line of code.
To prove that WHO hosts this data, I searched for a website named wayback machine, in which one can take screenshots from an arbitrary webpage and use it as a trusted citation. Therefore, I suggested using this website for the second part of the problem.
Application Language(s): N/A
Programming Languages and Technologies: Python
Member(s): Waldemar W. Koczkodaj, Edward Kozłowski, Witold Pedrycz, James Peters, Artur Przelaskowski, E. Rogalska, Taha Rostami, Ryszard Smarzewski, S. Xue, P.F. Zabrodskii