top of page
Priyanka P. Pattnaik

Getting the corona data using Web Scraping in Python


This topic “web scraping” takes a great role in my project while I am doing a data analysis about the pandemic of 2020 i.e. COVID-19. In my project, we need to store the data that gives us live updates about the corona patients i.e. active, recovered, death, and tested. So that we can analyze the data. In my story, I will discuss the challenges and the solutions I found while doing my project.


Web scraping in python can be done using so many inbuilt libraries, some URLs, and some HTML tags. First, you need to find out the URL from which you want the data. Imagine yourself that you will get the data in a few minutes by scraping while getting the data manually with your hard work. Indeed you will choose the scraping method. So, in python, it is really very handling to do it using BeautifulSoup.

Installing BeautifulSoup —

pip install beautifulsoup4
or
sudo pip install beautifulsoup4

Library requirements:

  1. Pandas

  2. requests

  3. BeautifulSoup

Beautiful Soup is a Python library for getting data out of HTML, XML, and other markup languages.

If you are trying to Scrap using an HTML tag then you need lxml library.

pip install lxml 
or
sudo pip install lxml

The above is a little code snippet from my project where I am collecting the data from covid19india.org. As my projects need the data in Odia so, I need to convert each number in Odia. So I convert all the numbers in Odia by defining a function.


Scraping is really helpful if you want the data from a website. I recommend you to use BeautifulSoup for your work.


Thank You for reading.

  • Priyanka P. Pattnaik

22 views0 comments

Comments


bottom of page