Coronavirus Data Extraction and VisualizationSunday, March 15, 2020
Since Korea confirmed the first case of Coronavirus on January 20th 2020, the total number of infected has reached 7,869 as of March 12nd. Although this pandemic outbreak shows a sign of being contained in the country, it’s still uncertain how long it will take before we completely beat the coronavirus.
At this point, we think it’s necessary to present the outbreak with a more interactive visual map. The goal of this article is to show you how to leverage a web scraping and visualization tool to achieve a few fundamental steps of data analytic work.
The article consists of two parts: we first use Octoparse to extract web data, then we use Finereport to visualize the data. If you are new to a web scraping tool, feel free to click this video to learn how to extract data from scratch. If you don’t know anything about coding, don’t worry, we make the article super easy to understand.
Part One: Data Extraction with Octoparse
Data extraction consists of 3 steps:
- Step 1: Build a scraper task by entering the URL
- Step 2: Click to extract the web data
- Step 3: Execute the scraper task
Sounds easy right? Well, it is very easy!
First, paste the web URL to the box after clicking the “Advanced Mode”. Then click “Save URL” to proceed. It will load the web page in its built-in browser for you to click and extract.
Next, click on any table cell and follow the guide that appears on the“Action tips” Panel. Choose “Select all sub-elements”, then click “Select all”. Congratulations! We just create a scraper successfully. Now we should confirm the step by clicking “Extract data in the loop”.
Last but not least, click to execute the scraper.
As we complete fetching the data, we can export to an excel format and use it to create a map visualization. I understand how data can be easily outdated over time, especially for time-sensitive data. At this point, you can take advantage of its scheduler to put your task on autopilot.
Part Two: Data Visualization with Finereport
First, click “plus” button from the menu bar to select and import the file we just collected. And you can inspect its accuracy from the “preview” window. This is a necessary step that many people are likely to ignore. We’re working with the geolocation and corresponding data. If FineReport can’t read the dimension as geographical information, it will fail to create a respective map. Our data looks fine. Let’s get our map now.
To add a map layer, click “edit” and select “Korea”. Boom! Your map appears! And it looks great! Now we need to get the points on the map to demonstrate the level of severity on each geolocation we collected. To do this, click “data” to connect case numbers with each geolocation.
We still need to make some final tweaks to make it pretty. Change the color and edit the format. Then refresh the screen.
Now let’s create a visualization map to display the outbreak successfully. Besides that, I also create data tables, bubble charts, line charts and so on. And what I did was to resemble them together into one dashboard.
The best part is that we can make this dashboard live by importing the data via an API, and this is achievable with the Octoparse.
Ashley is a data enthusiast and passionate blogger with hands-on experience in web scraping. She focuses on capturing web data and analyzing in a way that empowers companies and businesses with actionable insights. Read her blog here to discover practical tips and applications on web data extraction
Artículo en español: Extracción y Visualización de Datos de Coronavirus
Originally published at http://www.dataextraction.io on March 13, 2020.