Blog > Post

Cloud Extraction Works 24/7 with Speed 3-10 Times Faster than Local Extraction

Wednesday, February 19, 2020

A few years ago, we wrote a web crawler to parse and extract data from websites. In this process, the most painful thing was the data extraction tasks were interrupted in some circumstances. For example, the computers shut down suddenly because of unexpected reasons, or the IP was blocked by the targeted website because of frequent access. 

In order to resolve this problem, we’ve developed Cloud Extraction.



#1 Cloud Extraction

Cloud Extraction means data extraction tasks running in the cloud. You need to configure a rule and upload it to our cloud platform. Then your task will be reasonably assigned to one or several cloud servers to extract data simultaneously via central control commands. For example, you have configured a rule to extract data across pages ( 99 pages in total). Well, your tasks will be automatically divided into three sections and evenly assigned to three cloud servers to extract data at the same time. In this way, it will only take you one third of the original time to extract data from 99% websites.

 Octoparse - cloud extraction



#2 Avoid IP Being Blacklisted

Moreover, Cloud extraction can avoid various errors so we don’t have to worry about occasional network interruption anymore. When this occurs, cloud servers can resume its work immediately as soon as the network connection is available again. And also, we are no longer worried about IP being blacklisted. Cloud Extraction provides you with a huge number of IP addresses in Professional Edition. Cloud Extraction resolves this issue effectively by assigning your tasks to several cloud servers and speeding up the extraction speed. 


#3 API

If you need to extract data at a specified time or update your data once an hour, you can make a scheduled task for Cloud Extraction. 

If you find some data haven’t been extracted, you can launch Octoparse to extract these missing data again.

Cloud service also provides you API to link your system and Octoparse closely, which enables you to directly export the extracted data into your database. So for those who want to update their system data in real-time, Octoparse is your best choice. Just make a schedule to obtain the latest data, and then automatically link and update your system automatically in real-time.


Octoparse API documents:


We are pleased to announce that we released a new version of Octoparse and we are very excited by its unique features. Octoparse is a free web scraper for collecting data from the web. Based on the popularity in China market, where Octoparse already has more than 180,000 users, we decided to break into an international market. 

We are glad to help and make our product even better for you. If you find any missing feature, please feel free to contact us.




Author: The Octoparse Team 

contact Octoparse 

More Resources


Web Scraping Templates Take Away

Locate Element with XPath

Octoparse Regular Expression Tool (RegEx)

Deal with AJAX

Cloud Extraction: Scrape at Large Scale

Connect Octoparse API Step by Step






Laden Sie Octoparse herunter, um mit Web-Scraping zu beginnen, oder kontaktieren Sie uns
für die Fragen über Web Scraping!

Kontaktieren Sie uns Herunterladen
Diese Website verwendet Cookies um Ihnen ein besseres Internet-Erlebnis zu ermöglichen. Lesen Sie wie wir Cookies verwenden und Sie können sie kontrollieren, indem Sie auf Cookie-Einstellungen klicken. Wenn Sie die Website weiter nutzen, akzeptieren Sie unsere Verwendung von Cookies.
Akzeptieren Ablehnen