After listening to your feedback, we learned that scraping websites is not an easy task. We first started of by creating an extension that allows you to do this in a more simple and visual way, but quickly learned that we could take this a step further!
All of this fits on the vision of our main company: Roadwork. Where we want to make the web work for you, in an autonomous way!
We often saw that when selecting the elements through our visual tool, that a simple miss-click is enough to not get the expected result. After researching this problem, we ended up identifying the following pain points:
- Authenticated Pages are hard, how do you know the scraper can access those? How do you get all your cookies transferred easily
- Multiple URLs require you to scrape each page manually. Making it a time-consuming process when you want to scrape > 10 pages with the same layout and structure
- Sometimes you just can't get the correct element selected and you have to try again over and over again or think that the tool is simply not working
To solve the pain points above, we first started off by introducing you to a more detailed overview of what is happening, introducing the following features:
- Logs (what is happening, where are things failing, how long does it take?)
- Screenshots (how does the scraper see your page)
- Manual Scrape (scrape again when something fails)
Now we are ready for the next phase! Therefore we are happy to announce to introduce you to the world of "Recipes"!
What are Recipes?
Recipes are a pre-created set of properties that will be scraped from the website you provide to it. Examples can be:
- Gathering the latest URLs from different Instagram profiles
- Gathering the latest tweets from Twitter for a certain hashtag
- Gathering the latest update on the Stock Market
- Gathering the latest news articles for your favorite news website
As a first example we posted an introduction on how you can get started with Instagram! We're very excited about this feature and can't wait to hear from you on which other sites you would love to see!