Real estate sites primary target for web scrapers

Real estate websites are the primary target of web scraping, according to a new report by Distil Networks.

Real estate websites are the primary target of web scraping, according to a new report by Distil Networks. The bot detection and mitigation company released a new study, entitled “The 2016 Economics of Web Scraping”, where it explains just what web scraping is, who is using it, against whom and with what purpose.  

The report says real estate sites are the number one victim, followed by digital publishing, travel, online directories, e-commerce, and marketplace and classifieds.  So what is this web scraping in the first place? Distil Networks describes it as a “computer software technique for extracting information from websites, and often includes transforming unstructured website data into a database for analysis or repurposing content into the web scraper’s own website and business operations.” 

So, data gathering, corporate espionage, that kind of stuff.  They’re usually done by bots, who can obviously do the work faster, and for longer periods of time. They’re also much cheaper than humans – web scraping services apparently cost $3.33 per hour (£2.54). 

Annually, the average scraper can earn as much as $58,000 (£44,239). If they’re working for a large company, earnings can go as high as $128,000 (£97,000). “If your content can be viewed on the web, it can be scraped,” said Rami Essaid, CEO and co-founder of Distil Networks. 

“Not only does web scraping pose a critical challenge to a website’s brand, it can threaten sales and conversions, lower SEO rankings, or undermine the integrity of content that took considerable time and resources to produce. Understanding the pervasive nature of today’s web scraping economy not only raises awareness about this growing challenge, it also allows website owners to take action in the protection of their proprietary information.” 

The report also says that approximately two per cent of online revenue is lost due to web scraping.

Image source: Shutterstock/Eugenio Marongiu