Email
Telegram

How to Scrape Product Data from eBay

KAVUNKA
Personal Search Engine / Powerful Crawlers / Fast WEB Scraping
Try for Free
eBay is an American internet trading company providing online auction services. In fact, it is the world's largest bulletin board with millions of products from sellers located around the world. In this guide, I will show you how to get data on the graphics cards sold. We will be fetching data from eBay.com without Selenium, which is good as the data retrieval speed will be high.
  • NAME - product name
  • BRAND - product brand
  • COND - condition
  • PRICE - price
  • LOC - item location
  • SOLD - number of units sold
  • AVAL - the number of available
  • IMG - image
We will test our regular expressions using this example.
Let's create a regular expression for the product name. This is fairly easy to do since the title is enclosed between the <h1> and </h1> tags. Let's use the "Regular expressions designer" and fill in the appropriate fields.

NAME
Step 1:
(?<=<h1).*?(?=</h1>)
Next, we get the BRAND value. Typically, the brand name is in "Item specifics"
Now we need to select "Brand: NVIDIA" and look at the html-code of the selection. As you can see, eBay uses itemprop microdata. This is very good as we can use it to get the data we need.

BRAND
Step 1:
(?<=<span\ itemprop="brand").*?(?=<\/td>)

What is itemprop?

Micro-markup is the markup of data on the page and on the site, which is used to make the search bot better recognize the content on the site. The syntax is as follows:
itemprop="<property>"
This property is placed in a tag that contains the corresponding information. The property is defined by the Schema.org data dictionary, which is maintained by Google.
Next, we will do the same for the rest of the fields. You can also read a guide on how to get product data from Amazon. The result of our actions will be regular expressions for scraping:

COND
Step 1:
(?<=itemprop="itemCondition").*?(?=</div>)

PRICE
Step 1:
(?<=itemprop="price").*?(?=</)

LOC
Step 1:
(?<=itemprop=['"]availableAtOrFrom['"]>).*?(?=</)

SOLD
Step 1:
(?<=id="why2buy")[\w\W]*?(?=Sold)

AVAL
Step 1:
(?<=<span\ id="qtySubTxt">)[\w\W]*?(?=</span)

IMG
Step 1:
(?<=itemprop="image").*?(?=style\=)
Step 2:
(?<=src=").*?(?=")

Hopefully this guide will help you get the data you need for your business!
KAVUNKA
Personal Search Engine
Powerful Crawlers
Fast WEB Scraping
Try for Free