What is Kavunka?
Kavunka is a search engine and powerful web scraper, which you can install on your server or virtual machine. Built-in features allow you to parse Internet sites and create a more informative SERP than other search engines. For a user request, the search engine can provide not only textual information in the form of a snippet, but also a picture, rating, price, product characteristics etc.
Minimum System Requirements:
OS: CentOS 7 64 bit
CPU: 2.0 GHz
RAM: 2 GB
The search engine Kavunka crawls any website in several streams and scrapes data from web pages. Kavunka is trained to recognize the language of webpages. Currently available: en, it, fr, pt, es, pl, uk, ru. Kavunka is able to correct errors in user search query and get back search hits that match the query. Search robots can bypass site protection using proxy and random user agents.
How can this be used?
1. You can use Kavunka software to provide search services for sites. Just add the site to the task and the crawlers will start indexing the site. Next, install a simple form on the site that needs to be searched.
2. This technology will be useful to owners of Internet sites to attract additional traffic. You can make a manual selection of sites and add sites of a particular subject, for example: games, cars, real estate, medicine, etcetera to Kavunka search. Also, you can configure the parser and extract the information, you are interested in, from the donor sites, for example, price, user ratings,reviews. After that, you can customize search in such a way that this information will be provided in a convenient form,it makes the SERP more interesting and informative for your users. As a result of using Kavunka, you will have a super service on your site, which will be similar to Google or Bing in functionality. This will increase the credibility of your Internet resource in the eyes of users and give them the opportunity to search the Internet without leaving your site.
3. You can also use software for scraping a large number of web pages. After crawling the site, you can receive a scraping report in any format: json, csv, xml.