What is the right way to get Google search results?

I want to get all the search results to search for a specific keyword in google. I have seen scraping suggestions, but this seems like a bad idea. I saw Gems (I plan to use ruby) that do the cleanup and use the API. I also saw suggestions for using the API.

Does anyone know a better way to do this right now? The API is no longer supported, and I have seen people reporting that they return unused data. Do Gems help solve this or not?

Thanks in advance.

+5
source share
6 answers

According to http://code.google.com/apis/websearch/ , the search API is out of date - but there is a replacement for the Custom Search API . Will this do what you want?

If so, a quick web search turned out to be https://github.com/alexreisner/google_custom_search , among other gems.

+2
source

I also use the scrape option, faster than asking Google for the key and plus, and you are not limited to 100 searches per day. Google TOS is a problem though, as Richard points out. Here is an example I made for me - this is useful if you want to connect through a proxy:

require 'rubygems' require 'mechanize' agent = Mechanize.new agent.set_proxy '78.186.178.153', 8080 page = agent.get('http://www.google.com/') google_form = page.form('f') google_form.q = 'new york city council' page = agent.submit(google_form, google_form.buttons.first) page.links.each do |link| if link.href.to_s =~/url.q/ str=link.href.to_s strList=str.split(%r{=|&}) url=strList[1] puts url end end 
+10
source
+2
source

You’ll end up with 503 errors if you use the scraper on the Google search results page. A more scalable (and legal) approach is to use a custom search API .

The API provides 100 searches per day for free. If you need more, you can sign up for billing on the Google Developers Console. Additional requests cost $ 5 per 1000 requests, up to 10 thousand requests per day.

The following is an example of Google search results in JSON format:

 require 'open-uri' require 'httparty' require 'pp' def get_google_search_results(search_phrase) # assign api key api_key = "Your api key here" # encode search phrase search_phrase_encoded = URI::encode(search_phrase) # get api response response = HTTParty.get("https://www.googleapis.com/customsearch/v1?q=#{search_phrase_encoded}&key=#{api_key}&num=100") # pretty print api response pp response # get the url of the first search result first_search_result_link = response["items"][0]["link"] end get_google_search_results("Top Movies in Theatres") 
+2
source

You can also use the API . We take care of the hard parts of recycling and analyzing Google search results. We have bindings available in Ruby as easy as:

 query = GoogleSearchResults.new q: "coffee" hash_results = query.get_hash 

Repository: https://github.com/serpapi/google-search-results-ruby

+2
source

The custom search API is most likely not the one you are looking for. I am sure that you need to configure the user search mechanism that you use the API for the query, and this can only search the user-specified set of domains (i.e. you cannot perform a general search on the Internet).

If you need to do a general Google search, then cleaning is currently the only way. It’s very easy to write ruby ​​code to do a Google search and clear the URLs of the search results (I did it myself for a summer research project), but it violates Google TOS, so be warned.

+1
source

Source: https://habr.com/ru/post/888212/


All Articles