The main approaches for obtaining data from a page / site for mashup are:
Scraper with AJAX:
This works on almost all pages, although it will not work on pages that load the content you want using AJAX. Sometimes it can also be difficult for sites requiring authentication or to limit links.
In most cases, use GM_xmlhttpRequest() to enable cross-domain scripts. This approach will be described in detail below.
Loading resource page in <iframe> :
This approach works on AJAX-ified pages and can be encoded to allow the user to solve login problems on their own. But this is: slower, more resource intensive and more difficult to code.
Since this question is not needed, see "How do I get an AJAX request to wait for a page to appear before returning an answer?" for more information about this technique.
Use site API, if any:
Alas, most sites do not have an API, so this is probably not an option for you, but make sure that the API is not offered. An API is generally the best approach, if available. Do a new search / question for more information on this approach.
Mimicking AJAX site calls if it makes such calls for the required information:
This option is also not applicable to most sites, but it can be a clean, effective method when it is. Do a new search / question for more information on this approach.
Retrieving values (s) from a sequence of web pages via AJAX with cross-domain support:
Use GM_xmlhttpRequest() to load pages and jQuery to process their HTML.
Use the GM_xmlhttpRequest() onload function to invoke the next page; if necessary, do not try to use synchronous AJAX calls.
The main logic, starting from your original script, moves to the onload function, except that you no longer need to remember the values between the Greasemonkey runs.
Here's the full Greasemonkey script , with some status and error message:
// ==UserScript== // @name _Total-value mashup // @include https://play.google.com/apps* // @require http://ajax.googleapis.com/ajax/libs/jquery/1.7.2/jquery.min.js // @grant GM_addStyle // @grant GM_xmlhttpRequest // ==/UserScript== var startNum = 0; var totalValue = 0; //--- Scrape the first account-page for item values: $("body").prepend ( '<div id="gm_statusBar">Fetching total value, please wait...</div>' ); scrapeAccountPage (); function scrapeAccountPage () { var accntPage = 'https://play.google.com/store/account?start=0&num=40'; accntPage = accntPage.replace (/start=\d+/i, "start=" + startNum); $("#gm_statusBar").append ( '<span class="gmStatStart">Fetching page ' + accntPage + '...</span>' ); GM_xmlhttpRequest ( { method: 'GET', url: accntPage, //--- getTotalValuesFromPage() also gets the next page, as appropriate. onload: getTotalValuesFromPage, onabort: reportAJAX_Error, onerror: reportAJAX_Error, ontimeout: reportAJAX_Error } ); } function getTotalValuesFromPage (respObject) { if (respObject.status != 200 && respObject.status != 304) { reportAJAX_Error (respObject); return; } $("#gm_statusBar").append ('<span class="gmStatFinish">done.</span>'); var respDoc = $(respObject.responseText); var targetElems = respDoc.find ("#tab-body-account .rap-link"); targetElems.each ( function () { var itmVal = $(this).attr ("data-docprice").replace (/[^\d\.]/g, ""); if (itmVal) { itmVal = parseFloat (itmVal); if (typeof itmVal === "number") { totalValue += itmVal; } } } ); console.log ("totalValue: ", totalValue.toFixed(2) ); if ( respDoc.find (".snippet.snippet-tiny").length ) { startNum += 40; //--- Scrape the next page. scrapeAccountPage (); } else { //--- All done! report the total. $("#gm_statusBar").empty ().append ( 'Combined Value: $' + totalValue.toFixed(2) ); } } function reportAJAX_Error (respObject) { $("#gm_statusBar").append ( '<span class="gmStatError">Error ' + respObject.status + '! ' + '"' + respObject.statusText + '" ' + 'Total value, so far, was: ' + totalValue + '</span>' ); } //--- Make it look "purty". GM_addStyle ( multilineStr ( function () {/*!
Important: Do not forget the @include , @exclude and / or @match , so your script does not run on every page and iframe!
source share