Crawler HTML Snapshot - Understanding How It Works

I am reading this article today. Honestly, im really interested in "2. Most of your content is generated by server-side technology such as PHP or ASP.NET."

I want to understand if I understood :)

I create this php script (gethtmlsnapshot.php) where I include the ajax server page (getdata.php) and I run away (for security) the parameters. Then add it to the end of the html static page (index-movies.html). Correctly? Now...

1 - Where did I install gethtmlsnapshot.php? In other words, I need to call (or better, the searcher needs) this page. But if I don’t have a link on the main page, the crawler cannot call it: O How can a scanner call a page with _escaped_fragment_ parameters? He can’t know them if I don’t ask them somewhere :)

2 - How can the crew call this page with the parameters? As before, I need a link to a script with parameters, so crew members look at each page and save the contents of the dynamic result.

Could you help me? And what do you think of this technique? Wouldn't it be better if the scanner developers made their own bots in other ways? :)

Let me know what you think. Greetings

+3
source share
1

, - , , , . , (, , - ).

AJAX , , ( XML, JSON), - .

.

, xmlhttpget JavaScript. . onclick AJAX ( ).

.

, - . . , , ( Google), . Google JavaScript Blackhat SEO (, , -, JavaScript, - -... html , , JavaScript ).

CSS JS ( CSS, ).

, " AJAX" -, Webcrawler , JavaScript, . ? , - JavaScript (, document.location any). Google , . ajax . , , , , URI .

, 3 , .

  • onclick href (imo , , ).
  • - Sitemap, , ( URL-, pagerank).
  • ajax

, JavaScript xmlhttpget href, : www.example.com/ajax.php#!key=value

:

<a href="http://www.example.com/ajax.php#!page=imprint" onclick="handleajax()">go to my imprint</a>

handleajax document.location . URL- - .

ajax http://www.example.com/ajax.php.php?%23!page=imprint http://www.example.com/ajax.php#!page=imprint html, , . , http://www.example.com/ajax.php.php?%23!page=imprint -, - xmlhttpget .

, URL-, , ajax, . script , .

, pr/con:

:

  • , .
  • - , .

  • , JavaScript.
  • SEO. , Google .

:

  • href, onclick - , .

  • ajax , - URI, , - , .

    /li >
  • , ajax . , .

+8

Source: https://habr.com/ru/post/1768601/


All Articles