Is it possible to make a proxy with the file get_contents () or cURL?

I just messed around with file_get_contents() at school and noticed that it allows me to open blacklisted sites at school.

Just a few questions:

No image upload

Clicking on a link on a website simply returns me to the original blocked page.

I think I know a way to fix the binding problem, but I really didn't think about it. I could do str_replace for the content from file_get_contents to replace any link using another file_gets_contents( ) function from that link ... right?

Would it be easier if I used cURL?

Am I trying to do this, even possible, or just wasting my precious time?

I know that this is not a good way to do something like this, but, it's just a thought, I was curious.

+4
source share
2 answers

This is not a trivial task. It is possible, but you will need to analyze the returned document (s) and replace everything that relates to external content, so that they are also transmitted through your proxy server, and this is the difficult part.

Keep in mind that you will need to deal (for starters, this is not a complete list):

  • Relative and absolute paths that may or may not extract external content
  • Anchors, forms, images, and any number of other HTML elements that may link to external content, and may or may not explicitly indicate the content that they link to.
  • CSS and JS code that references external content, including JS, which modifies the DOM to create elements with click events that act as links to name only one problem.

This is a pretty gigantic task. Personally, I would say that you are not worried - you are probably wasting your precious time.

Moreover, some nice people have already done most of the work for you:

; -)

+2
source

Your β€œproblem” comes from the fact that HTTP is a stateless protocol , and different resources like css, js, images, etc., have their own URLs, so you need a request for each. If you want to do it yourself and not use php proxies or the like, this is "pretty trivial": you have to clear html and normalize it with tidy to xml (xhtml), then process it with DOMDocument and XPath .

You could learn a lot from this - it is not too difficult, but it includes some interesting "technologies".

What you will get as a result of what is called a finder or screen scraper .

0
source

Source: https://habr.com/ru/post/1393145/


All Articles