Wget and curl somehow modify the bencode file at boot

Question

Wget and curl somehow modify the bencode file at boot

Ok, so I have a bit of a strange problem: I'm not quite sure how to explain ... Basically I try to decode the bencode file (.torrent file), now I tried 4 or 5 different scripts that I found through google and SO without any luck (get a return like this in a non-dictionary or error output from the same)

Now I upload the .torrent file this way

wget http://link_to.torrent file //and have also tried with curl like so curl -C - -O http://link_to.torrent

and I conclude that something happens to the file when loading this way. The reason for this is because I found this site that will decode the .torrent file that you download online to display the information contained in the file. However, when I download the .torrent file, just without clicking on the link through the browser, but instead using one of the methods described above, it also does not work. So did anyone experience a similar problem using one of these methods and find a solution to the problem or even explain why this is happening? Since I can’t learn much about it or don’t know the workaround that I can use for my server

Update: Well, as @ coder543 suggested to compare download file size via browser versus wget. They are not the same size using smaller wget style results, so the problem with wget & curl not something else .. idea?

Update 2: Okay, so I tried it several times now, and I'm slightly reducing the problem, the problem seems to only happen on torchers and torrents. Links from other sites seem to be working properly or as expected ... so here are some links and my results from different methods:

  *** differnet sizes*** http://torrage.com/torrent/6760F0232086AFE6880C974645DE8105FF032706.torrent wget -> 7345 , curl -> 7345 , browser download -> 7376 *** same size*** http://isohunt.com/torrent_details/224634397/south+park?tab=summary wget -> 7491 , curl -> 7491 , browser download -> 7491 *** differnet sizes*** http://torcache.net/torrent/B00BA420568DA54A90456AEE90CAE7A28535FACE.torrent?title=[kickass.to]the.simpsons.s24e12.hdtv.x264.lol.eztv wget -> 4890 , curl-> 4890 , browser download -> 4985 *** same size*** http://h33t.com/download.php?id=cc1ad62bbe7b68401fe6ca0fbaa76c4ed022b221&f=Game%20of%20Thrones%20S03E10%20576p%20HDTV%20x264-DGN%20%7B1337x%7D.torrent wget-> 30632 , curl -> 30632 , browser download -> 30632 *** same size*** http://dl7.torrentreactor.net/download.php?id=9499345&name=ubuntu-13.04-desktop-i386.iso wget-> 32324, curl -> 32324, browser download -> 32324 *** differnet sizes*** http://torrage.com/torrent/D7497C2215C9448D9EB421A969453537621E0962.torrent wget -> 7856 , curl -> 7556 ,browser download -> 7888

So, it seems to me that it works well on some site, but sites that are actually on torcache.net and torrage.com provide files. Now it would be nice if I could just use other sites without relying directly on the cache, but I work with bitnoop api (which pulls all the data from torrage.com, so this is not an option) anyway, if anyone that is, the idea of how to solve these problems or take steps to find a solution will be very grateful!

Even if someone can reproduce the results, it will be appreciated! ... My server is 12.04 LTS on a 64-bit architecture, and the laptop with which I tried to compare the actual download is the same

+4

linux bash shell curl wget

brendosthoughts Jun 13 '13 at 0:08

source share

1 answer

Jester · Accepted Answer · 2013-06-13T23:04:13+0000

For the file obtained using the command line tools, I get:

 $ file 6760F0232086AFE6880C974645DE8105FF032706.torrent 6760F0232086AFE6880C974645DE8105FF032706.torrent: gzip compressed data, from Unix

And of course, gunzip with gunzip will lead to the correct output. Having a look at what the server sends, gives an interesting hint:

 $ wget -S http://torrage.com/torrent/6760F0232086AFE6880C974645DE8105FF032706.torrent --2013-06-14 00:53:37-- http://torrage.com/torrent/6760F0232086AFE6880C974645DE8105FF032706.torrent Resolving torrage.com... 192.121.86.94 Connecting to torrage.com|192.121.86.94|:80... connected. HTTP request sent, awaiting response... HTTP/1.0 200 OK Connection: keep-alive Content-Encoding: gzip

Thus, the server reports sending compressed gzip data, but wget and curl ignore this. curl has a --compressed switch that will properly decompress the data for you. It should be safe to use even for uncompressed files, it simply tells the http server that the client supports compression, but in this case, the curl looks at the received header to see if it is really needed for decompression or not.

Wget and curl somehow modify the bencode file at boot

More articles: