What is the rationale for the individual reserved characters in the url?

I noticed that these characters are illegal

#%<>?\/*+|:" 

I noticed that they are encoded (% NN, where NN is the hexadecimal value), but can be replaced without problems

 $,;=& @ 

(note the space, which is usually encoded as + (but may be% 20))

#%?/+ I understand. But what do the following characters do? <>\*|":

Note. I understand that : does in part of the domain (its port), since @ is the login, but after the first / why: is it illegal? (@isnt)

+4
source share
2 answers

RFC 2396 (Uniform Resource Identifier URI: General Syntax) says:

Many URIs include components that are or are limited to certain special characters. These characters are called β€œreserved” because their use in the URI component is limited to their reserved purpose.

 reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" | "$" | "," 

2.4.3. US-ASCII Excluded Characters

The brackets "<" and ">" and double quotation marks (") are excluded because they are often used as delimiters around URIs in text documents and protocol fields. The" # "character is excluded because it is used to delimit the URI from the fragment identifier in the link URI (section 4) The percent character β€œ%” is excluded because it is used to encode escaped characters.

 delims = "<" | ">" | "#" | "%" | <"> 

Other characters are excluded because gateways and other transport agents are known to sometimes modify such characters, or they are used as delimiters.

 unwise = "{" | "}" | "|" | "\" | "^" | "[" | "]" | "`" 

I think that covers everything you mentioned. Star " * " is not reserved and can be used. Paste this in your browser: http://en.wikipedia.org/wiki/ *

+3
source

I'm not sure about this, but can they be reserved so that if you try to enter the URLs into the shell environment, the URL will not be split into different parts unnecessarily? For example, imagine that I am trying to execute

 curl http://www.stackoverflow.com/this>that > myFile.txt 

This can disable the command line by trying to get the wrong URL http://www.stackoverflow.com/this and then writing it to a file called that , and then disconnecting the interpreter when it gets into the second > . This explanation takes into account all the characters that you indicated (they all mean something in the shell environment), but this is just my first guess as to why this might be.

0
source

Source: https://habr.com/ru/post/1334018/


All Articles