Encode / Decrypt URLs

What is the recommended way to encode and decode whole URLs in Go? I know the url.QueryEscape and url.QueryUnescape , but they don't seem to be exactly what I'm looking for. In particular, I'm looking for methods like JavaScript encodeURIComponent and decodeURIComponent .

Thank.

+65
url escaping go
Dec 11 '12 at 12:17
source share
8 answers

You can do whatever you need with the net / url module. It does not break separate encoding functions for parts of the URL; you must let it create the entire URL. After a boring look at the source code, I think it does a very good job of standards.

Here is an example ( link to the playground )

 package main import ( "fmt" "net/url" ) func main() { Url, err := url.Parse("http://www.example.com") if err != nil { panic("boom") } Url.Path += "/some/path/or/other_with_funny_characters?_or_not/" parameters := url.Values{} parameters.Add("hello", "42") parameters.Add("hello", "54") parameters.Add("vegetable", "potato") Url.RawQuery = parameters.Encode() fmt.Printf("Encoded URL is %q\n", Url.String()) } 

What prints

 Encoded URL is "http://www.example.com/some/path/or/other_with_funny_characters%3F_or_not/?vegetable=potato&hello=42&hello=54" 
+88
Dec 11
source share
— -

From MDN to encodeURIComponent :

encodeURIComponent deletes all characters except the following: alphabetic, decimal, '-', '_', '.', '!', '~', '*', ''', '(', ')'

From Go, the implementation of url.QueryEscape (in particular, the private function shouldEscape ) selects all characters except the following: alphabetic, decimal digits, '-', '_', '.', '~' .

Unlike Javascript, Go QueryEscape () will run '!', '*', ''', '(', ')' . Basically, the Go version strictly follows RFC-3986. Javascript is weaker. Again from MDN:

If someone wants to be more strict in complying with RFC 3986 (which reserves !, ', (,) and *), even if these characters do not have formalized uses of URI restrictions, you can safely use the following:

 function fixedEncodeURIComponent (str) { return encodeURIComponent(str).replace(/[!'()]/g, escape).replace(/\*/g, "%2A"); } 
+13
Dec 11 '12 at 13:50
source share

Starting with Go 1.8, this situation has changed. We now have access to PathEscape in addition to the older QueryEscape to encode path components along with the unescape counterpart PathUnescape .

+7
Apr 15 '17 at 18:18
source share

How about this:

 template.URLQueryEscaper(path) 
+6
Apr 12 '17 at 3:41 on
source share

To simulate Javascript encodeURIComponent() I created a helper string function.

Example: turns "My String" into "My%20String"

https://github.com/mrap/stringutil/blob/master/urlencode.go

 import "net/url" // UrlEncoded encodes a string like Javascript encodeURIComponent() func UrlEncoded(str string) (string, error) { u, err := url.Parse(str) if err != nil { return "", err } return u.String(), nil } 
+5
Feb 09 '15 at 10:46
source share

If someone wants to get an accurate result compared to JS encodeURIComponent Try my function, it is dirty, but it works well.

https://gist.github.com/czyang/7ae30f4f625fee14cfc40c143e1b78bf

 // #Warning! You Should Use this Code Carefully, and As Your Own Risk. package main import ( "fmt" "net/url" "strings" ) /* After hours searching, I can't find any method can get the result exact as the JS encodeURIComponent function. In my situation I need to write a sign method which need encode the user input exact same as the JS encodeURIComponent. This function does solved my problem. */ func main() { params := url.Values{ "test_string": {"+!+'( )*-._~0-👿 👿9a-zA-Z 中文测试 test with ❤️ !@#$%^&&*()~<>?/.,;'[][]:{{}|{}|"}, } urlEncode := params.Encode() fmt.Println(urlEncode) urlEncode = compatibleRFC3986Encode(urlEncode) fmt.Println("RFC3986", urlEncode) urlEncode = compatibleJSEncodeURIComponent(urlEncode) fmt.Println("JS encodeURIComponent", urlEncode) } // Compatible with RFC 3986. func compatibleRFC3986Encode(str string) string { resultStr := str resultStr = strings.Replace(resultStr, "+", "%20", -1) return resultStr } // This func mimic JS encodeURIComponent, JS is wild and not very strict. func compatibleJSEncodeURIComponent(str string) string { resultStr := str resultStr = strings.Replace(resultStr, "+", "%20", -1) resultStr = strings.Replace(resultStr, "%21", "!", -1) resultStr = strings.Replace(resultStr, "%27", "'", -1) resultStr = strings.Replace(resultStr, "%28", "(", -1) resultStr = strings.Replace(resultStr, "%29", ")", -1) resultStr = strings.Replace(resultStr, "%2A", "*", -1) return resultStr } 
+2
Sep 11 '17 at 12:28
source share

The implementation of escape and unescape (ripped from go source) is implemented here:

 package main import ( "fmt" "strconv" ) const ( encodePath encoding = 1 + iota encodeHost encodeUserPassword encodeQueryComponent encodeFragment ) type encoding int type EscapeError string func (e EscapeError) Error() string { return "invalid URL escape " + strconv.Quote(string(e)) } func ishex(c byte) bool { switch { case '0' <= c && c <= '9': return true case 'a' <= c && c <= 'f': return true case 'A' <= c && c <= 'F': return true } return false } func unhex(c byte) byte { switch { case '0' <= c && c <= '9': return c - '0' case 'a' <= c && c <= 'f': return c - 'a' + 10 case 'A' <= c && c <= 'F': return c - 'A' + 10 } return 0 } // Return true if the specified character should be escaped when // appearing in a URL string, according to RFC 3986. // // Please be informed that for now shouldEscape does not check all // reserved characters correctly. See golang.org/issue/5684. func shouldEscape(c byte, mode encoding) bool { // §2.3 Unreserved characters (alphanum) if 'A' <= c && c <= 'Z' || 'a' <= c && c <= 'z' || '0' <= c && c <= '9' { return false } if mode == encodeHost { // §3.2.2 Host allows // sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "=" // as part of reg-name. // We add : because we include :port as part of host. // We add [ ] because we include [ipv6]:port as part of host switch c { case '!', '$', '&', '\'', '(', ')', '*', '+', ',', ';', '=', ':', '[', ']': return false } } switch c { case '-', '_', '.', '~': // §2.3 Unreserved characters (mark) return false case '$', '&', '+', ',', '/', ':', ';', '=', '?', '@': // §2.2 Reserved characters (reserved) // Different sections of the URL allow a few of // the reserved characters to appear unescaped. switch mode { case encodePath: // §3.3 // The RFC allows : @ & = + $ but saves / ; , for assigning // meaning to individual path segments. This package // only manipulates the path as a whole, so we allow those // last two as well. That leaves only ? to escape. return c == '?' case encodeUserPassword: // §3.2.1 // The RFC allows ';', ':', '&', '=', '+', '$', and ',' in // userinfo, so we must escape only '@', '/', and '?'. // The parsing of userinfo treats ':' as special so we must escape // that too. return c == '@' || c == '/' || c == '?' || c == ':' case encodeQueryComponent: // §3.4 // The RFC reserves (so we must escape) everything. return true case encodeFragment: // §4.1 // The RFC text is silent but the grammar allows // everything, so escape nothing. return false } } // Everything else must be escaped. return true } func escape(s string, mode encoding) string { spaceCount, hexCount := 0, 0 for i := 0; i < len(s); i++ { c := s[i] if shouldEscape(c, mode) { if c == ' ' && mode == encodeQueryComponent { spaceCount++ } else { hexCount++ } } } if spaceCount == 0 && hexCount == 0 { return s } t := make([]byte, len(s)+2*hexCount) j := 0 for i := 0; i < len(s); i++ { switch c := s[i]; { case c == ' ' && mode == encodeQueryComponent: t[j] = '+' j++ case shouldEscape(c, mode): t[j] = '%' t[j+1] = "0123456789ABCDEF"[c>>4] t[j+2] = "0123456789ABCDEF"[c&15] j += 3 default: t[j] = s[i] j++ } } return string(t) } // unescape unescapes a string; the mode specifies // which section of the URL string is being unescaped. func unescape(s string, mode encoding) (string, error) { // Count %, check that they're well-formed. n := 0 hasPlus := false for i := 0; i < len(s); { switch s[i] { case '%': n++ if i+2 >= len(s) || !ishex(s[i+1]) || !ishex(s[i+2]) { s = s[i:] if len(s) > 3 { s = s[:3] } return "", EscapeError(s) } i += 3 case '+': hasPlus = mode == encodeQueryComponent i++ default: i++ } } if n == 0 && !hasPlus { return s, nil } t := make([]byte, len(s)-2*n) j := 0 for i := 0; i < len(s); { switch s[i] { case '%': t[j] = unhex(s[i+1])<<4 | unhex(s[i+2]) j++ i += 3 case '+': if mode == encodeQueryComponent { t[j] = ' ' } else { t[j] = '+' } j++ i++ default: t[j] = s[i] j++ i++ } } return string(t), nil } func EncodeUriComponent(rawString string) string{ return escape(rawString, encodeFragment) } func DecodeUriCompontent(encoded string) (string, error){ return unescape(encoded, encodeQueryComponent) } // https://golang.org/src/net/url/url.go // http://remove-line-numbers.ruurtjan.com/ func main() { // http://www.url-encode-decode.com/ origin := "äöüHel/lo world" encoded := EncodeUriComponent(origin) fmt.Println(encoded) s, _ := DecodeUriCompontent(encoded) fmt.Println(s) } 


 // ------------------------------------------------------- /* func UrlEncoded(str string) (string, error) { u, err := url.Parse(str) if err != nil { return "", err } return u.String(), nil } // http://stackoverflow.com/questions/13820280/encode-decode-urls // import "net/url" func old_main() { a,err := UrlEncoded("hello world") if err != nil { fmt.Println(err) } fmt.Println(a) // https://gobyexample.com/url-parsing //s := "postgres://user:pass@host.com:5432/path?k=v#f" s := "postgres://user:pass@host.com:5432/path?k=vbla%23fooa#f" u, err := url.Parse(s) if err != nil { panic(err) } fmt.Println(u.RawQuery) fmt.Println(u.Fragment) fmt.Println(u.String()) m, _ := url.ParseQuery(u.RawQuery) fmt.Println(m) fmt.Println(m["k"][0]) } */ // ------------------------------------------------------- 
-one
Oct 27 '15 at 14:55
source share

Hope this helps

  // url encoded func UrlEncodedISO(str string) (string, error) { u, err := url.Parse(str) if err != nil { return "", err } q := u.Query() return q.Encode(), nil } 
  * encoded into %2A  # encoded into %23  % encoded into %25  < encoded into %3C  > encoded into %3E  + encoded into %2B  enter key (#13#10) is encoded into %0D%0A 
-one
Aug 23 '19 at 9:16
source share



All Articles