Saturday, September 4, 2010

Re: Urlencode vs. iriencode

Here's the code for the two (the numbers at the start of each line are
just line numbers from the file) -

iriencode:
128 """
129 Convert an Internationalized Resource Identifier (IRI) portion to a URI
130 portion that is suitable for inclusion in a URL.
131
132 This is the algorithm from section 3.1 of RFC 3987. However,
since we are
133 assuming input is either UTF-8 or unicode already, we can
simplify things a
134 little from the full method.
135
136 Returns an ASCII string containing the encoded result.
137 """
138 # The list of safe characters here is constructed from the
"reserved" and
139 # "unreserved" characters specified in sections 2.2 and 2.3 of
RFC 3986:
140 # reserved = gen-delims / sub-delims
141 # gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"
142 # sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
143 # / "*" / "+" / "," / ";" / "="
144 # unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
145 # Of the unreserved characters, urllib.quote already considers all but
146 # the ~ safe.
147 # The % character is also added to the list of safe characters
here, as the
148 # end of section 3.1 of RFC 3987 specifically mentions that %
must not be
149 # converted.
150 if iri is None:
151 return iri
152 return urllib.quote(smart_str(iri), safe="/#%[]=:;$&()+,!?*@'~")


urlencode:
11 """
12 A version of Python's urllib.quote() function that can operate
on unicode
13 strings. The url is first UTF-8 encoded before quoting. The
returned string
14 can safely be used as part of an argument to a subsequent
iri_to_uri() call
15 without double-quoting occurring.
16 """
17 return force_unicode(urllib.quote(smart_str(url), safe='/'))

So iriencode only encodes the IRI portion (hence the longer list of
safe characters), while URL will encode the entire URL, including any
GET arguments and anchors.

As for usage, I haven't encountered any IRIs, but I believe IRIs need
to be encoded before inclusion in HTML (i.e. you can't just include
the non-ASCII characters in HTML). As for urlencode, its main purpose
is if you're including a URL in a form submission, e.g. the URL to go
to after login. urlencode will do everything that iriencode does, but
sometimes you might not want it to do that.

On 5 September 2010 08:17, Jordon Wii <jordonwii@gmail.com> wrote:
> Anyone?  I haven't found anything that describes the difference
> (except that one is for URI's and the other for URLs).
>
> On Sep 4, 8:52 am, Jordon Wii <jordon...@gmail.com> wrote:
>> What's the difference between the template filters urlencode and
>> iriencode?  When should I use one over the other (or use both)?
>
> --
> You received this message because you are subscribed to the Google Groups "Django users" group.
> To post to this group, send email to django-users@googlegroups.com.
> To unsubscribe from this group, send email to django-users+unsubscribe@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/django-users?hl=en.
>
>

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to django-users+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/django-users?hl=en.

No comments:

Post a Comment