Monday, February 28, 2011

Re: How to check if string is in Hebrew

On Sun, Feb 27, 2011 at 3:01 PM, ydjango <traderashish@gmail.com> wrote:
> Hebrew is within the unicode range 0x590 to 0x5ff.
>
> I tried
> if lang_string[0] >= u'0x590' and lang_string[0] <= u'0x5ff':
>
> but it does not seem to work.

That isn't the correct syntax for unicode string literals. What you
are trying to do should look like this:

if lang_string[0] >= u'\u0590' and lang_string[0] <= u'\u05ff':

(See http://docs.python.org/reference/lexical_analysis.html#strings
for all the details on \u, \U, \x and their friends)

Testing just the first character of the string may or may not work for
general input; that depends entirely on your problem (and your users).
If it were me, I would define a utility function like this:

def char_is_hebrew(char):
return char >= u'\u0590' and char <= u'\u05ff'

and then test all of the characters in the string, either with

if any(map(char_is_hebrew, lang_string)):
<some character is in the range>

or

if all(map(char_is_hebrew, lang_string)):
<every character is in the range>

--
Regards,
Ian Clelland
<clelland@gmail.com>

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to django-users+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/django-users?hl=en.

No comments:

Post a Comment