Monday, February 28, 2011

Re: How to check if string is in Hebrew

> Hebrew is within the unicode range 0x590 to 0x5ff.
> I tried
> if lang_string[0] >= u'0x590' and lang_string[0] <= u'0x5ff':
> but it does not seem to work.

That isn't the correct syntax for unicode string literals. What you
are trying to do should look like this:

if lang_string[0] >= u'\u0590' and lang_string[0] <= u'\u05ff':

for all the details on \u, \U, \x and their friends)

Testing just the first character of the string may or may not work for
general input; that depends entirely on your problem (and your users).
If it were me, I would define a utility function like this:

def char_is_hebrew(char):
return char >= u'\u0590' and char <= u'\u05ff'

and then test all of the characters in the string, either with

if any(map(char_is_hebrew, lang_string)):
<some character is in the range>


if all(map(char_is_hebrew, lang_string)):
<every character is in the range>

Ian Clelland

