A set of utility scripts when dealing with Unicode characters.
NOTE: This extension rose from my own needs, so its feature list is rather short for the moment. I'll update as I find requirements of my own but please contact for feature requests.
Implementation Details: Unicode properties can be hard to decode (there are tons of codepoints that are considered letters, then you have to differentiate from uppercase, lowercase, titlecase, ...). Hence, I built a few files (one per Unicode character property I saw fit) that are merely an encoded database for finding out if a given Unicode codepoint has that particular property. Each file occupies 136Kb of space and they are cached in memory as they are needed. With this in mind, the first time you query a property it will have to load the entire file into memory. However, you can control this in a fine-grain way by loading and unloading caches as you see fit.
Please contact me for feature requests or bugs.