BTW, right at the moment, RegExes are quite enemy teritory for me.
Perhaps you or somebody else should step in and tried to create the right "plugin" package, ready for subseqent integration with String (WString, maybe even Stream?).
BTW, Stream variant would be usable for String as well via addapting it using StringStream - performance wise, it is quite cheap. But there then remains the unicode problem... maybe use utf-8 only? E.g. in CodeEditor (in TheIDE), all search operations are performed in utf-8 anyway, ditto for files in most cases.