Most programming languages have apis for regular expression searching and replacing. In my experience the apis can be quite clunky, probably due to the number of actions available开发者_运维知识库 and efficiency considerations.
If you were going to implement an api, which one would you emulate?
Of particular interest is the methods and objects of the api, but also the regexp dialect and adherence to any standards.
If you emulate an API it is going to be just as clunky as the original (if not more.) I don't see what you are getting at. If you are really worried about losing 100 KB to a regex API you should only implement a minimalistic subset which wouldn't resemble a large one. Check to see if any APIs have configs to disable features you don't need.
I think the Lua pattern matching API is an excellent API to emulate. It has a superb balance of power versus simplicity. And there's one brilliant design choice: the escape character for regular expressions is different from the escape character for string literals—so there's no backslash hell.
If I were to add one thing to the Lua API it would be or-patterns.
Having actually implemented a full regular expression engine (used in-house in my company's products such as RegexBuddy) and a publicly available "API" based on PCRE (the TPerlRegEx component for Delphi), I recommend not too worry too much about emulating this or that, but instead focus on what your regex library will be used for. Unfortunately, you don't say much about this other than mentioning efficiency. A properly developed library doesn't have to be less efficient just because it has more available features. E.g. PCRE offers a feature-rich regex flavor and excellent performance, but a limited set of library features around it (e.g. no search-and-replace). But adding more library features such as a search-and-replace wouldn't make PCRE slower, because unused calls don't even have to be linked into the final .exe.
There are no regex standards. Only conventions that are frequently flaunted in subtle ways. If "standards" matter, simply use one of the popular regex libraries, even if it isn't perfect.
If you want something off-the-shelf minimalistic, dig up a copy of Henry Spencer's regex.c which implements POSIX regular expressions.
精彩评论