var URL_REGEX = /(http:\/\/)?(\w+\.{0,1})*client\.org/i;
function stripUrlPrefix (url) {
return url.replace(URL_REGEX, '');
}
stripUrlPrefix() removes the scheme, if present, and the server name, and it usually works like a champ.
However, Tony on the testing team found that the following input string (a real path name from one of our servers) sends the regex engines in IE 7 and Firefox 3 completely out to lunch:
/images/ap//AP_News_Wire:_World_News/3_Australia_Thirsty_Camels.sff_300.jpg
On my middle-of-the-line Windows XP laptop, IE 7 takes about 10 minutes to execute stripUrlPrefix(), given this input string; Firefox just pegs the CPU and never does return. Jason is going to give this code a spin on Chrome to see what happens.
I have somehow stumbled into some kind of backtracking morass with a regex that looks pretty vanilla to me, and an input string that's likewise not too gnarly.
It turns out that we can fix the problem by trimming leading whitespace from the input and adding a beginning of string anchor to the regular expression, thus:
var URL_REGEX = /^(http:\/\/)?(\w+\.{0,1})*client\.org/i;
I haven't checked to see whether explicitly using the RegExp class would make a difference.
No comments:
Post a Comment