javascript – RegExp.exec() returns NULL sporadically-ThrowExceptions

Exception or error:

I am seriously going crazy over this and I’ve already spent an unproportionate amount of time on trying to figure out what’s going on here. So please give me a hand =)

I need to do some RegExp matching of strings in JavaScript. Unfortunately it behaves very strangely. This code:

var rx = /(cat|dog)/gi;
var w = new Array("I have a cat and a dog too.", "There once was a dog and a cat.", "I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.");

for (var i in w) {
    var m = null;
    m = rx.exec(w[i]);
    if(m){
        document.writeln("<pre>" + i + "\nINPUT: " + w[i] + "\nMATCHES: " + m.slice(1) + "</pre>");
    }else{
        document.writeln("<pre>" + i + "\n'" + w[i] + "' FAILED.</pre>");
    }
}

Returns “cat” and “dog” for the first two elements, as it should be, but then some exec()-calls start returning null. I don’t understand why.

I posted a Fiddle here, where you can run and edit the code.

And so far I’ve tried this in Chrome and Firefox.

Cheers!

/Christofer

How to solve:

Oh, here it is. Because you’re defining your regex global, it matches first cat, and on the second pass of the loop dog. So, basically you just need to reset your regex (it’s internal pointer) as well. Cf. this:

var w = new Array("I have a cat and a dog too.", "I have a cat and a dog too.", "I have a cat and a dog too.", "I have a cat and a dog too.");

for (var i in w) {
    var rx = /(cat|dog)/gi;
    var m = null;
    m = rx.exec(w[i]);
    if(m){
        document.writeln("<p>" + i + "<br/>INPUT: " + w[i] + "<br/>MATCHES: " + w[i].length + "</p>");
    }else{
        document.writeln("<p><b>" + i + "<br/>'" + w[i] + "' FAILED.</b><br/>" + w[i].length + "</p>");
    }
    document.writeln(m);
}

###

The regex object has a property lastIndex which is updated when you run exec. So when you exec the regex on e.g. “I have a cat and a dog too.”, lastIndex is set to 12. The next time you run exec on the same regex object, it starts looking from index 12. So you have to reset the lastIndex property between each run.

###

Two things:

  1. The mentioned need of reset when using the g (global) flag. To solve this I recommed simply assign 0 to the lastIndex member of the RegExp object. This have better performance than destroy-and-recreate.
  2. Be careful when use in keyword in order to walk an Array object, because can lead to unexpected results with some libs. Sometimes you should check with somethign like isNaN(i), or if you know it don’t have holes, use the classic for loop.

The code can be:

var rx = /(cat|dog)/gi;
w = ["I have a cat and a dog too.", "There once was a dog and a cat.", "I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat."];

for (var i in w)
 if(!isNaN(i))        // Optional, check it is an element if Array could have some odd members.
  {
   var m = null;
   m = rx.exec(w[i]); // Run
   rx.lastIndex = 0;  // Reset
   if(m)
    {
     document.writeln("<pre>" + i + "\nINPUT: " + w[i] + "\nMATCHES: " + m.slice(1) + "</pre>");
    } else {
     document.writeln("<pre>" + i + "\n'" + w[i] + "' FAILED.</pre>");
    }
  }

###

I had a similar problem using /g only, and the proposed solution here did not work for me in FireFox 3.6.8. I got my script working with

var myRegex = new RegExp("my string", "g");

I’m adding this in case someone else has the same problem I did with the above solution.

Leave a Reply

Your email address will not be published. Required fields are marked *