Get Even More Visitors To Your Blog, Upgrade To A Business Listing >>

Javascript Regular expression for range of six-digit base36 numbers

Javascript Regular expression for range of six-digit base36 numbers

Problem

I'm building a node.js application and storing a six-digit base36 representation of a unix timestamp (in seconds) as the first part of an _id in Mongodb. A typical _id looks like this:

"_id" : "lwhlzy/czwszasfgr/a4d18976c1/f835caa1c3/184d06b47f"

Several pieces of data are concatenated, including the timestamp followed by a series of hashed data to form both a GUID and a "materialized path"

Later queries will select records based on a time range, followed by the path to get events that happened during that period for that particular path. These queries will rely on rooted regular expressions, so I need a Regex that can find a range of base36 numbers:

This is the code I have so far (a test to run via node and yes it is hard-coded to six digits. The seventh digit wont be needed until Dec 23rd 2038.)

var base36 = "0123456789abcdefghijklmnopqrstuvwxyz";

// determine how many left-most characters from & to have in common
// this function works nicely, no problems here
var getOverlap = function (from, to) {
    regex = '';
    count = to.length;

    for (i in to) {
        regex += (i>0?'|':'')+'('+to.slice(0,count)+')';
        count--;
    }

    result = from.match(RegExp(regex,"ig"));
    return result[0];
};

var from = "lec0s0"; 
var to = "lwhvqg"; // generated from: parseInt(Date.now()/1000,10).toString(36)

var overlap = getOverlap(from,to);

console.log(from);
console.log(to);

var regex = overlap;
var i = overlap.length;
// start immediately after the left-most common characters and append the rest of the regex
while (i

Which will output something like this:

l[efghijklmnopqrstuvw][cdefgh][0123456789abcdefghijklmnopqrstuv][stuvwxyz0123456789abcdefghijklmnopq][0123456789abcdefg]

After studying this I realized there are two main issues with this: 1) its not quite right for a true range (it would skip huge chunks of records) and 2) Id rather have Character ranges like [e-w] instead of every character explicitly stated though it still works.

For input from="lec0s0" and to="lwhvqg" I realize Im missing a large part of this regex. For example, the code above only allows the 3rd character a range from c-h, but that position will need to reach "z" before the 2nd character can increment. I've determined that I actually need a regex that looks more like this:

l[e-v][0-9a-z][0-9a-z][0-9a-z][0-9a-z]|l[e-w][c-g][0-9a-z][0-9a-z][0-9a-z]|l[e-w][c-h][0-9a-u][0-9a-z][0-9a-z]|l[e-w][c-h][0-9a-v][0-9a-o][0-9a-z]|l[e-w][c-h][0-9a-v][0-9a-q][0-9a-g]

So my question is: am I right to conclude the regex needs to look like the latter above? And if so, how might I modify the code to generate it?

Thanks in advance!

Problem courtesy of: talentedmrjones

Solution

Your current pattern will match from le0000 and up, you actually wish to match:

lec0s[0-9a-z]|lec0[t-z][0-9a-z]{1}|lec[1-9a-z][0-9a-z]{2}|le[d-z][0-9a-z]{3}|l[f-v][0-9a-z]{4}|lw[0-9a-g][0-9a-z]{3}|lwh[0-9a-u][0-9a-z]{2}|lwhv[0-9a-p][0-9a-z]{1}|lwhvq[0-9a-g]

The following function should give you the regex you need:

function getRegex(from,to) {
    var base36 = '0123456789abcdefghijklmnopqrstuvwxyz',
        getRange = function(f,t) {
            if(f == t) {
                return f;
            }
            if(base36.indexOf(f) >= base36.indexOf(t)) {
                return t;
            } 
            if(t = 'a'){
                return '[' +f+'-'+t+']';
            }
            return '[' +f+(f'a'?'a-':'')+t+']';    
        },
        from = from.split(''),
        to = to.split(''),
        prefix='', 
        regex=[], 
        tmp,i,l;

    for(i=0,l=from.length;i

You can see it live here: http://jsfiddle.net/3cu52/3/

Solution courtesy of: Martin Jespersen

Discussion

View additional discussion.



This post first appeared on Node.js Recipes, please read the originial post: here

Share the post

Javascript Regular expression for range of six-digit base36 numbers

×

Subscribe to Node.js Recipes

Get updates delivered right to your inbox!

Thank you for your subscription

×