Debuggex: A visual regex debugger

UnoriginalGuy · on Feb 22, 2013

This is the best RegEx web-site that I have seen. It is actually quite far ahead of everyone else.

Unfortunately if you don't "understand" RegEx it won't help much. It is more for people who already have it down.

For me I am still stuck in copy/paste land. I could never get my head around the "logic" of RegEx, it just seems completely random and arbitrary.

Plus they re-use the same characters but have multiple meanings (e.g. ^ for NOT and for START).

kgen · on Feb 22, 2013

I once wrote a website (http://regexone.com) to help people learn regular expressions using practical examples -- maybe you would like to give that a try and see if it helps you in understanding the different regexes a bit more?

zaptheimpaler · on Feb 23, 2013

I just went through all the examples, that was awesome! Best tutorial I've seen yet. With this and debuggex, I should be all set to parse HTML with regexes (kidding).

By the way, could someone explain this to me - do regexes match a string if a part of a string matches the regex, or if the whole string does?

For example, this regex "([+|-]* )(\d+[,|.]* )(\d+)(\.)?(e)?(\d+)" (intended to match decimal/scientific numbers - see http://regexone.com/example/0) matches "720p" on regexone, but fails to match on debuggex. So it seems like it varies depending on some configuration - is that right?

archangel_one · on Feb 23, 2013

That regex doesn't match 720p on regexone (and shouldn't - it has required spaces in it). You're getting a tick for the bottom item because it's not supposed to match that case.

As for your question, it generally depends how they're called - eg. in Python's re module, whether you call search() or match(). You can force a regex to match the entire string by adding ^ and $.

dancesdrunk · on Feb 23, 2013

Used your site a few months ago and have been recommending it to everyone since! Can I make a request for "best Solutions" & possibly more exercises? Its been the best resource I've found for getting the basics down, thanks!

UnoriginalGuy · on Feb 22, 2013

Holy shit, that is amazing. I'm currently working my way through it. I feel more confident already.

Seriously thank you.

heywire · on Feb 22, 2013

Thanks for the link, this looks very helpful!

sdepablos · on Feb 23, 2013

Just two comments: * when having cases that should match and other that not, It would be ok to differentiate them * in some lessons it would be ok to force the full string to be matched. If not it's too easy to write i.e. file in http://regexone.com/lesson/8? and pass to the next one.

sdepablos · on Feb 23, 2013

I was just taking a look to it as I'd like some of my non-technical staff to understand how regex works - they need it for Google Analytics filters - and it looks wonderful.

Jam0864 · on Feb 23, 2013

This is an amazing website, cheers!

tsergiu · on Feb 22, 2013

Yes, the intention was to help solve those cases where you are staring at your screen because you don't know where the match went wrong.

I'm thinking of doing some tutorials geared towards teaching students in grade school how to use them. I think a visual representation would help significantly.

anigbrowl · on Feb 22, 2013

Seconded - it's superb. Having the example button for new users of regexes or people who haven't used one in a while is an excellent addition, as are the diagrams.

I find it sort of sad that several people have responded by linking to their preferred (but clearly inferior) Regex pages, which detracts from the accomplishment of this one.

gizmo686 · on Feb 24, 2013

I understand RegEx, but am (almost) completely unable to read it. For me, this site it perfect.

The way I learned RegEx was simply spending 2 work days writing a parser with it. I think the problem is that there is a moment when RegEx suddenly makes sense, and you cannot understand how anyone can be confused by it (even when you yourself were confused just 5 minutes ago).

jacobparker · on Feb 22, 2013

Regular expression: ( a * )*

String: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab

This kills the browser.

ADDENDUM:

A good read on executing regular expressions in linear (and thus predictable) time is http://swtch.com/~rsc/regexp/regexp1.html

Many other algorithms have exponential edge cases. This can open yourself to DoS'ing if you accept regular expressions from the user (e.g. a search feature.)

thomasahle · on Feb 23, 2013

No. While you are correct a DFA is far superior for parsing this specific subset of javascript regex, it does in no way make it ideal for debugging purposes.

1) In the user's program the regex is not going to be run on a dfa (since we are talking about the javascript variation which has back references). It makes more sense to warn the user about bad performance, than making them believe they are safe.

2) A debugger has to be true to the input. If the user wants to debug (a) it doesn't help that the debugger just casually transforms it into a*. That wouldn't make the diagrams fun at all.

3) It is entirely possible that in the future, the author wants to expand the awesome tool to a larger subset of javascript regex. This would probably make it break out of the finite automa space.

I do however agree that it's a pitty how many good regular expressions are run on stupid backtracking systems out there.

jacobparker · on Feb 23, 2013

That's silly.

Javascript doesn't mandate that this regular expression be slow. In fact, in Safari and Firefox on OSX this regular expression runs fine. Chrome OSX fails but I remember it running fine in Linux yesterday (but I might be wrong.)

1) Back references do not mean you need to have exponential edge cases for vanilla REs

2) There is no one true way to execute an RE. There are good ways and bad ways, though.

3) The Thompson algorithm does not preclude non-regular extensions.

Anyway, just wanted to add the rsc link to the discussion :)

abecedarius · on Feb 23, 2013

Worth mentioning that the Thompson algorithm can be coded more concisely: https://github.com/darius/sketchbook/blob/master/regex/nfa_s...

tsergiu · on Feb 23, 2013

Sorry for the delayed response. Spent all day yesterday responding to feedback. The reason this crashes is due to the internal javascript engine.

In order to ensure that my engine (I simulate a kind of NFA) matches what javascript's engine matches, every time I match on my engine, I also try to exact match using javascript's engine. Unfortunately, javascript's engine always uses backtracking, even when it doesn't need to. Obviously this code should have been turned off for production, and I'll fix it on the next push.

To replicate the crash on your own, try typing: 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab'.match(/^(a)$/) in your console.

jacobparker · on Feb 24, 2013

Interesting - it seems some browsers dont have this as an exponential edge case and indeed in those yours doesn't execute exponentially (I guess I got mixed up when testing.)

Anyway I dont consider it a "bug" or anything, just wanted to bring up the rsc paper for discussion :) Keep up the good work!

tsergiu · on Feb 23, 2013

Not sure what kind of formatting hn does, but that should be / ^ ( a * ) * $ / inside the match (remove spaces).

tsergiu · on Feb 22, 2013

Looking into it.

albemuth · on Feb 22, 2013

My personal favourite: http://rubular.com/

Surprised no one brought it up

andrewguenther · on Feb 22, 2013

As a long time Rubular user, I will be switching to Debuggex.

It just feels right to me. It explains your regex to you, which, in my mind, is a much better way to debug a regex than to supply a large set of test strings.

ulrikrasmussen · on Feb 23, 2013

Note though that the matching semantics are different: http://rubular.com/r/n4r7E7qMrQ http://www.debuggex.com/?re=%28a|ab%29%28b%3F%29&str=ab

therec · on Feb 23, 2013

Well, you didn't use the same regular expression. If you use the same one the matching is the same. Here is the link of tubular with the same expression than debuggex: http://rubular.com/r/rMu0nAnJai

gojomo · on Feb 22, 2013

(Obligatory relevant repost:)

My entry into this category:

http://regex.powertoy.org/

Important caveat: it makes use of a hidden Java applet -- so that it can supports the somewhat larger Java regex syntax, doesn't send your data anywhere else for matching, and can hook into the string-probing to animate the process. So dig out whatever browser you use for Java applets (if you still have one) to test.

Regarding the animation, click the 'animate?' link to show the animation step/speed controls. For example, you can watch the regex that tests whether a number is prime (by failing) or composite (by succeeding) via these two animations:

49: http://regex.powertoy.org/?pat=/^1%3F%24|^%2811+%3F%29\1+%24....

47: http://regex.powertoy.org/?pat=/^1%3F%24|^%2811+%3F%29\1+%24....

I really want to get rid of the applet requirement; I might someday cross-compile the JDK7 regex support to JS so that the full syntax and animation can still be supported, without an applet.

japaget · on Feb 22, 2013

Generally I try these sorts of things out on a small but non-trivial example. Unfortunately it failed, so while this regex debugger shows a great deal of potential, there is still a bit more work to be done. My inputs:

    Regex: TVo[12].\d.* [Aa] ..[^k]
    Test string: TVo1-0:01.0-1:01.0 A Nashville

tsergiu · on Feb 22, 2013

The current release only supports exact matches. Multiple matches are planned for a future release.

However, if you use the slider to slide to just past the "s" in Nashville, you can see that the end state does indeed light up.

Guillaume86 · on Feb 22, 2013

Looks like it's looking for full match (^regex$),

Just use: TVo[12].\d.* [Aa] ..[^k].* and it works

DEinspanjer · on Feb 22, 2013

Very nice. I'd recommend making the text to match field a text area and doing line based matches.

When I need to haul out the big guns, I load up RegexBuddy in a Wine bottle and dump a screenfull of text into it along with the regex to figure out where I went wrong.

They have a very different way of visualizing the step by step, but both are great tools.

tsergiu · on Feb 22, 2013

The text to match field is a text area; it will auto-expand as you type into it.

However, only exact matches are supported for the first release. I wanted to get user feedback before I built any more features. I think I have an intuitive way to visualize findAll() type matches.

lilyball · on Feb 23, 2013

Neat. Here's the Daring Fireball email regex: http://www.debuggex.com/?re=%5Cb%28%28%3F%3A%5Ba-z%5D%5B%5Cw...

Sadly it doesn't seem to understand (?i).

tsergiu · on Feb 23, 2013

Sorry about the lacking support. The goal for the next release is to have full support for Javascript flavor regexes.

ajacksified · on Feb 22, 2013

This is awesome! The "random examples" is a nice touch. The visualization is great.

ulrikrasmussen · on Feb 23, 2013

This is really cool! One thing that would be really awesome would be if you added a way to switch between disambiguation strategies. At the moment, it seems like the default strategy is greedy parsing (i.e. the "Perl way"). For instance, when matching the string "ab" against (ab)(b?) the first group matches "a" while the second matches "b". With the POSIX strategy, the first group will match "ab" and the second will match the empty string.

I think these subtle differences leads to a lot of confusion when users are not aware that the underlying implementation is different from what they are used to.

tsergiu · on Feb 23, 2013

There will be support for different flavors of regexes in an upcoming release.

arrakeen · on Feb 22, 2013

very nice visualization. for a similar interactive regexp debugger in perl, see Regexp::Debugger: https://metacpan.org/module/Regexp::Debugger

martin_ · on Feb 22, 2013

This is incredible - good job. What flavors of regex do you plan to support

tsergiu · on Feb 22, 2013

Full support for Javascript will be built out first. After that, Python and PCRE. In the longer term, I plan to support .NET, Java, Ruby, and POSIX.

northisup · on Feb 22, 2013

Very cool visualization. Trying to match an e-mail with this (from django.core.validators) but it isn't working: http://www.debuggex.com/?re=%28%5E%5B-%21%23%24%25%26%27%2A%...

northisup · on Feb 22, 2013

now with ip address matching: http://www.debuggex.com/?re=%28%5E%5B-%21%23%24%25%26%27%2A%...

tsergiu · on Feb 22, 2013

I think you are missing lower case letters in your character sets.

Besides that, apologies for it stretching sideways and making you scroll. Will be fixed in a future release!

northisup · on Feb 22, 2013

In python you can add flags like 'case insensitive' when you compile the regex.

tsergiu · on Feb 22, 2013

Keep in mind that it doesn't support flags yet, which is why it wouldn't work out-of-the-box.

cromwellian · on Feb 23, 2013

Pretty cool, I like the automata diagram of the expression.

Here's another one http://ocpsoft.org/tutorials/regular-expressions/java-visual...

Done with GWT and Errai, source here: https://github.com/ocpsoft/regex-tester/tree/master/src/main...

greggman · on Feb 23, 2013

Not bad. Have you thought about supporting a much larger string area? The re editor in Slickedit lets me paste multiple lines of text and see what parts get matched by the regex which is super useful for searching and replacing code and also very useful for multi-line matches.

http://www.slickedit.com/demo/high/RegexEvaluator/RegexEvalu...

tsergiu · on Feb 23, 2013

Yup, it just didn't make it into the first release. Will definitely be in a future release.

NuZZ · on Feb 23, 2013

First off, nice work, this is legit useful stuff.

Suggestions; the regex reference could use a distinguishing feature such as a subtle light grey background and/or a line seperating it along with more whitespace.

Also, the boxes seem arbitrarily placed. I realize one is centered, and the others take up the remaining space on the next "line", but perhaps you could create better visual boundaries or something.

Lastly, apologies, but maybe the font Lato looks nice on your setup, but its rather jaggedy/unappealing on windows.

tsergiu · on Feb 23, 2013

Thank you for the feedback.

The regex reference is only temporarily there. It'll eventually be replaced by a much better feature which is in the pipeline. I'll play around with the css to make it better.

I've played around with the positioning a bit, and it definitely needs iteration. However, an upcoming ui change will drastically change the demands on the ui, so it doesn't make sense to optimize that yet.

I'll replace the Lato font. I agree it looks terrible on windows.

Thanks for all the feedback!

ericcholis · on Feb 22, 2013

Regex is one of my weak points that I've always wanted to fix. I think this just might be the tool that accomplishes that.

tsergiu · on Feb 22, 2013

Great to hear that :) Let me know if there's any features I can add that would make them less confusing for you.

benth · on Feb 22, 2013

Nice. It looks like beginning and end of string anchors are included by default. Is there any way to turn that off?

michaelt · on Feb 23, 2013

Why not just put a .* at the start and end of your regexp?

Unless you want to do that match-across-newlines witchcraft.

tsergiu · on Feb 22, 2013

Not in this release. It's exact matches only for now.

spankalee · on Feb 23, 2013

This is really awesome, and it's immediately going into my batbelt bookmark folder.

One quick UI note: The reference table is much easier to read if the lines are left-aligned. With centering and two columns, it's hard to tell at first which descriptions the escape sequences belong to.

tsergiu · on Feb 23, 2013

Thanks for the feedback. Will fix it immediately.

fernly · on Feb 23, 2013

When a kleene star follows a quantifier, the "n times" legend under the quantified diagram element gets cropped, e.g http://www.debuggex.com/?re=a%7B2%7Db%2A&str=

Chrome, mac os.

tsergiu · on Feb 23, 2013

Thanks for the feedback. I'll fix this as soon as I can.

jebblue · on Feb 23, 2013

This makes regex fun, I could actually see myself relying on it more. Not sure if I'm not writing them every day if I'll remember a year from now what \dd does but now I have a good site to go to to remember again. Nice site.

tsergiu · on Feb 23, 2013

Thank you. Let me know what I could do to make it even more fun :)

jes · on Feb 22, 2013

Great, now I have two problems.

tsergiu · on Feb 22, 2013

But at least you can fix them quickly ;)

speeder · on Feb 22, 2013

This is really, really, really awesome.

I think I will add this to the URLs that I know on my head.

tsergiu · on Feb 23, 2013

Thank you. Share generously :)

radicalbyte · on Feb 22, 2013

Showing the state machine is cool, but I'd recommend adding more room for test cases, ala regexpal.com.

A set of fail strings would be useful, it's something no-one else does but it vital for a good user experience.

tsergiu · on Feb 22, 2013

Multiple matches is planned for a future release.

How would you recommend generating fail string? The space of failing strings is really large. From my discussions with users, they usually have a specific failed string and they want to see why it doesn't work.

scriptproof · on Feb 23, 2013

Reminds me http://www.xul.fr/javascript/regular-expression-tester.php (2008) That is more extended.

mhartl · on Feb 22, 2013

My favorite tool in this vein—and one of my favorite examples of an essentially perfect one-page web app—is Rubular (http://rubular.com/).

mukundmr · on Feb 23, 2013

I prefer http://gskinner.com/RegExr/. The UI is way more polished and helpful. It is quite accurate as well.

amenghra · on Feb 22, 2013

Very similar. Provides some "linting": http://regexp.quaxio.com/

tsergiu · on Feb 22, 2013

That is a good visualization. However, it doesn't let you debug if there is a problem with matching a string. That was the explicit goal in building Debuggex.

Linting is a planned feature for a future release.

kahoon · on Feb 22, 2013

I immediately thought about this one: http://www.regexper.com/

VaucGiaps · on Feb 22, 2013

I really like this one. Too bad it's Flash...

tsergiu · on Feb 22, 2013

It's done with SVG. Everyting will work on your smartphone. (Try it!)

kahoon · on Feb 22, 2013

Flash? I think it's done with SVG.

malkia · on Feb 23, 2013

for a desktop regex "debugger" app, I've been using Edi Weitz's wonderful Regex Coach - http://www.weitz.de/regex-coach/ (Windows only version, written in LispWorks)

harpreets · on Feb 23, 2013

From hieroglyphics to the actual power on hands. Wow! Just wow.

tsergiu · on Feb 23, 2013

Thanks. Your support means a lot :)

JulianMorrison · on Feb 23, 2013

Thank you. It actually caught a mistake I made playing with it!

tsergiu · on Feb 23, 2013

Thanks for the support. May you catch many more!

jpettersson · on Feb 23, 2013

This is amazing. Thank you very much!

tsergiu · on Feb 23, 2013

Thanks for the support!

GaryGapinski · on Feb 23, 2013

Very nice. I'll be recommending this.

tsergiu · on Feb 23, 2013

Thank you for the support :)

fyolnish · on Feb 23, 2013

according to this, foo(bar|baz)? does not match foobar

tsergiu · on Feb 23, 2013

Not sure what you mean. It's working for me. Can you describe what happens in more detail please.

martinced · on Feb 23, 2013

It is way better than the other ones for a variety of reasons. A big one for me being the ability to directly share the URL of an example.

I'm sure they can do better: next please provide us the ability to use a tiny URL directly from within the domain (i.e. do not force me to lamely go to bit.ly or other non-sense).

tsergiu · on Feb 23, 2013

Thanks for the feedback. I'll keep that in mind for upcoming releases.

hacker789 · on Feb 22, 2013

This is fantastic. How on Earth are the random matching strings calculated?

tsergiu · on Feb 22, 2013

Imagine walking from the start to the end along the railroad diagram. Every time you come to a split in the road, you choose a random one. Every time you come to a character set, you choose something random inside there. That's all there is to it.

hacker789 · on Feb 22, 2013

It takes a great teacher to make something sound so simple. Thanks for making this tool!