This forum is in read-only mode. The new forum is live at and registrations are open!

Parsing and solving a numeric padded captcha

  • Admin

    Hi guys.

    Have you ever encountered a captcha like this?


    In spite of its looks, this is not actually an image, in fact you can highlight the numbers on the page.
    It uses a different padding-left to display the digits in an order that is different from the one in the source code.

    To solve this captcha we need to:

    • Parse the padding-left values to a list
    • Sort them numerically from the lowest to the highest
    • Use them to parse the HTML entities from the page (one by one in the correct order)
    • Join and decode the resulting HTML entity string

    Here's how the loliscript looks

    SET SOURCE "<td align=right><div style='width:80px;height:26px;font:bold 13px Arial;background:#ccc;text-align:left;direction:ltr;'><span style='position:absolute;padding-left:60px;padding-top:5px;'>&#55;</span><span style='position:absolute;padding-left:26px;padding-top:7px;'>&#51;</span><span style='position:absolute;padding-left:42px;padding-top:5px;'>&#52;</span><span style='position:absolute;padding-left:10px;padding-top:5px;'>&#54;</span></div></td>"
    PARSE "<SOURCE>" REGEX "padding-left:([0-9]+)px[^>]*>(&#[0-9]+;)" "[1]" Recursive=TRUE -> VAR "MATCHES" 
    UTILITY List "MATCHES" Sort Numeric=TRUE -> VAR "NUMBERS" 
    PARSE "<SOURCE>" REGEX "padding-left:<NUMBERS[0]>px[^>]*>(&#[0-9]+;)" "[1]" -> VAR "NUM1" 
    PARSE "<SOURCE>" REGEX "padding-left:<NUMBERS[1]>px[^>]*>(&#[0-9]+;)" "[1]" -> VAR "NUM2" 
    PARSE "<SOURCE>" REGEX "padding-left:<NUMBERS[2]>px[^>]*>(&#[0-9]+;)" "[1]" -> VAR "NUM3" 
    PARSE "<SOURCE>" REGEX "padding-left:<NUMBERS[3]>px[^>]*>(&#[0-9]+;)" "[1]" -> VAR "NUM4" 
    FUNCTION HTMLEntityDecode "<NUM1><NUM2><NUM3><NUM4>" -> VAR "FINAL" 

    The FINAL variable will contain the value 6347 which is the correct order of digits we wanted.

    Until next time,

  • ( @Ruri I never thought of solving it with Regex. Thank you. (Don't get stuck with naming it was the years when I just started
    ) 😛

  • Try it on sites that don't have this lol

Log in to reply