Parsing and solving a numeric padded captcha
-
Hi guys.
Have you ever encountered a captcha like this?
In spite of its looks, this is not actually an image, in fact you can highlight the numbers on the page.
It uses a different padding-left to display the digits in an order that is different from the one in the source code.To solve this captcha we need to:
- Parse the padding-left values to a list
- Sort them numerically from the lowest to the highest
- Use them to parse the HTML entities from the page (one by one in the correct order)
- Join and decode the resulting HTML entity string
Here's how the loliscript looks
SET SOURCE "<td align=right><div style='width:80px;height:26px;font:bold 13px Arial;background:#ccc;text-align:left;direction:ltr;'><span style='position:absolute;padding-left:60px;padding-top:5px;'>7</span><span style='position:absolute;padding-left:26px;padding-top:7px;'>3</span><span style='position:absolute;padding-left:42px;padding-top:5px;'>4</span><span style='position:absolute;padding-left:10px;padding-top:5px;'>6</span></div></td>" PARSE "<SOURCE>" REGEX "padding-left:([0-9]+)px[^>]*>(&#[0-9]+;)" "[1]" Recursive=TRUE -> VAR "MATCHES" UTILITY List "MATCHES" Sort Numeric=TRUE -> VAR "NUMBERS" PARSE "<SOURCE>" REGEX "padding-left:<NUMBERS[0]>px[^>]*>(&#[0-9]+;)" "[1]" -> VAR "NUM1" PARSE "<SOURCE>" REGEX "padding-left:<NUMBERS[1]>px[^>]*>(&#[0-9]+;)" "[1]" -> VAR "NUM2" PARSE "<SOURCE>" REGEX "padding-left:<NUMBERS[2]>px[^>]*>(&#[0-9]+;)" "[1]" -> VAR "NUM3" PARSE "<SOURCE>" REGEX "padding-left:<NUMBERS[3]>px[^>]*>(&#[0-9]+;)" "[1]" -> VAR "NUM4" FUNCTION HTMLEntityDecode "<NUM1><NUM2><NUM3><NUM4>" -> VAR "FINAL"
The
FINAL
variable will contain the value6347
which is the correct order of digits we wanted.Until next time,
Ruri
-
(https://i.postimg.cc/y8V30Dbt/gfgvb.png) @Ruri I never thought of solving it with Regex. Thank you. (Don't get stuck with naming it was the years when I just started
)
-
Try it on sites that don't have this lol https://prnt.sc/ud5off