This forum is in read-only mode. The new forum is live at and registrations are open!

How to capture data from different pages from one source?

  • Hello everyone!

    I was wondering if it is possibile to capture data from the different Urls contained in one source?

    For example:


    So, it is the request Url, i get the source which contains many other Urls like:

    and so on.

    I would like to enter the first Url, capture data, come back, then go to the second Url, capture data and then go the the third Url and capture until the last Url.

    It should be done via Loliscript and i also checked one of your guides but couldn't get any results.

    Can you please help?


  • Admin

    This post is deleted!

  • This post is deleted!

  • Admin

    1. Parse URLs to a list variable
    2. Read this
    3. In the loop place a request block, a parse block, and a utility block that adds the newly parsed list to a master list you created outside the loop (I think it's called List > Zip)

  • Will try and update you, thankyou.

  • Ok, i made the config till i parse all url's, now i need to make OB visit these one by one,so, i created a request. But what to put into this?

    <--- Executing Block UTILITY --->
    Executed action Length on file LIST
    SET command executed on field VAR
    <--- Executing Block FUNCTION --->
    Executed function Compute on input 0+1 with outcome 1
    Parsed variable | Name: INDEX | Value: 1
    Jumping to line 19
    <--- Executing Block FUNCTION --->
    Executed function Compute on input 1+1 with outcome 2
    Parsed variable | Name: INDEX | Value: 2
    Jumping to line 19
    <--- Executing Block FUNCTION --->
    Executed function Compute on input 2+1 with outcome 3
    Parsed variable | Name: INDEX | Value: 3
    Jumping to line 19
    <--- Executing Block FUNCTION --->
    Executed function Compute on input 3+1 with outcome 4
    Parsed variable | Name: INDEX | Value: 4
    Jumping to line 19
    Jumping to line 26
    WARNING: The test input data did not respect the validity regex for the selected wordlist type!

    I have modified the parsed urls.

  • Admin

    In the request put as address <LIST[<INDEX>]>

  • Yeah i tried it already and got:

    <--- Executing Block REQUEST --->
    Calling URL: <LIST[4]>
    Sent Headers:
    User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.149 Safari/537.36
    Pragma: no-cache
    Accept: */*
    Content-Type: application/x-www-form-urlencoded
    Sent Cookies:
    Invalid URI: The hostname could not be parsed.
    ERROR: Invalid URI: The hostname could not be parsed.

  • Admin

    Maybe because your URL list is not called LIST but some other name? xD

  • Admin

    Also make sure you're on OB 1.2.2 cause older versions didn't support this syntax

  • I'm using the 1.2.2 for that.



  • Works good for me (ignore first two lines - it is only to build an example list):


  • How stupid, i just needed to move the request before the function tab..worked.

  • Thanks a lot for helping me out, problem is solved.

    Kind regards.

  • Turning back to the parsing.

    If my request url is:

    and the source Urls are in this format:

    <span class=" subject_new" id="tid_901"><a href="Thread-test1">347 - 2020</a></span></span>
    <span class=" subject_new" id="tid_902"><a href="Thread-test2">348 - 2020</a></span></span>
    <span class=" subject_new" id="tid_903"><a href="Thread-test3">349 - 2020</a></span></span>
    <span class=" subject_new" id="tid_904"><a href="Thread-test4">350 - 2020</a></span></span>

    How can i parse the Urls to add them to the list and capture data from them?

    By using [class*=subject_new] as CSS selector, it will capture the whole class and not just the URL.

  • You simply can use regex afterwards to capture the desired elements.

  • Ok, that was done, who could please help me parsing only the usernames:passwords from this?

    <meta name="description" content="[email protected]:xxxxxx83" />
    <link rel="canonical" href="" />
    <meta name="description" content="[email protected]:xxxxxx83 [email protected]:xxxxxx831 [email protected]:xxxxxx835 [email protected]:xxxxxx836" />
    <link rel="canonical" href="https://xxxx/xxxx-xxxx-xxxxx--xxxxxx" />
    <meta name="description" content="[email protected]:xxxxxx83 | SUP = testttttttttttting |" />
    <link rel="canonical" href="" />

    By using

    (?<="description" content=")(.*)(?=")

    i can capture the the usernames:passwords but from the last line it capture everything. So, how to avoid it?

  • @kbilly what about 'span[a="href"]' ?

  • @L4roy please re-read my last post. I wonder to parse this, the second last post has been solved.

  • this worked for it will only parse user:pass in all lines

    meta name="description" content="(.[^\s]*)

    i just noticed that it doesn't capture everything in second line. but you can still that results you got so far split it and then use "removevalues" and make it remove the lines that doesnt contain "@" or ":".

Log in to reply