THE FORUM IS IN READ-ONLY MODE

This forum is in read-only mode. The new forum is live at https://discourse.openbullet.dev and registrations are open!

Parsing nested HTML


  • Admin

    Imagine you have a highly nested element with no easy attribute or parent handle you can reference, like this one:

    <table>
      <tbody>
        <tr>
          <td>
            <a href="products.php?productID=67432">Product Name</a>
    

    In this case you would struggle both with a normal CSS Selector and with LR since the number 67423 usually changes.
    You would have to do it with regex but if it's on multiple lines, the pattern is going to be very unpleasant to build.

    There is actually a very simple and clean trick to this. You can use the *= operator which, unlike =, doesn't try to find if something matches, but tries to find if something CONTAINS something else.

    So we can use the regular syntax for the CSS Selector search via attribute value and replace the = with *= thus obtaining, in this case:

    Selector: [href*=productID]
    Output Attribute: innerHTML

    Which literally means "search for elements where the href attribute CONTAINS the word productID".

    You can read more about the CSS Selector and its operators here https://www.w3schools.com/cssref/css_selectors.asp



  • I remember you have a video tut demonstrating something like this, do you still have those vids?


  • Admin

    No sorry I don't



  • I don't know why people struggle with css selector, I love[attribute*=contains], it is probably my most used selector, combine it with indexing I don't think there is a value I haven't been able to parse/select. You can also use [attribute^=The-Start-Of-A-Value], to select from an attribute that starts with that value, This would achieve the same thing.



  • @Ruri In this case how can i parse. Product Name


  • Admin

    @-solo- Exactly as I wrote in the example


  • Banned

    Re: Parsing nested HTML
    I have a problem getting the ReCaptcha token because it's in another request and hidden not appear in the source like in screen it's in google request, not login request!ca.PNGca.PNG


  • Admin

    The recaptcha token is always the same for the same website so you can just copy it once and hardcode it, there's no need to parse it.


  • Banned

    @Ruri login post like that recaptcha;user;pass;token and every time I use LR and CSS gives me nothing and site replay bad request missed ReCaptcha how can I use JSON or REGEX


Log in to reply