Parsing nested HTML


  • Admin

    Imagine you have a highly nested element with no easy attribute or parent handle you can reference, like this one:

    <table>
      <tbody>
        <tr>
          <td>
            <a href="products.php?productID=67432">Product Name</a>
    

    In this case you would struggle both with a normal CSS Selector and with LR since the number 67423 usually changes.
    You would have to do it with regex but if it's on multiple lines, the pattern is going to be very unpleasant to build.

    There is actually a very simple and clean trick to this. You can use the *= operator which, unlike =, doesn't try to find if something matches, but tries to find if something CONTAINS something else.

    So we can use the regular syntax for the CSS Selector search via attribute value and replace the = with *= thus obtaining, in this case:

    Selector: [href*=productID]
    Output Attribute: innerHTML

    Which literally means "search for elements where the href attribute CONTAINS the word productID".

    You can read more about the CSS Selector and its operators here https://www.w3schools.com/cssref/css_selectors.asp



  • I remember you have a video tut demonstrating something like this, do you still have those vids?


  • Admin

    No sorry I don't



  • I don't know why people struggle with css selector, I love[attribute*=contains], it is probably my most used selector, combine it with indexing I don't think there is a value I haven't been able to parse/select. You can also use [attribute^=The-Start-Of-A-Value], to select from an attribute that starts with that value, This would achieve the same thing.


Log in to reply