How can I parse complex html sourse



  • How can I parse complex below html sourse :

    Sub (Single) = <table id="userRegTable" class="eacTable">            
                <thead>
                <tr>
                    <th>Product Name</th>
                    <th>Status</th>
                    <th>Licence</th>
                    <th>Registration</th>
                </tr>
                </thead>
                <!-- <tfoot/> -->
                <tbody>            
                
                   
                   
                   
                    
                        
                    
                    <tr >
                    	
                    	
                        <td >
                        	
                        		
                        			<strong> Advanced 1</strong>
                    			
                    			
                    		
                        </td>
                        <td><span class="hideForLarge hideForMedium"><strong>Status:</strong></span>Active</td>
                        
                        <td><span class="hideForLarge hideForMedium"><strong>Licence:</strong></span>Access from May 15, 2020 9:54:03 PM UTC to May 15, 2021 9:54:03 PM UTC</td>
                        <td>
                            
                                
                                &nbsp;
                            
                        </td>
                    </tr>
                
                   
                   
                   
                    
                        
                    
                    <tr >
                    	
                    	
                        <td >
                        	
                        		
                        			<strong>Complete 2</strong>
                    			
                    			
                    		
                        </td>
                        <td><span class="hideForLarge hideForMedium"><strong>Status:</strong></span>Active</td>
                        
                        <td><span class="hideForLarge hideForMedium"><strong>Licence:</strong></span>Access from May 15, 2020 1:24:10 PM UTC to May 15, 2021 1:24:10 PM UTC</td>
                        <td>
                            
                                
                                &nbsp;
                            
                        </td>
                    </tr>
                
                </tbody>
             </table>
    
    

    can i make output will be :
    Advanced 1 - Active - May 15, 2021
    Complete 2 - Active - May 15, 2021

    and i dont know how many advanced and complete below just want to capture all
    Please help me.
    Thank you all


  • Admin

    Use css selector
    For example #userRegTable > tbody > tr > td and attribute innerText will give you what you are looking for.
    You can either tick the recursive option or choose the index of what you want to parse.



  • alt text

    I try with #userRegTable > tbody > tr > td but it output nothing bro.
    thanks


  • Admin

    I don't see the # symbol in your selector, you copied it wrong



  • alt text

    Here bro


  • Admin

    Weird, Sub1 is the big yellow thing or the "could not parse any data" one?



  • @Ruri said in How can I parse complex html sourse:

    Weird, Sub1 is the big yellow thing or the "could not parse any data" one?

    Sub1 is the result when i parse the big yellow thing bro -> could not parse any data


  • Admin

    Maybe it needs a valid HTML page in order to process the css selector and not just a part of the page, try on the full page



  • thank you bro I will try it


Log in to reply