Quantcast
Channel: Active questions tagged html - Stack Overflow
Viewing all articles
Browse latest Browse all 67411

How can I convert the html code with irregular columns into a nested json file?

$
0
0

I have the following HTML code. I want to convert the HTML code below:

<div class="company_data__list">

 <div class="company_data__row"><div class="company_data__head">Name</div><div class="company_data__data">ABC Company<br/>Subtitle</div></div>
 <div class="company_data__row"><div class="company_data__head">Capital</div><div class="company_data__data">230000</div></div>
 <div class="company_data__row"><div class="company_data__head">Total</div><div class="company_data__data">103</div></div>

 <div class="company_data__row"><div class="company_data__head">Name</div><div class="company_data__data">XYZ Company<br/>Subtitle</div> 
 <div class="company_data__row"><div class="company_data__head">Total</div><div class="company_data__data">10</div></div>

 <div class="company_data__row"><div class="company_data__head">Name</div><div class="company_data__data">CAT Company<br/>Subtitle</div></div>
 <div class="company_data__row"><div class="company_data__head">Capital</div><div class="company_data__data">430000</div></div>
 <div class="company_data__row"><div class="company_data__head">Total</div><div class="company_data__data">10233</div></div>
 <div class="company_data__row"><div class="company_data__head">URL</div><div class="company_data__data">www.abc.com</div></div>

</div>



into a Json file which looks like this:

{ id: '1',
  data:{
    name: 'ABC CAT Company',
    capital: '230000',
    total:'103'
  },
  id:'2',
  data: {
    name: 'XYZ CAT Company',
    total:'10'
  },
  id:'3',
  data: {
    name: 'CAT Company',
    capital: '430000',
    total:'10',
    url:'www.abc.com'
  },


}

I'm using python3, bs4, re (Regular Expression) I'm having trouble with matching the html head rows with data rows since it does not have a specific #id to differentiate with.


Viewing all articles
Browse latest Browse all 67411

Trending Articles