Quantcast
Channel: Active questions tagged html - Stack Overflow
Viewing all articles
Browse latest Browse all 67411

How do I fetch a particular item from html code using bs4? [on hold]

$
0
0

I have the following HTML code. I want to convert the HTML code below:

<div class="company_data__list">

 <div class="company_data__row"><div class="company_data__head">Name</div><div class="company_data__data">ABC Company<br/>Subtitle</div></div>
 <div class="company_data__row"><div class="company_data__head">Capital</div><div class="company_data__data">230000</div></div>
 <div class="company_data__row"><div class="company_data__head">Total</div><div class="company_data__data">103</div></div>

 <div class="company_data__row"><div class="company_data__head">Name</div><div class="company_data__data">XYZ Company<br/>Subtitle</div> 
 <div class="company_data__row"><div class="company_data__head">Total</div><div class="company_data__data">10</div></div>

 <div class="company_data__row"><div class="company_data__head">Name</div><div class="company_data__data">CAT Company<br/>Subtitle</div></div>
 <div class="company_data__row"><div class="company_data__head">Capital</div><div class="company_data__data">430000</div></div>
 <div class="company_data__row"><div class="company_data__head">Total</div><div class="company_data__data">10233</div></div>
 <div class="company_data__row"><div class="company_data__head">URL</div><div class="company_data__data">www.abc.com</div></div>

</div>



into a Json file which looks like this:

{ id: '1',
  data:{
    name: 'ABC CAT Company',
    capital: '230000',
    total:'103'
  },
  id:'2',
  data: {
    name: 'XYZ CAT Company',
    total:'10'
  },
  id:'3',
  data: {
    name: 'CAT Company',
    capital: '430000',
    total:'10',
    url:'www.abc.com'
  },


}

I'm using python3, bs4, re (Regular Expression)


Viewing all articles
Browse latest Browse all 67411

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>