开发者

extracting element and insert a space

开发者 https://www.devze.com 2023-03-15 08:12 出处:网络
im parsing html using BeautifulSoup in python i dont know how to insert a space when extracting text element

im parsing html using BeautifulSoup in python

i dont know how to insert a space when extracting text element

this is the code:

import BeautifulSoup
soup=BeautifulSoup.BeautifulSoup('<html>this<b>is</b>example</html>')
print soup.text

then output is

thisisexample

开发者_Go百科

but i want to insert a space to this like

yes is example

how do i insert a space?


Use getText instead:

import BeautifulSoup
soup=BeautifulSoup.BeautifulSoup('<html>this<b>is</b>example</html>')

print soup.getText(separator=u' ')
# u'this is example'


If your version of Beautifulsoup does not have getText then you could do this:

In [26]: ' '.join(soup.findAll(text=True))
Out[26]: u'this is example'


One may want to use also with strip argument

bs = BeautifulSoup("<html>this<b>is  </b>example</html>")
print(bs.get_text())  # thisis  example
print(bs.get_text(separator=" "))  # this is   example
print(bs.get_text(separator=" ", strip=True))  # this is example
0

精彩评论

暂无评论...
验证码 换一张
取 消