程式扎記: [ Python 常見問題 ] BeautifulSoup - How to remove all tags from an element?

2017年8月8日星期二

[ Python 常見問題 ] BeautifulSoup - How to remove all tags from an element?

Source From Here
Question
How can I simply strip all tags from an element I find in BeautifulSoup?

How-To
Use get_text(), it returns all the text in a document or beneath a tag, as a single Unicode string:

view plaincopy to clipboardprint?
html_doc = """  
<html><head><title>The Dormouse's story</title></head>  
<body>  
<p class="title"><b>The Dormouse's story</b></p>  
  
<p class="story">Once upon a time there were three little sisters; and their names were  
<a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>,  
<a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and  
<a href="http://example.com/tillie" class="sister" id="link3">Tillie</a>;  
and they lived at the bottom of a well.</p>  
  
<p class="story">...</p>  
"""  
  
from bs4 import BeautifulSoup  
soup = BeautifulSoup(html_doc, 'html.parser')  
print("{}".format(soup.get_text()))  

Execution output:

The Dormouse's story

The Dormouse's story
Once upon a time there were three little sisters; and their names were
Elsie,
Lacie and
Tillie;
and they lived at the bottom of a well.
...

This message was edited 3 times. Last update was at 09/08/2017 09:36:14

程式扎記

標籤

2017年8月8日星期二

[ Python 常見問題 ] BeautifulSoup - How to remove all tags from an element?

沒有留言:

張貼留言

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

檢舉濫用情形

學習筆記

標籤

2017年8月8日 星期二

[ Python 常見問題 ] BeautifulSoup - How to remove all tags from an element?

沒有留言:

張貼留言

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

檢舉濫用情形

學習筆記

2017年8月8日星期二