html文件怎么保存链接,如何使用beautifulsoup将链接的html保存在文件中，并对html文件中的所有链接执行相同的操作...

2023-09-16 阅读 20 评论 0

摘要：我明白了。使用美丽的汤递归URL解析的代码：import requestsimport urllib2怎么将网页以html保存到文件夹中。from bs4 import BeautifulSouplink_set = set()give_url = raw_input("Enter url:\t")def magic(give_url, link_set, count):html文件怎么

我明白了。使用美丽的汤递归URL解析的代码：

import requests

import urllib2

怎么将网页以html保存到文件夹中。from bs4 import BeautifulSoup

link_set = set()

give_url = raw_input("Enter url:\t")

def magic(give_url, link_set, count):

html文件怎么打开？# print "______________________________________________________"

#

# print "Count is: " + str(count)

# count += 1

html保存，# print "THE URL IT IS SCRAPPING IS:" + give_url

page = urllib2.urlopen(give_url)

page_content = page.read()

with open('page_content.html', 'w') as fid:

html5？fid.write(page_content)

response = requests.get(give_url)

html_data = response.text

soup = BeautifulSoup(html_data)

html。list_items = soup.find_all('a')

for each_item in list_items:

html_link = each_item.get('href')

if(html_link is None):

beautifulsoup干嘛的。pass

else:

if(not (html_link.startswith('http') or html_link.startswith('https'))):

link_set.add(give_url + html_link)

beautifulsoup的作用、else:

link_set.add(html_link)

# print "Total links in the given url are: " + str(len(link_set))

magic(give_url,link_set,0)

link_set2 = set()

link_set3 = set()

for element in link_set:

link_set2.add(element)

count = 1

for element in link_set:

magic(element,link_set3,count)

count += 1

for each_item in link_set3:

link_set2.add(each_item)

link_set3.clear()

count = 1

print "Total links scraped are: " + str(len(link_set2))

for element in link_set2:

count +=1

print "Element number " + str(count) + "processing"

print element

print "\n"

有很多错误，所以我要求你们都请告诉我在哪里可以提高代码。

版权声明：本站所有资料均为网友推荐收集整理而来,仅供学习和研究交流使用。

原文链接：https://hbdhgg.com/1/67551.html

上一篇：管理运筹学软件计算机解咋看,运筹学软件结果解析总结计划题1-20210618123710.docx-原创力文档...

下一篇：电子计算机工程学,电子计算机工程学荣誉工学士资料.ppt

标签：怎么将网页以html保存到文件夹中 html文件怎么打开 html保存 HTML5 HTML beautifulsoup干嘛的 beautifulsoup的作用 html文件怎么变成网站

怎么将网页以html保存到文件夹中

最新文章

阅读排行

猜你喜欢

本站为非赢利网站，部分文章来源或改编自互联网及其他公众平台，主要目的在于分享信息，版权归原作者所有，内容仅供读者参考，如有侵权请联系我们删除！

Copyright © 2022 匯編語言學習筆記 Inc. 保留所有权利。

底部版权信息

我要关灯
我要开灯
客户电话
工作时间：8:00-18:00
客服电话
电子邮件
admin@qq.com
官方微信
扫码二维码
获取最新动态
返回顶部