爬虫 爬小说(牢饭吃到饱)未完待续
  
      
      每天一个入狱小技巧
爬小说的详细步骤
from bs4.element import PageElement import requests import re from lxml import etree
 
  a = "12739599504227601#Catalog" url = "https://www.readnovel.com/book/" + a
 
  headers = {     "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.72 Safari/537.36" }
  paqv = requests.get(url, headers=headers ).content.decode() sdq1 = etree.HTML(paqv) hrefs = sdq1.xpath('//div[@class = "volume"]/ul/li/a/@href') for href in hrefs:     it2 = 'https:'+href     resp = requests.get(it2).content.decode()     sdq2 = etree.HTML(resp)     bcsj = sdq2.xpath('//div[@class = "ywskythunderfont"]/p/text()')          for neirong in bcsj:         s =''.join(neirong.split())                 with open('1.txt','a',encoding='UTF-8') as f:             f.write(s+'\n')
 
  | 
 
      
  
  
  
        
     
    
 
  作者: 我叫史迪奇
  
  本文来自于: 
   https://sdq3.link/reptile-novel.html博客内容遵循 署名-非商业性使用-相同方式共享 4.0 国际 (CC BY-NC-SA 4.0) 协议