css选择器

本章节是继上一小节的知识点,只是本人把它拆分了,如果你对css比较熟悉的话,可以直接使用这一章节的选择器

一、关于select选择器的使用

  • 1、直接获取元素节点

    print(soup.select('a'))
    
  • 2、根据类名查找,比如要查找class=sister的标签

    print(soup.select('.sister'))
    
  • 3、根据id查找

    print(soup.select("#link1"))
    
  • 4、多条件查找

    print(soup.select("p #link1")) # 查找p标签且是带id="link1"
    
  • 5、查找子节点

    print(soup.select("head > title"))
    
  • 6、通过属性查找

    print(soup.select('a[href="xx"]'))
    

二、获取内容

注意使用select选择的节点返回的都是list

soup = BeautifulSoup(html_doc, 'lxml')

    position = []

    trs = soup.select('tr')
    for tr in trs:
        tds = tr.select('td')
        post = {}
        title = tds[0].select('a')[0].get_text()
        type = tds[1].get_text()
        num = tds[2].get_text()
        city = tds[3].get_text()
        public_time = tds[4].get_text()

        post['title'] = title
        post['type'] = type
        post['num'] = num
        post['city'] = city
        post['public_time'] = public_time

        position.append(post)

    print(position)

results matching ""

    No results matching ""