html – Python extracting only the First href link for every nth occurence in the for loop-ThrowExceptions

Exception or error:

I am trying simple web scraping using python, but there is problem fetching link names as there are 2 to 3 href headers in the same class btn as mentioned below whereas i need only the first one to be printed for every new occurrence in the loop.

from bs4 import BeautifulSoup
import requests

url = ""

# Getting the webpage, creating a Response object.
response = requests.get(url)

# Extracting the source code of the page.
data = response.text

# Passing the source code to BeautifulSoup to create a BeautifulSoup object for it.
soup = BeautifulSoup(data, 'lxml')

# Extracting all the <a> tags into a list.
tags = soup.find_all('a', class_='btn')

# Extracting URLs from the attribute href in the <a> tags.
for tag in tags:

Output from the above code:

While desired Output:
How to solve:

BeautifulSoup has excellent CSS support, just use that to pick every odd item:

soup = BeautifulSoup(data, 'lxml')
for tag in'a.btn:nth-of-type(odd)'):


>>> for tag in'a.btn:nth-of-type(odd)'): print(tag['href'])
... etc

You do have a parent <div class="book"> element per group of links you could make use of:

for tag in'.book a.btn:first-of-type'):

which would work for any number of links per book.

Leave a Reply

Your email address will not be published. Required fields are marked *