The question-
My submission
he file is a table of names and comment counts. You can ignore most of the data in the file except for lines like the following:
<tr><td>Modu</td><td><span class="comments">90</span></td></tr> <tr><td>Kenzie</td><td><span class="comments">88</span></td></tr> <tr><td>Hubert</td><td><span class="comments">87</span></td></tr>You are to find all the <span> tags in the file and pull out the numbers from the tag and sum the numbers.
Look at the sample code provided. It shows how to find all of a certain kind of tag, loop through the tags and extract the various aspects of the tags.
... # Retrieve all of the anchor tags tags = soup('a') for tag in tags: # Look at the parts of a tag print 'TAG:',tag print 'URL:',tag.get('href', None) print 'Contents:',tag.contents[0] print 'Attrs:',tag.attrsYou need to adjust this code to look for span tags and pull out the text content of the span tag, convert them to integers and add them up to complete the assignment.
Sample Execution
$ python solution.py Enter - http://python-data.dr-chuck.net/comments_42.html Count 50 Sum 2...
My submission
import urllib
from BeautifulSoup import *
url = raw_input('Enter - ')
html = urllib.urlopen(url).read()
sum = 0
soup = BeautifulSoup(html)
tags = soup('span')
for tag in tags:
# Look at the parts of a tag
sum+=int(tag.contents[0])
print 'Count = ',len(tags)
print 'Sum = ', sum
import urllib.request,urllib.parse,urllib.error
ReplyDeletefrom bs4 import BeautifulSoup
import ssl
url = input('Enter - ')
ht = urllib.request.urlopen(url).read()
sum = 0
soup = BeautifulSoup(ht,"html.parser")
tags = soup('span')
for tag in tags:
sum+=int(tag.contents[0])
print ('Count = ',len(tags) )
print ('Sum = ', sum)
This comment has been removed by the author.
ReplyDelete