html scrapping using python topboxoffice list from imdb website -
i want print top box office list above site (all movies' rank, title, weekend, gross , weeks movies in order)
example output:
rank:1
title: godzilla
weekend:$93.2m
gross:$93.2m
weeks: 1
rank: 2
title: neighbours
this simple way extract entities beautifulsoup
from bs4 import beautifulsoup import urllib2 url = "http://www.imdb.com/chart/?ref_=nv_ch_cht_2" data = urllib2.urlopen(url).read() page = beautifulsoup(data, 'html.parser') rows = page.findall("tr", {'class': ['odd', 'even']}) tr in rows: data in tr.findall("td", {'class': ['titlecolumn', 'weekscolumn','ratingcolumn']}): print data.get_text()
p.s.-arrange according will.
Comments
Post a Comment