- 帖子
- 8
- 精华
- 0
- 积分
- 28
- 阅读权限
- 10
- 注册时间
- 2017-9-4
- 最后登录
- 2017-9-22
|
本帖最后由 ywqzj 于 2017-9-4 16:39 编辑
用python2学习教程“Python 实战(5):拿来主义”,只抓取一个记录,出现这种好像乱码的结果,是不是因为python2不支持,只能换python3
代码如下:
import urllib
import json
import time
import web
movie_ids = [u'1292052']#只抓这一个
db = web.database(dbn='sqlite',db='MovieSite.db')
def add_movie(data):
movie = json.loads(data)
print movie['title']
db.insert('movie',
id = int(movie['id']),
title = movie['title'],
origin = movie['original_title'],
url = movie['alt'],
rating = movie['rating']['average'],
image = movie['images']['large'],
directors = ','.join([d['name'] for d in movie['directors']]),
casts = ','.join([c['name'] for c in movie['casts']]),
year = movie['year'],
genres = ','.join(movie['genres']),
countries = ','.join(movie['countries']),
summary = movie['summary']
)
count = 0
for mid in movie_ids:
print count,mid
response = urllib.urlopen('http://api.douban.com/v2/movie/subject/%s'% mid)
data = response.read()
add_movie(data)
count += 1
time.sleep(3) |
|