白鹿大尺度娇喘床戏片段,男人日女人视频网站,色婷婷色综合,国产黃色AAA片,超碰www,亚洲无在线观看,国产精品露脸自拍,B想要叉叉

這次以豆瓣電影TOP250網(wǎng)為例編寫一個爬蟲程序，并將爬取到的數(shù)據(jù)（排名、電影名和電影海報網(wǎng)址）存入MySQL數(shù)據(jù)庫中。

在執(zhí)行程序前，先在MySQL中創(chuàng)建一個數(shù)據(jù)庫"pachong"。

import pymysqlimport requestsimport re  #獲取資源并下載def resp(listURL):    #連接數(shù)據(jù)庫    conn = pymysql.connect(        host = '127.0.0.1',        port = 3306,        user = 'root',        password = '******',  #數(shù)據(jù)庫密碼請根據(jù)自身實際密碼輸入        database = 'pachong',         charset = 'utf8'    )     #創(chuàng)建數(shù)據(jù)庫游標(biāo)    cursor = conn.cursor()     #創(chuàng)建列表t_movieTOP250（執(zhí)行sql語句）    cursor.execute('create table t_movieTOP250(id INT PRIMARY KEY                                               auto_increment NOT NULL ,movieName VARCHAR(20) NOT NULL                                     ,pictrue_address VARCHAR(100))')     try:        # 爬取數(shù)據(jù)        for urlPath in listURL:            # 獲取網(wǎng)頁源代碼            response = requests.get(urlPath)            html = response.text             # 正則表達(dá)式            namePat = r'alt="(.*?)" src='            imgPat = r'src="(.*?)" class='             # 匹配正則（排名【用數(shù)據(jù)庫中id代替，自動生成及排序】、電影名、電影海報（圖片地址））            res2 = re.compile(namePat)            res3 = re.compile(imgPat)            textList2 = res2.findall(html)            textList3 = res3.findall(html)             # 遍歷列表中元素,并將數(shù)據(jù)存入數(shù)據(jù)庫            for i in range(len(textList3)):                cursor.execute('insert into t_movieTOP250(movieName,pictrue_address)                                    VALUES("%s","%s")' % (textList2[i],textList3[i]))         #從游標(biāo)中獲取結(jié)果        cursor.fetchall()         #提交結(jié)果        conn.commit()        print("結(jié)果已提交")     except Exception as e:        #數(shù)據(jù)回滾        conn.rollback()        print("數(shù)據(jù)已回滾")     #關(guān)閉數(shù)據(jù)庫    conn.close() #top250所有網(wǎng)頁網(wǎng)址def page(url):    urlList = []    for i in range(10):        num = str(25*i)        pagePat = r'?start=' + num + '&filter='        urL = url+pagePat        urlList.append(urL)    return urlList  if __name__ == '__main__':    url = r"https://movie.douban.com/top250"    listURL = page(url)    resp(listURL)

結(jié)果如下圖：

以上就是我的分享，如果有什么不足之處請指出，多交流，謝謝！

搜索下方加老師微信

老師微信號：XTUOL1988【切記備注：學(xué)習(xí)Python】

領(lǐng)取Python web開發(fā)，Python爬蟲，Python數(shù)據(jù)分析，人工智能等精品學(xué)習(xí)課程。帶你從零基礎(chǔ)系統(tǒng)性的學(xué)好Python！

*聲明：本文于網(wǎng)絡(luò)整理，版權(quán)歸原作者所有，如來源信息有誤或侵犯權(quán)益，請聯(lián)系我們刪除或授權(quán)

python爬取豆瓣電影TOP250數(shù)據(jù)