作者：奧辰

博客：https://www.cnblogs.com/chenhuabin

最近需要將一批數(shù)據(jù)從csv文件中遷移到Oracle數(shù)據(jù)庫中，打算用Python來實(shí)現(xiàn)，趁著這個(gè)機(jī)會(huì)，寫一篇博客學(xué)習(xí)總結(jié)一些如何使用Python來操作Oracle數(shù)據(jù)庫。

1 安裝與導(dǎo)入

Python操作Oracle數(shù)據(jù)庫多用cx_Oracle這個(gè)第三方擴(kuò)展，總體而言，cx_Oracle的使用方式與Python操作MySQL數(shù)據(jù)庫的pymysql庫還是很相似的，如果還沒有安裝，可以通過下面的命令進(jìn)行安裝：
$ pip install -i https://pypi.douban.com/simple cx_oracle
使用前導(dǎo)入：

import?cx_Oracle

千萬注意，包名稱cx_Oracle中，字母“O”是大寫的，寫成小寫將會(huì)導(dǎo)入失敗。
這里再附帶多說一點(diǎn)，我在安裝好cx_Oracle第一次使用時(shí)，出現(xiàn)這個(gè)異常：DatabaseError: DPI-1047,可以按照官方的思路解決：https://oracle.github.io/odpi/doc/installation.html#linux

2 創(chuàng)建連接

cx_Oracle提供了兩種方式連接Oracle數(shù)據(jù)庫，分別是創(chuàng)建獨(dú)立的單一連接以及創(chuàng)建連接池。

2.1 單一連接

創(chuàng)建單一連接主要是通過cx_Oracle模塊中提供的connect()方法實(shí)現(xiàn)，雖然也可以直接通過Connection()類實(shí)現(xiàn)，但是不推薦。connect()方法參數(shù)有很多，說說其中最常用的四個(gè)：

user：用戶名
password：密碼
dsn：數(shù)據(jù)庫地址和服務(wù)名
encoding：編碼，合適的編碼可以避免出現(xiàn)亂碼
這里要重點(diǎn)說一下dsn，dsn是data source name的縮寫，用于描述數(shù)據(jù)源的詳細(xì)地址，一般由數(shù)據(jù)庫所在主機(jī)地址、端口和服務(wù)名組成。在默認(rèn)情況下，Oracle數(shù)據(jù)庫對(duì)應(yīng)1521端口，在這種情況下，dsn中可以省略端口：

connection?=?cx_Oracle.connect("username",?"password",?"192.168.1.2/helowin",?encoding="UTF-8")

其中，username是用戶名，password是密碼，192.168.1.2是數(shù)據(jù)庫所在主機(jī)IP，helowin是服務(wù)名。
在一般情況下，可以這么寫：

connection?=?cx_Oracle.connect("username",?"password",?"192.168.1.2:1521/helowin",?encoding="UTF-8")

有時(shí)候，我們需要以管理員身份登錄數(shù)據(jù)庫，這時(shí)候，直接連接時(shí)不行的，將會(huì)跑出異常：DatabaseError: ORA-28009: connection as SYS should be as SYSDBA or SYSOPER，這時(shí)候可以傳遞參數(shù)mode=cx_Oracle.SYSDBA。

connection?=?cx_Oracle.connect("sys",?"psdpassword",?"192.168.1.2:1521/helowin",mode=cx_Oracle.SYSDBA,
?????????encoding="UTF-8")

當(dāng)確定不在使用連接時(shí)，可以使用connection.close()關(guān)閉連接（這是個(gè)好習(xí)慣）。

connection.close()

2.2 連接池

cx_Oracle中提供SessionPool()創(chuàng)建連接池，連接池一般是在應(yīng)用程序初始化時(shí)創(chuàng)建。相比通過connect()方法創(chuàng)建單個(gè)數(shù)據(jù)庫連接，使用SessionPool()創(chuàng)建連接池時(shí)，需要額外指定最少連接數(shù)（min)和最大連接數(shù)(max)，連接池創(chuàng)建時(shí)會(huì)創(chuàng)建有min個(gè)數(shù)據(jù)庫連接，當(dāng)連接不夠用時(shí)會(huì)繼續(xù)新增連接，當(dāng)連接未被使用時(shí)連接池將會(huì)自動(dòng)減少連接的數(shù)量。在創(chuàng)建好連接池后，通過調(diào)用acquire()方法可以獲取一個(gè)數(shù)據(jù)庫連接，連接使用完畢之后，最好使用SessionPool.release(connection)或Connection.close()將連接放回連接池。

#?創(chuàng)建連接池
pool?=?cx_Oracle.SessionPool("username",?"password",
????????"192.168.1.2:1521/helowin",?min=2,?max=5,?increment=1,?encoding="UTF-8")

#?從連接池中獲取一個(gè)連接
connection?=?pool.acquire()

#?使用連接進(jìn)行查詢
cursor?=?connection.cursor()
for?result?in?cursor.execute("select?*?from?scott.students"):
????print(result)

#?將連接放回連接池
pool.release(connection)

#?關(guān)閉連接池
pool.close()

(1, '張三', 20)
(2, '李四', 30)

如果是在多線程下同時(shí)使用連接，那么在創(chuàng)建連接池時(shí)應(yīng)該傳遞一個(gè)threaded參數(shù)，并將值設(shè)置為True：

#?創(chuàng)建連接池
pool?=?cx_Oracle.SessionPool("username",?"password",
????????"192.168.1.2:1521/helowin",?min=2,?max=5,?increment=1,?threaded=True,?encoding="UTF-8")

pool.close()

3 游標(biāo)

有了數(shù)據(jù)庫連接之后，可以通過連接來獲取游標(biāo)：

cur?=?connection.cursor()

通過游標(biāo)，可以執(zhí)行SQL語句，實(shí)現(xiàn)與數(shù)據(jù)庫的交互，但是記住，游標(biāo)使用完之后記得關(guān)閉：

cur.close()

游標(biāo)對(duì)象中定義有Cursor.excute()方法和Cursor.executemany()兩個(gè)方法用于執(zhí)行SQL語句，前者一次只能執(zhí)行一條SQL語句，后者一次可執(zhí)行多條SQL。當(dāng)有類似的大量SQL語句需要執(zhí)行時(shí)，使用Cursor.executemany()而不是多次執(zhí)行Cursor.excute()可以極大提升性能。
另外，所有cx_Oracle執(zhí)行的語句都含有分號(hào)“;”或斜杠“/”：

connection?=?cx_Oracle.connect("username",?"password",?"192.168.1.2/helowin",?encoding="UTF-8")
cur?=?connection.cursor()

cur.execute("select?*?from?SCOTT.STUDENTS;")??#?含有分號(hào)，拋出異常

---------------------------------------------------------------------------

DatabaseError                             Traceback (most recent call last)

 in 
----> 1 cur.execute("select * from SCOTT.STUDENTS;")  # 含有分號(hào)，拋出異常


DatabaseError: ORA-00911: invalid character

4 執(zhí)行SQL

4.1 SQL語句拼接

（1）使用Python原生占位符拼接
在很多應(yīng)用場景中，我們查詢所用的SQL語句并不是固定的，而是根據(jù)當(dāng)時(shí)環(huán)境靈活的對(duì)SQL進(jìn)行拼接。最簡單的方式就是直接使用Python原生提供的占位符進(jìn)行拼接，不過要注意如果變量是字符串時(shí)，引號(hào)不能少。

cur.execute("insert?into?SCOTT.STUDENTS?(id,?name,?age)?values?({student_id},?'{student_name}',?{student_age})".format(
????student_id=4,
????student_name='李六',
????student_age=15
))
connection.commit()

student_id?=?4
result?=?cur.execute("select?*?from?SCOTT.STUDENTS?where?id={}".format(student_id))

result.fetchone()

(4, '李六', 15)

student_name?=?"張三"
result?=?cur.execute("select?*?from?SCOTT.STUDENTS?where?name='{}'".format(student_name))

result.fetchone()

(1, '張三', 20)

(2)通過變量名拼接
使用這種拼接方式時(shí)，字符串中的名稱與真實(shí)變量名必須一一對(duì)應(yīng)。
所有變量可以統(tǒng)一存儲(chǔ)在一個(gè)字典中：

student?=?{'student_id':5,?'student_name':'陳七',?'student_age':?25}??#?將所有變量存儲(chǔ)到一個(gè)字典中
cur.execute('insert?into?SCOTT.STUDENTS?(id,?name,?age)?values?(:student_id,?:student_name,?:student_age)',student)
connection.commit()

也可以逐一賦值：

cur.execute('insert?into?SCOTT.STUDENTS?(id,?name,?age)?values?(:student_id,?:student_name,?:student_age)',
????????????student_id=6,student_name='毛八',student_age=60)
connection.commit()

(3)通過參數(shù)位置拼接
通過參數(shù)位置進(jìn)行拼接時(shí)，所有變量可以統(tǒng)一存儲(chǔ)在一個(gè)list中，list中的變量的順序必須與字符串中定義的順序保持一致。

cur.execute('insert?into?SCOTT.STUDENTS?(id,?name,?age)?values?(:student_id,?:student_name,?:student_age)',
????????????[7,'魏九',30])
connection.commit()

這時(shí)候，在字符串中也可以不顯式的出現(xiàn)參數(shù)名，而是以數(shù)字來代替出現(xiàn)位置：

cur.execute('insert?into?SCOTT.STUDENTS?(id,?name,?age)?values?(:1,?:2,?:3)',
????????????[8,'吳十',90])
connection.commit()

4.2 執(zhí)行語句

cx_Oracle的游標(biāo)中定義了execute()和executemany()兩個(gè)方法用于執(zhí)行SQL語句，區(qū)別在于execute()一次只能執(zhí)行一條SQL，而executemany()一次能執(zhí)行多條SQL。在大量結(jié)構(gòu)一樣，參數(shù)不同的語句需要執(zhí)行時(shí)，使用executemany()而不是多次調(diào)用execute()執(zhí)行可以大大提高代碼性能。
（1）execute()
對(duì)于execute()方法，其實(shí)在上面代碼實(shí)例中以及多次使用，大致形式如下：

cur.execute('insert?into?SCOTT.STUDENTS?(id,?name,?age)?values?(:1,?:2,?:3)',
????????????[9,'蕭十一',32])
connection.commit()

（2）executemany()

students?=?[
????[10,'蕭十一',32],
????[11,'何十二',40],
????[12,'穆十三',35]
]
cur.executemany('insert?into?SCOTT.STUDENTS?(id,?name,?age)?values?(:1,?:2,?:3)',
????????????students)
connection.commit()

cx_Oracle執(zhí)行SQL時(shí)需要注意，若是執(zhí)行查詢，可通過游標(biāo)獲取查詢結(jié)果，具體如何獲取請(qǐng)繼續(xù)看下文；若是執(zhí)行insert或update操作，需要在執(zhí)行后繼續(xù)一步connection.commit()操作。

5 獲取查詢結(jié)果

當(dāng)使用游標(biāo)進(jìn)行查詢后，可以直接迭代取出查詢結(jié)果

result?=?cur.execute("select?*?from?SCOTT.STUDENTS")?

for?row?in?result:
????print(row)

(1, '張三', 20)
(2, '李四', 30)
(3, '王五', 40)

注意，這里的游標(biāo)查詢結(jié)果對(duì)象result雖然不是生成器，但是可以當(dāng)做生成器來用，每一次使用next()方法時(shí)，可以獲取一條記錄。當(dāng)然，也與生成器一樣，查詢結(jié)果只能迭代遍歷一次，再次使用迭代不會(huì)有任何輸出：

result?=?cur.execute("select?*?from?SCOTT.STUDENTS")?

next(result)

(1, '張三', 20)

next(result)

(2, '李四', 30)

next(result)

(3, '王五', 40)

for?row?in?result:??#?沒有任何輸出結(jié)果
????print(row)

其實(shí)，通過循環(huán)來獲取查詢結(jié)果時(shí)，每一次調(diào)用next()方法，result對(duì)象都會(huì)對(duì)數(shù)據(jù)庫發(fā)起一次請(qǐng)求，獲取一條查詢記錄，如果查詢記錄數(shù)量比較大時(shí)，性能會(huì)比較低，這時(shí)候，可以通過設(shè)置cur.arraysize參數(shù)改善性能。cur.arraysize參數(shù)配置的是每次請(qǐng)求獲取的數(shù)據(jù)包大小，默認(rèn)為100，當(dāng)設(shè)置為更大值時(shí)，一次請(qǐng)求就可以獲取更多的記錄，減少客戶端與數(shù)據(jù)庫服務(wù)器端網(wǎng)絡(luò)往返次數(shù)，從而提高性能，當(dāng)然缺點(diǎn)就是消耗的內(nèi)存會(huì)更大。

cur.arraysize?=?500
for?row?in?cur.execute("select?*?from?SCOTT.STUDENTS"):
????print(row)

(1, '張三', 20)
(2, '李四', 30)
(3, '王五', 40)

除了在循環(huán)中直接遍歷外，還可以通過fetchone()、fetchmany()、fetchall()三個(gè)方法取出查詢結(jié)果。

fetchone()
fetchone()每次只取出一條記錄，功能效果與直接對(duì)result使用next()方法一樣。

cur?=?connection.cursor()
result?=?cur.execute("select?*?from?SCOTT.STUDENTS")

result.fetchone()

(1, '張三', 20)

result.fetchone()

(2, '李四', 30)

result.fetchone()

(3, '王五', 40)

fetchmany()
fetchmany()可以一次取出指定數(shù)量的記錄，如果不指定數(shù)量，表示一次性去除所有記錄。

cur?=?connection.cursor()
result?=?cur.execute("select?*?from?SCOTT.STUDENTS")

result.fetchmany(2)

[(1, '張三', 20), (2, '李四', 30)]

result.fetchmany(2)

[(3, '王五', 40)]

result.fetchmany(2)??#?若所有記錄都取出來了，返回空列表

[]

fetchall()
fetchall()一次性去除所有結(jié)果。

cur?=?connection.cursor()
result?=?cur.execute("select?*?from?SCOTT.STUDENTS")

result.fetchall()

[(1, '張三', 20), (2, '李四', 30), (3, '王五', 40)]

python爬蟲人工智能大數(shù)據(jù)公眾號(hào)

Python操作Oracle數(shù)據(jù)庫：cx_Oracle