1. <strong id="7actg"></strong>
    2. <table id="7actg"></table>

    3. <address id="7actg"></address>
      <address id="7actg"></address>
      1. <object id="7actg"><tt id="7actg"></tt></object>

        怎么用matplotlib畫出漂亮的分析圖表

        共 6921字,需瀏覽 14分鐘

         ·

        2021-02-05 20:44

        點(diǎn)擊上方數(shù)據(jù)管道”,選擇“置頂星標(biāo)”公眾號(hào)

        干貨福利,第一時(shí)間送達(dá)

        今日錦囊

        特征錦囊:怎么用matplotlib畫出漂亮的分析圖表

        ?? Index

        • 數(shù)據(jù)集引入
        • 折線圖
        • 餅圖
        • 散點(diǎn)圖
        • 面積圖
        • 直方圖
        • 條形圖

        關(guān)于用matplotlib畫圖,先前的錦囊里有提及到,不過那些圖都是比較簡(jiǎn)陋的難登大雅之堂,作為一名優(yōu)秀的分析師,還是得學(xué)會(huì)一些讓圖表漂亮的技巧,這樣子拿出去才更加有面子哈哈。好了,今天的錦囊就是介紹一下各種常見的圖表,可以怎么來畫吧。

        ?? 數(shù)據(jù)集引入

        首先引入數(shù)據(jù)集,我們還用一樣的數(shù)據(jù)集吧,分別是 Salary_Ranges_by_Job_Classification以及 GlobalLandTemperaturesByCity

        #?導(dǎo)入一些常用包
        import?pandas?as?pd
        import?numpy?as?np
        import?seaborn?as?sns

        %matplotlib?inline
        import?matplotlib.pyplot?as?plt
        import?matplotlib?as?mpl
        plt.style.use('fivethirtyeight')

        #解決中文顯示問題,Mac
        from?matplotlib.font_manager?import?FontProperties

        #?查看本機(jī)plt的有效style
        print(plt.style.available)
        #?根據(jù)本機(jī)available的style,選擇其中一個(gè),因?yàn)橹爸纆gplot很好看,所以我選擇了它
        mpl.style.use(['ggplot'])

        #?['_classic_test',?'bmh',?'classic',?'dark_background',?'fast',?'fivethirtyeight',?'ggplot',?'grayscale',?'seaborn-bright',?'seaborn-colorblind',?'seaborn-dark-palette',?'seaborn-dark',?'seaborn-darkgrid',?'seaborn-deep',?'seaborn-muted',?'seaborn-notebook',?'seaborn-paper',?'seaborn-pastel',?'seaborn-poster',?'seaborn-talk',?'seaborn-ticks',?'seaborn-white',?'seaborn-whitegrid',?'seaborn',?'Solarize_Light2']

        #?數(shù)據(jù)集導(dǎo)入

        #?引入第?1?個(gè)數(shù)據(jù)集?Salary_Ranges_by_Job_Classification
        salary_ranges?=?pd.read_csv('./data/Salary_Ranges_by_Job_Classification.csv')

        #?引入第?2?個(gè)數(shù)據(jù)集?GlobalLandTemperaturesByCity
        climate?=?pd.read_csv('./data/GlobalLandTemperaturesByCity.csv')
        #?移除缺失值
        climate.dropna(axis=0,?inplace=True)
        #?只看中國(guó)
        #?日期轉(zhuǎn)換,?將dt?轉(zhuǎn)換為日期,取年份,?注意map的用法
        climate['dt']?=?pd.to_datetime(climate['dt'])
        climate['year']?=?climate['dt'].map(lambda?value:?value.year)
        climate_sub_china?=?climate.loc[climate['Country']?==?'China']
        climate_sub_china['Century']?=?climate_sub_china['year'].map(lambda?x:int(x/100?+1))
        climate.head()

        ?? 折線圖

        折線圖是比較簡(jiǎn)單的圖表了,也沒有什么好優(yōu)化的,顏色看起來順眼就好了。下面是從網(wǎng)上找到了顏色表,可以從中挑選~

        #?選擇上海部分天氣數(shù)據(jù)
        df1?=?climate.loc[(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\
        ??????????????????.loc[:,['dt','AverageTemperature']]\
        ??????????????????.set_index('dt')
        df1.head()

        #?折線圖
        df1.plot(colors=['lime'])
        plt.title('AverageTemperature?Of?ShangHai')
        plt.ylabel('Number?of?immigrants')
        plt.xlabel('Years')
        plt.show()

        上面這是單條折線圖,多條折線圖也是可以畫的,只需要多增加幾列。

        #?多條折線圖
        df1?=?climate.loc[(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\
        ??????????????????.loc[:,['dt','AverageTemperature']]\
        ??????????????????.rename(columns={'AverageTemperature':'SH'})
        df2?=?climate.loc[(climate['Country']=='China')&(climate['City']=='Tianjin')&(climate['dt']>='2010-01-01')]\
        ??????????????????.loc[:,['dt','AverageTemperature']]\
        ??????????????????.rename(columns={'AverageTemperature':'TJ'})
        df3?=?climate.loc[(climate['Country']=='China')&(climate['City']=='Shenyang')&(climate['dt']>='2010-01-01')]\
        ??????????????????.loc[:,['dt','AverageTemperature']]\
        ??????????????????.rename(columns={'AverageTemperature':'SY'})
        #?合并
        df123?=?df1.merge(df2,?how='inner',?on=['dt'])\
        ????????????????.merge(df3,?how='inner',?on=['dt'])\
        ????????????????.set_index(['dt'])
        df123.head()

        #?多條折線圖
        df123.plot()
        plt.title('AverageTemperature?Of?3?City')
        plt.ylabel('Number?of?immigrants')
        plt.xlabel('Years')
        plt.show()

        ?? 餅圖

        接下來是畫餅圖,我們可以優(yōu)化的點(diǎn)多了一些,比如說從餅塊的分離程度,我們先畫一個(gè)“低配版”的餅圖。

        df1?=?salary_ranges.groupby('SetID',?axis=0).sum()

        #?“低配版”餅圖
        df1['Step'].plot(kind='pie',?figsize=(7,7),
        ??????????????????autopct='%1.1f%%',
        ??????????????????shadow=True)
        plt.axis('equal')
        plt.show()

        #?“高配版”餅圖
        colors?=?['lightgreen',?'lightblue']?#控制餅圖顏色?['lightgreen',?'lightblue',?'pink',?'purple',?'grey',?'gold']
        explode=[0,?0.2]?#控制餅圖分離狀態(tài),越大越分離

        df1['Step'].plot(kind='pie',?figsize=(7,?7),
        ??????????????????autopct?=?'%1.1f%%',?startangle=90,
        ??????????????????shadow=True,?labels=None,?pctdistance=1.12,?colors=colors,?explode?=?explode)
        plt.axis('equal')
        plt.legend(labels=df1.index,?loc='upper?right',?fontsize=14)
        plt.show()

        ?? 散點(diǎn)圖

        散點(diǎn)圖可以優(yōu)化的地方比較少了,ggplot2的配色都蠻好看的,正所謂style選的好,省很多功夫!

        #?選擇上海部分天氣數(shù)據(jù)
        df1?=?climate.loc[(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\
        ??????????????????.loc[:,['dt','AverageTemperature']]\
        ??????????????????.rename(columns={'AverageTemperature':'SH'})

        df2?=?climate.loc[(climate['Country']=='China')&(climate['City']=='Shenyang')&(climate['dt']>='2010-01-01')]\
        ??????????????????.loc[:,['dt','AverageTemperature']]\
        ??????????????????.rename(columns={'AverageTemperature':'SY'})
        #?合并
        df12?=?df1.merge(df2,?how='inner',?on=['dt'])
        df12.head()

        #?散點(diǎn)圖
        df12.plot(kind='scatter',??x='SH',?y='SY',?figsize=(10,?6),?color='darkred')
        plt.title('Average?Temperature?Between?ShangHai?-?ShenYang')
        plt.xlabel('ShangHai')
        plt.ylabel('ShenYang')
        plt.show()

        ?? 面積圖

        #?多條折線圖
        df1?=?climate.loc[(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\
        ??????????????????.loc[:,['dt','AverageTemperature']]\
        ??????????????????.rename(columns={'AverageTemperature':'SH'})
        df2?=?climate.loc[(climate['Country']=='China')&(climate['City']=='Tianjin')&(climate['dt']>='2010-01-01')]\
        ??????????????????.loc[:,['dt','AverageTemperature']]\
        ??????????????????.rename(columns={'AverageTemperature':'TJ'})
        df3?=?climate.loc[(climate['Country']=='China')&(climate['City']=='Shenyang')&(climate['dt']>='2010-01-01')]\
        ??????????????????.loc[:,['dt','AverageTemperature']]\
        ??????????????????.rename(columns={'AverageTemperature':'SY'})
        #?合并
        df123?=?df1.merge(df2,?how='inner',?on=['dt'])\
        ????????????????.merge(df3,?how='inner',?on=['dt'])\
        ????????????????.set_index(['dt'])
        df123.head()

        colors?=?['red',?'pink',?'blue']?#控制餅圖顏色?['lightgreen',?'lightblue',?'pink',?'purple',?'grey',?'gold']
        df123.plot(kind='area',?stacked=False,
        ????????figsize=(20,?10),?colors=colors)
        plt.title('AverageTemperature?Of?3?City')
        plt.ylabel('AverageTemperature')
        plt.xlabel('Years')
        plt.show()

        ?? 直方圖

        #?選擇上海部分天氣數(shù)據(jù)
        df?=?climate.loc[(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\
        ??????????????????.loc[:,['dt','AverageTemperature']]\
        ??????????????????.set_index('dt')
        df.head()

        #?最簡(jiǎn)單的直方圖
        df['AverageTemperature'].plot(kind='hist',?figsize=(8,5),?colors=['grey'])
        plt.title('ShangHai?AverageTemperature?Of?2010-2013')?#?add?a?title?to?the?histogram
        plt.ylabel('Number?of?month')?#?add?y-label
        plt.xlabel('AverageTemperature')?#?add?x-label
        plt.show()

        ?? 條形圖

        #?選擇上海部分天氣數(shù)據(jù)
        df?=?climate.loc[(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\
        ??????????????????.loc[:,['dt','AverageTemperature']]\
        ??????????????????.set_index('dt')
        df.head()

        df.plot(kind='bar',?figsize?=?(10,?6))
        plt.xlabel('Month')?
        plt.ylabel('AverageTemperature')?
        plt.title('AverageTemperature?of?shanghai')
        plt.show()

        df.plot(kind='barh',?figsize=(12,?16),?color='steelblue')
        plt.xlabel('AverageTemperature')?
        plt.ylabel('Month')?
        plt.title('AverageTemperature?of?shanghai')?
        plt.show()

        今天的內(nèi)容比較長(zhǎng)了,建議收藏起來哦,下次有空的時(shí)候可以把它弄進(jìn)自己的代碼庫(kù),使用起來更加方便哦~

        ·················END·················

        推薦閱讀

        1. 說說心里話

        2. 寫給所有數(shù)據(jù)人。

        3. 從留存率業(yè)務(wù)案例談0-1的數(shù)據(jù)指標(biāo)體系

        4. NB,真PDF神處理工具!

        5. 超級(jí)菜鳥如何入門數(shù)據(jù)分析?


        歡迎長(zhǎng)按掃碼關(guān)注「數(shù)據(jù)管道」

        瀏覽 81
        點(diǎn)贊
        評(píng)論
        收藏
        分享

        手機(jī)掃一掃分享

        分享
        舉報(bào)
        評(píng)論
        圖片
        表情
        推薦
        點(diǎn)贊
        評(píng)論
        收藏
        分享

        手機(jī)掃一掃分享

        分享
        舉報(bào)
        1. <strong id="7actg"></strong>
        2. <table id="7actg"></table>

        3. <address id="7actg"></address>
          <address id="7actg"></address>
          1. <object id="7actg"><tt id="7actg"></tt></object>
            一边摸一边干 | 日本道久久 | 在线一道本 | 免费成年人 | 女教师被强行糟蹋电影 | 蜜桃秘 无码一区二区三区四区 | 偷偷解开女同桌的内裤摸小说 | 国产精品7m凸凹视频分类 | 亚洲AV无码AV | 国产精品久久影院 |