午夜精品久久久久久久,国产色在线,中文字幕日本不卡一二三区

Python數(shù)據(jù)可視化編程實戰(zhàn)——導(dǎo)入數(shù)據(jù)

_王文波 >《Python》

2017.04.04

關(guān)注

限時干貨下載：添加微信公眾號“數(shù)據(jù)玩家「fbigdata」”

回復(fù)【2】免費獲取「完整數(shù)據(jù)分析資料，包括SPSS\SAS\SQL\EXCEL\Project!」

（DataScientists））

（優(yōu)質(zhì)新三板投資機會，請聯(lián)系微.信.號：6048856）

1.從csv文件導(dǎo)入數(shù)據(jù)

原理：with語句打開文件并綁定到對象f。不必?fù)?dān)心在操作完資源后去關(guān)閉數(shù)據(jù)文件，with的上下文管理器會幫助處理。然后，csv.reader方法返回reader對象，通過該對象遍歷所讀取文件的所有行。

1#!/usr/bin/env python23import csv 45 filename = 'ch02-data.csv'67 data = 8try: 9 with open(filename) as f:10 reader = csv.reader(f)11 c = 012 for row in reader:13 if c == 0:14 header = row15 else:16 data.append(row)17 c += 118except csv.Error as e:19 print'Error reading CSV file at line %s: %s' % (reader.line_num, e)20 sys.exit(-1)2122if header:23 print header24 print2526for datarow in data:27 print datarow

實驗結(jié)果截圖：

2.從Excel中導(dǎo)入文件數(shù)據(jù)

Excel文件可以轉(zhuǎn)換成csv文件，然后通過上述的方法導(dǎo)入，但是如果想自動化地對大量文件進行數(shù)據(jù)管道處理（作為數(shù)據(jù)連續(xù)處理流程的一部分），那么手動把每個Excel文件轉(zhuǎn)換成CSV文件的做法就行不通了。

原理：使用xlrd模塊打開文件的工作簿，然后根據(jù)行數(shù)（nrows）和列數(shù)（ncols）讀取單元格的內(nèi)容，通過調(diào)用open_workbook方法，返回一個xlrd.book實例。

1import xlrd 2from xlrd.xldate import XLDateAmbiguous 34 file = 'ch02-xlsxdata.xlsx'56 wb = xlrd.open_workbook(filename=file) 78 ws = wb.sheet_by_name('Sheet1') 910 dataset = 1112for r in range(ws.nrows):13 col = 14 for c in range(ws.ncols):15 col.append(ws.cell(r, c).value)16 if ws.cell_type(r, c) == xlrd.XL_CELL_DATE:17 try:18 print ws.cell_type(r, c)19 from datetime import datetime20 date_value = xlrd.xldate_as_tuple(ws.cell(r, c).value, wb.datemode)21 print datetime(*date_value)22 except XLDateAmbiguous as e:23 print e24 dataset.append(col)2526from pprint import pprint2728 pprint(dataset)

實驗結(jié)果：

3.從定寬數(shù)據(jù)文件導(dǎo)入數(shù)據(jù)

時間的日志文件和基于時間序列的文件是數(shù)據(jù)可視化中最常見的數(shù)據(jù)源。有時候，可以以制表符分隔數(shù)據(jù)這種CSV方言來讀取它們，但有時它們不是通過任何特殊字符分隔的。實際上，這些文件的字段是有固定寬度的，我們能通過格式來匹配并提取數(shù)據(jù)。

例如（本例子的數(shù)據(jù)是使用代碼生成的）：

處理方法：

1.指定要讀取的數(shù)據(jù)文件。2.定義數(shù)據(jù)讀取的方式。3.逐行讀取文件并根據(jù)格式把每行解析成單獨的數(shù)據(jù)字段。4.安單獨數(shù)據(jù)字段的形式打印每一行。

1import struct 2import string 34 mask='9s14s5s'5 parse = struct.Struct(mask).unpack_from 6print'formatstring {!r}, record size: {}'.format(\ 7 mask, struct.calcsize(mask)) 89 datafile = 'ch02-fixed-width-1M.data'1011with open(datafile, 'r') as f:12 for line in f:13 fields = parse(line)14 print'fields: ', [field.strip for field in fields]

實驗結(jié)果：

4.從JSON數(shù)據(jù)源導(dǎo)入數(shù)據(jù)

操作步驟如下：1.指定GitHub URL來讀取JSON格式數(shù)據(jù)。2.使用requests模塊訪問指定的URL，并讀取內(nèi)容。3.讀取內(nèi)容并將之轉(zhuǎn)化為JSON格式的對象。4.迭代訪問JSON對象，對于其中的每一項，讀取每個代碼庫的URL值。

原理：首先，使用requests模塊獲取遠(yuǎn)程資源。Requests模塊提供了簡單的API來定義HTTP謂詞，我們只需要發(fā)出get方法調(diào)用。我們只對Response.json方法感興趣，這個方法可以讀取Response.content的內(nèi)容，把它解析成JSON并加載到JSON對象中。

代碼如下：

1import requests2from pprint import pprint3 url = 'https://api.github.com/users/justglowing'4 r = requests.get(url)5 json_obj = r.json6 pprint(json_obj)

結(jié)果：

結(jié)語：上個月在幫別人做畢業(yè)設(shè)計，用的FLASK，然后這個月還要用JAVA EE來寫個商城網(wǎng)站，忙的要死，一直沒更新博客，今天周日圖書館看了python數(shù)據(jù)可視化，走神一大半，唉，還是更博客吧，可惜的是，說好的要做的精品系列呢。。。。。。。

本站僅提供存儲服務(wù)，所有內(nèi)容均由用戶發(fā)布，如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容，請點擊舉報。

打開APP，閱讀全文并永久保存查看更多類似文章

Python讀取文件代碼塊已經(jīng)備好，用的時候光拿（建議收藏）

手把手教你利用Python輕松拆分Excel為多個CSV文件

4段簡短代碼教你用Python讀寫Excel

【數(shù)具】Python 百度API 畫出美美噠熱力地圖

用Python導(dǎo)入數(shù)據(jù)方法匯總

爬蟲pandas庫是啥呢？

更多類似文章 >>

九色国产,午夜在线视频,新黄色网址,九九色综合,天天做夜夜做久久做狠狠,天天躁夜夜躁狠狠躁2021a,久久不卡一区二区三区