【空间代码免费大全】编写Python代码下载百度空间文章(包括源代码)-拍拖百科

纯python新手写的百度空间文章python源码下载，代码写得不好，看不到。大家看看效果就行了。不需要简化代码。丹尼尔，请随波逐流。

百度空间句子python源代码使用方法下载：

在Cmd中输入python ' f : \ walkbox \ python \ my work \ Baidu \ getartic leid-R1 . py ' bspeng 922 6

命令格式：python文件归档路径[用户名] [下载页数]

下载页数可能不填，如果不填，则全部下载。如果大于实际总页数，将重复下载第一页的内容

这段代码只是新版的百度空间，只测试了“低调优雅”模板，结果生成了html文件。(阿尔伯特爱因斯坦，美国作家)。

同时我突然发现了一个奇怪的功能，这个代码居然可以用来刷百度空间的访问量。很好。

请按如下方式下载百度空间文章python源代码：

#-*-代码： utf8-*-

Import urllib

Importre、操作系统、sys、时间

Def article download (username，pagecount) :

#验证传递的参数是否有效

if username=' ' username=' bs Peng 922 '

if page count=' ' or int(page count)0 :

页数=0

Else:

PageCount=int(pageCount) 1

print ' blog : http://hi . Baidu . com/new/% s ' % username

#文件存储目录，可以修改

save drive=' e : \ test ' # directory to save html files

#html文件存储目录

If not os。(存储驱动器):

Os.mkdir(存储驱动器)

Mydrive=os。(saveDrive、username)

If not os。(我的驱动器):

Os.mkdir(我的驱动器)

#保存图像目录

ImgDir='img '

Imgpath=os。(savedrive、username、imgdir)

If not os。(imgPath):

Os.mkdir(imgPath)

#确定收到的页数是0还是0后，将全部下载

If页数==0 :

fstbaidu=urllib . urlopen(' http://hi . Baidu . com/new/% s ' % username)

TotalRecord、pagesize=0，0

适用于FST line in FST百度：

if(' all count ')0: # only one tag

total record=int(FST line[(' ')1: FST line . rindex(' ')])

If ('pageSize')0:

pagesize=int(FST line[(' ')1: FST line . rindex(' ')])

If pagesize！=0 and totalRecord！=0:

PageCount=totalRecord/pagesize

if total record/float(pagesize)total record/pagesize :

PageCount=pageCount 2

（）

Print' pagecount : 'pagecount-1

#根据句子ID获取实际句子链接

ArticleCount=0

Sum htmlpath=os。(保存驱动器、“% s . html”% username)

Sumfile=open (sumhtmlpath，' w') # the sumfile

atagcmp=re . com pile(' ' a href='/% s/item/([\ w]*)？)' class=' a-in content a-title cs-content block-hoverlink ' target=_ blank(.*？)/a''% username)

For page inrange (1，pagecount) :

this pageurl=urllib . urlopen(' http://hi . Baidu . com/new/% s？页面=% d'%(用户名，页面)

打印“第3360页”，第页

For line in thisPageUrl:

ifline . find(' a-in content a-title ')0 :

ArticleCount=1 #博客文章数

Linefind=aTagCmp.findall(line)

#print linefind

For line in linefind :

#句子ID和名称

Myurl=line[0]

Mytitle=line[1]

(' ' a href=' % s \ \ % s . html ' target=' blank ' % s/ABR ' ' %(username，myuu)

#导入和保存实际句子

Thispath=os。(我的驱动器，“% s . html”% my URL)

Thisfile=open(thispath，' w ')

this article=urllib . urlopen(' http://hi . Baidu . com/% s/item/% s ' %(username，myurl)

For thisline in thisArticle:

ImgCount=0

BadImg=0

仅导入if(“content-head clear fix”)0: #正文

#匹配地物标签

ImgTagCmp=re.compile('''img。*？Src='(.*？)’。*？’)。

img list=imgtagcmp . find all(this line)

For imglink in imglist :

ImageNewPath=' ' '

#print imglink

ifimglink . find(' '//')0:

imagename=imglink[imglink . rindex('/')1:]

#下载照片

tree :

Urllib.urlretrieve (imglink，os。(imgpath、imagename)

ImgCount=1

Except : #如果无法下载，则报告错误

print ' cannot download this image : ' imagename

#更改图片链接

imagenewpath=' ' ' imgsrc=' % s/% s '/' ' %(imgdir，imagename)

thisimgcmp=re . com pile(' ' imgwidth=' \ d { 1，4}' height=' \ d {1，4}' src /%s' /*？/%s' small='0' /|img src='http://。*？/%s' /|img small='0' src='http://。*？/% s“/”“%”(imagename、imagename、imagename、imagename)

#print imageNewPath

tree :

#print (thisline)

Thisline=(image new path，thisline) #每次替换当前地物标签

#print thisline

Except:

Print 'UnExpect error '

Else:#www.i

BadImg=1

#删除不必要的内容

Pos=('mod-post-info clearfix ')

If pos0 :

Thisline=thisline [03360 pos-12]

())

（）

# print ' image count :% d badimage :% d ' %(img count、badimg)

（）

Print' article count 3360 'article count

If _ _ name _ _==' _ _ main _ _ '

St=()

#获取命令行参数

If len)==2:

Uname=[1]

页面=0

Elif len)2:

Uname=[1]

页面=int [2]) 1

Else:

Uname=raw_input('Username-')

页面=raw _ input(' page-')

ArticleDownload(uname、pages)

Et=()

print ' time used :% 0.2 fs ' %(et-ST)