[Python] HTML,XML 파싱 라이브러리

티스토리 뷰

IT/프로그래밍

[Python] HTML,XML 파싱 라이브러리 - BeautifulSoup

NineKY 2009. 10. 28. 13:59

Python에서 HTML/XML 작업을 편하게 할 수 있도록 지원해주는 Library 이다. 사용법은 그리 어렵지 않으므로 구글사마에게 잠시 여쭤보면 대부분의 답이 나올 것이다.
제작사 : http://www.crummy.com/software/BeautifulSoup/

BeautifulSoup의 API 정보는 다음의 사이트에서 확인할 수 있다.
참고 :http://api.plone.org/Plone/3.0/private/frames/src/kss.core/kss/core/private/kss.core.BeautifulSoup-module.html

아래 소스는 BeautifulSoup을 이용해 작성한 간단한 코드이다.

try:
 socket.setdefaulttimeout(timeout)\
 // vatorul 에서 페이지 HTML 정보를 가져온다.
 text = urllib.urlopen(vitourl).read()        
 // BeautifulSoup의 입력으로 전달
 soup = BeautifulSoup.BeautifulSoup(text)        
 // '<table ~'을 검색, id 값이 tablaMotores인 것만 찾는다.
 table = soup.find("table", { "id" : "tablaMotores" })         
 // table 결과에서 모든 '<tr ~' 을 검색
 for TRs in table.findAll("tr"):                    
   // TRs 에서 '<td ~' 을 검색, class 값이 positivo인 것만 찾는다.
   node = TRs.find("td", { "class" : "positivo" })
   if (node):
     TDs = TRs('td')
     print "%-20s : %s" %(TDs.pop(0).contents[0], node.contents[0])
except Exception, msg:
 print "Error:Exception GetVirustotalResult : %s --> %s" %(msg, vitourl)

아래는 제작사에서 제공하는 사용법이다.

[ BeautifulSoup Documentation ]

The attributes of Tags

Navigating the Parse Tree

Searching the Parse Tree

The basic find method: findAll(name, attrs, recursive, text, limit, **kwargs)

Searching Within the Parse Tree

Modifying the Parse Tree

Troubleshooting

Advanced Topics

See Also

Conclusion

저작자표시 비영리 변경금지 (새창열림)

'IT > 프로그래밍' 카테고리의 다른 글

[MySQL] Connector C/C++ 어렵게 설정하지 않고 이용하는 방법 (0)	2009.12.04
[Python] setuptools 업그레이드 시 문제점 (0)	2009.11.23
[Python] session이 유지된 http 연결 지원 : ClientCookie (5)	2009.10.26
[Python] 어제 날짜 구하기 (2)	2009.10.26
Python - URLLIB - GetAddrInfo Failed (0)	2009.09.25

공지사항

최근에 올라온 글

최근에 달린 댓글

Total

Today

Yesterday

링크

TAG more

« 2025/06 »
일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

글 보관함

CoInzBlog

티스토리 뷰

[Python] HTML,XML 파싱 라이브러리 - BeautifulSoup

[ BeautifulSoup Documentation ]

'IT > 프로그래밍' 카테고리의 다른 글

티스토리툴바