[Python] 파이썬 라이브러리 beautiful soup 으로 웹 크롤링하기

Notice

Recent Posts

Recent Comments

Link

« 2025/04 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Tags more

Archives

Today

Total

관리 메뉴

Life goes slowly...

[Python] 파이썬 라이브러리 beautiful soup 으로 웹 크롤링하기 본문

프로그래밍/Python

[Python] 파이썬 라이브러리 beautiful soup 으로 웹 크롤링하기

빨강소 2020. 8. 21. 14:43

728x90

beautiful soup 라이브러리는 파이썬에서 Web 데이터 크롤링 또는 스크래핑을 할 때 사용하는 파이썬라이브러리입니다.

beautiful soup 라이브러리 설치방법

beautiful soup 라이브러리 설치방법은 파이썬이 설치된 Scripts의 폴더에서 라이브러리 설치가능합니다.

C:\Users\82108\AppData\Local\Programs\Python\Python38\Scripts>pip install bs4
또는
C:\Users\82108\AppData\Local\Programs\Python\Python38\Scripts>pip install beautifulsoup4

pip 명령어로 설치를 하게 되면 설치가 시작되고 설치가 완료가 됩니다.

Beautiful soup 기본 Setting 설정하기

패키지 import를 통해서 가져오며 html 파일을 가져오거나 urllib 혹은 requests 모듈을 통해서 직접 Web에서 소스를 가져올 수도 있습니다.

1. Package import

from bs4 import BeautifulSoup

2. html 파일열기

with open("example.html") as fp:
    soup = BeautifulSoup(fp, 'html.parser')

3. urllib를 통해서 web에 있는 소스 가져오기

import urllib.request
import urllib.parse

# Url에 원하는 웹의 URL을 넣어주시면 됩니다.
with urllib.request.urlopen(Url) as response:
    html = response.read()
    soup = BeautifulSoup(html, 'html.parser')

4. request를 통해서 web에 있는 소스 가져오기

import requests

# Url에 원하는 웹의 URL을 넣어주시면 됩니다.
>>> r = requests.get(Url)
>>> r.status_code
200
>>> r.headers['content-type']
'text/html; charset=UTF-8'
>>> r.encoding
'UTF-8'
>>> r.text
<!DOCTYPE html>
<html class="client-nojs" lang="en" dir="ltr">

728x90

저작자표시 비영리 변경금지

'프로그래밍 > Python' 카테고리의 다른 글

[Python] 파이썬(Python)의 카운팅 함수 - count(), len() 함수 (0)	2020.11.17
[Python] 파이썬 pass 와 continue 차이점 (0)	2020.11.16
[Python] 파이썬 파일 복사 모듈 - shutil 모듈 (0)	2020.11.15
[Python] 파이썬 - Slack 알람설정하기 (0)	2020.08.25
[Python] 파이썬 엑셀 활용 openpyxl 라이브러리 (0)	2020.08.10
[Python] 파이썬 pyinstaller 모듈로 실행파일로 만들기 (0)	2020.08.10
[Python] 파이썬 nested function 중첩 함수 (0)	2020.08.09
[Python] 파이썬 recursive function 재귀 함수 (0)	2020.08.09

'프로그래밍/Python' Related Articles

Comments

Life goes slowly...

[Python] 파이썬 라이브러리 beautiful soup 으로 웹 크롤링하기 본문

[Python] 파이썬 라이브러리 beautiful soup 으로 웹 크롤링하기

'프로그래밍 > Python' 카테고리의 다른 글

티스토리툴바