Data Engineering

PostgreSQL - ๊ด€๋ฆฌ

๊ตฌ์„ฑ, ๋ณต์ œ, ์ธ๋ฑ์‹ฑ(B-tree, GIN, GiST, BRIN), VACUUM, EXPLAIN ANALYZE, ํ™•์žฅ, ๋ฐฑ์—…

20 ๋ฉด์ ‘ ์งˆ๋ฌธยท
Mid-Level
1

shared_buffers ๋ฐ work_mem๊ณผ ๊ฐ™์€ ์ฃผ์š” ์„œ๋ฒ„ ๋งค๊ฐœ๋ณ€์ˆ˜๊ฐ€ ํฌํ•จ๋œ PostgreSQL ๊ตฌ์„ฑ ํŒŒ์ผ์€ ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

๋‹ต๋ณ€

postgresql.conf ํŒŒ์ผ์—๋Š” ๋ฉ”๋ชจ๋ฆฌ ํ• ๋‹น(shared_buffers, work_mem), ์—ฐ๊ฒฐ ์„ค์ •, ๋กœ๊น… ๋ฐ ์„ฑ๋Šฅ ํŠœ๋‹์„ ํฌํ•จํ•œ ์ฃผ์š” PostgreSQL ์„œ๋ฒ„ ๊ตฌ์„ฑ ๋งค๊ฐœ๋ณ€์ˆ˜๊ฐ€ ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ํŒŒ์ผ์€ ์„œ๋ฒ„ ์‹œ์ž‘ ์‹œ ์ฝํžˆ๋ฉฐ ์ผ๋ถ€ ๋งค๊ฐœ๋ณ€์ˆ˜๋Š” ์ „์ฒด ์žฌ์‹œ์ž‘์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

2

๋ฐ์ดํ„ฐ ์บ์‹ฑ์— ์‚ฌ์šฉ๋˜๋Š” ๊ณต์œ  ๋ฉ”๋ชจ๋ฆฌ ์–‘์„ ์ •์˜ํ•˜๋Š” PostgreSQL ๋งค๊ฐœ๋ณ€์ˆ˜๋Š” ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

๋‹ต๋ณ€

shared_buffers๋Š” PostgreSQL์ด ๋ฐ์ดํ„ฐ ์บ์‹ฑ์— ์‚ฌ์šฉํ•˜๋Š” ๋ฉ”๋ชจ๋ฆฌ ์–‘์„ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค. ์ด ๋งค๊ฐœ๋ณ€์ˆ˜๋Š” ๋ชจ๋“  ์—ฐ๊ฒฐ์—์„œ ๊ณต์œ ๋˜๋Š” ๋ฒ„ํผ ํ’€์˜ ํฌ๊ธฐ๋ฅผ ๊ฒฐ์ •ํ•˜๋ฏ€๋กœ ์„ฑ๋Šฅ์— ๋งค์šฐ ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ PostgreSQL ์ „์šฉ ์„œ๋ฒ„์˜ ๊ฒฝ์šฐ ์ด ์„œ๋ฒ„ RAM์˜ 25%์—์„œ 40% ์‚ฌ์ด๋กœ ์„ค์ •ํ•˜๋Š” ๊ฒƒ์ด ๊ถŒ์žฅ๋ฉ๋‹ˆ๋‹ค.

3

PostgreSQL์˜ ์ŠคํŠธ๋ฆฌ๋ฐ ๋ณต์ œ์™€ ๋…ผ๋ฆฌ์  ๋ณต์ œ์˜ ์ฃผ์š” ์ฐจ์ด์ ์€ ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

๋‹ต๋ณ€

์ŠคํŠธ๋ฆฌ๋ฐ ๋ณต์ œ๋Š” ๋ณ€๊ฒฝ ์‚ฌํ•ญ์„ WAL(Write-Ahead Log) ์ˆ˜์ค€์—์„œ ๋ฐ”์ด๋„ˆ๋ฆฌ ํ˜•์‹์œผ๋กœ ์ „์†กํ•˜์—ฌ ๊ธฐ๋ณธ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค์˜ ์ •ํ™•ํ•œ ๋ณต์‚ฌ๋ณธ์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ๋…ผ๋ฆฌ์  ๋ณต์ œ๋Š” ๋ณ€๊ฒฝ ์‚ฌํ•ญ์„ ๋…ผ๋ฆฌ์  SQL ์ž‘์—…์œผ๋กœ ๋””์ฝ”๋”ฉํ•˜์—ฌ ํŠน์ • ํ…Œ์ด๋ธ”์˜ ์„ ํƒ์  ๋ณต์ œ์™€ ๋ณต์ œ ์ค‘ ๋ฐ์ดํ„ฐ ๋ณ€ํ™˜์„ ํ—ˆ์šฉํ•ฉ๋‹ˆ๋‹ค. ๋…ผ๋ฆฌ์  ๋ณต์ œ๋Š” ๋” ๋งŽ์€ ์œ ์—ฐ์„ฑ์„ ์ œ๊ณตํ•˜์ง€๋งŒ ์•ฝ๊ฐ„ ๋” ๋†’์€ ์˜ค๋ฒ„ํ—ค๋“œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

4

์ „์ฒด ํ…์ŠคํŠธ ๊ฒ€์ƒ‰ ๋ฐ JSONB ์ปฌ๋Ÿผ์— ๊ฐ€์žฅ ์ ํ•ฉํ•œ PostgreSQL ์ธ๋ฑ์Šค ์œ ํ˜•์€ ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

5

BRIN(Block Range Index)์ด ํŠนํžˆ ํšจ๊ณผ์ ์ธ ๊ฒฝ์šฐ๋Š” ์–ธ์ œ์ž…๋‹ˆ๊นŒ?

+17 ๋ฉด์ ‘ ์งˆ๋ฌธ

๊ธฐํƒ€ Data Engineering ๋ฉด์ ‘ ์ฃผ์ œ

Linux & Shell - ๊ธฐ์ดˆ

Junior
20๊ฐœ ์งˆ๋ฌธ

Git & GitHub - ๊ธฐ์ดˆ

Junior
20๊ฐœ ์งˆ๋ฌธ

๋ฐ์ดํ„ฐ ์—”์ง€๋‹ˆ์–ด๋ง์„ ์œ„ํ•œ ๊ณ ๊ธ‰ Python

Junior
25๊ฐœ ์งˆ๋ฌธ

Docker - ๊ธฐ์ดˆ

Junior
25๊ฐœ ์งˆ๋ฌธ

Google Cloud Platform - ๊ธฐ์ดˆ

Junior
20๊ฐœ ์งˆ๋ฌธ

CI/CD ๋ฐ ์ฝ”๋“œ ํ’ˆ์งˆ

Mid-Level
20๊ฐœ ์งˆ๋ฌธ

Docker Compose

Mid-Level
20๊ฐœ ์งˆ๋ฌธ

FastAPI - ๋ฐ์ดํ„ฐ API

Mid-Level
20๊ฐœ ์งˆ๋ฌธ

Data Engineering์„ ์œ„ํ•œ ๊ณ ๊ธ‰ SQL

Mid-Level
20๊ฐœ ์งˆ๋ฌธ

Data Lake - ์•„ํ‚คํ…์ฒ˜ ๋ฐ ์ˆ˜์ง‘

Mid-Level
20๊ฐœ ์งˆ๋ฌธ

๋ฐ์ดํ„ฐ ์—”์ง€๋‹ˆ์–ด๋ง์„ ์œ„ํ•œ BigQuery

Mid-Level
20๊ฐœ ์งˆ๋ฌธ

Data Engineering์„ ์œ„ํ•œ Data Modeling

Mid-Level
20๊ฐœ ์งˆ๋ฌธ

Fivetran & Airbyte - ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘

Mid-Level
20๊ฐœ ์งˆ๋ฌธ

dbt - ๊ธฐ์ดˆ

Mid-Level
20๊ฐœ ์งˆ๋ฌธ

Apache Airflow - ๊ธฐ์ดˆ

Mid-Level
20๊ฐœ ์งˆ๋ฌธ

Kubernetes - ๊ธฐ์ดˆ

Mid-Level
20๊ฐœ ์งˆ๋ฌธ

dbt - ๊ณ ๊ธ‰ ๊ธฐ๋Šฅ

Senior
20๊ฐœ ์งˆ๋ฌธ

ETL / ELT / ETLT ํŒจํ„ด

Senior
20๊ฐœ ์งˆ๋ฌธ

Apache Airflow - ๊ณ ๊ธ‰

Senior
20๊ฐœ ์งˆ๋ฌธ

Airflow + dbt - ํŒŒ์ดํ”„๋ผ์ธ ์˜ค์ผ€์ŠคํŠธ๋ ˆ์ด์…˜

Senior
20๊ฐœ ์งˆ๋ฌธ

PySpark - ๋Œ€๊ทœ๋ชจ ์ฒ˜๋ฆฌ

Senior
20๊ฐœ ์งˆ๋ฌธ

Google Pub/Sub - ๋ฐ์ดํ„ฐ ์ŠคํŠธ๋ฆฌ๋ฐ

Senior
20๊ฐœ ์งˆ๋ฌธ

Apache Beam & Dataflow

Senior
20๊ฐœ ์งˆ๋ฌธ

Kubernetes - ํ”„๋กœ๋•์…˜ ๋ฐ ์Šค์ผ€์ผ๋ง

Senior
20๊ฐœ ์งˆ๋ฌธ

Terraform - Infrastructure as Code

Senior
20๊ฐœ ์งˆ๋ฌธ

NoSQL ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค

Senior
20๊ฐœ ์งˆ๋ฌธ

๋ชจ๋˜ Data Architecture

Senior
20๊ฐœ ์งˆ๋ฌธ

๋ชจ๋‹ˆํ„ฐ๋ง ๋ฐ ๊ด€์ฐฐ ๊ฐ€๋Šฅ์„ฑ

Senior
20๊ฐœ ์งˆ๋ฌธ

IAM ๋ฐ ๋ฐ์ดํ„ฐ ๋ณด์•ˆ

Senior
20๊ฐœ ์งˆ๋ฌธ

๋‹ค์Œ ๋ฉด์ ‘์„ ์œ„ํ•ด Data Engineering์„ ๋งˆ์Šคํ„ฐํ•˜์„ธ์š”

๋ชจ๋“  ์งˆ๋ฌธ, flashcards, ๊ธฐ์ˆ  ํ…Œ์ŠคํŠธ, ์ฝ”๋“œ ๋ฆฌ๋ทฐ ์—ฐ์Šต, ๋ฉด์ ‘ ์‹œ๋ฎฌ๋ ˆ์ดํ„ฐ์— ์ ‘๊ทผํ•˜์„ธ์š”.

๋ฌด๋ฃŒ๋กœ ์‹œ์ž‘ํ•˜๊ธฐ