Coding Standards¶
Team coding standards for data engineering projects.
Python Standards¶
Style Guide¶
We follow PEP 8 with these additions:
Line length: 88 characters (Black default)
Imports: Use
isortwith Black compatibilityType hints: Required for all public functions
Tools¶
# Install development tools
pip install black isort mypy ruff
# Format code
black src/
isort src/
# Type check
mypy src/
# Lint
ruff check src/
Function Documentation¶
Use Google-style docstrings:
def process_data(
input_path: str,
output_path: str,
batch_size: int = 1000
) -> int:
"""
Process data from input to output path.
Args:
input_path: Source data path
output_path: Destination path
batch_size: Number of rows per batch
Returns:
Number of rows processed
Raises:
FileNotFoundError: If input path doesn't exist
"""
pass
SQL Standards¶
Naming Conventions¶
Object |
Convention |
Example |
|---|---|---|
Tables |
|
|
Columns |
|
|
Primary Keys |
|
|
Foreign Keys |
|
|
Formatting¶
-- Use uppercase for keywords
-- Align clauses for readability
SELECT
c.customer_id,
c.name,
COUNT(o.order_id) AS order_count
FROM customers c
LEFT JOIN orders o
ON c.customer_id = o.customer_id
WHERE c.status = 'active'
GROUP BY c.customer_id, c.name
HAVING COUNT(o.order_id) > 0
ORDER BY order_count DESC;
Tip
Use SQLFluff for automated SQL linting.