NexusCS

Terragrunt

DevOps
Terragrunt is a thin wrapper for Terraform/OpenTofu that provides extra tools for keeping configurations DRY, working with multiple modules, and managing remote state.
terraform
infrastructure
iac
aws
devops

Getting started

Installation

# Quick install (Linux/macOS)
curl -sL https://terragrunt.gruntwork.io/install | bash

# Homebrew
brew install terragrunt

# Binary download
# Download from GitHub releases
# https://github.com/gruntwork-io/terragrunt/releases

Requirements

  • OpenTofu >= 1.6.0 or Terraform >= 0.12.0
  • Latest version: v0.99.1

Basic Commands

# Initialize
terragrunt init

# Plan changes
terragrunt plan

# Apply changes
terragrunt apply

# Destroy infrastructure
terragrunt destroy

# Show outputs
terragrunt output

Run-All Commands

# Run across all units
terragrunt run --all plan
terragrunt run --all apply
terragrunt run --all destroy

# Run on dependency graph
terragrunt run --graph apply

# Limit parallelism
terragrunt run --all apply --parallelism 5

Quick Example

# terragrunt.hcl
terraform {
  source = "git::git@github.com:org/modules.git//vpc?ref=v1.0.0"
}

inputs = {
  vpc_name = "main"
  cidr     = "10.0.0.0/16"
  region   = "us-east-1"
}

Core CLI Commands

Primary Patterns

Command Description
terragrunt run [cmd] Run with orchestration
terragrunt run --all [cmd] Run across all units
terragrunt run --graph [cmd] Run on dependency graph
terragrunt [cmd] Direct shortcut (e.g., plan)

Common Commands

# Validation and formatting
terragrunt validate
terragrunt fmt

# State refresh
terragrunt refresh

# Import resources
terragrunt import <addr> <id>

# Show dependency graph
terragrunt graph

# List providers
terragrunt providers

# Release state lock
terragrunt force-unlock <lock-id>

State Management

# List resources
terragrunt state list

# Show resource details
terragrunt state show <resource>

# Move resources
terragrunt state mv <source> <dest>

# Remove resources
terragrunt state rm <resource>

# Pull state
terragrunt state pull

# Push state
terragrunt state push

# Replace provider
terragrunt state replace-provider <old> <new>

Resource Management

# Mark resource for recreation
terragrunt taint <resource>

# Unmark tainted resource
terragrunt untaint <resource>

Backend Commands

Backend Operations

# Create backend resources (S3 bucket, DynamoDB table)
terragrunt backend bootstrap

# Migrate state between backends
terragrunt backend migrate

# Delete backend resources
terragrunt backend delete

Example Bootstrap

# Creates S3 bucket with versioning, encryption
# Creates DynamoDB table for locking
terragrunt backend bootstrap

Stack Commands

Stack Operations

# Generate from stack file
terragrunt stack generate

# Run on stack
terragrunt stack run [cmd]

# Show stack outputs
terragrunt stack output

# Clean generated files
terragrunt stack clean

HCL Operations

# Format HCL files
terragrunt hcl fmt

# Validate HCL syntax
terragrunt hcl validate

Discovery Commands

Finding and Listing

# Find configurations
terragrunt find

# List configurations
terragrunt list

# Browse modules (TUI)
terragrunt catalog

# Generate from catalog
terragrunt scaffold

Debug and Inspection

# Show merged config
terragrunt render

# Print debug info
terragrunt info print

# Show DOT format graph
terragrunt dag graph

# Generate visual graph
terragrunt dag graph | dot -Tpng > graph.png

Filtering Units

# Find units including a file
terragrunt --units-that-include path/to/file

Global Flags

Logging Flags

Flag Env Variable Description
--log-level TG_LOG_LEVEL trace, debug, info, warn, error
--log-format TG_LOG_FORMAT bare, key-value, json
--log-custom-format TG_LOG_CUSTOM_FORMAT Custom format string
--log-disable TG_LOG_DISABLE Disable logging
--log-show-abs-paths TG_LOG_SHOW_ABS_PATHS Show absolute paths
--no-color TG_NO_COLOR Disable colors

Execution Flags

Flag Env Variable Description
--non-interactive TG_NON_INTERACTIVE Auto-approve prompts
--working-dir TG_WORKING_DIR Set working directory
--experiment TG_EXPERIMENT Enable experiment
--strict-mode TG_STRICT_MODE Enable all strict controls

Examples

# Debug logging
terragrunt plan --log-level debug

# JSON logging for automation
terragrunt apply --log-format json --non-interactive

# Custom working directory
terragrunt plan --working-dir /path/to/module

Run Command Flags (Hidden Gems)

Stack/Graph Control

Flag Description
--all / TG_ALL Run on all units
--graph / TG_GRAPH Run on dependency graph
--filter Filter units by query (multiple)
--filter-affected Filter by git changes

Queue Management

# Include/exclude specific directories
--queue-include-dir path/to/dir
--queue-exclude-dir path/to/exclude

# Include external dependencies
--queue-include-external

# Exclude external dependencies
--queue-exclude-external

# Include units reading specific file
--queue-include-units-reading path/to/file

# Load exclusion patterns from file
--queue-excludes-file .terragruntignore

# Ignore dependency order
--queue-ignore-dag-order

# Continue on errors
--queue-ignore-errors

# Only run explicitly included units
--queue-strict-include

# Max parallel executions
--parallelism 5

Filter Examples

# Filter by path pattern
terragrunt run --all plan \
  --filter "path:prod/**"

# Filter affected by git changes
terragrunt run --all plan \
  --filter-affected

# Multiple filters
terragrunt run --all apply \
  --filter "path:prod/us-east-1/**" \
  --filter "!path:**/test/**"

Source Control Flags

Source Overrides

Flag Env Variable Description
--source - Override module source
--source-map - Map source paths
--source-update - Update cached sources
--download-dir TG_DOWNLOAD_DIR Module cache directory

Examples

# Override source for local development
terragrunt plan --source ../local-modules/vpc

# Update cached modules
terragrunt init --source-update

# Custom cache directory
terragrunt apply --download-dir /tmp/tg-cache

Authentication Flags

AWS IAM Flags

Flag Env Variable Description
--iam-assume-role TG_IAM_ASSUME_ROLE AWS IAM role ARN
--iam-assume-role-duration TG_IAM_ASSUME_ROLE_DURATION Session duration
--iam-assume-role-session-name TG_IAM_ASSUME_ROLE_SESSION_NAME Session name
--iam-assume-role-web-identity-token TG_IAM_ASSUME_ROLE_WEB_IDENTITY_TOKEN Web identity token

Auth Provider Command

Flag Env Variable Description
--auth-provider-cmd TG_AUTH_PROVIDER_CMD Dynamic auth command

Examples

# Assume IAM role
terragrunt apply \
  --iam-assume-role arn:aws:iam::123456789012:role/Admin

# With custom session name
terragrunt apply \
  --iam-assume-role arn:aws:iam::123456789012:role/Admin \
  --iam-assume-role-session-name my-session

# Dynamic auth provider
terragrunt apply \
  --auth-provider-cmd "aws-vault exec prod --"

Plan/Apply Flags

Output Directories

Flag Env Variable Description
--out-dir TG_OUT_DIR Directory for plan files
--json-out-dir TG_JSON_OUT_DIR Directory for JSON plans
--inputs-debug TG_INPUTS_DEBUG Write debug.tfvars

Auto Features

# Disable auto-init
--no-auto-init

# Disable auto-retry
--no-auto-retry

# Don't add -auto-approve
--no-auto-approve

# Disable auto provider cache
--no-auto-provider-cache-dir

Examples

# Save plans for review
terragrunt run --all plan \
  --out-dir /tmp/plans

# Save JSON plans for analysis
terragrunt run --all plan \
  --json-out-dir /tmp/json-plans

# Debug inputs
terragrunt apply --inputs-debug

Provider Cache Flags

Cache Server

Flag Description
--provider-cache Enable provider cache server
--provider-cache-dir Cache directory
--provider-cache-hostname Server hostname
--provider-cache-port Server port (int)
--provider-cache-token Auth token
--provider-cache-registry-names Registry names (multiple)

Example

# Enable provider cache
terragrunt run --all init \
  --provider-cache \
  --provider-cache-dir ~/.terragrunt/provider-cache \
  --provider-cache-port 5758

Benefits: Speeds up terraform init across multiple modules by sharing provider downloads.

Miscellaneous Flags

Config and Execution

# Custom config file
--config custom-terragrunt.hcl

# Path to tofu/terraform binary
--tf-path /usr/local/bin/tofu

# Forward TF stdout directly
--tf-forward-stdout

# Set feature flags
--feature flag_name=true

Dependency Optimization

# Fetch dependency outputs from state
--dependency-fetch-output-from-state

# Use partial parse config cache
--use-partial-parse-config-cache

# Disable backend resource updates
--disable-bucket-update

# Check dependencies on destroy
--destroy-dependencies-check

Example

# Faster dependency resolution
terragrunt apply \
  --dependency-fetch-output-from-state

# Use OpenTofu instead of Terraform
terragrunt apply --tf-path /usr/local/bin/tofu

HCL Syntax - terraform Block

Basic Configuration

terraform {
  # Module source
  source = "./modules/vpc"

  # Copy control
  include_in_copy = ["*.tf", "*.tfvars"]
  exclude_from_copy = ["test/**"]
  copy_terraform_lock_file = true
}

Extra Arguments

terraform {
  extra_arguments "custom_vars" {
    commands = ["plan", "apply"]

    # Additional arguments
    arguments = ["-var-file=custom.tfvars"]

    # Environment variables
    env_vars = {
      TF_VAR_foo = "bar"
    }

    # Variable files
    required_var_files = ["common.tfvars"]
    optional_var_files = ["optional.tfvars"]
  }
}

HCL Syntax - Hooks

Before Hook

terraform {
  before_hook "before_init" {
    commands     = ["init"]
    execute      = ["echo", "Before init"]
    working_dir  = "."
    run_on_error = false
  }
}

After Hook

terraform {
  after_hook "after_apply" {
    commands = ["apply"]
    execute  = ["echo", "After apply"]
  }
}

Error Hook

terraform {
  error_hook "on_error" {
    commands   = ["apply", "plan"]
    execute    = ["echo", "Error occurred"]
    on_errors  = [".*timeout.*"]
  }
}

Common Hook Patterns

# Validate before plan
before_hook "validate" {
  commands = ["plan"]
  execute  = ["terraform", "validate"]
}

# Notify after apply
after_hook "notify" {
  commands = ["apply"]
  execute  = ["slack-notify", "Applied changes"]
}

# Cleanup on error
error_hook "cleanup" {
  commands     = ["apply"]
  execute      = ["./cleanup.sh"]
  run_on_error = true
}

HCL Syntax - remote_state Block

S3 Backend

remote_state {
  backend = "s3"

  config = {
    bucket         = "my-terraform-state"
    key            = "${path_relative_to_include()}/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-locks"
  }

  generate = {
    path      = "backend.tf"
    if_exists = "overwrite"
  }
}

S3-Specific Options

remote_state {
  backend = "s3"

  config = {
    # ... basic config ...

    # Skip specific configurations
    skip_bucket_versioning             = false
    skip_bucket_ssencryption           = false
    skip_bucket_root_access            = false
    skip_bucket_enforced_tls           = false
    skip_bucket_public_access_blocking = false
    disable_bucket_update              = false

    # Enable DynamoDB encryption
    enable_lock_table_ssencryption = true

    # Tags
    s3_bucket_tags      = { Team = "ops" }
    dynamodb_table_tags = { Team = "ops" }

    # Access logging
    accesslogging_bucket_name   = "my-logs"
    accesslogging_target_prefix = "tf-state/"

    # KMS encryption
    bucket_sse_algorithm  = "aws:kms"
    bucket_sse_kms_key_id = "alias/my-key"
  }
}

Other Options

remote_state {
  backend = "s3"
  config  = { ... }

  # Disable init during plan/apply
  disable_init = false

  # Disable dependency optimization
  disable_dependency_optimization = false
}

HCL Syntax - include Block

Basic Include

include "root" {
  path = find_in_parent_folders()
}

Advanced Include

include "root" {
  path = find_in_parent_folders()

  # Expose parent config to child
  expose = true

  # Merge strategy
  merge_strategy = "deep"  # shallow, deep, no_merge
}

Accessing Parent Config

include "root" {
  path   = find_in_parent_folders()
  expose = true
}

# Access parent locals
inputs = {
  region = include.root.locals.region
  tags   = include.root.locals.common_tags
}

Multiple Includes

include "root" {
  path   = find_in_parent_folders("root.hcl")
  expose = true
}

include "region" {
  path   = find_in_parent_folders("region.hcl")
  expose = true
}

inputs = merge(
  include.root.locals,
  include.region.locals,
)

HCL Syntax - dependency Block

Basic Dependency

dependency "vpc" {
  config_path = "../vpc"
}

inputs = {
  vpc_id = dependency.vpc.outputs.vpc_id
}

With Mock Outputs

dependency "vpc" {
  config_path = "../vpc"

  # Mock outputs for plan without deployment
  mock_outputs = {
    vpc_id     = "vpc-fake"
    subnet_ids = ["subnet-fake-1", "subnet-fake-2"]
  }

  # Allow mocks for these commands
  mock_outputs_allowed_terraform_commands = ["validate", "plan"]

  # Merge strategy with state
  mock_outputs_merge_strategy_with_state = "shallow"
}

Advanced Options

dependency "networking" {
  config_path = "../networking"

  # Skip fetching outputs (for non-output deps)
  skip_outputs = false

  mock_outputs = {
    vpc_id = "vpc-fake"
  }

  mock_outputs_allowed_terraform_commands = ["validate", "plan"]
}

Multiple Dependencies

dependency "networking" {
  config_path = "../networking"
  mock_outputs = { vpc_id = "vpc-fake" }
  mock_outputs_allowed_terraform_commands = ["plan"]
}

dependency "security" {
  config_path = "../security"
  mock_outputs = { sg_id = "sg-fake" }
  mock_outputs_allowed_terraform_commands = ["plan"]
}

inputs = {
  vpc_id = dependency.networking.outputs.vpc_id
  sg_id  = dependency.security.outputs.sg_id
}

HCL Syntax - Other Core Blocks

locals Block

locals {
  environment = "prod"
  region      = "us-east-1"

  common_tags = {
    Environment = local.environment
    ManagedBy   = "Terragrunt"
  }
}

inputs Block

# Pass to Terraform as variables
inputs = {
  environment = local.environment
  region      = local.region
  tags        = local.common_tags
}

dependencies Block

# Non-output dependencies
# (run before this module, but don't fetch outputs)
dependencies {
  paths = [
    "../networking",
    "../security"
  ]
}

generate Block

generate "provider" {
  path      = "provider.tf"
  if_exists = "overwrite"  # overwrite, overwrite_terragrunt, skip
  contents  = <<EOF
provider "aws" {
  region = "${local.region}"
}
EOF
}

engine Block

engine {
  source  = "github.com/gruntwork-io/terragrunt-engine-opentofu"
  version = "v0.1.0"
  type    = "rpc"
}

feature Block

feature {
  name    = "my_feature"
  default = true
}

exclude Block

# Exclude from run-all commands
exclude {
  actions = ["apply", "destroy"]
  if      = get_env("ENV") == "prod"
}

errors Block

errors {
  retry_max_attempts       = 3
  retry_sleep_interval_sec = 5

  retryable_errors = [
    ".*timeout.*",
    ".*connection reset.*"
  ]
}

HCL Functions - Path/Directory

Finding Files

# Find parent terragrunt.hcl
find_in_parent_folders()

# Find specific file
find_in_parent_folders("root.hcl")

Path Functions

# Relative path to include
path_relative_to_include()

# Relative path from include
path_relative_from_include()

# Current directory
get_terragrunt_dir()

# Working directory for terraform
get_working_dir()

# Parent config directory
get_parent_terragrunt_dir()

# Original config directory
get_original_terragrunt_dir()

Repository Functions

# Git repo root
get_repo_root()

# Path from repo root
get_path_from_repo_root()

# Relative path to repo root
get_path_to_repo_root()

Example Usage

locals {
  # Include parent config
  root_config = find_in_parent_folders()

  # Relative path for state key
  state_key = path_relative_to_include()

  # Repo-relative path
  module_path = get_path_from_repo_root()
}

HCL Functions - Environment/Platform

Environment Variables

# Get environment variable
get_env("VAR_NAME")

# With default value
get_env("VAR_NAME", "default")

Platform Detection

# OS platform (linux, darwin, windows)
get_platform()

Example Usage

locals {
  environment = get_env("ENV", "dev")
  region      = get_env("AWS_REGION", "us-east-1")

  # Platform-specific config
  binary_path = get_platform() == "darwin" ? "/usr/local/bin" : "/usr/bin"
}

HCL Functions - AWS

AWS Identity Functions

# AWS account ID
get_aws_account_id()

# AWS account alias
get_aws_account_alias()

# Caller identity ARN
get_aws_caller_identity_arn()

# Caller user ID
get_aws_caller_identity_user_id()

Example Usage

locals {
  account_id    = get_aws_account_id()
  account_alias = get_aws_account_alias()

  # Account-specific bucket name
  state_bucket = "terraform-state-${local.account_id}"
}

remote_state {
  backend = "s3"
  config = {
    bucket = "terraform-state-${get_aws_account_id()}"
    key    = "${path_relative_to_include()}/terraform.tfstate"
    region = "us-east-1"
  }
}

HCL Functions - Terraform Context

Command Detection

# Current command (plan, apply, etc)
get_terraform_command()

# Current command arguments
get_terraform_cli_args()

Command Lists

# Commands that need -var-file
get_terraform_commands_that_need_vars()

# Commands that need -input
get_terraform_commands_that_need_input()

# Commands that need -lock
get_terraform_commands_that_need_locking()

# Commands that need -parallelism
get_terraform_commands_that_need_parallelism()

Error Patterns

# Default retryable error patterns
get_default_retryable_errors()

Example Usage

terraform {
  extra_arguments "vars" {
    commands = get_terraform_commands_that_need_vars()
    arguments = ["-var-file=common.tfvars"]
  }

  extra_arguments "parallelism" {
    commands = get_terraform_commands_that_need_parallelism()
    arguments = ["-parallelism=10"]
  }
}

HCL Functions - Config/Files

Reading Files

# Parse terragrunt config
read_terragrunt_config("path/terragrunt.hcl")

# Read tfvars file
read_tfvars_file("vars.tfvars")

# Decrypt SOPS file
sops_decrypt_file("secrets.yaml")

File Tracking

# Mark file as read (for filtering)
mark_as_read("path/file")

Example Usage

locals {
  # Read hierarchical configs
  account_vars = read_terragrunt_config(
    find_in_parent_folders("account.hcl")
  )

  region_vars = read_terragrunt_config(
    find_in_parent_folders("region.hcl")
  )

  # Read secrets
  secrets = sops_decrypt_file("secrets.enc.yaml")

  # Merge configs
  account_id = local.account_vars.locals.account_id
  region     = local.region_vars.locals.region
}

HCL Functions - Execution

run_cmd Function

# Execute command
run_cmd("echo", "hello")

# Multiple arguments
run_cmd("aws", "sts", "get-caller-identity")

Execution Flags

# Quiet mode (suppress output)
run_cmd("--terragrunt-quiet", "aws", "sts", "get-caller-identity")

# Global cache (cache result)
run_cmd("--terragrunt-global-cache", "git", "rev-parse", "HEAD")

# No cache
run_cmd("--terragrunt-no-cache", "date")

Example Usage

locals {
  # Get git commit hash
  git_hash = run_cmd("--terragrunt-quiet", "git", "rev-parse", "HEAD")

  # Get AWS account ID dynamically
  account_id = run_cmd(
    "--terragrunt-quiet",
    "aws", "sts", "get-caller-identity",
    "--query", "Account",
    "--output", "text"
  )

  # Get current timestamp
  timestamp = run_cmd("--terragrunt-no-cache", "date", "+%Y%m%d%H%M%S")
}

⚠️ Warning: run_cmd executes during parsing, not during Terraform execution. Use sparingly as it can slow down parsing.

HCL Functions - Miscellaneous

Other Functions

# Get --source flag value
get_terragrunt_source_cli_flag()

# Check version constraint
constraint_check("1.5.0", ">= 1.0")

Example Usage

locals {
  # Check if source override is set
  source_override = get_terragrunt_source_cli_flag()

  # Conditional logic based on version
  use_new_feature = constraint_check(
    get_env("TF_VERSION", "1.0.0"),
    ">= 1.5.0"
  )
}

Common Pattern - DRY Root Config

Directory Structure

.
├── terragrunt.hcl          # Root config
├── account.hcl             # Account-specific vars
├── region.hcl              # Region-specific vars
├── prod/
│   ├── us-east-1/
│   │   ├── vpc/
│   │   │   └── terragrunt.hcl
│   │   ├── eks/
│   │   │   └── terragrunt.hcl
│   │   └── rds/
│   │       └── terragrunt.hcl
└── dev/
    └── us-east-1/
        └── ...

Root terragrunt.hcl

locals {
  # Read hierarchical configs
  account_vars = read_terragrunt_config(
    find_in_parent_folders("account.hcl")
  )
  region_vars = read_terragrunt_config(
    find_in_parent_folders("region.hcl")
  )

  # Extract values
  account_id = local.account_vars.locals.account_id
  region     = local.region_vars.locals.region
}

# Configure remote state for all children
remote_state {
  backend = "s3"
  config = {
    bucket         = "terraform-state-${local.account_id}"
    key            = "${path_relative_to_include()}/terraform.tfstate"
    region         = local.region
    encrypt        = true
    dynamodb_table = "terraform-locks"
  }
}

# Pass common inputs to all children
inputs = merge(
  local.account_vars.locals,
  local.region_vars.locals,
)

account.hcl

locals {
  account_id   = "123456789012"
  account_name = "production"
}

region.hcl

locals {
  region = "us-east-1"
  azs    = ["us-east-1a", "us-east-1b", "us-east-1c"]
}

Child terragrunt.hcl

include "root" {
  path = find_in_parent_folders()
}

terraform {
  source = "git::git@github.com:org/modules.git//vpc?ref=v1.0.0"
}

inputs = {
  vpc_name = "main"
  cidr     = "10.0.0.0/16"

  # Inherits account_id, region, etc. from root
}

Common Pattern - Dependency Management

Basic Dependencies

dependency "networking" {
  config_path = "../networking"
}

dependency "security" {
  config_path = "../security"
}

inputs = {
  vpc_id = dependency.networking.outputs.vpc_id
  sg_id  = dependency.security.outputs.sg_id
}

With Mock Outputs

dependency "networking" {
  config_path = "../networking"

  # Mock for plan without dependencies deployed
  mock_outputs = {
    vpc_id     = "vpc-fake"
    subnet_ids = ["subnet-fake-1", "subnet-fake-2"]
  }

  mock_outputs_allowed_terraform_commands = ["validate", "plan"]
}

dependency "security" {
  config_path = "../security"

  mock_outputs = {
    sg_id = "sg-fake"
  }

  mock_outputs_allowed_terraform_commands = ["validate", "plan"]
}

inputs = {
  vpc_id     = dependency.networking.outputs.vpc_id
  subnet_ids = dependency.networking.outputs.subnet_ids
  sg_id      = dependency.security.outputs.sg_id
}

Avoiding Circular Dependencies

# Use dependencies block for non-output deps
dependencies {
  paths = ["../networking", "../security"]
}

# Or use skip_outputs
dependency "networking" {
  config_path  = "../networking"
  skip_outputs = true  # Don't fetch outputs
}

Fetching from State

# Faster dependency resolution
terragrunt apply --dependency-fetch-output-from-state

Common Pattern - IAM Role Assumption

Using CLI Flag

# Assume role for single command
terragrunt apply \
  --iam-assume-role arn:aws:iam::123456789012:role/Admin

# With session name and duration
terragrunt apply \
  --iam-assume-role arn:aws:iam::123456789012:role/Admin \
  --iam-assume-role-session-name my-session \
  --iam-assume-role-duration 3600

Using HCL Config

locals {
  role_arn = "arn:aws:iam::123456789012:role/Admin"
}

terraform {
  extra_arguments "role" {
    commands = get_terraform_commands_that_need_vars()

    # Clear AWS_PROFILE
    env_vars = {
      AWS_PROFILE = ""
    }
  }

  before_hook "assume_role" {
    commands = ["init", "plan", "apply", "destroy"]
    execute  = [
      "aws", "sts", "assume-role",
      "--role-arn", local.role_arn
    ]
  }
}

Using generate Block

locals {
  role_arn = "arn:aws:iam::123456789012:role/Admin"
}

generate "provider" {
  path      = "provider.tf"
  if_exists = "overwrite_terragrunt"
  contents  = <<EOF
provider "aws" {
  region = "${local.region}"

  assume_role {
    role_arn = "${local.role_arn}"
  }
}
EOF
}

Common Pattern - Auto-Retry

Basic Auto-Retry

errors {
  retry_max_attempts       = 3
  retry_sleep_interval_sec = 5

  retryable_errors = [
    "(?s).*timeout.*",
    "(?s).*connection reset.*",
  ]
}

AWS-Specific Errors

errors {
  retry_max_attempts       = 5
  retry_sleep_interval_sec = 10

  retryable_errors = [
    # Throttling
    "(?s).*TooManyRequestsException.*",
    "(?s).*RequestLimitExceeded.*",
    "(?s).*Throttling.*",

    # Timeouts
    "(?s).*timeout.*",
    "(?s).*connection reset.*",
    "(?s).*EOF.*",

    # Transient errors
    "(?s).*InternalError.*",
    "(?s).*ServiceUnavailable.*",
  ]
}

Global Default

Use get_default_retryable_errors() for standard patterns:

errors {
  retry_max_attempts       = 3
  retry_sleep_interval_sec = 5
  retryable_errors         = get_default_retryable_errors()
}

Common Pattern - Provider Generation

AWS Provider

locals {
  region      = "us-east-1"
  environment = "prod"
  role_arn    = "arn:aws:iam::123456789012:role/Admin"
}

generate "provider" {
  path      = "provider.tf"
  if_exists = "overwrite_terragrunt"
  contents  = <<EOF
provider "aws" {
  region = "${local.region}"

  assume_role {
    role_arn = "${local.role_arn}"
  }

  default_tags {
    tags = {
      Environment = "${local.environment}"
      ManagedBy   = "Terragrunt"
    }
  }
}
EOF
}

Multiple Providers

generate "providers" {
  path      = "providers.tf"
  if_exists = "overwrite_terragrunt"
  contents  = <<EOF
provider "aws" {
  region = "${local.region}"
}

provider "aws" {
  alias  = "secondary"
  region = "us-west-2"
}

provider "kubernetes" {
  host = module.eks.cluster_endpoint
}
EOF
}

Versions

generate "versions" {
  path      = "versions.tf"
  if_exists = "overwrite_terragrunt"
  contents  = <<EOF
terraform {
  required_version = ">= 1.5.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}
EOF
}

Gotchas and Solutions

1. Circular Dependencies

Error: Cycle detected in dependencies

Solution: Use skip_outputs or dependencies block:

# Option 1: Skip outputs
dependency "networking" {
  config_path  = "../networking"
  skip_outputs = true
}

# Option 2: Use dependencies block
dependencies {
  paths = ["../networking"]
}

2. State Lock Issues

Error: Error acquiring the state lock

Solution:

# Check lock info
terragrunt state list

# Force unlock (use lock ID from error)
terragrunt force-unlock <lock-id>

3. Source Path Resolution

Error: Module not found

Solution: Use absolute paths or git URLs with refs:

terraform {
  # Git with version tag
  source = "git::git@github.com:org/modules.git//vpc?ref=v1.0.0"

  # Local absolute path
  source = "/absolute/path/to/module"
}

4. Dependency Output Errors

Error: output "vpc_id" not found

Solution: Use mock_outputs:

dependency "vpc" {
  config_path = "../vpc"

  mock_outputs = {
    vpc_id = "vpc-fake"
  }

  mock_outputs_allowed_terraform_commands = ["validate", "plan"]
}

5. Backend Init Failures

Error: Failed to get existing workspaces

Solution: Bootstrap backend first:

terragrunt backend bootstrap

6. Permission Issues

Error: AccessDenied

Solution: Assume correct IAM role:

terragrunt apply \
  --iam-assume-role arn:aws:iam::123456789012:role/Admin

7. Cache Directory Bloat

.terragrunt-cache taking too much space.

Solution:

# Clean specific unit
rm -rf .terragrunt-cache

# Clean all (from root)
find . -type d -name ".terragrunt-cache" -exec rm -rf {} +

# Use custom location
terragrunt apply --download-dir /tmp/tg-cache

8. Working Directory Confusion

Error: No configuration files

Solution: Use --working-dir:

terragrunt apply --working-dir /path/to/module

9. Module Version Conflicts

Error: Module version constraint failed

Solution: Pin versions in source URL:

terraform {
  source = "git::git@github.com:org/modules.git//vpc?ref=v1.2.3"
}

10. Run-All Parallelism Issues

Error: Too many concurrent operations

Solution: Limit parallelism:

terragrunt run --all apply --parallelism 5

Best Practices

Directory Structure

.
├── terragrunt.hcl          # Root config
├── account.hcl             # Account vars
├── region.hcl              # Region vars
├── prod/
│   ├── us-east-1/
│   │   ├── vpc/
│   │   │   └── terragrunt.hcl
│   │   ├── eks/
│   │   │   └── terragrunt.hcl
│   │   └── rds/
│   │       └── terragrunt.hcl
└── dev/
    └── us-east-1/
        └── ...

DRY Principles

  • Use find_in_parent_folders() for root config
  • Use locals {} for shared variables
  • Use read_terragrunt_config() for hierarchical configs
  • Use include {} for config inheritance
  • Use generate {} for provider configs

State Management

  • Enable versioning on S3 buckets
  • Use encryption for state files
  • Enable DynamoDB for state locking
  • Use consistent key naming:
    key = "${path_relative_to_include()}/terraform.tfstate"
    
  • Enable access logging:
    accesslogging_bucket_name = "my-logs"
    

Dependencies

  • Use mock_outputs for faster planning
  • Use skip_outputs = true to avoid circular deps
  • Use dependencies {} for non-output dependencies
  • Document dependency graph:
    terragrunt dag graph | dot -Tpng > graph.png
    

Hooks

  • Use before_hook for validations
  • Use after_hook for notifications
  • Use error_hook for cleanup
  • Set run_on_error = true for critical hooks

Performance

  • Use --parallelism for run-all commands
  • Use --filter to target specific units
  • Use --dependency-fetch-output-from-state
  • Use provider cache for large stacks:
    terragrunt run --all init --provider-cache
    
  • Use custom cache directory:
    --download-dir /tmp/tg-cache
    

Security

  • Never commit secrets to version control
  • Use SOPS for encrypted secrets:
    sops_decrypt_file("secrets.enc.yaml")
    
  • Use IAM roles, not access keys
  • Enable encryption on state buckets
  • Use KMS for state encryption:
    bucket_sse_algorithm  = "aws:kms"
    bucket_sse_kms_key_id = "alias/terraform"
    

Version Control

  • Pin module versions:
    source = "git::...?ref=v1.0.0"
    
  • Pin provider versions:
    required_providers {
      aws = {
        version = "~> 5.0"
      }
    }
    
  • Use semantic versioning
  • Tag releases in git

Testing

  • Use mock_outputs for unit testing
  • Test with terragrunt plan first
  • Use --filter to test specific units:
    terragrunt run --all plan --filter "path:prod/**"
    
  • Validate before apply:
    terragrunt validate
    

Advanced Techniques

Dynamic Module Sources

locals {
  environment = get_env("ENV", "dev")

  # Use different branches per environment
  module_ref = local.environment == "prod" ? "v1.0.0" : "develop"
}

terraform {
  source = "git::git@github.com:org/modules.git//vpc?ref=${local.module_ref}"
}

Conditional Configuration

locals {
  is_prod = get_env("ENV") == "prod"
}

# Skip apply in prod unless explicitly approved
exclude {
  actions = ["apply", "destroy"]
  if      = local.is_prod && get_env("APPROVED") != "true"
}

Environment-Specific Hooks

locals {
  environment = get_env("ENV", "dev")
}

terraform {
  # Only notify in production
  after_hook "notify_prod" {
    commands = ["apply"]
    execute  = local.environment == "prod" ? ["slack-notify"] : ["echo", "skipped"]
  }
}

Custom Backend Bootstrap

remote_state {
  backend = "s3"

  config = {
    bucket = "terraform-state-${get_aws_account_id()}"
    key    = "${path_relative_to_include()}/terraform.tfstate"
    region = local.region

    # Custom tags for compliance
    s3_bucket_tags = {
      Team        = "ops"
      CostCenter  = "engineering"
      Compliance  = "required"
    }

    dynamodb_table_tags = {
      Team = "ops"
    }

    # Enable all security features
    skip_bucket_versioning             = false
    skip_bucket_ssencryption           = false
    skip_bucket_enforced_tls           = false
    skip_bucket_public_access_blocking = false
    enable_lock_table_ssencryption     = true

    # KMS encryption
    bucket_sse_algorithm  = "aws:kms"
    bucket_sse_kms_key_id = "alias/terraform"
  }
}

Multi-Region Deployments

# Deploy to multiple regions
for region in us-east-1 us-west-2 eu-west-1; do
  TG_REGION=$region terragrunt run --all apply \
    --filter "path:**/${region}/**"
done

Git-Based Change Detection

# Only apply changes affected by git diff
terragrunt run --all apply --filter-affected

Troubleshooting

Enable Debug Logging

# Trace-level logging
terragrunt plan --log-level trace

# JSON logging for parsing
terragrunt apply --log-format json --log-level debug

Inspect Merged Config

# Show final merged configuration
terragrunt render

Check Dependency Graph

# Show dependency graph
terragrunt dag graph

# Generate visual graph
terragrunt dag graph | dot -Tpng > graph.png

Debug Inputs

# Write inputs to debug.tfvars
terragrunt apply --inputs-debug
cat debug.tfvars

Validate HCL

# Validate HCL syntax
terragrunt hcl validate

# Format HCL files
terragrunt hcl fmt

Print Configuration

# Print debug info
terragrunt info print

Test Module Resolution

# Override source for local testing
terragrunt plan --source ../local-modules/vpc

# Update cached modules
terragrunt init --source-update

Check State

# List resources in state
terragrunt state list

# Show specific resource
terragrunt state show aws_instance.example

Also see