Azure DevOps

Making bicep what-if an actual gate in Azure Pipelines: parsing, blocking, and posting the diff

The 2:08am page said a production storage account had vanished. The Bicep what-if had run and printed the deletion to a log nobody read. This is the gate that would have caught it.

04 Aug 2025 18 min read 122 viewsBicepAzure PipelinesIaCPull requests

The page hit my phone at 2:08am on a Wednesday. Storage account 'stprodinvoices01' returned 404 on the warehouse job. The warehouse job reads about 11 GB of invoice blobs out of a container called invoices-2024 and lands them into Synapse. It had been running fine for fourteen months. I sat up in bed because the storage account was not there. The resource group held a new storage account with the same name, created 47 minutes earlier, holding zero blobs.

What had happened was simple and stupid. Someone had merged a Bicep PR at 1:18am. The parameter file had been kind: 'StorageV2' for the life of the module. The PR's parameter file said kind: 'Storage'. Five characters. The deploy pipeline ran az deployment group create on merge, ARM saw a property that requires resource replacement, and deleted the old storage account and created a new one with the same name. The blobs in the old account were not migrated because there is no migration step in a Delete then Create. They were gone. We had a soft delete window of seven days on the container, which is the only reason I am writing this article in 2026 and not in front of a tribunal.

The sleepy ops engineer who fielded the page (me) caught it within forty minutes and restored from soft-delete. The internal impact was the postmortem, which produced the question this article exists to answer: --what-if was already running on every deploy. It had said Delete: stprodinvoices01 in plain text. Nobody saw it. The output was buried at line 1,847 of a build log that nobody opens unless something fails, and the deploy step did not fail because what-if's exit code is zero when it has changes to report.

So I rebuilt the pipeline. The headline is that bicep what-if is now an actual gate: it runs on the PR, not on merge, it parses its own JSON output, it blocks the PR on destructive deltas, and it posts the diff back to the PR as a markdown table that reviewers actually read. Twenty-two months in production across 14 repos. Zero recreated resources in that window.

Why what-if was already there and still failed us

The original pipeline had what-if inside the deploy stage, right before az deployment group create. The shape was the obvious one:

- task: AzureCLI@2
  displayName: 'what-if then deploy'
  inputs:
    azureSubscription: $(serviceConnection)
    scriptType: bash
    scriptLocation: inlineScript
    inlineScript: |
      az deployment group what-if -g $(rg) -f main.bicep -p @prod.bicepparam
      az deployment group create  -g $(rg) -f main.bicep -p @prod.bicepparam

Two problems. First, what-if output goes to stdout interleaved with everything else the pipeline is logging. On a moderately busy template the output is two to three hundred lines of colorised diff, and there is no human in the loop between the two commands. Second, it runs on merge to main, after the PR is already approved. The reviewer who approved the PR never saw the what-if output, because the build that produced it had not started yet when they clicked "Approve".

The fix is structural. What-if has to run on the PR. The diff has to land in the PR conversation. The deploy stage on main should not be the place where dangerous changes are first surfaced. Once we made that mental shift, the rest was YAML and jq.

Splitting the validation pipeline from the deploy pipeline

We have two pipeline files. pr-validate.yml runs on PR triggers. deploy.yml runs on merge to main. They reference the same Bicep templates but use two different Service Connections built on workload identity federation, with the PR one scoped to Reader on the target resource group plus the read-only permissions the what-if API needs.

The Reader distinction matters. The what-if call requires only Microsoft.Resources/deployments/whatIf/action at the target scope, which Reader grants. If the PR pipeline somehow runs az deployment group create, the call fails with AuthorizationFailed, which is a circuit-breaker we want. The full list of permissions is on Microsoft Learn, which I bookmarked after forty minutes wondering why my read-only role assignment was returning 403 InvalidAuthenticationToken.

Here is the spine of pr-validate.yml.

pr:
  branches:
    include:
      - main
  paths:
    include:
      - infra/**

trigger: none

pool:
  vmImage: ubuntu-latest

variables:
  - name: serviceConnection
    value: 'sc-platform-prod-readonly'
  - name: resourceGroup
    value: 'rg-invoices-prod-uks'
  - name: templateFile
    value: 'infra/main.bicep'
  - name: paramsFile
    value: 'infra/prod.bicepparam'

stages:
  - stage: WhatIf
    displayName: 'Bicep what-if gate'
    jobs:
      - job: Plan
        displayName: 'Plan and classify'
        steps:
          - checkout: self
            persistCredentials: true

          - task: AzureCLI@2
            name: runWhatIf
            displayName: 'az deployment group what-if'
            inputs:
              azureSubscription: $(serviceConnection)
              scriptType: bash
              scriptLocation: inlineScript
              inlineScript: |
                set -euo pipefail
                mkdir -p $(Build.ArtifactStagingDirectory)/whatif

                az deployment group what-if \
                  --resource-group "$(resourceGroup)" \
                  --template-file "$(templateFile)" \
                  --parameters "@$(paramsFile)" \
                  --result-format FullResourcePayloads \
                  --no-pretty-print \
                  > $(Build.ArtifactStagingDirectory)/whatif/raw.json

                echo "##[group]what-if raw JSON head"
                head -c 2000 $(Build.ArtifactStagingDirectory)/whatif/raw.json
                echo
                echo "##[endgroup]"

          - task: Bash@3
            name: classify
            displayName: 'Classify deltas'
            inputs:
              targetType: filePath
              filePath: scripts/classify-whatif.sh
              arguments: >-
                $(Build.ArtifactStagingDirectory)/whatif/raw.json
                $(Build.ArtifactStagingDirectory)/whatif/diff.md
                $(Build.ArtifactStagingDirectory)/whatif/verdict.txt

          - task: AzureCLI@2
            displayName: 'Post diff to PR'
            condition: and(succeededOrFailed(), eq(variables['Build.Reason'], 'PullRequest'))
            inputs:
              azureSubscription: $(serviceConnection)
              scriptType: bash
              scriptLocation: inlineScript
              inlineScript: |
                bash scripts/post-pr-comment.sh \
                  "$(Build.ArtifactStagingDirectory)/whatif/diff.md" \
                  "$(Build.ArtifactStagingDirectory)/whatif/verdict.txt"
            env:
              SYSTEM_ACCESSTOKEN: $(System.AccessToken)

          - task: Bash@3
            displayName: 'Fail on destructive verdict'
            condition: succeededOrFailed()
            inputs:
              targetType: inline
              script: |
                VERDICT=$(cat $(Build.ArtifactStagingDirectory)/whatif/verdict.txt || echo "ERROR")
                echo "Verdict: $VERDICT"
                if [[ "$VERDICT" == "BLOCK" ]]; then
                  echo "##vso[task.logissue type=error]Destructive deltas detected. PR cannot merge until reviewed."
                  exit 1
                fi

The pr: trigger with trigger: none makes this a PR-only pipeline. The branch policy on main references it as a required build, so the PR cannot merge while the pipeline is red. The four tasks form the gate: run what-if to JSON, classify the deltas, post a comment back to the PR, fail the build if the classification says block. Everything interesting lives in classify-whatif.sh and post-pr-comment.sh.

The JSON shape, and why FullResourcePayloads matters

The what-if command output, when invoked with --no-pretty-print --result-format FullResourcePayloads, is JSON with a top-level changes array. Each change has a stable schema. The fields I care about are changeType, resourceId, before, after, and delta. The delta field is itself an array of path-based diffs (one entry per property change).

A truncated example, from one of our real what-if runs:

{
  "changes": [
    {
      "changeType": "Delete",
      "resourceId": "/subscriptions/.../resourceGroups/rg-invoices-prod-uks/providers/Microsoft.Storage/storageAccounts/stprodinvoices01",
      "before": { "name": "stprodinvoices01", "kind": "StorageV2", "sku": { "name": "Standard_GRS" } },
      "after": null,
      "delta": null
    },
    {
      "changeType": "Modify",
      "resourceId": "/subscriptions/.../resourceGroups/rg-invoices-prod-uks/providers/Microsoft.KeyVault/vaults/kv-invoices-prod",
      "delta": [
        {
          "path": "properties.networkAcls.defaultAction",
          "propertyChangeType": "Modify",
          "before": "Allow",
          "after": "Deny"
        }
      ]
    },
    {
      "changeType": "Create",
      "resourceId": "/subscriptions/.../resourceGroups/rg-invoices-prod-uks/providers/Microsoft.Storage/storageAccounts/stprodinvoices01",
      "after": { "name": "stprodinvoices01", "kind": "Storage", "sku": { "name": "Standard_LRS" } }
    }
  ]
}

Three things to notice. First, the Delete followed by Create with the same name is exactly the recreate pattern that caused our incident; the two entries are independent in the JSON, the classifier has to spot the pair. Second, the Modify delta array gives a per-property breakdown, which makes per-property decisions possible. Third, only FullResourcePayloads gives the full before/after we need for the markdown table. The mode flag is documented in the what-if CLI reference.

Classification: what counts as destructive

The rules we settled on, after running the classifier in shadow mode for two weeks against historic PRs:

Any Delete blocks. Always. No exceptions. If a Bicep PR claims it is deleting something on purpose, the author can add a [allow-delete] token to the PR title and we relax the rule for that run.
Any Modify whose delta touches kind, sku.name, sku.tier, properties.encryption, properties.accessTier (for storage), or properties.networkAcls.defaultAction blocks. These are the property categories that either recreate the resource silently, change the security posture, or change the billing tier in ways that have surprised us.
Any Create is informational. We surface it in the diff but do not block.
Anything in Ignore or NoChange is filtered out before the diff is rendered. These are the noise.

Here is classify-whatif.sh in full. It uses jq because the agent image already has it and because the JSON shape rewards path queries.

#!/usr/bin/env bash
set -euo pipefail

INPUT="$1"
DIFF_OUT="$2"
VERDICT_OUT="$3"

if [[ ! -s "$INPUT" ]]; then
  echo "what-if output is empty or missing: $INPUT" >&2
  echo "BLOCK" > "$VERDICT_OUT"
  exit 0
fi

# Real what-if output has one wrapper object with .changes; we tolerate either shape.
CHANGES=$(jq 'if has("changes") then .changes else . end' "$INPUT")

DANGEROUS_PROPS=(
  "kind"
  "sku.name"
  "sku.tier"
  "properties.encryption"
  "properties.encryption.keySource"
  "properties.accessTier"
  "properties.networkAcls.defaultAction"
  "properties.publicNetworkAccess"
  "properties.minimumTlsVersion"
)

# Build a jq filter that returns true if any delta path matches a dangerous prop.
PROP_FILTER=$(printf '"%s",' "${DANGEROUS_PROPS[@]}")
PROP_FILTER="[${PROP_FILTER%,}]"

DELETES=$(echo "$CHANGES" | jq --argjson props "$PROP_FILTER" '
  [ .[] | select(.changeType == "Delete") | {
      resourceId: .resourceId,
      type: (.resourceId | split("/providers/")[1] // "unknown" | split("/")[0:2] | join("/"))
    } ]
')

MODIFIES=$(echo "$CHANGES" | jq --argjson props "$PROP_FILTER" '
  [ .[]
    | select(.changeType == "Modify")
    | { resourceId: .resourceId,
        dangerous: ( [ (.delta // [])[].path ] | any( . as $p | $props | index($p) ) ),
        delta: (.delta // []) }
  ]
')

CREATES=$(echo "$CHANGES" | jq '
  [ .[] | select(.changeType == "Create") | { resourceId: .resourceId } ]
')

DELETE_COUNT=$(echo "$DELETES"  | jq 'length')
DANGER_MOD=$(echo  "$MODIFIES" | jq '[ .[] | select(.dangerous == true) ] | length')
SAFE_MOD=$(echo    "$MODIFIES" | jq '[ .[] | select(.dangerous == false) ] | length')
CREATE_COUNT=$(echo "$CREATES" | jq 'length')

# Detect Delete+Create with identical resourceId. The 2:08am case.
RECREATES=$(jq -n --argjson d "$DELETES" --argjson c "$CREATES" '
  [ $d[].resourceId ] as $ids
  | [ $c[] | select(.resourceId as $r | $ids | index($r)) | .resourceId ]
')
RECREATE_COUNT=$(echo "$RECREATES" | jq 'length')

ALLOW_DELETE="false"
if [[ "${BUILD_SOURCEBRANCHNAME:-}" != "" ]]; then
  PR_TITLE="${SYSTEM_PULLREQUEST_PULLREQUESTTITLE:-}"
  if [[ "$PR_TITLE" == *"[allow-delete]"* ]]; then
    ALLOW_DELETE="true"
  fi
fi

VERDICT="PASS"
if (( RECREATE_COUNT > 0 )); then
  VERDICT="BLOCK"
elif (( DELETE_COUNT > 0 )) && [[ "$ALLOW_DELETE" != "true" ]]; then
  VERDICT="BLOCK"
elif (( DANGER_MOD > 0 )); then
  VERDICT="BLOCK"
fi

echo "$VERDICT" > "$VERDICT_OUT"

# Render the PR markdown.
{
  echo "## Bicep what-if results"
  echo
  if [[ "$VERDICT" == "BLOCK" ]]; then
    echo "**Verdict: BLOCK.** This PR cannot merge until the destructive deltas below are reviewed and either justified or removed."
  else
    echo "**Verdict: PASS.** The deltas in this PR are non-destructive."
  fi
  echo
  echo "| Category | Count |"
  echo "|---|---|"
  echo "| Recreates (Delete + Create on same id) | $RECREATE_COUNT |"
  echo "| Deletes | $DELETE_COUNT |"
  echo "| Dangerous Modifies | $DANGER_MOD |"
  echo "| Safe Modifies | $SAFE_MOD |"
  echo "| Creates | $CREATE_COUNT |"
  echo

  if (( RECREATE_COUNT > 0 )); then
    echo "### Recreates"
    echo "| Resource |"
    echo "|---|"
    echo "$RECREATES" | jq -r '.[] | "| `\(.)` |"'
    echo
  fi

  if (( DELETE_COUNT > 0 )); then
    echo "### Deletes"
    echo "| Resource |"
    echo "|---|"
    echo "$DELETES" | jq -r '.[] | "| `\(.resourceId)` |"'
    echo
  fi

  if (( DANGER_MOD > 0 )); then
    echo "### Dangerous Modifies"
    echo "| Resource | Property | Before | After |"
    echo "|---|---|---|---|"
    echo "$MODIFIES" | jq -r '
      .[] | select(.dangerous == true) as $m
      | $m.delta[]
      | "| `\($m.resourceId | split("/") | .[-1])` | `\(.path)` | `\(.before // "null")` | `\(.after // "null")` |"
    '
    echo
  fi
} > "$DIFF_OUT"

echo "classify done. verdict=$VERDICT, recreates=$RECREATE_COUNT, deletes=$DELETE_COUNT, dangerous=$DANGER_MOD, safe=$SAFE_MOD, creates=$CREATE_COUNT"

A few notes. The [allow-delete] escape hatch is an explicit, auditable opt-out, not a default. The PR title is the right place because it forces the author to put the intent into the merge commit. The Delete-plus-Create-same-id pattern is its own category because that is what bit us at 2:08am; naming it surfaces the danger. Adding a new dangerous property is a one-line change to DANGEROUS_PROPS.

Posting the diff as a PR comment

Azure DevOps PRs use a "threads" model for review conversations. The right shape for a build comment is one thread per build, edited in place on subsequent runs so the PR does not get spammed.

The thread API lives at POST {project}/_apis/git/repositories/{repo}/pullRequests/{prId}/threads; the reference is on Microsoft Learn. Authentication uses the build's System.AccessToken, the OAuth token Azure DevOps mints for the running pipeline. No PAT required.

post-pr-comment.sh:

#!/usr/bin/env bash
set -euo pipefail

DIFF_FILE="$1"
VERDICT_FILE="$2"

if [[ "${BUILD_REASON:-}" != "PullRequest" ]]; then
  echo "Not a PR build, skipping comment."
  exit 0
fi

ORG="${SYSTEM_TEAMFOUNDATIONCOLLECTIONURI%/}"
PROJECT="${SYSTEM_TEAMPROJECT}"
REPO_ID="${BUILD_REPOSITORY_ID}"
PR_ID="${SYSTEM_PULLREQUEST_PULLREQUESTID}"
VERDICT=$(cat "$VERDICT_FILE")

CONTENT=$(jq -Rs . < "$DIFF_FILE")
STATUS=4
if [[ "$VERDICT" == "BLOCK" ]]; then
  STATUS=1
fi

BODY=$(cat <<EOF
{
  "comments": [
    {
      "parentCommentId": 0,
      "content": ${CONTENT},
      "commentType": 1
    }
  ],
  "status": ${STATUS},
  "properties": {
    "whatif.build": { "type": "System.String", "value": "${BUILD_BUILDID}" }
  }
}
EOF
)

# Look for an existing thread from a prior run of this same build definition.
EXISTING=$(curl -sS \
  -H "Authorization: Bearer ${SYSTEM_ACCESSTOKEN}" \
  "${ORG}/${PROJECT}/_apis/git/repositories/${REPO_ID}/pullRequests/${PR_ID}/threads?api-version=7.1" \
  | jq --arg def "${BUILD_DEFINITIONNAME}" '
      .value[] | select(.properties?["whatif.definition"]?.["$value"] == $def) | .id
    ' | head -n1)

if [[ -n "$EXISTING" ]]; then
  curl -sS -X PATCH \
    -H "Authorization: Bearer ${SYSTEM_ACCESSTOKEN}" \
    -H "Content-Type: application/json" \
    -d "$BODY" \
    "${ORG}/${PROJECT}/_apis/git/repositories/${REPO_ID}/pullRequests/${PR_ID}/threads/${EXISTING}?api-version=7.1" \
    > /dev/null
else
  curl -sS -X POST \
    -H "Authorization: Bearer ${SYSTEM_ACCESSTOKEN}" \
    -H "Content-Type: application/json" \
    -d "$BODY" \
    "${ORG}/${PROJECT}/_apis/git/repositories/${REPO_ID}/pullRequests/${PR_ID}/threads?api-version=7.1" \
    > /dev/null
fi

echo "Posted diff to PR ${PR_ID}, verdict ${VERDICT}, status ${STATUS}."

The thread status field carries semantics. 1 is "active" (which in the PR UI shows the thread as needing resolution, blocking by branch policy if you require resolved comments). 4 is "closed". A passing verdict closes the thread; a blocking verdict opens it. With "all comments must be resolved" in branch policy, that gives a second backstop against merge.

jq -Rs . reads the diff markdown as a raw string and JSON-encodes it in one shot, avoiding the entire class of bug where a backtick or quote in the diff breaks the JSON body. I learned that one by reading Invalid character at line 3 column 47 for ten minutes on a Tuesday.

The "find an existing thread" pass is what makes the PR comment update in place across reruns. Without it, a PR that goes through five iterations ends up with five identical threads. The reviewer will not read five threads.

Auth: `System.AccessToken` vs the service connection

There are two identities at play, and getting them straight took me longer than it should have.

The AzureCLI@2 task authenticates to Azure using the Service Connection. That runs the az deployment group what-if call and needs Azure read permissions on the target resource group.

The PR comment call authenticates to Azure DevOps, not Azure. The right identity is the build identity, exposed through System.AccessToken. To make that token available you pass SYSTEM_ACCESSTOKEN: $(System.AccessToken) through env: on the task (which the YAML above does), and grant the build's project identity the "Contribute to pull requests" permission on the repo. The second is per-project and the part teams miss; the failure mode is TF401019: The Git repository with name or identifier ... does not exist or you do not have permissions for the operation you are attempting. Despite the error claiming the repo does not exist, the actual cause is almost always the missing permission.

Using the Service Connection for the PR comment call does not work, because the connection is scoped to Azure (management.azure.com), not Azure DevOps (dev.azure.com). Trying it returns ERROR: Could not retrieve token from server. AADSTS50105: The signed in user is not assigned to a role for the application 499b84ac-1321-427f-aa17-267ca6975798. That GUID is Azure DevOps.

The split: Azure work uses the Service Connection. Azure DevOps work uses System.AccessToken. Never mix them.

Filtering noise from the diff

The first week of running the gate, we had a parade of false positives from what-if reporting cosmetic deltas. The pattern was the same every time: the bicep template referenced a child resource property that ARM normalises on read, so what-if showed apiVersion deltas, defaultEncryptionScope deltas on storage accounts, or kind deltas where the before was null and the after was the actual value (because the resource was created out-of-band).

The robust filter sits at the JSON layer, before classification: drop any Modify whose delta is exclusively properties marked as noise. Anything else flows through.

NOISE_PATHS='[
  "apiVersion",
  "properties.defaultEncryptionScope",
  "tags.createdBy",
  "tags.lastUpdatedBy",
  "identity.principalId",
  "identity.tenantId"
]'

CHANGES=$(echo "$CHANGES" | jq --argjson noise "$NOISE_PATHS" '
  map(
    if .changeType == "Modify" and .delta != null then
      .delta = ( .delta | map(select( .path as $p | $noise | index($p) | not )) )
    else . end
  )
  | map(select(.changeType != "Modify" or (.delta | length) > 0))
')

The list is small on purpose. Adding to it requires a one-line PR reviewed by the platform team. The temptation to add "everything noisy this week" leads to a filter that hides real changes; we resist it.

The `roleAssignments` gotcha

Microsoft.Authorization/roleAssignments is the resource type where what-if has lied to me most consistently. The symptom: a PR changes a role assignment from Reader to Contributor. Sometimes what-if reports Modify with delta on properties.roleDefinitionId. Other times it reports Delete plus Create of two role assignments with different name GUIDs because Bicep generated a new deterministic name from a guid(scope, principalId, role) expression. Both outputs are technically correct, but they trip the classifier into BLOCK every time.

Mitigation: scope role-assignment changes into their own Bicep module (modules/rbac.bicep) and exclude that resource type from recreate detection:

RECREATES=$(jq -n --argjson d "$DELETES" --argjson c "$CREATES" '
  [ $d[].resourceId ] as $del_ids
  | [ $c[]
      | select(.resourceId as $r | $del_ids | index($r))
      | select(.resourceId | test("/providers/Microsoft.Authorization/roleAssignments/") | not)
      | .resourceId
    ]
')

The delete and create rows still appear in the diff table; the reviewer sees them and confirms. We just do not treat them as automatic blockers. An RBAC change is not data-destructive. A storage account recreate is. The risk profile is different.

The other limitation worth knowing is on the what-if API limitations page. The error I see most often is:

Bicep what-if API does not yet support resource type 'Microsoft.Web/sites/config/appsettings'. Skipping evaluation for the following resources: ...

What-if continues past this and emits a warning. The classifier surfaces the warning in the PR comment so reviewers know the diff is incomplete for those resource types. App service appsettings is the most common occurrence.

The end-to-end flow, from the reviewer's perspective

A PR opens. Within thirty seconds the pr-validate pipeline starts. Three to four minutes later, the pipeline finishes and a thread appears in the PR conversation with a heading "Bicep what-if results", a verdict line, a summary table, and per-section tables for recreates, deletes, and dangerous modifies. If PASS, the thread is closed and the build is green; the reviewer reads the diff, approves, and merges. The deploy pipeline picks up on main and runs az deployment group create without surprises.

If BLOCK, the build is red, the thread is active, the branch policy refuses the merge. The author either fixes the template (usual case, e.g. they typed 'Storage' when they meant 'StorageV2'), or adds [allow-delete] to the PR title for an intentional decommission, or opens a separate PR on the rules file. That third case has happened twice in 22 months.

The deploy pipeline on main still runs what-if before az deployment group create. It is no longer the gate; it is a safety net. If the plan on main differs from the PR's plan, something changed out of band in Azure and the deploy fails fast. We have caught two cases in 22 months: once a platform engineer changed an SKU in the portal between PR open and merge, once an Azure Policy added a deny on Standard_LRS storage accounts in the hour between approval and merge. In both cases the deploy halted before any resource was touched.

Troubleshooting

ERROR: (DeploymentWhatIfResourceError) The 'what-if' operation failed because the deployment 'main' is currently in progress. What-if cannot run while a deployment of the same name is in flight on the same scope. Solution: name your deployments with the build id (--name "main-$(Build.BuildId)") so the PR's what-if and the running deploy never collide. The deploy stage on main uses the same convention.

AuthorizationFailed: The client does not have authorization to perform action 'Microsoft.Resources/deployments/whatIf/action' over scope '...'. The Service Connection's service principal lacks the right role on the target. Reader is the minimum. We grant Reader plus Microsoft.Resources/deployments/whatIf/action via a custom role on a couple of locked-down subscriptions; on most others Reader is fine. The full list of actions for the role is on the deploy what-if docs.

Bicep what-if API does not yet support resource type 'Microsoft.Web/sites/config/appsettings'. Already covered: surface as a warning in the PR comment, do not block.

TF401019: The Git repository with name or identifier ... does not exist or you do not have permissions. The build identity is missing "Contribute to pull requests" on the repo. Project Settings → Repos → Security → look for the user named "Project Collection Build Service ({org})" or similar, set the permission to Allow.

ERROR: Could not retrieve token from server. AADSTS50105. You are calling Azure DevOps with an Azure-scoped token. Switch to System.AccessToken for the call, leave the Service Connection for Azure-side calls.

The deployment scope is /subscriptions/... but the template targets resourceGroup. Wrong --scope or wrong subcommand. az deployment group what-if is RG-scoped. az deployment sub what-if is subscription-scoped. az deployment mg what-if is management-group-scoped. The Bicep file's targetScope declaration must match. The full scope matrix is on Microsoft Learn.

jq: error (at <stdin>:1): Cannot iterate over null (null). The what-if call returned no JSON, which usually means az errored before producing output. The wrapper if [[ ! -s "$INPUT" ]] in the classifier catches this and emits a BLOCK verdict on the principle that "we have no idea what would happen, so do not let it merge". Look earlier in the pipeline log for the real az error.

The AzureCLI@2 task reference is where I go when something about the task's plumbing surprises me. The addSpnToEnvironment: true flag is occasionally useful for scripts that want to inspect the SP identity at runtime.

Twenty-two months in

Across 14 repos and roughly 4,200 PRs validated since rollout, the gate has blocked 86 PRs that contained at least one Delete. Of those, 71 were the author having shipped a parameter typo or a property rename that flipped a recreate. The author fixed the template inside the same PR, the gate went green, the merge happened. Eight were intentional decommissions with [allow-delete] in the title. Six were the RBAC false-positive case, surfaced and waved through. One was a real bug in the Bicep module where a parent-child relationship had been refactored incorrectly; that one stayed red for three days while we figured out the right migration path, which was exactly what we wanted the gate to enable.

The 2:08am page has not recurred. The warehouse storage account has not been recreated, modified out of band, or paged me in eighteen months. The reviewer behavior changed too: the PR comment is the first thing reviewers read now, before they look at the Bicep diff itself, because the comment tells them whether to read the rest of the PR adversarially or trustingly. Two of the longest-tenured engineers on the platform team have told me, separately, that they review Bicep PRs faster now because they know the gate has the destructive cases. Their attention goes to the design questions in the diff, not to the question of whether the design accidentally vaporises something.

The piece that took me longest to internalise is the structural one. What-if was always there. The information was always there. What changed is where the information landed and whose attention it captured. A what-if buried in a build log is theatre. A what-if posted to the PR conversation with a verdict and a table and a hard block on destructive deltas is engineering. The code is mostly glue; the change is in the contract.