# Phase 0 — Foundation: Detailed Implementation Plan **Parent document:** Databricks-Primary Implementation Plan (MDP v8.0) **Duration:** Weeks 1–3 (15 working days) **Objective:** Establish the Azure landing zone, network security, Databricks workspace, Unity Catalog metastore, and IaC foundation. At the end of Phase 0, the platform is secure, governed, and ready for data engineering workloads — but no data flows yet. --- ## 1. Week-by-Week Breakdown ### Week 1 — Azure Landing Zone & Networking | Day | Task | Owner | Details | |---|---|---|---| | D1 | Subscription provisioning | Cloud Infra | Create subscription: `greenfield-mdp-prod` (Canada Central). Apply Greenfield management group policies (tagging, allowed regions, allowed SKUs). DR and DevTest subscriptions are out of scope for now. | | D1 | Resource group structure | Cloud Infra | Create resource groups per component: `rg-mdp-network`, `rg-mdp-databricks`, `rg-mdp-storage`, `rg-mdp-governance`, `rg-mdp-keyvault`, `rg-mdp-monitoring`. | | D1–D2 | VNet design & deployment | Cloud Infra | See §2 Network Architecture below. Deploy via Terraform. | | D2–D3 | Private Endpoints | Cloud Infra | Deploy Private Endpoints for: ADLS Gen2, Key Vault, Purview, Databricks (back-end), Databricks (front-end via Transit VNet). Register private DNS zones in central DNS hub (if Greenfield uses hub-spoke). | | D3 | NAT Gateway | Cloud Infra | Attach NAT Gateway to both Databricks subnets (host + container). Provides stable egress IPs for allowlisting at external source systems. | | D3–D4 | NSG rules | Cloud Infra / Security | Apply NSGs to Databricks subnets. Allow Databricks control plane service tags. Deny all other inbound. Restrict outbound to ADLS, Key Vault, Purview, and Databricks control plane only. See §3 NSG Rule Set. | | D4 | ADLS Gen2 storage account | Cloud Infra | Deploy storage account: `stglobalmdp` with HNS enabled, private endpoint, CMK (Key Vault-backed), soft delete (7 days), immutability policy on archive container. Container layout: `landing`, `bronze`, `silver`, `gold`, `archive`, `checkpoints`. | | D4 | Key Vault | Cloud Infra | Deploy Key Vault with private endpoint, soft delete, purge protection. Create initial secrets: Databricks PAT (placeholder), source system JDBC credentials (placeholders). Enable diagnostic logging to Log Analytics. | | D5 | Terraform state & CI/CD | DevOps | Azure Storage account for Terraform state (separate subscription, blob versioning, container-level lease lock). Azure DevOps / GitHub Actions pipeline: `terraform plan` on PR, `terraform apply` on merge to `main`. Branch protection rules enforced. | | D5 | Security review checkpoint | Security Architect | Review: VNet topology, NSG rules, Private Endpoint configuration, RBAC assignments. Sign-off before proceeding to Databricks deployment. | --- ### Week 2 — Databricks Workspace & Identity | Day | Task | Owner | Details | |---|---|---|---| | D6 | Databricks account setup | Platform Admin | Configure Azure Databricks account via account console. Enable Unity Catalog at account level. Set account-level admins (Entra ID group: `grp-mdp-account-admins`). | | D6 | Workspace deployment | Cloud Infra | Deploy Databricks workspace via Terraform (`azurerm_databricks_workspace`). Tier: **Premium** (required for UC, Private Link, IP access lists, audit logs). VNet injection: bind to pre-created host + container subnets. Managed resource group: `rg-mdp-databricks-managed` (auto-created by Databricks, do not modify). Enable No Public IP (NPIP) for secure cluster connectivity. | | D6–D7 | Private Link configuration | Cloud Infra | **Back-end Private Link:** Private endpoint from Databricks VNet to control plane (workspace, DBFS, artifact). **Front-end Private Link:** Private endpoint in Transit VNet for user browser/API access. DNS: register `adb-.azuredatabricks.net` in private DNS zone. | | D7 | IP access lists | Platform Admin | Configure workspace IP access list to allow only Greenfield corporate IP ranges and the Transit VNet CIDR. Block all other access. | | D7 | Conditional Access (Entra ID) | Security Architect | Entra ID Conditional Access policy for the Databricks enterprise application: require MFA, require compliant device, block access from non-compliant locations. | | D7–D8 | Identity provisioning (SCIM) | Platform Admin | Enable **automatic identity management** (default for accounts created after Aug 2025) or configure SCIM provisioning connector in Entra ID. Sync the following Entra ID groups to Databricks account: | | | | | `grp-mdp-account-admins` → Account Admin role | | | | | `grp-mdp-platform-engineers` → Workspace Admin role | | | | | `grp-mdp-data-engineers` → Workspace User, Can Manage clusters | | | | | `grp-mdp-data-analysts` → Workspace User, SQL access only | | | | | `grp-mdp-data-stewards` → Workspace User, UC metastore admin | | | | | `grp-mdp-data-scientists` → Workspace User, ML runtime access | | D8 | Workspace configuration | Platform Admin | Enable: audit logging (to Azure Monitor diagnostic settings), web terminal disabled (security), DBFS disabled for external access, repos enabled (Git integration), serverless compute enabled. | | D8–D9 | Cluster policies | Platform Admin | Create cluster policies (see §5 Cluster Policies). Assign policies to groups: data engineers get `de-job-cluster` and `de-interactive`, analysts get `analyst-sql-only`, data scientists get `ds-ml-cluster`. | | D9 | Git integration | DevOps | Configure workspace Git integration (Azure DevOps Repos or GitHub). Set default branch: `main`. Enable branch protection. Create repo structure for DABs (see §6). | | D10 | Workspace validation | Platform Admin | Validate: user login via Private Link, cluster launch, notebook execution, ADLS access via managed identity, Key Vault secret scope. Document any issues. | --- ### Week 3 — Unity Catalog & Governance Foundation | Day | Task | Owner | Details | |---|---|---|---| | D11 | Metastore creation | Platform Admin | Create Unity Catalog metastore for **Canada Central** region. Root storage: ADLS Gen2 container `unitycatalog` in the MDP storage account. Storage credential: **User-Assigned Managed Identity** (not service principal — supports storage behind network rules). Access connector: `databricks-access-connector-mdp-prod`. | | D11 | Metastore assignment | Platform Admin | Assign the metastore to the production workspace. | | D11–D12 | Catalog structure | Platform Admin / Data Steward | Create the initial catalog hierarchy: | ``` metastore: greenfield_canadacentral ├── catalog: raw ← Bronze layer (ingested data) │ ├── schema: customer │ ├── schema: transactions │ ├── schema: products │ ├── schema: claims │ └── schema: policies ├── catalog: curated ← Silver layer (cleansed, conformed) │ ├── schema: customer │ ├── schema: transactions │ ├── schema: products │ ├── schema: claims │ └── schema: policies ├── catalog: analytics ← Gold layer (business-ready, Metric Views) │ ├── schema: dimensional ← Star schemas │ ├── schema: metrics ← Metric Views │ ├── schema: reference ← MDM, reference data │ └── schema: customer360 ← Unified customer view ├── catalog: sandbox ← Exploratory (data scientists, analysts) │ └── schema: └── catalog: system ← UC system tables (audit, lineage, billing) ``` | Day | Task | Owner | Details | |---|---|---|---| | D12 | External locations | Platform Admin | Register ADLS containers as UC external locations: `abfss://bronze@stglobal...`, `abfss://silver@...`, `abfss://gold@...`. Bind each to the managed identity storage credential. | | D12 | Default grants | Data Steward | Apply baseline grants: | | | | | `grp-mdp-data-engineers`: `ALL PRIVILEGES` on `raw`, `curated`; `SELECT` on `analytics` | | | | | `grp-mdp-data-analysts`: `SELECT` on `curated`, `analytics`; `USAGE` on `sandbox` | | | | | `grp-mdp-data-scientists`: `SELECT` on `curated`, `analytics`; `ALL PRIVILEGES` on `sandbox` | | | | | `grp-mdp-data-stewards`: `ALL PRIVILEGES` on all catalogs (metastore admin) | | D13 | Audit logging | Platform Admin | Enable UC system tables: `system.access.audit`, `system.access.table_lineage`, `system.billing.usage`. Create a Databricks SQL dashboard for: login events, permission changes, table access patterns. Route audit logs to Azure Monitor via diagnostic settings. | | D13 | Purview registration | Governance Specialist | Register the Databricks workspace as a data source in Microsoft Purview. Configure the UC scan using managed identity authentication. Run initial metadata scan (catalogs, schemas — tables will come in Phase 1). Validate that catalog structure appears in Purview Unified Catalog. | | D14–D15 | Phase 0 exit review | All | **Go/No-Go gate.** Review against exit criteria (§8). Security sign-off. Architecture sign-off. Handover to Phase 1 team. | --- ## 2. Network Architecture ### VNet Layout (Canada Central — Production) | VNet / Subnet | CIDR | Purpose | |---|---|---| | `vnet-mdp-prod` | `10.100.0.0/16` | Main MDP VNet | | `snet-dbx-host` | `10.100.1.0/24` | Databricks host subnet (254 IPs) | | `snet-dbx-container` | `10.100.2.0/24` | Databricks container subnet (254 IPs) | | `snet-private-endpoints` | `10.100.3.0/24` | Private endpoints (ADLS, KV, Purview, etc.) | | `snet-sas-viya` | `10.100.4.0/24` | SAS Viya integration (Phase 2) | | `vnet-mdp-transit` | `10.101.0.0/24` | Front-end Private Link for user access | ### Peering - `vnet-mdp-prod` ↔ `vnet-mdp-transit` (for front-end Private Link) - `vnet-mdp-prod` ↔ Greenfield hub VNet (for on-prem connectivity via ExpressRoute/VPN) - `vnet-mdp-prod` ↔ `vnet-sas-viya` (when SAS is deployed in Phase 2) ### DNS - Private DNS zones hosted in Greenfield's central DNS hub (hub-spoke model) - Zones: `privatelink.azuredatabricks.net`, `privatelink.dfs.core.windows.net`, `privatelink.vaultcore.azure.net`, `privatelink.purview.azure.com` --- ## 3. NSG Rule Set (Databricks Subnets) ### Inbound Rules | Priority | Name | Source | Destination | Port | Action | |---|---|---|---|---|---| | 100 | AllowDatabricksControlPlane | `AzureDatabricks` service tag | `snet-dbx-*` | 443 | Allow | | 200 | AllowInternalSubnet | `snet-dbx-host` | `snet-dbx-container` | * | Allow | | 201 | AllowInternalSubnet2 | `snet-dbx-container` | `snet-dbx-host` | * | Allow | | 4096 | DenyAllInbound | * | * | * | Deny | ### Outbound Rules | Priority | Name | Source | Destination | Port | Action | |---|---|---|---|---|---| | 100 | AllowDatabricksControlPlane | `snet-dbx-*` | `AzureDatabricks` service tag | 443 | Allow | | 110 | AllowSQL | `snet-dbx-*` | `Sql` service tag | 3306 | Allow | | 120 | AllowStorage | `snet-dbx-*` | `Storage` service tag | 443 | Allow | | 130 | AllowEventHub | `snet-dbx-*` | `EventHub` service tag | 9093 | Allow | | 200 | AllowKeyVault | `snet-dbx-*` | `snet-private-endpoints` | 443 | Allow | | 210 | AllowADLS | `snet-dbx-*` | `snet-private-endpoints` | 443 | Allow | | 4096 | DenyAllOutbound | `snet-dbx-*` | `Internet` | * | Deny | **Note:** The `DenyAllOutbound` to Internet is the data exfiltration protection control. All egress goes through Private Endpoints or service tags. --- ## 4. Terraform Module Structure ``` terraform/ ├── environments/ │ └── prod/ │ ├── main.tf │ ├── variables.tf │ ├── terraform.tfvars │ └── backend.tf ← State in Azure Storage ├── modules/ │ ├── networking/ │ │ ├── main.tf ← VNet, subnets, NSGs, NAT GW, peerings │ │ ├── variables.tf │ │ └── outputs.tf │ ├── databricks-workspace/ │ │ ├── main.tf ← Workspace, VNet injection, Private Link │ │ ├── variables.tf │ │ └── outputs.tf │ ├── unity-catalog/ │ │ ├── main.tf ← Metastore, access connector, storage credential, │ │ │ catalogs, schemas, external locations, grants │ │ ├── variables.tf │ │ └── outputs.tf │ ├── storage/ │ │ ├── main.tf ← ADLS Gen2, containers, CMK, private endpoints │ │ ├── variables.tf │ │ └── outputs.tf │ ├── keyvault/ │ │ ├── main.tf ← Key Vault, access policies, private endpoint │ │ ├── variables.tf │ │ └── outputs.tf │ ├── monitoring/ │ │ ├── main.tf ← Log Analytics, diagnostic settings, alerts │ │ ├── variables.tf │ │ └── outputs.tf │ └── identity/ │ ├── main.tf ← Managed identities, RBAC assignments │ ├── variables.tf │ └── outputs.tf └── ci/ ├── azure-pipelines.yml ← Plan on PR, Apply on merge └── scripts/ └── validate.sh ← terraform validate + tflint + checkov ``` ### Key Terraform Resources | Module | Key Resources | |---|---| | `networking` | `azurerm_virtual_network`, `azurerm_subnet`, `azurerm_network_security_group`, `azurerm_nat_gateway`, `azurerm_private_endpoint`, `azurerm_private_dns_zone` | | `databricks-workspace` | `azurerm_databricks_workspace` (with `custom_parameters` for VNet injection), `azurerm_private_endpoint` (front-end + back-end) | | `unity-catalog` | `databricks_metastore`, `databricks_metastore_assignment`, `databricks_metastore_data_access`, `databricks_catalog`, `databricks_schema`, `databricks_external_location`, `databricks_grants` | | `storage` | `azurerm_storage_account` (HNS, CMK), `azurerm_storage_container`, `azurerm_private_endpoint` | | `identity` | `azurerm_user_assigned_identity`, `azurerm_databricks_access_connector`, `azurerm_role_assignment` (Storage Blob Data Contributor on ADLS) | --- ## 5. Cluster Policies | Policy Name | Target Group | Configuration | |---|---|---| | `de-job-cluster` | Data Engineers | Auto-termination: 20 min. Spot instances: up to 80% of workers. Node type: `Standard_DS3_v2` (fixed). Min/max workers: 2–8. Spark version: latest LTS. Unity Catalog enabled. | | `de-interactive` | Data Engineers | Auto-termination: 60 min. Single-user access mode. Node type: `Standard_DS4_v2`. Max workers: 4. | | `analyst-sql-only` | Analysts | Serverless SQL Warehouse only. No general-purpose clusters. Warehouse size: Small (2X-Small for dev). Auto-stop: 10 min. | | `ds-ml-cluster` | Data Scientists | GPU-enabled option: `Standard_NC6s_v3`. Auto-termination: 30 min. ML runtime (latest). Max workers: 4. Single-user access mode. | | `admin-unrestricted` | Platform Admins | Unrestricted — for platform debugging only. Requires justification tag. | --- ## 6. DABs Repository Structure ``` mdp-platform/ ← Git repository ├── databricks.yml ← DAB project config ├── bundles/ │ ├── infrastructure/ │ │ ├── cluster-policies/ ← Policy JSON definitions │ │ ├── instance-pools/ ← Pool definitions │ │ └── secret-scopes/ ← Secret scope config │ ├── ingestion/ ← Phase 1 (empty in Phase 0) │ │ ├── bronze/ │ │ └── silver/ │ ├── transformation/ ← Phase 2 (empty in Phase 0) │ │ └── gold/ │ ├── metrics/ ← Phase 3 (empty in Phase 0) │ │ └── metric-views/ │ └── governance/ │ ├── grants/ ← UC grant definitions │ └── quality/ ← DQ rule definitions ├── tests/ │ ├── unit/ │ └── integration/ └── .github/ or .azure-pipelines/ └── ci.yml ``` --- ## 7. RACI Matrix | Task | Platform Admin | Cloud Infra | Security Architect | DevOps | Data Steward | |---|---|---|---|---|---| | Subscription provisioning | C | **R** | A | I | I | | VNet & networking | C | **R** | A | I | I | | Private Endpoints | C | **R** | A | I | I | | NSG rules | I | **R** | **A** | I | I | | ADLS Gen2 deployment | C | **R** | A | I | I | | Key Vault deployment | C | **R** | A | I | I | | Databricks workspace | **R** | C | A | I | I | | Private Link config | C | **R** | A | I | I | | SCIM / identity provisioning | **R** | I | A | I | I | | IP access lists | **R** | I | **A** | I | I | | Conditional Access | I | I | **R** | I | I | | Cluster policies | **R** | I | C | I | I | | UC metastore creation | **R** | I | C | I | C | | Catalog/schema structure | C | I | I | I | **R** | | External locations | **R** | C | C | I | I | | Default grants | C | I | A | I | **R** | | Audit logging | **R** | C | A | I | I | | Purview registration | I | I | I | I | **R** | | Terraform modules | C | **R** | C | **R** | I | | CI/CD pipeline | I | C | I | **R** | I | | Phase 0 exit review | **R** | **R** | **R** | C | C | **R** = Responsible, **A** = Accountable, **C** = Consulted, **I** = Informed --- ## 8. Exit Criteria | # | Criterion | Verification Method | |---|---|---| | 1 | Workspace accessible via Private Link from Greenfield corporate network | User logs in via browser → workspace UI loads. `curl` to workspace URL resolves to private IP. | | 2 | No public internet access to workspace | Attempt access from non-corporate IP → blocked. IP access list denies connection. | | 3 | Unity Catalog metastore operational | `SHOW CATALOGS` returns `raw`, `curated`, `analytics`, `sandbox`. | | 4 | ADLS Gen2 accessible from workspace | Notebook: `dbutils.fs.ls("abfss://bronze@stglobal...")` succeeds via managed identity. | | 5 | Key Vault secret scope functional | Notebook: `dbutils.secrets.get(scope="mdp-keyvault", key="test-secret")` returns value. | | 6 | SCIM sync active | All 6 Entra ID groups visible in workspace admin console with correct members. | | 7 | Cluster policies enforced | User in `grp-mdp-data-analysts` cannot create general-purpose cluster — only SQL Warehouse. | | 8 | Audit logs flowing | Azure Monitor shows Databricks diagnostic logs. UC audit table `system.access.audit` contains login events. | | 9 | Purview metadata scan completed | Purview Unified Catalog shows `raw`, `curated`, `analytics`, `sandbox` catalogs. | | 10 | Terraform state clean | `terraform plan` returns "No changes. Your infrastructure matches the configuration." | | 11 | Security review passed | Security architect sign-off document on file. No critical findings. | | 12 | CI/CD pipeline operational | PR triggers `terraform plan`. Merge to `main` triggers `terraform apply`. Both succeed. | --- ## 9. Risks Specific to Phase 0 | Risk | Impact | Mitigation | |---|---|---| | Subscription provisioning delay (Greenfield IT) | Blocks everything | Engage cloud team 2 weeks before Phase 0 starts. Pre-approve subscription request. | | Private DNS zone conflicts with existing hub | Workspace unreachable | Coordinate with central networking team. Use conditional forwarders if needed. | | Databricks Premium contract not signed | Cannot deploy workspace | Procurement must close before D6. Escalate to VP if delayed. | | SCIM sync issues (nested groups, group size limits) | Incomplete identity provisioning | Test with a small pilot group first. Flatten nested groups if needed. | | NSG rules too restrictive | Cluster launch failures | Start permissive (log-only), harden iteratively based on traffic analysis. | | Terraform provider version mismatch | Drift between plan and apply | Pin provider versions in `required_providers`. Validate in CI pipeline before applying. |