Basic VPC design covers subnets and firewall rules. Production environments require advanced networking: connecting on-premises data centers, accessing SaaS services privately, routing traffic across multiple projects, and intelligent DNS failover. This guide covers the patterns GCP architects use at scale.

Network Architecture Evolution

  Stage 1: Single VPC              → Dev/test
Stage 2: Shared VPC              → Multi-project, centralized networking
Stage 3: VPC Peering / VPN       → Cross-project or hybrid connectivity
Stage 4: Cloud Interconnect      → Dedicated on-premises connection
Stage 5: Multi-region + PSC      → Enterprise, private SaaS access
  

Cloud VPN (Site-to-Site)

Connect on-premises data center to GCP via encrypted tunnels:

  # Create Cloud Router
gcloud compute routers create onprem-router \
  --network=learning-vpc \
  --region=us-central1 \
  --asn=64512

# Create HA VPN gateway (recommended — two interfaces for redundancy)
gcloud compute vpn-gateways create ha-vpn-gateway \
  --network=learning-vpc \
  --region=us-central1

# Create external VPN gateway (your on-premises device)
gcloud compute external-vpn-gateways create onprem-gateway \
  --interfaces=0=203.0.113.10,1=203.0.113.11

# Create VPN tunnels (two for redundancy)
gcloud compute vpn-tunnels create tunnel-1 \
  --peer-gcp-gateway=ha-vpn-gateway \
  --peer-external-gateway=onprem-gateway \
  --interface=0 \
  --router=onprem-router \
  --region=us-central1 \
  --ike-version=2 \
  --shared-secret=YOUR_PRESHARED_KEY

gcloud compute vpn-tunnels create tunnel-2 \
  --peer-gcp-gateway=ha-vpn-gateway \
  --peer-external-gateway=onprem-gateway \
  --interface=1 \
  --router=onprem-router \
  --region=us-central1 \
  --ike-version=2 \
  --shared-secret=YOUR_PRESHARED_KEY
  

Cloud VPN vs. Cloud Interconnect

Aspect Cloud VPN Cloud Interconnect
Bandwidth Up to ~3 Gbps (per tunnel) 10 Gbps – 200 Gbps
Latency Variable (internet path) Consistent, lower
Cost Low (VPN gateway + egress) Port hours + partner fees
Setup time Hours Weeks (physical install)
Encryption IPsec (built-in) MACsec optional; VPN over Interconnect
Best for Backup, dev/test hybrid Production hybrid, bulk data transfer

Best practice: Run HA VPN (two tunnels) for redundancy. Use Cloud Interconnect for steady high-bandwidth workloads and VPN as backup.

Cloud Interconnect (Dedicated)

Dedicated physical connection from on-premises to Google’s network:

  On-Premises DC
  → Cross-connect at colocation facility
    → Cloud Interconnect (Dedicated or Partner)
      → Cloud Router (BGP)
        → VPC subnets
  
  # Create VLAN attachment (after physical provisioning)
gcloud compute interconnects attachments dedicated create attachment-1 \
  --interconnect=my-interconnect \
  --router=onprem-router \
  --region=us-central1 \
  --vlan=100

# Cloud Router learns on-premises routes via BGP
gcloud compute routers add-bgp-peer onprem-router \
  --peer-name=onprem-peer \
  --interface=attachment-1 \
  --peer-ip-address=169.254.1.2 \
  --peer-asn=65001 \
  --region=us-central1
  

Use Partner Interconnect when you cannot colocate — connect through a supported service provider (Equinix, Megaport, etc.).

Private Service Connect

Access Google APIs and third-party services without traversing the public internet:

  # Consumer: create endpoint for a published service
gcloud compute addresses create psc-endpoint-ip \
  --region=us-central1 \
  --subnet=learning-subnet-us

gcloud compute forwarding-rules create psc-google-apis \
  --region=us-central1 \
  --network=learning-vpc \
  --address=psc-endpoint-ip \
  --target-google-apis-bundle=all-apis \
  --load-balancing-scheme=""
  

Private Service Connect Use Cases

Scenario Benefit
Access Google APIs privately Traffic stays on Google network
SaaS provider exposes service to customers No public internet exposure
Cross-project service access No VPC peering needed
On-premises access to GCP services Over Interconnect, not internet
Compliance (PCI, HIPAA) Traffic never leaves private network

PSC vs. VPC Peering vs. VPN

Feature PSC VPC Peering Cloud VPN
Transitive routing No No Via Cloud Router (BGP)
Access Google APIs Yes (native) No (use private Google access) Yes (with routing)
Cross-project services Yes Yes (bidirectional) Yes
Bandwidth High High Limited (~3 Gbps/tunnel)
Best for Private API/SaaS access Same-org project connectivity Hybrid on-premises

Cloud DNS Advanced Routing

Routing Policies

Policy Behavior Use Case
Standard Single resource record Basic DNS
Weighted Split traffic by weight A/B testing, gradual migration
Geolocation Route by user location Content localization, compliance
Geo-failover Primary + backup by geography Regional DR
WRR (Weighted Round Robin) Distribute by weight Load distribution
  # Create a managed zone
gcloud dns managed-zones create example-com \
  --dns-name=example.com. \
  --description="Production DNS"

# Weighted routing: 90% to current, 10% to canary
gcloud dns record-sets create api.example.com. \
  --zone=example-com \
  --type=A \
  --ttl=60 \
  --routing-policy-type=WRR \
  --routing-policy-data="0.9=203.0.113.10,0.1=203.0.113.20"

# Geolocation routing
gcloud dns record-sets create app.example.com. \
  --zone=example-com \
  --type=A \
  --ttl=300 \
  --routing-policy-type=GEO \
  --routing-policy-data="us-central1=203.0.113.10,europe-west1=198.51.100.10"
  

Health-Checked Failover

  # Create health check
gcloud compute health-checks create http dns-health \
  --port=443 --request-path=/health --check-interval=10s --unhealthy-threshold=3

# Failover policy: primary with backup
gcloud dns record-sets create api.example.com. \
  --zone=example-com \
  --type=A \
  --ttl=60 \
  --routing-policy-type=GEO \
  --routing-policy-data="us-central1=203.0.113.10" \
  --routing-policy-data="europe-west1=198.51.100.10" \
  --enable-health-checking \
  --health-check=dns-health
  

Shared VPC Multi-Project Architecture

                      GCP Organization
                         │
              ┌──────────┼──────────┐
              │          │          │
         Host Project  Service    Service
         (networking)  Project:   Project:
                       Production  Development
              │          │          │
              └──── Shared VPC ─────┘
                         │
                    Cloud Interconnect
                         │
                   On-Premises DC
  
Project Purpose Network
Host (networking) Shared VPC, Interconnect, DNS Central hub
Production GKE, Cloud SQL, Cloud Run Shared VPC subnet
Development Dev/test workloads Shared VPC subnet (isolated)
Shared Services Artifact Registry, logging Shared VPC subnet
  gcloud compute shared-vpc enable host-project-id
gcloud compute shared-vpc associated-projects add prod-project-id \
  --host-project=host-project-id
gcloud compute shared-vpc associated-projects add dev-project-id \
  --host-project=host-project-id
  

Cloud NAT and Hybrid Egress

On-premises systems accessing internet-bound GCP resources:

  # Cloud NAT for hybrid egress (on-premises → GCP → internet)
gcloud compute routers nats create hybrid-nat \
  --router=onprem-router \
  --region=us-central1 \
  --nat-custom-subnet-ip-ranges=learning-subnet-us \
  --enable-logging
  

Real-World Scenario: Global Enterprise Platform

Component Configuration
Cloud DNS Geolocation routing to us-central1, europe-west1, asia-east1
Cloud CDN Global CDN for static assets
External HTTP(S) LB Health-checked backends per region
Cloud SQL Cross-region read replicas; automated backups
Shared VPC 6 service projects on host VPC
Cloud Interconnect 10 Gbps dedicated to on-premises ERP
Private Service Connect Internal payment service accessed by all projects
Cloud VPN Backup path over internet (HA VPN, 2 tunnels)

Common Mistakes

  1. VPC peering mesh with 10+ VPCs — use Shared VPC instead
  2. Single VPN tunnel — always configure HA VPN with two tunnels
  3. Overlapping CIDR blocks — plan IP addressing before multi-VPC design (use /16 per VPC)
  4. No VPC Flow Logs on hybrid connections — blind to cross-network traffic
  5. Public endpoints for internal services — use Private Service Connect
  6. Ignoring DNS TTL during failover — lower TTL (60s) before planned failover
  7. Cloud Interconnect without VPN backup — single point of failure

Troubleshooting

Issue Check Fix
VPN tunnel down gcloud compute vpn-tunnels list status Verify preshared key, on-prem device config, firewall allows UDP 500/4500
BGP routes not propagating Cloud Router BGP peers status Check peer ASN, IP addresses, MD5 auth
PSC endpoint unreachable Forwarding rule, IP allocation Verify subnet has available IPs; check firewall allows PSC range
DNS failover not working Health check status Verify health endpoint returns 200; lower TTL
High Interconnect costs Egress direction Use Cloud CDN for user-facing egress; Interconnect for hybrid only
Cross-project traffic blocked Shared VPC IAM, firewall rules Verify service project attachment and firewall allows traffic

Best Practices

  • Use Shared VPC as the hub for multi-project organizations
  • Plan non-overlapping CIDR blocks (/16 per VPC) before deployment
  • Implement defense in depth: Cloud Armor → firewall rules → VPC Service Controls
  • Use Private Service Connect for private API and SaaS access
  • Configure Cloud DNS health-checked failover for DR
  • Enable VPC Flow Logs on all production subnets
  • Run HA VPN alongside Cloud Interconnect as backup
  • Document network topology with diagrams updated quarterly
  • Test VPN failover and Interconnect failover regularly

Next: Disaster Recovery — backup strategies and multi-region failover.