vSphere Management Guide
Effective management of VMware vSphere infrastructure requires understanding deployment strategies, operational procedures, monitoring techniques, and best practices. This comprehensive guide covers all aspects of vSphere administration, from initial deployment to day-to-day operations and advanced troubleshooting.
Whether you're a new administrator or an experienced vSphere professional, this guide provides practical insights for managing enterprise virtualization infrastructure effectively.
Deployment Planning
Proper planning ensures successful vSphere deployment and long-term operational success:
Infrastructure Assessment
Pre-Deployment Checklist
- Hardware Compatibility: Verify all hardware on VMware HCL
- Network Requirements: Plan VLANs, IP addressing, DNS, NTP
- Storage Design: Determine storage protocols, capacity, performance needs
- Licensing: Calculate required licenses for hosts and features
- Naming Conventions: Establish consistent naming standards
- Security Requirements: Define access control, compliance needs
- Backup Strategy: Plan backup and disaster recovery approach
- Documentation: Create deployment and operational documentation
Design Considerations
- Cluster Design: Number of hosts, DRS/HA configuration, resource pools
- Network Design: Management, vMotion, VM, iSCSI/NFS networks
- Storage Design: RAID levels, multipathing, tiering strategy
- Resource Planning: CPU/memory ratios, over-subscription limits
- Scalability: Growth planning for compute, storage, network
- High Availability: Failure scenarios, recovery objectives (RTO/RPO)
- Disaster Recovery: Off-site replication, backup locations
vCenter Server Deployment
VCSA Deployment Steps
- Download VCSA ISO: Obtain from VMware downloads portal
- Mount ISO: Extract installer files
- Run Installer: Launch vcsa-deploy.exe (Windows) or installer script (Mac/Linux)
- Stage 1 - Deploy Appliance:
- Target ESXi host or vCenter
- Appliance sizing (Tiny/Small/Medium/Large/X-Large)
- Storage configuration
- Network settings (IP, DNS, gateway)
- Stage 2 - Setup vCenter:
- SSO configuration (new/join domain)
- Configure CEIP participation
- Complete deployment
- Post-Deployment: Access VAMI (port 5480) and vSphere Client
vCenter Sizing Guidelines
| Size | vCPUs | Memory | Hosts | VMs |
|---|---|---|---|---|
| Tiny | 2 | 12 GB | Up to 10 | Up to 100 |
| Small | 4 | 19 GB | Up to 100 | Up to 1,000 |
| Medium | 8 | 28 GB | Up to 400 | Up to 4,000 |
| Large | 16 | 37 GB | Up to 1,000 | Up to 10,000 |
| X-Large | 24 | 56 GB | Up to 2,000 | Up to 35,000 |
Cluster Configuration
Creating and Configuring Clusters
DRS Configuration
- Set automation level (Manual/Partial/Full)
- Configure migration threshold (Conservative to Aggressive)
- Enable/disable VM automation individually
- Create affinity/anti-affinity rules
- Configure VM-Host rules
- Enable Distributed Power Management (DPM)
HA Configuration
- Enable host monitoring
- Configure admission control policy
- Set VM restart priority
- Configure host isolation response
- Enable VM and application monitoring
- Configure datastore heartbeating
Enhanced vMotion
- Configure EVC mode for cluster
- Enable vMotion on VMkernel adapters
- Configure vMotion network on all hosts
- Set number of concurrent vMotions
- Configure encrypted vMotion
- Test vMotion functionality
Fault Tolerance
- Configure FT logging network
- Verify CPU compatibility
- Enable FT on critical VMs
- Test FT failover
- Monitor FT bandwidth usage
- Document FT-protected VMs
Network Management
Standard Switch Configuration
- Create vSwitch: Add virtual switches per host
- Add Uplinks: Connect physical NICs to vSwitch
- Create Port Groups: Define VLAN-tagged networks
- Configure Security: Set promiscuous mode, MAC changes policies
- Setup NIC Teaming: Configure load balancing and failover
- Traffic Shaping: Set bandwidth limits if needed
Distributed Switch Configuration
- Create vDS: Define version and uplink count in vCenter
- Add Hosts: Add ESXi hosts to distributed switch
- Migrate Physical Adapters: Move uplinks from standard to distributed switch
- Create Port Groups: Define distributed port groups
- Migrate VM Networking: Move VMs to distributed port groups
- Configure Advanced Features: NetFlow, LACP, port mirroring
- Enable Health Check: Monitor configuration consistency
Network Best Practices
Networking Recommendations
- Separate management, vMotion, VM, and storage networks
- Use at least two physical NICs for redundancy
- Configure VLAN trunking on physical switch ports
- Enable jumbo frames for vMotion and iSCSI/NFS
- Use LACP for optimal load balancing on vDS
- Implement Network I/O Control for QoS
- Document VLAN assignments and IP schemes
- Test failover scenarios regularly
Storage Management
VMFS Datastore Creation
- Present LUN from storage array to ESXi hosts
- Rescan storage adapters on all hosts
- Navigate to Storage > New Datastore in vSphere Client
- Select VMFS as datastore type
- Select device/LUN for datastore
- Choose VMFS version (VMFS6 recommended)
- Configure partition size
- Mount datastore on all cluster hosts
NFS Datastore Configuration
- Configure NFS export on storage server
- Add NFS mount point on ESXi hosts
- Specify NFS server IP and export path
- Choose NFS version (3 or 4.1)
- Set datastore name
- Configure advanced options (auth, delegation)
- Verify connectivity and accessibility
Multipathing Configuration
Path Selection Policies
- Fixed: Use designated preferred path
- Most Recently Used (MRU): Use path until failure
- Round Robin (RR): Rotate I/O across all active paths
- Custom: Vendor-specific multipathing
Virtual Machine Management
VM Creation Best Practices
- Right-sizing: Allocate appropriate CPU/RAM (avoid over-provisioning)
- Virtual Hardware: Use latest compatible VM hardware version
- Disk Format: Thin provisioning for development, thick for production
- SCSI Controller: Use Paravirtual (PVSCSI) for best performance
- Network Adapter: Use VMXNET3 for optimal network performance
- VMware Tools: Install immediately after OS installation
- Resource Reservations: Set only for business-critical VMs
- Limits: Avoid setting limits unless absolutely necessary
Template Management
- Build Master VM: Install and configure OS
- Install Updates: Apply all OS patches
- Install VMware Tools: Ensure latest version
- Configure for Cloning: Remove unique identifiers
- Sysprep (Windows): Generalize Windows installation
- Shutdown VM: Power off before conversion
- Convert to Template: Right-click > Template > Convert to Template
- Store in Content Library: Import to content library for sharing
Snapshot Management
Snapshot Best Practices
- Snapshots are not backups - temporary only
- Delete snapshots after testing/verification
- Avoid snapshots on production database VMs
- Monitor snapshot disk space consumption
- Limit snapshot retention to 24-72 hours
- Avoid multiple snapshots per VM
- Schedule snapshot deletion during maintenance windows
- Use scripts to identify old snapshots
Backup and Recovery
vCenter Backup Strategy
- File-Based Backup: Use VAMI (port 5480) native backup
- Backup Schedule: Daily incremental, weekly full recommended
- Backup Location: Network share (NFS, SMB, FTP, HTTP)
- Encryption: Enable encryption for sensitive data
- Retention: Maintain multiple restore points
- Test Restores: Regularly verify backup integrity
- Document Procedure: Maintain recovery documentation
ESXi Configuration Backup
- Use vCenter to backup host configuration
- Schedule automated backup via PowerCLI scripts
- Export configuration with: vim-cmd hostsvc/firmware/backup_config
- Download backup bundle from ESXi web interface
- Store backups in version-controlled repository
- Test restoration on non-production hosts
VM Backup Solutions
Agent-Based Backup
- Install backup agent in guest OS
- Application-aware backups
- File-level recovery
- Works with physical and virtual
Image-Based Backup
- VADP (vStorage APIs for Data Protection)
- VM-level granularity
- Changed Block Tracking (CBT)
- Minimal VM impact
Replication
- vSphere Replication
- Site Recovery Manager
- Array-based replication
- Near real-time protection
Third-Party Solutions
- Veeam Backup & Replication
- Commvault Complete Backup
- Veritas NetBackup
- Dell EMC Avamar
Performance Monitoring
Key Performance Metrics
| Resource | Metric | Healthy Range | Action Threshold |
|---|---|---|---|
| CPU | Usage % | < 80% | > 90% sustained |
| CPU | Ready % | < 5% | > 10% sustained |
| Memory | Usage % | < 90% | > 95% sustained |
| Memory | Ballooning | 0 MB | > 0 MB indicates pressure |
| Storage | Latency | < 15ms | > 20ms sustained |
| Network | Throughput | < 80% capacity | > 90% sustained |
esxtop Performance Tool
# CPU statistics
esxtop
Press 'c' for CPU view
Key metrics: %USED, %RDY, %CSTP
# Memory statistics
Press 'm' for memory view
Key metrics: MEMSZ, GRANT, ACTIV, MCTL
# Network statistics
Press 'n' for network view
Key metrics: MbTX/s, MbRX/s, %DRPTX, %DRPRX
# Storage statistics
Press 'd' for disk view
Key metrics: CMDS/s, READS/s, WRITES/s, DAVG/cmd
Update and Patch Management
vSphere Lifecycle Manager Workflow
- Import Image: Import vendor-provided ESXi image
- Create Cluster Image: Define desired cluster image
- Check Compliance: Scan cluster against desired state
- Remediate Cluster: Apply updates to non-compliant hosts
- Automated Process:
- Put host in maintenance mode
- Evacuate VMs with vMotion
- Apply updates and reboot
- Verify host health
- Exit maintenance mode
- Proceed to next host
Patching Best Practices
- Review release notes for each patch
- Test patches in non-production environment
- Create baseline snapshots before patching
- Schedule maintenance windows
- Patch vCenter before ESXi hosts
- Patch one host at a time initially
- Monitor for issues after patching
- Document patch levels and changes
Security Management
Access Control
Security Best Practices
- Least Privilege: Grant minimum required permissions
- Role-Based Access: Use built-in and custom roles
- Active Directory: Integrate with AD for centralized authentication
- Lockdown Mode: Enable on ESXi hosts managed by vCenter
- Certificate Management: Replace default certificates
- Session Timeouts: Configure appropriate timeout values
- Audit Logging: Enable and monitor audit logs
- VM Encryption: Encrypt sensitive VM workloads
vCenter Roles
| Role | Permissions | Use Case |
|---|---|---|
| Administrator | Full control | vSphere administrators |
| Read-Only | View only | Auditors, monitoring systems |
| Virtual Machine Power User | VM operations | Application administrators |
| Virtual Machine User | Limited VM interaction | End users |
| Resource Pool Administrator | Resource pool management | Department administrators |
| Datastore Consumer | Allocate space | VM deployment users |
Troubleshooting
Common Issues and Solutions
VM Performance
- Check CPU ready time
- Verify memory ballooning
- Review storage latency
- Check network drops
- Update VMware Tools
- Right-size VM resources
vMotion Failures
- Verify network connectivity
- Check EVC compatibility
- Review resource availability
- Validate shared storage access
- Check for snapshots
- Review vMotion logs
HA Failures
- Verify host heartbeats
- Check admission control
- Review isolation addresses
- Validate datastore access
- Check master election
- Review fdm.log files
Storage Issues
- Rescan storage adapters
- Check multipath status
- Verify LUN presentation
- Review storage latency
- Check for APD/PDL conditions
- Validate array health
Log Files Reference
Important Log Locations
- /var/log/vmkernel.log: VMkernel events and errors
- /var/log/hostd.log: Host management service logs
- /var/log/vpxa.log: vCenter agent logs
- /var/log/vmware/ Various service-specific logs
- vmware.log: Per-VM logs in VM directory
- vCenter logs: /var/log/vmware/ on VCSA
Capacity Planning
- Monitor Trends: Track resource utilization over time
- Peak Usage: Identify peak demand periods
- Growth Rate: Calculate VM growth rate
- Resource Pools: Allocate resources per business unit
- Forecasting: Project future capacity needs
- Budget Planning: Plan hardware refresh cycles
- Right-sizing: Identify oversized VMs
- Consolidation Ratio: Track VMs per host ratio
Documentation and Procedures
Essential Documentation
- Network diagrams with VLANs and IP assignments
- Storage architecture and LUN mappings
- Host and cluster configurations
- Disaster recovery procedures
- Backup and restore procedures
- Change management processes
- Contact lists for escalation
- Vendor support information
- License inventory and renewal dates
- Compliance and audit requirements
Note: Effective vSphere management requires continuous learning and adaptation to new features and best practices. Stay current with VMware documentation, community forums, and training resources to optimize your virtualization infrastructure.