# Amazon RDS Failover Events & Automatic Failover Mechanism

Amazon **Relational Database Service (RDS)** is designed for **high availability** and **automatic failover** to ensure minimal downtime during failures.

🔹 **When does RDS perform automatic failover?**\
✔ When a **Multi-AZ RDS deployment** detects a failure, **Amazon RDS automatically promotes the standby replica** to the primary instance.

🔹 **What happens during failover?**\
✔ The **standby replica becomes the new primary**.\
✔ The **CNAME (database endpoint) automatically updates** to point to the new primary.\
✔ Applications reconnect using the same **database endpoint** without manual intervention.

***

### **📌 Events That Trigger an Automatic RDS Failover**

Amazon RDS automatically performs failover in **the following scenarios**:

| **Event Type**                      | **Description**                                                                                          |
| ----------------------------------- | -------------------------------------------------------------------------------------------------------- |
| **Primary DB instance failure**     | The primary instance crashes due to an OS, hardware, or database engine issue.                           |
| **Network connectivity loss**       | RDS detects the primary instance is **unreachable** due to **network failures**.                         |
| **Availability Zone (AZ) failure**  | The AWS **AZ hosting the primary instance** becomes unavailable due to outages.                          |
| **Software or hardware failure**    | The database server experiences an **operating system crash**, storage failure, or instance-level issue. |
| **Planned maintenance or patching** | AWS **performs automatic patching** or maintenance that requires a restart.                              |
| **Manual failover initiation**      | A user manually triggers a failover using the AWS Console or CLI.                                        |
| **Storage volume failure**          | The primary instance’s **EBS storage volume fails**, triggering an automatic failover to the standby.    |

***

### **📌 SecureCart’s Amazon RDS Failover Strategy**

🔹 **Business Requirement:** SecureCart ensures that **customer orders, inventory, and transactions** are **always available**, even in the event of an RDS failure.

🔹 **How SecureCart Uses RDS Failover:**\
✔ **Deploys Multi-AZ RDS for high availability.**\
✔ **Uses Route 53 health checks to monitor RDS availability.**\
✔ **Implements database connection retry logic in applications.**\
✔ **Logs failover events in Amazon CloudWatch for real-time monitoring.**

✅ **Example Setup for SecureCart:**

* **Primary RDS Instance:** `db-securecart-primary (us-east-1a)`
* **Standby RDS Replica:** `db-securecart-standby (us-east-1b)`
* **Database Endpoint:** `securecart-db.cluster-xyz.us-east-1.rds.amazonaws.com`
* **Failover Process:**
  1. The primary **fails** (e.g., AZ outage).
  2. AWS **automatically promotes** the standby.
  3. The **database endpoint updates** to the new primary.
  4. SecureCart applications **automatically reconnect** to the new instance.

***

### **📌 Best Practices for SecureCart’s RDS Failover**

✅ **Use Multi-AZ RDS** for automatic failover capability.\
✅ **Implement database connection pooling** to minimize downtime.\
✅ **Use read replicas for performance, but not failover (for RDS except Aurora).**\
✅ **Monitor RDS failover events** using **Amazon CloudWatch & AWS EventBridge**.\
✅ **Automate failover testing** in a **staging environment** to ensure smooth transitions.

***

### **📌 Summary**

🚀 **SecureCart ensures database availability with:**\
✔ **Multi-AZ RDS failover for high availability**\
✔ **Automatic CNAME updates for seamless application recovery**\
✔ **CloudWatch monitoring for proactive failover detection**

### **Why Only Options D and E are Correct for Amazon RDS Automatic Failover?**

Amazon RDS **Multi-AZ** deployments are designed for **high availability (HA)**, meaning that **failover occurs only when the primary database is impacted**.

#### **🔹 Understanding Amazon RDS Failover Scenarios**

Failover happens **ONLY when the primary database becomes unavailable**, such as:\
✔ **Primary DB storage failure (Option D)**\
✔ **Loss of availability in the primary Availability Zone (Option E)**

***

#### **📌 Explanation of Each Option:**

| **Option**                                           | **Explanation**                                                                                                                                          | **Does it trigger failover?** |
| ---------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------- |
| **A. Read Replica failure**                          | Read Replicas are used for **performance scaling**, not for high availability. A failure does not affect the primary instance.                           | ❌ No Failover                 |
| **B. Compute unit failure on secondary DB instance** | The secondary (standby) instance is not actively used until failover occurs. A failure of the standby instance **does not impact the primary instance**. | ❌ No Failover                 |
| **C. Storage failure on secondary DB instance**      | Similar to option B, **Multi-AZ RDS does not failover if the standby instance fails**. AWS **recreates a new standby automatically**.                    | ❌ No Failover                 |
| **✅ D. Storage failure on primary DB instance**      | If the **primary storage volume fails**, AWS **fails over to the standby** in another AZ.                                                                | ✅ Yes, Triggers Failover      |
| **✅ E. Loss of availability in primary AZ**          | If the **entire AZ hosting the primary instance goes down**, failover happens to a **standby in another AZ**.                                            | ✅ Yes, Triggers Failover      |

***

#### **🔹 Why Only D and E?**

✔ Failover only occurs if the **PRIMARY instance is affected**.\
✔ The **standby instance does not impact failover**—AWS will recreate it automatically.\
✔ **Read Replicas are not part of Multi-AZ failover**—they serve a different purpose.

***

#### **📌 Key Takeaways for SecureCart**

✅ **Ensure SecureCart’s production databases use Multi-AZ RDS for high availability.**\
✅ **Use Read Replicas for read-heavy workloads, but not for failover.**\
✅ **Monitor CloudWatch metrics (e.g., `DatabaseConnections`, `WriteLatency`) to detect primary failures.**\
✅ **Design applications to handle failover by retrying connections to the database endpoint.**
