# Amazon RDS Failover Events & Automatic Failover Mechanism

Amazon **Relational Database Service (RDS)** is designed for **high availability** and **automatic failover** to ensure minimal downtime during failures.

🔹 **When does RDS perform automatic failover?**\
✔ When a **Multi-AZ RDS deployment** detects a failure, **Amazon RDS automatically promotes the standby replica** to the primary instance.

🔹 **What happens during failover?**\
✔ The **standby replica becomes the new primary**.\
✔ The **CNAME (database endpoint) automatically updates** to point to the new primary.\
✔ Applications reconnect using the same **database endpoint** without manual intervention.

***

### **📌 Events That Trigger an Automatic RDS Failover**

Amazon RDS automatically performs failover in **the following scenarios**:

| **Event Type**                      | **Description**                                                                                          |
| ----------------------------------- | -------------------------------------------------------------------------------------------------------- |
| **Primary DB instance failure**     | The primary instance crashes due to an OS, hardware, or database engine issue.                           |
| **Network connectivity loss**       | RDS detects the primary instance is **unreachable** due to **network failures**.                         |
| **Availability Zone (AZ) failure**  | The AWS **AZ hosting the primary instance** becomes unavailable due to outages.                          |
| **Software or hardware failure**    | The database server experiences an **operating system crash**, storage failure, or instance-level issue. |
| **Planned maintenance or patching** | AWS **performs automatic patching** or maintenance that requires a restart.                              |
| **Manual failover initiation**      | A user manually triggers a failover using the AWS Console or CLI.                                        |
| **Storage volume failure**          | The primary instance’s **EBS storage volume fails**, triggering an automatic failover to the standby.    |

***

### **📌 SecureCart’s Amazon RDS Failover Strategy**

🔹 **Business Requirement:** SecureCart ensures that **customer orders, inventory, and transactions** are **always available**, even in the event of an RDS failure.

🔹 **How SecureCart Uses RDS Failover:**\
✔ **Deploys Multi-AZ RDS for high availability.**\
✔ **Uses Route 53 health checks to monitor RDS availability.**\
✔ **Implements database connection retry logic in applications.**\
✔ **Logs failover events in Amazon CloudWatch for real-time monitoring.**

✅ **Example Setup for SecureCart:**

* **Primary RDS Instance:** `db-securecart-primary (us-east-1a)`
* **Standby RDS Replica:** `db-securecart-standby (us-east-1b)`
* **Database Endpoint:** `securecart-db.cluster-xyz.us-east-1.rds.amazonaws.com`
* **Failover Process:**
  1. The primary **fails** (e.g., AZ outage).
  2. AWS **automatically promotes** the standby.
  3. The **database endpoint updates** to the new primary.
  4. SecureCart applications **automatically reconnect** to the new instance.

***

### **📌 Best Practices for SecureCart’s RDS Failover**

✅ **Use Multi-AZ RDS** for automatic failover capability.\
✅ **Implement database connection pooling** to minimize downtime.\
✅ **Use read replicas for performance, but not failover (for RDS except Aurora).**\
✅ **Monitor RDS failover events** using **Amazon CloudWatch & AWS EventBridge**.\
✅ **Automate failover testing** in a **staging environment** to ensure smooth transitions.

***

### **📌 Summary**

🚀 **SecureCart ensures database availability with:**\
✔ **Multi-AZ RDS failover for high availability**\
✔ **Automatic CNAME updates for seamless application recovery**\
✔ **CloudWatch monitoring for proactive failover detection**

### **Why Only Options D and E are Correct for Amazon RDS Automatic Failover?**

Amazon RDS **Multi-AZ** deployments are designed for **high availability (HA)**, meaning that **failover occurs only when the primary database is impacted**.

#### **🔹 Understanding Amazon RDS Failover Scenarios**

Failover happens **ONLY when the primary database becomes unavailable**, such as:\
✔ **Primary DB storage failure (Option D)**\
✔ **Loss of availability in the primary Availability Zone (Option E)**

***

#### **📌 Explanation of Each Option:**

| **Option**                                           | **Explanation**                                                                                                                                          | **Does it trigger failover?** |
| ---------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------- |
| **A. Read Replica failure**                          | Read Replicas are used for **performance scaling**, not for high availability. A failure does not affect the primary instance.                           | ❌ No Failover                 |
| **B. Compute unit failure on secondary DB instance** | The secondary (standby) instance is not actively used until failover occurs. A failure of the standby instance **does not impact the primary instance**. | ❌ No Failover                 |
| **C. Storage failure on secondary DB instance**      | Similar to option B, **Multi-AZ RDS does not failover if the standby instance fails**. AWS **recreates a new standby automatically**.                    | ❌ No Failover                 |
| **✅ D. Storage failure on primary DB instance**      | If the **primary storage volume fails**, AWS **fails over to the standby** in another AZ.                                                                | ✅ Yes, Triggers Failover      |
| **✅ E. Loss of availability in primary AZ**          | If the **entire AZ hosting the primary instance goes down**, failover happens to a **standby in another AZ**.                                            | ✅ Yes, Triggers Failover      |

***

#### **🔹 Why Only D and E?**

✔ Failover only occurs if the **PRIMARY instance is affected**.\
✔ The **standby instance does not impact failover**—AWS will recreate it automatically.\
✔ **Read Replicas are not part of Multi-AZ failover**—they serve a different purpose.

***

#### **📌 Key Takeaways for SecureCart**

✅ **Ensure SecureCart’s production databases use Multi-AZ RDS for high availability.**\
✅ **Use Read Replicas for read-heavy workloads, but not for failover.**\
✅ **Monitor CloudWatch metrics (e.g., `DatabaseConnections`, `WriteLatency`) to detect primary failures.**\
✅ **Design applications to handle failover by retrying connections to the database endpoint.**


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://awsinpractice.itassist.com/study-group/aws-certified-solutions-architect-associate/domain-2/task-statement-2.2-design-highly-available-and-or-fault-tolerant-architectures/use-cases/amazon-rds-failover-events-and-automatic-failover-mechanism.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
