You are on page 1of 9

DATA DESIGN ASSIGNMENT

Databse Diagram :
Q. Explain about searching performance. How will you handle replication in SQL for
searching & Reporting?:

Searching Performance:
Efficient searching performance is crucial for an E-Commerce application. Strategies include:

 Indexing: Index key columns used in search queries to speed up retrieval. Assuming
common search criteria involve fields like name, price, and discount, you can create
indexes on these columns. Additionally, if you often search by product_id, it's
advisable to create a primary key index on this column.

 Caching: Implement caching mechanisms to store frequently accessed data and


reduce database queries.
Redis can be used as a caching layer alongside a MySQL database to improve overall
application performance by reducing the load on the database and speeding up access
to frequently accessed data. Here's how you can integrate Redis caching with
MySQL.

We can insert Redis Enterprise between your application and your MySQL database
management system without disrupting your applications. Redis Connect enables real-
time event streaming, transformation, and propagation of changed-data events from
various data platforms to Redis Enterprise.

Performing queries on secondary indexes can be incredibly time consuming in MySQL


due to the table structure. Redis is commonly used for secondary indexing to build
relationships between records, and perform data queries (beyond primary keys) in real
time, while keeping your raw data in MySQL

Cache prefetching: Cache prefetching is a technique where data is read from its original
storage in disk-based-memory (MySQL) which is then written to a much faster in-
memory database, Redis Enterprise before it is needed by your application. Using this
approach to offload reads to Redis Enterprise boosts application speed and lowers the
load on MySQL.

CQRS pattern: The command query responsibility segregation (CQRS) pattern


separates the data mutation, or the command part of a system, from the query part.
You can use the CQRS pattern to separate updates and queries if they have different
requirements for throughput, latency, or consistency. The CQRS pattern splits the
application into two parts—the command side and the query side—as shown in the
following diagram. The command side handles create, update, and delete requests.
The query side runs the query part by using the read replicas.

Separate Read and Write Models: We will design separate models for handling
read and write operations. This may involve renormalizing data for optimized read
queries.

 Full-text Search(Elastic Search): We can use services like AWS Elasticsearch


which can help us increase performance. For example, if you have a product
catalogue or user profiles that need efficient and flexible search, Elasticsearch can be
beneficial.
Q. Explain what major factors are taken into consideration for performance:
where you have multiple interconnected tables and you're concerned about performance and
scalability, it's important to choose a database that aligns with your application's
requirements.
We can use multiple types of databases such SQL, NoSQL and key value pairs databases for
different tables.
Inventory Module:
Database Type: Relational Database (e.g., PostgreSQL, NoSQL)
Reasoning: Relational databases are suitable for handling structured data, such as product
prise and images. We can use NoSQL to store product_sku which does not have fixed
schema.
Order/Cart Module:
Database Type: Relational Database (e.g., PostgreSQL, MySQL)
Reasoning: Relational databases are well-suited for managing order-related data. Transactions
are crucial for maintaining consistency in the order and cart transactions.
Notification Module:
Database Type: NoSQL Database (e.g., MongoDB), or a combination of NoSQL and
Relational Databases
Reasoning: Notifications often involve storing semi-structured or unstructured data. A
NoSQL database can provide flexibility in managing various types of notifications and
subscriber data. However, you might still use a relational database for certain structured
aspects.
Authentication & Authorization Module:
Database Type: Relational Database (e.g., PostgreSQL, MySQL)
Reasoning: User authentication and authorization typically involve structured user data and
relationships, making a relational database suitable for storing user information securely.

Query Optimization: Write efficient queries for the Inventory and Order/Cart modules to
ensure quick retrieval of product details and order information. Avoid unnecessary
operations, use appropriate indexing, and consider the use of stored procedures or views for
complex queries related to order status and product availability. Well-optimized queries and
effective use of indexes.
Caching: Mechanisms to reduce redundant queries.
Q. Mention about Indexing, Normalization and Denormalization:
1. Indexing:
Purpose:
Improving Retrieval Performance: Indexes speed up the retrieval of rows from a table,
especially when searching or filtering based on specific columns.
Type of Indexing:
B-tree Indexing: Commonly used in relational databases like MySQL and PostgreSQL.
Suitable for range queries and equality searches.
Considerations:
Column Selection: Index key columns that are frequently used in WHERE clauses of queries.
Impact on Write Performance: While indexing improves read performance, it can impact
write performance, as indexes need to be maintained.
2. Normalization:
Purpose:
Minimizing Redundancy: Reduces data redundancy by organizing data into related tables.
Ensuring Data Consistency: Helps maintain data consistency by avoiding anomalies in the
database.
Normal Forms:
First Normal Form (1NF): Ensures atomicity of data values.
Second Normal Form (2NF): Eliminates partial dependencies.
Third Normal Form (3NF): Removes transitive dependencies.
Considerations:
Application Type: Choose the normalization level based on the specific needs of the
application.
Read vs. Write Performance: Normalization improves data integrity but may involve joining
multiple tables, impacting read performance.

Denormalization: Denormalization can help improve performance in certain scenarios,


especially in read-heavy applications or use cases where complex joins and queries are a
bottleneck. In your given database design, you might consider denormalization in specific
tables based on your application's query patterns. Here are a few considerations.
In the Cart table, you might consider denormalizing some product details to avoid frequent
joins when querying the cart. This can be particularly useful when displaying the cart
contents, as it eliminates the need to join with the product table for every cart item.
By denormalizing the product name, details, and price into the Cart table, you reduce the need for
additional joins when fetching cart details, which can improve query performance for displaying the
contents of the cart.
Denormalization in Notification Table:
In the Notification table, you might consider denormalizing user details to avoid joining with the user
table for every notification. By denormalizing the user's username and email into the Notification
table, you can avoid frequent joins when fetching notification details, especially when displaying
notifications to users.

Considerations:
Trade-offs:
Denormalization comes with trade-offs. While it can improve read performance, it may complicate
write operations and require careful management to maintain data consistency.
Query Patterns:
Consider your application's specific query patterns. Denormalize only when it aligns with the most
common and performance-critical use cases.
Q. How will you handle scaling, if required at any point of time:
Handling scaling in a microservices architecture involves:
1. Service Auto Scaling:
Implement auto-scaling for individual microservices based on metrics.
Auto Scaling Groups: Use Auto Scaling Groups to automatically adjust the number of EC2
instances based on demand. Configure scaling policies to scale up or down based on
predefined conditions, such as CPU utilization or network traffic.
Amazon RDS Scaling: If you are using Amazon RDS for your database, consider enabling
multi-AZ deployments and use Read Replicas to offload read traffic. Adjust the instance size
based on the database load.

2. Load Balancing:
- Use load balancers to distribute traffic across microservice instances.
Load Balancing: Implement Elastic Load Balancing (ELB) to distribute incoming
application traffic across multiple EC2 instances. ELB automatically scales with demand and
provides fault tolerance.
Service Load Balancing: Use load balancers to distribute incoming traffic across multiple
instances of the same microservice. This ensures that the load is balanced and individual
instances can scale independently.
Global Load Balancing: Implement global load balancing to distribute traffic across
multiple regions or availability zones for improved availability and reduced latency.

3. Caching and Content Delivery: Implement microservices-level caching and use CDNs.

4. Database Scaling: Use databases that support horizontal scaling and read replicas.
Vertical Scaling (Scaling Up): Increase the capacity of an existing server by adding
more resources such as CPU, RAM, or storage.
Horizontal Scaling (Scaling Out): Distribute the load across multiple servers or
nodes. In a relational database context, this may involve sharding or partitioning the
data across multiple database instances. Horizontal scaling can improve performance
and handle increased traffic.

5. Event-Driven Architecture: Design microservices to communicate asynchronously


through events.
AWS Lambda: Utilize serverless computing with AWS Lambda for specific
functions or microservices. Lambda automatically scales based on the number of
invocations.

6. Serverless Architecture: Implement serverless functions for specific microservices.

7. Microservices Resilience: Implement circuit breaker pattern and design for graceful
degradation.

8. Monitoring and Metrics: Use distributed tracing, logging, and metrics for insights.

9. Continuous Integration/Continuous Deployment (CI/CD): Automate deployment with


CI/CD pipelines.
10. Capacity Planning: Conduct performance testing and plan resources accordingly.

Read Replicas: If using Amazon RDS, set up Read Replicas to offload read traffic from the
primary database. This helps distribute the load and improves read performance.
Sharding: Consider database sharding for horizontal partitioning of data across multiple
database instances. This can be particularly useful for large datasets. Consider sharding the
database, especially for tables with potentially large amounts of data, such as inventory,
orders.
For example, you could shard the inventory table based on product categories or some other
logical partitioning scheme.
Elastic cache: Use Amazon Elastic cache for caching frequently accessed data. This helps
reduce the load on your database and improves response times.

11. Security Considerations: Implement security best practices at both microservices and
infrastructure levels.
Consider decomposing microservices, prioritizing statelessness, and implementing dynamic
configuration. Set up monitoring, alerts, and ensure a proactive approach to identify and
resolve scaling challenges promptly.
Q. Mention all the assumptions you are taking for solutions:
Roles: It defines the roles whether the user is buyer, seller or any other.
Authentication: It stores the user otp_secret which help to create and validate the otp for the
user.
We are contemplating the use of this e-commerce website exclusively for a single country. If
we intend to expand its functionality to encompass various countries, we can implement a
currency rate table. This table would include country names and their respective currency
rates. By storing the price of a product only once, we can dynamically convert it whenever
necessary.
User and address: We are considering a single user can have multiple users and multiple users
can have same address. Address can also be type off like Home, Office
Product recommendation and promotion: A user is recommended a product based on source
and priority. We can promote a product based on product recommendation.
Notifications: A user can get notification according to his Subscription and promotion .
Naming Conventions: Assumed typical naming conventions such as using singular names for
tables user instead of users or product instead of inventory camel case for column names, and
using underscores for spaces in column names
Relationships: Assumed relationships between tables based on foreign key references, such as
the relationship between users and roles, products and cart .
Timestamps: Assumed the use of timestamps (created_at, timestamp) to track when records
are created or modified.
Default Values: Assumed default values for certain columns, such as ‘status’ in the
subscriptions table, ‘available’ in subscrption .
Passwords: Passwords are encrypted while storing the data into database

You might also like