mirror of
https://gitlab.com/djdietrick/docs
synced 2026-05-03 00:20:54 -04:00
Updated system design section
This commit is contained in:
25
docs/interview/sd/efficiency.md
Normal file
25
docs/interview/sd/efficiency.md
Normal file
@@ -0,0 +1,25 @@
|
|||||||
|
# Measuring Efficiency
|
||||||
|
|
||||||
|
When describing the efficiency of a system, we look at a few different properties.
|
||||||
|
|
||||||
|
## Latency
|
||||||
|
|
||||||
|
Latency describes the time it takes for a machine to perform a certain operation. Every method of retrieving information has different time costs. A general rule of thumb:
|
||||||
|
|
||||||
|
- Reading 1MB from RAM - 0.25ms
|
||||||
|
- Reading 1MB from SSD - 1ms
|
||||||
|
- Transfer 1MB over network - 10ms
|
||||||
|
- Reading 1MB from HDD - 20ms
|
||||||
|
- Intercontinental round-trip - 150ms
|
||||||
|
|
||||||
|
## Throughput
|
||||||
|
|
||||||
|
While latency measures the time operations take, throughput describes the number of operations that can be processed in a given amount of time, such as requests per second.
|
||||||
|
|
||||||
|
## Availablility
|
||||||
|
|
||||||
|
Availability describes the percentage of time when a server is up and running. This is usually measured in percentages or more typically **nines**. If a server is available 99% of the time, we say that it has 2 nines of availability. A system is **highly available** if it has five or more nines of availability.
|
||||||
|
|
||||||
|
### Redundancy
|
||||||
|
|
||||||
|
Redundency is a means of increasing availability by having copies of the system ready to go in case the main system fails.
|
||||||
95
docs/interview/sd/impl.md
Normal file
95
docs/interview/sd/impl.md
Normal file
@@ -0,0 +1,95 @@
|
|||||||
|
# Implementing Efficiency Techniques
|
||||||
|
|
||||||
|
## Leader Election
|
||||||
|
|
||||||
|
Leader election is the process of nodes in a cluster elect a leader to perform the primary functions of the service. That way all the nodes in the system know who the leader is and can elect a new leader if the current leader dies. They do this by using a **Concensus Algorithm** such as Paxos or Raft, and using a third-party key-value service such as Etcd and Zookeeper.
|
||||||
|
|
||||||
|
### Python Implementation using Etcd
|
||||||
|
|
||||||
|
```python
|
||||||
|
import etcd
|
||||||
|
import sys
|
||||||
|
import time
|
||||||
|
from threading import Event
|
||||||
|
|
||||||
|
LEADER_KEY = 'LEADER_KEY'
|
||||||
|
|
||||||
|
def main(server_name):
|
||||||
|
client = etcd.client(host="localhost", port=2379)
|
||||||
|
|
||||||
|
while True:
|
||||||
|
is_leader, lease = leader_election(client, server_name)
|
||||||
|
if is_leader:
|
||||||
|
print("I am the leader")
|
||||||
|
on_leadership_gained(lease)
|
||||||
|
else:
|
||||||
|
print("I am a follower")
|
||||||
|
wait_for_next_election(client)
|
||||||
|
|
||||||
|
def leader_election(client, server_name):
|
||||||
|
print("New leader election happening")
|
||||||
|
lease = client.lease(5) # Must renew lease every 5 seconds or new leader is elected
|
||||||
|
is_leader = try_insert(client, LEADER_KEY, server_name, lease)
|
||||||
|
return is_leader, lease
|
||||||
|
|
||||||
|
def try_insert(client, key, server_name, lease):
|
||||||
|
insert_succeeded = client.transaction(
|
||||||
|
failure=[],
|
||||||
|
success=[client.transaction.put(key, server_name, lease)],
|
||||||
|
compare=[client.transaction.version(key) == 0]
|
||||||
|
)
|
||||||
|
return insert_succeeded
|
||||||
|
|
||||||
|
def on_leadership_gained(lease):
|
||||||
|
while True:
|
||||||
|
try:
|
||||||
|
print("Refreshing lease, still the leader")
|
||||||
|
lease.refresh()
|
||||||
|
do_work()
|
||||||
|
except Exception:
|
||||||
|
lease.revoke()
|
||||||
|
return
|
||||||
|
except KeyboardInterrupt:
|
||||||
|
lease.revoke()
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
def wait_for_next_election(client):
|
||||||
|
election_event = Event()
|
||||||
|
|
||||||
|
def watch_callback(resp):
|
||||||
|
for event in resp.events:
|
||||||
|
if isinstance(event, etcd.events.DeleteEvent):
|
||||||
|
print("Leader election required")
|
||||||
|
election_event.set()
|
||||||
|
|
||||||
|
watch_id = client.add_watch_callback(LEADER_KEY, watch_callback)
|
||||||
|
|
||||||
|
try:
|
||||||
|
while not election_event.is_set():
|
||||||
|
time.sleep(1)
|
||||||
|
except KeyboardInterrupt:
|
||||||
|
client.cancel_watch(watch_id)
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
client.cancel_watch()
|
||||||
|
|
||||||
|
|
||||||
|
def do_work():
|
||||||
|
time.sleep(1)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Polling and Streaming
|
||||||
|
|
||||||
|
**Polling** is the act of requesting data updates at a regular interval. This is typically done when the server has a REST API. **Streaming** is the act of getting continuous data updates fed from the server through an open connection. This is achieved using web sockets, which keeps an open connection between the server and client to allow for either party to send information at either time. Streaming is preferred when the information is time sensitive or when you would want the data update as soon as it happened.
|
||||||
|
|
||||||
|
### Pub Sub
|
||||||
|
|
||||||
|
Pub Sub, or publsihing and subscribing, is a method of dividing streamed data by topics that clients can subscribe to. Then, when a new event is published for that topic, all of the clients subscribed will receive the update. These systems often come with guarantees such as at-least-one delivery, persistent storage/queues, ordering of messages, and replayability of messages. These messages also typically have to be **idempotent** operations, which means the outcome has the be the same regardless of how many times the event takes place. If the same message is sent multiple times on a pub sub framework, it must typically have the same effect on all clients. Some popular Pub Sub frameworks include Apache Kafka and Cloud Pub/Sub.
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
Configuration is a set of variables/parameters that determine certain behaviors within the application. **Static** configuration is hard coded and shipped with the application, and **Dynamic** configuration is kept outside of the system and can be edited easier. Typical static configuration languages include JSON and YAML, which dynamic configurations use third-party key-value stores.
|
||||||
|
|
||||||
|
## Rate Limiting
|
||||||
|
|
||||||
|
Rate Limiting is the process of limiting the number of requests that can be made to the system. This is typically done by IP address to prevent people from abusing the server and taking up all of the resources. This type of attack is know as a **DoS Attack**, or denial of service. If this attack is performed from multiple machines, that is a **DDoS Attack**, or distributed denial of service, and is much harder to defend against. This is typically done with a key-value store such as Redis to keep track of how many times a particular IP accesses a service.
|
||||||
29
docs/interview/sd/improvements.md
Normal file
29
docs/interview/sd/improvements.md
Normal file
@@ -0,0 +1,29 @@
|
|||||||
|
# Improving Efficiency
|
||||||
|
|
||||||
|
## Caching
|
||||||
|
|
||||||
|
Caching is the process of saving data so that it is faster to retrieve than doing all the way to the data source or performing expensive computations. This can reduce the overall processing time at the cost of using extra memory. An example of caching at a large scale are **CDNs**, or Content Delivery Networks, which are third party caches used to cache website data such as Cloudflare and Google Cloud CDN. Some other popular caching softwares include:
|
||||||
|
|
||||||
|
- Redis: Really fast in memory key-value store with some persistent storage options, often used in rate limiting.
|
||||||
|
- Etcd: Strongly consistent and highly available key-value store, often used in leader election.
|
||||||
|
- Zookeeper: Another strongly consistent and highly available key-value store, often used for configuration or leader election.
|
||||||
|
|
||||||
|
## Proxies
|
||||||
|
|
||||||
|
Proxies are processes that stand between the client and server. **Forward** proxies work on the behalf of the client, such as a VPN. **Reverse** proxies act on the behalf of the server, such as loggers, load balancers, and caches.
|
||||||
|
|
||||||
|
### Load balancers
|
||||||
|
|
||||||
|
Load balancers receive all traffic for a service and distribute it to multiple processes. The use a **Server-Selection Strategy** to determine how to divide the traffic. Some strategies include round robin, random selection, performance-based selection, or location/IP-based selection. If the server workload becomes un-even, then we have a **hot spot** and the selection strategy may need tuning.
|
||||||
|
|
||||||
|
## Replication
|
||||||
|
|
||||||
|
Replication is a method of increasing redundency by actively duplicating data from one database server to another. It can also be used to decrease latency by copying data to servers closer to the user.
|
||||||
|
|
||||||
|
## Sharding
|
||||||
|
|
||||||
|
Sharding, or data partitioning, is the process of splitting databases in pieces to increase the throughput of the system. By splitting the data, queries will be able to be performed faster on any single database server. Sharding can be done by client region, type of data being stored, or some hashing function of a column. This will require a load balancer to send the requests to the correct database server.
|
||||||
|
|
||||||
|
## Peer to Peer Networks
|
||||||
|
|
||||||
|
P2P networks use a collection of machines (or peers) to divide a workload amongst themselves and reduce the overall processing time. This is especially useful for file distribution. Instead of downloading a file from a single server, the file is spread in chunks to peers across the network. Then a new peer can request those pieces of data from the other peers, spreading the strain from one server to many. This can be done without a centralized source of data and is called a **Gossip Protocol**.
|
||||||
@@ -1,13 +0,0 @@
|
|||||||
# Latency
|
|
||||||
|
|
||||||
Latency describes the time it takes for a machine to perform a certain operation. Every method of retrieving information has different time costs. A general rule of thumb:
|
|
||||||
|
|
||||||
- Reading 1MB from RAM - 0.25ms
|
|
||||||
- Reading 1MB from SSD - 1ms
|
|
||||||
- Transfer 1MB over network - 10ms
|
|
||||||
- Reading 1MB from HDD - 20ms
|
|
||||||
- Intercontinental round-trip - 150ms
|
|
||||||
|
|
||||||
## Throughput
|
|
||||||
|
|
||||||
While latency measures the time operations take, throughput describes the number of operations that can be processed in a given amount of time, such as requests per second.
|
|
||||||
@@ -1,6 +1,6 @@
|
|||||||
# Scaling
|
# Scaling
|
||||||
|
|
||||||
Scaling is the problem of supporting more and more users as your applications grow. In order to handle more requests, we need more hardware.
|
Scaling is the problem of supporting more and more users as your applications grow. These are ways to increase your throughput. In order to handle more requests, we need more hardware.
|
||||||
|
|
||||||
## Horizontal Scaling
|
## Horizontal Scaling
|
||||||
|
|
||||||
@@ -12,8 +12,7 @@ Some of the disadvantages of this approad are requiring load balancing, slower i
|
|||||||
|
|
||||||
## Vertical Scaling
|
## Vertical Scaling
|
||||||
|
|
||||||
Vertical scaling is a method of increasing scalability by adding capablility to a single machine. To
|
Vertical scaling is a method of increasing scalability by adding capablility to a single machine. To process more requests, you make your server faster by increasing the hardware.
|
||||||
process more requests, you make your server faster by increasing the hardware.
|
|
||||||
|
|
||||||
Some of the advantages of this approach compared to horizontal scaling are that there is no load balancing required, communication between processes is faster, and data will be consistent among processes on the server.
|
Some of the advantages of this approach compared to horizontal scaling are that there is no load balancing required, communication between processes is faster, and data will be consistent among processes on the server.
|
||||||
|
|
||||||
|
|||||||
@@ -24,8 +24,10 @@
|
|||||||
"text": "System Design",
|
"text": "System Design",
|
||||||
"items": [
|
"items": [
|
||||||
{"text": "Basics of the Internet", "link": "/interview/sd/basics"},
|
{"text": "Basics of the Internet", "link": "/interview/sd/basics"},
|
||||||
{"text": "Latency", "link": "/interview/sd/latency"},
|
{"text": "Efficiency", "link": "/interview/sd/efficiency"},
|
||||||
{"text": "Scaling", "link": "/interview/sd/scaling"}
|
{"text": "Scaling", "link": "/interview/sd/scaling"},
|
||||||
|
{"text": "Improvements", "link": "/interview/sd/improvements"},
|
||||||
|
{"text": "Implementation", "link": "/interview/sd/impl"}
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
Reference in New Issue
Block a user