Professional Documents
Culture Documents
The current IO devices are far from satisfying the massive read
and write requests of Internet applications. Then there is a cache,
using the high-speed read and write performance of memory to
cope with a large number of query requests. However, memory
resources are invaluable, and it is obviously impractical to store
the full amount of data in memory. Therefore, the current
combination of memory and IO, memory only stores hotspot data,
while IO devices store full amount of data. The design of the cache
contains a lot of tricks, and improper design can lead to serious
consequences. This article will introduce the three major problems
commonly encountered in the use of caching, and give the
corresponding solution.
1. Cache penetration
In most Internet applications:
To sum up: the data that the business system access does not exist
at all is called cache penetration.
Then, we can slightly modify the code of the business system, and
store the key with the empty database query result in the cache.
When the query request for the key occurs again, the cache
directly returns null without querying the database.
1.4.2 BloomFilter
It needs to add a barrier before the cache, which stores all the keys
that exist in the current database.
For some malicious attacks, the keys of the query are often
different, and the data thief is more. At this point, the first option
is too much. Because it needs to store the keys of all the empty
data, and the keys of these malicious attacks are often different,
and the same key is often only requested once. Therefore, even if
the key of these empty data is cached, since the second time is no
longer used, the role of protecting the database cannot be
achieved. Therefore, for a scenario where the keys of the null
data are different and the probability of the key repeat
request is low , the second scheme should be selected. For the
scenario where the number of keys of the null data is
limited and the probability of the key repeat request is
high , the first scheme should be selected.
2. Cache avalanche
However, for some hot data with very high requests, once the valid
time has passed, there will be a large number of requests falling on
the database at this moment, which may cause the database to
crash. The process is as follows:
3.2 Solution
3.2.1 Mutex
We can use the lock mechanism that comes with the cache. When
the first database query request is initiated, the data in the cache
will be locked; at this time, other query requests that arrive at the
cache will not be able to query the field, and thus will be blocked
waiting; After a request completes the database query and caches
the data update value, the lock is released; at this time, other
blocked query requests can be directly retrieved from the cache.
When a hotspot data fails, only the first database query request is
sent to the database, and all other query requests are blocked, thus
protecting the database. However, due to the use of a mutex, other
requests will block waiting and the throughput of the system will
drop. This needs to be combined with actual business
considerations to allow this.
When we store this data in the cache, we can stagger their cache
expiration time. This can avoid simultaneous failures. For
example, add/subtract a random number at a base time to stagger
the expiration time of these caches.
A central processing unit (CPU), also called a central processor or main processor, is
the electronic circuitry within a computer that carries out the instructions of a computer
program by performing the basic arithmetic, logic, controlling, and input/output (I/O)
operations specified by the instructions. The computer industry has used the term "central
processing unit" at least since the early 1960s. Traditionally, the term "CPU" refers to
a processor, more specifically to its processing unit and control unit (CU), distinguishing
these core elements of a computer from external components such as main
memoryand I/O circuitry.