Windows system >> Linux system Tutorial >> Linux Tutorial

Redis several misunderstandings

A few days ago, Weibo had a big system failure, and many technical friends were more concerned. The reason for this would not exceed the range that James Hamilton outlined in On Designing and Deploying Internet-Scale Service (1). James' first experience, Design for failure, is a key to the success of all Internet architectures. The engineering theory of the Internet system is actually very simple. The content of James paper is almost impossible to describe the theory, but a lot of practical experience sharing. Each company's understanding and execution of these experiences determine the composition failure.

After the topic, I recently studied Redis. Last year I did a MemcacheDB, Tokyo Tyrant, Redis performance test. So far, this benchmark result is still valid. In the past year, we have experienced the temptation of a lot of dazzling key value storage products, from the fading out of Cassandra (Twitter pauses in the main business use) to the rise of HBase (Facebook's new mailbox business uses HBase(2)), when going back Looking at Redis, I found that this program with more than 10,000 lines of source code is full of magical and unexplored features. The performance of Redis is amazing. The sub-products of the top ten websites in China are estimated to meet the storage and Cache requirements with one Redis. In addition to the performance impression, the industry generally has a certain misunderstanding of Redis. This article presents some points for discussion.
1. What is Redis

The result of this problem affects how we use Redis. If you think Redis is a key value store, you might use it instead of MySQL; if you think it's a cache that can be persisted, it might just save some temporary data that is accessed frequently. Redis is the abbreviation of REmote DIctionary Server. The subtitle of Redis on the official website is A persistent key-value database with built-in net interface written in ANSI-C for Posix systems. This definition is biased towards the key value store. There are also some opinions that Redis is a memory database because its high performance is based on the basis of memory operations. Others think that Redis is a data structure server because Redis supports complex data features such as List, Set, and so on. Different interpretations of the role of Redis determine how you use Redis.

Internet data currently uses two methods to store, relational databases or key values. However, these Internet services do not belong to these two types of data, such as the relationship of users in the social platform. It is a list. If you want to use relational database storage, you need to convert it into a multi-line record. This form exists. A lot of redundant data, each line needs to store some duplicate information. If you use the key value to store, it is cumbersome to modify and delete. You need to read and write all the data. Redis has designed various data types in memory, allowing the business to access these data structures at high speed, and does not need to care about persistent storage. It solves the problem that the first two kinds of storage need to take some detours.
2. Redis can't be faster than Memcache

Many developers think Redis can't be faster than Memcached. Memcached is completely memory-based, and Redis has persistent storage features, even if it's asynchronous, Redis doesn't. May be faster than Memcached. But the test results are basically that Redis has an absolute advantage. I have been thinking about this for a long time, and I think of the reasons for this.

Libevent differs from Memcached in that Redis does not choose libevent. Libevent has a huge code to cater to versatility (currently Redis code is less than 1/3 of libevent) and sacrifices a lot of performance on a particular platform. Redis implemented its own epoll event loop(4) with two files in libevent. Many developers in the industry also suggested that Redis use another libevent high performance instead of libev, but the author still insists that Redis should be small and dependent. An impressive detail is that you don't need to execute ./configure before compiling Redis.

CAS problem. CAS is a convenient way to prevent competition from modifying resources in Memcached. CAS implementation needs to set a hidden cas token for each cache key, cas is quite a value version number, each time the set will need to increment the token, thus bringing double overhead of CPU and memory, although these overheads are small, but to the stand-alone 10G+ cache And after QPS tens of thousands of these costs will bring some subtle performance differences between the two sides (5).