Notes on Memcached


Notes on Memcached
What is Memcached?
memcached is a high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load.
How Memcached works?
Memcached's magic lies in its two-stage hash approach. It behaves as though it were a giant hash table, looking up key/value pairs. Give it a key, and set or get some arbitrary data.
Memcached is organized as a farm of N servers. The storage model can be considered as a huge HashTable partitioned among these N servers.
The servers within the farm doesn't know each other at all, and it doesn't replicate data between different server, so its overhead is slow, and  add, remove a memcached server is very easy.
When doing a memcached lookup, first the client hashes the key against the whole list of servers and choses a server, then the client sends this request, and the server does an internal hash key lookup for the actual item data, and return response.
If the response is not null, we get the data we need, if not, we need query database and get the data and add the data to memcached server.
Each server use asynchronous, non-blocking I/O and a thread pool to handle large number of incoming TCP sockets, the number of threads is independent of the number of incoming sockets.
maxiumum key length - 250 bytes
maxiumum value size - 1 MB
limits on expire time -  30 days
Slabs, Pages, Chunks and Memcached Memory Management
Memcached use slab allocator to do all the memory-management for item data in memcached. It still uses malloc/calloc/realloc for other memory management, just use slab allocator for its key/value pair.
When you ask memcached to store a value, it looks up the “slab” associated with that value. A slab holds values within a particular size range. Slabs are composed of 1MB pages, which are broken into chunks of the slab’s size.
In a simple word, a slab has many pages which has many chunks. Each chunk is a fixed size.
When we insert a value, memcached will look up all slabs to find the slab whose chunk size is larger than and closest to the value size, then will insert the value to that slab's chunk.

Memcached Protocol
1. Text based protocol
The original memcached protocol is a very simple, small, and an effective text based protocol
2. Binary Protocol
Binary protocol is added in memcached 1.3 to improve performance and extensibility.
Parsing binary string is more efficient than parsing text string, and its extensibility opens up new possibilities,
What is SASL?
SASL stands for Simple Authentication and Security Layer, which provides you a means of adding authentication support to connection-based protocols, such as the memcached client binary protocol in this case.
memcached server won't allow the user to connect unless they provide correct credential.
Install prerequisites such as sasl development libraries and as well as SASL utilities such as saslpasswd.
sudo apt-get -f install libsasl2-2 sasl2-bin libsasl2-2 libsasl2-dev libsasl2-modules
Compile and install memcached with --enable-sasl
./configure --enable-sasl
Configuring SASL
# Create a user for memcached.
saslpasswd2 -a memcached -c cacheuser
Running Memcached
In order to enable SASL support in the server you must use the -S flag.
The -S flag does a few things things:
Enable all of the SASL commands.
Require binary protocol only.
Require authentication to have been successful before commands may be issued on a connection.
What if memcached server is added, or removed.
The default server hashing algorithm doesn't handle the growth and shrink of the number of servers very well.
It computes hash value of the key and them module it by server number.
When the number of server changes, the ownership equation (key mod N) will all be wrong, as a same key now would be mapped and redistributed to different server.
There are several approaches to this problem:
1. Consistent hashing
Consistent hashing localize the ownership changes to just the neighbor of the added or removed server.
Under this schema, each server is assigned with an id under the same key space. The ownership of a key is determined by the closest server whose key is the first one encountered when walking in the anti-clockwise direction.
When a server crashes, its immediate upstream neighbor server (walking along the anti-clockwise direction) will adopt the key ownership of the dead server, while all other servers have the same ownership of key range unchanged.
When a new server is added, only its neighbor servers’ ownership would change.
2. Run logical servers within a farm of physical machines.
When a physical machine crashes, its logical servers will be restart in the surviving physical machines. In other words, the number of logical servers is unchanged even when crashes happens.
This logical server approach is also good when the underlying physical machines have different memory capacity. We can start more Memcached process in the machine with more memory and proportionally spread the cache according to memory capacity.
What is memcached's cache?
The cache structure is an LRU (Least Recently Used), plus expiration timeouts. When you store items into memcached, you may state how long it should be valid in the cache. Which is forever, or some time in the future. If the server is out of memory, expired slabs are replaced first, then the oldest unused slabs go next.
API
get(key)
set(key, value)
set(key, value, expiration)
add(key, value, expiration)
replace(key, value, expiration)
Start server: memcached -d -m 4096 -t 4  -p 11211
-d[deamon], -t[thread number], -m[memory size] -p[port]
Stop server:  ps -ef|grep memcached|grep -v grep|awk '{print $2}'|xargs kill -9
telnet localhost 11211
Query the current traffic statistics: stats
Query the current memory statistics: stats slabs
Query which keys are used: stats items
Command       Description                            Example
get                 Reads a value                         get mykey
set                  Set a key unconditionally         set mykey 0 60 5
add                 Add a new key                        add newkey 0 60 5
replace            Overwrite existing key            replace key 0 60 5
append            Append data to existing key     append key 0 60 15
prepend           Prepend data to existing key    prepend key 0 60 15
incr                 Increments numerical key value by given number
incr mykey 2
decr             Decrements numerical key value by given number
decr mykey 5
delete           Deletes an existing key          delete mykey
flush_all         Invalidate specific items immediately   flush_all
                Invalidate all items in n seconds        flush_all 900
stats               Prints general statistics                       stats
                Prints memory statistics                 stats slabs
                Prints memory statistics                 stats malloc
                                   stats items,stats detail,stats sizes
                Resets statistics                          stats reset
version          Prints server version.                    version
verbosity        Increases log level                      verbosity
quit             Terminate telnet session                 quit
Install memcached
1.     Install Libevent
libevent is an asynchronous event notification software library.
The libevent API provides a mechanism to execute a callback function when a specific event occurs on a file descriptor or after a timeout has been reached. Furthermore, libevent also supports callbacks due to signals or regular timeouts.
wget http://monkey.org/~provos/libevent-2.0.11-stable.tar.gz
cd ibevent-2.0.11-stable
autoconf
./configure --prefix=/usr/local
make
sudo make install
2. Install Memcached
Download the latest version of Memcached:
wget http://memcached.googlecode.com/files/memcached-1.4.5.tar.gz
cd memcached-1.4.5
autoconf
./configure --prefix=/usr/local
make
sudo make install
Using Memcached with Java
There are several memcached java client implementations, such as spymemcached, and Memcached-Java-Client.

Source code Digestion
// Implements Spring InitializingBean and DisposableBean
public class SpyMemcachedClient implements InitializingBean, DisposableBean {
      private MemcachedClient memcachedClient;
      private String memcachedNodes = "9.186.1.1:11211,9.186.1.2:11211,9.186.1.3:11211";
      private boolean isBinaryProtocol = false;
      private boolean isConsistentHashing = true;
      private long operationTimeout = 1000; //default value in Spy is 1000ms
      @Override
      public void afterPropertiesSet() throws Exception {
           ConnectionFactoryBuilder cfb = new ConnectionFactoryBuilder();
           cfb.setFailureMode(FailureMode.Redistribute);
           cfb.setDaemon(true);
           cfb.setProtocol(isBinaryProtocol ? Protocol.BINARY : Protocol.TEXT);
           if (isConsistentHashing) {
                 cfb.setLocatorType(Locator.CONSISTENT);
                 cfb.setHashAlg(HashAlgorithm.KETAMA_HASH);
           }
           cfb.setOpTimeout(operationTimeout);
           try {
                 memcachedClient = new MemcachedClient(cfb.build(), AddrUtil.getAddresses(memcachedNodes));
           } catch (IOException e) {
                 logger.error("MemcachedClient initilization error: ", e);
                 throw e;
           }
      }
      public void destroy() throws Exception {
           if (memcachedClient != null) {
                 memcachedClient.shutdown();
           }
      }
      public MemcachedClient getMemcachedClient() {
           return memcachedClient;
      }
    // capture cast exception, and return null instead
      @SuppressWarnings("unchecked")
      public T get(String key) {
           try {
                 return (T) memcachedClient.get(key);
           } catch (RuntimeException e) {
                 logger.warn("Get from memcached server fail,key is " + key, e);
                 return null;
           }
      }
    // getBulk,gets, set, cas(check and set),delete, incr, decr, asyncIncr, asyncDecr
      // setMemcachedNodes, setBinaryProtocol, setConsistentHashing, setOperationTimeout
}
@Component
public class AccountManager {
      private SpyMemcachedClient spyMemcachedClient;
      @Transactional(readOnly = true)
      public User getInitedUser(String id) {
           if (spyMemcachedClient != null) {
                 return getUserFromMemcached(id);
           } else {
                 return userJdbcDao.queryObject(id);
           }
      }
      private User getUserFromMemcached(String id) {
           String key = MemcachedObjectType.USER.getPrefix() + id;
           User user = null;
           String jsonString = spyMemcachedClient.get(key);
           if (jsonString == null) {
                 user = userJdbcDao.queryObject(id);
                 if (user != null) {
                      jsonString = jsonBinder.toJson(user);
                     spyMemcachedClient.set(key, MemcachedObjectType.USER.getExpiredTime(), jsonString);
                 }
           } else {
                 user = jsonBinder.fromJson(jsonString, User.class);
           }
           return user;
      }   
      @Autowired(required = false) //Autowired
      public void setSpyMemcachedClient(SpyMemcachedClient spyMemcachedClient) {
           this.spyMemcachedClient = spyMemcachedClient;
      }
}

Resources
Post a Comment

Labels

Java (159) Lucene-Solr (111) Interview (61) All (58) J2SE (53) Algorithm (45) Soft Skills (37) Eclipse (33) Code Example (31) Linux (24) JavaScript (23) Spring (22) Windows (22) Web Development (20) Nutch2 (18) Tools (18) Bugs (17) Debug (16) Defects (14) Text Mining (14) J2EE (13) Network (13) Troubleshooting (13) PowerShell (11) Chrome (9) Design (9) How to (9) Learning code (9) Performance (9) Problem Solving (9) UIMA (9) html (9) Http Client (8) Maven (8) Security (8) bat (8) blogger (8) Big Data (7) Continuous Integration (7) Google (7) Guava (7) JSON (7) ANT (6) Coding Skills (6) Database (6) Scala (6) Shell (6) css (6) Algorithm Series (5) Cache (5) Dynamic Languages (5) IDE (5) Lesson Learned (5) Programmer Skills (5) System Design (5) Tips (5) adsense (5) xml (5) AIX (4) Code Quality (4) GAE (4) Git (4) Good Programming Practices (4) Jackson (4) Memory Usage (4) Miscs (4) OpenNLP (4) Project Managment (4) Spark (4) Testing (4) ads (4) regular-expression (4) Android (3) Apache Spark (3) Become a Better You (3) Concurrency (3) Eclipse RCP (3) English (3) Happy Hacking (3) IBM (3) J2SE Knowledge Series (3) JAX-RS (3) Jetty (3) Restful Web Service (3) Script (3) regex (3) seo (3) .Net (2) Android Studio (2) Apache (2) Apache Procrun (2) Architecture (2) Batch (2) Bit Operation (2) Build (2) Building Scalable Web Sites (2) C# (2) C/C++ (2) CSV (2) Career (2) Cassandra (2) Distributed (2) Fiddler (2) Firefox (2) Google Drive (2) Gson (2) How to Interview (2) Html Parser (2) Http (2) Image Tools (2) JQuery (2) Jersey (2) LDAP (2) Life (2) Logging (2) Python (2) Software Issues (2) Storage (2) Text Search (2) xml parser (2) AOP (1) Application Design (1) AspectJ (1) Chrome DevTools (1) Cloud (1) Codility (1) Data Mining (1) Data Structure (1) ExceptionUtils (1) Exif (1) Feature Request (1) FindBugs (1) Greasemonkey (1) HTML5 (1) Httpd (1) I18N (1) IBM Java Thread Dump Analyzer (1) JDK Source Code (1) JDK8 (1) JMX (1) Lazy Developer (1) Mac (1) Machine Learning (1) Mobile (1) My Plan for 2010 (1) Netbeans (1) Notes (1) Operating System (1) Perl (1) Problems (1) Product Architecture (1) Programming Life (1) Quality (1) Redhat (1) Redis (1) Review (1) RxJava (1) Solutions logs (1) Team Management (1) Thread Dump Analyzer (1) Visualization (1) boilerpipe (1) htm (1) ongoing (1) procrun (1) rss (1)

Popular Posts