2013年8月12日星期一

30G of redis how to optimize

 

suddenly found our redis have spent 30G, okay this is a very embarrassing figures because our machine's memory cache is now 32G, the memory has been exhausted divisions. Fortunately, last week the company purchased a 90G machine, which has now been moved to midnight a machine on. (Beside the next, 90G of memory too cool is my second time outside in addition koding.com 90G used machines, koding is a good site, online programming IDE.) But with the increasing amount of data has never been able to bear alone , the transformation imperative. After the initial thinking we have come to a very simple solution can be summed up "Out"

 

1. strength practice

 
  

start talking about the look of our application layer redis usage, there is no way to recover some of the key, first enter the redis server performs info, have deleted

 
 
  
   1:  redis 127.0.0.1:6391> info
 
 
  
   2:  used_memory_human:35.58G   
 
 
  
   3:  keyspace_hits:2580207188 
 
 
  
   4:  db0:keys=2706740,expires=1440700
 
   

Currently we only use a DB, but there are too many key 270W a key, there is already expired 144W. The first thought is that I go to Le, how there are so many key, second idea is that there may be too large key

 

see if you can do against excessive key optimization? But unfortunately the official did not command displays the db key size, we can only think of ways of

 

Google about it, find foreign friends have written a shell

 

Portal: https://gist.github.com/epicserve/5699837

 

can list the size of each key. But this does not apply to us because we key too performed nine hours did finish, unable Tucao. In fact, there is a choice is to use another tool

 

Portal: https://github.com/sripathikrishnan/redis-rdb-tools

 

Unfortunately, this is too heavy, do not want to trouble ops, we can only pulled up the sleeve, causing the wheel.

 

looked under the shell code simple hair pieces DEBUG OBJECT is a good thing ah, google found under the official website http: / / redis.io / commands / object

 

have simple debugging information, and the rest is like handling a

 
  
   
   1: #coding=utf-8
       
   2: import redis
       
   3:  
       
   4: COLOR_RED = "\033[31;49;1m %s \033[31;49;0m"
       
   5:  
       
   6: COLOR_GREED = "\033[32;49;1m %s \033[39;49;0m"
       
   7:  
       
   8: COLOR_YELLOW = "\033[33;49;1m %s \033[33;49;0m"
       
   9:  
       
  10: COLOR_BLUE = "\033[34;49;1m %s \033[34;49;0m"
       
  11:  
       
  12: COLOR_PINK = "\033[35;49;1m %s \033[35;49;0m"
       
  13:  
       
  14: COLOR_GREENBLUE = "\033[36;49;1m %s \033[36;49;0m"
       
  15:  
       
  16:  
       
  17: def getHumanSize(value):
       
  18:     gb = 1024 * 1024 * 1024.0
       
  19:     mb = 1024 * 1024.0
       
  20:     kb = 1024.0
       
  21:     if value >= gb:
       
  22:         return COLOR_RED % (str(round(value / gb, 2)) + " gb")
       
  23:     elif value >= mb:
       
  24:         return COLOR_YELLOW % (str(round(value / mb, 2)) + " mb")
       
  25:     elif value >= kb:
       
  26:         return COLOR_BLUE % (str(round(value / kb, 2)) + " kb")
       
  27:     else:
       
  28:         return COLOR_GREED % (str(value) + "b")
       
  29:  
       
  30:  
       
  31: month = 3600 * 24 * 30
       
  32: result = []
       
  33: client = redis.Redis(host="XXXXX", port=XXXX)
           
        
  36: client.info()
       
  37:  
       
  38: count = 0
       
  39: for key in client.keys('*'):
       
  40:     try:
       
  41:         count += 1
       
  42:         idleTime = client.object('idletime', key)
       
  43:         refcount = client.object('refcount', key)
       
  44:         length = client.debug_object(key)['serializedlength']
       
  45:         value = idleTime * refcount
       
  46:         print "%s key :%s , idletime : %s,refcount :%s, length : %s , humSize  :%s" % (count, key, idleTime, refcount, length, getHumanSize(length))
       
  47:     except Exception:
       
  48:         pass
      
 
 

wrote a simple python script output of each key size and idle time, and refer count. With so much data in conjunction with awk can be a good use of statistics for each key. One thing to note is that this size is the size of the key in redis, not the actual size, this is the result of redis compressed. After analysis found that there is no excessive key, but there are some key six months have not been visited Orz.

 

Next on the good deal, we set the expiration time for each key, if key is hit on the update of this expire time. Such data can be phased out cold, to cold separation

 

 

2. Waigong practice

 
  

our internal cleanup invalid key, we want to achieve the level of foreign expansion, single bearing is always limited, so we started the legendary distributed transformation

 
 This thing looks

distributed bluffing done more bluffing, but fortunately we are caching service CAP constraints limited. Distributed caching services do best course is consistent hash slightly. In fact, when we completed the transformation, only to find that the official has been prepared to do a distributed cache system (drool ah) is now under development gave a spare resounding Twemproxy is regrettable that we are well on the first use of, say, after waiting for official test

 

Portal: http://redis.io/topics/cluster-spec

 

we achieved a smooth migration of data, but also to achieve a minimal server changes impact. Because they are used is phpredis so we expanded, the code can be a smooth transition.

 

our own implementation: https://github.com/trigged / redis_con_hash

 

actually pull so much data is to take redis dispersed, single bearing is always a bottleneck, but redis Memcached in this regard is not perfect, but the future will be better and better

没有评论:

发表评论