YYCache源码学习

字数统计: 3.4k阅读时长: 14 min

 2016/03/06  Share

这篇文章并不是对YYCache的设计思路的范范分析，而是对YYCache代码实现的详细分析。一方面YYCache的设计思路作者已经写得比较清楚了，我就没必要再多此一举了，有兴趣的可以到大神的博客去看YYCache 设计思路。从代码层面进行分析，一方面是因为很多思路上的东西很虚理解起来大都很容易，但是要能用代码实现却往往不那么轻松，另一方面是因为代码分析可以学习优秀代码的编码风格以及加深对iOS技术上的理解，这样做给自己带来的帮助或许会更多。

代码结构

YYCache文件数并不多，主要包含四个文件：

YYCache

YYDiskCache

YYMemoryCache

YYKVStorage

他们之间的关系可以用一张图来描述：

YYCache是整个缓存框架的核心类，它是由YYDiskCache和YYMemeoryCache组成，而YYDiskCache需要借助YYKVStorage来实现对元素的读写。以下就依次对每个类进行详细的剖析，我觉得由下向上的去看代码理解起来会更方便，于是剖析的顺序依次是YYKVStorage、YYDiskCache、YYMemeoryCache、YYCache。

YYKVStorage

YYKVStorage文件中包含两个类YYKVStorageItem和YYKVStorage，前者的作用是为后者提供存储键值对和元数据服务的。前者将每一个储存的数据包装成一个元素，是静态的，后者则是对前者的元素进行读写操作，是动态的。做个比喻，YYKVStorageItem就好比是仓库里的货物，而YYKVStorage则是仓库管理员。不过，该类并非是线程安全的，所以，需要确保该类同一时间只能由一个YYKVStorage对象去访问某个YYKVStorageItem元素。
YYKVStorageItem的结构很简单：

@interface YYKVStorageItem : NSObject
@property (nonatomic, strong) NSString *key;                ///< key
@property (nonatomic, strong) NSData *value;                ///< value
@property (nullable, nonatomic, strong) NSString *filename; ///< filename (nil if inline)
@property (nonatomic) int size;                             ///< value's size in bytes
@property (nonatomic) int modTime;                          ///< modification unix timestamp
@property (nonatomic) int accessTime;                       ///< last access unix timestamp
@property (nullable, nonatomic, strong) NSData *extendedData; ///< extended data (nil if no extended data)
@end

key：唯一标示元素的标识符
value：存储的二进制数据
filename：存储数据文件的文件名
size：限定存储数据的大小
modTime：最后一次修改数据的时间戳
accessTime：最后一次读取数据的时间戳
extendedData：附加的数据

YYKVStorage类的结构如下：

@interface YYKVStorage : NSObject
#pragma mark - Attribute
@property (nonatomic, readonly) NSString *path;        ///< The path of this storage.
@property (nonatomic, readonly) YYKVStorageType type;  ///< The type of this storage.
@property (nonatomic) BOOL errorLogsEnabled;           ///< Set `YES` to enable error logs for debug.
@end

path：存储的路径
type：存储类型
errorLogsEnabled：debug下是否打印错误信息
注：由于写入速度方面sqlite比文件速度快，但是读取速度方面的性能则取决于数据的大小。在作者的测试当中，当数据大于20KB时，读取速度上文件要快于sqlite。为了从性能方面考虑，加入了YYKVStorageType枚举类型。

typedef NS_ENUM(NSUInteger, YYKVStorageType) {
    
    /// The `value` is stored as a file in file system.
    YYKVStorageTypeFile = 0,
    
    /// The `value` is stored in sqlite with blob type.
    YYKVStorageTypeSQLite = 1,
    
    /// The `value` is stored in file system or sqlite based on your choice.
    YYKVStorageTypeMixed = 2,
};

如果是要存储大量的小数据，用YYKVStorageTypeSQLite性能会更好，如果是要存储大文件（比如图片缓存），使用YYKVStorageTypeFile来获取更好的性能，当然也可以使用YYKVStorageTypeMixed来自己决定每一个item的存储方式。

YYKVStorage实现文件依据它的思路分为两部分：sqlite和文件，它们的存储结构如下：

/*
 File:
 /path/
      /manifest.sqlite
      /manifest.sqlite-shm
      /manifest.sqlite-wal
      /data/
           /e10adc3949ba59abbe56e057f20f883e
           /e10adc3949ba59abbe56e057f20f883e
      /trash/
            /unused_file_or_folder
 
 SQL:
 create table if not exists manifest (
    key                 text,
    filename            text,
    size                integer,
    inline_data         blob,
    modification_time   integer,
    last_access_time    integer,
    extended_data       blob,
    primary key(key)
 ); 
 create index if not exists last_access_time_idx on manifest(last_access_time);
 */

File和SQL共用一张manifest表，如果是用SQL方式存储数据，则manifest表中的filename字段为空。这样设计的好处是，查询和修改item的信息只要查询一张表，效率上会高一点。

注：manifest.sqlite-shm和manifest.sqlite-wal是自sqlite 3.7后加入的，-wal文件的意思是write-ahead log，当一个数据库采用WAL模式，所有连接数据的操作都必须使用WAL，然后在数据库文件夹下生成后缀为-wal的文件来保存操作日志，-shm则说的是共享内存的问题，有兴趣可以看看下面这段。

2.2 Write-Ahead Log (WAL) Files

A write-ahead log or WAL file is used in place of a rollback journal when SQLite is operating in WAL mode. As with the rollback journal, the purpose of the WAL file is to implement atomic commit and rollback. The WAL file is always located in the same directory as the database file and has the same name as the database file except with the 4 characters “-wal” appended. The WAL file is created when the first connection to the database is opened and is normally removed when the last connection to the database closes. However, if the last connection does not shutdown cleanly, the WAL file will remain in the filesystem and will be automatically cleaned up the next time the database is opened.

2.3 Shared-Memory Files

When operating in WAL mode, all SQLite database connections associated with the same database file need to share some memory that is used as an index for the WAL file. In most implementations, this shared memory is implemented by calling mmap() on a file created for this sole purpose: the shared-memory file. The shared-memory file, if it exists, is located in the same directory as the database file and has the same name as the database file except with the 4 characters “-shm” appended. Shared memory files only exist while running in WAL mode.

The shared-memory file contains no persistent content. The only purpose of the shared-memory file is to provide a block of shared memory for use by multiple processes all accessing the same database in WAL mode. If the VFS is able to provide an alternative method for accessing shared memory, then that alternative method might be used rather than the shared-memory file. For example, if PRAGMA locking_mode is set to EXCLUSIVE (meaning that only one process is able to access the database file) then the shared memory will be allocated from heap rather than out of the shared-memory file, and the shared-memory file will never be created.

The shared-memory file has the same lifetime as its associated WAL file. The shared-memory file is created when the WAL file is created and is deleted when the WAL file is deleted. During WAL file recovery, the shared memory file is recreated from scratch based on the contents of the WAL file being recovered.

sqlite3部分涉及到sqlite3预编译，它的预编译过程分为以下几步：

1.通过sqlite3_prepare_v2()创建sqlite3_stmt对象
2.通过sqlite3bind*()绑定预编译字段的值
3.通过sqlite2_step()执行SQL语句
4.通过sqlite3_reset()重置预编译语句，重复步骤2多次
5.通过sqlite3_finalize()销毁资源

sqlite3bind*有多种形式，分别对应不同的类型：

int sqlite3_bind_blob(sqlite3_stmt*, int, const void*, int n, void(*)(void*)); 
int sqlite3_bind_double(sqlite3_stmt*, int, double);
int sqlite3_bind_int(sqlite3_stmt*, int, int);
int sqlite3_bind_int64(sqlite3_stmt*, int, sqlite3_int64);
int sqlite3_bind_null(sqlite3_stmt*, int);
int sqlite3_bind_text(sqlite3_stmt*, int, const char*, int n, void(*)(void*));
int sqlite3_bind_text16(sqlite3_stmt*, int, const void*, int, void(*)(void*));
int sqlite3_bind_value(sqlite3_stmt*, int, const sqlite3_value*);
int sqlite3_bind_zeroblob(sqlite3_stmt*, int, int n);

YYDiskCache

YYDiskCache是线程安全的底层依赖于SQLite和File系统（类似于NSURLCache的磁盘缓存）来存储键值对的缓存
YYDiskCache有以下的特性：

使用LRU算法来移除对象

能够被开销、数量和寿命来控制

当没有多余的磁盘空间时它能够自动回收对象

能够自动决定为每个对象决定存储类型（sqlite还是文件）以达到更好的性能

YYDiskCache结构如下：

@interface YYDiskCache : NSObject
#pragma mark - Attribute
/** The name of the cache. Default is nil. */
@property (nullable, copy) NSString *name;
/** The path of the cache (read-only). */
@property (readonly) NSString *path;
/**
 *  如果对象的数据大小超过这个值，那对象就会被当做文件存储起来，否则对象会以sqlite形式存储起来。
    0意味着所有的对象会被以单独的文件存储起来，NSIntegerMax意味着所有的对象都会以sqlite形式存储。
 */
@property (readonly) NSUInteger inlineThreshold;
/**
 *  如果这个block不为空，那么这个block会被用来取代NSKeyedArchiver序列化对象。
 *  你可以使用这个block来支持那些没有实现'NSCoding'协议的对象
 */
@property (nullable, copy) NSData *(^customArchiveBlock)(id object);
/**
 *  反序列化那些没有遵从'NSCoding'协议的对象
 */
@property (nullable, copy) id (^customUnarchiveBlock)(NSData *data);
/**
 *  当一个对象需要以文件的形式存储起来时，这个block会被触发来以指定的key生成文件名。
    如果block为空，缓存就使用md5加密的key来作为文件名
 */
@property (nullable, copy) NSString *(^customFileNameBlock)(NSString *key);
#pragma mark - Limit
/**
 *  缓存所拥有的对象的最大限制
    默认值为NSIntegMax，这不是一个严格的限制——如果缓存超过了这个限制，一些对象会在后台队列中被回收
 */
@property NSUInteger countLimit;
/**
 *  在缓存开始回收对象前它所能持有的所有代价
    如果缓存超过这个限制，一些对象会在后台队列中被回收。
 */
@property NSUInteger costLimit;
/**
 *  缓存中对象的生命周期
    如果对象超过这个限制，它会在后台队列中被回收
 */
@property NSTimeInterval ageLimit;
/**
 *  缓存保有的最小的空余磁盘空间大小
    如果空余磁盘空间大小小于这个值，缓存会移除部分对象来释放磁盘空间。
 */
@property NSUInteger freeDiskSpaceLimit;
/**
 *  缓存有一个内部的timer来检查缓存是否达到它的限制，如果达到限制，便开始回收对象。
 */
@property NSTimeInterval autoTrimInterval;
/**
 Set `YES` to enable error logs for debug.
 */
@property BOOL errorLogsEnabled;

YYDiskCache实现部分采用dispatch_semaphore来控制同步的，而并没有采用性能非常好的OSSpinLock自旋锁，研究了一下原因。

OSSpinLock：得益于不进内核不挂起的方式，OSSpinLock有着优异的性能表现，然而在高并发执行(冲突概率大，竞争激烈)的时候，又或者代码片段比较耗时(比如涉及内核执行文件io、socket、thread等)，就容易引发CPU占有率暴涨的风险，因此更适用于一些简短低耗时的代码片段
dispatch_semaphore：GCD用于控制多线程并发的信号量，允许通过wait/signal的信号事件控制并发执行的最大线程数，当最大线程数降级为1的时候则可当作同步锁使用，注意该信号量并不支持递归；性能虽不如OSSpinLock但性能表现也是出乎意料之外的好，也没有OSSpinLock的CPU占有率暴涨的问题，然而原本是用于GCD的多线程并发控制，也是信号量机制。

对于耗时较大又易冲突的读操作，可以使用dispatch_semaphore，对于性能要求苛刻，可以考虑使用OSSpinLock，但需要确保加锁片段的耗时足够小。由于YYDiskCache锁占用时间会比较长，使用OSSpinLock会造成CPU内存暴涨，相比之下，使用dispatch_semaphore性能上则会好很多。

YYMemoryCache

YYMemoryCache是一个高效的存储键值对的内存缓存。与NSDictionary相比，keys只被持有而并不进行拷贝, 其API和性能与NSCache接近，所有的方法都是线程安全的。它的特性如下：

YYMemoryCache与NSCache在以下几个方面不同：

它使用LRU算法移除对象；NSCache的回收方法的策略是不确定的。

它可以被开销，数量和生命周期来控制；NSCache的限制是不确定的。

当收到内存警告和进入后台时，它可以自动回收对象。

YYMemoryCache使用pthread_mutex来控制同步。读写锁的在锁操作耗时上明显不占优势，读写锁的主要性能优势在于多线程高并发量的场景，这时候锁竞争可能会非常激烈，使用一般的锁这时候并发性能都会明显下降，读写锁对于所有读操作能够把同步放开，进而保持并发性能不受影响。由于内存缓存属于多线程高并发的使用场景，因此使用pthread_mutex会更稳定。

pthread_mutex：POSIX标准的unix多线程库(pthread)中使用的互斥量，支持递归，需要特别说明的是信号机制pthread_cond_wait()同步方式也是依赖于该互斥量，pthread_cond_wait()本身并不具备同步能力；

YYMemoryCache的实现主要基于双链表，将链表的节点按照时间先后顺序逆序链接，若有节点被访问，则将该节点挪到表头，若插入新节点而缓存已满，则从链表表尾开始删除节点腾出存储空间。

@interface _YYLinkedMapNode : NSObject {
    @package
    __unsafe_unretained _YYLinkedMapNode *_prev; // retained by dic
    __unsafe_unretained _YYLinkedMapNode *_next; // retained by dic
    id _key;
    id _value;
    NSUInteger _cost;
    NSTimeInterval _time;
}
@end

YYMemoryCache创建一个YYMemoryCacheGetReleaseQueue来releaseCFMutableDictionaryRef对象，避免阻塞主线程。

1
2
3

static inline dispatch_queue_t YYMemoryCacheGetReleaseQueue() {
    return dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_LOW, 0);
}

YYMemoryCache的实现部分有段代码引起了我的注意：

- (void)removeObjectForKey:(id)key {
    if (!key) return;
    pthread_mutex_lock(&_lock);
    _YYLinkedMapNode *node = CFDictionaryGetValue(_lru->_dic, (__bridge const void *)(key));
    if (node) {
        [_lru removeNode:node];
        if (_lru->_releaseAsynchronously) {
            dispatch_queue_t queue = _lru->_releaseOnMainThread ? dispatch_get_main_queue() : YYMemoryCacheGetReleaseQueue();
            dispatch_async(queue, ^{
                [node class]; //hold and release in queue
            });
        } else if (_lru->_releaseOnMainThread && !pthread_main_np()) {
            dispatch_async(dispatch_get_main_queue(), ^{
                [node class]; //hold and release in queue
            });
        }
    }
    pthread_mutex_unlock(&_lock);
}

这段代码实现的很巧妙，removeObjectForKey:方法执行完之后，node指向的对象的引用计数为0需要被释放，但是由于在dispatch_async方法中的block中调用了[node class];，使得blcok持有node，其指向的对象也就不会释放，而此时只有dispatch_async的block持有node，也就自然node释放的过程发生在dispatch_async指定的线程当中。

YYCache

YYCache是线程安全的键值对缓存。它使用YYMemoryCache将对象存储在速度快但空间小的内存缓存中，使用YYDiskCache将对象持久化存储在速度慢但空间大的磁盘缓存中。其类结构也很简单：

@interface YYCache : NSObject
/** The name of the cache, readonly. */
@property (copy, readonly) NSString *name;
/** The underlying memory cache. see `YYMemoryCache` for more information.*/
@property (strong, readonly) YYMemoryCache *memoryCache;
/** The underlying disk cache. see `YYDiskCache` for more information.*/
@property (strong, readonly) YYDiskCache *diskCache;
@end