《数据结构、算法与应用 —— C++语言描述》学习笔记 — 字典 — 跳表实现
一、跳表
1、理想情况
在一个用有序链表描述的 n 个数对的字典中进行查找,至多需要 n 此关键字比较。如果在链表的中部节点加一个指针,则比较次数可以减少到
n
2
+
1
\frac{n}{2} + 1
2n?+1。这是,为了查找一个数对,首先与中间的数对比较。如果查找的数对关键字比较小,则仅在链表的左半部分继续查找;否则,在链表右半部分继续查找。其实这就是对应于二分查找的链表结构优化。
我们增加头结点和尾结点来表示跳表,其结构图示为: 为了查找关键字为30的记录,首先用2级链表查找,所需时间为 O(1)。因为 30 < 40,所以要在链表左半部分用1级链表查找,所需时间为 O(1)。又因为 30 > 24,所以要用0级链表与24的下一个数对比较。
对n个数对而言,0级链表包括所有数对,1级链表每2个数对取一个,2级链表每4个数对取一个,i 级链表每
2
i
2^i
2i 个数对取一个。一个数对属于 i 级链表,当且仅当它属于 0 ~ i 级链表,但不属于 i + 1 级链表。在上图中,关键字40是2级链表中唯一的一个数对。
上述的结构称为跳表。在该结构中有一组等级链表。0级链表包含所有数对,1级链表的数对是0级链表数对的一个子集。i 级链表的数对是 i - 1 级链表数对的子集。
2、插入和删除
在插入和删除时,要保持跳表的规则结构,需要耗时 O(n)。在规则的跳表结构中,i 级链表有
n
2
i
\frac{n}{2^i}
2in? 个记录,在插入时要尽量逼近这种结构。插入的新数对属于 i 级链表的概率为
1
2
i
\frac{1}{2^i}
2i1?。在实际确定新数对所属的链表级别时,应考虑各种可能的情况。把新数对插入 i 级链表的可能性为
p
i
p^i
pi。在上述图示中,
p
=
0.5
p = 0.5
p=0.5。对一般的 p,链表的级数为
?
l
o
g
1
p
n
?
+
1
\left \lfloor log_{\frac{1}{p}}{n} \right \rfloor + 1
?logp1??n?+1。在一个规则的跳表中,i 级链表包含
1
p
\frac{1}{p}
p1? 个 i - 1 级链表的节点。
数对的插入过程如图: 在插入时,要确定数对属于哪一级链表,分配过程由后面的随机数生成器来完成。若新数对属于 i 级链表,则插入结果仅影响前 i 级链表指针。
对于删除操作,我们无法控制结构。要删除前面插入的77。首先要找到77。然后所遇到的链表指针时节点40的2级链表指针、节点75的1级链表指针和0级链表指针。因为77为1级链表中数对的关键字,所以只需改变0级和1级链表指针即可。
3、级的分配
在规则的跳表结构中, i - 1 级链表的数对个数与 i 级链表的数对个数之比是一个分数 p。因此,属于 i - 1 级链表的数对同时属于 i 级链表的概率为 p。假设用一个统一的随机数生成器产生0和1之间的实数,产生的随机数
≤
p
\le p
≤p 的概率为 p。若下一个随机数
≤
p
\le p
≤p,则新数对应在1级链表上。要确定该数对是否在2级链表上,要由下一个随机数确定。若新的随机数
≤
p
\le p
≤p,则该元素也属于2级链表。重复这个过程,知道一个随机数
>
p
\gt p
>p 为止。
这种方法有潜在的确定啊,某些数对被分配的级数可能特别大,远远超过
l
o
g
1
p
N
log_{\frac{1}{p}}{N}
logp1??N,其中N为字典数对的最大预期数目。为避免这种情况,可以设定一个级数的上限maxLevel,最大值为
?
l
o
g
1
p
n
?
?
1
\left \lceil log_{\frac{1}{p}}{n} \right \rceil - 1
?logp1??n??1
这种方法还有一个缺点,即使采用了级数的上限 maxLevel,还可能出现这样的情况:在插入一个新数对之前有三个链表,而在插入之后就有了10个链表。也就是说,尽管3~8级没有数对,新数对却被分配到9级链表。我们可以把新纪录的链表等级调整为3。
二、跳表实现
1、节点定义
#pragma once
#include <utility>
template<typename Key, typename Value>
struct skipNode
{
using value_type = typename std::pair<Key, Value>;
value_type element;
skipNode** nextNodesAddressForDifferentLevels;
skipNode(value_type element, int level)
{
this->element = element;
nextNodesAddressForDifferentLevels = new skipNode* [level];
}
~skipNode()
{
delete[] nextNodesAddressForDifferentLevels;
}
};
书中的节点定义有些问题,主要体现在没有实现析构函数。因此在堆上创建的 nextNodesAddressForDifferentLevels 指针数组并不会得到释放,造成内存泄露。
2、声明
这次我们希望使用自解释的代码,因此基本不会写函数注释。
#pragma once
#include "skipNode.h"
#include "dictionary.h"
#include <cmath>
#include <exception>
#include <random>
#include <algorithm>
template<typename Key, typename Value>
class skipList : public dictionary<Key, Value>
{
public:
using value_type = typename std::pair<Key, Value>;
skipList(int maxPairsNums, float probabilityOfNodeNotBelongingToNextLevel, Key keyLargerThanAllKeysInDictionary);
skipList(const skipList& other);
skipList(skipList&& other);
virtual ~skipList() override;
skipList& operator=(const skipList& other);
skipList& operator=(skipList&& other);
virtual bool empty() const override;
virtual int size() const override;
virtual value_type* find(const Key & key) const override;
virtual void erase(const Key & key) override;
virtual void insert(const value_type & element) override;
private:
using node_type = typename skipNode<Key, Value>;
int dictionarySize;
int maxExpectedLevel;
int maxNoneEmptyLevel;
float probabilityOfNodeNotBelongingToNextLevel;
Key keyLargerThanAllKeysInDictionary;
node_type* headerNode;
node_type* tailNode;
skipList() = default;
void swap(skipList& other);
skipList makeCopy(const skipList& other);
void makeCopyAndSwap(const skipList& other);
node_type* findPrecursorNodeForNodeWithKeyAtLevel0(const Key& key) const;
node_type** findPrecursorNodeForNodeWithKeyAtAllLevels(const Key& key) const;
int getRandomLevelAndCompareWithMaxEmptyLevel();
int getRandomLevel() const;
void checkKeyAndThrow(const Key& key);
void deleteNextNodeAtLevel0(node_type** nodes);
void reduceLevel();
void insertElementAfterPrecursorNodesAtLevel(const value_type& element, int level, skipList<Key, Value>::node_type** precursorNodes);
void copyAllNodesAtLevel0FromAnotherSkipList(const skipList& other);
void setNextPointersForNodesAtAllLevelsAccordingToAnotherSkipList(const skipList& other);
};
3、拷贝控制
template<typename Key, typename Value>
inline skipList<Key, Value>::skipList(int maxPairs, float probabilityOfNodeNotBelongingToNextLevel, Key keyLargerThanAllKeysInDictionary)
{
this->probabilityOfNodeNotBelongingToNextLevel = probabilityOfNodeNotBelongingToNextLevel;
this->keyLargerThanAllKeysInDictionary = keyLargerThanAllKeysInDictionary;
this->maxExpectedLevel = ceil(-log(maxPairs) / log(probabilityOfNodeNotBelongingToNextLevel) - 1);
this->maxNoneEmptyLevel = -1;
this->dictionarySize = 0;
value_type tailElement;
tailElement.first = keyLargerThanAllKeysInDictionary;
this->headerNode = new node_type(tailElement, maxExpectedLevel);
this->tailNode = new node_type(tailElement, maxExpectedLevel);
for (int i = 0; i < maxExpectedLevel; ++i)
{
this->headerNode->nextNodesAddressForDifferentLevels[i] = this->tailNode;
}
}
template<typename Key, typename Value>
inline skipList<Key, Value>::skipList(const skipList& other)
{
makeCopyAndSwap(other);
}
template<typename Key, typename Value>
inline skipList<Key, Value>::skipList(skipList&& other)
{
swap(other);
}
template<typename Key, typename Value>
inline skipList<Key, Value>::~skipList()
{
while (headerNode != tailNode)
{
auto deleteNode = headerNode;
headerNode = headerNode->nextNodesAddressForDifferentLevels[0];
delete deleteNode;
}
delete tailNode;
}
template<typename Key, typename Value>
inline skipList<Key, Value>& skipList<Key, Value>::operator=(const skipList& other)
{
makeCopyAndSwap(other);
return *this;
}
template<typename Key, typename Value>
inline skipList<Key, Value>& skipList<Key, Value>::operator=(skipList&& other)
{
swap(other);
return *this;
}
实际的拷贝控制封装到内部接口进行实现。
4、拷贝控制内部接口
template<typename Key, typename Value>
inline void skipList<Key, Value>::swap(skipList& other)
{
using std::swap;
swap(dictionarySize, other.dictionarySize);
swap(maxExpectedLevel, other.maxExpectedLevel);
swap(maxNoneEmptyLevel, other.maxNoneEmptyLevel);
swap(probabilityOfNodeNotBelongingToNextLevel, other.probabilityOfNodeNotBelongingToNextLevel);
swap(keyLargerThanAllKeysInDictionary, other.keyLargerThanAllKeysInDictionary);
swap(headerNode, other.headerNode);
swap(tailNode, other.tailNode);
}
template<typename Key, typename Value>
inline skipList<Key, Value> skipList<Key, Value>::makeCopy(const skipList& other)
{
skipList returnSkipList;
if (other.headerNode == nullptr)
{
return returnSkipList;
}
returnSkipList.dictionarySize = other.dictionarySize;
returnSkipList.maxExpectedLevel, other.maxExpectedLevel;
returnSkipList.maxNoneEmptyLevel = other.maxNoneEmptyLevel;
returnSkipList.probabilityOfNodeNotBelongingToNextLevel = other.probabilityOfNodeNotBelongingToNextLevel;
returnSkipList.keyLargerThanAllKeysInDictionary = other.keyLargerThanAllKeysInDictionary;
returnSkipList.copyAllNodesAtLevel0FromAnotherSkipList(other);
returnSkipList.setNextPointersForNodesAtAllLevelsAccordingToAnotherSkipList(other);
return returnSkipList;
}
template<typename Key, typename Value>
inline void skipList<Key, Value>::makeCopyAndSwap(const skipList& other)
{
auto copySortedChain = makeCopy(other);
swap(copySortedChain);
}
template<typename Key, typename Value>
inline void skipList<Key, Value>::copyAllNodesAtLevel0FromAnotherSkipList(const skipList& other)
{
this->headerNode = new node_type(other.headerNode->element, other.maxExpectedLevel);
this->tailNode = new node_type(other.tailNode->element, other.maxExpectedLevel);
auto currentNodeThis = this->headerNode;
auto currentNodeOther = other.headerNode;
while (currentNodeOther->nextNodesAddressForDifferentLevels[0]->element.first != other.keyLargerThanAllKeysInDictionary)
{
currentNodeThis->nextNodesAddressForDifferentLevels[0] = new node_type(currentNodeOther->nextNodesAddressForDifferentLevels[0]->element, maxExpectedLevel);
currentNodeThis = currentNodeThis->nextNodesAddressForDifferentLevels[0];
currentNodeOther = currentNodeOther->nextNodesAddressForDifferentLevels[0];
}
currentNodeThis->nextNodesAddressForDifferentLevels[0] = this->tailNode;
}
template<typename Key, typename Value>
inline void skipList<Key, Value>::setNextPointersForNodesAtAllLevelsAccordingToAnotherSkipList(const skipList& other)
{
for (int i = 1; i <= other.maxNoneEmptyLevel; i++)
{
auto currentLevelNodeThis = this->headerNode;
auto downLevelNodeThis = this->headerNode;
auto currentLevelNodeOther = other.headerNode;
auto downLevelNodeOther = other.headerNode;
while (currentLevelNodeOther->nextNodesAddressForDifferentLevels[i]->element.first != other.keyLargerThanAllKeysInDictionary)
{
currentLevelNodeOther = currentLevelNodeOther->nextNodesAddressForDifferentLevels[i];
while (downLevelNodeOther != currentLevelNodeOther)
{
downLevelNodeOther = downLevelNodeOther->nextNodesAddressForDifferentLevels[i - 1];
downLevelNodeThis = downLevelNodeThis->nextNodesAddressForDifferentLevels[i - 1];
}
currentLevelNodeThis->nextNodesAddressForDifferentLevels[i] = downLevelNodeThis;
currentLevelNodeThis = currentLevelNodeThis->nextNodesAddressForDifferentLevels[i];
}
currentLevelNodeThis->nextNodesAddressForDifferentLevels[i] = this->tailNode;
}
}
5、容量接口
template<typename Key, typename Value>
inline bool skipList<Key, Value>::empty() const
{
return this->headerNode->nextNodesAddressForDifferentLevels[0] == this->tailNode;
}
template<typename Key, typename Value>
inline int skipList<Key, Value>::size() const
{
return dictionarySize;
}
6、查找接口
template<typename Key, typename Value>
inline typename skipList<Key, Value>::value_type* skipList<Key, Value>::find(const Key& key) const
{
if (key >= keyLargerThanAllKeysInDictionary)
{
return nullptr;
}
auto nodePrecursorNodeForNodeWithKeyAtLevel0 = findPrecursorNodeForNodeWithKeyAtLevel0(key);
auto nextNode = nodePrecursorNodeForNodeWithKeyAtLevel0->nextNodesAddressForDifferentLevels[0];
return nextNode->element.first == key ? &(nextNode->element) : nullptr;
}
7、查找内部接口
template<typename Key, typename Value>
inline auto skipList<Key, Value>::findPrecursorNodeForNodeWithKeyAtLevel0(const Key& key) const -> skipList<Key, Value>::node_type*
{
auto precursorNode = headerNode;
for (int i = maxNoneEmptyLevel; i >= 0; --i)
{
while (precursorNode->nextNodesAddressForDifferentLevels[i]->element.first < key)
{
precursorNode = precursorNode->nextNodesAddressForDifferentLevels[i];
}
}
return precursorNode;
}
8、删除接口
template<typename Key, typename Value>
inline void skipList<Key, Value>::erase(const Key& key)
{
checkKeyAndThrow(key);
auto precursorNodes = findPrecursorNodeForNodeWithKeyAtAllLevels(key);
if (precursorNodes[0]->nextNodesAddressForDifferentLevels[0]->element.first != key)
{
return;
}
deleteNextNodeAtLevel0(precursorNodes);
reduceLevel();
dictionarySize--;
}
9、删除内部接口
template<typename Key, typename Value>
inline auto skipList<Key, Value>::findPrecursorNodeForNodeWithKeyAtAllLevels(const Key& key) const -> skipList<Key, Value>::node_type**
{
skipList<Key, Value>::node_type** precursorNodes = new node_type *[maxExpectedLevel];
std::fill(precursorNodes, precursorNodes + maxExpectedLevel, headerNode);
auto node = headerNode;
for (int i = maxNoneEmptyLevel; i >= 0; --i)
{
while (node->nextNodesAddressForDifferentLevels[i]->element.first < key)
{
node = node->nextNodesAddressForDifferentLevels[i];
}
precursorNodes[i] = node;
}
return precursorNodes;
}
template<typename Key, typename Value>
inline void skipList<Key, Value>::checkKeyAndThrow(const Key& key)
{
if (key >= keyLargerThanAllKeysInDictionary)
{
throw std::invalid_argument("too large key");
}
}
template<typename Key, typename Value>
inline void skipList<Key, Value>::deleteNextNodeAtLevel0(node_type** nodes)
{
auto deleteNode = nodes[0]->nextNodesAddressForDifferentLevels[0];
for (int i = 0; i <= maxNoneEmptyLevel && nodes[i]->nextNodesAddressForDifferentLevels[i] == deleteNode; ++i)
{
nodes[i]->nextNodesAddressForDifferentLevels[i] = nodes[i]->nextNodesAddressForDifferentLevels[i]->nextNodesAddressForDifferentLevels[i];
}
delete deleteNode;
}
template<typename Key, typename Value>
inline void skipList<Key, Value>::reduceLevel()
{
while (maxNoneEmptyLevel > 0 && headerNode->nextNodesAddressForDifferentLevels[maxNoneEmptyLevel] == tailNode)
{
maxNoneEmptyLevel--;
}
}
10、插入接口
template<typename Key, typename Value>
inline void skipList<Key, Value>::insert(const value_type& element)
{
checkKeyAndThrow(element.first);
auto precursorNodes = findPrecursorNodeForNodeWithKeyAtAllLevels(element.first);
if (precursorNodes[0]->nextNodesAddressForDifferentLevels[0]->element.first == element.first)
{
precursorNodes[0]->nextNodesAddressForDifferentLevels[0]->element.second = element.second;
return;
}
int level = getRandomLevelAndCompareWithMaxEmptyLevel();
insertElementAfterPrecursorNodesAtLevel(element, level, precursorNodes);
dictionarySize++;
}
11、插入内部接口
template<typename Key, typename Value>
inline int skipList<Key, Value>::getRandomLevelAndCompareWithMaxEmptyLevel()
{
int level = getRandomLevel();
if (level > maxNoneEmptyLevel)
{
level = ++maxNoneEmptyLevel;
}
return level;
}
template<typename Key, typename Value>
inline int skipList<Key, Value>::getRandomLevel() const
{
static std::random_device rd;
static std::mt19937 gen(rd());
std::bernoulli_distribution bd(probabilityOfNodeNotBelongingToNextLevel);
int level = 0;
while (bd(gen))
{
level++;
}
return level <= maxExpectedLevel ? level : maxExpectedLevel;
}
template<typename Key, typename Value>
inline void skipList<Key, Value>::insertElementAfterPrecursorNodesAtLevel(const value_type& element, int level, skipList<Key, Value>::node_type** precursorNodes)
{
auto newNode = new node_type(element, level);
for (int i = 0; i <= level; ++i)
{
newNode->nextNodesAddressForDifferentLevels[i] = precursorNodes[i]->nextNodesAddressForDifferentLevels[i];
precursorNodes[i]->nextNodesAddressForDifferentLevels[i] = newNode;
}
}
|