哈夫曼树与哈弗曼编码

  • 哈夫曼树

带权路径长度(WPL):设二叉树有n个叶子结点,每个叶子结点带有权值 Wk,从根结点到每个叶子结点的长度为 Lk,则每个叶子结点的带权路径长度之和就是:

      WPL =

最优二叉树或哈夫曼树: WPL最小的二叉树

  • 哈夫曼树的特点:

  ①没有度为1的结点   ②n个叶子结点的哈夫曼树共有2n-1个结点   ③哈夫曼树的任意非叶节点的左右子树交换后仍是哈夫曼树   ④对同一组权值{w1 ,w2 , …… , wn},存在不同构的两棵哈夫曼树
  • 哈夫曼树的构造

每次把权值最小的两颗二叉树合并.(利用堆)

基本操作

  • HuffmanTree的建立
  • 带权路径的求解
  • HuffmanCoding
#include<limits.h> /* INT_MAX等 */
#include<stdio.h> /* EOF(=^Z或F6),NULL */
typedef struct
{
unsigned int weight;
unsigned int parent,lchild,rchild;
}HTNode,*HuffmanTree; /* 动态分配数组存储赫夫曼树 */
typedef char **HuffmanCode; /* 动态分配数组存储赫夫曼编码表 */
int min1(HuffmanTree t,int i)
{ /* 函数void select()调用 */
int j,flag;
unsigned int k=UINT_MAX; /* 取k为不小于可能的值 */
for(j=1;j<=i;j++)
if(t[j].weight<k&&t[j].parent==0)
k=t[j].weight,flag=j;
t[flag].parent=1;
return flag;
}
void select(HuffmanTree t,int i,int *s1,int *s2)
{ /* s1为最小的两个值中序号小的那个 */
int j;
*s1=min1(t,i);
*s2=min1(t,i);
if(*s1>*s2)
{
j=*s1;
*s1=*s2;
*s2=j;
}
}
void HuffmanCoding(HuffmanTree *HT,HuffmanCode *HC,int *w,int n)
{ /* w存放n个字符的权值(均>0),构造赫夫曼树HT,并求出n个字符的赫夫曼编码HC */
int m,i,s1,s2;
unsigned c,cdlen;
HuffmanTree p;
char *cd;
if(n<=1)
return;
m=2*n-1;
*HT=(HuffmanTree)malloc((m+1)*sizeof(HTNode)); /* 0号单元未用 */
for(p=*HT+1,i=1;i<=n;++i,++p,++w)
{
(*p).weight=*w;
(*p).parent=0;
(*p).lchild=0;
(*p).rchild=0;
}
for(;i<=m;++i,++p)
(*p).parent=0;
for(i=n+1;i<=m;++i) /* 建赫夫曼树 */
{ /* 在HT[1~i-1]中选择parent为0且weight最小的两个结点,其序号分别为s1和s2 */
select(*HT,i-1,&s1,&s2);
(*HT)[s1].parent=(*HT)[s2].parent=i;
(*HT)[i].lchild=s1;
(*HT)[i].rchild=s2;
(*HT)[i].weight=(*HT)[s1].weight+(*HT)[s2].weight;
}
/* 以下为无栈非递归遍历赫夫曼树,求赫夫曼编码*/
*HC=(HuffmanCode)malloc((n+1)*sizeof(char*));
/* 分配n个字符编码的头指针向量([0]不用) */
cd=(char*)malloc(n*sizeof(char)); /* 分配求编码的工作空间 */
c=m;
cdlen=0;
for(i=1;i<=m;++i)
(*HT)[i].weight=0; /* 遍历赫夫曼树时用作结点状态标志 */
while(c)
{
if((*HT)[c].weight==0)
{ /* 向左 */
(*HT)[c].weight=1;
if((*HT)[c].lchild!=0)
{
c=(*HT)[c].lchild;
cd[cdlen++]='0';
}
else if((*HT)[c].rchild==0)
{ /* 登记叶子结点的字符的编码 */
(*HC)[c]=(char *)malloc((cdlen+1)*sizeof(char));
cd[cdlen]='\0';
strcpy((*HC)[c],cd); /* 复制编码(串) */
}
}
else if((*HT)[c].weight==1)
{ /* 向右 */
(*HT)[c].weight=2;
if((*HT)[c].rchild!=0)
{
c=(*HT)[c].rchild;
cd[cdlen++]='1';
}
}
else
{ /* HT[c].weight==2,退回 */
(*HT)[c].weight=0;
c=(*HT)[c].parent;
--cdlen; /* 退到父结点,编码长度减1 */
}
}
free(cd);
}
void main()
{
HuffmanTree HT;
HuffmanCode HC;
int *w,n,i;
printf("请输入权值的个数(>1):");
scanf("%d",&n);
w=(int *)malloc(n*sizeof(int));
printf("请依次输入%d个权值(整型):\n",n);
for(i=0;i<=n-1;i++)
scanf("%d",w+i);
HuffmanCoding(&HT,&HC,w,n);
for(i=1;i<=n;i++)
puts(HC[i]);
}

题目

  • Input Specification:
Each input file contains one test case. For each case, the first line gives an integer NN (2\le N\le 632≤N≤63), then followed by a line that contains all the NN distinct characters and their frequencies in the following format:

c[1] f[1] c[2] f[2] ... c[N] f[N]
where c[i] is a character chosen from {'0' - '9', 'a' - 'z', 'A' - 'Z', '_'}, and f[i] is the frequency of c[i] and is an integer no more than 1000. The next line gives a positive integer MM (\le 1000≤1000), then followed by MM student submissions. Each student submission consists of NN lines, each in the format: c[i] code[i]
where c[i] is the i-th character and code[i] is an non-empty string of no more than 63 '0's and '1's.
  • Output Specification:

For each test case, print in each line either "Yes" if the student's submission is correct, or "No" if not.

  • Note: The optimal solution is not necessarily generated by Huffman algorithm. Any prefix code with code length being optimal is considered correct.

  • Sample Input:

7
A 1 B 1 C 1 D 3 E 3 F 6 G 6
4
A 00000
B 00001
C 0001
D 001
E 01
F 10
G 11
A 01010
B 01011
C 0100
D 011
E 10
F 11
G 00
A 000
B 001
C 010
D 011
E 100
F 101
G 110
A 00000
B 00001
C 0001
D 001
E 00
F 10
G 1
  • Sample Output:
Yes
Yes
No
No

AC代码


#include <iostream>
#include <cstdio>
#include <vector>
#include <string>
using namespace std;
#define MAXSIZE 64 int nodenum;
int c[64];
char f[64]; typedef struct TNode* HuffTree;
struct TNode
{
HuffTree left;
HuffTree right;
int freq;
}; struct heap
{
HuffTree* data;
int size;
int capacity;
};
typedef struct heap* Minheap; Minheap CreatHeap()
{
Minheap H = new struct heap;
H->data = new HuffTree[MAXSIZE];
H->size = 0;
H->capacity = MAXSIZE;
H->data[0] = new struct TNode;
H->data[0]->freq = -1;
return H;
}
bool Insert(Minheap H, HuffTree f)
{
int i;
if (H->size == H->capacity){
printf("minheap is full");
return false;
}
i = ++H->size;
for (; H->data[i / 2]->freq > f->freq; i /= 2)
H->data[i] = H->data[i / 2];
H->data[i] = f;
return true;
}
bool IsEmpty(Minheap H)
{
return (H->size == 0);
} HuffTree DeleteMin(Minheap H)
{
int Parent, Child;
HuffTree MinItem, X; if (IsEmpty(H)) {
printf("minheap is empty");
} MinItem = H->data[1];
X = H->data[H->size--];
for (Parent = 1; Parent * 2 <= H->size; Parent = Child) {
Child = Parent * 2;
if ((Child != H->size) && (H->data[Child]->freq > H->data[Child + 1]->freq))
Child++;
if (X->freq <= H->data[Child]->freq) break;
else
H->data[Parent] = H->data[Child];
}
H->data[Parent] = X; return MinItem;
} HuffTree HuffmanTree()
{
Minheap h = CreatHeap();
for (int i = 0; i < nodenum; i++)
{ HuffTree f = new struct TNode;
f->freq = c[i]; //全局变量赋值
f->left = NULL;
f->right = NULL;
if (!Insert(h, f))
break;
}
for (;;)
{
if (h->size == 1) break;
HuffTree f = new struct TNode;
f->left = DeleteMin(h);
f->right = DeleteMin(h);
f->freq = f->left->freq + f->right->freq;
Insert(h, f);
//printf("fleft=%d,fright=%d,ffreq=%d\n",f->left->freq,f->right->freq,f->freq);
}
return DeleteMin(h);
}
int WPL(HuffTree tree, int depth)
{
if ((!tree->left) && (!tree->right))
{
//printf("depth:%d,leaf->freq:%d\n",depth,tree->freq);
return depth*(tree->freq);
}
else
{
//printf("depth:%d,tree->freq:%d\n",depth,tree->freq);
return WPL(tree->left, depth + 1) + WPL(tree->right, depth + 1);
}
} bool check(HuffTree tree, string s)
{
bool flag = false;
HuffTree p = tree;
for (int i = 0; i < s.size(); i++)
{
if (s[i] == '0')
{
if (!p->left)
{
p->left = new TNode;
p->left->left = NULL;
p->left->right = NULL;
p = p->left;
flag = true;
}
else
{
p = p->left;
}
}
else if (s[i] == '1')
{
if (!p->right)
{
p->right = new TNode;
p->right->left = NULL;
p->right->right = NULL;
p = p->right;
flag = true;
}
else
{
p = p->right;
}
}
}
return flag;
} int main()
{
int case_;
cin >> nodenum;
for (int i = 0; i < nodenum; i++)
{
cin >> f[i] >> c[i];
}
HuffTree tree = HuffmanTree();
int wpl = WPL(tree, 0); //权值可以放在建HufumanTree的时候计算 cin >> case_;
for (int j = 0; j < case_; j++)
{
HuffTree root = new TNode;
root->left = NULL;
root->right = NULL;
int s_wpl = 0;
string judge = "";
for (int i = 0; i<nodenum; i++)
{
char ch;
string s;
cin >> ch >> s;
if (s.size()>nodenum - 1){
judge = "No";
break;
}
s_wpl += s.size()*c[i]; if (!check(root, s)) //判断序列是否满足要求
judge = "No";
}
if (judge.empty() && s_wpl == wpl)
judge = "Yes";
else
judge = "No";
cout << judge << endl;
}
return 0;
}
  • 补充

//补充,理解思路,时间复杂度O(N*logN)
HuffTree HuffmanTree(Minheap H)
{
//假设H->size个权值已经在H->data->frq里
int i;
HuffTree T;
BuildMinTree(H); //将H->data[]按权值调整为最小堆
for (i = 0; i < H->size;i++)
{
T = malloc(sizeof(struct TNode));
T->left = DeleteMin(H); //从最小堆中删除一个结点,作为新T的左子结点
T->right = DeleteMin(H); //从最小堆中删除一个结点,作为新T的右子节点
T->freq = T->left->freq + T->right->freq; //计算新权值
Insert(H, T);
}
T = DeleteMin(H);
return T;
}

题目来源

05-树9 Huffman Codes

Reference

05-树9 Huffman Codes

05-树9 Huffman Codes (用优先队列实现)

05-树9 Huffman Codes及基本操作的更多相关文章

  1. PAT 05-树8 Huffman Codes

    以现在的生产力,是做不到一天一篇博客了.这题给我难得不行了,花了两天时间在PAT上还有测试点1没过,先写上吧.记录几个做题中的难点:1.本来比较WPL那块我是想用一个函数实现的,无奈我对传字符串数组无 ...

  2. 05-树9 Huffman Codes

    哈夫曼树 Yes 需满足两个条件:1.HuffmanTree 结构不同,但WPL一定.子串WPL需一致 2.判断是否为前缀码 开始判断用的strstr函数,但其传值应为char *,不能用在strin ...

  3. pta5-9 Huffman Codes (30分)

    5-9 Huffman Codes   (30分) In 1953, David A. Huffman published his paper "A Method for the Const ...

  4. PTA 05-树9 Huffman Codes (30分)

    题目地址 https://pta.patest.cn/pta/test/16/exam/4/question/671 5-9 Huffman Codes   (30分) In 1953, David ...

  5. 数据结构慕课PTA 05-树9 Huffman Codes

    题目内容 In 1953, David A. Huffman published his paper "A Method for the Construction of Minimum-Re ...

  6. 哈夫曼树(Huffman Tree)与哈夫曼编码

    哈夫曼树(Huffman Tree)与哈夫曼编码(Huffman coding)

  7. 05-树9 Huffman Codes (30 分)

    In 1953, David A. Huffman published his paper "A Method for the Construction of Minimum-Redunda ...

  8. 05-树9 Huffman Codes (30 分)

    In 1953, David A. Huffman published his paper "A Method for the Construction of Minimum-Redunda ...

  9. Huffman codes

    05-树9 Huffman Codes(30 分) In 1953, David A. Huffman published his paper "A Method for the Const ...

随机推荐

  1. keras + tensorflow安装

    先安装anaconda 一条指令:conda install keras 就可以把keras,tensorflow装好.

  2. LeetCode(32):最长有效括号

    Hard! 题目描述: 给定一个只包含 '(' 和 ')' 的字符串,找出最长的包含有效括号的子串的长度. 示例 1: 输入: "(()" 输出: 2 解释: 最长有效括号子串为 ...

  3. cf276E 两棵线段树分别维护dfs序和bfs序,好题回头再做

    搞了一晚上,错了,以后回头再来看 /* 对于每次更新,先处理其儿子方向,再处理其父亲方向 处理父亲方向时无法达到根,那么直接更新 如果能达到根,那么到兄弟链中去更新,使用bfs序 最后,查询结点v的结 ...

  4. Lucene.Net简单例子-01

    前面已经简单介绍了Lucene.Net,下面来看一个实际的例子 1.1 引用必要的bll文件.这里不再介绍(Lucene.Net  PanGu  PanGu.HightLight  PanGu.Luc ...

  5. 在Centos下用alternative命令切换各个版本的jdk的方法

    https://blog.csdn.net/nsrainbow/article/details/43273991 https://blog.csdn.net/yzh_1346983557/articl ...

  6. springbank 开发日志 Spring启动过程中对自定义标签的处理

    这篇随笔的许多知识来源于:http://www.importnew.com/19391.html 之所以会去看这些东东,主要是希望能够模仿spring mvc的处理流程,做出一套合理的交易处理流程. ...

  7. Codeforces 901C Bipartite Segments

    Bipartite Segments 因为图中只存在奇数长度的环, 所以它是个只有奇数环的仙人掌, 每条边只属于一个环. 那么我们能把所有环给扣出来, 所以我们询问的区间不能包含每个环里的最大值和最小 ...

  8. ubuntu server 18.04的安装 以及配置网络还有ssh服务

    ubuntu server 18.04的安装 以及配置网络还有ssh服务   服务器是 dell T420 安装过程中规中矩,其中最关键的是分区部分,由于是服务器,如果磁盘比较大的话,一定要用 uef ...

  9. hdu 1686 Oulipo 【KMP】(计算模式串匹配的次数——与已匹配的字串可以有交集)

    题目链接:https://vjudge.net/contest/220679#problem/B 题目大意: 输入一个T,表示有T组测试数据: 每组测试数据包括一个字符串W,T,T长度大于W小于100 ...

  10. Linux学习之分区自动挂载与fstab文件修复(九)

    linux分区自动挂载与fstab文件修复 在前面我们实现新添加硬盘,进行分区与格式化,然后手动挂载,这样做,在重启后,需要重新挂载才能使用. https://www.cnblogs.com/-wen ...