题目内容

In 1953, David A. Huffman published his paper "A Method for the Construction of Minimum-Redundancy Codes", and hence printed his name in the history of computer science. As a professor who gives the final exam problem on Huffman codes, I am encountering a big problem: the Huffman codes are NOT unique. For example, given a string "aaaxuaxz", we can observe that the frequencies of the characters 'a', 'x', 'u' and 'z' are 4, 2, 1 and 1, respectively. We may either encode the symbols as {'a'=0, 'x'=10, 'u'=110, 'z'=111}, or in another way as {'a'=1, 'x'=01, 'u'=001, 'z'=000}, both compress the string into 14 bits. Another set of code can be given as {'a'=0, 'x'=11, 'u'=100, 'z'=101}, but {'a'=0, 'x'=01, 'u'=011, 'z'=001} is NOT correct since "aaaxuaxz" and "aazuaxax" can both be decoded from the code 00001011001001. The students are submitting all kinds of codes, and I need a computer program to help me determine which ones are correct and which ones are not.

输入格式

Each input file contains one test case. For each case, the first line gives an integer N (2≤N≤63), then followed by a line that contains all the N distinct characters and their frequencies in the following format:

c[1] f[1] c[2] f[2] ... c[N] f[N]

where c[i] is a character chosen from {'0' - '9', 'a' - 'z', 'A' - 'Z', '_'}, and f[i] is the frequency of c[i] and is an integer no more than 1000. The next line gives a positive integer M (≤1000), then followed by M student submissions. Each student submission consists of N lines, each in the format:

c[i] code[i]

where c[i] is the i-th character and code[i] is an non-empty string of no more than 63 '0's and '1's.

输出格式

For each test case, print in each line either "Yes" if the student's submission is correct, or "No" if not.

Note: The optimal solution is not necessarily generated by Huffman algorithm. Any prefix code with code length being optimal is considered correct.

输入样例（对应图1）：

7

A 1 B 1 C 1 D 3 E 3 F 6 G 6

4

A 00000

B 00001

C 0001

D 001

E 01

F 10

G 11

A 01010

B 01011

C 0100

D 011

E 10

F 11

G 00

A 000

B 001

C 010

D 011

E 100

F 101

G 110

A 00000

B 00001

C 0001

D 001

E 00

F 10

G 11

输出样例

Yes

Yes

No

No

单词

redundancy

英 /rɪ'dʌnd(ə)nsɪ/ 美 /rɪ'dʌndənsi/

n. [计][数] 冗余（等于redundance）；裁员；人浮于事

hence

英 /hens/ 美 /hɛns/

adv. 因此；今后

professor

英 /prə'fesə/ 美 /prə'fɛsɚ/

n. 教授；教师；公开表示信仰的人

respectively

英 /rɪ'spektɪvlɪ/ 美 /rɪ'spɛktɪvli/

adv. 分别地；各自地，独自地

compress

英 /kəm'pres/ 美 /kəm'prɛs/

vt. 压缩，压紧；精简

vi. 受压缩小

correct

英 /kə'rekt/ 美 /kə'rɛkt/

adj. （政治或思想）正确的；恰当的；端正的

v. 改正；批改（学生作业）；校正；指出错误；抵消；校准（仪器）；修正、调整（数据）

submit

英 /səb'mɪt/ 美 /səb'mɪt/

vi. 服从，顺从

vt. 使服从；主张；呈递；提交

distinct

英 /dɪ'stɪŋ(k)t/ 美 /dɪ'stɪŋkt/

adj. 明显的；独特的；清楚的；有区别的

submission

英 /səb'mɪʃ(ə)n/ 美 /səb'mɪʃən/

n. 投降；提交（物）；服从；（向法官提出的）意见；谦恭

optimal

英 /'ɒptɪm(ə)l/ 美 /'ɑptəml/

adj. 最佳的；最理想的

prefix

英 /'priːfɪks/ 美 /'prifɪks/

n. 前缀

vt. 加前缀；将某事物加在前面

题目分析

本题算是树类题中最难的一道了，考察了二叉堆、哈夫曼树等问题的运用，其中必须要认识到编码最优解并不一定要由哈夫曼树构造得来，这要WPL的值为最优即可。

根据题目的提示，我们发现最优的字符编码方式需要满足：

1.首先WPL的值应该为最小值（与构造的哈夫曼树一致即可保证）。

2.要保证不会出现一个字符的编码是另一个字符的前缀（即保证要计算的结点都出现在叶子结点）。

要满足第一点，我们可以构造一个哈夫曼树，然后计算其WPL值（注意：虽然WPL的最小值不一定要考哈夫曼树构造，但是哈夫曼树构造的一定是WPL最小的情况），构造哈夫曼树，我们需要用到二叉堆，具体实现见下面的代码。

要满足第二点，我选择的方法是根据用户所给的编码自己构造一个树，如果编码为0且上一个结点没有左子树就malloc一个新节点（新节点一律把frequency设为0），如果有那么把左子树置为上一个结点。如果为1就看右子树，如此循环，结束时在最后一个结点把frequency赋给它（因为编码顺序是和输入时一致的，所以我构造了一个全局数组fre[]，来依次存放frequency）。

当在编码路径上遇到frequency不为0的结点说明其遇到了叶子结点，这说明本次的编码一定出现了不满足第二点的情况。或者是一次编码路径结束后，最后一个结点的左右子树不全为空，这也说明了本次编码一定出现了前缀冲突的情况。

代码实现

#include

#include

typedef struct HFMTreeNode *Node;

typedef Node Root;

struct HFMTreeNode

{

	int frequency;

	Node lchild;

	Node rchild;

};//哈夫曼树结构

Node *MinHeap;

int HeapLength;

//最小堆

int M, N;

int *fre;//这个变量用来按顺序保存C[i]的frequency。

void create_heap(int maxsize);

void insert_heap(int pos, Node n);//因为要用DeleteMin来构造哈夫曼树，所以插入时要保存结点的左右子树。

Node DeleteMin_heap(void);

//最小堆操作

Root create_hfmtree(void);

int count_wpl(Root,int);

//哈夫曼树操作

bool is_prefix_true(Root);

//判断前缀是否正确

void create_heap(int maxsize)

{

	fre = (int*)malloc((maxsize - 1) * sizeof(int));

	MinHeap = (Node*)malloc(maxsize * sizeof(Node));

	HeapLength = 0;

	MinHeap[0] = (Node)malloc(sizeof(struct HFMTreeNode));//将首结点置为0，也可以置为一个负数，总之要保证其比所有的可能频率值小即可

	MinHeap[0]->frequency = 0;

	MinHeap[0]->lchild = NULL;

	MinHeap[0]->rchild = NULL;

	for (int i = 1; i frequency = frequency;

		n->lchild = NULL;

		n->rchild = NULL;

		insert_heap(i, n);

	}

}

void insert_heap(int pos, Node n)

{

	while (n->frequency frequency)

	{

		MinHeap[pos] = MinHeap[pos / 2];

		pos /= 2;

	}

	MinHeap[pos] = n;

	HeapLength++;

}

Node DeleteMin_heap(void)

{

	Node temp=MinHeap[1];

	MinHeap[1] = MinHeap[HeapLength];

	HeapLength--;

	int parents = 1;

	int child = 2 * parents;

	while (child frequency > MinHeap[child+1]->frequency)

				child++;

		}

		if (MinHeap[parents]->frequency > MinHeap[child]->frequency)

		{

			Node temp = MinHeap[parents];

			MinHeap[parents] = MinHeap[child];

			MinHeap[child] = temp;

		}

		else

			break;

		parents = child;

		child = 2 * parents;

	}

	return temp;

}

//最小堆操作

Root create_hfmtree(void)

{

	Node Root = NULL;

	scanf("%d", &N);

	create_heap(N + 1);

	while (1)

	{

		Node left = DeleteMin_heap();

		Node right = DeleteMin_heap();

		Node newnode = (Node)malloc(sizeof(struct HFMTreeNode));

		newnode->frequency = left->frequency + right->frequency;

		newnode->lchild = left;

		newnode->rchild = right;

		if (HeapLength != 0)

			insert_heap(HeapLength + 1, newnode);

		else

		{

			Root = newnode;

			break;

		}

	}

	return Root;

}

int count_wpl(Root t,int depth)

{

	if (t == NULL)

		return 0;

	else if (t->lchild == NULL && t->rchild == NULL)

		return t->frequency*depth;

	else

		return count_wpl(t->lchild, depth + 1) + count_wpl(t->rchild, depth + 1);

}

//哈夫曼树操作

bool is_prefix_true(Root t)

{

	int flag = 0;

	char word;

	char string[64];

	for (int i = 0; i lchild == NULL)

				{

					Node Nnode = (Node)malloc(sizeof(struct HFMTreeNode));

					Nnode->frequency = 0;

					Nnode->lchild = NULL;

					Nnode->rchild = NULL;

					Last->lchild = Nnode;

					Last = Nnode;

				}

				else

				{

					if (Last->lchild->frequency != 0)

						flag = 1;

					Last = Last->lchild;

				}

			}

			else if(*p == '1')

			{

				if (Last->rchild == NULL)

				{

					Node Nnode = (Node)malloc(sizeof(struct HFMTreeNode));

					Nnode->frequency = 0;

					Nnode->lchild = NULL;

					Nnode->rchild = NULL;

					Last->rchild = Nnode;

					Last = Nnode;

				}

				else

				{

					if (Last->rchild->frequency != 0)

						flag = 1;

					Last = Last->rchild;

				}

			}

			p++;

		}

		if (Last->lchild != NULL || Last->rchild != NULL)

			flag = 1;

		else

			Last->frequency = fre[i];

	}

	if (flag == 1)

		return false;

	else

		return true;

}

//判断前缀是否正确

bool is_wpl_true(Root t1, Root t2)

{

	if (count_wpl(t1, 0) == count_wpl(t2, 0))

		return true;

	else

		return false;

}

//判断WPL是否一致

int main(void)

{

	Root t = create_hfmtree();

	scanf("%d", &M);

	for (int i = 0; i frequency = 0;

		check_t->lchild = NULL;

		check_t->rchild = NULL;

		if (is_prefix_true(check_t))

		{

			if (is_wpl_true(t, check_t))

				printf("Yes\n");

			else

				printf("No\n");

		}

		else

			printf("No\n");

	}

}

数据结构慕课PTA 05-树9 Huffman Codes的更多相关文章

pta 编程题14 Huffman Codes
其它pta数据结构编程题请参见:pta 题目题目给出一组字母和每个字母的频数,因为哈夫曼编码不唯一,然后给出几组编码,因为哈夫曼编码不唯一,所以让你判断这些编码是否符合是哈夫曼编码的一种. 解题思路 ...
PTA 05-树9 Huffman Codes (30分)
题目地址 https://pta.patest.cn/pta/test/16/exam/4/question/671 5-9 Huffman Codes (30分) In 1953, David ...
数据结构（三）树和二叉树，以及Huffman树
三.树和二叉树 1.树 2.二叉树 3.遍历二叉树和线索二叉树 4.赫夫曼树及应用树和二叉树树状结构是一种常用的非线性结构,元素之间有分支和层次关系,除了树根元素无前驱外,其它元素都有唯一前驱. ...
数据结构（二十七）Huffman树和Huffman编码
Huffman树是一种在编码技术方面得到广泛应用的二叉树,它也是一种最优二叉树. 一.霍夫曼树的基本概念 1.结点的路径和结点的路径长度:结点间的路径是指从一个结点到另一个结点所经历的结点和分支序列. ...
PAT 05-树8 Huffman Codes
以现在的生产力,是做不到一天一篇博客了.这题给我难得不行了,花了两天时间在PAT上还有测试点1没过,先写上吧.记录几个做题中的难点:1.本来比较WPL那块我是想用一个函数实现的,无奈我对传字符串数组无 ...
05-树9 Huffman Codes
哈夫曼树 Yes 需满足两个条件:1.HuffmanTree 结构不同,但WPL一定.子串WPL需一致 2.判断是否为前缀码开始判断用的strstr函数,但其传值应为char *,不能用在strin ...
05-树9 Huffman Codes及基本操作
哈夫曼树与哈弗曼编码哈夫曼树带权路径长度(WPL):设二叉树有n个叶子结点,每个叶子结点带有权值 Wk,从根结点到每个叶子结点的长度为 Lk,则每个叶子结点的带权路径长度之和就是: WPL = 最 ...
pta5-9 Huffman Codes (30分)
5-9 Huffman Codes (30分) In 1953, David A. Huffman published his paper "A Method for the Const ...
Mysql存储引擎之TokuDB以及它的数据结构Fractal tree(分形树)
在目前的Mysql数据库中,使用最广泛的是innodb存储引擎.innodb确实是个很不错的存储引擎,就连高性能Mysql里都说了,如果不是有什么很特别的要求,innodb就是最好的选择.当然,这偏文 ...

随机推荐

源码分析--dubbo服务端暴露
服务暴露的入口方法是 ServiceBean 的 onApplicationEvent.onApplicationEvent 是一个事件响应方法,该方法会在收到 Spring 上下文刷新事件后执行服务 ...
[HAOI2018]苹果树(组合数学,计数)
[HAOI2018]苹果树 cx巨巨给我的大火题. 感觉这题和上次考试gcz讲的那道有标号树的形态(不记顺序)计数问题很类似. 考虑如果对每个点对它算有贡献的其他点很麻烦,不知怎么下手.这个时候就想到 ...
win7 部署tomcat
1,下载 jdk:http://www.oracle.com/technetwork/java/javase/downloads/jdk-7u3-download-1501626.html 2,下载t ...
章节十六、4-TestNG高级功能--把测试方法分优先级、分组执行
一. 把测试方法分优先级执行----->(priority=索引) 1.新建一个testng方法 package testclasses; import org.testng.annotatio ...
JDBC主要API学习总结
JDBC主要API学习一.JDBC主要API简介 JDBC API 是一系列的接口,它使得应用程序能够进行数据库联接,执行SQL语句,并且得到返回结果. 二.Driver 接口 Java.sql.D ...
算法与数据结构基础 - 贪心(Greedy)
贪心基础贪心(Greedy)常用于解决最优问题,以期通过某种策略获得一系列局部最优解.从而求得整体最优解. 贪心从局部最优角度考虑,只适用于具备无后效性的问题,即某个状态以前的过程不影响以后的状态. ...
【故障公告】升级阿里云 RDS SQL Server 实例故障经过
昨天晚上,我们使用的阿里云 RDS SQL Server 2008 R2 实例突然出现持续 CPU 100% 问题,后来我们通过重启实例恢复了正常(详见故障公告).但是在恢复正常后发现了新问题,这台 ...
ValueError: Cannot create group in read only mode.
报错 Using TensorFlow backend. Traceback (most recent call last): File "D:/PyCharm 5.0.3/WorkSpac ...
POJ3321 - Apple Tree DFS序 + 线段树或树状数组
Apple Tree:http://poj.org/problem?id=3321 题意: 告诉你一棵树,每棵树开始每个点上都有一个苹果,有两种操作,一种是计算以x为根的树上有几个苹果,一种是转换x这 ...
CodeForces - 1150 D Three Religions
题目传送门题解: id[ i ][ j ] 代表的是在第j个位置之后的第i个字符的位置在哪里. dp[ i ][ j ][ k ] 代表的是第一个串匹配到第i个位置, 第二个串匹配到第j个位置, ...

数据结构慕课PTA 05-树9 Huffman Codes