PrefixTree

208. 实现 Trie (前缀树)

Trie（发音类似 "try"）或者说 前缀树 是一种树形数据结构，用于高效地存储和检索字符串数据集中的键。这一数据结构有相当多的应用情景，例如自动补完和拼写检查。

请你实现 Trie 类：

Trie() 初始化前缀树对象。
void insert(String word) 向前缀树中插入字符串 word 。
boolean search(String word) 如果字符串 word 在前缀树中，返回 true（即，在检索之前已经插入）；否则，返回 false 。
boolean startsWith(String prefix) 如果之前已经插入的字符串 word 的前缀之一为 prefix ，返回 true ；否则，返回 false 。

示例：

输入

["Trie", "insert", "search", "search", "startsWith", "insert", "search"]

[[], ["apple"], ["apple"], ["app"], ["app"], ["app"], ["app"]]

输出

[null, null, true, false, true, null, true]

解释

Trie trie = new Trie();

trie.insert("apple");

trie.search("apple");   // 返回 True

trie.search("app");     // 返回 False

trie.startsWith("app"); // 返回 True

trie.insert("app");

trie.search("app");     // 返回 True

提示：

1 <= word.length, prefix.length <= 2000
word 和 prefix 仅由小写英文字母组成
insert、search 和 startsWith 调用次数总计不超过 3 * 104 次

实现代码：支持可变 wordList 的 Trie

class Trie {

    private final Node root = new Node('/');

    public void insert(char[] text) {

        Node p = root;

        for (char c : text) {

            int index = c - 'a';

            if (p.children[index] == null) {

                Node newNode = new Node(c);

                p.children[index] = newNode;

            }

            p = p.children[index];

        }

        p.isEndingChar = true;

    }

    public boolean find (char[] pattern) {

        Node p = root;

        for (char c : pattern) {

            int index = c - 'a';

            if (p.children[index] == null) {

                return false;

            }

        }

        return p.isEndingChar;

    }

    private static class Node {

        public char data;

        public boolean isEndingChar = false;

        public Node[] children = new Node[26];

        public Node(char r) {

            this.data = r;

        }

    }

}

面试题 17.17. 多次搜索

给定一个较长字符串big和一个包含较短字符串的数组smalls，设计一个方法，根据smalls中的每一个较短字符串，对big进行搜索。输出smalls中的字符串在big里出现的所有位置positions，其中positions[i]为smalls[i]出现的所有位置。

示例：

输入：

big = "mississippi"

smalls = ["is","ppi","hi","sis","i","ssippi"]

输出： [[1,4],[8],[],[3],[1,4,7,10],[5]]

提示：

0 <= len(big) <= 1000
0 <= len(smalls[i]) <= 1000
smalls的总字符数不会超过 100000。
你可以认为smalls中没有重复字符串。
所有出现的字符均为英文小写字母。

题解思想：Trie 树

class Solution {

    class TrieNode {

        String end;

        TrieNode[] next = new TrieNode[26];

    }

    class Trie {

        TrieNode root;

        public Trie(String[] words) {

            root = new TrieNode();

            for (String word : words) {

                TrieNode node = root;

                for (char r : word.toCharArray()) {

                    int i = r - 'a';

                    if (node.next[i] == null) {

                        node.next[i] = new TrieNode();

                    }

                    node = node.next[i];

                }

                node.end = word;

            }

        }

        public List<String> search(String str) {

            TrieNode node = root;

            List<String> res = new ArrayList<>();

            for (char c : str.toCharArray()) {

                int i = c - 'a';

                if (node.next[i] == null) {

                    break;

                }

                node = node.next[i];

                if (node.end != null) {

                    res.add(node.end);

                }

            }

            return res;

        }

    }

    public int[][] multiSearch(String big, String[] smalls) {

        Trie trie = new Trie(smalls);

        Map<String, List<Integer>> hit = new HashMap<>();

        for (int i = 0; i < big.length(); i ++) {

            List<String> matchs = trie.search(big.substring(i));

            for (String word : matchs) {

                if (!hit.containsKey(word)) {

                    hit.put(word, new ArrayList<>());

                }

                hit.get(word).add(i);

            }

        }

        int[][] res = new int[smalls.length][];

        for (int i = 0; i < smalls.length; i ++) {

            List<Integer> list = hit.get(smalls[i]);

            if (list == null) {

                res[i] = new int[0];

                continue;

            }

            int size = list.size();

            res[i] = new int[size];

            for (int j = 0; j < size; j ++) {

                res[i][j] = list.get(j);

            }

        }

        return res;

    }

}

Trie树结构的更多相关文章

Trie和Ternary Search Tree介绍
Trie树 Trie树,又称字典树,单词查找树或者前缀树,是一种用于快速检索的多叉树结构,如英文字母的字典树是一个26叉树,数字的字典树是一个10叉树. Trie树与二叉搜索树不同,键不是直接保存在节 ...
Trie树（转：http://blog.csdn.net/arhaiyun/article/details/11913501）
Trie 树, 又称字典树,单词查找树.它来源于retrieval(检索)中取中间四个字符构成(读音同try).用于存储大量的字符串以便支持快速模式匹配.主要应用在信息检索领域. Trie 有三种结构 ...
Double-Array Trie 原理解析
http://ansjsun.iteye.com/blog/702255 Trie树是搜索树的一种,它在本质上是一个确定的有限状态自动机,每个结点代表一个状态,根据输入变量的不同,进行状态转 ...
Trie树子节点快速获取法
今天做了一道leetcode上关于字典树的题:https://leetcode.com/problems/word-search-ii/#/description 一开始坚持不看别人的思路,完全自己写 ...
poj_3630 trie树
题目大意给定一系列电话号码,查看他们之间是否有i,j满足,号码i是号码j的前缀子串. 题目分析典型的trie树结构.直接使用trie树即可.但是需要注意,若使用指针形式的trie树,则在大数据量下 ...
python中文分词：结巴分词
中文分词是中文文本处理的一个基础性工作,结巴分词利用进行中文分词.其基本实现原理有三点: 基于Trie树结构实现高效的词图扫描,生成句子中汉字所有可能成词情况所构成的有向无环图(DAG) 采用了动态规 ...
Python 结巴分词
今天的任务是对txt文本进行分词,有幸了解到"结巴"中文分词,其愿景是做最好的Python中文分词组件.有兴趣的朋友请点这里. jieba支持三种分词模式: *精确模式,试图将句子 ...
转：鏖战双十一-阿里直播平台面临的技术挑战(webSocket, 敏感词过滤等很不错)
转自:http://www.infoq.com/cn/articles/alibaba-broadcast-platform-technology-challenges 鏖战双十一-阿里直播平台面临的 ...
Python 结巴分词模块
原文链接:http://www.gowhich.com/blog/147?utm_source=tuicool&utm_medium=referral PS:结巴分词支持Python3 源码下 ...
阿里巴巴笔试整理系列 Session2 高级篇
阿里一面:1. 入场就是红黑树,B数2. apache和nginx源码看过多少,平时看过什么技术论坛,还有没有看过更多的开源代码3. pthread 到自旋锁4. hadoop源码看过没5. 为什么选 ...

随机推荐

textarea兼容问题
一.IOS 中不允许输入 <textarea contenteditable="true"></textarea> textarea { -webkit-u ...
mysql大小写无法区分问题
1.在创建表时设置编码格式 ALTER TABLE `test`.`t_test` COLLATE=utf8mb4_bin; 只能在建表或者没有数据时设置. 还有其他比如改字段格式,比如将varcha ...
【jenkins】jenkins 持续集成本地项目（win）
[项目]--->[配置]: [源码管理]= 无 [Pre Steps]:(因为是windows下,本地装有maven,直接用maven构建,又是个测试用例,就直接用mvn test) 选择win ...
Flink笔记
高可用(HA):直白来说就是系统不会因为某台机器,或某个实例挂了,就不能提供服务了.高可用需要做到分布式.负载均衡.自动侦查.自动切换.自动恢复等. 高吞吐: 单位时间内,能传输的数据量,对应指标就是 ...
SourceMap解析
前端发展至今已不再是刀耕火种的年代了,出现了typescript.babel.uglify.js等功能强大的工具.我们手动撰写的代码一般具有可读性,并且可以享受高级语法.类型检查带来的便利,但经过工具 ...
【peewee】Python使用peewee时where中不同类型比较的问题
问题以学生表为例,TableStudents表中age字段是TextField类型,想要筛选出18岁以上的学生 TableStudents.select().where(TableStudents. ...
nuxt.js的导航守卫
一.使用router.js 重构项目时还行使用原来的router.js也是可以的,需要下载插件惊醒配置,这时候vue-cli中怎么用,nuxt中就怎么使用导航守卫,几乎一样二.使用nuxt.js 1 ...
20220718 第七组陈美娜 java
如果把变量直接声明在类里:成员变量(全局变量)成员变量->属性如果把变量声明在某个方法里:局部变量 public:访问权限修饰符,后面讲 void:没有返回值 run():方法名,标识符 {} ...
Jmeter-接口测试（三）
一.jmeter接口关联 1.正则表达式实现接口关联正则表达式可以这样测试 2.jsonpath表达式实现接口关联(只能作用于返回值是token的) 从根目录开始找$.token 从任意目录开始找( ...
ceph pg修复过程
1.通过命令查看哪些pg状态不一致 ceph pg dump|grep inconsistent 2.根据输出的pg id 进行一致性检查 ceph pg scrub 1.23 instructing ...

Trie树结构

PrefixTree

208. 实现 Trie (前缀树)

面试题 17.17. 多次搜索

Trie树结构的更多相关文章

随机推荐

热门专题