187. Repeated DNA Sequences
题目:
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
For example,
Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT", Return:
["AAAAACCCCC", "CCCCCAAAAA"].
链接: http://leetcode.com/problems/repeated-dna-sequences/
题解:
求repeating molecule of DNA sequence。最直接的想法就是从头遍历,把substring加入到HashMap中,然后进行比较。这样的话因为substring()的复杂度是O(n),所以整个算法复杂度是O(n2)。看到有讨论用Rabin-Karp的Rolling Hash自己做Hash Function。这样做的好处我觉得可能是减少了substring()的次数,但总得来说时间复杂度也还是O(n2)。而且做了Rabin-Karp的话要不要要不要判断collision,collision以后用Monte-carlo还是Las Vegas检测结果,也是问题。要再研究一下。
Time Complexity - O(n2), Space Complexity - O(n)。
public class Solution {
public List<String> findRepeatedDnaSequences(String s) {
List<String> res = new ArrayList<String>();
if(s == null || s.length() < 10)
return res;
Map<String, Integer> map = new HashMap<>();
for(int i = 0; i < s.length() - 9; i++) {
String subStr = s.substring(i, i + 10);
if(map.containsKey(subStr)) {
if(map.get(subStr) == 1)
res.add(subStr);
map.put(subStr, map.get(subStr) + 1);
} else
map.put(subStr, 1);
}
return res;
}
}
二刷:
还是用的老方法,利用HashMap进行比较。
看到Stefan的用Set也能做,更像Python的风格,速度更快。
Rolling Hash的话,Time Complexity是O(n),但每次都要计算10个数的hash value,速度也不快。有机会的话联系一下好了。
Java:
HashMap:
Time Complexity - O(n2), Space Complexity - O(n)。
public class Solution {
public List<String> findRepeatedDnaSequences(String s) {
List<String> res = new ArrayList<>();
if (s == null) return res;
Map<String, Integer> map = new HashMap<>();
for (int i = 0; i + 10 <= s.length(); i++) {
String str = s.substring(i, i + 10);
if (!map.containsKey(str)) {
map.put(str, 1);
} else {
if (map.get(str) == 1) res.add(str);
map.put(str, 2);
}
}
return res;
}
}
Reference:
https://leetcode.com/discuss/24595/short-java-rolling-hash-solution
https://leetcode.com/discuss/24557/just-7-lines-of-code
https://leetcode.com/discuss/24478/i-did-it-in-10-lines-of-c
https://leetcode.com/discuss/25399/clean-java-solution-hashmap-bits-manipulation
https://leetcode.com/discuss/25536/am-understanding-the-problem-wrongly-what-about-aaaaccccca
https://leetcode.com/discuss/29623/11ms-solution-with-unified-hash-fxn
https://leetcode.com/discuss/54777/easy-to-understand-java-solution-with-well-commented-code
https://leetcode.com/discuss/46948/accepted-java-easy-to-understand-solution
https://leetcode.com/discuss/64841/7-lines-simple-java-o-n
187. Repeated DNA Sequences的更多相关文章
- [LeetCode] 187. Repeated DNA Sequences 求重复的DNA序列
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
- leetcode 187. Repeated DNA Sequences 求重复的DNA串 ---------- java
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
- Java for LeetCode 187 Repeated DNA Sequences
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
- [LeetCode#187]Repeated DNA Sequences
Problem: All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: ...
- [LeetCode] 187. Repeated DNA Sequences 解题思路
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
- 【LeetCode】187. Repeated DNA Sequences
题目: All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: " ...
- 187. Repeated DNA Sequences重复的DNA子串序列
[抄题]: All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: &qu ...
- 187. Repeated DNA Sequences (String; Bit)
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
- 187. Repeated DNA Sequences(建立词典,遍历一遍 o(n))
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
随机推荐
- 输入整数n(n<=10000),表示接下来将会输入n个实数,将这n个实数存入数组a中。请定义一个数组拷贝函数将数组a中的n个数拷贝到数组b中。
代码一大串! #include<stdio.h> ],y[]; void arraycopy (double c[],double d[],int m); { ;i<=m;i++) ...
- Ubuntu、Sql Server卸载心得
这几天真是搞得亏大了! 首先是卸载Ubuntu,直接在Windows下格式化那个盘了,这就出岔子了……然后越来越糟糕,最后弄得一个系统都没有了……然后重装系统…… 然后装VS和Sql Server,因 ...
- 真正的inotify+rsync实时同步 彻底告别同步慢
真正的inotify+rsync实时同步 彻底告别同步慢 http://www.ttlsa.com/web/let-infotify-rsync-fast/ 背景 我们公司在用in ...
- 【转】DataGridView显示行号
ref:http://blog.csdn.net/xieyufei/article/details/9769631 方法一: 网上最常见的做法是用DataGridView的RowPostPaint事件 ...
- OWIN and Katana - 1
翻译自 http://www.asp.net/aspnet/overview/owin-and-katana/an-overview-of-project-katana 十多年来,基于ASP.NET框 ...
- JS禁用和启用鼠标滚轮滚动事件
// left: 37, up: 38, right: 39, down: 40, // spacebar: 32, pageup: 33, pagedown: 34, end: 35, home: ...
- Kakfa揭秘 Day4 Kafka中分区深度解析
Kakfa揭秘 Day4 Kafka中分区深度解析 今天主要谈Kafka中的分区数和consumer中的并行度.从使用Kafka的角度说,这些都是至关重要的. 分区原则 Partition代表一个to ...
- c++空类的大小
初学者在学习面向对象的程序设计语言时,或多或少的都些疑问,我们写的代码与最终生编译成的代码却大相径庭,我们并不知道编译器在后台做了什么工作.这些都是由于我们仅停留在语言层的原因,所谓语言层就是教会我们 ...
- Wpf 简单制作自己的窗体样式
最近一直在搞wpf相关的东东,由于还在门外徘徊,所以第一篇blog写了简单的制作扁平化的wpf button样式,这一篇也简单的制作属于自己wpf 窗体的样式. 废话少说,下面就开始制作自己的窗体样式 ...
- ubuntu安装kernel3.10.34
参考http://www.linuxidc.com/Linux/2014-03/98818.htm 32位系统安装 wget kernel.ubuntu.com/~kernel-ppa/mainlin ...