Repeated DNA Sequences 解答
Question
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
For example,
Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT", Return: ["AAAAACCCCC", "CCCCCAAAAA"].
Solution -- Bit Manipulation
Original idea is to use a set to store each substring. Time complexity is O(n) and space cost is O(n). But for details of space cost, a char is 2 bytes, so we need 20 bytes to store a substring and therefore (20n) space.
If we represent DNA substring by integer, the space is cut down to (4n).
public List<String> findRepeatedDnaSequences(String s) {
List<String> result = new ArrayList<String>();
int len = s.length();
if (len < 10) {
return result;
}
Map<Character, Integer> map = new HashMap<Character, Integer>();
map.put('A', 0);
map.put('C', 1);
map.put('G', 2);
map.put('T', 3);
Set<Integer> temp = new HashSet<Integer>();
Set<Integer> added = new HashSet<Integer>();
int hash = 0;
for (int i = 0; i < len; i++) {
if (i < 9) {
//each ACGT fit 2 bits, so left shift 2
hash = (hash << 2) + map.get(s.charAt(i));
} else {
hash = (hash << 2) + map.get(s.charAt(i));
//make length of hash to be 20
hash = hash & (1 << 20) - 1;
if (temp.contains(hash) && !added.contains(hash)) {
result.add(s.substring(i - 9, i + 1));
added.add(hash); //track added
} else {
temp.add(hash);
}
}
}
return result;
}
Repeated DNA Sequences 解答的更多相关文章
- lc面试准备:Repeated DNA Sequences
1 题目 All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: &quo ...
- LeetCode 187. 重复的DNA序列(Repeated DNA Sequences)
187. 重复的DNA序列 187. Repeated DNA Sequences 题目描述 All DNA is composed of a series of nucleotides abbrev ...
- [LeetCode] Repeated DNA Sequences 求重复的DNA序列
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
- [Leetcode] Repeated DNA Sequences
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
- leetcode 187. Repeated DNA Sequences 求重复的DNA串 ---------- java
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
- 【leetcode】Repeated DNA Sequences(middle)★
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
- LeetCode() Repeated DNA Sequences 看的非常的过瘾!
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
- Repeated DNA Sequences
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
- Java for LeetCode 187 Repeated DNA Sequences
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
随机推荐
- Zigzag Iterator 解答
Question Given two 1d vectors, implement an iterator to return their elements alternately. For examp ...
- mysql的日志
是否启用了日志mysql>show variables like ‘log_bin’; 怎样知道当前的日志mysql> show master status; 看二进制日志文件用mysql ...
- 【转】第一个MiniGUI程序:模仿QQ界面
最近几天在学MiniGui,最好的学习方法就是实践,先写个练练笔.其实只是一个界面,不知道什么时候才能真正写个完整的程序.初次写GUI程序,感觉写得不好,还请高手来指教. //============ ...
- 【转】RTSP协议学习笔记
第一部分:RTSP协议 一. RTSP协议概述 RTSP(Real-Time Stream Protocol )是一种基于文本的应用层协议,在语法及一些消息参数等方面,RTSP协议与HTTP协议类似. ...
- MVC 简单数据传递
Mode: namespace MVCDemo.Models { public class Data { //申明为静态 归类所有,取数据不要实例化 ; public static string st ...
- 用window.print()打印指定div里面的内容
用window.print()打印指定div里面的内容 今天客户让添加个打印证照功能,直接用window.print()打印的是整个页面,而用以下方法就可以只打印证明了 <!--window.p ...
- Lua内存泄漏应对方法[转]
转自http://blog.csdn.net/xocoder/article/details/42685685 由于目前正在负责的项目是一个二次开发项目,而且留给我们的代码质量实在让人无力吐槽,所以遇 ...
- [Immutable.js] Exploring Sequences and Range() in Immutable.js
Understanding Immutable.js's Map() and List() structures will likely take you as far as you want to ...
- ORACLE 视图的 with check option
ORACLE 视图的 with check option 我们来看下面的例子: create or replace view testview as select empno,ename from e ...
- NET基础课--对象的筛选和排序(NET之美)
1.数据量不大的时候取出数据缓存于服务器,然后排序,筛选等基于缓存进行以提高效率. 排序或筛选的方法是使用集合类型提供的,如List<T>.sort() List<T>.Fi ...