Repeated DNA Sequences 解答
Question
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
For example,
Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT", Return: ["AAAAACCCCC", "CCCCCAAAAA"].
Solution -- Bit Manipulation
Original idea is to use a set to store each substring. Time complexity is O(n) and space cost is O(n). But for details of space cost, a char is 2 bytes, so we need 20 bytes to store a substring and therefore (20n) space.
If we represent DNA substring by integer, the space is cut down to (4n).
public List<String> findRepeatedDnaSequences(String s) {
List<String> result = new ArrayList<String>();
int len = s.length();
if (len < 10) {
return result;
}
Map<Character, Integer> map = new HashMap<Character, Integer>();
map.put('A', 0);
map.put('C', 1);
map.put('G', 2);
map.put('T', 3);
Set<Integer> temp = new HashSet<Integer>();
Set<Integer> added = new HashSet<Integer>();
int hash = 0;
for (int i = 0; i < len; i++) {
if (i < 9) {
//each ACGT fit 2 bits, so left shift 2
hash = (hash << 2) + map.get(s.charAt(i));
} else {
hash = (hash << 2) + map.get(s.charAt(i));
//make length of hash to be 20
hash = hash & (1 << 20) - 1;
if (temp.contains(hash) && !added.contains(hash)) {
result.add(s.substring(i - 9, i + 1));
added.add(hash); //track added
} else {
temp.add(hash);
}
}
}
return result;
}
Repeated DNA Sequences 解答的更多相关文章
- lc面试准备:Repeated DNA Sequences
1 题目 All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: &quo ...
- LeetCode 187. 重复的DNA序列(Repeated DNA Sequences)
187. 重复的DNA序列 187. Repeated DNA Sequences 题目描述 All DNA is composed of a series of nucleotides abbrev ...
- [LeetCode] Repeated DNA Sequences 求重复的DNA序列
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
- [Leetcode] Repeated DNA Sequences
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
- leetcode 187. Repeated DNA Sequences 求重复的DNA串 ---------- java
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
- 【leetcode】Repeated DNA Sequences(middle)★
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
- LeetCode() Repeated DNA Sequences 看的非常的过瘾!
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
- Repeated DNA Sequences
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
- Java for LeetCode 187 Repeated DNA Sequences
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
随机推荐
- 3Sum Smaller 解答
Question Given an array of n integers nums and a target, find the number of index triplets i, j, k w ...
- UVa12096.The SetStack Computer
题目链接:http://uva.onlinejudge.org/index.php?option=com_onlinejudge&Itemid=8&page=show_problem& ...
- pyqt listview基础学习01
from decimal import * from PyQt4.QtGui import * from PyQt4.Qt import * from PyQt4.QtCore import * im ...
- pyqt说明
我是个PHP程序员,不过有时候觉得需要写些小软件,对于我这种不太熟悉桌面软件开发的人来说,界面问题最让我头痛.听说Qt很强大,而且是跨平台,所以决定学习它用来弥补我写桌面软件的不足. Qt一般是通过C ...
- DreamWeaver文件保存时,提示"发生共享违例"问题的解决方法
在学习牛腩老师的JS视频中,视频中的例子要求实现一个是23个3相乘的结果,在用Dreamweaver制作时,, <script language="javascript" t ...
- [转]Laravel 4之控制器
Laravel 4之控制器 http://dingjiannan.com/2013/laravel-controller/ 控制器 通常Laravel控制器文件放在app/controllers/目录 ...
- [Ember] Wraming up
1.1: <!DOCTYPE html> <html> <head> <base href='http://courseware.codeschool.com ...
- The account is locked
SQL> select * from v$version where rownum=1; BANNER --------------------------------------------- ...
- linux在文件打包和压缩
1. 打包和压缩文件 linux现在经常使用gzip和bzip2要压缩的文件.tar压缩文件. 经常使用的扩展: *.gz gzip压缩文件 *.bz2 bzip2压缩的文件 *.tar t ...
- 传输中文乱码js解决方法
encodeURI要编码两次 var a="我的"; //编译两次 //window.location.href = "http://127.0.0.1:8080/kab ...