All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

Example:

Input: s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"

Output: ["AAAAACCCCC", "CCCCCAAAAA"]

Solution: count the frequency of 10 letter words

class Solution {
//find all the 10-letter-long sequences that occur more than once in a DNA molecule
public List<String> findRepeatedDnaSequences(String s) {
//substring -- subset n +n-1+...+1: n-k+1
List<String> res = new ArrayList<String>();
Map<String,Integer> map = new HashMap<String,Integer>();
int n = s.length();
int k =10;
if(n < k) return res;
for(int i = 0; i<=n-k; i++){//11-10 1
String sub = s.substring(i, i+k);
if(map.containsKey(sub)){
map.put(sub, map.get(sub)+1);
}else {
map.put(sub, 1);
}
}
for(Map.Entry<String, Integer> entry : map.entrySet()){
if(entry.getValue() >1){
res.add(entry.getKey());
}
}
return res;
}
}

Solution 2: two HashSet with a non-duplicate feature.

public List<String> findRepeatedDnaSequences(String s) {
Set seen = new HashSet(), repeated = new HashSet();
for (int i = 0; i + 9 < s.length(); i++) {
String ten = s.substring(i, i + 10);
if (!seen.add(ten))//if add then first time, else add it
repeated.add(ten);
}
return new ArrayList(repeated);
}

subsequence & substring

subsequence: subset 2^n

substring: continous string : n+n-1+n-2+...+1

*187. Repeated DNA Sequences (hashmap, one for loop)(difference between subsequence & substring)的更多相关文章

  1. [LeetCode] 187. Repeated DNA Sequences 求重复的DNA序列

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  2. Java for LeetCode 187 Repeated DNA Sequences

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  3. 187. Repeated DNA Sequences

    题目: All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: " ...

  4. [LeetCode#187]Repeated DNA Sequences

    Problem: All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: ...

  5. 【LeetCode】187. Repeated DNA Sequences

    题目: All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: " ...

  6. 187. Repeated DNA Sequences重复的DNA子串序列

    [抄题]: All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: &qu ...

  7. leetcode 187. Repeated DNA Sequences 求重复的DNA串 ---------- java

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  8. [LeetCode] 187. Repeated DNA Sequences 解题思路

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  9. 187. Repeated DNA Sequences (String; Bit)

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

随机推荐

  1. tarjan算法,一个关于 图的联通性的神奇算法

    一.算法简介 Tarjan 算法一种由Robert Tarjan提出的求解有向图强连通分量的算法,它能做到线性时间的复杂度. 我们定义: 如果两个顶点可以相互通达,则称两个顶点强连通(strongly ...

  2. POJ 3659 Cell Phone Network 最小支配集模板题(树形dp)

    题意:有以个 有 N 个节点的树形地图,问在这些顶点上最少建多少个电话杆,可以使得所有顶点被覆盖到,一个节点如果建立了电话杆,那么和它直接相连的顶点也会被覆盖到. 分析:用最少的点覆盖所有的点,即为求 ...

  3. jquery 拖动改变div大小

    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/ ...

  4. css兼容写法

    css3 1.box-shadow: filter:progid:DXImageTransform.Microsoft.Shadow(color=#,direction=,strength=);/*兼 ...

  5. scanf()函数的注意事项

    /* 2 time:2018年5月23日18:57:52 3 author:Howie Tang 4 title:scanf()函数的总结 5 */ #include <stdio.h> ...

  6. import与export

    expoer default 输出的是一个对象 export 输出的是对象的一个元素

  7. idea各种快捷键

    工作的的时候,如果不知道idea一些方便的快捷键会大大影响工作效率,今天打算看看这些小技巧: https://blog.csdn.net/linsongbin1/article/details/802 ...

  8. java——int、args[]传参、标签、数字塔?、一个输入格式

    1.当int型整数超出自己范围时,会从它的上界重新开始. public class exp { public static void main(String[] args) { int i = 214 ...

  9. 数据结构---Java---LinkedList

    public class LinkedList<E> extends AbstractSequentialList<E> implements List<E>, D ...

  10. PlayMaker Rotate旋转

    每秒 绕 Y轴 旋转 180度 ,每帧都执行. 不勾选Every Frame的话就只会旋转一帧.