All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

Example:

Input: s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"

Output: ["AAAAACCCCC", "CCCCCAAAAA"]

Solution: count the frequency of 10 letter words

class Solution {
//find all the 10-letter-long sequences that occur more than once in a DNA molecule
public List<String> findRepeatedDnaSequences(String s) {
//substring -- subset n +n-1+...+1: n-k+1
List<String> res = new ArrayList<String>();
Map<String,Integer> map = new HashMap<String,Integer>();
int n = s.length();
int k =10;
if(n < k) return res;
for(int i = 0; i<=n-k; i++){//11-10 1
String sub = s.substring(i, i+k);
if(map.containsKey(sub)){
map.put(sub, map.get(sub)+1);
}else {
map.put(sub, 1);
}
}
for(Map.Entry<String, Integer> entry : map.entrySet()){
if(entry.getValue() >1){
res.add(entry.getKey());
}
}
return res;
}
}

Solution 2: two HashSet with a non-duplicate feature.

public List<String> findRepeatedDnaSequences(String s) {
Set seen = new HashSet(), repeated = new HashSet();
for (int i = 0; i + 9 < s.length(); i++) {
String ten = s.substring(i, i + 10);
if (!seen.add(ten))//if add then first time, else add it
repeated.add(ten);
}
return new ArrayList(repeated);
}

subsequence & substring

subsequence: subset 2^n

substring: continous string : n+n-1+n-2+...+1

*187. Repeated DNA Sequences (hashmap, one for loop)(difference between subsequence & substring)的更多相关文章

  1. [LeetCode] 187. Repeated DNA Sequences 求重复的DNA序列

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  2. Java for LeetCode 187 Repeated DNA Sequences

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  3. 187. Repeated DNA Sequences

    题目: All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: " ...

  4. [LeetCode#187]Repeated DNA Sequences

    Problem: All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: ...

  5. 【LeetCode】187. Repeated DNA Sequences

    题目: All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: " ...

  6. 187. Repeated DNA Sequences重复的DNA子串序列

    [抄题]: All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: &qu ...

  7. leetcode 187. Repeated DNA Sequences 求重复的DNA串 ---------- java

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  8. [LeetCode] 187. Repeated DNA Sequences 解题思路

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  9. 187. Repeated DNA Sequences (String; Bit)

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

随机推荐

  1. CF E. Vasya and a Tree】 dfs+树状数组(给你一棵n个节点的树,每个点有一个权值,初始全为0,m次操作,每次三个数(v, d, x)表示只考虑以v为根的子树,将所有与v点距离小于等于d的点权值全部加上x,求所有操作完毕后,所有节点的值)

    题意: 给你一棵n个节点的树,每个点有一个权值,初始全为0,m次操作,每次三个数(v, d, x)表示只考虑以v为根的子树,将所有与v点距离小于等于d的点权值全部加上x,求所有操作完毕后,所有节点的值 ...

  2. jQuery 全屏滚动插件 fullPage.js 参数说明

    fullPage.js 是一个基于 jQuery 的插件,它能够很方便.很轻松的制作出全屏网站,主要功能有: 支持鼠标滚动 支持前进后退和键盘控制 多个回调函数 支持手机.平板触摸事件 支持 CSS3 ...

  3. 用sphinx-doc优雅的写文档

    Sphinx 是一个工具,它使得创建一个智能而美丽的文档变得简单.作者Georg Brandl,基于BSD许可证. 起初为写 Python 文档而诞生的 Sphinx,支持为各种语言生成软件开发文档. ...

  4. SElinux学习记录

    1.SELinux:是一种基于域类型模型的强制访问控制安全系统,由NSA编写设计成内核模块包含到内核中,相应的某些安全相关的应用也被打了SE Linux补丁 查看Selinux: ps -Z #查看S ...

  5. shell 终端字符颜色

    终端的字符颜色是用转义序列控制的,是文本模式下的系统显示功能,和具体的语言无关,shell,python,perl等均可以调用. 转义序列是以 ESC 开头,可以用 \033 完成相同的工作(ESC ...

  6. Java基础14-多维数组

    1.二位数组可以看成以数组为元素的数组 2.java中多维数组的声明和初始化一样,应该从高维到低维的顺序进行,例如 int[][] a=new int[3][]; a[0]=new int[2]; a ...

  7. [转]AngularJS+UI Router(1) 多步表单

    本文转自:https://www.zybuluo.com/dreamapplehappy/note/54448 多步表单的实现   在线demo演示地址https://rawgit.com/dream ...

  8. tinkphp3.2.3 关于事务处理。

    自己做一个测试,关于事务处理的. 在对多表进行操作的时候 基本上都离不开事务. 有的操作,是要由上一操作后,产的值(如主表里插入后,要获取插入的主键ID值,返回给下面处理表用.)带到后面的表处理当中去 ...

  9. usually study notebook

    2018/01/02 删除用户经验: 1,vi /etc/passwd ,然后注释掉用户,观察一个月,以便于还原,相当于备份. 2,把登入shell改成/sbin/nologin. 3,openlda ...

  10. 使用git将自己的代码同时保存在多个代码托管平台

    现在有很多代码管理平台,例如github,oschina-git,coding.net,我的网速有时候访问github比较慢.这时候我使用国内的.但是只使用一家我已不知道我的代码在他们的管理平台是否足 ...