原题地址:

https://oj.leetcode.com/problems/repeated-dna-sequences/

题目内容:

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,

Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",

Return:
["AAAAACCCCC", "CCCCCAAAAA"].

方法:

大概的方向是,遍历所有长度为10的子串,用一个hash表记录每个不同子串的出现次数,最后输出满足条件的子串。

关键问题是:如何更快

我的办法:将A、C、G、T映射到1、2、3、4,然后换算成大整数。为了方便计算,字符串的最左边是最低位。这么说有些语焉不详,举几个例子:

AACG = 4311

CGTA = 1432

然后计算出首个字符串的整数值并加入map,这样,每下一个子串都可以通过 整数值/10 + 下一个字符乘以十亿来得到。

这样,hash值的计算从字符串变成了整数,同时,获得下一个字符串的行为也可以在更快的常数次时间内完成,因为操作字符串的时间开支。

全部代码:

class Solution {
public:
vector<string> findRepeatedDnaSequences(string s) {
unordered_map<long long,int> dict;
unordered_map<long long,int> :: iterator it;
vector<string> res;
long long flag = 1000000000;
if (s.size() <= 10)
return res;
long long num = generateFirstNum(s);
dict[num] = 1;
for (int i = 10; i < s.size(); i ++) {
num /= 10;
long long now = getCharNum(s[i]);
num += now * flag;
it = dict.find(num);
if (it == dict.end()) {
dict[num] = 1;
} else {
dict[num] += 1;
}
}
for (it = dict.begin(); it != dict.end(); it ++) {
if (it->second > 1) {
generateRes(res,it->first);
}
}
return res;
} long long generateFirstNum(string s) {
long long res = 0;
long long power = 1;
for (int i = 0; i < 10; i ++) {
long long num = getCharNum(s[i]);
res += num * power;
power *= 10;
}
return res;
} long long getCharNum(char s) {
switch (s) {
case 'A' : return 1;
case 'C' : return 2;
case 'G' : return 3;
case 'T' : return 4;
}
} char getNumChar(long long s) {
switch (s) {
case 1 : return 'A';
case 2 : return 'C';
case 3 : return 'G';
case 4 : return 'T';
}
} void generateRes(vector<string> &res,long long target) {
string s;
while (target > 0) {
char now = getNumChar(target % 10);
s = s + now;
target /= 10;
}
res.push_back(s);
}
};

  

【原创】leetCodeOj --- Repeated DNA Sequences 解题报告的更多相关文章

  1. 【LeetCode】187. Repeated DNA Sequences 解题报告(Python)

    作者: 负雪明烛 id: fuxuemingzhu 个人博客: http://fuxuemingzhu.cn/ 题目地址: https://leetcode.com/problems/repeated ...

  2. 【LeetCode】Repeated DNA Sequences 解题报告

    [题目] All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: &quo ...

  3. [LeetCode] 187. Repeated DNA Sequences 解题思路

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  4. lc面试准备:Repeated DNA Sequences

    1 题目 All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: &quo ...

  5. LeetCode 187. 重复的DNA序列(Repeated DNA Sequences)

    187. 重复的DNA序列 187. Repeated DNA Sequences 题目描述 All DNA is composed of a series of nucleotides abbrev ...

  6. 【原创】leetCodeOj --- Sliding Window Maximum 解题报告

    天,这题我已经没有底气高呼“水”了... 题目的地址: https://leetcode.com/problems/sliding-window-maximum/ 题目内容: Given an arr ...

  7. [Swift]LeetCode187. 重复的DNA序列 | Repeated DNA Sequences

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  8. 【原创】leetCodeOj --- Word Ladder II 解题报告 (迄今为止最痛苦的一道题)

    原题地址: https://oj.leetcode.com/submissions/detail/19446353/ 题目内容: Given two words (start and end), an ...

  9. 【原创】leetCodeOj --- Factorial Trailing Zeroes 解题报告

    原题地址: https://oj.leetcode.com/problems/factorial-trailing-zeroes/ 题目内容: Given an integer n, return t ...

随机推荐

  1. 大白菜U盘启动制作工具装机维护版V5.0–大白菜U盘下载中心

    大白菜U盘启动制作工具装机维护版V5.0–大白菜U盘下载中心   大白菜U盘启动制作工具装机维护版V5.0

  2. HDU 1394 Minimum Inversion Number (线段树 单点更新 求逆序数)

    题目链接:http://acm.hdu.edu.cn/showproblem.php?pid=1394 题意:给你一个n个数的序列,当中组成的数仅仅有0-n,我们能够进行这么一种操作:把第一个数移到最 ...

  3. Android获取设备採用的时间制式(12小时制式或24小时制式)

    /** * 获取设备採用的时间制式(12小时制式或者24小时制式) * 注意: * 在模拟器上获取的时间制式为空 */ private void getTime_12_24(Context conte ...

  4. cocos2d 游戏开发:Cocos2d v3 &quot;hello world&quot;+显示飞船

    V3 RC4 版本号图片 显示一个飞船 将Chapter1中 SpaceCargoShip.png 文件 加入到项目里面. 代码在 init : CCSprite *spaceCargoShip = ...

  5. Theano+Keras+CUDA7.5+VS2013+Windows10x64配置

    Visual Studio 2013 正常安装,这里只要C++打勾就可以. ANACONDA ANACONDA是封装了Python的科学计算工具,装这个就可以不用额外装Python了.在安装之前建议先 ...

  6. Qt中提高sqlite的读写速度(使用事务一次性写入100万条数据)

    SQLite数据库本质上来讲就是一个磁盘上的文件,所以一切的数据库操作其实都会转化为对文件的操作,而频繁的文件操作将会是一个很好时的过程,会极大地影响数据库存取的速度.例如:向数据库中插入100万条数 ...

  7. rzsz不能大于4G,securefx传5.2G没有问题,

    rzsz不能大于4G,securefx传5.2G没有问题, 查看系统限制: $ulimit -acore file size          (blocks, -c) 0data seg size  ...

  8. oschina图形和图像工具开源软件

    图形和图像工具开源软件 http://www.oschina.net/project/tag/181/imagetools?sort=view&lang=21&os=0

  9. Cocos2d-x-lua游戏两个场景互相切换MainScene01切换到MainScene02

    /* 场景一lua代码 */ require "MainScene02" local dic_size = CCDirector:sharedDirector():getWinSi ...

  10. 物联网操作系统HelloX开发人员入门指南

    HelloX开发人员入门指南 HelloX是聚焦于物联网领域的操作系统开发项目,能够通过百度搜索"HelloX".获取具体信息. 当前开发团队正在进一步招募中,欢迎您的了解和添加. ...