All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
For example,
Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",Return:["AAAAACCCCC", "CCCCCAAAAA"].
[分析]
HASHMAP方法會EXCEED SPACE LIMIT.
因為只有4個字母,所以可以創建自己的hashkey, 每兩個BITS, 對應一個 incoming character. 超過20bit 即10個字符時, 只保留20bits.
[注意]
1. (hash<<2) + map.get(c) 符號優先級, << 一定要括起來.
public class Solution {
public List findRepeatedDnaSequences(String s) {
List res = new ArrayList();
if(s==null || s.length() < 11) return res;
int hash = 0;
Map map = new HashMap();
map.put('A', 0);
map.put('C', 1);
map.put('G', 2);
map.put('T', 3);
Set set = new HashSet();
Set unique = new HashSet();
for(int i=0; i