POJ3450 Corporate Identity —— 后缀数组 最长公共子序列
題目鏈接:https://vjudge.net/problem/POJ-3450
?
Corporate Identity| Time Limit:?3000MS | ? | Memory Limit:?65536K |
| Total Submissions:?8046 | ? | Accepted:?2710 |
Description
Beside other services, ACM helps companies to clearly state their “corporate identity”, which includes company logo but also other signs, like trademarks. One of such companies is Internet Building Masters (IBM), which has recently asked ACM for a help with their new identity. IBM do not want to change their existing logos and trademarks completely, because their customers are used to the old ones. Therefore, ACM will only change existing trademarks instead of creating new ones.
After several other proposals, it was decided to take all existing trademarks and find the longest common sequence of letters that is contained in all of them. This sequence will be graphically emphasized to form a new logo. Then, the old trademarks may still be used while showing the new identity.
Your task is to find such a sequence.
Input
The input contains several tasks. Each task begins with a line containing a positive integer N, the number of trademarks (2 ≤ N ≤ 4000). The number is followed by N lines, each containing one trademark. Trademarks will be composed only from lowercase letters, the length of each trademark will be at least 1 and at most 200 characters.
After the last trademark, the next task begins. The last task is followed by a line containing zero.
Output
For each task, output a single line containing the longest string contained as a substring in all trademarks. If there are several strings of the same length, print the one that is lexicographically smallest. If there is no such non-empty string, output the words “IDENTITY LOST” instead.
Sample Input
3 aabbaabb abbababb bbbbbabb 2 xyz abc 0Sample Output
abb IDENTITY LOSTSource
CTU Open 2007?
題意:
給出n個字符串,求這n個字符串的最長公共子序列,輸出字典序最小的一個。
?
題解:
1.將n個字符串拼接在一起,并且相鄰兩個之間用分隔符隔開,并且分隔符應各異。因此得到新串。
2.求出新串的后綴數組,然后二分公共子串的長度mid:可知當前的mid可將新串的后綴按排名的順序將其分成若干組,且每一組的最長公共前綴都大于等于mid,于是就在每一組內統計出現了多少個字符串,如果等于n,即表明當前mid合法,否則不合法,因此可以根據此規則最終求得長度。
3.由于題目還要求輸出字典序最小的。所以,如果當前mid合法,那么就記錄下公共子串的起始點和結束點。因為枚舉是按sa[i]從小到大的順序,因此在同一個mid下,第一組符合條件的公共子串即為字典序最小的。
?
代碼如下:
1 #include <iostream> 2 #include <cstdio> 3 #include <cstring> 4 #include <algorithm> 5 #include <vector> 6 #include <cmath> 7 #include <queue> 8 #include <stack> 9 #include <map> 10 #include <string> 11 #include <set> 12 using namespace std; 13 typedef long long LL; 14 const int INF = 2e9; 15 const LL LNF = 9e18; 16 const int MOD = 1e9+7; 17 const int MAXN = 1e6+100; 18 19 int id[MAXN]; //記錄屬于哪個字符串 20 int r[MAXN], sa[MAXN], Rank[MAXN], height[MAXN]; 21 int t1[MAXN], t2[MAXN], c[MAXN]; 22 23 bool cmp(int *r, int a, int b, int l) 24 { 25 return r[a]==r[b] && r[a+l]==r[b+l]; 26 } 27 28 void DA(int str[], int sa[], int Rank[], int height[], int n, int m) 29 { 30 n++; 31 int i, j, p, *x = t1, *y = t2; 32 for(i = 0; i<m; i++) c[i] = 0; 33 for(i = 0; i<n; i++) c[x[i] = str[i]]++; 34 for(i = 1; i<m; i++) c[i] += c[i-1]; 35 for(i = n-1; i>=0; i--) sa[--c[x[i]]] = i; 36 for(j = 1; j<=n; j <<= 1) 37 { 38 p = 0; 39 for(i = n-j; i<n; i++) y[p++] = i; 40 for(i = 0; i<n; i++) if(sa[i]>=j) y[p++] = sa[i]-j; 41 42 for(i = 0; i<m; i++) c[i] = 0; 43 for(i = 0; i<n; i++) c[x[y[i]]]++; 44 for(i = 1; i<m; i++) c[i] += c[i-1]; 45 for(i = n-1; i>=0; i--) sa[--c[x[y[i]]]] = y[i]; 46 47 swap(x, y); 48 p = 1; x[sa[0]] = 0; 49 for(i = 1; i<n; i++) 50 x[sa[i]] = cmp(y, sa[i-1], sa[i], j)?p-1:p++; 51 52 if(p>=n) break; 53 m = p; 54 } 55 56 int k = 0; 57 n--; 58 for(i = 0; i<=n; i++) Rank[sa[i]] = i; 59 for(i = 0; i<n; i++) 60 { 61 if(k) k--; 62 j = sa[Rank[i]-1]; 63 while(str[i+k]==str[j+k]) k++; 64 height[Rank[i]] = k; 65 } 66 } 67 68 bool vis[4040]; 69 int Le, Ri; 70 bool test(int n, int len, int k) 71 { 72 int cnt = 0; 73 memset(vis, false, sizeof(vis)); 74 for(int i = 2; i<=len; i++) 75 { 76 if(height[i]<k) 77 { 78 cnt = 0; 79 memset(vis, false, sizeof(vis)); 80 } 81 else 82 { 83 if(!vis[id[sa[i-1]]]) vis[id[sa[i-1]]] = true, cnt++; 84 if(!vis[id[sa[i]]]) vis[id[sa[i]]] = true, cnt++; 85 if(cnt==n) 86 { 87 Le = sa[i]; Ri = sa[i]+k-1; 88 return true; 89 } 90 } 91 } 92 return false; 93 } 94 95 char str[MAXN]; 96 int main() 97 { 98 int n; 99 while(scanf("%d", &n)&&n) 100 { 101 int len = 0; 102 for(int i = 0; i<n; i++) 103 { 104 scanf("%s", str); 105 int LEN = strlen(str); 106 for(int j = 0; j<LEN; j++) 107 { 108 r[len] = str[j]-'a'+1; 109 id[len++] = i; 110 } 111 r[len] = 30+i; //分隔符要各異 112 id[len++] = i; 113 } 114 r[len] = 0; 115 DA(r,sa,Rank,height,len,30+n); 116 117 int L = 0, R = strlen(str); 118 while(L<=R) 119 { 120 int mid = (L+R)>>1; 121 if(test(n,len,mid)) 122 L = mid + 1; 123 else 124 R = mid - 1; 125 } 126 127 if(R==0) puts("IDENTITY LOST"); 128 else 129 { 130 for(int i = Le; i<=Ri; i++) 131 printf("%c", r[i]+'a'-1); 132 putchar('\n'); 133 } 134 } 135 } View Code?
轉載于:https://www.cnblogs.com/DOLFAMINGO/p/8480366.html
總結
以上是生活随笔為你收集整理的POJ3450 Corporate Identity —— 后缀数组 最长公共子序列的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: D盘提示RAW文件如何找回
- 下一篇: TeraTerm设定(窗体大小,字体字号