當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

thinking-in-java(13) String字符串

發(fā)布時間：2023/12/3 编程问答 43 豆豆

生活随笔收集整理的這篇文章主要介紹了 thinking-in-java(13) String字符串小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

?【13.1】不可變String

1）String對象是不可變的，具有只讀特性；

【荔枝-String對象是不可變的】?

public class Immutable {public static String upcase(String s) {return s.toUpperCase();}public static void main(String[] args) {String q = "howdy";print(q); // howdyString qq = upcase(q);print(qq); // HOWDYprint(q); // howdy(原有 String 沒有改變)} } // howdy // HOWDY // howdy 【代碼解說】

字符串q 傳給 upcase() 方法時，實際上傳遞的是 s 引用的拷貝；

// Stirng.toUpperCase() 源碼 public String toUpperCase(Locale locale) {if (locale == null) {throw new NullPointerException();}int firstLower;final int len = value.length;/* Now check if there are any characters that need to be changed. */scan: {for (firstLower = 0 ; firstLower < len; ) {int c = (int)value[firstLower];int srcCount;if ((c >= Character.MIN_HIGH_SURROGATE)&& (c <= Character.MAX_HIGH_SURROGATE)) {c = codePointAt(firstLower);srcCount = Character.charCount(c);} else {srcCount = 1;}int upperCaseChar = Character.toUpperCaseEx(c);if ((upperCaseChar == Character.ERROR)|| (c != upperCaseChar)) {break scan;}firstLower += srcCount;}return this;}/* result may grow, so i+resultOffset is the write location in result */int resultOffset = 0;char[] result = new char[len]; /* may grow *//* Just copy the first few upperCase characters. */System.arraycopy(value, 0, result, 0, firstLower);String lang = locale.getLanguage();boolean localeDependent =(lang == "tr" || lang == "az" || lang == "lt");char[] upperCharArray;int upperChar;int srcChar;int srcCount;for (int i = firstLower; i < len; i += srcCount) {srcChar = (int)value[i];if ((char)srcChar >= Character.MIN_HIGH_SURROGATE &&(char)srcChar <= Character.MAX_HIGH_SURROGATE) {srcChar = codePointAt(i);srcCount = Character.charCount(srcChar);} else {srcCount = 1;}if (localeDependent) {upperChar = ConditionalSpecialCasing.toUpperCaseEx(this, i, locale);} else {upperChar = Character.toUpperCaseEx(srcChar);}if ((upperChar == Character.ERROR)|| (upperChar >= Character.MIN_SUPPLEMENTARY_CODE_POINT)) {if (upperChar == Character.ERROR) {if (localeDependent) {upperCharArray =ConditionalSpecialCasing.toUpperCaseCharArray(this, i, locale);} else {upperCharArray = Character.toUpperCaseCharArray(srcChar);}} else if (srcCount == 2) {resultOffset += Character.toChars(upperChar, result, i + resultOffset) - srcCount;continue;} else {upperCharArray = Character.toChars(upperChar);}/* Grow result if needed */int mapLen = upperCharArray.length;if (mapLen > srcCount) {char[] result2 = new char[result.length + mapLen - srcCount];System.arraycopy(result, 0, result2, 0, i + resultOffset);result = result2;}for (int x = 0; x < mapLen; ++x) {result[i + resultOffset + x] = upperCharArray[x];}resultOffset += (mapLen - srcCount);} else {result[i + resultOffset] = (char)upperChar;}}return new String(result, 0, len + resultOffset);}

【13.2】重載運算符 + 與 StringBuilder

1）重載的意思： 一個操作符在應(yīng)用于特定的類時，被賦予特殊意義；

（Attention：用于String 的 + 和 +=? 是java中僅有的兩個重載過的操作符，而java 并不允許程序員重載任何操作符）

【荔枝-字符串重載符+】

// 對于 + 運算符，編譯器實際上創(chuàng)建了一個 StringBuilder() // append() 方法表示重載的 + 運算符 public class Concatenation {public static void main(String[] args) {String mango = "mango";String s = "abc" + mango + "def" + 47;System.out.println(s);} } /* abcmangodef47 */ 【代碼解說】

字符串連接符 + 的性能非常低下。。因為為了生成最終的string，會產(chǎn)生大量需要垃圾回收的中間對象；

2）通過javap 來反編譯Concatenation

E:\bench-cluster\spring_in_action_eclipse\AThinkingInJava\src>javap -c chapter13.Concatenation Compiled from "Concatenation.java" public class chapter13.Concatenation {public chapter13.Concatenation();Code:0: aload_01: invokespecial #1 // Method java/lang/Object."<init>":()V4: returnpublic static void main(java.lang.String[]);Code:0: ldc #2 // String mango2: astore_13: new #3 // class java/lang/StringBuilder6: dup7: invokespecial #4 // Method java/lang/StringBuilder."<init>":()V10: ldc #5 // String abc12: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;15: aload_116: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;19: ldc #7 // String def21: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;24: bipush 4726: invokevirtual #8 // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;29: invokevirtual #9 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;32: astore_233: getstatic #10 // Field java/lang/System.out:Ljava/io/PrintStream;36: aload_237: invokevirtual #11 // Method java/io/PrintStream.println:(Ljava/lang/String;)V40: return } 【代碼解說】

第3行：? 編譯器自動引入了 java.lang.StringBuilder 類，即使源代碼中沒有使用 StringBuilder，但是顯然StringBuilder 更加有效；

3）編譯器能為String 處理效率優(yōu)化到什么程度？

// 利用 StringBuilder.append() 來重載 + 運算符 public class WhitherStringBuilder {public String implicit(String[] fields) { // 方法一：使用多個String對象String result = "";for (int i = 0; i < fields.length; i++) // （效率低）隱式創(chuàng)建 StringBuilderresult += fields[i]; return result;} // 因為 StringBuilder是在循環(huán)內(nèi)創(chuàng)建的，這意味著每經(jīng)過循環(huán)一次，就會創(chuàng)建一個新的 StringBuilder對象public String explicit(String[] fields) { // 方法二：使用StringBuilder，因為效率高StringBuilder result = new StringBuilder(); // （效率高）顯式創(chuàng)建 StringBuilderfor (int i = 0; i < fields.length; i++)result.append(fields[i]);return result.toString();} }

4）StringBuilder 補充： 可以為StringBuilder 預(yù)先指定大小，如果知道最終的字符串長度，可以預(yù)先指定StringBuilder的大小，以避免多次重新分配緩沖；

【StringBuilder的荔枝】

5）如果要在toString() 方法中使用循環(huán)的話，最好自己創(chuàng)建一個StringBuidler 對象；

/* toString() 方法中使用循環(huán)的荔枝 */ public class UsingStringBuilder {public static Random rand = new Random(47);public String toString() {StringBuilder result = new StringBuilder("[");for (int i = 0; i < 25; i++) {result.append(rand.nextInt(100));result.append(", ");}result.delete(result.length() - 2, result.length()); // 刪除最后兩個字符result.append("]");return result.toString();}public static void main(String[] args) {UsingStringBuilder usb = new UsingStringBuilder();System.out.println(usb);} } /* [58, 55, 93, 61, 61, 29, 68, 0, 22, 7, 88, 28, 51,89, 9, 78, 98, 61, 20, 58, 16, 40, 11, 22, 4 ] */ 6）StringBuilder方法列表： insert, replace, substring, reverse, 最常用的方法是 append 和 toString() 方法；

7）StringBuilder 和 StringBuffer

7.1）StringBuilder： 線程不安全，效率高；（java SE5 引入）

7.2）StringBuffer： 線程安全，效率高；（java se 5 之前使用）

【13.3.】無意識的遞歸

1）所有java的根基類都是 Object， 所以容器類都有 toString() 方法。容器的toString() 方法都能夠表達容器自身和容器所包含的對象；

【看個荔枝】

class Latte extends Coffee {} class Americano extends Coffee {} class Cappuccino extends Coffee {} class Mocha extends Coffee {} class Breve extends Coffee {}public class CoffeeGenerator implements Generator<Coffee>, Iterable<Coffee> {private Class[] types = { Latte.class, Mocha.class, Cappuccino.class,Americano.class, Breve.class, };private static Random rand = new Random(47);public CoffeeGenerator() {}// For iteration:private int size = 0;public CoffeeGenerator(int sz) {size = sz;}public Coffee next() {try {return (Coffee) types[rand.nextInt(types.length)].newInstance();// Report programmer errors at run time:} catch (Exception e) {throw new RuntimeException(e);}}class CoffeeIterator implements Iterator<Coffee> { // 內(nèi)部迭代器類int count = size;public boolean hasNext() {return count > 0;}public Coffee next() {count--;return CoffeeGenerator.this.next();}public void remove() { // Not implementedthrow new UnsupportedOperationException();}};public Iterator<Coffee> iterator() { // 返回迭代器return new CoffeeIterator();}public static void main(String[] args) {CoffeeGenerator gen = new CoffeeGenerator();for (int i = 0; i < 10; i++)System.out.print(gen.next() + " ");} } /* Americano 0 Latte 1 Americano 2 Mocha 3 Mocha 4 Breve 5 Americano 6 Latte 7 Cappuccino 8 Cappuccino 9 */

【荔枝-toString方法調(diào)用內(nèi)存地址】

// 無限遞歸使得 java虛擬機棧被頂滿, 然后拋出異常 public class InfiniteRecursion {@Overridepublic String toString() {// toString() 中的this關(guān)鍵字是引起無限遞歸的原因 // return " InfiniteRecursion address: " + this + "\n"; // Exception in thread "main" java.lang.StackOverflowErrorreturn " InfiniteRecursion address: " + super.toString() + "\n";}public static void main(String[] args) {List<InfiniteRecursion> v = new ArrayList<InfiniteRecursion>();for (int i = 0; i < 10; i++)v.add(new InfiniteRecursion());System.out.println(v);} } /* [ InfiniteRecursion address: chapter13.InfiniteRecursion@15db9742 , InfiniteRecursion address: chapter13.InfiniteRecursion@6d06d69c , InfiniteRecursion address: chapter13.InfiniteRecursion@7852e922 , InfiniteRecursion address: chapter13.InfiniteRecursion@4e25154f , InfiniteRecursion address: chapter13.InfiniteRecursion@70dea4e , InfiniteRecursion address: chapter13.InfiniteRecursion@5c647e05 , InfiniteRecursion address: chapter13.InfiniteRecursion@33909752 , InfiniteRecursion address: chapter13.InfiniteRecursion@55f96302 , InfiniteRecursion address: chapter13.InfiniteRecursion@3d4eac69 , InfiniteRecursion address: chapter13.InfiniteRecursion@42a57993 ] */ 【代碼解說】?這里發(fā)生了自動類型轉(zhuǎn)換： 由 InfiniteRecursion類型轉(zhuǎn)換為 String 類型。 this前面的是字符串，后面是換行符，所以 this 轉(zhuǎn)換為 String，即調(diào)用了 this.toString() 方法，于是就發(fā)生了 遞歸調(diào)用 toString() 方法，無限遞歸使得 java 虛擬機棧被頂滿；然后拋出異常； 把this換做 super.toString() 方法后執(zhí)行成功；

【13.4】String 上的操作

1）String 對象的基本方法列表如下：

2）當(dāng)需要改變字符串的內(nèi)容時： String 類的方法都會返回一個新的String 對象；如果沒有改變，則返回原對象的引用；

【13.5】格式化輸出

【13.5.1】printf() 方法

【13.5.2】System.out.format() 方法： format方法可以用于 PrintStream' 和 PrintWriter 對象；

【荔枝-System.out.format() 輸出格式】

// System.out.format() 輸出格式 public class SimpleFormat {public static void main(String[] args) {int x = 5;double y = 5.332542;// The old way:System.out.println("Row 1: [" + x + " " + y + "]");// The new way:System.out.format("Row 1: [%d %f]\n", x, y); // format() 方法的荔枝// orSystem.out.printf("Row 1: [%d %f]\n", x, y); // printf() 方法荔枝} } /* Row 1: [5 5.332542] Row 1: [5 5.332542] Row 1: [5 5.332542] */ 【注意】 PrintStream.printf() 方法實際上調(diào)用了 format() 方法

// PrintStream.printf() 方法源碼 public PrintStream printf(String format, Object ... args) {return format(format, args);} // System.out 實際上是不可變的PrintStream對象常量 public final static PrintStream out = null;
【13.5.3】java.util.Formatter 類

1）格式化功能都由 java.util.Formatter 類處理；

1.1）Formatter 是一個翻譯器： 將格式化字符串與數(shù)據(jù)翻譯成期望的結(jié)果；

1.2）Formatter 構(gòu)造器需要傳入目的地輸出流參數(shù)： 最常用的目的地是： PrintStream、OutputStream 和 File；

【Formatter荔枝】

// Formatter() 的荔枝 public class Turtle {private String name;private Formatter f;public Turtle(String name, Formatter f) {this.name = name;this.f = f;}public void move(int x, int y) {f.format("%s The Turtle is at (%d,%d)\n", name, x, y);}public static void main(String[] args) {PrintStream outAlias = System.out;// new Formatter(dest), 設(shè)置輸出目的地Turtle tommy = new Turtle("Tommy", new Formatter(System.out));Turtle terry = new Turtle("Terry", new Formatter(outAlias));tommy.move(0, 0);terry.move(4, 8);tommy.move(3, 4);terry.move(2, 5);tommy.move(3, 3);terry.move(3, 3);} } /* Tommy The Turtle is at (0,0) Terry The Turtle is at (4,8) Tommy The Turtle is at (3,4) Terry The Turtle is at (2,5) Tommy The Turtle is at (3,3) Terry The Turtle is at (3,3) */ // Formatter 構(gòu)造器 public Formatter(PrintStream ps) {this(Locale.getDefault(Locale.Category.FORMAT),(Appendable)Objects.requireNonNull(ps));}

【13.5.4】格式化說明符（如 %d, %s）

1）如何控制輸出的空格與格式對齊： 默認或+右對齊， -表示左對齊；

2）字符串格式化語法： %[argument_index$][flags][width][.precision]conversion

2.1）argument_index： 參數(shù)序號；

2.2）flags： + 或者 - ；

2.3）width： 最小長度；

2.4） precision： 用于格式化字符串時，表示最大長度；用于格式化浮點數(shù)時，表示小數(shù)部分位數(shù) （默認6位，少則補0，多則舍入）；無法格式化整數(shù)（否則拋出異常）；

2.5）conversion： 表示類型轉(zhuǎn)換字符： d, c, b, s, f, e, x, h, %；

【格式化說明符的荔枝】

// 輸出格式的左(-) 右(默認)對齊設(shè)置 public class Receipt {private double total = 0;private Formatter f = new Formatter(System.out);public void printTitle() {// - 左對齊f.format("%-15s %5s %10s\n", "Item", "Qty", "Price"); // - 左對齊f.format("%-15s %5s %10s\n", "----", "---", "-----");}public void print(String name, int qty, double price) {f.format("%-15.15s %5d %10.2f\n", name, qty, price);total += price;}public void printTotal() {f.format("%-15s %5s %10.2f\n", "Tax", "", total * 0.06);f.format("%-15s %5s %10s\n", "", "", "-----");f.format("%-15s %5s %10.2f\n", "Total", "", total * 1.06);}public static void main(String[] args) {Receipt receipt = new Receipt();receipt.printTitle();receipt.print("Jack's Magic Beans", 40, 4.25);receipt.print("Princess Peas", 311, 5.1);receipt.print("Three Bears Porridge", 1, 14.29);receipt.printTotal();} } /* Item Qty Price ---- --- ----- Jack's Magic Be 40 4.25 Princess Peas 311 5.10 Three Bears Por 1 14.29 Tax 1.42----- Total 25.06 */

【13.5.5】Formatter轉(zhuǎn)換

1）下面的表格包含了最常用的類型轉(zhuǎn)換：

2）類型轉(zhuǎn)換字符有： d, c, b, s, f, e, x, h, % ；

d：整數(shù)型（10進制）；

c： Unicode 字符；

b：Boolean 值；

s：String；

f：浮點數(shù)（10進制）；

x：整數(shù)（16進制）；

h：散列碼（16進制）；

%：字符% 或類型轉(zhuǎn)換字符前綴（必須是單個%，多個% 不是）

【類型轉(zhuǎn)換的荔枝】

/* Formatter 對各種數(shù)據(jù)類型轉(zhuǎn)換的荔枝 */ public class Conversion {public static void main(String[] args) {Formatter f = new Formatter(System.out);char u = 'a'; System.out.println("u = 'a'"); // u = 'a'f.format("%%s: %s\n", u); // %s: af.format("%%c: %c\n", u); // %c: af.format("%%b: %b\n", u); // %b: truef.format("%%h: %h\n", u); // %h: 61 // f.format("d: %d\n", u); // java.util.IllegalFormatConversionException: d != java.lang.Character // f.format("f: %f\n", u); // java.util.IllegalFormatConversionException: f != java.lang.Character // f.format("e: %e\n", u); // java.util.IllegalFormatConversionException: e != java.lang.Character // f.format("x: %x\n", u); // java.util.IllegalFormatConversionException: x != java.lang.Characterint v = 121;System.out.println();System.out.println("v = 121"); // v = 121f.format("%%d: %d\n", v); // %d: 121f.format("%%c: %c\n", v); // %c: yf.format("%%b: %b\n", v); // %b: true f.format("%%s: %s\n", v); // %s: 121f.format("%%x: %x\n", v); // %x: 79f.format("%%h: %h\n", v); // %h: 79 // f.format("f: %f\n", v); // java.util.IllegalFormatConversionException: f != java.lang.Integer // f.format("e: %e\n", v); // java.util.IllegalFormatConversionException: e != java.lang.IntegerBigInteger w = new BigInteger("50000000000000");System.out.println();System.out.println("w = new BigInteger(\"50000000000000\")"); // w = new BigInteger("50000000000000")f.format("%%d: %d\n", w); // %d: 50000000000000f.format("%%b: %b\n", w); // %b: truef.format("%%s: %s\n", w); // %s: 50000000000000f.format("%%x: %x\n", w); // %x: 2d79883d2000f.format("%%h: %h\n", w); // %h: 8842a1a7 // f.format("c: %c\n", w); // java.util.IllegalFormatConversionException: c != java.math.BigInteger // f.format("f: %f\n", w); // java.util.IllegalFormatConversionException: f != java.math.BigInteger // f.format("e: %e\n", w); // java.util.IllegalFormatConversionException: e != java.math.BigIntegerdouble x = 179.543;System.out.println();System.out.println("x = 179.543"); // x = 179.543f.format("%%b: %b\n", x); // %b: truef.format("%%s: %s\n", x); // %s: 179.543f.format("%%f: %f\n", x); // %f: 179.543000f.format("%%e: %e\n", x); //%e: 1.795430e+02, 科學(xué)表示法f.format("%%h: %h\n", x); // %h: 1ef462c // f.format("d: %d\n", x); // java.util.IllegalFormatConversionException: d != java.lang.Double // f.format("c: %c\n", x); // java.util.IllegalFormatConversionException: c != java.lang.Double // f.format("x: %x\n", x); // java.util.IllegalFormatConversionException: x != java.lang.DoubleConversion y = new Conversion();System.out.println();System.out.println("y = new Conversion()"); // y = new Conversion()f.format("%%b: %b\n", y); // %b: truef.format("%%s: %s\n", y); // %s: chapter13.Conversion@4aa298b7f.format("%%h: %h\n", y); // %h: 4aa298b7 // f.format("d: %d\n", y); // java.util.IllegalFormatConversionException: d != chapter13.Conversion // f.format("c: %c\n", y); // java.util.IllegalFormatConversionException: c != chapter13.Conversion // f.format("f: %f\n", y); // java.util.IllegalFormatConversionException: f != chapter13.Conversion // f.format("e: %e\n", y); // java.util.IllegalFormatConversionException: e != chapter13.Conversion // f.format("x: %x\n", y); // java.util.IllegalFormatConversionException: x != chapter13.Conversionboolean z = false;System.out.println();System.out.println("z = false"); // z = falsef.format("%%b: %b\n", z); // %b: falsef.format("%%s: %s\n", z); // %s: falsef.format("%%h: %h\n", z); // %h: 4d5 // f.format("d: %d\n", z); // java.util.IllegalFormatConversionException: d != java.lang.Boolean // f.format("c: %c\n", z); // java.util.IllegalFormatConversionException: c != java.lang.Boolean // f.format("f: %f\n", z); // java.util.IllegalFormatConversionException: f != java.lang.Boolean // f.format("e: %e\n", z); // java.util.IllegalFormatConversionException: e != java.lang.Boolean // f.format("x: %x\n", z); // java.util.IllegalFormatConversionException: x != java.lang.Boolean} }

【13.5.6】String.format() 方法

1）String.format方法源碼： 接受的參數(shù)與 Formatter.format()方法一樣，但返回一個 String 對象；

【String.format() 荔枝】

public class DatabaseException extends Exception {public DatabaseException(int transactionID, int queryID, String message) {super(String.format("(t%d, q%d) %s", transactionID, queryID, message));} /** String.format() 源碼詳解： String.format() 方法也是創(chuàng)建一個 Formatter對象.public static String format(String format, Object... args) {return new Formatter().format(format, args).toString();}*/public static void main(String[] args) {try {throw new DatabaseException(3, 7, "Write failed");} catch (Exception e) {System.out.println(e);System.out.println(e.getMessage());}} } /* chapter13.DatabaseException: (t3, q7) Write failed (t3, q7) Write failed */

2）16進制轉(zhuǎn)儲（dump）工具

【荔枝-使用String.format() 方法以可讀的16 進制格式把字節(jié)數(shù)組打印出來】

// 16進制轉(zhuǎn)儲工具 public class Hex {public static String format(byte[] data) {StringBuilder result = new StringBuilder();int n = 0;for (byte b : data) {if (n % 16 == 0)result.append(String.format("%05X: ", n)); // 占用5個位置(16進制表示)result.append(String.format("%02X ", b)); // 占用2個位置(16進制表示)n++;if (n % 16 == 0)result.append("\n");}result.append("\n");return result.toString();}public static void main(String[] args) throws Exception {if (args.length == 0)System.out.println(format(BinaryFile.read(MyConstant.path + "Hex.class")));elseSystem.out.println(format(BinaryFile.read(new File(args[0]))));} } /* 00000: CA FE BA BE 00 00 00 34 00 58 0A 00 05 00 26 07 00010: 00 27 0A 00 02 00 26 08 00 28 07 00 29 0A 00 2A ...... */ public class BinaryFile {public static byte[] read(File bFile) throws IOException {BufferedInputStream bf = new BufferedInputStream(new FileInputStream(bFile));try {byte[] data = new byte[bf.available()];bf.read(data);return data;} finally {bf.close();}}public static byte[] read(String bFile) throws IOException {return read(new File(bFile).getAbsoluteFile());} } // /:~

【13.6】正則表達式 regex

【13.6.1】基礎(chǔ)

1）java 對反斜線 '\' 的不同處理

1.1）其他語言： \\ 表示在regex 插入字面量反斜線 '\'；

1.2）java： \\ 表示插入一個regex 的反斜線，所以反斜線后面的字符具有特殊意義；

2）荔枝-java 反斜線：

2.1）數(shù)字的regex： \\d；

2.2）普通反斜線的 regex ： \\\\；

2.3）換行和制表符的regex： \n\t （無需轉(zhuǎn)換）；

3）使用 regex 的最簡單途徑：??利用String 類內(nèi)建功能： String.matches(regex)；

4）String的內(nèi)建匹配的 regex的荔枝

/* String的內(nèi)建匹配的 regex的荔枝 */ public class IntegerMatch {public static void main(String[] args) {System.out.println("-1234".matches("-?\\d+")); // trueSystem.out.println("5678".matches("-?\\d+")); // trueSystem.out.println("+911".matches("-?\\d+")); // falseSystem.out.println("+911".matches("(-|\\+)?\\d+")); // true} } /* true true false true */

// String.matches() 源碼，實際上調(diào)用了 Pattern.matches() public boolean matches(String regex) {return Pattern.matches(regex, this);}

【代碼解說】 (-|\\+)? ： 表示字符串的起始字符可能是一個 - 或 + （\\+ 是對 + 的轉(zhuǎn)義，轉(zhuǎn)義后是普通字符），或二者都沒有，（？表示0個或1個）；

5）String.split(regex)： regex也可以是空格，把字符串從regex 匹配的地方切開；

【荔枝-利用String.split(regex) 分割字符串】

/* 荔枝-利用String.split(regex) 分割字符串 */ public class Splitting {public static String knights = "Then, when you have found the shrubbery, you must "+ "cut down the mightiest tree in the forest... "+ "with... a herring!";public static void split(String regex) {String[] array = knights.split(regex);for(String s : array) {System.out.print(s + " ");} }public static void main(String[] args) {System.out.println("knights = \"" + knights + "\"\n");split(" "); // 利用空格進行分割, Doesn't have to contain regex chars(不必包含正則表達式字符)System.out.println();split("\\W+"); // (大寫W)基于非單詞字符進行分割, Non-word charactersSystem.out.println();split("n\\W+"); // (大寫W)基于n之后跟非單詞字符進行分割, 'n' followed by non-word charactersSystem.out.println("\nknights = \"" + knights + "\"\n"); // 顯然 String.split(regex) 不會修改string 而是重新創(chuàng)建一個String}// 基于誰進行分割，這個誰最后都會被移除. } /* knights = "Then, when you have found the shrubbery, you must cut down the mightiest tree in the forest... with... a herring!"Then, when you have found the shrubbery, you must cut down the mightiest tree in the forest... with... a herring! Then when you have found the shrubbery you must cut down the mightiest tree in the forest with a herring The whe you have found the shrubbery, you must cut dow the mightiest tree i the forest... with... a herring! knights = "Then, when you have found the shrubbery, you must cut down the mightiest tree in the forest... with... a herring!" */【代碼解說】

1）\W： 匹配非單詞字符；

2）\w：匹配單詞字符；

6）String.split() 重載版本： 允許你限制字符串分割的次數(shù)；

7）利用regex 進行字符串替換： 僅替換regex 第一次匹配的子串，也可以替換所有匹配的地方；

【荔枝-利用regex進行字符串替換(replaceFirst ， replaceAll )】

/* 利用regex進行字符串替換(replaceFirst, replaceAll ) */ public class Replacing {static String s = Splitting.knights;public static void main(String[] args) {System.out.println(s);// 以y字母開頭的單詞被替換為 Tom(且僅被替換一次)print(s.replaceFirst("y\\w+", "Tom")); // String.replaceFirst 荔枝System.out.println();// shrubbery 或 tree 或 herring 全部替換為bananaprint(s.replaceAll("shrubbery|tree|herring", "banana")); // String.replaceAll 荔枝} } /* Then, when you have found the shrubbery, you must cut down the mightiest tree in the forest... with... a herring! Then, when Tom have found the shrubbery, you must cut down the mightiest tree in the forest... with... a herring!Then, when you have found the banana, you must cut down the mightiest banana in the forest... with... a banana! */

【13.6.2】創(chuàng)建正則表達式

1）正則表達式字符，字符類，邏輯操作符，邊界匹配符；

【荔枝-利用regex匹配字符序列】

// 正則表達式的模式匹配 public class Rudolph {public static void main(String[] args) {CharSequence seq ;CharSequence str = new String();for (String pattern : new String[] { "Rudolph", "[rR]udolph", "[rR][aeiou][a-z]ol.*", "R.*" })System.out.println("Rudolph".matches(pattern)); // 全為 true, 全匹配.} } /* String.matches(String regex) 源碼 public boolean matches(String regex) {return Pattern.matches(regex, this); } */

【13.6.3】量詞

1）量詞描述了一個模式吸收輸入文本的方式：

貪婪型： 發(fā)現(xiàn)盡可能多的匹配；

勉強型；

占有型；

【注意】：表達式X 通常必須用圓括號括起來；

【CharSequence-字符序列】接口 CharSequence 從 CharBuffer, String, StringBuffer, StringBuilder 類中抽象出了字符序列的一般化定義：多數(shù)正則表達式都接受 CharSequence類型的參數(shù)呢。

/* CharSequence接口源碼 */ public interface CharSequence {int length();char charAt(int index);CharSequence subSequence(int start, int end);public String toString();public default IntStream chars() {class CharIterator implements PrimitiveIterator.OfInt {int cur = 0;public boolean hasNext() {return cur < length();}public int nextInt() {if (hasNext()) {return charAt(cur++);} else {throw new NoSuchElementException();}}@Overridepublic void forEachRemaining(IntConsumer block) {for (; cur < length(); cur++) {block.accept(charAt(cur));}}}return StreamSupport.intStream(() ->Spliterators.spliterator(new CharIterator(),length(),Spliterator.ORDERED),Spliterator.SUBSIZED | Spliterator.SIZED | Spliterator.ORDERED,false);}public default IntStream codePoints() {class CodePointIterator implements PrimitiveIterator.OfInt {int cur = 0;@Overridepublic void forEachRemaining(IntConsumer block) {final int length = length();int i = cur;try {while (i < length) {char c1 = charAt(i++);if (!Character.isHighSurrogate(c1) || i >= length) {block.accept(c1);} else {char c2 = charAt(i);if (Character.isLowSurrogate(c2)) {i++;block.accept(Character.toCodePoint(c1, c2));} else {block.accept(c1);}}}} finally {cur = i;}}public boolean hasNext() {return cur < length();}public int nextInt() {final int length = length();if (cur >= length) {throw new NoSuchElementException();}char c1 = charAt(cur++);if (Character.isHighSurrogate(c1) && cur < length) {char c2 = charAt(cur);if (Character.isLowSurrogate(c2)) {cur++;return Character.toCodePoint(c1, c2);}}return c1;}}return StreamSupport.intStream(() ->Spliterators.spliteratorUnknownSize(new CodePointIterator(),Spliterator.ORDERED),Spliterator.ORDERED,false);} }

【13.6.4】Patter 和 Matcher

1）如何構(gòu)建功能強大的regex 對象？

step1： Pattern.compile(regex) 編譯regex 并產(chǎn)生 Pattern 對象；

step2：Patter.matcher(檢索的字符串) 生成一個 Matcher 對象；

2）Matcher對象有許多方法如下：

// 利用 Pattern 和 Matcher 測試正則表達式的荔枝 public class TestRegularExpression {public static void main(String[] args) {String[] array = {"aabbcc", "aab", "aab+", "(b+)"};for (String arg : array) {System.out.println();print("Regular expression: \"" + arg + "\"");Pattern p = Pattern.compile(arg); // step1: Pattern 表示編譯后的匹配模型Pattern.（編譯后的正則表達式）Matcher m = p.matcher("aabbcc"); // step2: 模型實例檢索待匹配字符串并生成一個匹配對象Matcher， Matcher有很多方法while (m.find()) {print("Match \"" + m.group() // 待匹配的字符串+ "\" at positions " + m.start() // 字符串匹配regex的起始位置+ "-" + (m.end() - 1)); // 字符串匹配regex的終點位置}}} } /* Regular expression: "aabbcc" Match "aabbcc" at positions 0-5Regular expression: "aab" Match "aab" at positions 0-2Regular expression: "aab+" Match "aabb" at positions 0-3Regular expression: "(b+)" Match "bb" at positions 2-3 */

【代碼解說】Pattern對象表示編譯后的 regex-正則表達式，是具有更強功能的正則表達式對象；

【編譯后的regex-Pattern】

1）Pattern 提供了 static 方法： 它實際上要經(jīng)過 Pattern.compile(regex) 生成 Pattern對象， pattern obj.matcher(str) 生成 Matcher 對象，最后返回 metcher.matches() 結(jié)果，即 input 是否匹配 regex；

// Pattern.matches() 方法 public static boolean matches(String regex, CharSequence input) {Pattern p = Pattern.compile(regex);Matcher m = p.matcher(input);return m.matches();}

// Pattern.compile() 源碼public static Pattern compile(String regex) {return new Pattern(regex, 0);}

2）Pattern 方法列表：

split() 方法： 它從字符串匹配regex的地方分割字符串，并返回分割后的字符串?dāng)?shù)組；

pattern 方法：返回pattern；

3）Matcher 方法列表：

boolean matches(); //判斷輸入字符串是否匹配正則表達式regex； boolean lookingAt(); //判斷輸入字符串（不是整個）的開始部分是否匹配 regex； boolean find(); //用于在 CharSequence 輸入字符串中查找多個匹配； boolean find(int start); //用于在 CharSequence 輸入字符串的start 位置開始查找多個匹配； String group(); //用于返回匹配regex的輸入字符串的子串；

【荔枝-Matcher.find() 方法荔枝】

public class Finding {public static void main(String[] args) {// step1, 對regex進行編譯得到編譯后的regex對象Pattern// step2, Pattern 對輸入字符串進行檢索得到匹配對象Matcher.Matcher m = Pattern.compile("\\w+").matcher( // (小寫w) 表示匹配單詞字符"Evening is full of the linnet's wings");while (m.find())printnb(m.group() + ", ");System.out.println("\n======");int i = 0;while (m.find(i)) {printnb(m.group() + " \n");i++;}} } /* Evening, is, full, of, the, linnet, s, wings, ====== Evening vening ening ning ing ng g is is s full full ull ll l of of f the the he e linnet linnet innet nnet net et t s s wings wings ings ngs gs s */ 【代碼解說】 模式 \\w+ 將字符串劃分為單詞。 find() 前向遍歷輸入字符串； find(int start) 把 start 作為輸入字符串搜索的起點；

4）組group： 組是用括號劃分的regex，可以根據(jù)組編號來引用某個組。組號為0 表示整個regex，組號為1 表示被第一對括號括起來的組；

【荔枝-group】

A(B(C))D ：有3個組；

組0：ABCD;

組1： BC

組2： C

【荔枝-regex group - 正則表達式組的荔枝】

public class Groups {static public final String POEM = "Twas brillig, and the slithy toves\n"+ "Did gyre and gimble in the wabe.\n"+ "All mimsy were the borogoves,\n"+ "And the mome raths outgrabe.\n\n"+ "Beware the Jabberwock, my son,\n"+ "The jaws that bite, the claws that catch.\n"+ "Beware the Jubjub bird, and shun\n"+ "The frumious Bandersnatch.";public static void main(String[] args) {// \S 非空白符, \s 空白符, 補充: 圓括號闊起來的是分組// 目的是捕獲每行最后的3個詞，每行最后以 $ 結(jié)束。 ?m 是模式標(biāo)記，用于指定輸入序列中的換行符Matcher m = Pattern.compile("(?m)(\\S+)\\s+((\\S+)\\s+(\\S+))$") .matcher(POEM); // 對輸入字符串 POEM 進行正則表達式匹配.while (m.find()) {for (int j = 0; j <= m.groupCount(); j++)printnb("[" + m.group(j) + "]");print();}} } /* [the slithy toves][the][slithy toves][slithy][toves] [in the wabe.][in][the wabe.][the][wabe.] [were the borogoves,][were][the borogoves,][the][borogoves,] [mome raths outgrabe.][mome][raths outgrabe.][raths][outgrabe.] [Jabberwock, my son,][Jabberwock,][my son,][my][son,] [claws that catch.][claws][that catch.][that][catch.] [bird, and shun][bird,][and shun][and][shun] [The frumious Bandersnatch.][The][frumious Bandersnatch.][frumious][Bandersnatch.] */ 5）start() 與 end() 方法：?

5.1）返回值： start方法返回先前匹配的起始位置的索引，而end方法返回所匹配的最后字符的索引加一的值；

5.2）匹配操作失敗后： 調(diào)用 start() 或 end() 方法報錯 IllegalStateException ；

【荔枝-Matcher方法列表】

public class StartEnd {public static String input = "As long as there is injustice, whenever a\n"+ "Targathian baby cries out, wherever a distress\n"+ "signal sounds among the stars ... We'll be there.\n"+ "This fine ship, and this fine crew ...\n"+ "Never give up! Never surrender!";private static class Display {private boolean regexPrinted = false;private String regex;Display(String regex) {this.regex = regex;}void display(String message) {if (!regexPrinted) { // print(regex);regexPrinted = true;}print(message);}}/* 校驗輸入字符串s 是否匹配 regex */static void examine(String s, String regex) {Display d = new Display(regex);Pattern p = Pattern.compile(regex);Matcher m = p.matcher(s);/* find() 遍歷輸入字符串，并以匹配regex的輸入字符串子串的終點作為下次遍歷的起點 */while (m.find())/* Matcher.group() 返回的是匹配regex的輸入字符串的子串*/d.display("find() '" + m.group() + "' start = " + m.start()+ " end = " + m.end());/* 判斷輸入字符串的開始部分是否匹配regex */if (m.lookingAt()){ // No reset() necessary System.out.println("\n m.lookingAt() : ");d.display("lookingAt() start = " + m.start() + " end = " + m.end());}/* 判斷整個輸入字符串是否匹配 regex */if (m.matches()) // No reset() necessaryd.display("matches() start = " + m.start() + " end = " + m.end());}public static void main(String[] args) {int i = 0;for (String in : input.split("\n")) {System.out.println("[" + ++i +"]====================================");print("input : " + in);int j = 0;for (String regex : new String[] { "\\w*ere\\w*", "\\w*ever","T\\w+", "Never.*?!" }) {System.out.println("regex" + ++j + " = " + regex);examine(in, regex);}}} } /* [1]==================================== input : As long as there is injustice, whenever a regex1 = \w*ere\w* find() 'there' start = 11 end = 16 regex2 = \w*ever find() 'whenever' start = 31 end = 39 regex3 = T\w+ regex4 = Never.*?! [2]==================================== input : Targathian baby cries out, wherever a distress regex1 = \w*ere\w* find() 'wherever' start = 27 end = 35 regex2 = \w*ever find() 'wherever' start = 27 end = 35 regex3 = T\w+ find() 'Targathian' start = 0 end = 10m.lookingAt() : lookingAt() start = 0 end = 10 regex4 = Never.*?! [3]==================================== input : signal sounds among the stars ... We'll be there. regex1 = \w*ere\w* find() 'there' start = 43 end = 48 regex2 = \w*ever regex3 = T\w+ regex4 = Never.*?! [4]==================================== input : This fine ship, and this fine crew ... regex1 = \w*ere\w* regex2 = \w*ever regex3 = T\w+ find() 'This' start = 0 end = 4m.lookingAt() : lookingAt() start = 0 end = 4 regex4 = Never.*?! [5]==================================== input : Never give up! Never surrender! regex1 = \w*ere\w* regex2 = \w*ever find() 'Never' start = 0 end = 5 find() 'Never' start = 15 end = 20m.lookingAt() : lookingAt() start = 0 end = 5 regex3 = T\w+ regex4 = Never.*?! find() 'Never give up!' start = 0 end = 14 find() 'Never surrender!' start = 15 end = 31m.lookingAt() : lookingAt() start = 0 end = 14 matches() start = 0 end = 31 */ 【代碼解說-Matcher方法列表】

1）find()： 從輸入字符串的任意位置匹配 regex；而 find(int start) ：從輸入字符串的第start字符開始匹配 regex；

2）lookingAt()： 判斷輸入字符串是否從最開始處就匹配 regex；

3）matches()：?判斷整個輸入字符串是否匹配 regex；?

【Pattern標(biāo)記】

1）Pattern.compile() 方法的重載版本：該方法可以調(diào)整 regex 的匹配行為：?

// Pattern.compile(String, int) 源碼public static Pattern compile(String regex, int flags) {return new Pattern(regex, flags);} 2）上述 flags 表示匹配行為，必須為 Pattern類常量，如下：

3）常用的Pattern 標(biāo)記如下：

3.1）Pattern.CASE_INSENSITIVE：? 不區(qū)分大小寫；

3.2）Pattern.MULTILINE： 允許多行，即不以換行字符作為分隔符；

3.3）Pattern.COMMENTS： 模式中允許空格和注釋，不以空格和注釋作為分隔符；

【荔枝-Pattern標(biāo)記】?

/* Pattern標(biāo)記的荔枝 */ public class ReFlags {public static void main(String[] args) {Pattern p = Pattern.compile("^java", Pattern.CASE_INSENSITIVE| Pattern.MULTILINE);// Pattern.CASE_INSENSITIVE: 不區(qū)分大小寫；// | Pattern.MULTILINE: 允許多行，即不以換行字符作為分隔符；Matcher m = p.matcher("java has regex\nJava has regex\n"+ "JAVA has pretty good regular expressions\n"+ "Regular expressions are in Java");/* Matcher.find() ：從輸入字符串的任意位置校驗輸入字符串是否匹配regex*/while (m.find())System.out.println(m.group()); // m.group() 返回匹配regex的輸入字符串子串} } /* java Java JAVA */ 【注意】模式Pattern 表示的是： 編譯后的regex；

【13.6.5】Pattern.split() 方法

1）Patter.split() 方法?將輸入字符串分割為字符串對象數(shù)組，分割邊界由 regex 確定（分割邊界在分割結(jié)果中被刪除）；

// Pattern.split(CharSequence input)源碼 public String[] split(CharSequence input) {return split(input, 0);} // Pattern.split(CharSequence input, int limit) 源碼 public String[] split(CharSequence input, int limit) {int index = 0;boolean matchLimited = limit > 0;ArrayList<String> matchList = new ArrayList<>();Matcher m = matcher(input);// Add segments before each match foundwhile(m.find()) {if (!matchLimited || matchList.size() < limit - 1) {if (index == 0 && index == m.start() && m.start() == m.end()) {// no empty leading substring included for zero-width match// at the beginning of the input char sequence.continue;}String match = input.subSequence(index, m.start()).toString();matchList.add(match);index = m.end();} else if (matchList.size() == limit - 1) { // last oneString match = input.subSequence(index,input.length()).toString();matchList.add(match);index = m.end();}}// If no match was found, return thisif (index == 0)return new String[] {input.toString()};// Add remaining segmentif (!matchLimited || matchList.size() < limit)matchList.add(input.subSequence(index, input.length()).toString());// Construct resultint resultSize = matchList.size();if (limit == 0)while (resultSize > 0 && matchList.get(resultSize-1).equals(""))resultSize--;String[] result = new String[resultSize];return matchList.subList(0, resultSize).toArray(result);}【荔枝-Patter.split() 方法分割輸入字符串】

// Pattern.split() 方法的測試用例 public class SplitDemo {public static void main(String[] args) {String input = "This!!unusual use!!of exclamation!!points";print(Arrays.toString(Pattern.compile("!!").split(input))); // split(input, 0); 對匹配次數(shù)不做任何限制/* (只匹配前2個 !! ) *//* 注意：分割邊界在分割結(jié)果中被刪除 */print(Arrays.toString(Pattern.compile("!!").split(input, 3))); // 限定匹配次數(shù)，limit限制將輸入字符串分割成數(shù)組的數(shù)組大小} } /* [This, unusual use, of exclamation, points] [This, unusual use, of exclamation!!points] */

【13.6.6】替換操作

1） Matcher.appendReplacement 和 Matcher.appendTail 方法的荔枝

public class TheReplacements {public static void main(String[] args) throws Exception {String s = TextFile.read(MyConstant.path + "TheReplacements.java");// 匹配在 /*! 和 !*/ 之間的所有文字。// 如 /*! 今天 2017 年11月26日 , i love you. !*/Matcher mInput = Pattern.compile("/\\*!(.*)!\\*/", Pattern.DOTALL).matcher(s);if (mInput.find()) {s = mInput.group(1); // Captured by parentheses(圓括號)System.out.println("matched.");System.out.println("s1 = " + s); }// Replace two or more spaces with a single space:/* 用一個空格替換2個或多個空格（縮進字符 \t 不起作用） */s = s.replaceAll(" {2,}", " ");System.out.println("after s.replaceAll(\" {2,}\", \" \"), s2 = " + s); // // Replace one or more spaces at the beginning of each line with no spaces. Must enable MULTILINE mode:// 在每行的開頭替換一個或多個空格，不要有空格。必須啟用MULTILINE模式：s = s.replaceAll("(?m)^ +", "");System.out.println("after s = s.replaceAll(\"(?m)^ +\", \"\"), s3 = " + s);s = s.replaceFirst("[aeiou]", "(Tr)"); // 用 (VOWEL1) 替換第一次匹配到的任何一個aeiou元音字母, 這里調(diào)用的是 String.replaceFirst()方法System.out.println("after s.replaceFirst(\"[aeiou]\", \"(Tr)\"), s4 = " + s);/* 構(gòu)建模式，即編譯后的regex. */StringBuffer sbuf = new StringBuffer();Pattern p = Pattern.compile("[aeiou]");Matcher m = p.matcher(s);// Process the find information as you perform the replacements:// 在執(zhí)行替換時處理查找信息：while (m.find())m.appendReplacement(sbuf, m.group().toUpperCase()); // 將regex找到的元音字母轉(zhuǎn)換為大寫字母print("s = " + s); // s 沒有改變// Put in the remainder of the text:m.appendTail(sbuf); // 將未處理的部分存入sbuf;print(sbuf); // 最后 sbuf 是字符串s被匹配的元音字母轉(zhuǎn)換為大寫后的結(jié)果。} } // 打印結(jié)果： matched. s1 = 和 !*/ 之間的所有文字。// 如 /*! 今天 2017 年11月26日 , i love you. after s.replaceAll(" {2,}", " "), s2 = 和 !*/ 之間的所有文字。// 如 /*! 今天 2017 年11月26日 , i love you. after s = s.replaceAll("(?m)^ +", ""), s3 = 和 !*/ 之間的所有文字。// 如 /*! 今天 2017 年11月26日 , i love you. after s.replaceFirst("[aeiou]", "(Tr)"), s4 = 和 !*/ 之間的所有文字。// 如 /*! 今天 2017 年11月26日 , (Tr) love you. s = 和 !*/ 之間的所有文字。// 如 /*! 今天 2017 年11月26日 , (Tr) love you. 和 !*/ 之間的所有文字。// 如 /*! 今天 2017 年11月26日 , (Tr) lOvE yOU. 【代碼解說】方法列表：

1）Matcher.appendReplacement(StringBuffer sb, String replacement) 方法： 是將匹配到的字符串部分（或子串）做處理后追加到 sb；

2）Matcher.appendTail(StringBuffer sb) ： 在執(zhí)行一次或多次 appendReplacement() 方法后，調(diào)用 appendTail() 方法可以將輸入字符串余下的部分復(fù)制到sb中；

3） Matcher.replaceFirst(String replacement)： 調(diào)用一次?appendReplacement，再調(diào)用一次?appendTail 方法就可以了；

// Matcher.replaceFirst() 源碼public String replaceFirst(String replacement) {if (replacement == null)throw new NullPointerException("replacement");reset();if (!find())return text.toString();StringBuffer sb = new StringBuffer();appendReplacement(sb, replacement);appendTail(sb);return sb.toString();}

4）Matcher.replaceAll(String replacement)：?用replacement 替換輸入字符串中?所有?匹配regex 的部分：調(diào)用多次 appendReplacement，再調(diào)用一次?appendTail 方法就可以了；

// Matcher.replaceAll() 源碼public String replaceAll(String replacement) {reset();boolean result = find();if (result) {StringBuffer sb = new StringBuffer();do {appendReplacement(sb, replacement);result = find();} while (result);appendTail(sb);return sb.toString();}return text.toString();}

5）代碼?s = s.replaceFirst("[aeiou]", "(Tr)");? 調(diào)用的是 String.replaceFirst() 方法，源碼如下：String.replaceFirst() 方法實際上也是調(diào)用了 Matcher.replaceFirst()

// String.replace() 方法源碼public String replaceFirst(String regex, String replacement) {return Pattern.compile(regex).matcher(this).replaceFirst(replacement);}

【13.6.7】reset()

1）reset方法： 可以將 Matcher對象應(yīng)用于一個新的字符序列；

// reset()方法重新設(shè)置 Matcher 的輸入字符串 public class Resetting {public static void main(String[] args) throws Exception {Matcher m = Pattern.compile("[frb][aiu][gx]").matcher("fix the rug with bags"); // 設(shè)置輸入字符串 “fix the rug with bags”while (m.find()) {System.out.println(m.group() + " " + m.start() + " to " + m.end());}System.out.println("\nafter m.reset(\"fix the rig with rags\") by regex [frb][aiu][gx]");/* Matcher.reset() 方法可以將 Matcher對象重置到當(dāng)前字符串或字符序列的起始位置 */m.reset("fix the rig with rags"); while (m.find())System.out.println(m.group() + " " + m.start() + " to " + m.end());} } /* fix 0 to 3 rug 8 to 11 bag 17 to 20=== reset === fix 0 to 3 rig 8 to 11 rag 17 to 20 */

【13.6.8】正則表達式regex 與 java io

1）如何應(yīng)用 regex 在一個文件中進行搜索匹配操作？

2）unix 系統(tǒng)的 grep函數(shù)： 有兩個輸入?yún)?shù)，文件名和 regex；輸出匹配部分和匹配部分在行中的位置；

// Matcher.reset(str) 的荔枝 public class JGrep {public static void main(String[] args) throws Exception {args = new String[2];args[1] = "[a-z]+"; // 定義正則表達式regexPattern p = Pattern.compile(args[1]); // Iterate through the lines of the input file:// 遍歷輸入文件的行：int index = 1;Matcher m = p.matcher(""); // 隨便設(shè)置一個輸入字符串 ""args[0] = MyConstant.path + "JGrep.java"; // 輸入字符串所在文件的 dirfor (String line : new TextFile(args[0])) {/* Matcher.reset()方法將 Matcher對象重置到當(dāng)前字符串或字符序列的起始位置 */m.reset(line); while (m.find())System.out.println(index++ + " , " + m.group() + " , start = "+ m.start());}} } 【代碼解說】循環(huán)外線創(chuàng)建一個空的 Matcher 對象，然后在for循環(huán)內(nèi)部調(diào)用 Matcher.reset() 方法為 Matcher加載一行輸入，這種處理會有一定的性能優(yōu)化。

【13.7】掃描輸入

1）如何從文件或標(biāo)準(zhǔn)輸入讀取數(shù)據(jù)？： 讀入一行文本，對其進行分詞，然后使用 Integer， Double 等類的各種解析方法來解析數(shù)據(jù)；

【荔枝-原始掃描輸入BufferedReader實現(xiàn)】

// 原始掃描輸入的荔枝 public class SimpleRead {public static BufferedReader input = new BufferedReader(new StringReader("Sir Robin of Camelot\n22 1.61803"));public static void main(String[] args) {try {System.out.println("\n1.What is your name?");String name = input.readLine();System.out.println(name); // Sir Robin of CamelotSystem.out.println("\n2.input: <age> <double>");String numbers = input.readLine();System.out.println("input.readLine() = " + numbers); // 22 1.61803String[] numArray = numbers.split(" ");int age = Integer.parseInt(numArray[0]); // 22double favorite = Double.parseDouble(numArray[1]); // 1.61803System.out.format("Hi %s.\n", name);System.out.format("In 5 years you will be %d.\n", age + 5);System.out.format("My favorite double is %f.", favorite / 2);} catch (IOException e) {System.err.println("I/O exception");}} } /*1.What is your name? Sir Robin of Camelot2.input: <age> <double> input.readLine() = 22 1.61803 Hi Sir Robin of Camelot. In 5 years you will be 27. My favorite double is 0.809015.*/ 【代碼解說】 顯然，上面的掃描讀入代碼有一個問題：當(dāng) integer 和 double 類型數(shù)據(jù)在同一行的時候，還必須對 string 進行分割；

所以， java se5 新增了 Scanner 類，這減輕了掃描輸入的工作負擔(dān)；

【荔枝-Java SE5 引入的Scanner實現(xiàn)掃描輸入】

// 應(yīng)用 Scanner 進行掃描輸入操作 public class BetterRead {public static void main(String[] args) {// Scanner 可以接受任何類型的 Readable 輸入對象Scanner stdin = new Scanner(SimpleRead.input);System.out.println("What is your name?");// 所有的輸入，分詞以及翻譯的操作都隱藏在不同類型的 next 方法中.String name = stdin.nextLine(); // nextLine() 返回 StringSystem.out.println(name);System.out.println("How old are you? What is your favorite double?");System.out.println("(input: <age> <double>)");// Scanner 直接讀入 integer 和 double 類型數(shù)據(jù)int age = stdin.nextInt();double favorite = stdin.nextDouble();System.out.println(age);System.out.println(favorite);System.out.format("Hi %s.\n", name);System.out.format("In 5 years you will be %d.\n", age + 5);System.out.format("My favorite double is %f.", favorite / 2);} } /* What is your name? Sir Robin of Camelot How old are you? What is your favorite double? (input: <age> <double>) 22 1.61803 Hi Sir Robin of Camelot. In 5 years you will be 27. My favorite double is 0.809015. */ 【代碼解說】

1）Scanner 構(gòu)造器可以接受任何類型的輸入對象 Readable對象：包括 File, InputStream, String 等； Readable 接口時 Java SE5 引入的一個接口；

【13.7.1】Scanner 定界符（分割符）（定界符 delimiter ==分隔符，這個概念非常重要）

1）默認： Scanner 使用空白符對字符串進行分割，但可以自定義的 regex 作為分隔符；

2） Scanner.useDelimiter（regex）的自定義regex 作為分隔符的荔枝

// 使用正則表達式regex 指定 Scanner 所需的定界符( 小\s== 空白符，而大\S == 非空白符) public class ScannerDelimiter {public static void main(String[] args) {Scanner scanner = new Scanner("12, 42, 78, 99, 42");scanner.useDelimiter("\\s*,\\s*"); // 使用逗號, 作為分隔符while (scanner.hasNextInt())System.out.print(scanner.nextInt() + " ");} } /** Output: 12 42 78 99 42*/// :~

【13.7.2】用regex 掃描

1）除了掃描基本數(shù)據(jù)類型外，還可以使用描自定義的regex進行掃描；

【例子-使用regex 掃描日志文件中記錄的威脅數(shù)據(jù)】

// Scanner 與正則表達式相結(jié)合掃描輸入字符串 public class ThreatAnalyzer {static String threatData = "58.27.82.161@02/10/2005\n"+ "204.45.234.40@02/11/2005\n" + "58.27.82.161@02/11/2005\n"+ "58.27.82.161@02/12/2005\n" + "58.27.82.161@02/12/2005\n"+ "[Next log section with different data format]";public static void main(String[] args) {Scanner scanner = new Scanner(threatData);String pattern = "(\\d+[.]\\d+[.]\\d+[.]\\d+)@(\\d{2}/\\d{2}/\\d{4})"; // 正則表達式./* 注意scanner的方法列表，如 hasNext, match, group() 等*/while (scanner.hasNext(pattern)) {scanner.next(pattern);MatchResult match = scanner.match();String ip = match.group(1);String date = match.group(2);System.out.format("Threat on %s from %s\n", date, ip); // PrintStream.format()方法返回的就是一個 PrintStream，直接輸出到控制臺}} } /* Threat on 02/10/2005 from 58.27.82.161 Threat on 02/11/2005 from 204.45.234.40 Threat on 02/11/2005 from 58.27.82.161 Threat on 02/12/2005 from 58.27.82.161 Threat on 02/12/2005 from 58.27.82.161*/ // PrintStream.format() 源碼public PrintStream format(String format, Object ... args) {try {synchronized (this) {ensureOpen();if ((formatter == null)|| (formatter.locale() != Locale.getDefault()))formatter = new Formatter((Appendable) this);formatter.format(Locale.getDefault(), format, args);}} catch (InterruptedIOException x) {Thread.currentThread().interrupt();} catch (IOException x) {trouble = true;}return this;}【代碼解說】

Scanner.next()：找到下一個匹配該模式的輸入部分；

Scanner.match()：獲得匹配結(jié)果；

注意： 在使用 Scanner 和 regex 進行掃描輸入時，掃描方式僅僅針對下一個輸入分詞進行匹配， 如果你的 regex 中含有定界符，那永遠都不會匹配成功的；

【13.8】StringTokenize（string 字符串分詞器）

1）regex正則表達式：在 J2SE 4 中引入的；

2）Scanner類： 是在 Java SE5 中引入的；

在regex 和 Scanner 被引入之前，分割字符串的唯一方式是使用 StringTokenizer 來分詞；

因為使用 regex 或 Scanner，能夠使用更加復(fù)雜的模式分割字符串，StringTokenizer 可以廢棄了；

【不過還是給出荔枝-StringTokenizer】

StringTokenizer ， regex， Scanner 分詞結(jié)果比較：

public class ReplacingStringTokenizer {public static void main(String[] args) {String input = "But I'm not dead yet! I feel happy!";StringTokenizer stoke = new StringTokenizer(input);/* 使用 StringTokenizer 進行分詞 */System.out.print("使用 StringTokenizer 進行分詞: ");while (stoke.hasMoreElements())System.out.print(stoke.nextToken() + " ");/* 使用 String.split() + regex 進行分詞 */System.out.print("\n使用 String.split() + regex 進行分詞: ");System.out.println(Arrays.toString(input.split("\\W+")));/* 使用Scanner 進行分詞，定界符默認為空格 */System.out.print("使用Scanner 進行分詞，定界符默認為空格： ");Scanner scanner = new Scanner(input);scanner.useDelimiter("\\s+"); // 自定義定界符為空格while (scanner.hasNext())System.out.print(scanner.next() + ";");} } /* 使用 StringTokenizer 進行分詞: But I'm not dead yet! I feel happy! 使用 String.split() + regex 進行分詞: [But, I, m, not, dead, yet, I, feel, happy] 使用Scanner 進行分詞，定界符默認為空格： But;I'm;not;dead;yet!;I;feel;happy!; */

總結(jié)

以上是生活随笔為你收集整理的thinking-in-java(13) String字符串的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。