日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

thinking-in-java(13) String字符串

發布時間:2023/12/3 编程问答 37 豆豆
生活随笔 收集整理的這篇文章主要介紹了 thinking-in-java(13) String字符串 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

?【13.1】不可變String

1)String對象是不可變的,具有只讀特性;

【荔枝-String對象是不可變的】?

public class Immutable {public static String upcase(String s) {return s.toUpperCase();}public static void main(String[] args) {String q = "howdy";print(q); // howdyString qq = upcase(q);print(qq); // HOWDYprint(q); // howdy(原有 String 沒有改變)} } // howdy // HOWDY // howdy 【代碼解說】

字符串q 傳給 upcase() 方法時,實際上傳遞的是 s 引用的拷貝;

// Stirng.toUpperCase() 源碼 public String toUpperCase(Locale locale) {if (locale == null) {throw new NullPointerException();}int firstLower;final int len = value.length;/* Now check if there are any characters that need to be changed. */scan: {for (firstLower = 0 ; firstLower < len; ) {int c = (int)value[firstLower];int srcCount;if ((c >= Character.MIN_HIGH_SURROGATE)&& (c <= Character.MAX_HIGH_SURROGATE)) {c = codePointAt(firstLower);srcCount = Character.charCount(c);} else {srcCount = 1;}int upperCaseChar = Character.toUpperCaseEx(c);if ((upperCaseChar == Character.ERROR)|| (c != upperCaseChar)) {break scan;}firstLower += srcCount;}return this;}/* result may grow, so i+resultOffset is the write location in result */int resultOffset = 0;char[] result = new char[len]; /* may grow *//* Just copy the first few upperCase characters. */System.arraycopy(value, 0, result, 0, firstLower);String lang = locale.getLanguage();boolean localeDependent =(lang == "tr" || lang == "az" || lang == "lt");char[] upperCharArray;int upperChar;int srcChar;int srcCount;for (int i = firstLower; i < len; i += srcCount) {srcChar = (int)value[i];if ((char)srcChar >= Character.MIN_HIGH_SURROGATE &&(char)srcChar <= Character.MAX_HIGH_SURROGATE) {srcChar = codePointAt(i);srcCount = Character.charCount(srcChar);} else {srcCount = 1;}if (localeDependent) {upperChar = ConditionalSpecialCasing.toUpperCaseEx(this, i, locale);} else {upperChar = Character.toUpperCaseEx(srcChar);}if ((upperChar == Character.ERROR)|| (upperChar >= Character.MIN_SUPPLEMENTARY_CODE_POINT)) {if (upperChar == Character.ERROR) {if (localeDependent) {upperCharArray =ConditionalSpecialCasing.toUpperCaseCharArray(this, i, locale);} else {upperCharArray = Character.toUpperCaseCharArray(srcChar);}} else if (srcCount == 2) {resultOffset += Character.toChars(upperChar, result, i + resultOffset) - srcCount;continue;} else {upperCharArray = Character.toChars(upperChar);}/* Grow result if needed */int mapLen = upperCharArray.length;if (mapLen > srcCount) {char[] result2 = new char[result.length + mapLen - srcCount];System.arraycopy(result, 0, result2, 0, i + resultOffset);result = result2;}for (int x = 0; x < mapLen; ++x) {result[i + resultOffset + x] = upperCharArray[x];}resultOffset += (mapLen - srcCount);} else {result[i + resultOffset] = (char)upperChar;}}return new String(result, 0, len + resultOffset);}


【13.2】重載運算符 + 與 StringBuilder

1)重載的意思: 一個操作符在應用于特定的類時, 被賦予特殊意義;

(Attention: 用于String 的 + 和 +=? 是java中僅有的兩個重載過的操作符,而java 并不允許程序員重載任何操作符)

【荔枝-字符串重載符+】

// 對于 + 運算符,編譯器實際上創建了一個 StringBuilder() // append() 方法 表示重載的 + 運算符 public class Concatenation {public static void main(String[] args) {String mango = "mango";String s = "abc" + mango + "def" + 47;System.out.println(s);} } /* abcmangodef47 */ 【代碼解說】

字符串連接符 + 的性能非常低下。。因為為了生成最終的string, 會產生大量需要垃圾回收的中間對象;

2)通過javap 來反編譯Concatenation

E:\bench-cluster\spring_in_action_eclipse\AThinkingInJava\src>javap -c chapter13.Concatenation Compiled from "Concatenation.java" public class chapter13.Concatenation {public chapter13.Concatenation();Code:0: aload_01: invokespecial #1 // Method java/lang/Object."<init>":()V4: returnpublic static void main(java.lang.String[]);Code:0: ldc #2 // String mango2: astore_13: new #3 // class java/lang/StringBuilder6: dup7: invokespecial #4 // Method java/lang/StringBuilder."<init>":()V10: ldc #5 // String abc12: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;15: aload_116: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;19: ldc #7 // String def21: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;24: bipush 4726: invokevirtual #8 // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;29: invokevirtual #9 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;32: astore_233: getstatic #10 // Field java/lang/System.out:Ljava/io/PrintStream;36: aload_237: invokevirtual #11 // Method java/io/PrintStream.println:(Ljava/lang/String;)V40: return } 【代碼解說】

第3行:? 編譯器自動引入了 java.lang.StringBuilder 類,即使源代碼中沒有使用 StringBuilder, 但是顯然StringBuilder 更加有效;


3)編譯器能為String 處理效率優化到什么程度?

// 利用 StringBuilder.append() 來重載 + 運算符 public class WhitherStringBuilder {public String implicit(String[] fields) { // 方法一:使用多個String對象String result = "";for (int i = 0; i < fields.length; i++) // (效率低)隱式創建 StringBuilderresult += fields[i]; return result;} // 因為 StringBuilder是在循環內創建的,這意味著 每經過循環一次,就會創建一個新的 StringBuilder對象public String explicit(String[] fields) { // 方法二:使用StringBuilder,因為效率高StringBuilder result = new StringBuilder(); // (效率高)顯式創建 StringBuilderfor (int i = 0; i < fields.length; i++)result.append(fields[i]);return result.toString();} }

4)StringBuilder 補充: 可以為StringBuilder 預先指定大小,如果知道最終的字符串長度,可以預先指定StringBuilder的大小, 以避免多次 重新分配緩沖;


【StringBuilder的荔枝】

5)如果要在toString() 方法中使用循環的話,最好自己創建一個StringBuidler 對象;

/* toString() 方法中使用循環的荔枝 */ public class UsingStringBuilder {public static Random rand = new Random(47);public String toString() {StringBuilder result = new StringBuilder("[");for (int i = 0; i < 25; i++) {result.append(rand.nextInt(100));result.append(", ");}result.delete(result.length() - 2, result.length()); // 刪除最后兩個字符result.append("]");return result.toString();}public static void main(String[] args) {UsingStringBuilder usb = new UsingStringBuilder();System.out.println(usb);} } /* [58, 55, 93, 61, 61, 29, 68, 0, 22, 7, 88, 28, 51,89, 9, 78, 98, 61, 20, 58, 16, 40, 11, 22, 4 ] */ 6)StringBuilder方法列表: insert, replace, substring, reverse, 最常用的方法是 append 和 toString() 方法;

7)StringBuilder 和 StringBuffer

7.1)StringBuilder: 線程不安全,效率高;(java SE5 引入)

7.2)StringBuffer: 線程安全, 效率高;(java se 5 之前使用)


【13.3.】無意識的遞歸

1)所有java的根基類都是 Object, 所以容器類都有 toString() 方法。 容器的toString() 方法都能夠表達容器自身和容器所包含的 對象;

【看個荔枝】

class Latte extends Coffee {} class Americano extends Coffee {} class Cappuccino extends Coffee {} class Mocha extends Coffee {} class Breve extends Coffee {}public class CoffeeGenerator implements Generator<Coffee>, Iterable<Coffee> {private Class[] types = { Latte.class, Mocha.class, Cappuccino.class,Americano.class, Breve.class, };private static Random rand = new Random(47);public CoffeeGenerator() {}// For iteration:private int size = 0;public CoffeeGenerator(int sz) {size = sz;}public Coffee next() {try {return (Coffee) types[rand.nextInt(types.length)].newInstance();// Report programmer errors at run time:} catch (Exception e) {throw new RuntimeException(e);}}class CoffeeIterator implements Iterator<Coffee> { // 內部迭代器類int count = size;public boolean hasNext() {return count > 0;}public Coffee next() {count--;return CoffeeGenerator.this.next();}public void remove() { // Not implementedthrow new UnsupportedOperationException();}};public Iterator<Coffee> iterator() { // 返回迭代器return new CoffeeIterator();}public static void main(String[] args) {CoffeeGenerator gen = new CoffeeGenerator();for (int i = 0; i < 10; i++)System.out.print(gen.next() + " ");} } /* Americano 0 Latte 1 Americano 2 Mocha 3 Mocha 4 Breve 5 Americano 6 Latte 7 Cappuccino 8 Cappuccino 9 */


【荔枝-toString方法調用內存地址】

// 無限遞歸 使得 java虛擬機棧被頂滿, 然后拋出異常 public class InfiniteRecursion {@Overridepublic String toString() {// toString() 中的this關鍵字是 引起無限遞歸的原因 // return " InfiniteRecursion address: " + this + "\n"; // Exception in thread "main" java.lang.StackOverflowErrorreturn " InfiniteRecursion address: " + super.toString() + "\n";}public static void main(String[] args) {List<InfiniteRecursion> v = new ArrayList<InfiniteRecursion>();for (int i = 0; i < 10; i++)v.add(new InfiniteRecursion());System.out.println(v);} } /* [ InfiniteRecursion address: chapter13.InfiniteRecursion@15db9742 , InfiniteRecursion address: chapter13.InfiniteRecursion@6d06d69c , InfiniteRecursion address: chapter13.InfiniteRecursion@7852e922 , InfiniteRecursion address: chapter13.InfiniteRecursion@4e25154f , InfiniteRecursion address: chapter13.InfiniteRecursion@70dea4e , InfiniteRecursion address: chapter13.InfiniteRecursion@5c647e05 , InfiniteRecursion address: chapter13.InfiniteRecursion@33909752 , InfiniteRecursion address: chapter13.InfiniteRecursion@55f96302 , InfiniteRecursion address: chapter13.InfiniteRecursion@3d4eac69 , InfiniteRecursion address: chapter13.InfiniteRecursion@42a57993 ] */ 【代碼解說】?這里發生了自動類型轉換: 由 InfiniteRecursion類型轉換為 String 類型。 this前面的是字符串,后面是換行符, 所以 this 轉換為 String, 即調用了 this.toString() 方法, 于是就發生了 遞歸調用 toString() 方法,無限遞歸使得 java 虛擬機棧被頂滿; 然后拋出異常; 把this換做 super.toString() 方法后 執行成功;


【13.4】String 上的操作

1)String 對象的基本方法列表如下:





2)當需要改變字符串的內容時: String 類的方法都會返回一個新的String 對象; 如果沒有改變, 則返回 原對象的引用;


【13.5】格式化輸出

【13.5.1】printf() 方法

【13.5.2】System.out.format() 方法: format方法可以用于 PrintStream' 和 PrintWriter 對象;


【荔枝-System.out.format() 輸出格式】

// System.out.format() 輸出格式 public class SimpleFormat {public static void main(String[] args) {int x = 5;double y = 5.332542;// The old way:System.out.println("Row 1: [" + x + " " + y + "]");// The new way:System.out.format("Row 1: [%d %f]\n", x, y); // format() 方法的荔枝// orSystem.out.printf("Row 1: [%d %f]\n", x, y); // printf() 方法荔枝} } /* Row 1: [5 5.332542] Row 1: [5 5.332542] Row 1: [5 5.332542] */ 【注意】 PrintStream.printf() 方法實際上調用了 format() 方法

// PrintStream.printf() 方法源碼 public PrintStream printf(String format, Object ... args) {return format(format, args);} // System.out 實際上是 不可變的PrintStream對象常量 public final static PrintStream out = null;
【13.5.3】java.util.Formatter 類

1)格式化功能都由 java.util.Formatter 類處理;

1.1)Formatter 是一個翻譯器: 將格式化字符串與數據翻譯成期望的結果;

1.2)Formatter 構造器需要傳入目的地輸出流參數: 最常用的目的地是: PrintStream、OutputStream 和 File;

【Formatter荔枝】

// Formatter() 的荔枝 public class Turtle {private String name;private Formatter f;public Turtle(String name, Formatter f) {this.name = name;this.f = f;}public void move(int x, int y) {f.format("%s The Turtle is at (%d,%d)\n", name, x, y);}public static void main(String[] args) {PrintStream outAlias = System.out;// new Formatter(dest), 設置輸出目的地Turtle tommy = new Turtle("Tommy", new Formatter(System.out));Turtle terry = new Turtle("Terry", new Formatter(outAlias));tommy.move(0, 0);terry.move(4, 8);tommy.move(3, 4);terry.move(2, 5);tommy.move(3, 3);terry.move(3, 3);} } /* Tommy The Turtle is at (0,0) Terry The Turtle is at (4,8) Tommy The Turtle is at (3,4) Terry The Turtle is at (2,5) Tommy The Turtle is at (3,3) Terry The Turtle is at (3,3) */ // Formatter 構造器 public Formatter(PrintStream ps) {this(Locale.getDefault(Locale.Category.FORMAT),(Appendable)Objects.requireNonNull(ps));}


【13.5.4】格式化說明符(如 %d, %s)

1)如何控制輸出的空格與格式對齊: 默認或+右對齊, -表示左對齊;

2)字符串格式化語法: %[argument_index$][flags][width][.precision]conversion

2.1)argument_index: 參數序號;

2.2)flags: + 或者 - ;

2.3)width: 最小長度;

2.4) precision: 用于格式化字符串時, 表示最大長度;用于格式化浮點數時, 表示小數部分位數 (默認6位, 少則補0, 多則舍入);無法格式化整數(否則拋出異常);

2.5)conversion: 表示類型轉換字符: d, c, b, s, f, e, x, h, %;


【格式化說明符的荔枝】

// 輸出格式的 左(-) 右(默認)對齊設置 public class Receipt {private double total = 0;private Formatter f = new Formatter(System.out);public void printTitle() {// - 左對齊f.format("%-15s %5s %10s\n", "Item", "Qty", "Price"); // - 左對齊f.format("%-15s %5s %10s\n", "----", "---", "-----");}public void print(String name, int qty, double price) {f.format("%-15.15s %5d %10.2f\n", name, qty, price);total += price;}public void printTotal() {f.format("%-15s %5s %10.2f\n", "Tax", "", total * 0.06);f.format("%-15s %5s %10s\n", "", "", "-----");f.format("%-15s %5s %10.2f\n", "Total", "", total * 1.06);}public static void main(String[] args) {Receipt receipt = new Receipt();receipt.printTitle();receipt.print("Jack's Magic Beans", 40, 4.25);receipt.print("Princess Peas", 311, 5.1);receipt.print("Three Bears Porridge", 1, 14.29);receipt.printTotal();} } /* Item Qty Price ---- --- ----- Jack's Magic Be 40 4.25 Princess Peas 311 5.10 Three Bears Por 1 14.29 Tax 1.42----- Total 25.06 */


【13.5.5】Formatter轉換

1)下面的表格包含了最常用的類型轉換:

2)類型轉換字符有: d, c, b, s, f, e, x, h, % ;

d: 整數型(10進制);

c: Unicode 字符;

b:Boolean 值;

s:String;

f:浮點數(10進制);

x:整數(16進制);

h:散列碼(16進制);

%: 字符% 或 類型轉換字符前綴 (必須是單個%, 多個% 不是)


【類型轉換的荔枝】

/* Formatter 對各種數據類型轉換的荔枝 */ public class Conversion {public static void main(String[] args) {Formatter f = new Formatter(System.out);char u = 'a'; System.out.println("u = 'a'"); // u = 'a'f.format("%%s: %s\n", u); // %s: af.format("%%c: %c\n", u); // %c: af.format("%%b: %b\n", u); // %b: truef.format("%%h: %h\n", u); // %h: 61 // f.format("d: %d\n", u); // java.util.IllegalFormatConversionException: d != java.lang.Character // f.format("f: %f\n", u); // java.util.IllegalFormatConversionException: f != java.lang.Character // f.format("e: %e\n", u); // java.util.IllegalFormatConversionException: e != java.lang.Character // f.format("x: %x\n", u); // java.util.IllegalFormatConversionException: x != java.lang.Characterint v = 121;System.out.println();System.out.println("v = 121"); // v = 121f.format("%%d: %d\n", v); // %d: 121f.format("%%c: %c\n", v); // %c: yf.format("%%b: %b\n", v); // %b: true f.format("%%s: %s\n", v); // %s: 121f.format("%%x: %x\n", v); // %x: 79f.format("%%h: %h\n", v); // %h: 79 // f.format("f: %f\n", v); // java.util.IllegalFormatConversionException: f != java.lang.Integer // f.format("e: %e\n", v); // java.util.IllegalFormatConversionException: e != java.lang.IntegerBigInteger w = new BigInteger("50000000000000");System.out.println();System.out.println("w = new BigInteger(\"50000000000000\")"); // w = new BigInteger("50000000000000")f.format("%%d: %d\n", w); // %d: 50000000000000f.format("%%b: %b\n", w); // %b: truef.format("%%s: %s\n", w); // %s: 50000000000000f.format("%%x: %x\n", w); // %x: 2d79883d2000f.format("%%h: %h\n", w); // %h: 8842a1a7 // f.format("c: %c\n", w); // java.util.IllegalFormatConversionException: c != java.math.BigInteger // f.format("f: %f\n", w); // java.util.IllegalFormatConversionException: f != java.math.BigInteger // f.format("e: %e\n", w); // java.util.IllegalFormatConversionException: e != java.math.BigIntegerdouble x = 179.543;System.out.println();System.out.println("x = 179.543"); // x = 179.543f.format("%%b: %b\n", x); // %b: truef.format("%%s: %s\n", x); // %s: 179.543f.format("%%f: %f\n", x); // %f: 179.543000f.format("%%e: %e\n", x); //%e: 1.795430e+02, 科學表示法f.format("%%h: %h\n", x); // %h: 1ef462c // f.format("d: %d\n", x); // java.util.IllegalFormatConversionException: d != java.lang.Double // f.format("c: %c\n", x); // java.util.IllegalFormatConversionException: c != java.lang.Double // f.format("x: %x\n", x); // java.util.IllegalFormatConversionException: x != java.lang.DoubleConversion y = new Conversion();System.out.println();System.out.println("y = new Conversion()"); // y = new Conversion()f.format("%%b: %b\n", y); // %b: truef.format("%%s: %s\n", y); // %s: chapter13.Conversion@4aa298b7f.format("%%h: %h\n", y); // %h: 4aa298b7 // f.format("d: %d\n", y); // java.util.IllegalFormatConversionException: d != chapter13.Conversion // f.format("c: %c\n", y); // java.util.IllegalFormatConversionException: c != chapter13.Conversion // f.format("f: %f\n", y); // java.util.IllegalFormatConversionException: f != chapter13.Conversion // f.format("e: %e\n", y); // java.util.IllegalFormatConversionException: e != chapter13.Conversion // f.format("x: %x\n", y); // java.util.IllegalFormatConversionException: x != chapter13.Conversionboolean z = false;System.out.println();System.out.println("z = false"); // z = falsef.format("%%b: %b\n", z); // %b: falsef.format("%%s: %s\n", z); // %s: falsef.format("%%h: %h\n", z); // %h: 4d5 // f.format("d: %d\n", z); // java.util.IllegalFormatConversionException: d != java.lang.Boolean // f.format("c: %c\n", z); // java.util.IllegalFormatConversionException: c != java.lang.Boolean // f.format("f: %f\n", z); // java.util.IllegalFormatConversionException: f != java.lang.Boolean // f.format("e: %e\n", z); // java.util.IllegalFormatConversionException: e != java.lang.Boolean // f.format("x: %x\n", z); // java.util.IllegalFormatConversionException: x != java.lang.Boolean} }

【13.5.6】String.format() 方法

1)String.format方法源碼: 接受的參數與 Formatter.format()方法一樣,但返回一個 String 對象;

【String.format() 荔枝】

public class DatabaseException extends Exception {public DatabaseException(int transactionID, int queryID, String message) {super(String.format("(t%d, q%d) %s", transactionID, queryID, message));} /** String.format() 源碼詳解: String.format() 方法也是創建一個 Formatter對象.public static String format(String format, Object... args) {return new Formatter().format(format, args).toString();}*/public static void main(String[] args) {try {throw new DatabaseException(3, 7, "Write failed");} catch (Exception e) {System.out.println(e);System.out.println(e.getMessage());}} } /* chapter13.DatabaseException: (t3, q7) Write failed (t3, q7) Write failed */


2)16進制轉儲(dump)工具

【荔枝-使用String.format() 方法以可讀的16 進制格式把字節數組打印出來】

// 16進制轉儲工具 public class Hex {public static String format(byte[] data) {StringBuilder result = new StringBuilder();int n = 0;for (byte b : data) {if (n % 16 == 0)result.append(String.format("%05X: ", n)); // 占用5個位置(16進制表示)result.append(String.format("%02X ", b)); // 占用2個位置(16進制表示)n++;if (n % 16 == 0)result.append("\n");}result.append("\n");return result.toString();}public static void main(String[] args) throws Exception {if (args.length == 0)System.out.println(format(BinaryFile.read(MyConstant.path + "Hex.class")));elseSystem.out.println(format(BinaryFile.read(new File(args[0]))));} } /* 00000: CA FE BA BE 00 00 00 34 00 58 0A 00 05 00 26 07 00010: 00 27 0A 00 02 00 26 08 00 28 07 00 29 0A 00 2A ...... */ public class BinaryFile {public static byte[] read(File bFile) throws IOException {BufferedInputStream bf = new BufferedInputStream(new FileInputStream(bFile));try {byte[] data = new byte[bf.available()];bf.read(data);return data;} finally {bf.close();}}public static byte[] read(String bFile) throws IOException {return read(new File(bFile).getAbsoluteFile());} } // /:~

【13.6】正則表達式 regex

【13.6.1】基礎

1)java 對反斜線 '\' 的不同處理

1.1)其他語言: \\ 表示在regex 插入字面量反斜線 '\';

1.2)java: \\ 表示插入一個regex 的反斜線,所以反斜線后面的字符具有特殊意義;

2)荔枝-java 反斜線:

2.1)數字的regex: \\d;

2.2)普通反斜線的 regex : \\\\;

2.3)換行和制表符的regex: \n\t (無需轉換);

3)使用 regex 的最簡單途徑:??利用String 類內建功能: String.matches(regex);

4)String的內建匹配的 regex的荔枝

/* String的內建匹配的 regex的荔枝 */ public class IntegerMatch {public static void main(String[] args) {System.out.println("-1234".matches("-?\\d+")); // trueSystem.out.println("5678".matches("-?\\d+")); // trueSystem.out.println("+911".matches("-?\\d+")); // falseSystem.out.println("+911".matches("(-|\\+)?\\d+")); // true} } /* true true false true */

// String.matches() 源碼,實際上調用了 Pattern.matches() public boolean matches(String regex) {return Pattern.matches(regex, this);}

【代碼解說】 (-|\\+)? : 表示字符串的起始字符可能是一個 - 或 + (\\+ 是對 + 的轉義, 轉義后是普通字符), 或二者都沒有, (?表示0個或1個);

5)String.split(regex): regex也可以是空格, 把字符串從regex 匹配的地方切開;

【荔枝-利用String.split(regex) 分割字符串】

/* 荔枝-利用String.split(regex) 分割字符串 */ public class Splitting {public static String knights = "Then, when you have found the shrubbery, you must "+ "cut down the mightiest tree in the forest... "+ "with... a herring!";public static void split(String regex) {String[] array = knights.split(regex);for(String s : array) {System.out.print(s + " ");} }public static void main(String[] args) {System.out.println("knights = \"" + knights + "\"\n");split(" "); // 利用空格進行分割, Doesn't have to contain regex chars(不必包含正則表達式字符)System.out.println();split("\\W+"); // (大寫W)基于非單詞字符進行分割, Non-word charactersSystem.out.println();split("n\\W+"); // (大寫W)基于n之后跟非單詞字符進行分割, 'n' followed by non-word charactersSystem.out.println("\nknights = \"" + knights + "\"\n"); // 顯然 String.split(regex) 不會修改string 而是重新創建一個String}// 基于誰進行分割,這個誰最后都會被移除. } /* knights = "Then, when you have found the shrubbery, you must cut down the mightiest tree in the forest... with... a herring!"Then, when you have found the shrubbery, you must cut down the mightiest tree in the forest... with... a herring! Then when you have found the shrubbery you must cut down the mightiest tree in the forest with a herring The whe you have found the shrubbery, you must cut dow the mightiest tree i the forest... with... a herring! knights = "Then, when you have found the shrubbery, you must cut down the mightiest tree in the forest... with... a herring!" */【代碼解說】

1)\W: 匹配非單詞字符;

2)\w:匹配單詞字符;


6)String.split() 重載版本: 允許你限制字符串分割的次數;

7)利用regex 進行字符串替換: 僅替換regex 第一次匹配的子串, 也可以替換所有匹配的地方;

【荔枝-利用regex進行字符串替換(replaceFirst , replaceAll )】

/* 利用regex進行字符串替換(replaceFirst, replaceAll ) */ public class Replacing {static String s = Splitting.knights;public static void main(String[] args) {System.out.println(s);// 以y字母開頭的單詞 被替換為 Tom(且僅被替換一次)print(s.replaceFirst("y\\w+", "Tom")); // String.replaceFirst 荔枝System.out.println();// shrubbery 或 tree 或 herring 全部替換為bananaprint(s.replaceAll("shrubbery|tree|herring", "banana")); // String.replaceAll 荔枝} } /* Then, when you have found the shrubbery, you must cut down the mightiest tree in the forest... with... a herring! Then, when Tom have found the shrubbery, you must cut down the mightiest tree in the forest... with... a herring!Then, when you have found the banana, you must cut down the mightiest banana in the forest... with... a banana! */


【13.6.2】創建正則表達式

1)正則表達式字符,字符類,邏輯操作符,邊界匹配符;





【荔枝-利用regex匹配字符序列】

// 正則表達式的 模式匹配 public class Rudolph {public static void main(String[] args) {CharSequence seq ;CharSequence str = new String();for (String pattern : new String[] { "Rudolph", "[rR]udolph", "[rR][aeiou][a-z]ol.*", "R.*" })System.out.println("Rudolph".matches(pattern)); // 全為 true, 全匹配.} } /* String.matches(String regex) 源碼 public boolean matches(String regex) {return Pattern.matches(regex, this); } */


【13.6.3】量詞

1)量詞描述了一個模式吸收輸入文本的方式:

貪婪型: 發現盡可能多的匹配;

勉強型;

占有型;

【注意】:表達式X 通常必須用 圓括號括起來;


【CharSequence-字符序列】 接口 CharSequence 從 CharBuffer, String, StringBuffer, StringBuilder 類中抽象出了 字符序列的一般化定義:多數正則表達式都接受 CharSequence類型的參數呢。

/* CharSequence接口源碼 */ public interface CharSequence {int length();char charAt(int index);CharSequence subSequence(int start, int end);public String toString();public default IntStream chars() {class CharIterator implements PrimitiveIterator.OfInt {int cur = 0;public boolean hasNext() {return cur < length();}public int nextInt() {if (hasNext()) {return charAt(cur++);} else {throw new NoSuchElementException();}}@Overridepublic void forEachRemaining(IntConsumer block) {for (; cur < length(); cur++) {block.accept(charAt(cur));}}}return StreamSupport.intStream(() ->Spliterators.spliterator(new CharIterator(),length(),Spliterator.ORDERED),Spliterator.SUBSIZED | Spliterator.SIZED | Spliterator.ORDERED,false);}public default IntStream codePoints() {class CodePointIterator implements PrimitiveIterator.OfInt {int cur = 0;@Overridepublic void forEachRemaining(IntConsumer block) {final int length = length();int i = cur;try {while (i < length) {char c1 = charAt(i++);if (!Character.isHighSurrogate(c1) || i >= length) {block.accept(c1);} else {char c2 = charAt(i);if (Character.isLowSurrogate(c2)) {i++;block.accept(Character.toCodePoint(c1, c2));} else {block.accept(c1);}}}} finally {cur = i;}}public boolean hasNext() {return cur < length();}public int nextInt() {final int length = length();if (cur >= length) {throw new NoSuchElementException();}char c1 = charAt(cur++);if (Character.isHighSurrogate(c1) && cur < length) {char c2 = charAt(cur);if (Character.isLowSurrogate(c2)) {cur++;return Character.toCodePoint(c1, c2);}}return c1;}}return StreamSupport.intStream(() ->Spliterators.spliteratorUnknownSize(new CodePointIterator(),Spliterator.ORDERED),Spliterator.ORDERED,false);} }


【13.6.4】Patter 和 Matcher

1)如何構建功能強大的regex 對象?

step1: Pattern.compile(regex) 編譯regex 并產生 Pattern 對象;

step2:Patter.matcher(檢索的字符串) 生成一個 Matcher 對象;

2)Matcher對象有許多方法如下:

// 利用 Pattern 和 Matcher 測試正則表達式的荔枝 public class TestRegularExpression {public static void main(String[] args) {String[] array = {"aabbcc", "aab", "aab+", "(b+)"};for (String arg : array) {System.out.println();print("Regular expression: \"" + arg + "\"");Pattern p = Pattern.compile(arg); // step1: Pattern 表示編譯后的匹配模型Pattern.(編譯后的正則表達式)Matcher m = p.matcher("aabbcc"); // step2: 模型實例 檢索 待匹配字符串并 生成一個匹配對象Matcher, Matcher有很多方法while (m.find()) {print("Match \"" + m.group() // 待匹配的字符串+ "\" at positions " + m.start() // 字符串匹配regex的起始位置+ "-" + (m.end() - 1)); // 字符串匹配regex的終點位置}}} } /* Regular expression: "aabbcc" Match "aabbcc" at positions 0-5Regular expression: "aab" Match "aab" at positions 0-2Regular expression: "aab+" Match "aabb" at positions 0-3Regular expression: "(b+)" Match "bb" at positions 2-3 */

【代碼解說】Pattern對象 表示 編譯后的 regex-正則表達式, 是具有更強功能的正則表達式對象;


【編譯后的regex-Pattern】

1)Pattern 提供了 static 方法: 它實際上要 經過 Pattern.compile(regex) 生成 Pattern對象, pattern obj.matcher(str) 生成 Matcher 對象,最后返回 metcher.matches() 結果,即 input 是否匹配 regex

// Pattern.matches() 方法 public static boolean matches(String regex, CharSequence input) {Pattern p = Pattern.compile(regex);Matcher m = p.matcher(input);return m.matches();}

// Pattern.compile() 源碼public static Pattern compile(String regex) {return new Pattern(regex, 0);}

2)Pattern 方法列表:

split() 方法: 它從字符串匹配regex的地方 分割 字符串,并返回分割后的 字符串數組;

pattern 方法: 返回pattern;

3)Matcher 方法列表:

boolean matches(); //判斷 輸入字符串 是否匹配正則表達式regex; boolean lookingAt(); //判斷輸入字符串(不是整個)的開始部分是否匹配 regex; boolean find(); //用于 在 CharSequence 輸入字符串中查找多個匹配; boolean find(int start); //用于在 CharSequence 輸入字符串的start 位置開始查找多個匹配; String group(); //用于返回匹配regex的輸入字符串的子串;

【荔枝-Matcher.find() 方法荔枝】

public class Finding {public static void main(String[] args) {// step1, 對regex進行編譯 得到 編譯后的regex對象Pattern// step2, Pattern 對 輸入字符串進行檢索 得到 匹配對象Matcher.Matcher m = Pattern.compile("\\w+").matcher( // (小寫w) 表示匹配單詞字符"Evening is full of the linnet's wings");while (m.find())printnb(m.group() + ", ");System.out.println("\n======");int i = 0;while (m.find(i)) {printnb(m.group() + " \n");i++;}} } /* Evening, is, full, of, the, linnet, s, wings, ====== Evening vening ening ning ing ng g is is s full full ull ll l of of f the the he e linnet linnet innet nnet net et t s s wings wings ings ngs gs s */ 【代碼解說】 模式 \\w+ 將字符串劃分為單詞。 find() 前向遍歷輸入字符串; find(int start) 把 start 作為輸入字符串搜索的起點;


4)組group: 組是用括號劃分的regex, 可以根據組編號來引用某個組。組號為0 表示整個regex, 組號為1 表示被第一對括號括起來的組;

【荔枝-group】

A(B(C))D :有3個組;

組0:ABCD;

組1: BC

組2: C

【荔枝-regex group - 正則表達式組的荔枝】

public class Groups {static public final String POEM = "Twas brillig, and the slithy toves\n"+ "Did gyre and gimble in the wabe.\n"+ "All mimsy were the borogoves,\n"+ "And the mome raths outgrabe.\n\n"+ "Beware the Jabberwock, my son,\n"+ "The jaws that bite, the claws that catch.\n"+ "Beware the Jubjub bird, and shun\n"+ "The frumious Bandersnatch.";public static void main(String[] args) {// \S 非空白符, \s 空白符, 補充: 圓括號闊起來的是分組// 目的是捕獲每行最后的3個詞,每行最后以 $ 結束。 ?m 是模式標記,用于指定輸入序列中的換行符Matcher m = Pattern.compile("(?m)(\\S+)\\s+((\\S+)\\s+(\\S+))$") .matcher(POEM); // 對 輸入字符串 POEM 進行正則表達式匹配.while (m.find()) {for (int j = 0; j <= m.groupCount(); j++)printnb("[" + m.group(j) + "]");print();}} } /* [the slithy toves][the][slithy toves][slithy][toves] [in the wabe.][in][the wabe.][the][wabe.] [were the borogoves,][were][the borogoves,][the][borogoves,] [mome raths outgrabe.][mome][raths outgrabe.][raths][outgrabe.] [Jabberwock, my son,][Jabberwock,][my son,][my][son,] [claws that catch.][claws][that catch.][that][catch.] [bird, and shun][bird,][and shun][and][shun] [The frumious Bandersnatch.][The][frumious Bandersnatch.][frumious][Bandersnatch.] */ 5)start() 與 end() 方法:?

5.1)返回值: start方法返回先前匹配的起始位置的索引,而end方法返回所匹配的最后字符的索引加一的值;

5.2)匹配操作失敗后: 調用 start() 或 end() 方法 報錯 IllegalStateException ;


【荔枝-Matcher方法列表】

public class StartEnd {public static String input = "As long as there is injustice, whenever a\n"+ "Targathian baby cries out, wherever a distress\n"+ "signal sounds among the stars ... We'll be there.\n"+ "This fine ship, and this fine crew ...\n"+ "Never give up! Never surrender!";private static class Display {private boolean regexPrinted = false;private String regex;Display(String regex) {this.regex = regex;}void display(String message) {if (!regexPrinted) { // print(regex);regexPrinted = true;}print(message);}}/* 校驗輸入字符串s 是否匹配 regex */static void examine(String s, String regex) {Display d = new Display(regex);Pattern p = Pattern.compile(regex);Matcher m = p.matcher(s);/* find() 遍歷 輸入字符串,并以匹配regex的輸入字符串子串的終點作為下次遍歷的起點 */while (m.find())/* Matcher.group() 返回的是匹配regex的輸入字符串的子串*/d.display("find() '" + m.group() + "' start = " + m.start()+ " end = " + m.end());/* 判斷輸入字符串的開始部分是否匹配regex */if (m.lookingAt()){ // No reset() necessary System.out.println("\n m.lookingAt() : ");d.display("lookingAt() start = " + m.start() + " end = " + m.end());}/* 判斷整個輸入字符串是否匹配 regex */if (m.matches()) // No reset() necessaryd.display("matches() start = " + m.start() + " end = " + m.end());}public static void main(String[] args) {int i = 0;for (String in : input.split("\n")) {System.out.println("[" + ++i +"]====================================");print("input : " + in);int j = 0;for (String regex : new String[] { "\\w*ere\\w*", "\\w*ever","T\\w+", "Never.*?!" }) {System.out.println("regex" + ++j + " = " + regex);examine(in, regex);}}} } /* [1]==================================== input : As long as there is injustice, whenever a regex1 = \w*ere\w* find() 'there' start = 11 end = 16 regex2 = \w*ever find() 'whenever' start = 31 end = 39 regex3 = T\w+ regex4 = Never.*?! [2]==================================== input : Targathian baby cries out, wherever a distress regex1 = \w*ere\w* find() 'wherever' start = 27 end = 35 regex2 = \w*ever find() 'wherever' start = 27 end = 35 regex3 = T\w+ find() 'Targathian' start = 0 end = 10m.lookingAt() : lookingAt() start = 0 end = 10 regex4 = Never.*?! [3]==================================== input : signal sounds among the stars ... We'll be there. regex1 = \w*ere\w* find() 'there' start = 43 end = 48 regex2 = \w*ever regex3 = T\w+ regex4 = Never.*?! [4]==================================== input : This fine ship, and this fine crew ... regex1 = \w*ere\w* regex2 = \w*ever regex3 = T\w+ find() 'This' start = 0 end = 4m.lookingAt() : lookingAt() start = 0 end = 4 regex4 = Never.*?! [5]==================================== input : Never give up! Never surrender! regex1 = \w*ere\w* regex2 = \w*ever find() 'Never' start = 0 end = 5 find() 'Never' start = 15 end = 20m.lookingAt() : lookingAt() start = 0 end = 5 regex3 = T\w+ regex4 = Never.*?! find() 'Never give up!' start = 0 end = 14 find() 'Never surrender!' start = 15 end = 31m.lookingAt() : lookingAt() start = 0 end = 14 matches() start = 0 end = 31 */ 【代碼解說-Matcher方法列表】

1)find(): 從輸入字符串的任意位置匹配 regex; 而 find(int start) : 從輸入字符串的第start字符 開始匹配 regex;

2)lookingAt(): 判斷 輸入字符串是否從最開始處 就匹配 regex;

3)matches():?判斷 整個輸入字符串 是否 匹配 regex;?


【Pattern標記】

1)Pattern.compile() 方法的重載版本: 該方法可以調整 regex 的 匹配行為:?

// Pattern.compile(String, int) 源碼public static Pattern compile(String regex, int flags) {return new Pattern(regex, flags);} 2)上述 flags 表示匹配行為, 必須為 Pattern類常量,如下:

3)常用的Pattern 標記 如下:

3.1)Pattern.CASE_INSENSITIVE:? 不區分大小寫;

3.2)Pattern.MULTILINE: 允許多行,即不以換行字符作為分隔符;

3.3)Pattern.COMMENTS: 模式中允許空格和注釋, 不以空格和注釋作為分隔符;

【荔枝-Pattern標記】?

/* Pattern標記的荔枝 */ public class ReFlags {public static void main(String[] args) {Pattern p = Pattern.compile("^java", Pattern.CASE_INSENSITIVE| Pattern.MULTILINE);// Pattern.CASE_INSENSITIVE: 不區分大小寫;// | Pattern.MULTILINE: 允許多行,即不以換行字符作為分隔符;Matcher m = p.matcher("java has regex\nJava has regex\n"+ "JAVA has pretty good regular expressions\n"+ "Regular expressions are in Java");/* Matcher.find() : 從輸入字符串的任意位置校驗輸入字符串是否匹配regex*/while (m.find())System.out.println(m.group()); // m.group() 返回匹配regex的輸入字符串子串} } /* java Java JAVA */ 【注意】模式Pattern 表示的是: 編譯后的regex;


【13.6.5】Pattern.split() 方法

1)Patter.split() 方法?將 輸入字符串 分割 為 字符串對象數組,分割邊界由 regex 確定(分割邊界在分割結果中被刪除)

// Pattern.split(CharSequence input)源碼 public String[] split(CharSequence input) {return split(input, 0);} // Pattern.split(CharSequence input, int limit) 源碼 public String[] split(CharSequence input, int limit) {int index = 0;boolean matchLimited = limit > 0;ArrayList<String> matchList = new ArrayList<>();Matcher m = matcher(input);// Add segments before each match foundwhile(m.find()) {if (!matchLimited || matchList.size() < limit - 1) {if (index == 0 && index == m.start() && m.start() == m.end()) {// no empty leading substring included for zero-width match// at the beginning of the input char sequence.continue;}String match = input.subSequence(index, m.start()).toString();matchList.add(match);index = m.end();} else if (matchList.size() == limit - 1) { // last oneString match = input.subSequence(index,input.length()).toString();matchList.add(match);index = m.end();}}// If no match was found, return thisif (index == 0)return new String[] {input.toString()};// Add remaining segmentif (!matchLimited || matchList.size() < limit)matchList.add(input.subSequence(index, input.length()).toString());// Construct resultint resultSize = matchList.size();if (limit == 0)while (resultSize > 0 && matchList.get(resultSize-1).equals(""))resultSize--;String[] result = new String[resultSize];return matchList.subList(0, resultSize).toArray(result);}【荔枝-Patter.split() 方法分割輸入字符串】

// Pattern.split() 方法的測試用例 public class SplitDemo {public static void main(String[] args) {String input = "This!!unusual use!!of exclamation!!points";print(Arrays.toString(Pattern.compile("!!").split(input))); // split(input, 0); 對匹配次數不做任何限制/* (只匹配前2個 !! ) *//* 注意:分割邊界在分割結果中被刪除 */print(Arrays.toString(Pattern.compile("!!").split(input, 3))); // 限定匹配次數,limit限制將輸入字符串分割成數組的數組大小} } /* [This, unusual use, of exclamation, points] [This, unusual use, of exclamation!!points] */


【13.6.6】替換操作

1) Matcher.appendReplacement 和 Matcher.appendTail 方法的荔枝

public class TheReplacements {public static void main(String[] args) throws Exception {String s = TextFile.read(MyConstant.path + "TheReplacements.java");// 匹配在 /*! 和 !*/ 之間的所有文字。// 如 /*! 今天 2017 年11月26日 , i love you. !*/Matcher mInput = Pattern.compile("/\\*!(.*)!\\*/", Pattern.DOTALL).matcher(s);if (mInput.find()) {s = mInput.group(1); // Captured by parentheses(圓括號)System.out.println("matched.");System.out.println("s1 = " + s); }// Replace two or more spaces with a single space:/* 用一個空格替換2個或多個空格(縮進字符 \t 不起作用) */s = s.replaceAll(" {2,}", " ");System.out.println("after s.replaceAll(\" {2,}\", \" \"), s2 = " + s); // // Replace one or more spaces at the beginning of each line with no spaces. Must enable MULTILINE mode:// 在每行的開頭替換一個或多個空格,不要有空格。 必須啟用MULTILINE模式:s = s.replaceAll("(?m)^ +", "");System.out.println("after s = s.replaceAll(\"(?m)^ +\", \"\"), s3 = " + s);s = s.replaceFirst("[aeiou]", "(Tr)"); // 用 (VOWEL1) 替換第一次匹配到的 任何一個aeiou元音字母, 這里調用的是 String.replaceFirst()方法System.out.println("after s.replaceFirst(\"[aeiou]\", \"(Tr)\"), s4 = " + s);/* 構建模式,即編譯后的regex. */StringBuffer sbuf = new StringBuffer();Pattern p = Pattern.compile("[aeiou]");Matcher m = p.matcher(s);// Process the find information as you perform the replacements:// 在執行替換時處理查找信息:while (m.find())m.appendReplacement(sbuf, m.group().toUpperCase()); // 將regex找到的元音字母轉換為大寫字母print("s = " + s); // s 沒有改變// Put in the remainder of the text:m.appendTail(sbuf); // 將未處理的部分存入sbuf;print(sbuf); // 最后 sbuf 是 字符串s被匹配的元音字母轉換為大寫后的 結果。} } // 打印結果: matched. s1 = 和 !*/ 之間的所有文字。// 如 /*! 今天 2017 年11月26日 , i love you. after s.replaceAll(" {2,}", " "), s2 = 和 !*/ 之間的所有文字。// 如 /*! 今天 2017 年11月26日 , i love you. after s = s.replaceAll("(?m)^ +", ""), s3 = 和 !*/ 之間的所有文字。// 如 /*! 今天 2017 年11月26日 , i love you. after s.replaceFirst("[aeiou]", "(Tr)"), s4 = 和 !*/ 之間的所有文字。// 如 /*! 今天 2017 年11月26日 , (Tr) love you. s = 和 !*/ 之間的所有文字。// 如 /*! 今天 2017 年11月26日 , (Tr) love you. 和 !*/ 之間的所有文字。// 如 /*! 今天 2017 年11月26日 , (Tr) lOvE yOU. 【代碼解說】方法列表:

1)Matcher.appendReplacement(StringBuffer sb, String replacement) 方法: 是將匹配到的字符串部分(或子串)做處理后追加到 sb;

2)Matcher.appendTail(StringBuffer sb) : 在執行一次或多次 appendReplacement() 方法后, 調用 appendTail() 方法可以將輸入字符串余下的部分復制到sb中;

3) Matcher.replaceFirst(String replacement): 調用一次?appendReplacement, 再調用一次?appendTail 方法 就可以了;

// Matcher.replaceFirst() 源碼public String replaceFirst(String replacement) {if (replacement == null)throw new NullPointerException("replacement");reset();if (!find())return text.toString();StringBuffer sb = new StringBuffer();appendReplacement(sb, replacement);appendTail(sb);return sb.toString();}

4)Matcher.replaceAll(String replacement):?用replacement 替換 輸入字符串中?所有?匹配regex 的部分:調用多次 appendReplacement, 再調用一次?appendTail 方法 就可以了;

// Matcher.replaceAll() 源碼public String replaceAll(String replacement) {reset();boolean result = find();if (result) {StringBuffer sb = new StringBuffer();do {appendReplacement(sb, replacement);result = find();} while (result);appendTail(sb);return sb.toString();}return text.toString();}

5)代碼?s = s.replaceFirst("[aeiou]", "(Tr)");? 調用的是 String.replaceFirst() 方法,源碼如下:String.replaceFirst() 方法實際上也是調用了 Matcher.replaceFirst()

// String.replace() 方法源碼public String replaceFirst(String regex, String replacement) {return Pattern.compile(regex).matcher(this).replaceFirst(replacement);}


【13.6.7】reset()

1)reset方法: 可以將 Matcher對象應用于一個新的字符序列;

// reset()方法 重新設置 Matcher 的輸入字符串 public class Resetting {public static void main(String[] args) throws Exception {Matcher m = Pattern.compile("[frb][aiu][gx]").matcher("fix the rug with bags"); // 設置輸入字符串 “fix the rug with bags”while (m.find()) {System.out.println(m.group() + " " + m.start() + " to " + m.end());}System.out.println("\nafter m.reset(\"fix the rig with rags\") by regex [frb][aiu][gx]");/* Matcher.reset() 方法 可以將 Matcher對象重置到當前字符串或字符序列的起始位置 */m.reset("fix the rig with rags"); while (m.find())System.out.println(m.group() + " " + m.start() + " to " + m.end());} } /* fix 0 to 3 rug 8 to 11 bag 17 to 20=== reset === fix 0 to 3 rig 8 to 11 rag 17 to 20 */


【13.6.8】正則表達式regex 與 java io

1)如何應用 regex 在一個文件中 進行搜索匹配操作?

2)unix 系統的 grep函數: 有兩個輸入參數, 文件名和 regex; 輸出 匹配部分 和 匹配部分在行中的位置;

// Matcher.reset(str) 的荔枝 public class JGrep {public static void main(String[] args) throws Exception {args = new String[2];args[1] = "[a-z]+"; // 定義 正則表達式regexPattern p = Pattern.compile(args[1]); // Iterate through the lines of the input file:// 遍歷輸入文件的行:int index = 1;Matcher m = p.matcher(""); // 隨便設置一個輸入字符串 ""args[0] = MyConstant.path + "JGrep.java"; // 輸入字符串所在文件的 dirfor (String line : new TextFile(args[0])) {/* Matcher.reset()方法 將 Matcher對象重置到當前字符串或字符序列的起始位置 */m.reset(line); while (m.find())System.out.println(index++ + " , " + m.group() + " , start = "+ m.start());}} } 【代碼解說】循環外線創建一個空的 Matcher 對象,然后在for循環內部 調用 Matcher.reset() 方法為 Matcher加載一行輸入, 這種處理會有一定的性能優化。


【13.7】掃描輸入

1)如何從文件或 標準輸入讀取數據? : 讀入一行文本,對其進行分詞,然后使用 Integer, Double 等類的各種解析方法來解析數據;

【荔枝-原始掃描輸入BufferedReader實現】

// 原始掃描輸入的荔枝 public class SimpleRead {public static BufferedReader input = new BufferedReader(new StringReader("Sir Robin of Camelot\n22 1.61803"));public static void main(String[] args) {try {System.out.println("\n1.What is your name?");String name = input.readLine();System.out.println(name); // Sir Robin of CamelotSystem.out.println("\n2.input: <age> <double>");String numbers = input.readLine();System.out.println("input.readLine() = " + numbers); // 22 1.61803String[] numArray = numbers.split(" ");int age = Integer.parseInt(numArray[0]); // 22double favorite = Double.parseDouble(numArray[1]); // 1.61803System.out.format("Hi %s.\n", name);System.out.format("In 5 years you will be %d.\n", age + 5);System.out.format("My favorite double is %f.", favorite / 2);} catch (IOException e) {System.err.println("I/O exception");}} } /*1.What is your name? Sir Robin of Camelot2.input: <age> <double> input.readLine() = 22 1.61803 Hi Sir Robin of Camelot. In 5 years you will be 27. My favorite double is 0.809015.*/ 【代碼解說】 顯然,上面的掃描讀入代碼有一個問題:當 integer 和 double 類型數據在同一行的時候,還必須對 string 進行分割;

所以, java se5 新增了 Scanner 類, 這減輕了 掃描輸入的工作負擔;

【荔枝-Java SE5 引入的Scanner實現掃描輸入 】

// 應用 Scanner 進行掃描輸入操作 public class BetterRead {public static void main(String[] args) {// Scanner 可以接受任何類型的 Readable 輸入對象Scanner stdin = new Scanner(SimpleRead.input);System.out.println("What is your name?");// 所有的輸入,分詞以及翻譯的操作都隱藏在不同類型的 next 方法 中.String name = stdin.nextLine(); // nextLine() 返回 StringSystem.out.println(name);System.out.println("How old are you? What is your favorite double?");System.out.println("(input: <age> <double>)");// Scanner 直接讀入 integer 和 double 類型數據int age = stdin.nextInt();double favorite = stdin.nextDouble();System.out.println(age);System.out.println(favorite);System.out.format("Hi %s.\n", name);System.out.format("In 5 years you will be %d.\n", age + 5);System.out.format("My favorite double is %f.", favorite / 2);} } /* What is your name? Sir Robin of Camelot How old are you? What is your favorite double? (input: <age> <double>) 22 1.61803 Hi Sir Robin of Camelot. In 5 years you will be 27. My favorite double is 0.809015. */ 【代碼解說】

1)Scanner 構造器可以接受任何類型的輸入對象 Readable對象:包括 File, InputStream, String 等; Readable 接口時 Java SE5 引入的一個接口;


【13.7.1】Scanner 定界符(分割符)(定界符 delimiter ==分隔符,這個概念非常重要)

1)默認: Scanner 使用空白符對字符串進行分割,但可以自定義的 regex 作為分隔符;

2) Scanner.useDelimiter(regex)的自定義regex 作為分隔符的荔枝

// 使用正則表達式regex 指定 Scanner 所需的定界符( 小\s== 空白符, 而 大\S == 非空白符) public class ScannerDelimiter {public static void main(String[] args) {Scanner scanner = new Scanner("12, 42, 78, 99, 42");scanner.useDelimiter("\\s*,\\s*"); // 使用 逗號, 作為分隔符while (scanner.hasNextInt())System.out.print(scanner.nextInt() + " ");} } /** Output: 12 42 78 99 42*/// :~


【13.7.2】用regex 掃描

1)除了掃描基本數據類型外,還可以使用 描自定義的regex進行掃描;

【例子-使用regex 掃描 日志文件中記錄的威脅數據】

// Scanner 與 正則表達式相結合 掃描輸入字符串 public class ThreatAnalyzer {static String threatData = "58.27.82.161@02/10/2005\n"+ "204.45.234.40@02/11/2005\n" + "58.27.82.161@02/11/2005\n"+ "58.27.82.161@02/12/2005\n" + "58.27.82.161@02/12/2005\n"+ "[Next log section with different data format]";public static void main(String[] args) {Scanner scanner = new Scanner(threatData);String pattern = "(\\d+[.]\\d+[.]\\d+[.]\\d+)@(\\d{2}/\\d{2}/\\d{4})"; // 正則表達式./* 注意scanner的方法列表,如 hasNext, match, group() 等*/while (scanner.hasNext(pattern)) {scanner.next(pattern);MatchResult match = scanner.match();String ip = match.group(1);String date = match.group(2);System.out.format("Threat on %s from %s\n", date, ip); // PrintStream.format()方法返回的就是一個 PrintStream,直接輸出到控制臺}} } /* Threat on 02/10/2005 from 58.27.82.161 Threat on 02/11/2005 from 204.45.234.40 Threat on 02/11/2005 from 58.27.82.161 Threat on 02/12/2005 from 58.27.82.161 Threat on 02/12/2005 from 58.27.82.161*/ // PrintStream.format() 源碼public PrintStream format(String format, Object ... args) {try {synchronized (this) {ensureOpen();if ((formatter == null)|| (formatter.locale() != Locale.getDefault()))formatter = new Formatter((Appendable) this);formatter.format(Locale.getDefault(), format, args);}} catch (InterruptedIOException x) {Thread.currentThread().interrupt();} catch (IOException x) {trouble = true;}return this;}【代碼解說】

Scanner.next():找到下一個匹配該模式的輸入部分;

Scanner.match(): 獲得匹配結果;

注意: 在使用 Scanner 和 regex 進行 掃描輸入時, 掃描方式僅僅針對下一個輸入分詞進行匹配, 如果你的 regex 中含有定界符, 那永遠都不會匹配成功的


【13.8】StringTokenize(string 字符串分詞器)

1)regex正則表達式: 在 J2SE 4 中引入的;

2)Scanner類 : 是在 Java SE5 中引入的;

在regex 和 Scanner 被引入之前, 分割字符串的唯一方式是 使用 StringTokenizer 來分詞;

因為使用 regex 或 Scanner, 能夠使用更加復雜的模式分割字符串,StringTokenizer 可以廢棄 了;

【不過還是給出荔枝-StringTokenizer】

StringTokenizer , regex, Scanner 分詞 結果比較:

public class ReplacingStringTokenizer {public static void main(String[] args) {String input = "But I'm not dead yet! I feel happy!";StringTokenizer stoke = new StringTokenizer(input);/* 使用 StringTokenizer 進行分詞 */System.out.print("使用 StringTokenizer 進行分詞: ");while (stoke.hasMoreElements())System.out.print(stoke.nextToken() + " ");/* 使用 String.split() + regex 進行分詞 */System.out.print("\n使用 String.split() + regex 進行分詞: ");System.out.println(Arrays.toString(input.split("\\W+")));/* 使用Scanner 進行分詞 ,定界符默認為 空格 */System.out.print("使用Scanner 進行分詞 ,定界符默認為 空格: ");Scanner scanner = new Scanner(input);scanner.useDelimiter("\\s+"); // 自定義定界符為 空格while (scanner.hasNext())System.out.print(scanner.next() + ";");} } /* 使用 StringTokenizer 進行分詞: But I'm not dead yet! I feel happy! 使用 String.split() + regex 進行分詞: [But, I, m, not, dead, yet, I, feel, happy] 使用Scanner 進行分詞 ,定界符默認為 空格: But;I'm;not;dead;yet!;I;feel;happy!; */

總結

以上是生活随笔為你收集整理的thinking-in-java(13) String字符串的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。