正则表达式:根据逗号解析CSV并忽略引号内的逗号 splitting a comma-separated string but ignoring commas in quotes

Categories: Java; Tagged with: ; @ January 30th, 2013 16:04

需求:解析CSV文件并忽略引号内的逗号

解决方案:

public static void main(String[] args) {
	String s = "a,b,c,\"1,000\"";
	String[] result = s.split(",(?=([^\"]*\"[^\"]*\")*[^\"]*$)");
	for (String str : result) {
		System.out.println(str);
	}
}

输出:
a
b
c
“1,000”

Java 正则表达式替换小心: $ / 符号

Categories: Java; Tagged with: ; @ September 3rd, 2011 14:34

尝试使用正则表达式处理内容时, 需要小心替换字符串中是否包含:$ or /, 譬如:

Pattern pattern = Pattern.compile(“\\{C0\\}”);
Matcher matcher = pattern.matcher(“Price: [{C0}].”);
System.out.println(matcher.replaceAll(“€6.99”));
System.out.println(matcher.replaceAll(“$6.99”));
输出:
Price: [€6.99].
Exception in thread “main” java.lang.IndexOutOfBoundsException: No group 6
at java.util.regex.Matcher.group(Unknown Source)
at java.util.regex.Matcher.appendReplacement(Unknown Source)
at java.util.regex.Matcher.replaceAll(Unknown Source)
at TestExcel2Xml.main(TestExcel2Xml.java:10)

可见第一个replaceAll是正常工作的, 但第二个中的美元符号就出问题了.

Java API:
Note that backslashes (\) and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string. Dollar signs may be treated as references to captured subsequences as described above, and backslashes are used to escape literal characters in the replacement string.

可以使用Matcher.quoteReplacement(String)对替换内容进行预先处理: (API)
Returns a literal replacement String for the specified String. This method produces a String that will work use as a literal replacement s in the appendReplacement method of the Matcher class. The String produced will match the sequence of characters in s treated as a literal sequence. Slashes (‘\’) and dollar signs (‘$’) will be given no special meaning.

修改为:

Pattern pattern = Pattern.compile(“\\{C0\\}”);
Matcher matcher = pattern.matcher(“Price: [{C0}].”);
System.out.println(matcher.replaceAll(“€6.99”));
System.out.println(matcher.replaceAll(Matcher.quoteReplacement(“$6.99”)));

正确输出:

Price: [€6.99].
Price: [$6.99].

正则表达式: 电子邮件格式检验

Categories: Java; Tagged with: ; @ February 17th, 2011 15:50

自己用的:

\\w+([-.]\\w+)*@\\w+([-.]\\w+)*\\.[a-z]{2,3}

可能不是最完美的, 但基本没大问题.

Java测试类一并奉上:

	//定义正则表达式
    private static final String REGEX_EMAIL = "\\w+([-.]\\w+)*@\\w+([-.]\\w+)*\\.[a-z]{2,3}";// \\w+([-.]\\w+)*";// "[\\w]+[\\w.]*@(\\w+\\.)+[A-Za-z]+"; // [\\w]+[\\w+.]+\\.\\w+"; //邮件检查正则表达式
    										// \w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*
	/**
	 * @param args
	 */
	public static void main(String[] args) {
		// TODO Auto-generated method stub
		String s = "[email protected]";
		System.out.println(s.matches(REGEX_EMAIL));
	}

Email Parser 邮件地址解析器, 从给定内容中抓取邮件地址&自动排除重复&过滤指定邮箱

Categories: Flex; Tagged with: ; @ September 10th, 2010 21:41

功能介绍:

从给定的文字内容中 解析Email地址, 并显示出来. 同时支持QQ提取.

(more…)

PHP中使用正则表达式(实例: 搜索String中的Email地址)

Categories: PHP; Tagged with: ; @ September 8th, 2010 20:29

以下代码简单演示在PHP中使用正则表达式从String中搜索Email地址:

	/**
	 * 从String中通过正则表达式找到所有的Email地址.
	 * @param $str
	 * @return array 搜索到的Email地址组成的array.
	 */
	public static function parseEmails($str) {
		$emails = array();
		preg_match_all("(([\w\.-]{1,})@([\w-]{1,}\.+[a-zA-Z]{2,}))", $str, $matches, PREG_PATTERN_ORDER);
		
		// var_dump($matches);
		
		foreach($matches[0] as $email) {
			$emails[$email] = $email;
 		}
 		return $emails;
	}

$matches中包含所有搜索到的Group, 可使用不同的Pattern对得到的数组进行排序, 如上$matches[0]为最外部Group搜索到的字符.

详细可参阅:http://php.net/manual/en/function.preg-match-all.php

相关阅读:  

Eclipse 正则表达式书写测试插件 – 基于java.util.regex

Flex:使用正则表达式替换String

Java正则表达式使用笔记

Older Posts



// Proudly powered by Apache, PHP, MySQL, WordPress, Bootstrap, etc,.