1
我正在編寫一個程序來解析一堆數據,(你可以在這裏得到一個數據集本身的例子:https://explore.data.gov/Geography-and-Environment/Worldwide-M1-Earthquakes-Past-7-Days/7tag-iwnu)。爲什麼我需要調用find()兩次?
以下班級完全正常,但我不確定爲什麼我需要在我的parseEarthquake()
方法中的每個項目之間撥打matcher.find()
額外的時間。這是爲什麼?這是我必須處理的一個正常的怪癖,還是我錯誤地設置了我的模式/匹配器?
該方法採用包含其中一行數據的字符串(例如,nc,71958020,1,"Thursday, March 21, 2013 17:13:34 UTC",38.8367,-122.8298,1.4,2.60,28,"Northern California"
),並返回數據的地震對象。
import java.text.DecimalFormat;
import java.text.FieldPosition;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.TimeZone;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Earthquake {
String src="xx";
String eqid="00000000";
short version;
long dateTime;
float lat, lon;
float mag, dep;
short nst;
String region="Nowhere";
private Earthquake(){
date.setTimeZone(TimeZone.getTimeZone("UTC"));
}
private static DecimalFormat
coords = new DecimalFormat("##0.0000"),
magnitude = new DecimalFormat("###0.0"),
depth = new DecimalFormat("###0.00");
private static SimpleDateFormat date = new SimpleDateFormat("'\"'EEEE', 'MMMM' 'dd', 'yyyy' 'HH':'mm':'ss' 'zzz'\"'");
// Src, Eqid, Version, Datetime, Lat, Lon, Magnitude, Depth, NST, Region;
public static Earthquake parseEarthquake(String string){
Earthquake result = new Earthquake();
Matcher matcher = Pattern.compile("(\".*?\")|([^,]*)").matcher(string);
try {
matcher.find(); result.src = matcher.group();
matcher.find(); matcher.find(); result.eqid = matcher.group();
matcher.find(); matcher.find(); result.version = Short.parseShort(matcher.group());
matcher.find(); matcher.find(); result.dateTime = date.parse(matcher.group()).getTime();
matcher.find(); matcher.find(); result.lat = coords.parse(matcher.group()).floatValue();
matcher.find(); matcher.find(); result.lon = coords.parse(matcher.group()).floatValue();
matcher.find(); matcher.find(); result.mag = magnitude.parse(matcher.group()).floatValue();
matcher.find(); matcher.find(); result.dep = depth.parse(matcher.group()).floatValue();
matcher.find(); matcher.find(); result.nst = Short.parseShort(matcher.group());
matcher.find(); matcher.find(); result.region = matcher.group();
} catch (ParseException e) {
e.printStackTrace();
} catch (NumberFormatException e) {
e.printStackTrace();
}
return result;
}
public String toString(){
StringBuffer buf = new StringBuffer();
buf.append(src);
buf.append(','); buf.append(eqid);
buf.append(','); buf.append(version);
buf.append(','); date.format(dateTime, buf, new FieldPosition(0));
buf.append(','); coords.format(lat, buf, new FieldPosition(0));
buf.append(','); coords.format(lon, buf, new FieldPosition(0));
buf.append(','); magnitude.format(mag, buf, new FieldPosition(0));
buf.append(','); depth.format(dep, buf, new FieldPosition(0));
buf.append(','); buf.append(nst);
buf.append(','); buf.append('"'); buf.append(region); buf.append('"');
return buf.toString();
}
}
順便說一句,正如你可能已經猜到的那樣,我還有其他一些我計劃添加到課堂上的東西。這絕不是完全完成的。 – AJMansfield 2013-03-25 18:15:13
如果您使用庫來讀取CSV文件格式,則可以刪除大部分代碼(和錯誤)。 – 2013-03-25 18:15:47
@IvanNevostruev是java標準庫的庫部分?我打算最終將其作爲processing.org草圖的一部分,因此如果它不是標準庫,可能會證明它非常困難。 – AJMansfield 2013-03-25 18:18:58