打破排列的單詞列表

我有一個文件是排列的單詞列表，格式如下。它被格式化的方式，當我像記事本程序打開它，這似乎是根本不會被隔開，因此，例如，人的眼睛，第一位是這樣的：打破排列的單詞列表

ATHROCYTESDISHLIKEIRRECOVERABLENESSESEMBRITTLEMENTSYOUNGSOVER

但是當我複製和過去吧，它的出現格式如下：

ATHROCYTES 
    DISHLIKE 
    IRRECOVERABLENESSES 
    EMBRITTLEMENTS 
    YOUNGS 
    OVER

我想這個文件加載到一個數組，所以我可以排序。我正在努力如何正確地打破這一點。我發現使用此代碼：

while (dis.available() != 0) { 
      System.out.println(dis.readLine()); 
     }

打印出格式正確的文檔，就好像我要複製並粘貼它一樣。我使用此代碼，試圖在數組中加載：

String[] store = sb.toString().split(",");

由於沒有逗號，言不正確分離。認識到這一點，我也試過這個代碼，試圖在每一個新行把它分解：

String[] store = sb.toString().split(scan.nextLine());

這兩個給我同樣的結果，被印在同一行的話。現在有人可以將我的結果格式化爲數組嗎？

我包括我的代碼的其餘部分，因爲它有可能是在其他地方的問題源於：

public class InsertionSort { 

public static String[] InsertSort(String[] args) { 
    int i, j; 
    String key; 

    for (j = 1; j < args.length; j++) { //the condition has changed 
     key = args[j]; 
     i = j - 1; 
     while (i >= 0) { 
      if (key.compareTo(args[i]) > 0) {//here too 
       break; 
      } 
      args[i + 1] = args[i]; 
      i--; 
     } 
     args[i + 1] = key; 
     return args; 
    } 

    return args; 
} 

/** 
* @param args the command line arguments 
*/ 
public static void main(String[] args) throws FileNotFoundException, IOException { 
    Scanner scan = new Scanner(System.in); 
    System.out.println("Insertion Sort Test\n"); 


    int n; 
    String name, line; 


    System.out.println("Enter name of file to sort: "); 
    name = scan.next(); 

    BufferedReader reader = new BufferedReader(new FileReader(new File(name))); 
    //The StringBuffer will be used to create a string if your file has multiple lines 
    StringBuffer sb = new StringBuffer(); 

    File file = new File(name); 
    FileInputStream fis = null; 
    BufferedInputStream bis = null; 
    DataInputStream dis = null; 

    try { 
     fis = new FileInputStream(file); 

     // Here BufferedInputStream is added for fast reading. 
     bis = new BufferedInputStream(fis); 
     dis = new DataInputStream(bis); 

     // dis.available() returns 0 if the file does not have more lines. 
     while (dis.available() != 0) { 

    // this statement reads the line from the file and print it to 
      // the console. 
      System.out.println(dis.readLine()); 
     } 

     // dispose all the resources after using them. 
     fis.close(); 
     bis.close(); 
     dis.close(); 

    } catch (FileNotFoundException e) { 
     e.printStackTrace(); 
    } catch (IOException e) { 
     e.printStackTrace(); 
    } 

    while((line = reader.readLine())!= null){ 

    sb.append(line); 

} 

    //We now split the line on the "," to get a string array of the values 
    String[] store = sb.toString().split("/n"); 
    System.out.println(Arrays.toString(store)); 
    /* Call method sort */ 
    InsertSort(store); 

    n = store.length; 
    FileWriter fw = new FileWriter("sorted.txt"); 


for (int i = 0; i < store.length; i++) { 
    fw.write(store[i] + "\n"); 
} 
fw.close(); 
    } 

}

來源

2015-09-13 user3068177

你試過記事本++？它比記事本更好。這些行可能以換行符分隔（\ n）。這應該是你的分隔符。我對Java並不熟悉，但這看起來確實是你的問題。 –

我只是使用記事本，因爲它是一個.txt文件。我正在使用NetBeans中的所有代碼。這就是說，我試着編輯我的代碼來做split \ n，給我：String [] store = sb.toString（）。split（「/ n」）;但我仍然得到相同的結果，他們都在同一行。 – user3068177

那麼你使用了錯誤的斜線。此外，記事本+ +更好地讀取文件，這就是爲什麼我建議它。 –

你必須提前返回語句在這裏：

args[i + 1] = key; 
    return args; // the cause 
}

刪除它，並且它應該是固定的：

[ATHROCYTES, DISHLIKE, IRRECOVERABLENESSES, EMBRITTLEMENTS, YOUNGS, OVER] 

DISHLIKE -> ATHROCYTES = 3 
IRRECOVERABLENESSES -> DISHLIKE = 5 
EMBRITTLEMENTS -> IRRECOVERABLENESSES = -4 
EMBRITTLEMENTS -> DISHLIKE = 1 
YOUNGS -> IRRECOVERABLENESSES = 16 
OVER -> YOUNGS = -10 
OVER -> IRRECOVERABLENESSES = 6 

[ATHROCYTES, DISHLIKE, EMBRITTLEMENTS, IRRECOVERABLENESSES, OVER, YOUNGS]

完整代碼：

public static String[] InsertSort(String[] args) { 
    int i, j; 
    String key; 

    System.out.println(Arrays.toString(args)); 

    for (j = 1; j < args.length; j++) { //the condition has changed 
    key = args[j]; 
    i = j - 1; 
    while (i >= 0) { 
     System.out.printf(" %s -> %s = %d\n", key, args[i], key.compareTo(args[i])); 
     if (key.compareTo(args[i]) > 0)//here too 
     break; 
     args[i + 1] = args[i]; 
     i--; 
    } 
    args[i + 1] = key; 
    } 

    return args; 
} 

public static void main(String[] args) throws FileNotFoundException, IOException { 
    Scanner scan = new Scanner(System.in); 
    System.out.println("Insertion Sort Test\n"); 

    System.out.println("Enter name of file to sort: "); 
    String name = scan.nextLine(); 

    File file = new File(name); 
    String sb = (new Scanner(file)).useDelimiter("\\Z").next(); 

    //We now split the line on the "," to get a string array of the values 
    List<String> list = Arrays.asList(sb.split("\n\r?")); 

    ArrayList<String> list2 = new ArrayList<>(); 
    list.stream().forEach((s) -> { 
    list2.add(s.trim()); 
    }); 

    System.out.println(list2); 
    /* Call method sort */ 
    String[] store = list2.toArray(new String[]{}); 

    InsertSort(store); 

    System.out.println(Arrays.asList(store)); 

    int n = store.length; 

    try (FileWriter fw = new FileWriter("sorted.txt")) { 
    StringBuilder b = new StringBuilder(); 
    for (String s: store) 
     b.append(s).append("\n"); 

    fw.write(b.toString()); 
    } 
}

來源

2015-09-13 01:30:26 ankhzet

這似乎沒有改變結果。 – user3068177

找到原因，請參閱更新 – ankhzet

所以我只需要刪除「返回參數」？如果是這樣的話，我早些時候嘗試過並得到了相同的結果。 – user3068177

您的文件顯示爲Windows記事一條線的原因可能是因爲記事本只能識別CRLF，\n\r作爲換行符，而大多數UNIX程序僅將LF，\n視爲換行符。您的文本文件很可能是由UNIX程序生成的。進一步的解釋可以發現here.

現在，到你的代碼。

String[] store = sb.toString().split(scan.nextLine());

無論您的掃描儀的第一行是什麼，這行代碼都會送入split()。我不知道這可能是什麼，但是拆分要做的是查找該項目的實例，並在這些實例中對字符串進行分區。

你想要的是

String[] store = sb.toString.split("\n\r?");

String.split()接受一個Java正則表達式。正則表達式

"\n\r?"

相當於說「在換行，或CRLF`拆分

此外，我會建議用Scanner解析你的字符串，而不是試圖把它分割成一個數組。

Scanner scan = new Scanner(sb.toString()); 
while(scan.hasNextLine()) { 
    //Do stuff with scan.nextLine() 
}

編輯：請記住，轉義字符使用回斜線，而不是正斜槓。例如，\n或\r。

來源

2015-09-13 00:35:49

''\ n \ r | [\ n \ r]「'可以被縮寫爲'」\ n \ r？「，afaik – ankhzet

'」\ n \ r | [\ n \ r] UNIX和Windows行結尾。在這種情況下''\ n \ r「'將起作用，但最好採用始終有效的方法。 [Java掃描程序]（http://stackoverflow.com/questions/5918896/java-scanner-newline-recognition）使用'「\ r \ n | [\ n \ r \ u2028 \ u2029 \ u0085]」'作爲默認值正則表達式。 –

eh，regex'「\ n \ r？」'等於''\ n \ r | [\ n \ r]「，它們都會捕獲相同的序列（'\ n'，'\ n \ r '）。或者你曾經把'？'修飾符看作'\ r'字符？ – ankhzet

打破排列的單詞列表

回答

相關問題