2013-07-23 28 views
7

早上好, 我有一個文件夾,其中包含數千個不同深度的子目錄。我需要列出所有不包含子目錄的目錄(諺語「行尾」)。如果它們包含文件,那很好。有沒有辦法與EnumerateDirectories做到這一點?C#使用EnumerateDirectories列出所有「leaf」子目錄

例如,如果完全遞歸EnumerateDirectories返回:

/files/ 
/files/q 
/files/q/1 
/files/q/2 
/files/q/2/examples 
/files/7 
/files/7/eb 
/files/7/eb/s 
/files/7/eb/s/t 

我只關心:

/files/q/1 
/files/q/2/examples 
/files/7/eb/s/t 

回答

14

這應該工作:

var folderWithoutSubfolder = Directory.EnumerateDirectories(root, "*.*", SearchOption.AllDirectories) 
    .Where(f => !Directory.EnumerateDirectories(f, "*.*", SearchOption.TopDirectoryOnly).Any()); 
+3

計數器投票去除-1 ...似乎爲我 – Sayse

+0

+1的工作,只是讓我感到驕傲! –

+0

不錯的單行遊戲 – m1m1k

3

如果你想避免爲每個目錄調用EnumerateDirectories()兩次,您可以像這樣實現它:

public IEnumerable<string> EnumerateLeafFolders(string root) 
{ 
    bool anySubfolders = false; 

    foreach (var subfolder in Directory.EnumerateDirectories(root)) 
    { 
     anySubfolders = true; 

     foreach (var leafFolder in EnumerateLeafFolders(subfolder)) 
      yield return leafFolder; 
    } 

    if (!anySubfolders) 
     yield return root; 
} 

我做了一些時間測試,對我這種方法比快兩倍,使用LINQ的方法了。

我使用發佈版本運行此測試,在任何調試器之外運行。我跑它含有大量的文件夾的SSD - 頁夾的總數爲25035.

我對節目的第二次運行結果(第一次運行是預熱磁盤緩存):

Calling Using linq. 1 times took 00:00:08.2707813 
Calling Using yield. 1 times took 00:00:03.6457477 
Calling Using linq. 1 times took 00:00:08.0668787 
Calling Using yield. 1 times took 00:00:03.5960438 
Calling Using linq. 1 times took 00:00:08.1501002 
Calling Using yield. 1 times took 00:00:03.6589386 
Calling Using linq. 1 times took 00:00:08.1325582 
Calling Using yield. 1 times took 00:00:03.6563730 
Calling Using linq. 1 times took 00:00:07.9994754 
Calling Using yield. 1 times took 00:00:03.5616040 
Calling Using linq. 1 times took 00:00:08.0803573 
Calling Using yield. 1 times took 00:00:03.5892681 
Calling Using linq. 1 times took 00:00:08.1216921 
Calling Using yield. 1 times took 00:00:03.6571429 
Calling Using linq. 1 times took 00:00:08.1437973 
Calling Using yield. 1 times took 00:00:03.6606362 
Calling Using linq. 1 times took 00:00:08.0058955 
Calling Using yield. 1 times took 00:00:03.6477621 
Calling Using linq. 1 times took 00:00:08.1084669 
Calling Using yield. 1 times took 00:00:03.5875057 

正如您所看到的,使用yield方法顯着更快。 (可能是因爲它不枚舉每個文件夾的兩倍。)

我的測試代碼:

using System; 
using System.Collections.Generic; 
using System.Diagnostics; 
using System.IO; 
using System.Linq; 

namespace Demo 
{ 
    class Program 
    { 
     private void run() 
     { 
      string root = "F:\\TFROOT"; 

      Action test1 =() => leafFolders1(root).Count(); 
      Action test2 =() => leafFolders2(root).Count(); 

      for (int i = 0; i < 10; ++i) 
      { 
       test1.TimeThis("Using linq."); 
       test2.TimeThis("Using yield."); 
      } 
     } 

     static void Main() 
     { 
      new Program().run(); 
     } 

     static IEnumerable<string> leafFolders1(string root) 
     { 
      var folderWithoutSubfolder = Directory.EnumerateDirectories(root, "*.*", SearchOption.AllDirectories) 
       .Where(f => !Directory.EnumerateDirectories(f, "*.*", SearchOption.TopDirectoryOnly).Any()); 

      return folderWithoutSubfolder; 
     } 

     static IEnumerable<string> leafFolders2(string root) 
     { 
      bool anySubfolders = false; 

      foreach (var subfolder in Directory.EnumerateDirectories(root)) 
      { 
       anySubfolders = true; 

       foreach (var leafFolder in leafFolders2(subfolder)) 
        yield return leafFolder; 
      } 

      if (!anySubfolders) 
       yield return root; 
     } 
    } 

    static class DemoUtil 
    { 
     public static void Print(this object self) 
     { 
      Console.WriteLine(self); 
     } 

     public static void Print(this string self) 
     { 
      Console.WriteLine(self); 
     } 

     public static void Print<T>(this IEnumerable<T> self) 
     { 
      foreach (var item in self) 
       Console.WriteLine(item); 
     } 

     public static void TimeThis(this Action action, string title, int count = 1) 
     { 
      var sw = Stopwatch.StartNew(); 

      for (int i = 0; i < count; ++i) 
       action(); 

      Console.WriteLine("Calling {0} {1} times took {2}", title, count, sw.Elapsed); 
     } 
    } 
} 
+0

EnumerateDirectories是懶洋洋的評估,所以在Tim的回答中做出的額外通話相當便宜。當我將你的代碼與Tim的基準進行比較時,Tim的運行時間不到一半。我想這是因爲使用迭代器遞歸地增加了很多開銷。 – Brian

+0

@Brian您是否多次運行測試以消除磁盤緩存首次運行導致的工件? –

+0

是的。在開始之前,我進行了幾百次測試,並進行了優化編譯,並在啓動定時器之前運行一次測試(以避免抖動僞像)。 – Brian

相關問題