2016-05-15 64 views
0

我有一些不同的字符串(大約100.000取自產品)的文件。對於從該文件處理每個字符串的函數,需要找出99%,99.9%。Java微型基準測試從列表中找到平均值

我試過用jmh來寫基準。但是,我只能針對批處理函數(處理整個文件)或僅針對具有一個特定字符串的所需函數找出所需的百分位數。

public String process1(String str){ 
    ...process... 
} 

public String processBatch(List<String> strings){ 
    for (String str: strings){ 
     process1(str) 
    } 
} 

此外,我試圖通過@param設置整個字符串列表。這使得jmh爲每個字符串運行幾十次迭代,但不能找到所需的結果。

jmh中有什麼可以幫助找到所需的統計信息嗎?如果沒有,可以使用什麼工具呢?

+0

@ CConard96感謝,我在開始時檢查了這些信息。但未能找到如何運行特定基準的信息 – Natalia

回答

2

這是你在找什麼?

@Warmup(iterations = 1, time = 5, timeUnit = TimeUnit.SECONDS) 
@Measurement(iterations = 1, time = 5, timeUnit = TimeUnit.SECONDS) 
@Fork(1) 
@State(Scope.Benchmark) 
public class MyBenchmark { 

    ClassUnderBenchmark classUnderBenchmark = new ClassUnderBenchmark(); 

    @State(Scope.Benchmark) 
    public static class MyTestState { 

     int counter = 0; 
     List<String> list = Arrays.asList("aaaaa", "bbbb", "ccc"); 
     String currentString; 

     @Setup(Level.Invocation) 
     public void init() throws IOException { 
      this.currentString = list.get(counter++); 
      if (counter == 3) { 
       counter = 0; 
      } 
     } 
    } 

    @Benchmark 
    @Threads(1) 
    @BenchmarkMode(Mode.SampleTime) 
    public void test(MyBenchmark.MyTestState myTestState) { 
     classUnderBenchmark.toUpper(myTestState.currentString); 
    } 

    public static class ClassUnderBenchmark { 

     Random r = new Random(); 

     public String toUpper(String name) { 
      try { 
       Thread.sleep(r.nextInt(100)); 
      } catch (InterruptedException e) { 
       e.printStackTrace(); 
      } 
      return name.toUpperCase(); 
     } 
    } 

    public static void main(String[] args) throws RunnerException { 
     Options opt = new OptionsBuilder() 
       .include(MyBenchmark.class.getSimpleName()) 
       .jvmArgs("-XX:+UseG1GC", "-XX:MaxGCPauseMillis=50") 
       .build(); 
     new Runner(opt).run(); 
    } 
} 

請參閱的javadoc(org.openjdk.jmh.annotations.Mode):

/** 
* <p>Sample time: samples the time for each operation.</p> 
* 
* <p>Runs by continuously calling {@link Benchmark} methods, 
* and randomly samples the time needed for the call. This mode automatically adjusts the sampling 
* frequency, but may omit some pauses which missed the sampling measurement. This mode is time-based, and it will 
* run until the iteration time expires.</p> 
*/ 
SampleTime("sample", "Sampling time"), 

此測試將讓你的輸出:

Result "test": 

    N = 91 
    mean =  0,056 ±(99.9%) 0,010 s/op 

    Histogram, s/op: 
    [0,000, 0,010) = 6 
    [0,010, 0,020) = 9 
    [0,020, 0,030) = 3 
    [0,030, 0,040) = 11 
    [0,040, 0,050) = 8 
    [0,050, 0,060) = 11 
    [0,060, 0,070) = 9 
    [0,070, 0,080) = 9 
    [0,080, 0,090) = 14 

    Percentiles, s/op: 
     p(0,0000) =  0,003 s/op 
    p(50,0000) =  0,059 s/op 
    p(90,0000) =  0,092 s/op 
    p(95,0000) =  0,095 s/op 
    p(99,0000) =  0,100 s/op 
    p(99,9000) =  0,100 s/op 
    p(99,9900) =  0,100 s/op 
    p(99,9990) =  0,100 s/op 
    p(99,9999) =  0,100 s/op 
    p(100,0000) =  0,100 s/op 


Benchmark   Mode Cnt Score Error Units 
MyBenchmark.test sample 91 0,056 ± 0,010 s/op