Changes between Version 4 and Version 5 of benchmarks
- Timestamp:
- 11/24/11 15:34:57 (12 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
benchmarks
v4 v5 20 20 * queueing system: Torque 2.4.12 + Maui 3.3, 21 21 * about 800 nodes, 22 * about 3-4k tasks present in the system,22 * about 3-4k jobs present in the system, 23 23 * Maui „RMPOLLINTERVAL”: 3,5 minutes, 24 24 * for the puropose of the tests, a special partition (WP4) was set aside: 64 cores / 8 nodes - 64 slots, … … 78 78 ''Pros:'' 79 79 * The test reflects the natural situation in productive environments: 80 * approximately constant number of tasks,81 * "the task flow" (when one taskis finished, another begins).80 * approximately constant number of jobs, 81 * "the jobs flow" (when one job is finished, another begins). 82 82 * The program may be used to measure the overall capacity of the system. 83 83 … … 86 86 87 87 ==== Plan of the tests ==== 88 * 50 tasks x 10 users = 500 tasks, 30 minutes, SLEEP_COEF = 1089 * 100 tasks x 10 users = 1000 tasks, 30 minutes, SLEEP_COEF = 1090 * 200 tasks x 10 users = 2000 tasks, 30 minutes, SLEEP_COEF = 1091 * 400 tasks x 10 users = 4000 tasks, 30 minutes, SLEEP_COEF = 1088 * 50 jobs x 10 users = 500 jobs, 30 minutes, SLEEP_COEF = 10 89 * 100 jobs x 10 users = 1000 jobs, 30 minutes, SLEEP_COEF = 10 90 * 200 jobs x 10 users = 2000 jobs, 30 minutes, SLEEP_COEF = 10 91 * 400 jobs x 10 users = 4000 jobs, 30 minutes, SLEEP_COEF = 10 92 92 93 93 ==== Results ==== … … 114 114 == Test 2 - Throughput == 115 115 The test is grounded on the methodology described in the paper [[http://dl.acm.org/citation.cfm?id=1533455|Benchmarking of Integrated OGSA-BES with the Grid Middleware]] and bases on measurement performed from the user perspective of the finish time of the last from N jobs submitted at (almost) the same moment. In addition to the paper, the presented test has utilized also the following elements: 116 * submitting the tasks by N processes/users,116 * submitting the jobs by N processes/users, 117 117 * using consistent SDK, not the command-line clients, 118 118 * single test environment. … … 129 129 130 130 ==== Results ==== 131 * 1 user, 1 thread, 500 tasks:131 * 1 user, 1 thread, 500 jobs: 132 132 [[Image(zeus-throughput-500x1-1.png, center, width=640px)]] 133 * 1 user, 10 thread, 500 tasks (50x10):133 * 1 user, 10 thread, 500 jobs (50x10): 134 134 [[Image(zeus-throughput-50x1-1.png, center, width=640px)]] 135 * 10 users, 10 thread, 500 tasks (50x10):135 * 10 users, 10 thread, 500 jobs (50x10): 136 136 [[Image(zeus-throughput-50x10-0.png, center, width=640px)]] 137 * 10 users, 10 thread, 1000 tasks (10x100):137 * 10 users, 10 thread, 1000 jobs (10x100): 138 138 [[Image(zeus-throughput-100-10-0.png, center, width=640px)]] 139 139 140 140 === Notes === 141 141 1. The machine where CREAM (gLite) was running had more resources (in particular CPU cores and virtual memory) than the machines with QCG and UNICORE. 142 2. ... hovewer this machine was additionally loaded by external tasks (about 500-2000 tasks - the tests were performed by 2 weeks).143 3. QCG returns the job status when the job is already in queueing system, gLite and UNICORE not necessarily. Thus, e.g. in the throughput tests, new tasks appeared after the test finished.144 4. The bottle-neck (especially in the second group of tests) was the throughput of the WP4 partition and Maui, which imposed that only 64 tasks could be scheduled per one scheduling cycle (at least 3.5 minutes).142 2. ... hovewer this machine was additionally loaded by external jobs (about 500-2000 jobs - the tests were performed by 2 weeks). 143 3. QCG returns the job status when the job is already in queueing system, gLite and UNICORE not necessarily. Thus, e.g. in the throughput tests, new jobs appeared after the test finished. 144 4. The bottle-neck (especially in the second group of tests) was the throughput of the WP4 partition and Maui, which imposed that only 64 jobs could be scheduled per one scheduling cycle (at least 3.5 minutes). 145 145 146 146 [=#n]