Context Navigation

← Previous Change
Wiki History
Next Change →

Changes between Version 4 and Version 5 of benchmarks

Timestamp:: 11/24/11 15:34:57 (12 years ago)
Author:: bartek
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

benchmarks

v4	v5
20	20	* queueing system: Torque 2.4.12 + Maui 3.3,
21	21	* about 800 nodes,
22		* about 3-4k ~~task~~s present in the system,
	22	* about 3-4k jobs present in the system,
23	23	* Maui „RMPOLLINTERVAL”: 3,5 minutes,
24	24	* for the puropose of the tests, a special partition (WP4) was set aside: 64 cores / 8 nodes - 64 slots,
…	…
78	78	''Pros:''
79	79	* The test reflects the natural situation in productive environments:
80		* approximately constant number of ~~task~~s,
81		* "the ~~task flow" (when one task~~ is finished, another begins).
	80	* approximately constant number of jobs,
	81	* "the jobs flow" (when one job is finished, another begins).
82	82	* The program may be used to measure the overall capacity of the system.
83	83
…	…
86	86
87	87	==== Plan of the tests ====
88		* 50 ~~tasks x 10 users = 500 task~~s, 30 minutes, SLEEP_COEF = 10
89		* 100 ~~tasks x 10 users = 1000 task~~s, 30 minutes, SLEEP_COEF = 10
90		* 200 ~~tasks x 10 users = 2000 task~~s, 30 minutes, SLEEP_COEF = 10
91		* 400 ~~tasks x 10 users = 4000 task~~s, 30 minutes, SLEEP_COEF = 10
	88	* 50 jobs x 10 users = 500 jobs, 30 minutes, SLEEP_COEF = 10
	89	* 100 jobs x 10 users = 1000 jobs, 30 minutes, SLEEP_COEF = 10
	90	* 200 jobs x 10 users = 2000 jobs, 30 minutes, SLEEP_COEF = 10
	91	* 400 jobs x 10 users = 4000 jobs, 30 minutes, SLEEP_COEF = 10
92	92
93	93	==== Results ====
…	…
114	114	== Test 2 - Throughput ==
115	115	The test is grounded on the methodology described in the paper [[http://dl.acm.org/citation.cfm?id=1533455\|Benchmarking of Integrated OGSA-BES with the Grid Middleware]] and bases on measurement performed from the user perspective of the finish time of the last from N jobs submitted at (almost) the same moment. In addition to the paper, the presented test has utilized also the following elements:
116		* submitting the ~~task~~s by N processes/users,
	116	* submitting the jobs by N processes/users,
117	117	* using consistent SDK, not the command-line clients,
118	118	* single test environment.
…	…
129	129
130	130	==== Results ====
131		* 1 user, 1 thread, 500 ~~task~~s:
	131	* 1 user, 1 thread, 500 jobs:
132	132	[[Image(zeus-throughput-500x1-1.png, center, width=640px)]]
133		* 1 user, 10 thread, 500 ~~task~~s (50x10):
	133	* 1 user, 10 thread, 500 jobs (50x10):
134	134	[[Image(zeus-throughput-50x1-1.png, center, width=640px)]]
135		* 10 users, 10 thread, 500 ~~task~~s (50x10):
	135	* 10 users, 10 thread, 500 jobs (50x10):
136	136	[[Image(zeus-throughput-50x10-0.png, center, width=640px)]]
137		* 10 users, 10 thread, 1000 ~~task~~s (10x100):
	137	* 10 users, 10 thread, 1000 jobs (10x100):
138	138	[[Image(zeus-throughput-100-10-0.png, center, width=640px)]]
139	139
140	140	=== Notes ===
141	141	1. The machine where CREAM (gLite) was running had more resources (in particular CPU cores and virtual memory) than the machines with QCG and UNICORE.
142		2. ... hovewer this machine was additionally loaded by external ~~tasks (about 500-2000 task~~s - the tests were performed by 2 weeks).
143		3. QCG returns the job status when the job is already in queueing system, gLite and UNICORE not necessarily. Thus, e.g. in the throughput tests, new ~~task~~s appeared after the test finished.
144		4. The bottle-neck (especially in the second group of tests) was the throughput of the WP4 partition and Maui, which imposed that only 64 ~~task~~s could be scheduled per one scheduling cycle (at least 3.5 minutes).
	142	2. ... hovewer this machine was additionally loaded by external jobs (about 500-2000 jobs - the tests were performed by 2 weeks).
	143	3. QCG returns the job status when the job is already in queueing system, gLite and UNICORE not necessarily. Thus, e.g. in the throughput tests, new jobs appeared after the test finished.
	144	4. The bottle-neck (especially in the second group of tests) was the throughput of the WP4 partition and Maui, which imposed that only 64 jobs could be scheduled per one scheduling cycle (at least 3.5 minutes).
145	145
146	146	[=#n]