User Tools

Site Tools


gams:parallel_versus_sequential_assignments_-_or_beware_of_loop_-statements

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
gams:parallel_versus_sequential_assignments_-_or_beware_of_loop_-statements [2009/05/08 15:45]
support
gams:parallel_versus_sequential_assignments_-_or_beware_of_loop_-statements [2020/05/28 14:23] (current)
Michael Bussieck [A third example]
Line 5: Line 5:
 ===== A simple example ===== ===== A simple example =====
 <​code>​ <​code>​
-set i / i1 * i100000 /;+set i / i1*i100000 /;
 parameter u(i); parameter u(i);
 * bad! * bad!
-loop(i, +loop(i, u(i) = uniform(0,​2));​
-        ​u(i) = uniform(0,​2)+
-     );+
 * good * good
 u(i) = uniform(0,​2);​ u(i) = uniform(0,​2);​
Line 17: Line 15:
 If you run this fragment with the profile option set to one, you'll get something like: If you run this fragment with the profile option set to one, you'll get something like:
 <​code>​ <​code>​
-----      4 Loop                    61.172    61.172 SECS     11 Mb  +----      4 Loop                     7.235     7.235 SECS     14 MB  
-----      ​Assignment u             0.031    61.203 SECS     11 Mb  ​100000+----      ​Assignment u             0.000     7.235 SECS     14 MB  ​100000
 </​code>​ </​code>​
-The execution of the loop statement was approximately ​ 2000 times slower than the parallel assignment! Thus parallel assignments should be used wherever possible. ​+The execution of the loop statement was significantly ​slower than the parallel assignment! Thus parallel assignments should be used wherever possible. ​
  
-Note: The order in which a "​parallel"​ assignment ​is executed ​is not definedTo force an order one must use a loop statement. Here is an example, which illustrates this:+Note: That "​parallel"​ assignment ​that use the same symbol on the left and on the right creates a copy of the symbol for the use on the right. Hence the updated value on the left is not used when referencing on the right as shown in the next exampleIf this is required ​a loop statement ​must be used. Here is an example, which illustrates this:
 <​code>​ <​code>​
 set j      /j1*j4/ set j      /j1*j4/
Line 31: Line 29:
 display '​Wrong:',​a;​ display '​Wrong:',​a;​
 * correct! * correct!
-loop(j$(ord(j)>​1), ​ a(j)=a(j-1)+1;   );+loop(j$(ord(j)>​1),​ a(j)=a(j-1)+1);​
 display '​Correct:',​a;​ display '​Correct:',​a;​
 </​code>​ </​code>​
Line 49: Line 47:
 <​code>​ <​code>​
 found =0; found =0;
- loop((stores,​ products, weeks) $ (condition1(weeks) +loop((stores,​ products, weeks) $ (condition1(weeks) 
-                                   ​and condition2(stores) +                              and condition2(stores) 
-                                   ​and condition3(products)),​ +                              and condition3(products)),​ found =1);
-       found =1+
-     );+
 </​code>​ </​code>​
-//Assuming that we have 6000 stores, 20 products, and 100 weeks. And only  one week, week1, satisfies condition1; only one store, store1, satisfies ​ condition2; only one product, product1, satisfies condition3. Will the  iteration go through ALL the stores, products, and weeks even though only  one element satisfies all the conditions. ​ Given the conditional control, ​ will the dimensions of the sets have a big impact on run time? //+//Assuming that we have 6000 stores, 20 products, and 100 weeks. And only  one week, ''​week1''​, satisfies condition1; only one store, ​''​store1''​, satisfies ​ condition2; only one product, ​''​product1''​, satisfies condition3. Will the iteration go through ALL the stores, products, and weeks even though only  one element satisfies all the conditions. ​ Given the conditional control, ​ will the dimensions of the sets have a big impact on run time? //
  
 GAMS tries to be efficient when it comes to restricting the domains it needs to search. There is some optimization in place, but with a general construct as you have it there, GAMS will go through all the tuples of the loop. You GAMS tries to be efficient when it comes to restricting the domains it needs to search. There is some optimization in place, but with a general construct as you have it there, GAMS will go through all the tuples of the loop. You
Line 61: Line 57:
 <​code>​ <​code>​
 set w(weeks), s(stores), p(products);​ set w(weeks), s(stores), p(products);​
-w(weeks) = condition1(weeks);​ +w(weeks) ​   = condition1(weeks);​ 
-s(stores) = condition2(stores);​+s(stores) ​  ​= condition2(stores);​
 p(products) = condition3(products);​ p(products) = condition3(products);​
-found = sum((w,s,p), 1)// or card(w)*card(s)*card(p) +found = card(w) and card(s) and card(p) // or card(w)*card(s)*card(p) 
-</​code>​ Even if GAMS can't be smart about the individual conditions, it only walks them once (''​card(weeks)+card(stores)+card(products)''​ in the original loop you probably get ''​*''​ instead of ''​+''​). If the conditions ​ are simple and can be optimized by the GAMS compiler you get constant time. It is a little like code optimization in other language compilers (C, Fortran, etc).+</​code>​ Even if GAMS can't be smart about the individual conditions, it only walks them once (''​card(weeks)+card(stores)+card(products)''​ in the original loop you probably get ''​*''​ instead of ''​+''​). If the conditions ​ are simple and can be optimized by the GAMS compiler you get constant time. It is a little like code optimization in other language compilers (C, FORTRAN, etc).
  
 ===== A third example ===== ===== A third example =====
Line 71: Line 67:
 Here we have an example, where we have to match elements from two different sets. We show different formulation,​ which are much faster than using a loop statement. Here we have an example, where we have to match elements from two different sets. We show different formulation,​ which are much faster than using a loop statement.
 <​code>​ <​code>​
-$eolcom // +set k /k1*k90000/t /t1*t10000/;
-set k /k1*k9000/+
-set t /t1*t1000/+
- +
-parameter a(k), b(t);+
  
 +parameter a(k), b(t), bref(t);
 a(k)=normal(0,​1);​ a(k)=normal(0,​1);​
  
-// expansive loop +expansive loop 
-loop(k, +loop(k, b(t)$(ord(t) eq ord(k)-card(t))=a(k)); 
-    ​b(t)$(ord(t) eq ord(k)-card(t))=a(k);​ +bref(t) = b(t);
-    );+
 option clear=b; option clear=b;
  
-// better, but not yet optimal+better, but not yet optimal
 b(t) = sum(k$(ord(k)=ord(t)),​ a(k+card(t)));​ b(t) = sum(k$(ord(k)=ord(t)),​ a(k+card(t)));​
 +abort$(smax(t,​abs(b(t)-bref(t)))>​1e-6) b,bref;
 option clear=b; option clear=b;
  
-// much faster using element matching introduced in distribution ​227 +much faster using element matching introduced in distribution 227 
-$version ​227 +set sk(k) '​shifted k', kt(k,t);
-set sk(k) '​shifted k', +
-    ​kt(k,t);+
 sk(k+card(t)) =  yes; sk(k+card(t)) =  yes;
-option kt(sk:​t); ​// or  set kt(k,t) / #k:#t /+option kt(sk:t);
 b(t)=sum(k$kt(k,​t),​a(k));​ b(t)=sum(k$kt(k,​t),​a(k));​
 +abort$(smax(t,​abs(b(t)-bref(t)))>​1e-6) b,bref;
 option clear=b, clear=kt; option clear=b, clear=kt;
  
-// this formulation is more compact+this formulation is more compact
 option kt(k:t); option kt(k:t);
 b(t)=sum(kt(k,​t),​ a(k+card(t)));​ b(t)=sum(kt(k,​t),​ a(k+card(t)));​
 +abort$(smax(t,​abs(b(t)-bref(t)))>​1e-6) b,bref;
 </​code>​ </​code>​
  
 Running the code using the GAMS profiler (''​profile 1''​) yields: Running the code using the GAMS profiler (''​profile 1''​) yields:
 <​code>​ <​code>​
-----      1 ExecInit ​                ​0.000 ​    0.000 SECS      3 Mb  +----      7 Loop                    29.218    29.234 SECS     14 MB  
-----      7 Assignment a             ​0.000 ​    0.000 SECS      5 Mb   ​9000 +----     12 Assignment b            20.188    49.422 SECS     16 MB  10000 
-----     ​10 ​Loop                     1.388     1.388 SECS      5 Mb  +----     18 Assignment sk            0.015    49.437 SECS     17 MB  80000 
-----     ​13 Clear      b             ​0.000 ​    1.388 SECS      5 Mb  +----     20 Assignment b             ​0.000 ​   49.437 SECS     21 MB  10000 
-----     16 Assignment b             0.796     2.184 SECS      5 Mb   ​1000 +----     26 Assignment b             0.125    49.562 SECS     21 MB  10000 
-----     ​17 Clear      b             ​0.000 ​    2.184 SECS      5 Mb  +</​code>​
-----     23 Assignment sk            0.000     2.184 SECS      5 Mb   ​8000 +
-----     ​24 Other                    0.000     2.184 SECS      5 Mb  +
-----     25 Assignment b             ​0.000 ​    2.184 SECS      5 Mb   1000 +
-----     ​26 ​Clear      kt            0.000     2.184 SECS      5 Mb  +
-----     29 Other                    0.000     2.184 SECS      5 Mb  +
-----     ​30 ​Assignment b             0.015     2.199 SECS      5 Mb   1000+
  
 +===== A fourth example =====
 +
 +In this example, a large matrix with a small band of non-zeros off the diagonal is copied from the name space ''​j,​jj''​ to a new parallel matching name space ''​x,​xx''​ where '​j'​ and '​x'​ match perfectly. The band matrix (''​PDmatrix_jj''​) is created and the copied to the same matrix with different labels (''​PDmatrix_xx''​). This is done once with the combination of two ''​j:​x''​ mappings (''​jx''​ and ''​jxx''​) in a loop statement and once with a sequence of parallel asignment utilizing a temporary 4-dimensional map symbol ''​j_jj_x_xx':​
 +<​code>​
 +set x / x1*x50000 /;
 +singleton set xone(x) / x1 /;
 +$eval xCard card(x)
 +set j /1*%xCard% /;
 +set jx(j,x) / #j:#x /;
 +alias (x,xx,xxx), (j,jj,jjj), (jx,jxx);
 +
 +parameter
 + LTmatrix_jj(j,​jj)
 + PDmatrix_jj(j,​jj)
 +    ​
 +scalar LTwidth / 3 /;
 +$eval LTWIDTH LTwidth
 +set jsub(j) / 1*%LTWIDTH% /;
 +
 +LTmatrix_jj(j,​jj+(ord(j)-LTwidth))$[jsub(jj)] = normal(0,​1);​
 +PDmatrix_jj(j,​jj+(ord(j)-LTwidth))$[jsub(jj)] = sum(jjj, LTmatrix_jj(j,​jjj)*LTmatrix_jj(jj+(ord(j)-LTwidth),​jjj));​
 +
 +Parameter
 +    PDmatrix_xx1(x,​xx)
 +    PDmatrix_xx2(x,​xx);​
 +
 +* Slow loop
 +loop((jx(j,​x),​jxx(jj,​xx))$PDmatrix_jj(j,​jj),​ PDmatrix_xx1(x,​xx) = PDmatrix_jj(j,​jj));​
 +
 +* Fast sequence of 
 +set j_jj_x_xx(j,​jj,​x,​xx),​ x_xx(x,xx);
 +j_jj_x_xx(j,​jj,​x+(ord(j)-1),​x+(ord(jj)-1))$xone(x) = PDmatrix_jj(j,​jj);​
 +option x_xx<​j_jj_x_xx;​
 +PDmatrix_xx2(x_xx(x,​xx)) = SUM(j_jj_x_xx(j,​jj,​x,​xx),​ PDmatrix_jj(j,​jj));​
 +
 +abort$(smax(x_xx,​ abs(PDmatrix_xx1(x_xx)-PDmatrix_xx2(x_xx)))>​1e-6) 'bad calculation';​
 </​code>​ </​code>​
  
 +The relevant part of the GAMS profiler report yields:
 +<​code>​
 +----     24 Loop                    14.625 ​   18.657 SECS     32 MB 
 +----     28 Assignment j_jj_x_xx ​    ​3.250 ​   21.907 SECS     45 MB  149997
 +----     29 Other                    0.000    21.907 SECS     49 MB 
 +----     30 Assignment PDmatrix_xx2 ​ 0.093    22.000 SECS     69 MB  149997
 +</​code>​
 +So the loop is more than 4 times slower than the sequence of parallel assignment statements.
IMPRESSUM / LEGAL NOTICEPRIVACY POLICY gams/parallel_versus_sequential_assignments_-_or_beware_of_loop_-statements.1241790307.txt.gz · Last modified: 2009/05/08 15:45 by support