Mudanças entre as edições de "Timing"
De WikiLICC
m |
m |
||
Linha 37: | Linha 37: | ||
Parallelization yes 0.171 s /Qparallel (or any combination above) usa 8 processors | Parallelization yes 0.171 s /Qparallel (or any combination above) usa 8 processors | ||
Inline directive 0.145 s /Ob1 use 4 processors | Inline directive 0.145 s /Ob1 use 4 processors | ||
+ | |||
+ | * real(16): As coisas ficam lll...e...n.t...as.. . .. . . (vectorization doesn't help) | ||
+ | Memory 7.3 Gb (permon) Windows performance monitor | ||
+ | Release (64bits) 7.00 s | ||
+ | /Qvec- (sem vetorizacao) 7.02 s (NOT the default) |
Edição das 01h18min de 20 de junho de 2012
Testando vetorização:
- Maior problema alocável: 154Mb * 3*4 = 1.8Gb
! http://goparallel.sourceforge.net/optimizing-loops-vectorization/ program Vectorization use portlib real(4),dimension(:),allocatable :: x,y,z integer :: len=150*1024*1024 ! 154 Mb real(4) :: timing allocate( x(len) ,stat=ierr) allocate( y(len) ,stat=ierr) allocate( z(len) ,stat=ierr) do j=1,10 timing = secnds(0.0) do i=1,len z(i)=sqrt(x(i))+sqrt(y(i)) end do timing = secnds(timing)*1000 print *,' Timing =',timing,'/1000 s' end do end program
- Resultados: Usando real(4) ou real(8) leva aprox. o mesmo tempo.
Memory 1.75 Gb
Debug (no optimization) (32bits) 2.13 s Debug (no optimization) (64bits) 2.00 s Release (32bits) 0.143 s Release (64bits) 0.140 s <=========
Release (64bits) /Qvec- (sem vetorizacao) 0.909 s (NOT the default) Threshold for vectorization 0 0.140 s Threshold for parallelization 0 0.140 s Parallelization yes 0.171 s /Qparallel (or any combination above) usa 8 processors Inline directive 0.145 s /Ob1 use 4 processors
- real(16): As coisas ficam lll...e...n.t...as.. . .. . . (vectorization doesn't help)
Memory 7.3 Gb (permon) Windows performance monitor Release (64bits) 7.00 s /Qvec- (sem vetorizacao) 7.02 s (NOT the default)