当前位置:文档之家› An Extended Set of Fortran Basic Linear Algebra Subprograms

An Extended Set of Fortran Basic Linear Algebra Subprograms

An Extended Set of Fortran Basic Linear Algebra Subprograms
An Extended Set of Fortran Basic Linear Algebra Subprograms

ARGONNE NATIONAL LABORATORY

9700South Cass Avenue

Argonne,Illinois60439

An Extended Set of Fortran

Basic Linear Algebra Subprograms

Jack J.Dongarra,Jeremy Du Croz,Sven Hammarling,and Richard J.Hanson

Mathematics and Computer Science Division

Technical Memorandum No.41(Revision3)

September,1986

ABSTRACT

This paper describes an extension to the set of Basic Linear Algebra Subprograms.The extensions are targeted at matrix-vector operations which should provide for ef?cient and portable implementations of algorithms for high performance computers.

An Extended Set of Fortran

Basic Linear Algebra Subprograms

Jack J.Dongarra?

Mathematics and Computer Science Division

Argonne National Laboratory

Argonne,Illinois60439

Jeremy Du Croz

Numerical Algorithms Group Ltd.

NAG Central Of?ce,May?eld House

256Banbury Road,Oxford OX27DE

Sven Hammarling

Numerical Algorithms Group Ltd.

NAG Central Of?ce,May?eld House

256Banbury Road,Oxford OX27DE

Richard J.Hanson

Applied Math2646

Sandia National Laboratory

Albuquerque,New Mexico87185

Abstract—This paper describes an extension to the set of Basic Linear Algebra Subprograms.The exten-sions are targeted at matrix-vector operations which should provide for ef?cient and portable implementa-tions of algorithms for high performance computers.

1.Introduction

In1973Hanson,Krogh,and Lawson wrote an article in the SIGNUM Newsletter(Vol.8,no.4, page16)describing the advantages of adopting a set of basic routines for problems in linear algebra.The original basic linear algebra subprograms,now commonly referred to as the BLAS and fully described in Lawson,Hanson,Kincaid,and Krogh[9,10],have been very successful and have been used in a wide range of software including LINPACK[4]and many of the algorithms published by the ACM Transac-tions on Mathematical Software.In particular they are an aid to clarity,portability,modularity and maintenance of software and they have become a de facto standard for the elementary vector operations. An excellent discussion of the raison d’?e tre of the BLAS is given in Dodson and Lewis[1].

Special versions of the BLAS,in some cases machine code versions,have been implemented on a number of computers,thus improving the ef?ciency of the BLAS.However,with some of the modern

__________________

?

Work supported in part by the Applied Mathematical Sciences subprogram of the Of?ce of Energy

Research,U.S.Department of Energy,under Contract W-31-109-Eng-38.

Typeset on October14,1992.

machine architectures,the use of the BLAS is not the best way to improve the ef?ciency of higher level codes.On vector machines,for example,one needs to optimize at least at the level of matrix-vector operations in order to approach the potential ef?ciency of the machine(see[2and3]);and the use of the BLAS inhibits this optimization because they hide the matrix-vector nature of the operations from the compiler.

During the Gatlinburg meeting of June,1984(Waterloo,Ontario)discussions among the partici-pants encouraged two of us(Dongarra and Hammarling)to prepare a proposed set of extended BLAS.At about the same time IFIP Working Group2.5started a project on the same subject at their annual meeting in Pasadena,CA.

An initial proposal was drafted and presented at the Parvec IV workshop held at Purdue Univ.,Oct. 29-30,1984.At that time two more authors joined the project(Hanson and du Croz).A series of meet-ings were planned so that the project would re?ect the best thinking of the mathematical software com-munity.Three meetings soliciting input were held.These occurred at SIAM conferences:The Spring Meeting of the Society,(Seatle,WA,July16-20,1984);The Conference on Applied Linear Algebra (Raleigh,NC,April29-May2,1985);The Fall Meeting of the Society,(Tempe,AZ,October28-30, 1985);The Conference on Parallel Processing for Scienti?c Computing(Norfolk,VA,November18-21, 1985).

Earlier,a modi?ed proposal was printed in the SIGNUM Newletter,[6].In that document we invited readers to send us their views and suggestions for changes to the design of the extended BLAS. Thus we have appealed to a wide audience within the mathematical software community.Our hope is that the proposed set of names that constitue the extended BLAS will?nd wide application in the future software of numerical linear algebra and provide a useful tool for implementors and users.

We believe that the time is right to specify an additional set of BLAS designed for matrix-vector operations.It has been our experience that a small set of matrix-vector operations occur frequently in the implementation of many of the most common algorithms in linear algebra.We de?ne here the basic operations for that set,together with the naming conventions and the calling sequences.Routines at this level should provide a reasonable compromise between the sometimes con?icting aims of ef?ciency and modularity and it is our hope that ef?cient implementations will become available on a wide range of computer architectures.

In this paper we shall refer to the existing BLAS of Lawson et al.as‘‘Level1BLAS’’,and the new extended set as‘‘Level2BLAS’’.The Level2BLAS involve O(mn)scalar operations where m and n are the dimensions of the matrix involved.These could be programmed by a series of calls to the Level1 BLAS,though we do not recommend that they be implemented in that way.Hence,in a natural sense,the Level2BLAS are performing basic operations at one level higher than the Level1BLAS.

In[7]we present a model implementation of the Level2BLAS in Fortran77(extended to include a COMPLEX*16data type),and also a set of rigorous test programs.

2.Scope of the Level2BLAS

The following three types of basic operation are performed by the Level2BLAS:

a)Matrix-vector products of the form

y←αAx + βy, y←αA T x + βy,and y←αA T x + βy

whereαandβare scalars,x and y are vectors and A is a matrix,and

x← Tx, x← T T x,and x← T T x,

where x is a vector and T is an upper or lower triangular matrix.

b)Rank-one and rank-two updates of the form

A←αxy T + A, A←αxy T + A, H←αxx T + H,and H←αxy T + α yx T + H, where H is a Hermitian matrix.

c)Solution of triangular equations of the form

x← T?1x, x← T?T x,and x← T ?T x,

where T is a non-singular upper or lower triangular matrix.

Where appropriate,the operations are applied to general,general band,Hermitian,Hermitian band, triangular,and triangular band matrices in both real and complex arithmetic,and in single and double precision.

In Appendix B we propose corresponding sets of routines in which the internal computation is per-formed in extended precision,and the vectors x and/or y are stored in extended precision,so that the extra internal precision is not all discarded on return from the routine.This proposal is aimed at machines with extended precision arithmetic registers;for example machines performing IEEE arithmetic[12].We propose these routines as an optional extension to the Level2BLAS because it is not possible to specify a complete set within the con?nes of ANSI Fortran77.The only case that can be realized is where the matrix is real single precision and the extended precision vectors are real double precision.Code for these routines is not included in[7].

3.Naming Conventions

The name of a Level2BLAS is in the LINPACK style and consists of?ve characters,the last of which may be blank.The fourth and?fth characters in the name denote the type of operation,as follows:

MV-Matrix-vector product

R-Rank-one update

R2-Rank-two update

SV-Solve a system of linear equations

Characters two and three in the name denote the kind of matrix involved,as follows:

GE General matrix

GB General band matrix

HE Hermitian matrix

SY Symmetric matrix

HP Hermitian matrix stored in packed form

SP Symmetric matrix stored in packed form

HB Hermitian band matrix

SB Symmetric band matrix

TR Triangular matrix

TP Triangular matrix in packed form

TB Triangular band matrix

The?rst character in the name denotes the Fortran data type of the matrix,as follows:

S REAL

D DOUBL

E PRECISION

C COMPLEX

Z COMPLEX*16or DOUBLE COMPLEX(if available)

The available combinations are indicated in Table3.1below.In the?rst column,under complex,the ini-tial C may be replaced by Z.In the second column,under real,the initial S may be replaced by D.See Appendix C for the full subroutine calling sequences.

The collection of routines can be thought of as being divided into four separate parts,real,double precision,complex,and complex*16.The routines can be written in ANSI standard Fortran77,with the exception of the routines that use COMPLEX*16variables.These routines are included for complete-ness and for their usefulness on those systems which support this data type;but because they do not con-form to the Fortran standard,they may not be available on all machines.

Table3.1

complex real MV R R2SV

CGE SGE**

CGB SGB*

CHE SSY***

CHP SSP***

CHB SSB*

CTR STR**

CTP STP**

CTB STB**

For the general rank-1update(GER)we specify two complex routines:CGERC for A←αxy T + A and CGERU for A←αxy T + A.This is the only exception to the one to one correspondence between real and complex routines.See section7for further discussion.

We do not specify routines for rank-one and rank-two updates applied to band matrices because these can be obtained by calls to the rank-one and rank-two full matrix routines.This is illustrated in Appendix A.

4.Argument Conventions

We follow a similar convention for the argument lists to that for the Level1BLAS,but with exten-sions where comparable arguments are not present in the Level1BLAS.The order of arguments is as fol-lows:

a)Arguments specifying options.

b)Arguments de?ning the size of the matrix.

c)Input scalar.

d)Description of the input matrix.

e)Description of input vector(s).

f)Input scalar(associated with input-output vector).

g)Description of the input-output vector.

h)Description of the input-output matrix.

Note that not each category is present in each of the routines.

The arguments that specify options are character arguments with the names TRANS,UPLO,and DIAG.TRANS is used by the matrix-vector product routines as follows:

Value Meaning

_________________________________________________

‘N’Operate with the matrix.

‘T’Operate with the transpose of the matrix.

‘C’Operate with the conjugate transpose of the matrix.

In the real case the values‘T’and‘C’have the same meaning.

UPLO is used by the Hermitian,symmetric,and triangular matrix routines to specify whether the upper or lower triangle is being referenced as follows:

Value Meaning

____________________

‘U’Upper triangle

‘L’Lower triangle

DIAG is used by the triangular matrix routines to specify whether or not the matrix is unit triangu-lar,as follows:

Value Meaning

_______________________

‘U’Unit triangular

‘N’Non-unit triangular

When DIAG is supplied as‘U’the diagonal elements are not referenced.

We recommend that the equivalent lower case characters be accepted with the same meaning, although,because they are not included in the standard Fortran character set,their use may not be sup-ported on all systems.See Section7for further discussion.

It is worth noting that actual character arguments in Fortran may be longer than the corresponding dummy arguments.So that,for example,the value‘T’for TRANS may be passed as‘TRANSPOSE’.

The size of the matrix is determined by the arguments M and N for an m by n rectangular matrix; and by the argument N for an n by n symmetric,Hermitian,or triangular matrix.Note that it is permissi-ble to call the routines with M or N=0,in which case the routines exit immediately without referencing their vector or matrix arguments.The bandwidth is determined by the arguments KL and KU for a rec-tangular matrix with kl sub-diagonals and ku super-diagonals;and by the argument K for a symmetric, Hermitian,or triangular matrix with k sub-diagonals and/or super-diagonals.

The description of the matrix consists either of the array name(A)followed by the leading dimen-sion of the array as declared in the calling(sub)program(LDA),when the matrix is being stored in a two-dimensional array;or the array name(AP)alone when the matrix is being stored as a(packed)vec-tor.In the former case the actual array must contain at least((n?1)d + l)elements,where d is the lead-ing dimension of the array,d≥ l,and l = m for the GE routines,l = n for the SY,HE and TR routines, l = kl + ku + 1for the GB routines,and l = k+1for the SB,HB or TB routines.For packed storage the actual array must contain at least n(n+1)/2elements.

The scalars always have the dummy argument names ALPHA and BETA.

As with the existing BLAS the description of a vector consists of the name of the array(X or Y)fol-lowed by the storage spacing(increment)in the array of the vector elements(INCX or INCY).The incre-ment is allowed to be positive or negative-but not zero(see section7).When the vector x consists of k elements,then the corresponding actual array argument X must be of length at least(1+(k-1)|INCX|). For those routines that include the argument BETA,when BETA is supplied as zero then the array Y need not be set on input,so that operations such as y←αAx may be performed without initially setting y to zero.

The following values of arguments are invalid:

Any value of the character arguments DIAG,TRANS,or UPLO

whose meaning is not speci?ed.

M<0

N<0

KL<0

KU<0

K<0

LDA

LDA

LDA

LDA

INCX=0

INCY=0

If a routine is called with an invalid value for any of its arguments,then it must report the fact and ter-minate execution of the program.In the model implementation(see[7]),each routine,on detecting an error,calls a common error handling routine XERBLA,passing to it the name of the routine and the number of the?rst argument which is in error.Specialized implementations may call system-speci?c exception-handling and diagnostic facilities,either via an auxiliary routine XERBLA or directly from the routines.One advantage of using XERBLA is that the test program can then test that all errors are detected(see[7]).

5.Storage Conventions

Unless otherwise stated it is assumed that matrices are stored conventionally in a2-dimensional

stored in array-element A(I,J).

array with matrix-element a

ij

The routines for real symmetric and complex Hermitian matrices allow for the matrix to be stored in either the upper(UPLO=‘U’)or lower triangle(UPLO=‘L’)of a two dimensional array,or to be packed in a one dimensional array.In the latter case the upper triangle may be packed sequentially

column by column(UPLO=‘U’),or the lower triangle may be packed sequentially column by column (UPLO=‘L’).Note that for real symmetric matrices packing the upper triangle by column is equivalent to packing the lower triangle by rows,and packing the lower triangle by columns is equivalent to packing the upper triangle by rows.(For complex Hermitian matrices the only difference is that the off-diagonal elements are conjugated.)

For triangular matrices the argument UPLO serves to de?ne whether the matrix is upper(UPLO=‘U’)or lower(UPLO=‘L’)triangular.In packed storage the triangle has to be packed by column.

The band matrix routines allow storage in the same style as with LINPACK,so that the j th column of the matrix is stored in the j th column of the Fortran array.For a general band matrix the diagonal of the matrix is stored in the ku+1th row of the array.For a Hermitian or symmetric matrix either the upper triangle(UPLO=‘U’)may be stored in which case the leading diagonal is in the k+1th row of the array, or the lower triangle(UPLO=‘L’)may be stored in which case the leading diagonal is in the?rst row of the array.For an upper triangular band matrix(UPLO=‘U’)the leading diagonal is in the k+1th row of the array and for a lower triangular band matrix(UPLO=‘L’)the leading diagonal is in the?rst row.

For a Hermitian matrix the imaginary parts of the diagonal elements are of course zero and thus the imaginary parts of the corresponding Fortran array elements need not be set,but are assumed to be zero. In the R and R2routines these imaginary parts will be set to zero on return,except whenα = 0,in which case the routines exit immediately.

For packed triangular matrices the same storage layout is used whether or not DIAG=‘U’,i.e. space is left for the diagonal elements even if those array elements are not referenced.

6.Speci?cation of the Level2BLAS

Type and dimension for variables occurring in the subroutine speci?cations are as follows:

INTEGER INCX,INCY,K,KL,KU,LDA,M,N

CHARACTER*1DIAG,TRANS,UPLO

For routines whose?rst letter is an S:

REAL ALPHA,BETA

REAL X(*),Y(*)

REAL A(LDA,*)

REAL AP(*)

For routines whose?rst letter is a D

DOUBLE PRECISION ALPHA,BETA

DOUBLE PRECISION X(*),Y(*)

DOUBLE PRECISION A(LDA,*)

DOUBLE PRECISION AP(*)

For routines whose?rst letter is a C:

COMPLEX ALPHA

COMPLEX BETA

COMPLEX X(*),Y(*)

COMPLEX A(LDA,*)

COMPLEX AP(*)

except that for CHER and CHPR the scalarαis real so that the?rst declaration above is replaced by:

REAL ALPHA

For routines whose?rst letter is Z:

COMPLEX*16ALPHA DOUBLE COMPLEX ALPHA

COMPLEX*16BETA DOUBLE COMPLEX BETA

COMPLEX*16X(*),Y(*)or DOUBLE COMPLEX X(*),Y(*)

COMPLEX*16A(LDA,*)DOUBLE COMPLEX A(LDA,*)

COMPLEX*16AP(*)DOUBLE COMPLEX AP(*)

except that for ZHER and ZHPR the?rst declaration above is replaced by:

DOUBLE PRECISION ALPHA

The generic names,argument lists and speci?cations for the extended BLAS now follow.Refer to table 3.1for the speci?c subroutine names.

a)General Matrix Vector Products

for a general matrix:

_GEMV(TRANS,M,N,ALPHA,A,LDA,X,INCX,BETA,Y,INCY)

for a general band matrix:

_GBMV(TRANS,M,N,KL,KU,ALPHA,A,LDA,X,INCX,BETA,Y,INCY)

Operation:

if TRANS = ‘N’, y←αAx + βy

if TRANS = ‘T’, y←αA T x + βy

if TRANS = ‘C’, y←αA T x + βy.

b)Symmetric or Hermitian Matrix Vector Products

for a symmetric or Hermitian matrix:

_SYMV(UPLO,N,ALPHA,A,LDA,X,INCX,BETA,Y,INCY)

_HEMV(UPLO,N,ALPHA,A,LDA,X,INCX,BETA,Y,INCY)

for a symmetric or Hermitian matrix in packed storage:

_SPMV(UPLO,N,ALPHA,AP,X,INCX,BETA,Y,INCY)

_HPMV(UPLO,N,ALPHA,AP,X,INCX,BETA,Y,INCY)

for a symmetric or Hermitian band matrix:

_SBMV(UPLO,N,K,ALPHA,A,LDA,X,INCX,BETA,Y,INCY) _HBMV(UPLO,N,K,ALPHA,A,LDA,X,INCX,BETA,Y,INCY)

Operation:

y←αAx + βy .

c)Triangular Matrix Vector Products

for a triangular matrix:

_TRMV(UPLO,TRANS,DIAG,N,A,LDA,X,INCX)

for a triangular matrix in packed storage:

_TPMV(UPLO,TRANS,DIAG,N,AP,X,INCX)

for a triangular band matrix:

_TBMV(UPLO,TRANS,DIAG,N,K,A,LDA,X,INCX) Operation:

if TRANS = ‘N’, x← Tx

if TRANS = ‘T’, x← T T x

if TRANS = ‘C’, x← T T x

d)Triangular equation solvers

for a triangular matrix:

_TRSV(UPLO,TRANS,DIAG,N,A,LDA,X,INCX)

for a triangular matrix in packed storage:

_TPSV(UPLO,TRANS,DIAG,N,AP,X,INCX)

for a triangular band matrix:

_TBSV(UPLO,TRANS,DIAG,N,K,A,LDA,X,INCX) Operation:

if TRANS = ‘N’, x← T?1x

if TRANS = ‘T’, x← T?T x

if TRANS = ‘C’, x← T ?T x.

e)General Rank-1Updates:

for a general matrix:

_GER_(M,N,ALPHA,X,INCX,Y,INCY,A,LDA)

for real matrices:

SGER or DGER performs the operation A←αxy T + A.

for complex matrices:

CGERC or ZGERC performs the operation A←αxy T + A and

CGERU or ZGERU performs the operation A←αxy T + A.

f)Symmetric or Hermitian Rank-1Updates:

for a symmetric or Hermitian matrix:

_SYR(UPLO,N,ALPHA,X,INCX,A,LDA)

_HER(UPLO,N,ALPHA,X,INCX,A,LDA)

for symmetric or Hermitian matrix in packed storage:

_SPR(UPLO,N,ALPHA,X,INCX,AP)

_HPR(UPLO,N,ALPHA,X,INCX,AP)

Operation:

A←αxx T + A

for real symmetric matrices this is simply

A←αxx T + A .

g)Symmetric or Hermitian Rank-2Updates:

for a symmetric or Hermitian matrix:

_SYR2(UPLO,N,ALPHA,X,INCX,Y,INCY,A,LDA)

_HER2(UPLO,N,ALPHA,X,INCX,Y,INCY,A,LDA)

for symmetric or Hermitian matrix in packed storage:

_SPR2(UPLO,N,ALPHA,X,INCX,Y,INCY,AP)

_HPR2(UPLO,N,ALPHA,X,INCX,Y,INCY,AP)

Operation:

A←αxy T + α yx T + A

for real symmetric matrices this is simply

A←αxy T + αyx T + A .

7.Rationale

The three basic matrix-vector operations chosen(Section2)were obvious candidates because they occur in a wide range of linear algebra applications,and they occur at the innermost level of many algo-rithms.The hard decision was to restrict the scope only to these operations,since there are many other potential candidates,such as matrix scaling and sequences of plane rotations.Similarly,we could have extended the scope by applying the operations to other types of matrices such as complex symmetric or augmented band matrices.We have aimed at a reasonable compromise between a much larger number of routines each performing one type of operation(e.g.x← L?T x),and a smaller number of routines with a more complicated set of options.There are in fact,in each precision,16real routines performing alto-gether43different operations,and17complex routines performing58different operations.

We feel that to extend the scope further would signi?cantly reduce the chances of having the rou-tines implemented ef?ciently over a wide range of machines,because it would place too heavy a burden on implementors.On the other hand,to restrict the scope further would place too narrow a limit on the potential applications of the level2BLAS.

We have adhered to the conventions of the level 1BLAS in allowing an increment argument to be associated with each vector,so that a vector could,for example,be a row of a matrix.This increment may be negative,in which case the elements of the vectors are taken in reverse order.This affects the de?nition of the operation.For example,if m = n = 3and INCX and INCY are both negative,the _GEMV routines with TRANS=’N’perform the operation:

y 1y 2y 3 ← α a 31a 21a 11 a 32a 22a 12 a 33

a 23a 13 x 1x 2x 3 + β y 1y 2y 3 .However,in contrast to the level 1BLAS we do not allow INCX or INCY to be zero.This feature would have little usefulness,would complicate implementation of the routines on many vector machines,and,when the associated vector is an output vector,its meaning is ambiguous.

As noted earlier,corresponding to the real routine SGER we specify two complex routines CGERC

(for A ← αxy

T + A )and CGERU (for A ← αxy T + A ).Both are frequently required.An alternative would be to provide a single complex routine CGER with an option argument;however this argument would have become redundant in the real routine SGER.Rather than have redundant arguments,or different argument lists for the real and complex routines,we have chosen two distinct complex routines;

they are analogous to the level 1BLAS CDOTC (c ← c + x T y )and CDOTU (c ← c + x T y ).

Note that no check has been included for singularity,and near singularity,in the triangular equation solving routines.The requirements for such a test depend upon the application and so we felt that this should not be included,but should instead be performed before calling the triangular solver.

On certain machines,which do not use the ASCII sequence on all of their Fortran systems,lower case characters may not exist,so that the innocent looking argument ‘t’,passed through the argument TRANS for designating a transposed matrix,is not in the Fortran character set.Some UNIVAC systems do not have a lower case representation using the ‘?eld data’character set.On the CDC NOS-2system,a mechanism is provided for a full 128ASCII character set by using pairs of 6-bit host characters for cer-tain 7-bit ASCII characters.This means that there is a ‘2for 1’physical extension of the logical records that contain lower case letters.This fact can hamper portability of codes written on ASCII machines that are later moved to CDC systems.The only safe way to proceed is to convert the transported text entirely into the Fortran character set.On the other hand we believe that users on ASCII character set systems may wish to treat upper and lower case letters as equivalent in meaning.If this is done,it means that text that will be transported to machines of unknown types must have the ASCII set mapped into the Fortran character set before the text is moved.

The band storage scheme used by the GB,HB,SB,and TB routines has columns of the matrix stored in columns of the array,and diagonals of the matrix stored in rows of the array.This is the storage scheme used by LINPACK.An alternative scheme (used in some EISPACK [8,11]routines)has rows of the matrix stored in rows of the array,and diagonals of the matrix stored in columns of the array.The latter scheme has the advantage that a band matrix-vector product of the form y ← αAx + βy can be

computed using long vectors(the diagonals of the matrix)stored in contiguous elements,and hence is much more ef?cient on some machines(e.g.CDC Cyber205)than the?rst scheme.However other com-putations involving band matrices,such as x← Tx,x← T?1x and LU and U T U factorization,cannot be organized‘by diagonals’;instead the computation sweeps along the band,and the LINPACK storage scheme has the advantage of reducing the number of page swaps and allowing contiguous vectors(the columns of the matrix)to be used.

We considered the possibility of generalizing the rank-1and rank-2updates to rank-k updates. Rank-k updates with k > 1(but k << n)can achieve signi?cantly better performance on some machines than rank-1[5].But to take advantage of this usually requires complicating the calling algorithm;and moreover rank-k updates with k~~ n would allow an even higher level operation such as matrix multipli-cation‘in by the back door’.We prefer to keep to a clean concept of genuine matrix-vector operations.

8.Acknowledgements

A draft proposal that led to this speci?cation was discussed at the Parvec IV Workshop organized by John Rice at Purdue University on October29-30,1984and at various SIAM conferences.We wish to thank all the participants at the workshop and meetings for their comments,discussions,and encourage-ment,as well as the many people who have sent us comments separately.

13.References

[1] D.S.Dodson and J.G.Lewis,"Issues relating to extension of the Basic Linear Algebra Subpro-

grams",ACM SIGNUM Newsletter,vol20,no1,(1985),2-18.

[2]J.J.Dongarra and S.C.Eisenstat,‘‘Squeezing the Most out of an Algorithm in CRAY Fortran,’’

ACM Transactions on Mathematical Software,Vol.10,No.3,(1984),221-230.

[3]J.J.Dongarra,Increasing the Performance of Mathematical Software through High-Level Modular-

ity.Proceedings of the Sixth International Symposium on Computing Methods in Engineering and Applied Sciences.(Versailles,France).North-Holland(1984),pp239-248.

[4]J.J.Dongarra,J.R.Bunch,C.B.Moler,and G.W.Stewart,LINPACK Users’Guide,SIAM Publica-

tions,Philadelphia,1979.

[5]J.J.Dongarra,L.Kaufman,and S.Hammarling,Squeezing the Most out of Eigenvalue Solvers on

High-Performance Computers,Linear Algebra and Its Applications,77,pp113-136,(1986).

[6]J.J.Dongarra,J.J.Du Croz,S.Hammarling,R.J.Hanson,A Proposal for an Extended Set of Fortran

Basic Linear Algebra Subprograms,ACM SIGNUM Newsletter,vol.20,no.1,1985.

[7]J.J.Dongarra,J.J.Du Croz,S.Hammarling,R.J.Hanson,Model Implementation and Test Package

for the Extended BLAS,Argonne National Laboratory Report,ANL MCS-TM81,August1986.

[8] B.S.Garbow,J.M.Boyle,J.J.Dongarra,C.B.Moler,Matrix Eigensystem Routines-EISPACK

Guide Extension,Lecture Notes in Computer Science,Vol.51,Springer-Verlag,Berlin,1977.

[9] https://www.doczj.com/doc/361198287.html,wson,R.Hanson,D.Kincaid,and F.Krogh,‘‘Basic Linear Algebra Subprograms for Fortran

Usage,’’ACM Transactions on Mathematical Software5(1979),308-323.

[10] https://www.doczj.com/doc/361198287.html,wson,R.Hanson,D.Kincaid,and F.Krogh,‘‘Algorithm539:Basic Linear Algebra Subpro-

grams for Fortran Usage,’’ACM Transactions on Mathematical Software5(1979),324-325.

[11] B.T.Smith,J.M.Boyle,J.J.Dongarra,B.S.Garbow,Y.Ikebe,V.C.Klema,and C.B.Moler,Matrix

Eigensystem Routines-EISPACK Guide,Lecture Notes in Computer Science,Vol.6,2nd edition, Springer-Verlag,Berlin,1976.

[12]IEEE Standard for Binary Floating-Point Arithmetic,ANSI/IEEE Std754-1985,The IEEE Inc.,

New York,(1985).

Appendix A

In this appendix we illustrate how to use the full matrix update routines to obtain rank-one and rank-two updates to band matrices.We assume that the vectors x and y are such that no?ll-in occurs out-side the band,in which case the update affects only a full rectangle within the band matrix A.This is illustrated in Fig.A.1for the case where m=n=9,kl=2,ku=3and the update commences in row(and column)l=3.

(****)(0)(00****000) (***__*__*__)(0)

(**|****|)(*)

(*|****|*)(*)

(|*__*__*__*|**)+(*)

(******)(0)

(*****)(0)

(****)(0)

(***)(0)

A + xy T

Figure A.1

We see that the update affects only that part of A indicated by the dotted lines,that is,the(kl+1)by (ku+1)part of A starting at a

ll

.

The routines that we could have included are_GBR,_SBR,and_SBR2(in the complex case_HBR and _HBR2).Their argument lists could have been

_GBR(M,N,KL,KU,L,ALPHA,X,INCX,Y,INCY,A,LDA)

_SBR(UPLO,N,K,L,ALPHA,X,INCX,A,LDA)

_SBR2(UPLO,N,K,L,ALPHA,X,INCX,Y,INCY,A,LDA)

where the argument L denotes the starting row and column for the update and the elements x

l and y

l

,of

the vectors x and y,are in elements X(1)and Y(1)of the arrays X and Y.

Calls to SGBR can be achieved by

KM=MIN(KL+1,M-L+1)

KN=MIN(KU+1,N-L+1)

CALL SGER(KM,KN,ALPHA,X,INCX,Y,INCY,A(KU+1,L),MAX(KM,LDA-1))

Calls to SSBR can be achieved by

KN=MIN(K+1,N-L+1)

IF(UPLO.EQ.’U’)THEN

CALL SSYR(’U’,KN,ALPHA,X,INCX,A(K+1,L),MAX(1,LDA-1)) ELSE

CALL SSYR(’L’,KN,ALPHA,X,INCX,A(1,L),MAX(1,LDA-1)) ENDIF

and similarly for calls to SSBR2.

Appendix B

In this appendix we propose an additional set of real and complex level2routines which allow extended precision matrix-vector operations to be performed.The names of these routines are obtained by preceding the character representing the Fortran data type(S or C),by the character E.The matrix is always stored in working precision(which is single precision for the ES-or EC-set of routines,and dou-ble precision for the ED-or EZ-set).The computation must be performed in extended precision(which is at least double precision for ES-or EC-set,and at least quadruple precision for the ED-or EZ-set).

Such routines are useful,for example,in the accurate computation of residuals in iterative re?nement.Many machines have extended precision registers in which extended precision computation is performed at little or no extra cost.However,in order to allow the additional precision to be carried through a series of calls to these routines,at least one,in some cases both,of the vectors x and y must be stored in extended precision.

These routines are to perform the operations described in section2as follows.

For the matrix-vector operations

y←αAx + βy, y←αA T x + βy, y←αA T x + βy

α,β, A,and x are working precision,y is extended precision and the computation of y is to be performed in extended precision.

For the triangular operations

x← Tx, x← T T x, x← T T x,

x← T?1x, x← T?T x, x← T ?T x

T is working precision,x is extended precision and the computation of x is to be performed in extended precision.

For the rank-one and rank-two updates

A←αxy T + A, A←αxy T + A,

H←αxx T + H, H←αxy T + α yx T + H,

α,A,and H are working precision,x and y are extended precision and the computation is to be per-formed in extended precision.

fortran调试经验

FORTRAN调试程序的时候注意的问题 调试程序的时候注意的问题。 程序编好,能够直接运行而且完全正确,基本不可能,这就有调试程序的问题。主要有一下几个方面: 其一,在每个子程序被调用的时候留个心眼,写个输出语句,表示程序已经运行到了这里。这样一个小提示会给调试带来巨大的方便,如果程序运行出错,至少你可以知道它是在运行到哪里出的错,这样,直接去检查那个程序就可以了。不必重头开始检查。 其二,注意对中间计算结果的输出。有时候,而且是很多的时候,程序编译成功,运行也没有问题,就是结果不对,这肯定是计算原理有问题,此时,输入一些重要步骤的中间结果,往往可以检查出问题所在。同时,就算查出了问题所在,也可以不删除这段输出中间计算结果的代码,有可能后面还会有用处,此时,在每行输出语句前加一个感叹号,把代码变成注释的绿体字就可以了。等到再次需要输出的时候,直接删除一个“!”比再写一遍输出代码,当然要简单的多。 其三,对WATCH功能的应用,FORTRAN提供的这个功能很实用,可以查很多问题,尤其是程序中间计算值,这个和上述的中间结果的输出有点相似。但两者的不同是前者可以进行中间结果的输出控制,就是只有符合了某个条件的才能被输出,这样可以便捷程序的调试,同时对中间结果输出后可以直接用STOP停止程序的运行,这样对于大型程序来说,节省了很多后面继续计算的时间——因为前面的结果已经不对了,后面的算也是白算。 其四,对中间结算结果输出形式的控制,一般来说,FORTRAN计算结果可以输出到文件里面和计算界面两个地方。对较大的计算结果,可以输出到文件里面,反之较少的结果可以直接输出到屏幕上,为了增强数据的可读性,最好进行有格式的数据输出,以利于相同性质的数据的比较。输出到屏幕上的结果直接用WRITE(6,*)就可以(无格式),对于输出到文件里面的数据,可以省些事情,直接用WRITE(X,*)就可以,其中X是一个任意的正整数,最好大于10,也不用事先对这个X设备进行说明,程序会将结果输出到一个FORT.X的文件里面,例如10,就是FORT.10,此时,用NOTEPAD或者ULTRA-EDIT都可以把它打开——FORT.10实质上就是一个.DAT的文件,你可以把它重命名。 3.对数据计算时的误差控制。 以前觉得小数点后的误差不是那回事,没有太在意,可经过实战,终于明白了小小的误差完全可以改变整个计算的结果。因此,如果程序能够输入结果而不正确时,除了寻找算法的问题,不要忽略的误差。一般认为,FORTRAN的REAL变量小数点后8位数字误差定义已经足够,而事实上,这个精度可能在一些情况下不满足,这个时候,需要用更精确的变量类型——REAL(8),同理,当要判断两个数是否相等的时候,一定要慎用相等判断(.EQ.)这个比较运算符,因为任何数据,别看着在现实中它们一定相等,在程序中就不一定了。一旦经过了计算,就不可避免的产生了舍入误差,对于整数和有限几位循环的有理数都问题不大,可一旦是一个无理数或者无限循环的小数,只有在判断了小数点后的每一位都相等的时候,程序才判断为相等成立。这个相等的标准是非常苛刻的,所以一般情况下,可行的方法是将

FORTRAN内部函数库

FORTRAN内部函数 用FORTRAN解题往往要用到一些专门运算,如求三角函数sinx, cosx,对数lnx,指数ex,求一组数中最大数和最小数等。 FORTRAN提供了一些系统函数(称为内部函数)来完成这些运算。程序设计者不必自己设计进行这些运算的语句组(即程序段或子程序),只需写出一个函数的名字以及给出一个或若干个自变量,就可以得到所需的值,例如: SQRT(4.0)求出4.0的平方根, SIN(2.0)求出2(弧度)的正弦值, EXP(3.5)求出e3.5, LOG(3.0)求出3, 常用的函数如下表,FORTRAN77提供的全部函数明细见FORTRAN77内部函数。 函数名含义应用例子相当于数学上的运算 ABS 求绝对值ABS(A) |a| EXP 指数运算EXP(A) e^a SIN 正弦值SIN(X) sin x COS 余弦值COS(X) cos x ASIN 反正弦ASIN(X) sin^(-1)a ACOS 反余弦ACOS(X) cos-1a TAN 正切TAN(X) tan x ATAN 反正切ATAN(A) tan^(-1)a LOG 自然对数LOG(A) lna,或loge(a) LOG10 常用对数LOG10(A) log10a INT 取整INT(A) int(a),取a的整数部分 MOD 求余MOD(A1,A2) a - int(a1/a2)*a2 SIGN 求符号SIGN(A1,A2) |a1|(若a2>0) -|a1|(若a2<0) REAL 转换为实型REAL(I) MAX 求最大值MAX(A1,A2,A3) max(a1,a2,a3) MIN 求最小值 MIN(A1,A2,A3) min(a1,a2,a3) 说明: (1)FORTRAN77将这些系统函数分别编成一个个子程序,组成函数库,存贮于外部介质(如磁盘)上。在完成源程序的编译之后,用LINK命令实现连接,即将已翻译成二进制指令的目标程序与函数库连接。也就是将程序中出现函数名的地方用函数库中相应的一组指令代入之,组成一个统一的“可执行目标块”。例如,程序中出现一个SIN函数,在连接时就将一组二进制指令(它们是实现求正弦值的运算的)直接插入到程序中出现SIN的地方。由于是插入到程序内部的,所以称为“内部函数”。 (2)一个内部函数要求一个或多个自变量。例如,SQRT函数只能有一个自变量SQRT(4.6),MOD函数要求两个自变量MOD(8,3),MAX和MIN函数要求两个以上自变量MAX(6,-8,10),MIN(-6,8,0)。当自变量个数规定为2个时,自变量的顺序不应任意颠倒,MOD(8,3)表示8被3除的余数,其值为2,而MOD(3,8)则表示3被8除的余数,其值为3。当自变量个数>2时,自变量的顺序无关,MAX(6,8,10)和MAX(8,10,6)结果是一样的。

MATLAB 与C C + + 、FORTRAN语言混合编程

MATLAB 与C/ C + + 、FORTRAN语言混合编程摘要:对MATLAB 与C/ C + + 和FORTRAN 语言进行混合编程的常用方法进行了介绍,分析了其实现方式和各自的利弊,并用实例对MEX 文件实现方式进行了较详细的论述. 关键词: MATLAB ; C/ C + + ; FORTRAN ; 混合编程 中图分类号: TP313 文献标识码: A 文章编号:16722948X(2004) 0620547205 1 混合编程的意义及其实现方式 1. 1 混合编程的意义 MATLAB 语言具有功能强大、开发效率高等诸 多优点, 已在工程实际中得到广泛应用, 但是与 FORTRAN、C/ C + + 等语言相比,其执行效率较低, 特别是当程序中含有大量循环语句(例如迭代计算) 时,MATLAB 就有些力不从心, 速度很慢, 而运用 FORTRAN 等擅长数值计算语言进行编程,其运行效 率高. 一方面,如果放弃MATLAB 强大功能和数量 众多的应用工具箱,无疑是资源的极大浪费. 另一方 面,针对工程实际,人们用FORTRAN、C/ C + + 语言 已编写了大量实用程序,如果将其重新改写成M 文 件移植到MATLAB 环境中,不仅要花费大量的时间 和精力,而且有时还降低了其运行效率. 可否将二者 优势互补呢? 混合编程就是其有效的解决途径. 1. 2 混合编程的实现 正是考虑到上面这些原由,MATLAB 系统提供 了其应用程序接口(Application Program Interface) 来 解决这些问题. API 主要包括3 部分:MEX 文件——— 外部程序调用接口,用来在MATLAB 环境下调用 FORTRAN、C/ C + + 语言编写的程序;MAT 文件应 用程序———数据输入输出接口,用于MATLAB 系统 与外部环境之间互传数据; 计算引擎函数库——— MATLAB 处于后台作为一个计算引擎,与其它应用 程序建立客户机/ 服务器关系,在其它应用程序中调 用[1 ,2 ] . 1. 2. 1 MEX 文件 MEX 文件是按照一定格式,用FORTRAN 或C/ C + + 语言编写的源程序,在MATLAB 下借助相应 的编译器,生成的动态链接函数的统称. 在Windows 操作系统下,是用MATLAB 附带的批处理mex. bat 来编译生成文件后缀名为. dll (Dynamic Link Li2 brary) 动态链接库文件,该文件可在MATLAB 环境 下,像命令函数一样直接运行和调用,使用起来极为 方便. 采取MEX 文件方式,是重复利用已有FOR2 TRAN、C/ C + + 程序,让MATLAB 和FORTRAN、

fortran语法手册

1 FORTRAN77四则运算符 + - * / ** (其中**表示乘方) 在表达式中按优先级次序由低到高为:+或-→*或/→**→函数→() 2 FORTRAN77变量类型 2.1 隐含约定:I-N规则 凡是以字母I,J,K,L,M,N六个字母开头的,即认为是整型变量,其它为实型变量。 如IMPLICIT REAL (I,J) 三种定义的优先级别由低到高顺序为:I-N规则→IMPLICIT语句→类型说明语句,因此,在程序中IMPLICIT语句应放在类型说明语句之前。 2.4 数组的说明与使用 使用I-N规则时用DIMENSION说明数组,也可在定义变量类型同时说明数组,说明格式为:数组名(下标下界,下标上界),也可省略下标下界,此时默认为1,例: DIMENSION IA(0:9),ND(80:99),W(3,2),NUM(-1:0),A(0:2,0:1,0:3) REAL IA(10),ND(80:99)使用隐含DO循环进行数组输入输出操作:例如WRITE(*,10) ('I=',I,'A=',A(I),I=1,10,2) 10FORMAT(1X,5(A2,I2,1X,A2,I4)) 2.5 使用DATA语句给数组赋初值 变量表中可出现变量名,数组名,数组元素名,隐含DO循环,但不许出现任何形式的表达式:例如 DATA A,B,C/-1.0,-1.0,-1.0/ DATA A/-1.0/,B/-1.0/,C/-1.0/ DATA A,B,C/3*-1.0/CHARACTER*6 CHN(10)

DATA CHN/10*' '/INTEGER NUM(1000) DATA (NUM(I),I=1,500)/500*0/,(NUM(I),I=501,1000)/500*1/ 3 FORTRAN77程序书写规则 程序中的变量名,不分大小写; 变量名称是以字母开头再加上1到5位字母或数字构成,即变更名字串中只有前6位有效; 一行只能写一个语句; 程序的第一个语句固定为PROGRAM 程序名称字符串 某行的第1个字符至第5个字符位为标号区,只能书写语句标号或空着或注释内容; 某行的第1个字符为C或*号时,则表示该行为注释行,其后面的内容为注释内容; 某行的第6个字符位为非空格和非0字符时,则该行为上一行的续行,一个语句最多可有19个续行; 某行的第7至72字符位为语句区,语句区内可以任加空格以求美观; 某行的第73至80字符位为注释区,80字符位以后不能有内容。 4 FORTRAN77关系运算符 .GT. 大于 .GE. 天于或等于 .LT. 小于 .LE. 小于或等于 .EQ. 等于 .NE. 不等于 .AND. 逻辑与 .OR. 逻辑或 .NOT. 逻辑非 .EQV. 逻辑等 .NEQV. 逻辑不等 运算符优先级由高到低顺序为:()→**→*或/→+或-→.GT.或.GE.或.LT. 或.LE.或.EQ.或.NE.→.NOT.→.AND.→.OR.→.EQV.或.NEQV 5 FORTRAN77语句

fortran常见问题解决

楼主为了减少重复回答问题,特编此帖,并不定期添加和更新内容。 错误难免,欢迎讨论,仅供参考。 很多人问哪里可以找到Fortran编译器,有不少热心学友提供网址,特汇集在这里。虽然俺检验过这些链接,但是它们不一定总有效。 Fortran编译器下载: CVF? FTN95(License:Freeforpersonaluse) 以下操作,如无特别说明,都是以为例。 1.如何加大Stacksize? 选Project=>Settings=>Link=>Category:Output=>? Stackallocations Reserve:这里填新值(默认为1M,若需要10M,则填) 2.如何用Fortran批量生成文件? 设要生成4000个文件,文件名为AA1-AA4000,如何写循环生成文件,而不用写4000次write 命令呢? 用内部文件: character(len=80)::filename,form integer::i doi=1,4000 selectcase(i) case(1:9) write(form,'(i1)')i case(10:99) write(form,'(i2)')i case(100:999) write(form,'(i3)')i case(1000:9999) write(form,'(i4)')i endselect write(filename,*)"AA",trim(form),".TXT" open(10,file=filename) write(10,*)i close(10)

enddo? stop end 3.如何用Fortran动态生成输出格式? 设有一个数组data(100),输出时,希望每行输出num个数,而num由用户输入,如何实现? 用内部文件: character(len=80)::form real::data(100) integer::i,num data=(/(i,i=1,100)/)/ read(*,*)num write(form,*)"(",num,"" write(*,form)data stop end 4.MS是不是很垃圾? 是垃圾,其中Bug太多,多到不可用的地步! 在这个主题里,换了CVF后问题就没了的人已有相当的数目。 如果你用,遇到莫名其妙的错误,建议换,这是一个比较成熟的编译器。 5.如何用F90/95生成随机数? 注意: 现在计算机产生的随机数都是伪随机数。 random_number(x)产生一个0到1之间的随机数(x可以是向量),但是每次总是那几个数。用了random_seed()后,系统根据日期和时间随机地提供种子,使得随机数更随机了。 programrandom implicitnone real::x callrandom_seed()!系统根据日期和时间随机地提供种子 callrandom_number(x)!每次的随机数就都不一样了 write(*,*)x stop endprogramrandom 6.函数/子程序超载的例子

(完整)Fortran经典编程语言笔记(你值得拥有)

FORTRAN笔记 2014.10.07 目录 第七讲_FORTRAN的基本知识.ppt (2) FORTRAN语言程序设计初步 (2) FORTRAN源程序的书写格式(以77为例) (2) 变量 (2) 变量类型 (2) 算术运算符和运算优先级 (3) 赋值语句 (3) 参数语句(PARAMETER语句) (3) END语句 (3) PAUSE语句 (3) 逻辑运算和选择结构 (4) 关系表达式 (4) FORTRAN中数组的定义及使用 (4) 其他 (5) 1. fortran语言定义CHARACTER*6 TTL(14,3),CNAM(400)是什么意思? (5) 2. fortran里character*10 是什么意思 (5) 3. Fortran中kind是什么函数? (5)

第七讲_FORTRAN的基本知识.ppt FORTRAN语言程序设计初步 FORTRAN是Formula Translation的缩写,意为“公式翻译”,它是为科学、工程问题或企事业管理中的那些能够用 数学公式表达的问题而设计的,其数值计算的功能较强。 常用的是FORTRAN77和FORTRAN90两种标准。 1、一个程序由若干个程序单位组成。主程序和每一个子程序分别是一个独立的程序单位。 2、每一个程序单位都是以“END”结束的。 3、一个程序单位包括若干行。 1)语句行。由一个FORTRAN语句组成。 2)非语句行,即注释行。 4、FORTRAN程序中的语句可以没有标号,也可以有标号,根据需要而定。标号的作用是标志一个语句以便被其 他语句引用。 5、一个程序单位中各类语句的位置是有一定规定的。 6、FORTRAN源程序必须按一定的格式书写。 FORTRAN源程序的书写格式(以77为例) 每一行有80列,分别如下: 1、第1-5列为标号区。一行中第一列为“C”或“*”,该行即被认为是注释行。 2、第6列为“续行标志区”,如果在一行的第6列上写一个非空格和非零的字符,则该行作为其上一行的续行。 3、第7-72列为语句区。 4、第73-80列,注释区。 变量 变量名:一个变量需要用一个名字(变量名)来识别。在同一个程序单位中不能用同一个变量名代表两个不同的变 量。 FORTRAN的变量名按以下规则选定: 1)第一个字符必须是字母,即变量名必须以字母开头; 2)在一个字母后面可以跟1-5为数字或字母。 如果选定的变量名超过6个字符,则只有前面6个字符有效。 注:在变量名中大写与小写字母是等价的。 变量类型 整型变量Integer、实型变量Real、双精度变量Double Precision、复型变量Complex、逻辑型变量Logical和字符型变量Character。 1、隐含约定(I-N规则) FORTRAN规定:在程序中的变量名,凡以字母I,J,K,L,M,N六个字母开头的,即认为该变量为整型变量。 在程序中,凡是变量名以字母I,J,K,L,M,N,i,j,k,l,m,n开头的变量被默认为整型变量,以其他字母开头的变量被 默认为实型变量。 2、用类型说明语句确定变量类型 1)INTEGER语句(整型说明语句) 2)REAL语句(实型说明语句) 3)DOUBLE PRECISION语句(双精度说明语句) 4)COMPLEX语句(复型说明语句) 5)LOGICAL语句(逻辑型说明语句)

C语言转换为fortran语言

C/C++采用的是缺省调用约定是STDCALL约定.在C程序中,可以在函数原型的声明中使用_stdcall关键字来指明过程采用STDCALL调用约定。 Fortran过程采用的缺省标识符是,全部大写的过程名加上“_”前缀和“@n”后缀。在C程序中保留标识符的大小写。编译程序会给采用STDCALL约定的过程标识符加上“_”前缀和“@n”后缀。 Fortran过程缺省的参数传递方式是引用方式是。对于下面这个Fortarn过程:SUBROUTINE ForSub(ivar,rvar) INTEGER ivar REAL rvar WRITE(*,*) ivar,rvar END 在C语言程序中应给出过程的函数原型及调用方式为: void main() { extern void__stdcall FORSUB(int*I,float*f); int iCV AR=1; float rCV AR=2.0; FORSUB(&iCV AR,&rCV AR); } 在C++中调用Fortan的过程,在声明函数原型时需要用extern“C”语句,以避免C++编译程序对标识符的修饰;并且C++也可以通过引用方式传递参数。对于上面的Fortran过程,C++程序应给出的函数原型及调用方法是: void main() { extern “C”{void__stdcall FORSUB(int*I,float*f);} int iCV AR=1; float rCV AR=2.0; FORSUB(&iCV AR,&rCV AR); } 另外,也可以在Fortran中用!MS$ATTRIBUTES编译伪指令来改变Fortran子过程的调用约定,以便于被其他语言的程序调用。在下面的例子中,过程ForSub具有C语言的调用约定。 SUBROUTINE ForSub(ivar,rvar) !MS$ATTRIBUTES C::ForSub INTEGER ivar REAL rvar WRITE(*,*) ivar,rvar END 这样,这个过程使用的是C调用约定,并且参数传递方式也变为传值方式,过程的标识符变为全部小写且有_前缀而无后缀的方式。在C语言源程序中的函数原型及调用方法为:void main() { extern void FORSUB(int ivar,float rvar); int iVar=1;

fortran课后习题答案

第一章 FORTRAN程序设计基础第15页 1、2 1.简述程序设计的步骤。 “程序设计”:反映了利用计算机解决问题的全过程,通常要经过以下四个基本步骤:(1)分析问题,确定数学模型或方法;(2)设计算法,画出流程图;(3)选择编程工具,编写程序;(4)调试程序,分析输出结果。 2. 什么是算法?它有何特征?如何描述算法? 解决问题的方法和步骤称为算法。 算法的五个特征:(1) 有穷性。 (2) 确定性。 (3) 有效性。 (4) 要有数据输入。(5) 要有结果输出。 算法的描述有许多方法,常用的有:自然语言、一般流程图、N-S图等。 第二章顺序结构程序设计 第29页 1、2、3、4、5、6、7、8、9 1.简述符号常量与变量的区别? 符号常量在程序运行过程中其值不能改变。变量在程序运行过程中其值可以改变。 2. 下列符号中为合法的FORTRAN 90标识符的有哪些? (1) A123B (2) M%10 (3) X_C2 (4) 5YZ (5) X+Y (6) F(X) (7) COS(X) (8) A.2 (9) ‘A’ONE (10) U.S.S.R. (11) min*2 (12) PRINT 3. 下列数据中哪一些是合法的FORTRAN常量? (1) 9,87 (2) .0 (3) 25.82(4) -356231 (5) 3.57*E2 (6) 3.57E2.1 (7) 3.57E+2(8) 3,57E-2 4. 已知A=2,B=3,C=5(REAL);且I=2,J=3(INTEGER),求下列表达式的值: (1) A*B+C 表达式的值: 11 (2) A*(B+C) 表达式的值: 16 (3) B/C*A 表达式的值: 1.2 (4) B/(C*A) 表达式的值: 0.3 (5) A/I/J 表达式的值: 0.33 (6) I/J/A 表达式的值: 0 (7) A*B**I/A**J*2 表达式的值: 4.5 (8) C+(B/A)**3/B*2. 表达式的值: 7.25 (9) A**B**I 表达式的值: 512 5. 将下列数学表达式写成相应的FORTRAN表达式: (1) 1E-2 (2)(-B+SQRT(B*B-4*A*C)/(2*A) (3) 1+X+X*X/2+X**3/2/3 (4) COS(ATAN((A**3+B**3)**(1.0/3)/(C*C+1))) (5) EXP(A*X**2+B*X+C) (6) COS(X*Y/SQRT(X*X+Y*Y))**3 6. 用FORTRAN语句完成下列操作: (1) 将变量I的值增加1。I=I+1 (2) I的立方加上J,并将结果保存到I中。 I=I**3+J (3) 将E和F中大者存储到G中。G=Max(E,F) (4) 将两位自然数N的个位与十位互换,得到一个新的数存储到M中(不考虑个位为0的情况) M=MOD(N,10)*10+N/10 第三章选择结构程序设计第43页 1、2、3、5、6、7、9 1.分析下列程序运行结果 (1) LOGICAL P INTEGER I,I1,I2,I3 P=.FALSE. READ*,I I1=MOD(I,10) I2=MOD(I/10,10) I3=I/100

fortran基本函数

FORTRAN 90标准函数(一) (2012-07-03 17:14:57) 转载▼ 分类:学习 标签: fortran 函数 教育 符号约定: ●I代表整型;R代表实型;C代表复型;CH代表字符型;S代表字符串;L代表逻辑型;A代表数组;P代表指针;T代表派生类型;AT为任意类型。 ●s:P表示s类型为P类型(任意kind值)。s:P(k)表示s类型为P类型(kind值=k)。 ●[…]表示可选参数。 ●*表示常用函数。

注:三角函数名前有C、D的函数为复数、双精度型函数。 注:指数函数名、平方根函数名、对数函数名前有C、D的函数为复数、双精度型函数。 表4 参数查询函数

atan2函数的值域是多少?我从网上找到一个fortran函数的日志,说此值域是-π~π,但正常反正切函数的值域应该是-π/2~π/2。对atan2函数不够了解,所以不知道你的答案对不对,我个人认为不对。我是用正常的反正切函数atan(v/u)来算的: FORTRAN: if (u>0..and.v>0.) dir=270-atan(v/u)*180/pi if (u<0..and.v>0.) dir=90-atan(v/u)*180/pi if (u<0..and.v<0.) dir=90-atan(v/u)*180/pi if (u>0..and.v<0.) dir=270-atan(v/u)*180/pi if (u==0..and.v>0.) dir=180 if (u==0..and.v<0.) dir=0 if (u>0..and.v==0.) dir=270 if (u<0..and.v==0.) dir=90 if (u==0..and.v==0.) dir=999 其中uv等于零的五种情况要单独挑出来,不然程序会有瑕疵。atan函数换成atand函数的话直接是度数,不用*180/pi 我四个象限和轴都试了,应该没错。 最需要注意的问题,一个是函数值域,另一个是uv矢量方向和风向是反着的,并且风向角度数是从正Y轴开始顺时针算,和三角函数里度数从正X轴开始逆时针算不一样。

C与C++与FORTRAN混合编程

C/C++/FORTRAN 混合编程 混合编程在软件编程中是经常遇到的问题,尤其是C/C++/FORTRAN的混合编程,本文主要说明以上三种语言混合编程中经常遇到的问题,同时,也说明了不同平台下混合编程应注意的问题。 混合语言编程要注意的问题主要体现在:函数调用和数据结构的存储。 1 Windows平台 函数:由于Fortran编程语言没有大小写之分,Windows平台下的混合语言编程要注意的主要是大小写的问题。考虑到编译器的差异,可以用下面的方式进行跨平台编程的函数声明。( C/C++编译器使用Microsoft Visual C++ 6.0, Fortran编译器使用 Digital Visual Fortran 6.0)。 假设一个C的函数为 void cFunction(); 那么,只需要在它的头文件里面进行如下定义即可: #ifdef __cplusplus extern “C” void { #endif extern void __stdcall CFunction(); #define cFunction CFUNCTION #ifdef __cplusplus } #endif 这样,在Fortran或者C++的程序里面就可以直接调用了。 假设是一个Fortran函数SUBROUTINE FFUNCTION(); 那么,在C++头文件里进行如下的定义就可以了: #ifdef __cplusplus extern “C” void { #endif extern void __stdcall ffunction(); #define ffunction FFUNCTION #ifdef __cplusplus } #endif 这样,就可以在C++的程序里面直接调用。由于C编译器里面,没有定义__cplusplus这个环境变量,因此,C文件里面,也可以直接使用这个头文件。如果是一个C++函数,如: void cPlusplusFunction();和c函数一样,进行下面的定义即可: #ifdef __cplusplus extern “C” void { #endif extern void __stdcall cPlusplusFunction (); #define cPlusplusFunction CPLUSPLUSFUNCTION #ifdef __cplusplus }

fortran常见错误

FAQ之常见错误 2014-02-02 13:45:35 来源:Fcode研讨团队评论:2点击:4419 本文从编译错误,链接错误,运行时错误,计算结果错误等四个方面介绍了常见的错误及解决思路。适合初学者阅读。 首先应该明确:错误有哪几种?我们当前遇到的是何种错误? 阐述这些问题前,我们先讨论一下常规的应用程序开发的过程: 1>>编写代码,使用一个或多个源代码文件。 2>>对第一步的每一个源代码文件执行编译操作。得到一个或若干个目标代码。 3>>将目标代码,运行时库(Run-time Library)和其他使用到的函数库链接起来。得到一个可执行文件(EXE 或其他) 4>>编写程序的说明书,必要的(输入)数据文件 5>>将上述得到的结果发布给用户。(发布的方式可以是刻录成光盘,销售,放在网站上供别人下载,或者其他) 6>>用户得到程序后,运行,输入数据,得到计算结果。 对于很多 Fortran 程序员来说,可能用户就是自己,也可能仅仅是自己教研室的同事同学。所以第4,5,6步骤很多时候不明显。而如果使用集成开发环境(IDE)进行开发,第1,2,3步骤又可以一键完成。因此,很多初学者就认为,写程序就是:输入代码,运行,得到结果。这样的理解太狭义。 不管我们面对什么使用者来写代码,程序开发应该是上述的过程。我们的编译器,编译环境,也是为这个过程而设计的。 于是,我们将错误分为四种: 一. 编译错误(发生在第2步) 编译错误,一般是源代码书写格式不正确,不符合语法要求。 二. 链接错误(发生在第3步) 链接错误,一般是源代码结构不完整,运行时库或函数库使用不合理。 三. 运行时错误(发生在第6步) 运行时错误,一般是执行代码时,遇到了事先未料及的错误。比如内存不足了,磁盘空间不够了,输入文件格式不对了,输出文件写入失败了等等。 四. 计算结果不符合预期(程序代码不规范,或不符合你的设想) 计算结果不符合预期,可能性就很多了。语法与你的想法不一致,超出函数库的适用范围,执行流程控制不当等等。 这四种错误,其排查难度依次增大。也就是,编译错误最容易排查和修改,而计算结果不正确,最让人头疼。

fortran中批处理实现

********************************************* fortran中批处理命令的实现函数: 利用systemqq命令(需要调用DFLIB 数据库) ********************************************* 例1: USE DFLIB character*100 CMD LOGICAL(4) res CMD="dir/a-d/b/s "//trim(fPath)//" >"//trim(outPut) res=SYSTEMQQ(CMD) 例2: USE DFLIB LOGICAL(4) result result = SYSTEMQQ('copy e:\dir.txt e:\test\dir.txt') !将e:\dir.txt 复制到e:\test\dir.txt文件中。!****************实例3:复制文件************************* 例3: programmain_pro USE DFLIB implicit none integer,parameter::sta_num=123 character(5),dimension(sta_num)::sta_ID character(500)::filein,fileout character(5000)::cmd logical(4)::judge

integer::status,is open(1,file='山东.txt',status='old',action='read',iostat=status) read(1,*) do is=1,sta_num read(1,*) sta_ID(is) filein='Z:\data\降水逐小时数据-戴至修\precip_data\'//sta_ID(is)//'_precip.txt' open(2,file= filein,status='old',action='read',iostat=status) if(status/=0) goto 1000 fileout='Z:\data\降水逐小时数据-戴至修\山东省-降水数据\'//sta_ID(is)//'_precip.txt' cmd='copy '//filein//' '//fileout judge=SYSTEMQQ( cmd) 1000 continue enddo end program

Fortran常用函数

1、RANDOM_NUMBER Syntax ['sint?ks] n. 语法 CALL RANDOM_NUMBER (harvest结果) Intrinsic Subroutine(固有子程序):Returns a pseudorandom number greater than or equal to zero and less than one from the uniform distribution. 返回大于或等于0且小于1,服从均匀分布的随机数 2、RNNOA/ DRNNOA (Single/Double precision) Generate pseudorandom numbers from a standard normal distribution using an acceptance/rejection method. 产生服从标准正态分布的随机数 Usage(用法) CALL RNNOA (NR, R) Arguments(参数) NR— Number of random numbers to generate. (Input) 要产生随机数的个数 R— Vector of length NR containing the random standard normal deviates. (Output) 输出长度为NR,随机正态分布的向量 Comments(注解) The routine RNSET can be used to initialize the seed of the random number generator. The routine RNOPT can be used to select the form of the generator. 程序RNSET可以用来初始化随机数发生器的种子 Example In this example, RNNOA is used to generate five pseudorandom deviates from a standard normal distribution. INTEGER ISEED, NOUT, NR REAL R(5) EXTERNAL RNNOA, RNSET, UMACH C CALL UMACH (2, NOUT) NR = 5 ISEED = 123457 CALL RNSET (ISEED) CALL RNNOA (NR, R) WRITE (NOUT,99999) R 99999 FORMAT (' Standard normal random deviates: ', 5F8.4) END Output Standard normal random deviates: 2.0516 1.0833 0.0826 1.2777 -1.2260

fortran语言内部函数

附录 FORTRAN 90标准函数 符号约定: ●I代表整型;R代表实型;C代表复型;CH代表字符型;S代表字符串;L代表逻辑型;A代表数组;P代表指针;T代表派生类型;AT为任意类型。 ●s:P表示s类型为P类型(任意kind值)。s:P(k)表示s类型为P类型(kind值=k)。 ●[…]表示可选参数。 ●*表示常用函数。 表1 数值和类型转换函数 函数名说明 ABS(x)*求x的绝对值∣x∣。x:I、R,结果类型同x; x:C,结果:R AIMAG(x)求x的实部。x:C,结果:R AINT(x[,kind])*对x取整,并转换为实数(kind)。x:R, kind:I,结果:R(kind) AMAX0(x1,x2,x3,…)*求x1,x2,x3,…中最大值。x I:I,结果:R AMIN0(x1,x2,x3,…)*求x1,x2,x3,…中最小值。x I:I,结果:R ANINT(x[,kind])*对x四舍五入取整,并转换为实数(kind)。x:R, kind:I,结果:R(kind) CEILING(x)*求大于等于x的最小整数。x:R,结果:I CMPLX(x[,y][,kind]))将参数转换为x、(x,0.0)或(x,y)。x:I、R、C, y:I、R,kind:I,结果:C(kind) CONJG(x)求x的共轭复数。x:C,结果:C DBLE(x)*将x转换为双精度实数。x:I、R、C,结果:R(8) DCMPLX(x[,y])将参数转换为x、(x,0.0)或(x,y)。x:I、R、C, y:I、R,结果:C(8) DFLOAT(x)将x转换为双精度实数。x:I,结果:R(8) DIM(x,y)*求x-y和0中最大值,即MAX(x-y,0)。x:I、R, y的类型同x,结果类型同x DPROD(x,y)求x和y的乘积,并转换为双精度实数。x:R, y:R,结果:R(8)

fortran问题

imsl7.0中用use linear_operators这句话是会出错的,当时intel论坛上也有人问,后来intel 给出了个X64 imsl的补丁,但是32位的没有。 要使用包含在linear_operators这个库中的函数时,要用use+原函数 例如:上面那个例子把use linear_operators改成use operation_xt即可 另外,imsl7.0引用函数和以前版本不一样的 补充一点: linear_operators这个文件是这样子的(一看就知道原因了): modulelinear_operators usecond_int usedet_int usediag_int usediagonals_int usefft_int useifft_int useeye_int uselin_eig_self_int uselin_sol_self_int usenorm_int useoperation_i useoperation_ix useoperation_t useoperation_h useoperation_tx useoperation_hx useoperation_x useoperation_xi useoperation_xt useoperation_xh useorth_int userand_int userank_int usesvd_int useunit_int useeig_int usechol_int useisnan_int end module 1. 如何加大Stack size? 选Project => Settings => Link => Category: Output =>

常用基本函数

R语言基本函数 一、数据管理 vector:向量 numeric:数值型向量 logical:逻辑型向量 character;字符型向量 list:列表 data.frame:数据框 c:连接为向量或列表 length:求长度 subset:求子集 seq,from:to,sequence:等差序列 rep:重复 NA:缺失值 NULL:空对象 sort,order,unique,rev:排序 unlist:展平列表 attr,attributes:对象属性 mode,typeof:对象存储模式与类型 names:对象的名字属性 二、字符串处理 character:字符型向量 nchar:字符数 substr:取子串 format,formatC:把对象用格式转换为字符串paste,strsplit:连接或拆分 charmatch,pmatch:字符串匹配 grep,sub,gsub:模式匹配与替换 三、复数 complex,Re,Im,Mod,Arg,Conj:复数函数 四、因子 factor:因子 codes:因子的编码 levels:因子的各水平的名字 nlevels:因子的水平个数 cut:把数值型对象分区间转换为因子 table:交叉频数表 split:按因子分组 aggregate:计算各数据子集的概括统计量 tapply:对“不规则”数组应用函数 数学 一、计算 +, -, *, /, ^, %%, %/%:四则运算 ceiling,floor,round,signif,trunc,zapsmall:舍入 max,min,pmax,pmin:最大最小值 range:最大值和最小值 sum,prod:向量元素和,积 cumsum,cumprod,cummax,cummin:累加、累乘 sort:排序 approx和approx fun:插值 diff:差分 sign:符号函数 二、数学函数 abs,sqrt:绝对值,平方根 log, exp, log10, log2:对数与指数函数 sin,cos,tan,asin,acos,atan,atan2:三角函数 sinh,cosh,tanh,asinh,acosh,atanh:双曲函数 beta,lbeta,gamma,lgamma,digamma,trigamma,tetragamma,pentagamma,choose ,lchoose:与贝塔函数、伽玛函数、组合数有关的特殊函数 fft,mvfft,convolve:富利叶变换及卷积 polyroot:多项式求根 poly:正交多项式 spline,splinefun:样条差值 besselI,besselK,besselJ,besselY,gammaCody:Bessel函数 deriv:简单表达式的符号微分或算法微分

Fortran 运行中给出的系统错误及解决方法

. Fortran 运行中给出的系统错误及解决方法 以下均为linker tools errors and warnings Linker Tools Error LNK1000 unknown error; consult documentation for technical support options Note the circumstances of the error, try to isolate the problem and create a reproducible test case, then contact technical support. Linker Tools Error LNK1101 incorrect MSPDBxx.DLL version; recheck installation of this product The version of MSPDBxx.DLL available on your system does not match the version required by this tool. Linker Tools Error LNK1102 out of memory There was not enough memory for the tool to run. Probably the paging file exceeded available disk space. If a shortage of disk space is not the cause, note the circumstances of the error, try to isolate the problem and create a reproducible test case, then request technical support. Linker Tools Error LNK1103 debugging information corrupt; recompile module Probably the compilation was terminated before a valid object file was created.Recompile the given object file. If recompiling does not correct the problem,note the circumstances of the error, try to isolate the problem and create a reproducible test case, then consult technical support. Linker Tools Error LNK1104 cannot open file "filename" The tool could not open the given file. One of the following may be a cause: l There was not enough disk space. l The file does not exist. l The filename or its path was incorrectly specified. l The specified drive is invalid. l The file does not have the appropriate permissions. l The path for filename expands to more than 260 characters. l If the given file is named LNKn, which is a filename generated by the linker for a temporary file, then the directory specified in the TMP environment variable may not exist, or more than one directory is specified for the TMP environment variable. (Only one directory path should be specified for the TMP environment variable.) l If the error occurs on the executable filename, an earlier version of the executable may still be running. You will need to terminate the executable before linking it. In Windows NT (including Windows 2000) or Windows 95,you can use the utility PVIEW to look for and kill instances of the application. l If the error message occurs for a library name, and you recently ported the .MAK file from a previous Microsoft Visual C++ development

相关主题
文本预览
相关文档 最新文档