当前位置:文档之家› Dybvig. An infrastructure for profile-driven dynamic recompilation

Dybvig. An infrastructure for profile-driven dynamic recompilation

Dybvig. An infrastructure for profile-driven dynamic recompilation
Dybvig. An infrastructure for profile-driven dynamic recompilation

An Infrastructure for Pro?le-Driven Dynamic Recompilation?

Robert G.Burger R.Kent Dybvig

Indiana University Computer Science Department

Lindley Hall215

Bloomington,Indiana47405

{burger,dyb}@https://www.doczj.com/doc/9717034237.html,

September1997

Abstract

Dynamic optimization of computer programs can dramatically improve their performance on a

variety of applications.This paper presents an e?cient infrastructure for dynamic recompilation

that can support a wide range of dynamic optimizations including pro?le-driven optimizations.

The infrastructure allows any section of code to be optimized and regenerated on-the-?y,even

code for currently active procedures.The infrastructure incorporates a low-overhead edge-

count pro?ling strategy that supports?rst-class continuations and reinstrumentation of active

procedures.Pro?ling instrumentation can be added and removed dynamically,and the data

can be displayed graphically in terms of the original source to provide useful feedback to the

programmer.

Keywords:edge-count pro?ling,dynamic compilation,run-time code generation,basic block

reordering

1Introduction

In the traditional model of program optimization and compilation,a program is optimized and compiled once,prior to execution.This allows the cost of program optimization and compilation to be amortized,possibly,over may program runs.On the other hand,it prevents the compiler from exploiting properties of the program,its input,and its execution environment that can be determined only at run time.Recent research has shown that dynamic compilation can dramatically improve the performance of a wide range of applications including network packet demultiplexing, sparse matrix computations,pattern matching,and mobile code[9,7,12,15,19,23].Dynamic optimization works when a program is staged in such a way that the cost of dynamic recompilation can be amortized over many runs of the optimized code[21].

This paper presents an infrastructure for dynamic recompilation of computer programs that permits optimization of code at run time.The infrastructure allows any section of code to be

?This material is based on work supported in part by the National Science Foundation under grant numbers CDA-9312614and CCR-9711269.Robert G.Burger was supported in part by a National Science Foundation Graduate Research Fellowship.

optimized and regenerated on-the-?y,even the code for currently active procedures.In our Scheme-based implementation of the infrastructure,the garbage collector is used to?nd and relocate all references to recompiled procedures,including return addresses in the active portion of the control stack.

Since many optimizations bene?t from pro?ling information,we have included support for pro?ling as an integral part of the recompilation infrastructure.Instrumentation for pro?ling can be inserted or removed dynamically,again by regenerating code on-the-?y.A variant of Ball and Larus’s low-overhead edge-count pro?ling strategy[2],extended to support?rst-class continuations and reinstrumentation of active procedures,is used to obtain accurate execution counts for all basic blocks in instrumented code.Although pro?ling overhead is fairly low,pro?ling is typically enabled only for a portion of a program run,allowing subsequent execution to bene?t from optizations guided by the pro?ling information without the pro?ling overhead.

A side bene?t of the pro?ling support is that pro?le data can be made available to the program-mer,even as a program is executing.In our implementation,the programmer can view pro?le data graphically in terms of the original source(possibly split over many?les)to identify the“hot spots”in a program.The source is color-coded according to execution frequency,and the programmer can“zoom in”on portions of the program or portions of the frequency range.

As a proof of concept,we have used the recompilation infrastructure and pro?ling information to support run-time reordering of basic blocks to reduce the number mispredicted branches and in-struction cache misses,using a variant of Pettis and Hansen’s basic block-reordering algorithm[24].

The mechanisms described in this paper are directly applicable to garbage collected languages such as Java,ML,Scheme,and Smalltalk in which all references to a procedure may be found and relocated at run time.The mechanisms can be adapted to languages like C and Fortran in which storage management is done manually by maintaining a level of indirection for all procedure entry points and return addresses.Although a level of indirection is common for entry points to support dynamic linking and shared libraries,the level of indirection for return addresses would be a cost incurred entirely to support run-time recompilation.

Section2describes the edge-count pro?ling algorithm incorporated into the dynamic recom-pilation infrastructure.Section3presents the infrastructure in detail and its proof-of-concept application to basic-block reordering based on pro?le information.Section4gives some perfor-mance data for the pro?ler and dynamic recompiler.Section5brie?y discusses how pro?le data is associated with source code and presented to the programmer.Section6describes related work. Section7summarizes our results and discusses future work.

2Edge-Count Pro?ling

This section describes a low-overhead edge-count pro?ling strategy based on one described by Ball and Larus[2].Like Ball and Larus’s,it minimizes the total number of pro?le counter increments at run time.Unlike Ball and Larus’s,it supports?rst-class continuations and reinstrumentation of active procedures.Additionally,our strategy employs a fast log-linear algorithm to determine

(de?ne remq

(lambda(x ls)

(cond

[(null?ls)’()]

[(eq?(car ls)x)(remq x(cdr ls))]

[else(cons(car ls)(remq x(cdr ls)))]))) bne arg2,nil,L1B1

ld ac,empty-list B2

jmp ret

L1:ld ac,car-o?set(arg2)B3

bne ac,arg1,L2

ld arg2,cdr-o?set(arg2)B4

jmp reloc remq

L2:st arg2,4(fp)B5

st ret,0(fp)

ld arg2,cdr-o?set(arg2)

ld ret,L3

add fp,8,fp

jmp reloc remq

L3:sub fp,8,fp B6

ld arg2,4(fp)

ld ret,0(fp)

ld arg1,ac

ld ac,car-o?set(arg2)

sub ap,car-o?set,xp

add ap,8,ap

st ac,car-o?set(xp)

st arg1,cdr-o?set(xp)

ld ac,xp

jmp ret >(remq’b’(b a b c d)) |(remq b(b a b c d))

|(remq b(a b c d))

||(remq b(b c d))

||(remq b(c d))

|||(remq b(d))

||||(remq b())

||||()

|||(d)

||(c d)

|(a c d)

(a c d)

B1

B2B3

B4B5

B6

Exit

1

13

3

3

2

2

5

6

Figure1:remq source,sample trace,basic blocks,and control-?ow graph with thick,uninstru-mented edges from one of the twelve maximal spanning trees and encircled counts for the remaining, instrumented edges

optimal counter placement.The recompiler uses the pro?ling information to guide optimization, and it may also use the data to optimize counter placement,as described in Section3.The programmer can view the data graphically in terms of the original source to identify the“hot spots”in a program,as described in Section5.

Section2.1summarizes Ball and Larus’s optimal edge-count placement algorithm.Section2.2 presents modi?cations to support?rst-class continuations and reinstrumentation of active proce-dures.Section2.3describes our implementation.

2.1Background

Figure1illustrates Ball and Larus’s optimal edge-count placement algorithm using a Scheme pro-cedure that removes all occurrences of a given item from a list.

A procedure is represented by a control-?ow graph composed of basic blocks and weighted

(de?ne fact

(lambda(n done)

(if(

(done1)

(?n(fact(?n1)done)))))

bge arg1,?xnum2,L1B1

ld cp,arg2B2

ld arg1,?xnum1

jmp entry-o?set(cp)

L1:st arg1,4(fp)B3 st ret,0(fp)

sub arg1,?xnum1,arg1

add fp,8,fp

ld ret,L2

jmp reloc fact

L2:sub fp,8,fp B4 ld arg1,4(fp)

ld ret,0(fp)

ld arg2,ac

jmp reloc?

Figure2:Source and basic blocks for fact,a variant of the factorial function which invokes the done procedure to compute the value of the base case

edges.The assembly language instructions for the procedure are split into basic blocks,which are sequential sections of code for which control enters only at the top and exits only from the bottom.The branches at the bottoms of the basic blocks determine how the blocks are connected, so they become the edges in the graph.The weight of each edge represents the number of times the corresponding branch is taken.

The key property needed for optimal pro?ling is conservation of?ow,i.e.,the sum of the?ow coming into a basic block is equal to the sum of the?ow going out of it.In order for the control-?ow graph to satisfy this property,it must represent all possible control paths.Consequently,a virtual “exit”block is added so that all exits are explicitly represented as edges to the exit block.Entry to the procedure is explicitly represented by an edge from the exit block to the entry block,and the weight of this edge represents the number of times the procedure is invoked.

Because the augmented control-?ow graph satis?es the conservation of?ow property,it is not necessary to instrument all the edges.Instead,the weights for many of the edges can be computed from the weights of the other edges using addition and subtraction,provided that the uninstru-mented edges do not form a cycle.The largest cycle-free set of edges is a spanning tree.Since the sum of the weights of the uninstrumented edges represents the savings in counting,a maximal spanning tree determines the inverse of the lowest-cost set of edges to instrument.

The edge from the exit block to the entry block does not correspond to an actual instruction in the procedure,so it cannot be instrumented.Consequently,the maximal spanning tree algorithm is seeded with this edge,and the resulting spanning tree is still maximal[2].

>(call/cc

(lambda(k)

(fact5k)))

|(fact5#)

||(fact4#)

|||(fact3#)

||||(fact2#) |||||(fact1#) 1incorrect counts(in bold)

without an exit edge:

B1

B2B3

Exit B4

1

1

1

all counts correct

with an exit edge:

B1

B2B3

Exit B4

1

5

4

14

Figure3:Trace and control-?ow graphs illustrating nonlocal exit from fact with and without an exit edge

2.2Control-Flow Aberrations

The conservation of?ow property is met only under the assumption that each procedure call returns exactly once.First-class continuations,however,cause some procedure activations to exit prematurely and others to be reinstated one or more times.Even when there are no continuations, reinstrumenting a procedure while it has activations on the stack is problematic because some of its calls have not returned yet.

Figure2gives a variant of the factorial function that illustrates the e?ects of nonlocal exit and re-entry using Scheme’s call-with-current-continuation(call/cc)function[10].

Ball and Larus handle the restricted class of exit-only continuations,e.g.,setjmp/longjmp in C,by adding to each call block an edge pointing to the exit block.The weight of an exit edge represents the number of times its associated call exits prematurely.Figure3demonstrates why exit edges are needed for exit-only continuations.Ball and Larus measure the weights of exit edges directly by modifying longjmp to account for aborted activations.We use a similar strategy,but to avoid the linear stack walk overhead,we measure the weights indirectly by ensuring that exit edges are on the spanning tree(see Section2.3).

With fully general continuations,the weight of an exit edge represents the net number of premature exits,and a negative weight indicates reinstatement.Figure4demonstrates the utility of exit edges for continuations that reinstate procedure activations.

2.3Implementation

To support pro?ling,the compiler organizes the generated code for each procedure into basic blocks linked by edges that represent the?ow of control within the procedure.It adds the virtual exit block and corresponding edges and seeds the spanning tree with the edge from the exit block to the start block as described in Section2.1.It then computes the maximal spanning tree and assigns counters

>(fact4

(lambda(n)

(call/cc

(lambda(k)

(set!redo k)

(k n)))))

|(fact4#)

||(fact3#)

|||(fact2#) ||||(fact1#) ||||1

|||2

||6

|24

24

>(redo2)

||||2

|||4

||12

|48

48incorrect counts(in bold)

without an exit edge:

B1

B2B3

Exit B4

1

1

6

6

7

6

all counts correct

with an exit edge:

B1

B2B3

Exit B4

1

13

3

?36

6

Figure4:Trace and control-?ow graphs illustrating re-entry into fact with and without an exit edge

to all edges not on the maximal spanning tree.Finally,the compiler inserts counter increments into the generated code,placing them into existing blocks whenever possible.

For e?ciency,our compiler uses the priority-?rst search algorithm for?nding a maximal span-ning tree[27].Its worst-case behavior is O((E+B)log B),where B is the number of blocks and E is the number of edges.Since each block has no more than two outgoing edges,E is O(B).Conse-quently,the priority-?rst algorithm performs very well with a worst-case behavior of O(B log B).

Another bene?t of this algorithm is that it adds uninstrumented edges to the tree in precisely the reverse order for which their weights need to be computed using the conservation of?ow property.As a result,count propagation does not require a separate depth-?rst search as described in[2].Instead,the maximal spanning tree algorithm generates the list used to propagate the counts quickly and easily.This list is especially important to the garbage collector,which must propagate the counts of recompiled procedures(see Sections3.1and3.3).

Figure5illustrates how our compiler minimizes the number of additional blocks needed to increment counters by placing as many increments as possible in existing blocks.The increment instructions refer to the edge data structures by actual address.

Instrumenting exit edges is more di?cult because there are no branches in the procedure asso-ciated with them.We solved this problem for the edge from the exit block to the entry block by seeding the maximal spanning tree algorithm with this edge and proving that the resulting tree is still maximal.Unfortunately,exit edges rarely lie on a maximal spanning tree because their weights are usually zero.Consequently,there are two choices for measuring the weights of exit edges.

First,we could modify continuation invocation to update the weights directly.Ball and Larus

If A’s only outgoing edge is e,the increment code is placed in A.

B

e

inc e

A

If B’s only incoming edge is e(and B is not the exit block),the increment code is placed in B.

B

e inc e A

Otherwise,the increment code is placed in a new

block C that is spliced into the control-?ow graph.

A

e

B inc e

C

Figure5:E?cient instrumentation of edge e from block A to block B

use this technique for exit-only continuations by incrementing the weights of the exit edges as-sociated with each activation that will exit prematurely[2].This approach would support fully general continuations if it would also decrement the weights of the exit edges associated with each activation that would be reinstated.The pointer to the exit edge would have to be stored either in each activation record or in a static location associated with the return address.

Second,we could seed the maximal spanning tree algorithm with all the exit edges.The resulting spanning tree might not be maximal,but it would be maximal among spanning trees that include all the exit edges.

Our system’s segmented stack implementation supports constant-time continuation invoca-tion[16,5].Implementing the?rst approach would destroy this property.Moreover,if any pro-cedure is pro?led,the system must traverse all activations to be sure it?nds all pro?led ones. Although the second approach increases pro?ling overhead,it a?ects only pro?led procedures,the overhead is still reasonable,and the programmer may turn o?accurate continuation pro?ling when it is not needed.Consequently,we implemented only the second approach.

3Dynamic Recompilation

This section presents the dynamic recompilation infrastructure.Section3.1gives an overview of the recompilation process.Section3.2describes how the representation of procedures is modi?ed to accommodate dynamic recompilation.Section3.3describes how the garbage collector uses the modi?ed representation to replace code objects with recompiled ones.Section3.4presents the recompilation process,which uses a variant of Pettis and Hansen’s basic block reordering algorithm to reduce the number of mispredicted branches and instruction cache misses[24].

3.1Overview

Dynamic recompilation proceeds in three phases.First,the candidate procedures are identi?ed, either by the user or by a program that selects among all the procedures in the heap.Second,these procedures are recompiled and linked to separate,new procedures.Third,the original procedures are replaced by the new ones during the next garbage collection.

Because a procedure’s entry and return points may change during recompilation,the recompiler creates a translation table that associates the entry-and return-point o?sets of the original and the new procedure and attaches it to the original procedure.The collector uses this table to relocate call instructions and return addresses.Because the collector translates return addresses,procedures can be recompiled while they are executing.

Before the next collection,only the original procedures are used.This invariant allows each orig-inal procedure to share its control-?ow graph and associated pro?le counts,if pro?ling is enabled, with the new procedure.Because the new procedure’s maximal spanning tree may be di?erent from the original’s,the new procedure may increment di?erent counts.The collector accounts for the di?erence by propagating the counts of the original procedure so that the new procedure starts with a complete set of accurate counts.

3.2Representation

Figure6illustrates how our representation of procedures,highlighting the minor changes needed to support dynamic recompilation.A more detailed description of our object representation is given elsewhere[13].

Procedures are represented as closures and code objects.A closure is a variable-length array whose?rst element contains the address of the procedure’s anonymous entry point and whose remaining elements contain the procedure’s free variables.The anonymous entry point is the?rst instruction of the procedure’s machine code,which is stored in a code object.A code object has a six-word header before the machine code.The info?eld stores the code object’s block structure and debugging information.

The relocation table is used by the garbage collector to relocate items stored in the instruction stream.Each item has an entry that speci?es the item’s o?set within the code stream(the code o?set),the o?set from the item’s address to the address actually stored in the code stream(the item o?set),and how the item is encoded in the code stream(the type).The code pointer is used to relocate items stored as relative addresses.

Because procedure entry points may change during recompilation,relocation entries for calls use the long format so that the translated o?set cannot exceed the?eld width.Consequently,the long format provides a convenient location for the“d”bit,which indicates whether an entry corresponds to a call https://www.doczj.com/doc/9717034237.html,e of the long format does not signi?cantly increase the table size because calls account for a minority of relocation entries.Because short entries cannot describe calls,they need not be checked.

Three of the code object’s type bits are used to encode the status of a code object.The?rst

frame size inst

Figure 6:Representation of closures,code objects,and stacks with the infrastructure for dynamic recompilation highlighted

bit,set by the recompiler,indicates whether the code object has been recompiled to a new one and is thus obsolete .Since an obsolete code object will not be copied during collection,its relocation table is no longer needed.Therefore,its reloc ?eld is used to point to the translation table instead.The list of original/new o?set pairs are sorted by original o?set to enable fast binary searching during relocation.

The second bit,denoted by “n”in the ?gure,indicates whether the code object is new .This bit,set by the recompiler and cleared by the collector,is used to prevent a new code object from being recompiled before the associated obsolete one has been eliminated.The third bit,denoted by “b”in the ?gure,serves a similar purpose.It indicates whether an old code object is busy in that it is either being or has been recompiled.This bit is used to prevent multiple recompilation of the same code object.The recompiler sets this bit at the beginning of recompilation.Because the recompiler may trigger a garbage collection before it creates a new code object and marks the original one obsolete,this bit must be preserved by the collector.It is possible to recompile a code

object multiple times;however,each subsequent recompilation must occur after the code object has been recompiled and collected.(It would be possible to relax this restriction.) In order to support multiple return values,?rst-class continuations,and garbage collection e?ciently,four words of data are placed in the instruction stream immediately before the single-value return point from each nontail call[16,1,5].The live mask is a bit vector describing which frame locations contain live data.The code pointer is used to?nd the code object associated with a given return address.The frame size is used during garbage collection,continuation invocation, and debugging to walk down a stack one frame at a time.

3.3Collection

Only a few modi?cations are necessary for the garbage collector to support dynamic recompilation. The primary change involves how a code object is copied.If the obsolete bit is clear,the code object and associated relocation table are copied as usual,but the new bit of the copied code object’s type ?eld is cleared if set.

If the obsolete bit is set,the code object is not copied at all.Instead,the pointer to the new code object is relocated and stored as the forwarding address for the obsolete code object so that all other pointers to the obsolete code object will be forwarded to the new one.Moreover,if the obsolete code object is instrumented for pro?ling,the collector propagates the counts so that they remain accurate when the new code object increments a possibly di?erent subset of the counts.The list of edge/block pairs used for propagation is found in the info structure.Since the propagation occurs during collection,some of the objects in the list may be forwarded,so the collector must check for pointers to forwarded objects as it propagates the counts.

The translation table is used to relocate call instructions and return addresses.Call instructions are found only in code objects,so they are handled when code objects are swept.The“d”bit of long-format relocation entries identi?es the candidates for translation.Return addresses are found only in stacks,so they are handled when continuations are swept.

Since our collector is generational,we must address the problem of potential cross-generational pointers from obsolete to new code objects.Our segmented heap model allows us to allocate new objects in older generations when necessary[13];thus,we always allocate a new code object in the same generation as the corresponding obsolete code object.Otherwise,we would have to add the pointer to our remembered set and promote the new code object to the older generation during collection.

3.4Block Reordering

To demonstrate the dynamic recompilation infrastructure,we have applied it to the problem of basic-block reordering.We use a variant of Pettis and Hansen’s basic block reordering algorithm to reduce the number of mispredicted branches and instruction cache misses[24].We also use the edge-count pro?le data to decrease pro?ling overhead by re-running the maximal spanning tree algorithm to improve counter placement.

If both edges point to blocks in the same chain, no preferences are added.If x≥y,A should come

before B with weight

x?y.Otherwise,B

should come before A

with weight y?x.

If x≥y,B should come

before A before C with

weight x?y.Otherwise,C

should come before A before

B with weight y?x

.

Figure7:Adding branch prediction preferences for architectures that predict all backward condi-tional branches taken and all forward conditional branches not taken

The block reordering algorithm proceeds in two steps.First,blocks are combined into chains according to the most frequently executed edges to reduce the number of instruction cache misses. Second,the chains are ordered to reduce the number of mispredicted branches.

Initially,every block comprises a chain of one block,https://www.doczj.com/doc/9717034237.html,ing a list of edges sorted by decreasing weight,distinct chains A and B are combined when an edge’s source block is at the tail of A and its sink block is at the head of B.When all the edges have been processed,a set of chains is left.

The algorithm places ordering preferences on the chains based on the conditional branches emitted from the blocks within the chains and the target architecture’s branch prediction strategy. Blocks with two outgoing edges always generate a conditional branch for one of the edges,and they generate an unconditional branch when the other edge does not point to the next block in the chain.Figure7illustrates how the various conditional branch possibilities generate preferences for a common prediction strategy.The preferences are implemented using a weighted directed graph with nodes representing chains and edges representing the“should come before”relation.

As each block with two outgoing edges is processed,its preferences(if any)are added to the weights of the graph.Suppose there is an edge of weight x from chain A to B and an edge of weight y from chain B to A,and x>y.The second edge is removed,and the?rst edge’s weight becomes x?y,so that there is only one positive-weighted edge between any two nodes.A depth-?rst search then topologically sorts the chains,omitting edges that cause cycles.The machine code for the chains is placed in a new code object,and the old code object is marked obsolete and has its reloc ?eld changed to point to the translation table.

Benchmark Lines Description

compiler35,901Chez Scheme5.0g recompiling itself

softscheme10,073Andrew Wright’s soft typer[29]checking match

ddd9,578Digital Design Derivation System1.0[4]deriving hardware

for a Scheme machine[6]

similix7,305self-application of the Similix5.0partial evaluator[3]

nucleic3,4753-D structure determination of a nucleic acid

slatex2,343SLaTeX2.2typesetting its own manual

maze730Hexagonal maze maker by Olin Shivers

earley655Earley’s parser by Marc Feeley

peval639Feeley’s simple Scheme partial evaluator

Table1:Description of benchmarks

Initial Count Placement Optimal Count Placement Optimal Count Placement

Initial Block Ordering Initial Block Ordering Block Reordering Benchmark Run Time Compile Time Run Time Recompile Time Run Time Recompile Time compiler 1.75 1.290.900.150.920.14 softscheme 1.56 1.26 1.310.09 1.300.10 ddd 1.09 1.31 1.020.18 1.000.21 similix 1.57 1.23 1.330.26 1.310.28 nucleic 1.24 1.03 1.210.05 1.060.05 slatex 1.18 1.20 1.090.15 1.050.16 maze 1.37 1.23 1.150.10 1.100.11 earley 1.07 1.280.960.110.920.11 peval 1.61 1.31 1.300.15 1.170.15 Average 1.38 1.24 1.140.14 1.090.15 Table2:Run-time and(re)compile-time costs of edge-count pro?ling relative to a base compiler that does not instrument for pro?ling or reorder basic blocks

4Performance

We used the set of benchmarks listed in Table1to assess both compile-time and run-time per-formance of the dynamic recompiler and pro?ler.All measurements were taken on a DEC Alpha 3000/600running Digital UNIX V4.0A.

Table2shows the cost of pro?ling and recompiling each of the benchmark programs.Initial instrumentation has an average run-time overhead of38%and an average compile-time overhead of 24%.Without block reordering,optimal count placement reduces the average run-time overhead to 14%,and the average recompile time is only14%of the base compile time.With block reordering, the average run-time overhead drops to9%,while the average recompile time increases very slightly to15%.

The reduction of pro?ling overhead from38%to14%is quite respectable and shows that the

Mispredicted Branches Unconditional Branches Reordered Performance Benchmark Ideal Reordered Initial Reordered Initial Run Time Recompile Time compiler0.060.090.720.090.110.930.14 softscheme0.070.070.380.010.030.980.07 ddd0.040.040.620.000.130.980.12 similix0.070.070.610.000.020.960.17

nucleic0.010.020.970.000.010.940.03

slatex0.020.030.770.040.270.990.11

maze0.100.100.790.090.100.980.07

earley0.200.240.650.150.190.770.08

peval0.110.110.660.000.060.870.09 Average0.080.090.690.040.100.930.10 Table3:E?ectiveness of dynamic block reordering on reducing the number of mispredicted branches and unconditional branches,the relative run time of the recompiled code,and the recompile time relative to the base compile time.The number of unconditional branches is given relative to the number of conditional branches.Run-time checks for conditions such as heap and stack over?ow account for one third to one half of the conditional branches.For simplicity,these checks were designed to test and branch around the code that handles the condition.Because these tests almost always fail,the mispredicted forward branches in?ate the initial misprediction rate.Factoring them out yields an initial misprediction rate of about50%.

optimal counter placement can be an e?ective tool for use in long program runs intended to gather precise overall pro?ling information.The block reordering optimization does not,however,fully compensate for the pro?ling overhead,and even if additional run-time optimizations justi?ed by the pro?ling information reduce the overhead still further,it is not clear that pro?ling will pay for itself in production runs if left enabled inde?nitely.This suggests a strategy for dynamic optimization in which a program is dynamically optimized and pro?ling instrumentation removed after an initial portion of a program run during which pro?ling information is gathered.

To assess the e?ectiveness of the block reordering algorithm,we measured the number of mis-predicted conditional branches and the number of unconditional branches.The Alpha architecture encourages hardware and compiler implementors to predict backward conditional branches taken and forward conditional branches not taken[28].Current Alpha implementations use this static model as a starting point for dynamic branch prediction.We computed the mispredicted branch percentage using the static model as a metric for determining the e?ectiveness of the block re-ordering algorithm.Table3gives the results.Like Pettis and Hansen’s algorithm,ours achieves near-optimal misprediction rates,signi?cantly reduces the number of unconditional branches,and provides a modest7%reduction in run time.Our algorithm is also fast,for the average recompile time is just10%of the base compile time.Because the Alpha performs dynamic branch prediction, the7%reduction in run time is likely due mostly to reduction in instruction cache misses.

5Graphical Display of Pro?le Information

Although the pro?le data is intended primarily for use by the dynamic recompiler,it is also useful for providing feedback to the programmer.In order to provide useful feedback,some way of associating

pro?le data with source code is needed.The compiler attaches a source record containing a source ?le descriptor and a byte o?set from the beginning of the?le to each source expression as it is read.The compiler propagates these source records through the intermediate compilation passes and associates each source record with an appropriate basic block.Within the representation of a basic block,each source record associated with the block is included within a special“source”pseudo instruction.These pseudo instructions do not result in the generation of any additional machine code.

The default location for an expression’s source record is in the basic block that begins its evaluation.For procedure calls,however,the source expression corresponds to the basic block that makes the call.In most situations,the count for this block is the same as the count for the block that begins to evaluate the entire call expression.A di?erence arises when continuations are involved,and in this case we have found it more useful to know how many times the call is actually made.

Each constant and variable reference occurs in just one basic block,so the source record goes there.When a constant or reference occurs in nontail position,its count can usually be determined from the count of the closest enclosing expression.An exception arises when the constant or reference occurs as the“then”or“else”part of an if expression in nontail position.Consequently, the compiler generates source instructions for constants and references only when they occur in tail position or as the“then”or“else”part of an if expression.By eliminating the source instructions for the remaining cases,the compiler signi?cantly reduces the number of source records stored in basic blocks without sacri?cing useful information.

To display the block counts in terms of the original source,we follow three steps.First,we determine the count for each block by summing the weights of all its outgoing edges.Second,we build an association list of source expressions and block counts from the source records stored in each basic block.Third,we use this list to determine the source?les that must be displayed by the graphical user interface and to guide the color-coding of the code according to the counts.The programmer can“zoom in”on portions of the program or portions of the frequency range and can also click on an expression to obtain its precise count.Figure8gives an example.

The programmer is not limited to displaying information for one procedure at a time.The data for all pro?led procedures can be displayed at one time with a separate window for each source?le. The data is sorted by frequency to help programmers identify hot spots.This technique proved useful in pro?ling the pro?ler itself,helping us identify ine?ciencies in the maximal spanning tree and block look-up algorithms.

6Related Work

Our edge-count pro?ling algorithm is based one described by Ball and Larus[2],who applied their algorithm in a traditional static compilation environment.We have extended their algorithm to support?rst-class procedures and reinstrumentation of active procedures.We have also identi?ed an algorithm for determining optimal counter placement that is faster and eliminates the need for

Figure8:A section of a lexical scanner displayed using darker shades for more frequently executed code and lighter shades for less frequently executed codes.Note to the program committee:We hope to improve the translation from color to grey scales for the?nal paper.

a separate depth-?rst search for count propagation.

We have used Pettis and Hansen’s intraprocedural block-reordering strategy with only minor modi?cations to apply it in the context of dynamic recompilation.Samples[26]explores a similar intraprocedural algorithm that reduces instruction cache miss rates by up to50%.Pettis and Hansen[24]also describe an interprocedural algorithm that places procedures in memory such that those that execute close together in time will also be close together in memory.Since determining the optimal ordering is NP-complete,they use a greedy“closest is best”strategy.

Several recent research projects have focused on light-weight run-time code generation[7,14, 19,20,25],with code generation costs on the order of?ve to100instructions executed for each instruction generated.Although the overhead inherent in our model is greater,the potential bene?ts are greater as well.

Little attention has previously been paid to pro?le-driven dynamic optimizations,with the no-table exception of work on Self[8,17].This work uses a specialized form of pro?ling to guide generation of special-purpose code to avoid generic dispatch overhead.Our infrastructure incorpo-rates a more general pro?ling strategy that is not targeted to any particular optimization technique, although we have so far applied it only to the limited problem of block reordering.

Keppel has investigated the application of run-time code generation to value-speci?c optimiza-tions,in which code is special-cased to particular input values[18].Although we have so far focused on pro?le-driven optimizations,we believe that our infrastructure is well suited to value-speci?c

optimizations as well.

7Conclusions

We have described an e?cient infrastructure for dynamic recompilation that incorporates a low-overhead edge-count pro?ling strategy supporting?rst-class continuations and reinstrumentation of active procedures.We have shown how the pro?le data can be used to improve performance by reordering basic blocks and optimizing counter placement.In addition,we have explained how the pro?le data can be associated with the original source to provide graphical feedback to the programmer.

The mechanisms described in this paper have all been implemented and incorporated into Chez Scheme.The recompiler can pro?le and regenerate all code in the system,including itself,as it runs.Performance results show that the average run-time pro?ling overhead is initially38% and decreases signi?cantly with recompilation.The average compile-time pro?ling overhead is 24%,and the average recompile time relative to the base compile time is10%without pro?ling instrumentation and14–15%with pro?ling instrumentation.The block-reordering algorithm is fast and e?ective at reducing the number of mispredicted branches and unconditional branches.

The techniques and algorithms are directly applicable to garbage collected languages such as Java,ML,Scheme,and Smalltalk,and can be adapted to other languages as described in Section1.

Dynamic recompilation need not be limited to low-level optimizations such as block reordering.

A promising area of future work involves associating the pro?le data with earlier passes of the compiler so that higher-level optimizations can take advantage of the information.For example, register allocation,?ow analysis,and inlining could bene?t from pro?le data.An unoptimized active code segment with no analogue in the optimized version of a program,such as code for an active procedure that has been inlined at its call sites,can be retained by the system until no longer needed.This will,in fact,occur naturally in garbage collected systems.

Another area of future work involves a generalization of edge-count pro?ling to measure other dynamic program characteristics such as the number of procedure calls,variable references,and so forth.For example,pro?le data can be used to measure how many times a procedure is called versus how many times it is created,and this ratio could be used to guide lambda lifting[11]. Combined with an estimate of the cost of generating code for the procedure,this ratio could also help determine when run-time code generation[22]would be pro?table.Our system can also be extended to recompile(specialize)a closure based on the run-time values of its free variables.

References

[1]J.Michael Ashley and R.Kent Dybvig.An e?cient implementation of multiple return values in Scheme.

In Proceedings of the ACM SIGPLAN’94Conference on Programming Language Design and Imple-mentation,pages140–149,June1994.

[2]Thomas Ball and James https://www.doczj.com/doc/9717034237.html,rus.Optimally pro?ling and tracing programs.In Proceedings of the19th

Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages,pages59–70, January1992.

[3]Anders Bondorf.Similix Manual,System Version5.0.DIKU,University of Copenhagen,Denmark,

1993.

[4]Bhaskar Bose.DDD—A transformation system for Digital Design Derivation.Technical Report331,

Indiana University,Computer Science Department,May1991.

[5]Carl Bruggeman,Oscar Waddell,and R.Kent Dybvig.Representing control in the presence of one-shot

continuations.In Proceedings of the ACM SIGPLAN’96Conference on Programming Language Design and Implementation,pages99–107,May1996.

[6]Robert G.Burger.The Scheme Machine.Technical Report413,Indiana University,Computer Science

Department,August1994.

[7]Chaig Chambers,Susan J.Eggers,Joel Auslander,Matthai Philipose,Markus Mock,and Przemyslaw

Pardyak.Automatic dynamic compilation support for event dispatching in extensible systems.In WCSSS’96Workshop on Compiler Support for System Software,February1996.

[8]Craig Chambers.The Design and Implementation of the SELF Compiler,an Optimizing Compiler for

Object-Oriented Programming Languages.PhD thesis,Stanford University,April1992.

[9]Craig Chambers and David Ungar.Customization:Optimizing compiler technology for SELF,a

dynamically-typed object-oriented programming language.In PLDI’89Conference on Programming Language Design and Implementation,pages146–160,June1989.

[10]William Clinger and Jonathan Rees(editors).Revised4report on the algorithmic language Scheme.

LISP Pointers,IV(3):1–55,July–September1991.

[11]William D.Clinger and Lars Thomas https://www.doczj.com/doc/9717034237.html,mbda,the ultimate label,or a simple optimizing

compiler for Scheme.In Proceedings of the1994ACM Conference on LISP and Functional Programming, pages128–139,1994.

[12]L.Peter Deutsch and Allan M.Schi?man.E?cient implementation of the Smalltalk–80system.In

POPL’84Symposium on Principles of Programming Languages,Salt Lake City,pages297–302,January 1984.

[13]R.Kent Dybvig,David Eby,and Carl Bruggeman.Don’t stop the BIBOP:Flexible and e?cient storage

management for dynamically typed languages.Technical Report400,Indiana University,Computer Science Department,March1994.

[14]Dawson R.Engler.vcode:A retargetable,extensible,very fast dynamic code generation system.In

PLDI’96Conference on Programming Language Design and Implementation,May1996.

[15]Dawson R.Engler,Deborah Wallach,and M.Frans Kaashoek.E?cient,safe,application-speci?c

message processing.Technical Memorandum MIT/LCS/TM533,MIT Laboratory for Computer Science, March1995.

[16]Robert Hieb,R.Kent Dybvig,and Carl Bruggeman.Representing control in the presence of?rst-class

continuations.In Proceedings of the ACM SIGPLAN’90Conference on Programming Language Design and Implementation,pages66–77,June1990.

[17]Urs H¨o lze and David Ungar.Optimizing dynamically-dispatched calls with run-time type feedback.In

Proceedings of the ACM SIGPLAN’94Conference on Programming Language Design and Implemen-tation,pages326–335,1994.

[18]David Keppel,Susan J.Eggers,and Robert R.Henry.Evaluating runtime-compiled value-speci?c

optimizations.Technical Report93-11-02,Department of Computer Science and Engineering,University of Washington,November1993.

[19]Peter Lee and Mark Leone.Optimizing ML with run-time code generation.In ACM Conference on

Programming Language Design and Implementation,pages137–148,May1996.

[20]Mark Leone.A Principled and Practical Approach to Run-Time Code Generation.PhD thesis,School

of Computer Science,Carnegie Mellon University,1997.In preparation.

[21]Mark Leone and R.Kent Dybvig.Dynamo:A staged compiler architecture for dynamic program

optimization.Technical Report490,Indiana University,Computer Science Department,September 1997.

[22]Mark Leone and Peter Lee.Optimizing ML with run-time code generation.In Proceedings of the ACM

SIGPLAN’96Conference on Programming Language Design and Implementation,pages137–148,May 1996.

[23]Henry Massalin.Synthesis:An E?cient Implementation of Fundamental Operating System Services.

PhD thesis,Department of Computer Science,Columbia University,1992.

[24]Karl Pettis and Robert C.Hansen.Pro?le guided code positioning.In Proceedings of the ACM SIGPLAN

’90Conference on Programming Language Design and Implementation,pages16–27,June1990. [25]Massimiliano Poletto,Dawson R.Engler,and M.Frans Kaashoek.tcc:A template-based compiler

for‘C.In WCSSS’96Workshop on Compiler Support for System Software,pages1–7,February1996.

[26] A.Dain Samples.Pro?le-driven compilation.Technical Report627,University of California,Berkeley,

1991.

[27]Robert Sedgewick.Algorithms,chapter31.Addison-Wesley Publishing Company,second edition,1988.

[28]Richard L.Sites,editor.Alpha Architecture Reference Manual.Digital Press,1992.

[29]Andrew K.Wright and Robert Cartwright.A practical soft type system for Scheme.In Proceedings of

the1994ACM Conference on LISP and Functional Programming,pages250–262,1994.

[批处理]计算时间差的函数etime

[批处理]计算时间差的函数etime 计算时间差的函数etime 收藏 https://www.doczj.com/doc/9717034237.html,/thread-4701-1-1.html 这个是脚本代码[保存为etime.bat放在当前路径下即可:免费内容: :etime <begin_time> <end_time> <return> rem 所测试任务的执行时间不超过1天// 骨瘦如柴版setlocal&set be=%~1:%~2&set cc=(%%d-%%a)*360000+(1%%e-1%%b)*6000+1%%f-1% %c&set dy=-8640000 for /f "delims=: tokens=1-6" %%a in ("%be:.=%")do endlocal&set/a %3=%cc%,%3+=%dy%*("%3>> 31")&exit/b ---------------------------------------------------------------------------------------------------------------------------------------- 计算两个时间点差的函数批处理etime 今天兴趣大法思考了好多bat的问题,以至于通宵 在论坛逛看到有个求时间差的"函数"被打搅调用地方不少(大都是测试代码执行效率的) 免费内容: :time0

::计算时间差(封装) @echo off&setlocal&set /a n=0&rem code 随风@https://www.doczj.com/doc/9717034237.html, for /f "tokens=1-8 delims=.: " %%a in ("%~1:%~2") do ( set /a n+=10%%a%%100*360000+10%%b%%100*6000+10%% c%%100*100+10%%d%%100 set /a n-=10%%e%%100*360000+10%%f%%100*6000+10%%g %%100*100+10%%h%%100) set /a s=n/360000,n=n%%360000,f=n/6000,n=n%%6000,m=n/1 00,n=n%%100 set "ok=%s% 小时%f% 分钟%m% 秒%n% 毫秒" endlocal&set %~3=%ok:-=%&goto :EOF 这个代码的算法是统一找时间点凌晨0:00:00.00然后计算任何一个时间点到凌晨的时间差(单位跑秒) 然后任意两个时间点求时间差就是他们相对凌晨时间点的时间数的差 对09这样的非法8进制数的处理用到了一些技巧,还有两个时间参数不分先后顺序,可全可点, 但是这个代码一行是可以省去的(既然是常被人掉用自然体

用c++编写计算日期的函数

14.1 分解与抽象 人类解决复杂问题采用的主要策略是“分而治之”,也就是对问题进行分解,然后分别解决各个子问题。著名的计算机科学家Parnas认为,巧妙的分解系统可以有效地系统的状态空间,降低软件系统的复杂性所带来的影响。对于复杂的软件系统,可以逐个将它分解为越来越小的组成部分,直至不能分解为止。这样在小的分解层次上,人就很容易理解并实现了。当所有小的问题解决完毕,整个大的系统也就解决完毕了。 在分解过程中会分解出很多类似的小问题,他们的解决方式是一样的,因而可以把这些小问题,抽象出来,只需要给出一个实现即可,凡是需要用到该问题时直接使用即可。 案例日期运算 给定日期由年、月、日(三个整数,年的取值在1970-2050之间)组成,完成以下功能: (1)判断给定日期的合法性; (2)计算两个日期相差的天数; (3)计算一个日期加上一个整数后对应的日期; (4)计算一个日期减去一个整数后对应的日期; (5)计算一个日期是星期几。 针对这个问题,很自然想到本例分解为5个模块,如图14.1所示。 图14.1日期计算功能分解图 仔细分析每一个模块的功能的具体流程: 1. 判断给定日期的合法性: 首先判断给定年份是否位于1970到2050之间。然后判断给定月份是否在1到12之间。最后判定日的合法性。判定日的合法性与月份有关,还涉及到闰年问题。当月份为1、3、5、7、8、10、12时,日的有效范围为1到31;当月份为4、6、9、11时,日的有效范围为1到30;当月份为2时,若年为闰年,日的有效范围为1到29;当月份为2时,若年不为闰年,日的有效范围为1到28。

图14.2日期合法性判定盒图 判断日期合法性要要用到判断年份是否为闰年,在图14.2中并未给出实现方法,在图14.3中给出。 图14.3闰年判定盒图 2. 计算两个日期相差的天数 计算日期A (yearA 、monthA 、dayA )和日期B (yearB 、monthB 、dayB )相差天数,假定A 小于B 并且A 和B 不在同一年份,很自然想到把天数分成3段: 2.1 A 日期到A 所在年份12月31日的天数; 2.2 A 之后到B 之前的整年的天数(A 、B 相邻年份这部分没有); 2.3 B 日期所在年份1月1日到B 日期的天数。 A 日期 A 日期12月31日 B 日期 B 日期1月1日 整年部分 整年部分 图14.4日期差分段计算图 若A 小于B 并且A 和B 在同一年份,直接在年内计算。 2.1和2.3都是计算年内的一段时间,并且涉及到闰年问题。2.2计算整年比较容易,但

批处理命令行for语句

for语句可以在命令行提示符中使用,也可以在批处理文件中使用。这两种情况下唯一的区别是%和%%,参加下文说明。 一、for语句的格式: for [参数] 变量in (集合) do 命令[命令的参数] 二、for语句的作用:对集合内的元素逐一执行后面的命令。 1、如:for %%i in (你好) do echo %%i 将在屏幕上显示“你好”2个字。这里集合是“你好”,执行的命令是“echo”。由于集合中只有1个元素,因此循环只运行一次。 如果改成for %%i in (你好朋友) do echo %%i 将会显示2行文字,第一行为“你好”,第二行为“朋友”。因为2个词之间有空格,因此集合中就有了2个元素,循环将运行2次。 2、注意:以上for语句的运行方式是新建一个批处理文件,即扩展名为“.bat”的文件,内容为上面的命令,然后运行。为了批处理执行完不退出,可在最后加上一条pause>null命令,这样能看到执行的结果。要想通过cmd命令行执行的话,必须将%%换成%,即去掉一个%,如下: for %i in (你好) do echo %i 3、以下所有例子都是这样,若要在命令行提示符下执行,请将所有的%%改成一个%。 三、for语句详细说明: 上面语句格式中有的加了中括号[],表示这个语句元素不是必须的,只在需要时使用。像刚才显示“你好”的命令中就没有使用[参数]这

个语句元素。所有语句元素间用空格隔开。 各语句元素的写法:for、in、do这3个符号是固定不变的 1、[参数]的种类:只有4种,分别是/d、/r、/l、/f(即目录Directory、递归recursion、序列list、文件file),他们用于对后面的集合的含义做出解释,请与下面的集合解释结合来看。这4个参数不区分大小写,可以联合使用,即一条for语句可以出现多个参数。 2、变量:除10个数字外(0-9)的所有符号(因为0-9往往作为形参使用,为了与此区别),变量名一般用单个字母表示即可,而且变量名区分大小写,即A和a是两个不同的变量。变量名前面必须是%,当在命令提示符下执行时,只用一个%;而在批处理程序中,必须用%%。 一行语句中,一般只需定义一个变量。使用/f参数中的tokens 选项可以将集合中的元素分解成多个值,并自动定义新的变量对应这些值。这时语句中可以使用多个变量,通常按字母顺序命名,即第一个是%%a,那么后一个就用%%b。如果第一个是%%i,后一个就用%%j。依此类推。具体看后面的相关内容。 变量可以直接在do后面的命令中使用。每次使用的变量总数不超过52个。 3、集合:集合必须放在括号里。集合是一行文本,这行文本可能有几种类型,如“你好”只是一串字符;“c:\boot.ini”是一个文件;“dir /b”是一个命令。 (1)如果for语句中没有使用任何参数,对待集合文本的处理方式是:

批处理命令for语句基本用法

批处理命令for语句的基本用法 [系列教程]批处理for语句从入门到精通[20101225更新] ____________________________版主提醒 ____________________________ 文档来自于网络搜索 为了避免影响技术讨论、提高看帖的舒适性,请大家不要在此帖下跟 无实质内容的口水帖,特别是纯顶、纯支持、纯感谢、路过之类的帖子, 管理人员将不定期清理此类回帖,请大家多参与讨论少灌水,与人方便, 终将给自己带来方便,谢谢合作。 ________________________________________________________________ 文档来自于网络搜索 批处理是一门简单的脚本语言,虽然不能独当一面,但是,若作为工作中的辅助工具,绝对会让大家有随用随写、称心如意的畅快感。 文档来自于网络搜索 和其他语言相比,批处理语言有其先天性的优势: 1、系统自带,无需另行安装; 2、命令少,语句简洁,上手非常快; 3、编写出来的脚本小巧玲珑,随写随用; 但是,因为它以命令行方式工作,操作多有不便,在图形界面大行其道的windows世界里,多多少少会让大众望而却步;就算是对命令行有好感的新手,面对微软有如天书的帮助文件,很多人也会败下阵来,因此,论坛里很多会员也发出了编写系统的批处理教程的呼声。

文档来自于网络搜索 编写系统的批处理新手教程,一直是论坛管理层讨论的热点问题,但是,各位管理人员大多都有工作在身,而系统的教程涉及的面是如此之广,面对如此浩大的工程,仅凭一两个人的力量,是难以做好的,因此,本人退而求其次,此次发布的教程,以专题的形式编写,日后人手渐多之后,再考虑组织人力编写全面的教程。 文档来自于网络搜索之所以选择最难的for,一是觉得for最为强大,是大多数人最希望掌握的;二是若写其他命令教程,如果没有for的基础,展开来讲解会无从下手;三是for也是批处理中最复杂最难掌握的语句,把它攻克了,批处理的学习将会一片坦途。 文档来自于网络搜索 这次的for语句系列教程,打算按照for语句的5种句式逐一展开,在讲解for/f的时候,会穿插讲解批处理中一个最为关键、也是新手最容易犯错的概念:变量延迟,大纲如下: 文档来自于网络搜索一前言 二for语句的基本用法 三for /f(含变量延迟) 四for /r 五for /d 六for /l 遵照yibantiaokuan的建议,在顶楼放出此教程的txt版本、word版本和pdf版本,以方便那些离线浏览的会员。 文档来自于网络搜索[本帖最后由namejm于2010-12-26 02:36编辑]

Excel中如何计算日期差

Excel中如何计算日期差: ----Excel中最便利的工作表函数之一——Datedif名不见经传,但却十分好用。Datedif能返回任意两个日期之间相差的时间,并能以年、月或天数的形式表示。您可以用它来计算发货单到期的时间,还可以用它来进行2000年的倒计时。 ----Excel中的Datedif函数带有3个参数,其格式如下: ----=Datedif(start_date,end_date,units) ----start_date和end_date参数可以是日期或者是代表日期的变量,而units则是1到2个字符长度的字符串,用以说明返回日期差的形式(见表1)。图1是使用Datedif函数的一个例子,第2行的值就表明这两个日期之间相差1年又14天。units的参数类型对应的Datedif返回值 “y”日期之差的年数(非四舍五入) “m”日期之差的月数(非四舍五入) “d”日期之差的天数(非四舍五入) “md”两个日期相减后,其差不足一个月的部分的天数 “ym”两个日期相减后,其差不足一年的部分的月数 “yd”两个日期相减后,其差不足一年的部分的天数

表1units参数的类型及其含义 图1可以通过键入3个带有不同参数的Datedif公式来计算日期的差。units的参数类型 ----图中:单元格Ex为公式“=Datedif(Cx,Dx,“y”)”得到的结果(x=2,3,4......下同) ----Fx为公式“=Datedif(Cx,Dx,“ym”)”得到的结果 ----Gx为公式“=Datedif(Cx,Dx,“md”)”得到的结果 现在要求两个日期之间相差多少分钟,units参数是什么呢? 晕,分钟你不能用天数乘小时再乘分钟吗? units的参数类型对应的Datedif返回值 “y”日期之差的年数(非四舍五入) “m”日期之差的月数(非四舍五入) “d”日期之差的天数(非四舍五入) “md”两个日期相减后,其差不足一个月的部分的天数 “ym”两个日期相减后,其差不足一年的部分的月数 “yd”两个日期相减后,其差不足一年的部分的天数 假设你的数据从A2和B2开始,在C2里输入下面公式,然后拖拉复制。 =IF(TEXT(A2,"h:mm:ss")

实用批处理(bat)教程

目录 第一章批处理基础 第一节常用批处理内部命令简介 1、REM 和:: 2、ECHO 和@ 3、PAUSE 4、ERRORLEVEL 5、TITLE 6、COLOR 7、mode 配置系统设备 8、GOTO 和: 9、FIND 10、START 11、assoc 和ftype 12、pushd 和popd 13、CALL 14、shift 15、IF 16、setlocal 与变量延迟(ENABLEDELAYEDEXPANSION / DISABLEDELAYEDEXPANSION 启动或停用延缓环境变量扩展名。) 17、ATTRIB显示或更改文件属性 第二节常用特殊符号 1、@命令行回显屏蔽符 2、%批处理变量引导符 3、> 重定向符 4、>>重定向符 5、<、>、<& 重定向符 6、|命令管道符 7、^转义字符 8、组合命令 9、& 组合命令 10、||组合命令 11、\"\"字符串界定符 12、, 逗号 13、; 分号 14、() 括号 15、! 感叹号 第二章FOR命令详解 一、基本格式 二、参数/d仅为目录 三、参数/R递归(文件名) 四、参数/L迭代数值范围 五、参数/F迭代及文件解析 第三章FOR命令中的变量

一、~I- 删除任何引号(\"),扩展%I 二、%~fI- 将%I 扩展到一个完全合格的路径名 三、%~dI- 仅将%I 扩展到一个驱动器号 四、%~pI- 仅将%I 扩展到一个路径 五、%~nI- 仅将%I 扩展到一个文件名 六、%~xI- 仅将%I 扩展到一个文件扩展名 七、%~sI- 扩展的路径只含有短名 八、%~aI- 将%I 扩展到文件的文件属性 九、%~tI- 将%I 扩展到文件的日期/时间 十、%~zI- 将%I 扩展到文件的大小 十一、%~$PATH:I 第四章批处理中的变量 一、系统变量 二、自定义变量 第五章set命令详解 一、用set命令设置自定义变量 二、用set命令进行简单计算 三、用set命令进行字符串处理 1、字符串替换 2、字符串截取 第六章if命令讲解 第一种用法:IF [NOT] ERRORLEVEL number command 第二种用法:IF [NOT] string1==string2 command 第三种用法:IF [NOT] EXIST filename command 第四种用法:IF增强的用法 第七章DOS编程高级技巧 一、界面设计 二、if…else…条件语句 三、循环语句 四、子程序 五、用ftp命令实现自动下载 六、用7-ZIP实现命令行压缩和解压功能 七、调用VBScript程序 八、将批处理转化为可执行文件 九、时间延迟 1、利用ping命令延时 2、利用for命令延时 3、利用vbs延迟函数,精确度毫秒,误差1000毫秒内 4、仅用批处理命令实现任意时间延迟,精确度10毫秒,误差50毫秒内 十、模拟进度条 十一、特殊字符的输入及应用 十二、随机数(%random%)的应用技巧 十三、变量嵌套与命令嵌套 1、更正了所有的错别字,适当排版,增加条理性。

小学英语不定冠词a和an的用法练习,

不定冠词a和an 的用法练习 一、请在横线上填上a或an: 1.______ dog 2.______ dictionary 3.______ student 4._______ egg 5.______ elephant 6.______ island 7.______ university student 8._________ European country 9.______ honest boy 10.________ honorable deed 11.miss:____“m”12._____ 8—year plan 13________ unhappy boy 14.________ umbrella 15.________ orange 16.________ hour 17.________ green apple 18.________ long umbrella 19.________ useful book 20.________ old bike 二、用a,an填空: 1.This is ____pear and that is ____ apple. 2.Mary has ____e-dictionary. She get it from her uncle. 3.Do you have ____QQ number? 4.This is___ orange. It is ___ small orange. 5.There is ___“s” in the word “bus”. 6.There is ____“h”and ____ "u" in the word “hour”. 7. My teacher will give me ___ useful book. 8.I have ___ hour to do my homework. 9.Do you have ___ umbrella. 10.10, She wear ___ uniform. 三用a,an填空: 1. He is ____ ugly man. 2. I have _____orange. 3. There is ____elephant in the zoo. 4. You are ____good student. 5. Please take ____umbrella. 6.I am ____student. 7. This is ___new book. 8.I have ___useful book. 9.He teaches math in ___ university. 10.I bought ___MP3 yesterday. 11.He is ____honest boy. 12.I got___ X-ray (x射线透视)。 13.My father bought me ___ iphone 5. 14.There is ___“u”in the word “hour”, there is ____“h”. 15.Yao Ming is ____NBA basketball player. 16.Tom had ____unusual experience. 17.This is___egg. 18. I spent ___hour doing my homework.

excel中计算日期差工龄生日等方法

excel中计算日期差工龄生日等方法 方法1:在A1单元格输入前面的日期,比如“2004-10-10”,在A2单元格输入后面的日期,如“2005-6-7”。接着单击A3单元格,输入公式“=DATEDIF(A1,A2,"d")”。然后按下回车键,那么立刻就会得到两者的天数差“240”。 提示:公式中的A1和A2分别代表前后两个日期,顺序是不可以颠倒的。此外,DATEDIF 函数是Excel中一个隐藏函数,在函数向导中看不到它,但这并不影响我们的使用。 方法2:任意选择一个单元格,输入公式“="2004-10-10"-"2005-6-7"”,然后按下回车键,我们可以立即计算出结果。 计算工作时间——工龄—— 假如日期数据在D2单元格。 =DA TEDIF(D2,TODAY(),"y")+1 注意:工龄两头算,所以加“1”。 如果精确到“天”—— =DA TEDIF(D2,TODAY(),"y")&"年"&DATEDIF(D2,TODAY(),"ym")&"月"&DATEDIF(D2,TODAY(),"md")&"日" 二、计算2003-7-617:05到2006-7-713:50分之间相差了多少天、多少个小时多少分钟 假定原数据分别在A1和B1单元格,将计算结果分别放在C1、D1和E1单元格。 C1单元格公式如下: =ROUND(B1-A1,0) D1单元格公式如下: =(B1-A1)*24 E1单元格公式如下: =(B1-A1)*24*60 注意:A1和B1单元格格式要设为日期,C1、D1和E1单元格格式要设为常规. 三、计算生日,假设b2为生日

=datedif(B2,today(),"y") DA TEDIF函数,除Excel2000中在帮助文档有描述外,其他版本的Excel在帮助文档中都没有说明,并且在所有版本的函数向导中也都找不到此函数。但该函数在电子表格中确实存在,并且用来计算两个日期之间的天数、月数或年数很方便。微软称,提供此函数是为了与Lotus1-2-3兼容。 该函数的用法为“DA TEDIF(Start_date,End_date,Unit)”,其中Start_date为一个日期,它代表时间段内的第一个日期或起始日期。End_date为一个日期,它代表时间段内的最后一个日期或结束日期。Unit为所需信息的返回类型。 “Y”为时间段中的整年数,“M”为时间段中的整月数,“D”时间段中的天数。“MD”为Start_date与End_date日期中天数的差,可忽略日期中的月和年。“YM”为Start_date与End_date日期中月数的差,可忽略日期中的日和年。“YD”为Start_date与End_date日期中天数的差,可忽略日期中的年。比如,B2单元格中存放的是出生日期(输入年月日时,用斜线或短横线隔开),在C2单元格中输入“=datedif(B2,today(),"y")”(C2单元格的格式为常规),按回车键后,C2单元格中的数值就是计算后的年龄。此函数在计算时,只有在两日期相差满12个月,才算为一年,假如生日是2004年2月27日,今天是2005年2月28日,用此函数计算的年龄则为0岁,这样算出的年龄其实是最公平的。 本篇文章来源于:实例教程网(https://www.doczj.com/doc/9717034237.html,) 原文链接:https://www.doczj.com/doc/9717034237.html,/bgruanjian/excel/631.html

批处理命令格式

批处理命令格式.txt人永远不知道谁哪次不经意的跟你说了再见之后就真的再也不见了。一分钟有多长?这要看你是蹲在厕所里面,还是等在厕所外面……echo 表示显示此命令后的字符 echo off 表示在此语句后所有运行的命令都不显示命令行本身 @与echo off相象,但它是加在每个命令行的最前面,表示运行时不显示这一行的命令行(只能影响当前行)。 call 调用另一个批处理文件(如果不用call而直接调用别的批处理文件,那么执行完那个批处理文件后将无法返回当前文件并执行当前文件的后续命令)。 pause 运行此句会暂停批处理的执行并在屏幕上显示Press any key to continue...的提示,等待用户按任意键后继续 rem 表示此命令后的字符为解释行(注释),不执行,只是给自己今后参考用的(相当于程序中的注释)。 例1:用edit编辑a.bat文件,输入下列内容后存盘为c: a.bat,执行该批处理文件后可实现:将根目录中所有文件写入 a.txt中,启动UCDOS,进入WPS等功能。 批处理文件的内容为: 命令注释: @echo off 不显示后续命令行及当前命令行 dir c: *.* >a.txt 将c盘文件列表写入a.txt call c: ucdos ucdos.bat 调用ucdos echo 你好显示"你好" pause 暂停,等待按键继续 rem 准备运行wps 注释:准备运行wps cd ucdos 进入ucdos目录 wps 运行wps 批处理文件的参数 批处理文件还可以像C语言的函数一样使用参数(相当于DOS命令的命令行参数),这需要用到一个参数表示符“%”。 %[1-9]表示参数,参数是指在运行批处理文件时在文件名后加的以空格(或者Tab)分隔的字符串。变量可以从%0到%9,%0表示批处理命令本身,其它参数字符串用%1到%9顺序表示。例2:C:根目录下有一批处理文件名为f.bat,内容为: @echo off format %1 如果执行C: >f a: 那么在执行f.bat时,%1就表示a:,这样format %1就相当于format a:,于是上面的命令运行时实际执行的是format a: 例3:C:根目录下一批处理文件名为t.bat,内容为: @echo off type %1 type %2 那么运行C: >t a.txt b.txt %1 : 表示a.txt %2 : 表示b.txt 于是上面的命令将顺序地显示a.txt和b.txt文件的内容。 特殊命令 if goto choice for是批处理文件中比较高级的命令,如果这几个你用得很熟练,你就是批

不定冠词a_an的用法

不定冠词a an的用法 一、教学目标 1.让学生通过比较掌握不定冠词a 和an的用法。 2.学会在句子及写作中运用不定冠词a和an。 二、教学重难点: 重点:掌握不定冠词a 和an的用法 难点:学会使用不定冠词a 和an 三、教学过程 1.讲解英语中两个冠词的区别 a/an在英语中被称为不定冠词。它们表示一(个,只,本,件,块,片……),在意义上没有区别。 用a/an时,我们必须记住两条基本原则: ①a/an有不确定的意义。(即所说的人、动物或东西对听者或读者来说可能是不知道的) ②a/an只能用于单数可数名词之前。 例如:a book (一本书), an eraser(一块橡皮) 在英语中,液状、气状物体等很多都属于不可数名词,前面不能够带有不定冠词。 例如cola 可乐,milk 牛奶,water 水和gas 汽油,等等。 2.a和an的区别 不定冠词有a和an两种形式,a用于辅音(不是辅音字母)开头的词前,an用于元音(不是元音字母)开头的词前。 值得注意的是: A.。以元音字母开头但是发辅音,用a: university, useless, useful, unit, uniform。

B.以辅音字母开头但是发元音,用an: hour, honest。 3.通过短语操练让学生掌握a 和an的使用 e.g: a bus 一辆公交车 a car 一辆汽车 a pen 一支钢笔 a table 一张桌子 an apple 一个苹果 a cake 一块蛋糕 an elephant一头大象 an egg 一个鸡蛋 an English book 一本英语书 an eraser 一块橡皮4.在句子中练习不定冠词 a 和an的用法 重点强调特殊情况下的使用

批处理基础知识

批处理文件基础知识 一、单符号message指定让MS-DOS在屏幕上显示的正文 ~ ①在for中表示使用增强的变量扩展。 ②在%var:~n,m%中表示使用扩展环境变量指定位置的字符串。 ③在set/a中表示一元运算符,将操作数按位取反。 ! ①在set /a中一元运算符,表示逻辑非。比如set /a a=!0,这时a就表示逻辑1。 @ ①隐藏命令行本身的回显,常用于批处理中。 % ①在set /a中的二元运算符,表示算术取余。 ②命令行环境下,在for命令in前,后面接一个字符(可以是字母、数字或者一些特定字符),表示指定一个循环或者遍历指标变量。 ③批处理中,后接一个数字表示引用本批处理当前执行时的指定的参数。 ④其它情况下,%将会被脱去(批处理)或保留(命令行) ^ ①取消特定字符的转义作用,比如& | > < ! "等,但不包括%。比如要在屏幕显示一些特殊的字符,比如> >> | ^ &等符号时,就可以在其前面加一个^符号来显示这个^后面的字符了,^^就是显示一个^,^|就是显示一个|字符了; ②在set/a中的二元运算符,表示按位异或。 ③在findstr/r的[]中表示不匹配指定的字符集。 & ①命令连接字符。比如我要在一行文本上同时执行两个命令,就可以用&命令连接这两个命令。 ②在set/a中是按位与。 : ①标签定位符,表示其后的字符串为以标签,可以作为goto命令的作用对象。比如在批处理文件里面定义了一个":begin"标签,用"goto begin"命令就可以转到":begin"标签后面来执行批处理命令了。 ②在%var:string1=string2%中分隔变量名和被替换字串关系。 | ①管道符,就是将上一个命令的输出,作为下一个命令的输入."dir /a/b |more"就可以逐屏的显示dir命令所输出的信息。 ②在set/a中的二元运算符,表示按位或。 ③在帮助文档中表示其前后两个开关、选项或参数是二选一的。 / ①表示其后的字符(串)是命令的功能开关(选项)。比如"dir /s/b/a-d"表示"dir"命令指定的不同的参数。 ②在set/a中表示除法。 > ①命令重定向符,将其前面的命令的输出结果重新定向到其后面的设备中去,后面的设备中的内容被覆盖。比如可以用"dir > lxmxn.txt"将"dir"命令的结果输出到"lxmxn.txt"这个文本文件中去。 ②在findstr/r中表示匹配单词的右边界,需要配合转义字符\使用。 < ①将其后面的文件的内容作为其前面命令的输入。 ②在findstr/r中表示匹配单词的左边界,需要配合转义字符\使用。 . ①在路径的\后紧跟或者单独出现时:

DOS批处理命令大全

写批处理 扩展名是bat(在nt/2000/xp/2003下也可以是cmd)的文件就是批处理文件。 ==== willsort 编注======================================= .bat是dos下的批处理文件 .cmd是nt内核命令行环境的另一种批处理文件 从更广义的角度来看,unix的shell脚本以及其它操作系统甚至应用程序中由外壳进行解释执行的文本,都具有与批处理文件十分相似的作用,而且同样是由专用解释器以行为单位解释执行,这种文本形式更通用的称谓是脚本语言。所以从某个程度分析,batch, unix shell, awk, basic, perl 等脚本语言都是一样的,只不过应用的范围和解释的平台各有不同而已。甚至有些应用程序仍然沿用批处理这一称呼,而其内容和扩展名与dos的批处理却又完全不同。 =================================== 首先批处理文件是一个文本文件,这个文件的每一行都是一条DOS命令(大部分时候就好象我们在DOS 提示符下执行的命令行一样),你可以使用DOS下的Edit或者Windows的记事本(notepad)等任何文本文件编辑工具创建和修改批处理文件。 ==== willsort 题注=================== 批处理文件中完全可以使用非dos命令,甚至可以使用不具有可执行特性的普通数据性文件,这缘于wind ows系统这个新型解释平台的涉入,使得批处理的应用越来越"边缘化"。所以我们讨论的批处理应该限定在dos环境或者命令行环境中,否则很多观念和设定都需要做比较大的变动。 ======================== 其次,批处理文件是一种简单的程序,可以通过条件语句(if)和流程控制语句(goto)来控制命令运行的流程,在批处理中也可以使用循环语句(for)来循环执行一条命令。当然,批处理文件的编程能力与C语言等编程语句比起来是十分有限的,也是十分不规范的。批处理的程序语句就是一条条的DOS命令(包括内部命令和外部命令),而批处理的能力主要取决于你所使用的命令。 ==== willsort 编注================== 批处理文件(batch file)也可以称之为批处理程序(batch program),这一点与编译型语言有所不同,就c语言来说,扩展名为c或者cpp的文件可以称之为c语言文件或者c语言源代码,但只有编译连接后的exe 文件才可以称之为c语言程序。因为批处理文件本身既具有文本的可读性,又具有程序的可执行性,这些称谓的界限是比较模糊的。 =========================== 第三,每个编写好的批处理文件都相当于一个DOS的外部命令,你可以把它所在的目录放到你的DOS搜索路径(path)中来使得它可以在任意位置运行。一个良好的习惯是在硬盘上建立一个bat或者batch目录(例如C:\BATCH),然后将所有你编写的批处理文件放到该目录中,这样只要在path中设置上c:\batch,你就可以在任意位置运行所有你编写的批处理程序。 ==== willsort 编注===== 纯以dos系统而言,可执行程序大约可以细分为五类,依照执行优先级由高到低排列分别是:DOSKEY宏命令(预先驻留内存),https://www.doczj.com/doc/9717034237.html,中的内部命令(根据内存的环境随时进驻内存),以com为扩

不定冠词a 、an及其用法

不定冠词a 、an及其用法 a/an(不定冠词) a用在以辅音字母开头,或以读做辅音的元音字母开头的单词前面: a man一个男人 a university一所大学 a hat一顶帽子 a European一个欧洲人 a one-way street一条单行马路 an用在以元音字母(a,e,i,o,u)开头,或以不发音的h字母开头的单词前面:an apple一个苹果 an island一个岛 an uncle一位大叔 an onion一个洋葱 an egg一个鸡蛋 an hour一小时 an还用在发音以元音开头的单个字母前面: an L-plate一块“实习驾驶”车牌 an MP一个国会议员 an SOS一个呼救信号 an…x?一个x字母、X形的东西或未知数 a/an没有性的变化: a man一个男人

a woman一个女人 an actor一个男演员 an actress一个女演员 a table一张桌子 a/an的用法 A 用在第一次提到而非特指某人或某物的单数可数名词前面:I need a visa. 我需要签证。 They live in a flat. 他们住一个套间。 He bought an ice-cream. 他买了一个冰淇淋。 B 用在代表一类东西的单数可数名词前面: A car must be insured. 汽车必须投保。相当于: All cars/Any car must be insured. 所有汽车/任何汽车都必须投保。 A child needs love. 孩子需要爱。相当于: All children need/Any child needs love. 所有孩子/任何孩子都需要爱。 C 用在作表语的名词(包括职业名称)前面:

目前为止最全的批处理教程

目录 第一章 批处理基础 第一节 常用批处理内部命令简介 1、REM 和 :: 2、ECHO 和 @ 3、PAUSE 4、ERRORLEVEL 5、TITLE 6、COLOR 7、mode 配置系统设备 8、GOTO 和 : 9、FIND 10、START 11、assoc 和 ftype 12、pushd 和 popd 13、CALL 14、shift 15、IF 16、setlocal 与 变量延迟(ENABLEDELAYEDEXPANSION / DISABLEDELAYEDEXPANSION 启动或停用延缓环境变量扩展名。) 17、ATTRIB显示或更改文件属性 第二节 常用特殊符号

1、@命令行回显屏蔽符 2、%批处理变量引导符 3、> 重定向符 4、>>重定向符 5、<、>、<& 重定向符 6、|命令管道符 7、^转义字符 8、组合命令 9、& 组合命令 10、||组合命令 11、\"\"字符串界定符 12、, 逗号 13、; 分号 14、() 括号 15、! 感叹号 第二章 FOR命令详解 一、基本格式 二、参数 /d仅为目录 三、参数 /R递归(文件名) 四、参数 /L迭代数值范围 五、参数 /F迭代及文件解析 第三章 FOR命令中的变量

一、 ~I- 删除任何引号(\"),扩展 %I 二、 %~fI- 将 %I 扩展到一个完全合格的路径名 三、 %~dI- 仅将 %I 扩展到一个驱动器号 四、 %~pI- 仅将 %I 扩展到一个路径 五、 %~nI- 仅将 %I 扩展到一个文件名 六、 %~xI- 仅将 %I 扩展到一个文件扩展名 七、 %~sI- 扩展的路径只含有短名 八、 %~aI- 将 %I 扩展到文件的文件属性 九、 %~tI- 将 %I 扩展到文件的日期/时间 十、 %~zI- 将 %I 扩展到文件的大小 十一、 %~$PATH:I 第四章 批处理中的变量 一、系统变量 二、自定义变量 第五章 set命令详解 一、用set命令设置自定义变量 二、用set命令进行简单计算 三、用set命令进行字符串处理 1、字符串替换 2、字符串截取 第六章 if命令讲解 第一种用法:IF [NOT] ERRORLEVEL number command

英语不定冠词a和an的用法

英语不定冠词a和an的用法 一、a还是an 不定冠词有a 和an两种形式其区别是:a 用于辅音音素前,an 用于元音音素前: a dog 一条狗 a dictionary 一本词典 a student 一个学生 an egg 一只鸡蛋an elephant 一只大象an island 一个岛 【说明】有些以元音字母开头的单词,由于它不是以元音开头,其前仍用a: a university student一个大学生 a European country一个欧洲国家 同时,有些单词虽然以辅音字母开头,由于它的第一个读音为元音,其前用an:an honest man 一个诚实的人an honorable deed 高尚的行为 以下各例均用了an,也是因为紧跟在其后的词语以元音开头: miss an “m” 漏写一个m an 8-year plan 一个8年计划 二、不定冠词的类别用法 即指明某一类别的人或事物,并将其与其他类的人或事物区别开来。 1. 泛指指某一类人或物中的任何一个: We need a boy to do the work. 我们需要一个男孩来做这工作。 下定义时通常这样用: A teacher is a person who teaches. 教师就是教书的人。 2. 笼统指某类中的某一个,但又不具体说明是哪一个: He bought a computer yesterday. 他昨天买了台电脑。 比较: A tiger can be dangerous. 老虎是危险的。(泛指任何一只老虎) A tiger has escaped. 有一只老虎逃跑了。(指某只老虎,但不具体说明是哪只)

windows批处理命令详解及脚本实例

批处理文件是将一系列命令按一定的顺序集合为一个可执行的文本文件,其扩展名为BAT。第一部分:批处理内部命令 1、REM REM 是个注释命令一般是用来给程序加上注解的,该命令后的内容在程序执行的时候将不会被显示和执行。例: REM 你现在看到的就是注解,这一句将不会被执行。在以后的例子中解释的内容都REM 会放在REM后面。请大家注意。 2、ECHO ECHO 是一个回显命令主要参数有OFF和ON,一般用ECHO message来显示一个特定的消息。例: Echo off Rem 以上代表关闭回显即不显示所执行的命令 Echo 这个就是消息。 Rem 以上代表显示"这就是消息"这列字符 执行结果: C:\>ECHO.BAT 这个就是消息。 3、GOTO GOTO 即为跳转的意思。在批处理中允许以":XXX"来构建一个标号然后用GOTO :标号直接来执行标号后的命令。例 :LABEL REM 上面就是名为LABEL的标号。 DIR C:\ DIR D:\ GOTO LABEL REM 以上程序跳转标号LABEL处继续执行。 4、CALL CALL 命令可以在批处理执行过程中调用另一个批处理,当另一个批处理执行完后再继续

执行原来的批处理。例: 批处理2.BAT内容如下: ECHO 这就是2的内容 批处理1.BAT内容如下: ECHO 这是1的内容 CALL 2.BAT ECHO 1和2的内容全部显示完成 执行结果如下: C:\>1.BAT 这是1的内容 这就是2的内容 1和2的内容全部显示完成 5、PAUSE PAUSE 停止系统命令的执行并显示下面的内容。例: C:\> PAUSE 请按任意键继续. . . 6、IF IF 条件判断语句,语法格式如下: IF [NOT] ERRORLEVEL number command IF [NOT] string1==string2 command IF [NOT] EXIST filename command 说明: [NOT] 将返回的结果取反值即"如果没有"的意思。 ERRORLEVEL 是命令执行完成后返回的退出值 Number 退出值的数字取值范围0~255。判断时值的排列顺序应该又大到小。返回的值大于或等于指定的值时条件成立。 string1==string2 string1和string2都为字符的数据,英文字符的大小写将看做不同,这个条件中的等于号必须是2个(绝对相等),条件想等后即执行后面的command EXIST filename 为文件或目录存在的意思。 IF ERRORLEVEL这条语句必须放在某一个命令后面。执行命令后由IF ERRORLEVEL来

相关主题
文本预览
相关文档 最新文档