当前位置:文档之家› Compiler-Directed Static Classification of Value Locality Behavior

Compiler-Directed Static Classification of Value Locality Behavior

Compiler-Directed Static Classification of Value Locality Behavior
Compiler-Directed Static Classification of Value Locality Behavior

Compiler-Directed Static Classification of Value Locality Behavior

Qing Zhao and David J. Lilja?

Dept. of Computer Science and Engineering, Univ. of Minnesota, Minneapolis, MN 55455

?Dept. of Electrical and Computer Engineering, Univ. of Minnesota, Minneapolis, MN 55455

zhao@https://www.doczj.com/doc/444179791.html,, ?lilja@https://www.doczj.com/doc/444179791.html,

Abstract

Predicting the values that are likely to be produced by instructions has been suggested as a way of increasing the instruction-level parallelism available in a wide-issue processor. One of the potential difficulties in exploiting the predictability of values, however, is selecting the proper type of predictor, such as a last-value predictor, a stride predictor, or a context-based predictor, for a given instruction. We propose a compiler-directed classification scheme that statically partitions all of the instructions in a program into several groups, each of which is associated with a specific value predictability pattern. This value predictability pattern is encoded into the instructions to identify the type of value predictor that will be best suited for predicting the values that are likely to be produced by each instruction at run-time. Both an idealized profile-based compiler implementation and an implementation based on the GCC compiler are studied to show the performance bounds for the proposed technique. Our simulations based on the SimpleScalar tool set and the SPEC95 integer benchmarks indicate that this approach can substantially reduce the number of read/write ports needed in the value predictor for a given level of performance. This static partitioning approach also produces better performance than a dynamically partitioned approach for a given hardware configuration. Finally, this work demonstrates the connection between value locality behavior and source-level program structures thereby leading to a deeper understanding of the causes of this behavior.

1 Introduction

Several constraints limit the instruction-level parallelism (ILP) that can be exploited in programs, including true data dependencies (read-after-write hazards), artificial dependencies (write-after-read and write-after-write hazards), control dependencies (branches), and resource conflicts (such as limited numbers of register ports or function units). While the performance impact of control dependencies and artificial dependencies can be reduced or sometimes even eliminated using various hardware and software techniques, such as branch prediction and register renaming, true data dependencies remain a serious limitation in exploiting ILP in superscalar processors. Value prediction has been proposed as a mechanism to break true data dependencies [2,3, 6-15].

Recent studies have shown that wide-issue processors benefit more from value prediction than traditional four-instruction-issue superscalar processors since more true data dependencies typically appear in the larger instruction windows needed to support wider issue widths [2]. Eliminating data

dependencies in these larger instruction windows potentially can increase performance substantially. On the other hand, several problems arise when applying value prediction in wide-issue processors. For instance, since multiple instructions need to access the value predictor in each cycle, the predictors must be able to support very high read/write bandwidths. Our measurements on a set of SPEC95 integer programs, which are summarized in Table 1, show that a simple value predictor (the SVP described in Section 4.3) must support an average of 5 accesses per cycle for an 8-issue processor, increasing to 7 accesses per cycle for a 16-issue processor. If the value predictor is unable to support this request bandwidth, many predictions that could otherwise have been made will not be available to the instruction issue logic. Furthermore, entries in the value predictor will not be updated with the newly generated values in time for the next prediction request, which will cause a corresponding increase in the misprediction rate.

benchmarks

issue width

go m88ksim gcc compress li ijpeg perl vortex AVG 8_issue 5.2 5.5 4.6 5.2 4.5 6.1 4.7 4.2 5.0

16_issue 6.110.7 6.3 6.7 6.311.0 6.1 6.07.0

Table 1. The average number of requests per cycle made to a simple value predictor in a wide-issue

superscalar processor.

There are several approaches that can be used to resolve this value prediction bandwidth problem. A straightforward approach is to increase the number of read/write ports in the value predictor. While multiple independent ports will increase the number of requests per cycle that can be handled by the predictor, each additional port increases the complexity of the design with additional decoding and multiplexing logic and, perhaps more importantly, it substantially increases the wiring density. This additional circuitry and increased wire density will increase the implementation cost. Furthermore, these changes most likely increase the critical path delays, which then will increase the predictor?s access time and the processor?s clock period. A variation of this approach interleaves the predictor using several banks with a fast distribution network [2]. This scheme still suffers from bank conflicts, however, since several instructions may try to access the same bank simultaneously. Moreover, this interleaving requires a scaling factor that is a power of two, which severely restricts the development of a cost-effective design for value predictors in high-issue width processors [4].

A decoupled value predictor scheme has been proposed [3] that reduces the bandwidth requirement of the predictor by dynamically classifying each instruction into one of several different component

predictors. One drawback of this scheme, though, is that it requires a modified trace cache to store the classification information for each instruction. Additionally, this dynamic classification scheme suffers from a relatively long latency required to classify instructions. Due to changes in the predictability behavior of some instructions during a program?s execution, these instructions migrate among the several different component predictors at run-time. These frequent migrations will introduce many transient states into the predictors, thereby reducing their effectiveness.

This paper proposes a new compiler-directed classification scheme for statically partitioning individual instructions among several different component value predictors. The compiler attaches information about the value predictability pattern for each instruction to its binary object code through a simple extension of the instruction format. This compiler-directed static classification scheme reduces the hardware cost and complexity required to build a large prediction cache with multiple ports. Instead, several smaller value predictors with only a limited number of ports in each can provide the equivalent bandwidth of a single larger multiported predictor while maintaining approximately the same, and in some cases even slightly better, prediction accuracy. Since each instruction is statically marked with its prediction classification, no additional hardware classification table is needed. This preclassification also eliminates the time required by existing dynamic classification schemes to classify the value predictability behavior of instructions and it eliminates the start-up overhead incurred when instructions migrate among several different component predictors. It also eliminates the need for a single instruction to occupy multiple entries simultaneously in the component predictors of the overall predictor.

Unlike other compiler-directed techniques that rely on profile-guided heuristics [5], our static classification scheme identifies the salient characteristics of predictable instructions at compile time without the need to profile the program beforehand.

The remainder of this paper is organized as follows: Section 2 summarizes recent related work on value prediction schemes. Section 3 then studies the inherent characteristics of programs that serve as the basis for the static classification. This section also describes the compiler classification heuristics and presents classification results for an idealized compiler. Our performance evaluation methodology is described in Section 4 with the simulation results presented in Section 5. Finally, the results and conclusions are summarized in Section 6.

2 Related Work

There are three main areas of previous work related to the scheme proposed in this paper. The first area focuses on building more accurate and efficient value predictors, such as the last-value predictor [6], the stride value predictor [7], and the context-based value predictor [8,9]. Each of these predictors produces good performance for certain value locality patterns, but poor performance for others. To obtain higher prediction rates for instructions with varying value locality patterns, several simple value predictors can be combined into one hybrid value predictor [9,10,11]. Although these hybrid value predictors can provide higher prediction rate than single predictors, they can waste resources since every instruction being predicted occupies a unique entry in each of the component predictors. To address this inefficient use of the predictor resources, dynamic classification schemes [3,10,11] can distribute the instructions into the different component predictors at run-time.

Another important area of related work has studied how to incorporate value predictors in complete machine models. Gabbay and Mendelson [2] showed that the performance potential of value prediction tends to increase as the instruction issue width increases. However, it has been pointed out [2,3] that using value prediction in wide-issue processors requires very high bandwidth predictors. To address this problem, a highly interleaved prediction table has been proposed [2]. Additionally, Lee et al [3] combined a dynamic classification table with a trace cache to distribute instructions to different component predictors thereby increasing the predictor bandwidth available in each cycle. Another approach is to reduce the bandwidth requirement of the predictors by giving priority to those instructions that belong to the longest data dependence chains [12]. Rychlic et al [11] have shown that over 30% of the value predictions in the SPEC95 integer programs produced no performance improvement, which further supports the selective prediction scheme.

Finally, the last area of important related work has studied the program characteristics that lead to value locality [14,15]. For instance, Sazeides and Smith [14] divided prediction models into two groups, computational and context-based, while Sodani and Sohi [15] studied the sources of instruction repetition at both the global level and the local level.

3. Static Classification

A program's execution-time behavior determines the predictability of the values its instructions produce. There are two basic types of value predictability in programs [14]:

?Value repetition, in which subsequent dynamic instances of the same static instruction repeat the values produced by previous dynamic instances. The repetition distance is the number of times an instruction is executed between two repetitions of the same value (including the repeating instance itself). For instance, an instruction that always produces the same value has a repetition distance of one, while an instruction that alternately produces only two distinct values has a repetition distance of two. Note that the repetition distance need not be constant for a particular static instruction.

?Value computability, in which values produced by later dynamic instances of the same static instruction can be computed using a simple function of the previous values.

Last-value predictors [6] and context-based value predictors [8,9] have been proposed to capture the value repetition that exists in programs. The primary difference between these two types of predictors is the repetition distance that can be exploited by each. The last-value predictors assume that the immediately previous value produced by an instruction will be produced again the next time it is executed. Thus, it assumes a repetition distance of one. The context-based predictors, on the other hand, generalize to arbitrary repetition distances by selecting one of several previous values as the next value predicted to be produced by an instruction. Computational predictors, such as the stride value predictor [7], compute the next predicted value using some predefined function of previously seen values, such as an increment by a constant stride.

In this section, we further categorize the predictability behavior of instructions with the goal of developing compiler algorithms to statically identify the value predictability pattern of each instruction. 3.1 Value Predictability Patterns for Static Instructions

We classify static instructions into three different groups according to the predictability behavior that they exhibit. This classification was developed by observing both the run-time behavior of programs and the source code level structures that can be naturally analyzed by compilers.

?Single_Pattern -- These instructions have exactly one type of value predictability during the execution of the program. For example, the sequence {1, 2, 3, 4, 5, 6, ...} demonstrates a computable value predictability pattern while the sequence {4, 7, 9, 4, 7, 9, ...} shows a single type of value repetition.

We further subdivide this group below.

?Mixed_Pattern -- Instructions with this category may interleave both the value repetition and the value computability types of behavior. The value sequence {1, 2, 3, 5, 5, 5, 5, 1, 2, 3, 5, 5, ...}, for instance, shows both repetition of values and a stride-based computable pattern.

?Unknown -- This group includes static instructions whose values are not predictable during the program execution.

It is useful to further divide static instructions in the Single_Pattern group into the following three subgroups:

?The Single_Last subgroup includes those instructions that have a constant repetition distance of one.

That is, the instruction usually produces the same value as the immediately previous instance.

?The Single_Stride subgroup includes instructions whose values are computable using a constant stride.

?The Single_Context subgroup includes instructions with a value repetition distance larger than one.

For example, the dynamic instruction instances sequence {1, 4, 17, 23, 1, 4, 17, 23, 1, ...} exhibits value repetition with a repetition distance of four. The corresponding static instruction thus belongs to this Single_Context subgroup.

Note that static instructions in both the Single_Last and Single_Context subgroups show similar value repetition behavior, differing only in the repetition distance. Logically, they could be combined into a single group with an associated value repetition distance. However, the Single_Last type is pulled out as a separate subgroup because instructions with this behavior occur quite frequently in the programs tested.

Each group and subgroup corresponds to a particular type of program structure. Instructions in the Single_Last group typically are assigned a value at the beginning of a loop?s execution and then are never changed within the loop. These instructions are said to be part of loop invariant computations. On the other hand, a loop induction variable that is a variable whose successive values form an arithmetic progression in a loop [17], produce the Single_Stride behavior. An instruction within the innermost loop of a set of nested loop could produce the Single_Context behavior when the inner loop causes the instruction to produce some arbitrary sequence of values and the outer loop causes this sequence to be repeated. The repetition distance for the values produced by this instruction will be the number of iterations in the innermost loop.

While the above simple program features cause the Single_Pattern behaviors, program structures that cause the Mixed_Pattern behaviors are more complicated. Figure 1 shows a fragment of a small program that illustrates one type of situation in which the Mixed_Pattern behaviors could occur. In the two-level nested loop, there are three static instructions I1,I2,I3, that produce different value predictability patterns, as noted in the comments in this figure. Instruction I1 cannot be moved outside of the loops by an ordinary loop invariant removal compiler optimization because I1 does not dominate the use of variable r1 in I3 [17]. It is obvious that this instruction will produce only a single value throughout the loop?s execution. Instruction I2 contains an induction variable on the inner loop and so demonstrates a Single_Stride pattern. Note that the operand r1 of instruction I3 has two possible definitions, one defined by I1 and the other by I2. Since these definitions occur in complementary control paths that both converge at I3, the value predictability patterns of I1 and I2 will determine the value predictability pattern of I3. As a result, I3 shows a Mixed_Pattern behavior. Specifically, it shows the value repetition behavior with a repetition distance of one in the first iteration of the inner loop while showing the value computability behavior in the subsequent iterations of the inner loop.

Instructions in single-level loops that perform complex computations will produce an arbitrary sequence of values during execution. For example, traversing a linked list would often generate address values that have no value predictability. Instructions in this type of program structure would demonstrate an Unknown value predictability pattern.

Figure 1 An example of a program structure that causes a Mixed_Pattern value repetition behavior.

3.2 Compiler Classification

Now that we have identified an appropriate set of groups with which to classify the value predictability of instructions, we develop compiler algorithms to classify each static instruction in a program. We begin by

describing an idealized compiler algorithm that uses the execution profile of a program to classify the instructions. This idealized compiler is used to provide an upper bound on the performance potential of this classification approach. We then describe our implementation of an approximate classification in the GCC compiler.

3.2.1 Idealized compiler

The classification information for the idealized compiler is collected by a profiler as the program is executed. The profiler stores in execution order the output values produced by up to 100 consecutive instances of each static instruction. Whenever a new instance of an instruction is encountered, the value repetition behavior (with a repetition distance up to 100) and the computable stride are checked simultaneously using three counters. One counter is used to determine value repetition behavior with a repetition distance of one, the second counter tracks repetition distances larger than one, and the third is used to determine if there is a computable stride.

Figure 2. The distribution of different value predictability patterns when using the idealized profile-based value predictability classification algorithm. The left-hand bar for each program corresponds to the static instruction count and the right-hand bar corresponds to the dynamic instruction count.

At the end of a program?s execution, if only the counter for a repetition distance of one is non-zero, the instruction is assigned to the Single_Last group. Similarly, if only the stride counter is non-zero, or if only the counter for repetition distances larger than one is non-zero, the instruction is assigned to the Single_Stride group or the Single_Context group, respectively. Otherwise, if two or three different counters are non-zero, the instruction is assigned to the Mixed_Pattern group. Finally, if all of these counters are zero, the instruction is assigned to the Unknown group.

Figure 2 shows the distribution of instructions executed by the test programs when divided into the above five value predictability patterns using the idealized profile-based compiler algorithm. From these results, we can see that, on average, the Single_Last and Single_Stride categories together comprise about 40% of both the static and dynamic instruction counts. An average of 50-60% of both the static and dynamic instruction counts are classified as showing the Mixed_Pattern value predictability behavior. Typically, no more than about 5-10% of the instructions are classified as either Single_Context or Unknown.

3.2.2 Practical Compiler Heuristic

The statistics in Figure 2 suggest that a practical compiler algorithm to perform this static classification of value predictability behavior can ignore the instructions in the Unknown group and the Single_Context subgroup since they occur relatively infrequently. The task then becomes to partition the whole program into three categories: Single_Last, Single_Stride and Mixed_Pattern. In fact, we develop a compiler algorithm to classify instructions into both the Single_Last and Single_Stride categories. Every instruction that does not get classified in either of these two categories must be either Mixed_Pattern, Unknown, or Single_Context instructions. Since the fraction of instructions that the profile-based algorithm determined were either Unknown or Single_Context is quite small, we simply combine instructions in these two categories with the Mixed_Pattern instructions to make a single category called Others.

In this paper, we describe an approximate static classification algorithm that we implemented in the GCC compiler, version 2.6.3. This algorithm is implemented using only the information available in a single function at-a-time since this is the most basic level for practical compilers. This algorithm could be implemented in any compiler, even if it does not have interprocedural analysis, accurate alias analysis and other modern compiler control flow and data flow analysis algorithms. Therefore, this function-level approximate algorithm provides a lower bound on what could actually be achieved in a real compiler on a real system.

As described in Section 3.1, loops are the basic source-level program structures that lead to the Single_Last and Single_Stride value predictability behavior. Consequently, identifying instructions that fall into the Single_Last and Single_Stride categories can be accomplished by using the compiler to identify loop invariant computations and loop induction computations, respectively.

Many loop invariant computations can be eliminated by a traditional loop invariant removal compiler optimizations. This optimization tries to identify computations that never change within a loop and then moves them outside of the loop body. There are several compile-time limitations that can hinder this optimization, however. For example, if the compiler has no information about the life span of a particular result produced within a loop, it will be unable to determine the benefit of removing the computation from the loop. In this case, it simply may leave the computation within the loop. In GCC, for instance, a computation is moved out of a loop only once. As a result, a computation that has already been moved out of the inner loop will not be moved again outside of an outer enclosing loop, even though it may be an invariant within the outer loop as well.

Another situation in which a compiler typically cannot move a loop invariant computation out of the loop body is when the loop invariant computation does not dominate (in a graph-theoretic sense) all of the uses of it within the loop and all the exits of the loop [17]. The compiler cannot move the computation out of the loop in this case since it cannot prove that correct program operation will be maintained.

A third situation that limits the compiler?s ability to move loop invariant computations outside the loop is due to function boundaries. Although functions can be called within loops, a function-level compiler has no interprocedural information since it analyzes only one function at a time. As a result, a function level compiler ignores all the loop invariant computations that occur within called functions. In Figure 3, for example, the function abc is called by the function foo in a for loop. Instruction I1 in function abc obviously is a loop invariant computation within the loop in the function foo. However, when the function level compiler analyzes the function abc, instruction I1 does appear not within a loop, so it is not identified as a loop invariant computation. As a result, no loop invariant removal optimization is applied on instruction I1. Note that interprocedural analysis cannot always solve this problem if a function is called from multiple sites. Function in-lining can be used to make instruction I1 visible as a loop invariant computation in this example, although in-lining can introduce other difficulties.

The above three situations at least partially explain why the Single_Last value repetition behavior still exists in programs at run-time even after the compiler has aggressively tried to remove loop invariant computations. Similarly, the Single_Stride behavior still exists at run-time even though there are compiler algorithms to remove loop induction computations [17]. The basic optimization is to convert the arithmetic progression produced by a simple loop induction computation into a corresponding closed-

form function of the loop index. However, the values of the induction variables themselves are typically

used within the loop iterations. Consequently, most of the loop induction computations are retained

within the loop after the compiler has completed its optimizations.

Figure 3. An example of a loop invariant computation that is not visible to a compiler that has no interprocedural analysis capabilities.

Based on the above analysis, the identification of instructions that belong to the Single_Last and

Single_Stride categories can be divided into two steps in the function level compiler. The first step is to

apply the traditional loop invariant and loop induction computation identification algorithm after the

normal loop optimization pass. These algorithms will find the Single_Last and Single_Stride instructions

within the loops of the current function being analyzed. Next, we identify potential Single_Last and

Single_Stride instructions by assuming that every function analyzed is called within a loop. This allows

us to put instructions in the Single_Last group which would otherwise not be identified by a traditional

compiler as being loop invariants.

Note that our classification of instructions is used only to assist a later value prediction at run-time and

not to produce any program transformations that must be guaranteed to be correct to maintain the

semantics of the program. As a result, if our assumption is wrong for some functions, that is, if the

functions are not called within loops, the computation still will be correct. The worst that will happen is

that the value prediction performed at run-time will be incorrect. This wrong prediction may impact

performance, but not program correctness. Furthermore, notice that, if the function is not called within a

loop, it will be executed a relatively small number of times. As a result, the computation that our

algorithm identified as being potentially loop invariant will be executed only a relatively small number of

times. Thus, misclassifying the instructions in this function as Single_Last will have very little impact on

the overall classification.

The algorithm to identify the potentia l loop invariant computations in the second step above requires

only simple function-level data-flow analysis. First, we define the original data source to be the type of

variable that provides the input data for a function. We identify these original data sources

to be

immediate values (CONST), function parameters (ARG), returned values (RET), global values (GLOBAL) and stack pointer related values (SP). An original definition is a definition of the value of a variable within a function that has one of these original data sources on the right side of an assignment statement. Thus, the type for an original definition is one of CONST, ARG, RET, GLOBAL and SP. The data-flow analysis begins by finding all the original definitions in a function that have specific original definition types. The compiler then applies traditional def-use data-flow analysis on the entire function to associate all the original definitions with each instruction. Finally, each instructions is categorized as Single_Last if all of its original definitions are of the CONST type, even if they are not within loops in the current function.

In addition to using the above algorithms to identify instructions in the Single_Last category, we can identify some Single_Last instructions using specific features of the instruction set [15]. For example, when the number of bits in the immediate field of an instruction format limits the size of the immediate value that can be handled by an instruction, larger constants often are manipulated using a standard sequence of instructions. In the MIPS architecture, for instance, the òluió instruction commonly is used for generating this type of large immediate value. We take advantage of these instruction set architecture-specific behaviors by using the op-code generated by the compiler to identify these types of instructions as belonging to the Single_Last category. In this study, in particular, we optimistically classify all òluióinstructions as Single_Last.

3.3 Architectural support

To pass the information acquired by the compiler to the processor, we add a three-bit extension to each instruction. These extra three bits identify one of the five predictability patterns identified by the idealized profiler-based compiler (Single_Last, Single_Context, Single_Stride, Mixed_Pattern, Unknown), or one of the three predictability patterns identified by the real compiler (Single_Last, Single_Stride and Others).

4. Performance Evaluation

We use the above compiler-directed classification algorithm to statically assign instructions to one of three different component predictors within an overall value predictor. This combined predictor is used within a wide-issue superscalar processor to predict the values that will be produced by individual

instructions. We evaluate the performance of this statically-classified predictor using both the idealized profile-based algorithm and the function-level algorithm implemented in the GCC compiler. These two compilers provide lower and upper bounds on the performance that could be expected from this type of static-classification predictors. We compare our results to both a simple hybrid value predictor and a predictor that dynamically distributes instructions to the component predictors using only run-time information.

4.1 Simulation methodology

We have developed a cycle-accurate execution-driven simulator derived from the sim-outorder simulator in the SimpleScalar tool set [1] for this study. The baseline superscalar processor supports out-of-order instruction issue and execution. To prevent resource and control dependences from becoming performance bottlenecks, and instead allowing us to focus on the performance potential of the value predictors themselves, we assume perfect branch prediction and a sufficient number of function units. The processor can issue 16 instructions per cycle out of a 256-entry instruction window, and the load-store queue has 64 entries.

There are many ways to incorporate the value predictors into a superscalar processor and it is beyond the scope of this paper to propose an optimal design. In this study, we adopt a conventional approach for integrating the value predictors into the processor pipelines. The value predictors are indexed by the instruction addresses during the instruction fetch stage. The predicted values then are inserted into the reservation stations of the data-dependent instructions after the instructions are decoded in the dispatch stage. These instructions in the reservation stations are marked as having one or more predicted input values. An instruction in the reorder buffer will be scheduled and executed when all of its operands are available and there is an available functional unit, regardless of whether or not the operands are predicted. When the actual results are produced, the predicted value is verified and updated during the write-back stage. If the actual result is the same as the predicted result, the execution is continued without any interruption. Otherwise, only those instructions that were issued using this incorrectly predicted value will be invalidated and selectively reissued with the correct value [11]. Values are predicted only for load instructions and for register instructions whose outputs are needed by other instructions in the current instruction window [3].

4.2 Benchmark programs

Benchmark Input Set No. of Inst.

Executed

(million)

No. of Inst. need

prediction

(million)

IPC for

baseline

machine

Full Speedup over

baseline machine(%)

go5,982637.298.0 m88ksim dcrand.lit2401649.3468.0

gcc jump.i38258.5414.4

compress10000 e 223135238.7017.1 li queen6.lsp41249.988.0

ijpeg spectriv.ppm987610.6133.2

perl scrabbl.in40257.5030.6

vortex persons.25066398.6718.0

Table 2. The run-time characteristics of the SPEC95 integer benchmark programs used in this study We use eight integer programs from the SPEC95 benchmark suite for this performance evaluation. The inputs sets used, the total instruction counts, and the IPC of the baseline processors are shown in Table 2 for these test programs. The fourth column in this table shows the total number of instructions that need to query the value predictor. The last column shows the speedup obtained compared to the baseline processor when using a simple value predictor (the SVP described in Section 4.3). This comparison assumes that the value predictor always has a sufficient number of read/write ports available so there are never any conflicts among instructions trying to access the predictor at the same time. All the programs were compiled using our modified GCC 2.6.3 compiler with DO3 optimization and loop unrolling enabled. A few of the input sets were modified slightly to control the simulation time.

4.3 The comparison value predictors

We compare our Static Classification Value Predictor (SCVP) to two other value predictors. The first one is the Simple Classification Value Predictor (SVP). This hybrid predictor consists of three component predictors, a last-value predictor, a stride predictor and a context-based predictor. Every predicted instruction is allocated an entry in each of the three component predictors. When a value is to be predicted, each component predictor is queried. We assume an idealized predictor such that the correct prediction from the three component predictors is always selected, if the corresponding 3-bit confidence counter for the corresponding component predictor exceeds the predefined prediction threshold value. This predictor can still mispredict when all of its component predictors mispredict. Furthermore, no prediction will be made when the confidence counters of all of its component predictors are below the prediction threshold, or if the instruction is not in the prediction table and so cause a table miss. This SVP serves as an ideal hybrid value predictor and is used in computing the average number of requests in Table 1 and the ideal speedups shown in Table 2.

The second predictor used in our comparisons is the Dynamic Classification Value Predictor (DCVP) [3]. This predictor uses a classification table to dynamically assign instructions into one of the three component predictors of the above SVP. The first time an instruction whose output needs value prediction is executed, it is assigned the Unknown type and forwarded to the classification table. The output value is stored in the classification table at this time. When it is executed the second time, a stride value is calculated by subtracting the old output value stored in the table from the new output value. The stride value calculated and the new output value then are stored in the table. If this instruction is executed again, a new stride value is calculated and compared with the old one stored in the table. If both of them are equal to zero, this instruction is assigned to the Last type. Otherwise, if they are equal but are not zero, this instruction is assigned to the Stride type. Finally, if they are not equal, this instruction is assigned to the Context type.

An instruction that has been assigned an access type is dispatched to the component value predictor with the corresponding access type and will be predicted by that value predictor the next time it is executed. When the dynamic instances of this instruction cannot be correctly predicted by that value predictor several times in a row, this instruction is removed from its current value predictor, is assigned the Unknown type again, and is forward back to the classification table. After another classification stage, this instruction is assigned a new access type that may be the same as or different from the previous access type.

Since the value predictability pattern of an instruction can change from time-to-time during a program?s execution, its prediction classification may change frequently in this mechanism. Frequent reclassifications will lower the performance of this predictor due to the time required to warm-up the predictor for an instruction after is has been reclassified. Note that the DCVP needs additional storage to store the classification information for each instruction and a complicated control engine to accomplish the above tasks. However, we do not include the additional storage and the overhead of this control engine in the following simulations. As a result, the simulation results for the DCVP are somewhat optimistic.

Our Static Classification Value Predictor (SCVP) used in the following experiments also includes a last-value predictor, a stride predictor and a hybrid predictor similar to the SVP. The two different levels of compiler capability that we simulate, specifically, the idealized profiler-based compiler scheme,

denoted SCVP_P, and the actual GCC compiler implementation, denoted SCVP_C, use these component predictors in a slightly different way. Both compiler-based mechanisms use the last-value predictor to hold those instructions that were statically classified as showing the Single_Last value prediction behavior. Both of them also use the stride predictor for instructions in the Single_Stride category. With the idealized profiler-based compiler scheme (SCVP_P), though, the hybrid predictor is used for instructions classified as either Mixed_Pattern or Single_Context. Instructions classified as Unknown are not predicted with the SCVP_P mechanism. However, with the real SCVP_C compiler mechanism, the hybrid predictor is used to predict all instructions classified as Others, which includes the Mixed_Pattern, Single_Context, and Unknown categories.

SCVP SVP DCVP

Hybrid

Last Stride Context Classify Last Stride Context Last Stride

Last Stride Context

4k1k4k1k8k8k2k8k4k4k2k8k Table 3. The table sizes used in the following simulations for the SCVP, SVP and DCVP value predictors To fairly compare these different value predictors, we choose the sizes of the component predictors such that the total storage resources allocated for each of the three predictors being compared was constant. Table 3 shows the specific sizes used in each of the configurations. These table sizes for each component predictor in the classification predictors (SCVP and DCVP) were chosen based on the observed static and dynamic classification distributions (See Fig.2 and [3]).

4.4 Evaluation Metrics

The most important evaluation metric is the overall machine performance obtained when the baseline processor incorporates one of the predictors being compared. Three additional metrics are defined to evaluate the properties of the value predictor themselves. The Correct prediction rate is the percentage of correct predictions out of the total number of instructions executed. Similarly, the Misprediction rate is the percentage of predictions that predict the wrong values out of the total number of instructions executed. The Non-prediction rate is the percentage of all instructions in which a prediction is attempted, but the predictor does not return a value. This type of nonprediction occurs when the confidence counter is below the predefined threshold required to make a prediction, or when the instruction being queried is not in the table, and so causes a table miss. The sum of these three rates reflects the bandwidth that a

value predictor can provide. However, only the Correct prediction rate and the Misprediction rate influence the overall processor performance.

The correct predictions can improve performance by breaking true data dependencies. Mispredictions, on the other hand, may decrease performance since instructions that have been previously issued with an incorrectly predicted value must be squashed and reexecuted. Note that not all the correct predictions actually produce a performance improvement, though. As a result, there is not necessarily a proportional relationship between the speedup and the Correct prediction rate. The value of the misprediction penalty is set to one clock cycle to correspond to the misprediction penalties commonly assumed in other studies [3,11].

5. Analysis of Results

In this section, we first compare the classification capabilities of the real GCC compiler implementation to the idealized profile-based compiler. We then compare the overall processor performance obtained when using the different value predictors assuming no resource constraints. Finally, we compare the performance obtained with the different predictors when the number of read-write ports in each predictor is limited.

5.1 The capability of the real compiler

Figure 4a compares the classification capabilities of both the idealized compiler and the real compiler that uses only function-level information. The Others category in this figure represents all instructions that are not Single_Last or Single_Stride. It is observed that the real compiler is able to identify an average of approximately half of the instructions classified as Single_Last and Single_Stride by the idealized compiler. For go, the real compiler performs almost as well as the idealized compiler. For ijpeg, however, the real compiler can identify fewer than 30% of the instructions in the Single_Last and Single_Stride categories combined, even though these two categories count for approximately 50-70% of all of the instructions identified by the idealized compiler.

There are two primary parts of the Single_Last and Single_Stride categories that the real compiler cannot readily identified. The first part consists of those instructions that have invariant original definitions where the type of original definition is one of GLOBAL, ARG, RET or SP. (Recall that our algorithm identifies as Single_Last only those instructions whose types of all original definitions are

CONST.) The second part that the compiler cannot readily identify are those instructions that appear in function prologues or epilogues, denoted LOG. It is beyond the capability of a compiler with only function-level analysis to classify these types of instructions since this classification requires interprocedural information.

(a)(b)

Figure 4. (a) The distribution of the value predictability pattern identified by the idealized compiler and the GCC-based compiler analysis. The four bars in each program show the following, starting at the left-most bar in each group: 1) static instruction count, idealized compiler; 2) dynamic instruction count, idealized compiler; 3) static instruction count, GCC-based compiler; 4) dynamic instruction count, GCC-based compiler.

(b) The dynamic instruction distribution of instructions in the Single_Last category using the GCC-based compiler showing the source of the Single_Last value predictability behavior.

Figure 4b shows a detailed distribution of the types of instructions in the Single_Last category as identified by the real compiler implementation. Each bar in this figure is divided into five components (from bottom to top): 1) instructions with only CONST original definitions; 2) those that are loop invariant computations within the current function; 3) those whose op-code is òluió; 4) those with invariant original definitions other than CONST; and 5) those that appear in the function prologue or epilogue. From this figure, we can see that, on average, around 45% of the instructions in the Single_Last category have original definitions that are something other than CONST. The ijpeg program is the exception having this portion above 90%. On average, another 5% of the instructions in the Single_Last category appear in function prologues and epilogues. In the go program, fewer than 10% of the instructions in the Single_Last category have an invariant original definition that is other than CONST or that is in a function prologue or epilogue. As a result, most of the Single_Last instructions in this program can be identified by a function-level compiler.

5.2 Performance comparisons with unlimited resources

Figure 5a shows the speedups obtained with no resource constraints when using the four value predictors compared to the baseline processor with no value prediction. In particular, there are a sufficient number of read/write ports in each predictor in these simulations to ensure that there are no conflicts among instructions trying to access the predictor simultaneously. We can see that the performance of all three schemes is not substantially different in most cases. We do see that the performance of the DCVP scheme, which dynamically classifies instructions and distributes them to the various component predictors, exceeds the other schemes only for li, though, while it performs worse than the other schemes

Figure 5. (a) The speedup over the baseline processor for the different value predictor mechanisms tested.

(b) The corresponding value predictor characteristics. From left to right, the four bars in each program group are: SCVP_P, SCVP_C, DCVP and SVP.

In almost all cases, the idealized profile-based compiler classification scheme, SCVP_P, produces the best performance. The exception is that the real compiler-based scheme, SCVP_C, slightly exceeds the profile-based scheme for gcc. This anomaly occurs because in the SCVP_C scheme, all instructions that are not classified as Single_Last or Single_Stride are classified as Others. All instructions in this Others category then are predicted by the context-based predictor. In the SCVP_P, though, instructions categorized as Unknown are never predicted. Thus, the SCVP_C scheme actually attempts to predict more instructions than the SCVP_P scheme. In this particular program, these extra predictions produced slightly better performance for the SCVP_C scheme compared to the SCVP_P scheme. Except for compress and gcc, the performance of the real compiler scheme is almost the same as that of the profile-based compiler scheme. This shows that, in the absence of resource conflicts, the real compiler can do an excellent of classifying instructions into the component predictors, even when limited to function-level analysis.

Figure 5b shows that the corresponding correct prediction, misprediction, and nonprediction rates for the configurations compared in Figure 5a. We see from these results that, when there is no contention for read/write ports, the correct prediction rate of the DCVP mechanism tends to be somewhat lower than that of the other mechanisms. This lower correct prediction rate occurs because instructions are often dynamically reclassified by the DCVP, which also shows up in its higher nonprediction rate compared to the others. Consequently, the speedup obtained with the DCVP mechanism is lower than that obtained with the other mechanisms. It is interesting to note that the misprediction rate for the DCVP is the smallest among all the predictors. That is, the predictions produced by the DCVP tend to be more accurate than the other predictors, but it cannot predict as many instruction as the others due to the relatively large number of times it must reclassify instructions. This comparison suggests that higher overall performance can be obtained by predicting more instructions even with slightly lower overall accuracy.

5.3 Performance comparisons with limited read/write ports

When the number of read/write ports in the value predictor is limited, some of the instructions that try to access the predictor will conflict with other instructions trying to access the predictor simultaneously. As a result, the prediction must be delayed to the next issue cycle, or the value will not be predicted at all. These conflicts will lower the correct prediction rate. Additionally, these conflicts will prevent retiring instructions from updating the predictor with the actual value produced in time for a subsequent prediction. This will introduce more stale values into the predictor which then will produce a higher misprediction rate. Both of these effects will decrease the performance produced when using a value predictor.

Figures 6 shows the speedup obtained compared to the baseline processor for the different value predictors when the number of read/write ports available, N, is varied from 1, 2, 4, to 8. From these figures, we can see that the profile-based compiler scheme, SCVP_P, provides the best speedup for all programs except li. The speedup produced when using the real compiler-based scheme, SCVP_C, is better than the DCVP scheme for half of the programs while it produces slightly lower performance than the DCVP scheme for the remaining programs. When N is 1 or 2, however, the SCVP_C scheme produces the same or slightly better performance than the DCVP scheme for six of the eight programs

java static 的使用方法

类方法 方法被声明为static后,则称为类方法。类方法相对于实例方法,前者区别于后者的地方:前者为属于该类的所有实例对象共享,无须实例化对象,仅通过类名即可访问(当然,是否能够直接访问,还取决于所声明的访问权限)。 因为被static化后产生上述特殊性,所以static变量都会在类被加载时初始化,而类方法也同时随类加载而进驻内存。先来段代码热热身吧~ 上段代码,输出结果为: null A Class 由结果可知,即字符串prvateStr的值为空。嘿,可别认为值应该是下面那样啊。那样 的话,进行下去就太具挑战性了。 A Class A Class 请记住一点,类变量初始化的顺序与其在类中的赋值顺序一致。

重写(覆盖) 或许大家对于面向对象编程语言最初的印象就是其语言单元是以父类、子类的关系存在着,而构建这一关系的就是继承机制了。子类可以继承父类一切非private的变量与方法,并且可以添加自己的变量与方法。在构建一个系统时,这机制让我们强烈地感觉到编程是一 门优雅的艺术。 来段小小的代码简单地展示下: 结果如下: Jack I am a thinking animal, and a Programmer

如上,子类Programmer中并没定义字符串characteristic,但我们却在其方法printProfession()中调用了;同样,我们正常使用了父类定义的printName()方法。而这就 是继承的简单实现。 继承不仅仅带来以上特性。它还赋予子类重写(覆盖)父类方法的能力(因为旨在讲类方法的重写,所以这儿就不讲重载以及变量在继承机制中的问题了)。方法的重写(覆盖):继承父类的子类,可以通过拟具有相同方法名与参数组的方法来重写父类中对应的方法,从而让子类更个性化。又因为重写(覆盖)的出现,多态也随之产生。多态:通过父类变量可以引用其子类对象,从而调用子类中那些继承自自己并被重写(覆盖)的方法。

Java--static关键字的

static关键字 如果使用一个类则会在实例化对象时分别开辟栈内存及堆内存,在堆内存中要保存对象中的属性,每个对象都有自己的属性,如果你现在有些属性希望被共享,则就必行将其声明为static属性,而且一个属性声明为static属性,可以直接使用类名称进行调用,如果一个类中的方法想由类调用,则可以声明为static方法。 一.使用static声明属性 如果程序中使用static声明属性,则属性成为全局属性(有些也称为静态属性),那么声明为全局属性到底有什么用吶?观察以下代码: class Person{ // 定义Person类 String name ; // 定义name属性,暂时不封装 int age ; // 定义age属性,暂时不封装 String country = "A城" ; // 定义城市属性,有默认值 public Person(String name,int age){ https://www.doczj.com/doc/444179791.html, = name ; this.age = age; } public void info(){ // 得到信息 System.out.println("姓名:" + https://www.doczj.com/doc/444179791.html, + ",年龄:" + this.age + ",城市:" + country) ; } }; public class StaticDemo01{ public static void main(String args[]){ Person p1 = new Person("张三",30) ; // 实例化对象 Person p2 = new Person("李四",31) ; // 实例化对象 Person p3 = new Person("王五",32) ; // 实例化对象 https://www.doczj.com/doc/444179791.html,() ; https://www.doczj.com/doc/444179791.html,() ; https://www.doczj.com/doc/444179791.html,() ; } }; 运行结果: 姓名:张三,年龄:30,城市:A城 姓名:李四,年龄:31,城市:A城 姓名:王五,年龄:32,城市:A城 以上代码,为了观察方便没有使用private关键字进行封装。以上的程序是一个简单的程序,但是代码中有些不妥之处。 实际上,如果现在假设此城市不叫A城,而改为了B城,而且此类产生了200个对象,那么就意味着要把这些对象的城市属性全部修改一边。这样显然是不行的。最好的方法是修改一次就可以,此时可以把城市属性使用static关键字进行声明,将其变为公共属性。 使用static声明属性: class Person{ // 定义Person类

JAVA语言中的final修饰符

final关键字可用于修饰类,变量和方法,final关键字有点类似c#里的sealed关键字(如果大家学过C#就知道),它用于表示它修饰的类,方法和变量不可改变。 final变量 final修饰变量时,表示该变量一旦获得了初始值就不可改变,final既可修饰成员变量(包括类变量和实例变量),也可以修饰局部变量,形参。严格来说final修饰的变量不要被改变,一旦获得初始值之后,该final变量的值就不能被重新赋值。 因为final变量获得初始值之后不能被重新赋,因此final修饰成员变量和修饰局部变量时有一定的不同:下面我将会写到有哪些方面的不同,还有就是为什么会不同。 final修饰成员变量 成员变量是随类初始化或对象初始化而初始化的。当类初始化时,系统会为该类属性分配内存,并分配默认值;当创建对象时,系统会为该对象的实例属性分配内存,并分配默认值。也就是说,当执行静态初始化块时可以对类属性赋初始值,当执行普通初始块,构造器时可对实例属性赋初始值。因此,成员变量的初始值可以在定义该变量时指定默认值,可以在初始化块,构造器中指定初始值,否则,成员变量的初始值将是由系统自动分配的初始值。 对于final修饰的成员变量而言,一旦有了初始值之后,就不能重新赋值,因此不可以在普通方法中对成员变量重新赋值。成员变量只能在定义该成员变量时指定默认值,或者在静态初始化块,初始化块,构造器中为成员变量指定初始值,如果既没有在定义成员变量时指定初始值,也没有在初始化块,构造器中为成员变量指定初始值,那么这些成员变量的值将一直是0,\u0000,false null这些成员变量也就失去了存在的意义。 因此当使用final修饰成员变量的时候,要么在定义成员变量时候指定初始值,要么

c语言关键字的用法详解优选稿

c语言关键字的用法详 解 集团文件版本号:(M928-T898-M248-WU2669-I2896-DQ586-M1988)

1.Static用法 1.1static声明的变量在C语言中有两方面的特征: 1)、变量会被放在程序的全局存储区中,这样可以在下一次调用的时候还可以保持原来的赋值。这一点是它与堆栈变量和堆变量的区别。 2)、变量用static告知编译器,自己仅仅在变量的作用范围内可见。这一点是它与全局变量的区别。 1.2特点 A.若全局变量仅在单个C文件中访问,则可以将这个变量修改为静态全局变量,以降低模块间的耦合度; B.若全局变量仅由单个函数访问,则可以将这个变量改为该函数的静态局部变量,以降低模块间的耦合度; C.设计和使用访问动态全局变量、静态全局变量、静态局部变量的函数时,需要考虑重入问题; D.如果我们需要一个可重入的函数,那么,我们一定要避免函数中使用static变量(这样的函数被称为:带“内部存储器”功能的的函数) E.函数中必须要使用static变量情况:比如当某函数的返回值为指针类型时,则必须是static的局部变量的地址作为返回值,若为auto类型,则返回为错指针。 函数前加static使得函数成为静态函数。但此处“static”的含义不是指存储方式,而是指对函数的作用域仅局限于本文件(所以又称内部函

数)。使用内部函数的好处是:不同的人编写不同的函数时,不用担心自己定义的函数,是否会与其它文件中的函数同名。 扩展分析:术语static有着不寻常的历史.起初,在C中引入关键字st atic是为了表示退出一个块后仍然存在的局部变量。随后,static在C 中有了第二种含义:用来表示不能被其它文件访问的全局变量和函数。为了避免引入新的关键字,所以仍使用static关键字来表示这第二种含义。最后,C++重用了这个关键字,并赋予它与前面不同的第三种含义:表示属于一个类而不是属于此类的任何特定对象的变量和函数(与Java 中此关键字的含义相同)。 1.3关键字static的作用是什么? 1.4 这个简单的问题很少有人能回答完全。在C语言中,关键字static有三个明显的作用: 1.4.1在函数体,一个被声明为静态的变量在这一函数被调用过程中维持其值不变。 int testStatic() { int x=1; x++; return x; }

java基础知识点总结

Created by AIwen on 2017/5/14. java是面向对象的程序设计语言;类可被认为是一种自定义的数据类型,可以使用类来定义变量,所有使用类定义的变量都是引用变量,它们将会引用到类的对象。类用于描述客观世界里某一类对象的共同特征,而对象则是类的具体存在,java程序使用类的构造器来创建该类的对象。 java也支持面向对象的三大特征:封装、继承、和多态。java提供了private、protected、和public三个访问控制修饰符来实现良好的封装,提供了extends关键字让子类继承父类,子类继承父类就可以继承到父类的成员变量和和方法,如果访问控制允许,子类实例可以直接调用父类里定义的方法。继承是实现类复用的重要手段。使用继承关系来实现复用时,子类对象可以直接赋给父类变量,这个变量具有多态性。 面向对象的程序设计过程中有两个重要的概念:类(Class)和对象(object,也被称为实例,instance)。类可以包含三种最常见的成员:构造器、成员变量、和方法。 构造器用于构造该类的实例,java语言通过new关键字类调用构造器,从而返回该类的实例。构造器是一个类创建对象的根本途径,如果一个类没有构造器,这个类通常无法创建实例。因此java语言提供了一个功能:如果程序员没有为一个类编写构造器,则系统会为该类提供一个默认的构造器,这个构造器总是没有参数的。一旦程序员为一个类提供了构造器,系统将不再为该类提供构造器。 构造器用于对类实例进行初始化操作,构造器支持重载,如果多个重载的构造器里包含了相同的初始化代码,则可以把这些初始化代码放置在普通初始化块里完成,初始化块总在构造器执行之前被调用。静态初始化块代码用于初始化类,在类初始化阶段被执行。如果继承树里某一个类需要被初始化时,系统将会同时初始化该类的所有父类。 构造器修饰符:可以是public、protected、private其中之一,或者省略构造器名:构造器名必须和类名相同。 注意:构造器既不能定义返回值类型,也不能使用void声明构造器没有返回值。如果为构造器定义了返回值类型,或使用void声明构造器没有返回值,编译时不会出错,但java会把这个所谓的构造器当成方法来处理——它就不再是构造器。 实际上类的构造器是有返回值的,当使用new关键字来调用构造器时,构造器返回该类的实例,可以把这个类的实例当成构造器的返回值。因此构造器的返回值类型总是当前类,无须定义返回值类型。不要在构造器里显式的使用return来返回当前类的对象,因为构造器的返回值是隐式的。 java类名必须是由一个或多个有意义的单词连缀而成的,每个单词首字母大写,其他字母全部小写,单词与单词之间不要使用任何分隔符。 成员变量: 成员变量的修饰符:public、protected、private、static、final前三个只能出现一个再和后面的修饰符组合起来修饰成员变量,也可省略。 成员变量:由一个或者多个有意义的单词连缀而成,第一个单词首字母小写,后面每个单词首字母大写,其他字母全部小写,单词与单词之间不要使用任何分隔符。 类型:可以是java语言允许的任何数据类型,包括基本类型和引用类型。 成员方法: 方法修饰符:public、protected、private、static、final、abstract,前三个只能出现一个,static和final最多只能出现其中的一个,和abstract组合起来使用。也可省略。 返回值类型:可以是java语言的允许的任何数据类型,包括基本类型和引用类型。 方法名:和成员变量的方法命名规则相同,通常建议方法名以英文动词开头。 方法体里多条可执行语句之间有严格的执行顺序,排在方法体前面的语句总先执行,排在方法体后面的语句总是后执行。 static是一个特殊的关键字,它可用于修饰方法、成员变量等成员。static修饰的成员表明它属于这个类本身,而

Java关键字final使用总结

Java关键字final使用总结 一、final 根据程序上下文环境,Java关键字final有“这是无法改变的”或者“终态的”含义,它可以修饰非抽象类、非抽象类成员方法和变量。你可能出于两种理解而需要阻止改变:设计或效率。 final类不能被继承,没有子类,final类中的方法默认是final的。 final方法不能被子类的方法覆盖,但可以被继承。 final成员变量表示常量,只能被赋值一次,赋值后值不再改变。 final不能用于修饰构造方法。 注意:父类的private成员方法是不能被子类方法覆盖的,因此private 类型的方法默认是final类型的。 1、final类 final类不能被继承,因此final类的成员方法没有机会被覆盖,默认都是final的。在设计类时候,如果这个类不需要有子类,类的实现细节不允许改变,并且确信这个类不会载被扩展,那么就设计为final类。 2、final方法 如果一个类不允许其子类覆盖某个方法,则可以把这个方法声明为final 方法。 使用final方法的原因有二: 第一、把方法锁定,防止任何继承类修改它的意义和实现。 第二、高效。编译器在遇到调用final方法时候会转入内嵌机制,大大提高执行效率。 例如:

3、final变量(常量) 用final修饰的成员变量表示常量,值一旦给定就无法改变! final修饰的变量有三种:静态变量、实例变量和局部变量,分别表示三种类型的常量。 从下面的例子中可以看出,一旦给final变量初值后,值就不能再改变了。 另外,final变量定义的时候,可以先声明,而不给初值,这中变量也称为final空白,无论什么情况,编译器都确保空白final在使用之前必须被初始化。但是,final空白在final关键字final的使用上提供了更大的灵活性,为此,一个类中的final 数据成员就可以实现依对象而有所不同,却有保持其恒定不变的特征。

super关键字用法

使用super来引用父类的成分,用this来引用当前对象一、super关键字 在JAVA类中使用super来引用父类的成分,用this来引用当前对象,如果一个类从另 外一个类继承,我们new这个子类的实例对象的时候,这个子类对象里面会有一个父类对象。怎么去引用里面的父类对象呢?使用super来引用,this指的是当前对象的引用,super是当前对象里面的父对象的引用。 1.1.super关键字测试 1package cn.galc.test; 2 3/** 4 * 父类 5 * @author gacl 6 * 7*/ 8class FatherClass { 9public int value; 10public void f() { 11 value=100; 12 System.out.println("父类的value属性值="+value); 13 } 14 } 15 16/** 17 * 子类ChildClass从父类FatherClass继承 18 * @author gacl 19 * 20*/ 21class ChildClass extends FatherClass { 22/**

23 * 子类除了继承父类所具有的valu属性外,自己又另外声明了一个value属性, 24 * 也就是说,此时的子类拥有两个value属性。 25*/ 26public int value; 27/** 28 * 在子类ChildClass里面重写了从父类继承下来的f()方法里面的实现,即重写了f()方法的方法体。 29*/ 30public void f() { 31super.f();//使用super作为父类对象的引用对象来调用父类对象里面的f()方法 32 value=200;//这个value是子类自己定义的那个valu,不是从父类继承下来的那个value 33 System.out.println("子类的value属性值="+value); 34 System.out.println(value);//打印出来的是子类自定义的那个value的值,这个值是200 35/** 36 * 打印出来的是父类里面的value值,由于子类在重写从父类继承下来的f()方法时, 37 * 第一句话“super.f();”是让父类对象的引用对象调用父类对象的f()方法, 38 * 即相当于是这个父类对象自己调用f()方法去改变自己的value 属性的值,由0变了100。 39 * 所以这里打印出来的value值是100。 40*/ 41 System.out.println(super.value); 42 } 43 } 44 45/** 46 * 测试类 47 * @author gacl 48 * 49*/ 50public class TestInherit { 51public static void main(String[] args) { 52 ChildClass cc = new ChildClass();

java中如何使用Static的变量和方法

如何使用Static的变量和方法 有时你希望定义一个类成员,使它的使用完全独立于该类的任何对象。通常情况下,类成员必须通过它的类的对象访问,但是可以创建这样一个成员,它能够被它自己使用,而不必引用特定的实例。在成员的声明前面加上关键字static(静态的)就能创建这样的成员。如果一个成员被声明为static,它就能够在它的类的任何对象创建之前被访问,而不必引用任何对象。你可以将方法和变量都声明为static。static 成员的最常见的例子是main( ) 。因为在程序开始执行时必须调用main() ,所以它被声明为static。 声明为static的变量实质上就是全局变量。当声明一个对象时,并不产生static变量的拷贝,而是该类所有的实例变量共用同一个static变量。声明为static的方法有以下几条限制: 1.它们仅能调用其他的static 方法。 2.它们只能访问static数据。 它们不能以任何方式引用this 或super(关键字super 与继承有关)。 如果你需要通过计算来初始化你的static变量,你可以声明一个static块,Static 块仅在该类被加载时执行一次。下面的例子显示的类有一个static方法,一些static变量,以及一个static 初始化块: // Demonstrate static variables,methods,and blocks. class UseStatic { static int a = 3; static int b; static void meth(int x) { System.out.println("x = " + x); System.out.println("a = " + a); System.out.println("b = " + b); } static { System.out.println("Static block initialized."); b = a * 4; } public static void main(String args[]) { meth(42); } } 一旦UseStatic 类被装载,所有的static语句被运行。首先,a被设置为3,接着static 块执行(打印一条消息),最后,b被初始化为a*4 或12。然后调用main(),main() 调用meth() ,把值42传递给x。3个println ( ) 语句引用两个static变量a和b,以及局部变量x 。 注意:在一个static 方法中引用任何实例变量都是非法的。 下面是该程序的输出: Static block initialized. x = 42 a = 3 b = 12

c++static关键字

C/C++中的static关键字 C/C++中的static有两种用法: 面向过程程序设计中的static和面向对象程序设计中的static。前者应用于普通变量和函数,不涉及类的问题。 A. 面向过程程序设计中的stati c关键字 1) 静态全局变量 在全局变量前,加上关键字static,该变量就被定义成为一个静态全局变量。静态全局变量定义和使用类似:#include using namespace std; void fn(); static int n; //定义静态全局变量 void main() { n=20; cout< using namespace std; void fn(); void main()

{ fn(); fn(); fn(); } void fn() { static n=10; // 定义了静态局部变量,仅初始化一次! cout << n < using namespace std; static void fn(); //声明静态函数 void main() { fn(); } void fn() //定义静态函数 { int n=10;

Java 中的 static 使用之静态方法

Java 中的static 使用之静态方法 与静态变量一样,我们也可以使用 static 修饰方法,称为静态方法或类方法。其实之前我们一直写的 main 方法就是静态方法。静态方法的使用如: 需要注意: 1、静态方法中可以直接调用同类中的静态成员,但不能直接调用非静态成员。如: 如果希望在静态方法中调用非静态变量,可以通过创建类的对象,然后通过对象来访问非静态变量。如: 、在普通成员方法中,则可以直接访问同类的非静态变量和静态变量,如下所示:

3、静态方法中不能直接调用非静态方法,需要通过对象来访问非静态方法。如: ava 中的static 使用之静态初始化块 Java 中可以通过初始化块进行数据赋值。如: 在类的声明中,可以包含多个初始化块,当创建类的实例时,就会依次执行这些代码块。如果使用 static 修饰初始化块,就称为静态初始化块。 需要特别注意:静态初始化块只在类加载时执行,且只会执行一次,同时静态初始化块只能给静态变量赋值,不能初始化普通的成员变量。 我们来看一段代码:

运行结果: 通过输出结果,我们可以看到,程序运行时静态初始化块最先被执行,然后执行普通初始化块,最后才执行构造方法。由于静态初始化块只在类加载时执行一次,所以当再次创建对象 hello2 时并未执行静态初始化块。 封装 1、概念: 将类的某些信息隐藏在类内部,不允许外部程序直接访问,而是通过该类提供的方法来实现对隐藏信息的操作和访问 2、好处 a:只能通过规定的方法访问数据 b:隐藏类的实例细节,方便修改和实现。 什么是Java 中的内部类 问:什么是内部类呢? 答:内部类( Inner Class )就是定义在另外一个类里面的类。与之对应,包含内部类的类被称为外部类。 问:那为什么要将一个类定义在另一个类里面呢?清清爽爽的独立的一个类多好啊!! 答:内部类的主要作用如下:

Java中super的几种用法并与this的区别

4.super和this的异同: 1)super(参数):调用基类中的某一个构造函数(应该为构造函数中的第一条语句) 2)this(参数):调用本类中另一种形成的构造函数(应该为构造函数中的第一条语句) 3)super:它引用当前对象的直接父类中的成员(用来访问直接父类中被隐藏的父类中成员数据或函数,基类与派生类中有相同成员定义时如:super.变量名super.成员函数据名(实参) 4)this:它代表当前对象名(在程序中易产生二义性之处,应使用this来指明当前对象;如果函数的形参与类中的成员数据同名,这时需用this来指明成员变量名) 5)调用super()必须写在子类构造方法的第一行,否则编译不通过。每个子类构造方法的第一条语句,都是隐含地调用super(),如果父类没有这种形式的构造函数,那么在编译的时候就会报错。 6)super()和this()类似,区别是,super()从子类中调用父类的构造方法,this()在同一类内调用其它方法。 7)super()和this()均需放在构造方法内第一行。 8)尽管可以用this调用一个构造器,但却不能调用两个。 9)this和super不能同时出现在一个构造函数里面,因为this必然会调用其它的构造函数,其它的构造函数必然也会有super语句的存在,所以在同一个构造函数里面有相同的语句,就失去了语句的意义,编译器也不会通过。 10)this()和super()都指的是对象,所以,均不可以在static环境中使用。包括:static 变量,static方法,static语句块。 11)从本质上讲,this是一个指向本对象的指针, 然而super是一个Java关键字。1.静态方法 通常,在一个类中定义一个方法为static,那就是说,无需本类的对象即可调用此方法声明为static的方法有以下几条限制: 1)它们仅能调用其他的static 方法。 2)它们只能访问static数据。 3)它们不能以任何方式引用this 或super。 class Simple { static void Go() { System.out.println("Welcome"); } } public class Cal { public static void main(String[] args) { Simple.go(); } } 调用一个静态方法就是“类名.方法名”,静态方法的使用很简单如上所示。一般来说,静态方法常常为应用程序中的其它类提供一些实用工具所用,在Java的类库中大量的静态方法正是出于此目的而定义的。 2. 静态变量 声明为static的变量实质上就是全局变量。当声明一个对象时,并不产生static变量的拷贝,而是该类所有的实例变量共用同一个static变量。静态变量与静态方法类似。所有此类实例

c语言关键字的用法详解

1. Static用法 1.1 static声明的变量在C语言中有两方面的特征: 1)、变量会被放在程序的全局存储区中,这样可以在下一次调用的时候还可以保持原来的赋值。这一点是它与堆栈变量和堆变量的区别。 2)、变量用static告知编译器,自己仅仅在变量的作用范围内可见。这一点是它与全局变量的区别。 1.2 特点 A.若全局变量仅在单个C文件中访问,则可以将这个变量修改为静态全局变量,以降低模块间的耦合度; B.若全局变量仅由单个函数访问,则可以将这个变量改为该函数的静态局部变量,以降低模块间的耦合度; C.设计和使用访问动态全局变量、静态全局变量、静态局部变量的函数时,需要考虑重入问题; D.如果我们需要一个可重入的函数,那么,我们一定要避免函数中使用static变量(这样的函数被称为:带“内部存储器”功能的的函数) E.函数中必须要使用static变量情况:比如当某函数的返回值为指针类型时,则必须是static 的局部变量的地址作为返回值,若为auto类型,则返回为错指针。 函数前加static使得函数成为静态函数。但此处“static”的含义不是指存储方式,而是指对函数的作用域仅局限于本文件(所以又称内部函数)。使用内部函数的好处是:不同的人编写不同的函数时,不用担心自己定义的函数,是否会与其它文件中的函数同名。 扩展分析:术语static有着不寻常的历史.起初,在C中引入关键字static是为了表示退出一个块后仍然存在的局部变量。随后,static在C中有了第二种含义:用来表示不能被其它文件访问的全局变量和函数。为了避免引入新的关键字,所以仍使用static关键字来表示这第二种含义。最后,C++重用了这个关键字,并赋予它与前面不同的第三种含义:表示属于一个类而不是属于此类的任何特定对象的变量和函数(与Java中此关键字的含义相同)。 1.3 关键字static的作用是什么? 这个简单的问题很少有人能回答完全。在C语言中,关键字static有三个明显的作用:

static和this的理解和用法总结

static和this的理解和用法小结 关键字static和this是初学者比较头疼的知识点,自己也一直比较模糊.现在整理一下,既可以加深自己的印象也可以便于以后查询. 其实在think in java里关于为什么要使用static写的比较详细,不明白的多读几遍会有很大的收获.一般在两钟情形下需要使用static关键字:一种情形是只想用一个存储区域来保存一个特定的数据——无论要创建多少个对象,甚至根本不创建对象。另一种情形是我们需要一个特殊的方法,它没有与这个类的任何对象关联。也就是说,即使没有创建对象,也需要一个能调用的方法。一旦将什么东西设为static,数据或方法就不会同那个类的任何对象实例联系到一起.所以尽管从未创建那个类的一个对象,仍能调用一个static方法,或访问一些static数据。而在这之前,对于非static数据和方法,我们必须创建一个对象,并用那个对象访问数据或方法。这是由于非static数据和方法必须知道它们操作的具体对象.有这样的一个类,其中定义一个静态数据: class Test { Static int i = 47; } Test st1 = new StaticTest();Test st2 = new StaticTest();即使们new了两个Test对象,但它们仍然只占据Test.i的一个存储空间。这两个对象都共享同样的i。对方法来说,static一项重要的用途就是帮助我们在不必创建对象的前提下调用那个方法. 静态变量)一个静态对象属于一个类,但它不属于实例对象,也不是实例对象状态的一部分.每一个静态变量只存在一份.静态变量通常称为类变量(class variable).在实际中,经常需要这样的一个变量,它被一个类的所有实例对象所共享,如果它同时又是公有的,那么它就可以被这个类的任意访问者所使用.静态变量存在于类的作用域之内.通常声明为private.java中许多时候会用到public static final 这样的变量。静态变量可以被位于同一个作用域内的任意方或静态方法访问,访问时使用变量名称即可。如果在类作用域以外访问类,则要使用一个含有类名的表达式访问静态变量,例如: Integer.MAX_VALUE, 其中MAX_VALUE是在类库中声明过的。 静态方法)静态方法或类方法属于一个而不是属于某个实例对象实现的一部分。可以直接通过类来调用这种方法,而并不是只能由某个特定的实例对象调用。静态的方法不能用abstract声明,而且无论是否明确地指定实际上都是final型的。静态方法的声明格式: modifiers static typeName methodName (parameterList){ statementSequence } modifiers(可以从public,protect,private中选择一个),后面可以加上 final,nativc,synchronized中的一个或几个的组合。 static main是静态方法的一个特殊用法,用static main 方法来建立程序的初始状态,创建一组初始对象,并进行合理的方法调用,使得程序能够继续执行下去,static main方法使用String数组型参数包含了用户在运行时给出的任意命令行参数。

JAVA学习总结

1、Print、Println、Printf的区别 Print: 将信息显示在命令窗口中,输出光标定位在最后一个字符之后; Println:将信息显示在命令窗口中,输出光标换行定位在下一行开头; Printf: 将信息进行格式化显示在命令窗口中,输出光标定位在最后一个字符之后,其来自C语言,产生格式化输出的函数(来自stdio.h中); 2、异常 如果有多个catch 语句,那么捕获父类异常的catch 语句必须放在后面,否则它会捕获它的所有子类异常,而使得子类异常catch 语句永远不会执行。 一般情况下 finally 语句块一般放在最后一个catch 语句块后,不管程序是否抛出异常,它都会执行。 throw与throws的区别 区别一:throw 是语句抛出一个异常;throws 是方法抛出一个异常; throw语法:throw <异常对象> 在方法声明中,添加throws子句表示该方法将抛出异常。 throws语法: [<修饰符>]<返回值类型><方法名>([<参数列表>])[throws<异常类>] 其中:异常类可以声明多个,用逗号分割。 区别二:throws可以单独使用,但throw不能; 区别三:throw要么和try-catch-finally语句配套使用,要么与throws配套使用。 但throws可以单独使用,然后再由处理异常的方法捕获。 可检测异常和非检测异常 Java的可检测异常和非检测异常泾渭分明。 可检测异常经编译器验证,对于声明抛出异常的任何方法,编译器将强制执行处理 或声明规则。 非检测异常不遵循处理或声明规则。在产生此类异常时,不一定非要采取任何适当 操作,编译器不会检查是否已解决了这样一个异常。有两个主要类定义非检测异常:RuntimeException和Error。 对于未检查异常,在方法抛出时可以不用throws 来声明,而检查异常则必须在throws声明后才能进行throw 抛出异常。 为什么Error子类属于非检测异常?这是因为无法预知它们的产生时间。若Java应用程序内存不足,则随时可能出现OutOfMemoryError;起因一般不是应用 程序中的特殊调用,而是JVM自身的问题。另外,Error类一般表示应用程序无法 解决的严重问题,故将这些类视为非检测异常。 RuntimeException类也属于非检测异常,一个原因是普通JVM操作引发的运行时异常随时可能发生。与Error不同,此类异常一般由特定操作引发。但这些操 作在Java应用程序中会频繁出现。例如,若每次使用对象时,都必须编写异常处 理代码来检查null引用,则整个应用程序很快将变成一个庞大的try-catch块。 因此,运行时异常不受编译器检查与处理或声明规则的限制。 将RuntimeException类作为未检测异常还有一个原因:它们表示的问题不一定作为异常处理。可以在try-catch结构中处理NullPointerException,但若在 使用引用前测试空值,则更简单,更经济。同样,可以在除法运算时检查0值,而 不使用ArithmeticException。

java笔记(super关键字的使用)

super 关键字的使用 super 关键字出现在子类中,主要功能就是完成子类调用父类中的内容,也就是调用父类中的属性或方法。 super 调用父类中的构造方法: class Person { String name; int age; public Person(String name,int age) { https://www.doczj.com/doc/444179791.html,=name; this.age=age; } } class Student extends Person { String school; public Student() { super("张三",27); } } public class TestPersonStudentDemo { public static void main(String args[]) { Student s=new Student(); S.shchool=”北京”; System.out.println("我是:"+https://www.doczj.com/doc/444179791.html,+",今年:"+s.age+"岁,学校:"+s.school) ; } } 输出结果为:我是张三,今年27岁,学校:北京 本程序在子类的构造方法中明确地指明了调用的是父类中有两个参数的构造方法,所以程序在编译时不再去找父类中无参的构造方法。 用super 调用父类中的构造方法,只能放在子类的第一行。 通过super 调用父类的属性和方法: class Person 父类构造方法 子类构造方法 调用父类构造方法

{ String name; int age; public Person() { } public String talk() { return "我是:"+https://www.doczj.com/doc/444179791.html,+",今年:"+this.age+"岁"; } } class Student extends Person { String school; public Student(String name,int age,String school) { //在这里用super 调用父类中的属性 https://www.doczj.com/doc/444179791.html,=name; super.age=age; //调用父类中的talk()方法 System.out.print(super.talk()); //调用本类中属性 this.school=school; } } public class TestPersonStudentDemo3 { public static void main(String args[]) { Student s=new Student("张三",27,"北京"); System.out.println(",学校:"+s.school); } } 输出结果为: 我是:张三,今年:27岁,学校:北京 限制子类的访问 有些时候,父类并不希望子类可以访问自己的类中全部的属性或方法,所以需要将一些属性父类构造方法 子类构造方法 父类一般方法

关于static的用法

关于static的用法: 在《Java编程思想》P86页有这样一段话: “static方法就是没有this的方法。在static方法内部不能调用非静态方法,反过来是可以的。而且可以在没有创建任何对象的前提下,仅仅通过类本身来调用static方法。这实际上正是static方法的主要用途。” 这段话虽然只是说明了static方法的特殊之处,但是可以看出static关键字的基本作用,简而言之,一句话来描述就是: 方便在没有创建对象的情况下来进行调用(方法/变量)。 很显然,被static关键字修饰的方法或者变量不需要依赖于对象来进行访问,只要类被加载了,就可以通过类名去进行访问。 static可以用来修饰类的成员方法、类的成员变量,另外可以编写static代码块来优化程序性能。 1)static方法 static方法一般称作静态方法,由于静态方法不依赖于任何对象就可以进行访问,因此对于静态方法来说,是没有this的,因为它不依附于任何对象,既然都没有对象,就谈不上this了。并且由于这个特性,在静态方法中不能访问类的非静态成员变量和非静态成员方法,因为非静态成员方法/变量都是必须依赖具体的对象才能够被调用。 但是要注意的是,虽然在静态方法中不能访问非静态成员方法和非静态成员变量,但是在非静态成员方法中是可以访问静态成员方法/变量的。举个简单的例子:

在上面的代码中,由于print2方法是独立于对象存在的,可以直接用过类名调用。假如说可以在静态方法中访问非静态方法/变量的话,那么如果在main方法中有下面一条语句: MyObject.print2(); 此时对象都没有,str2根本就不存在,所以就会产生矛盾了。同样对于方法也是一样,由于你无法预知在print1方法中是否访问了非静态成员变量,所以也禁止在静态成员方法中访问非静态成员方法。 而对于非静态成员方法,它访问静态成员方法/变量显然是毫无限制的。 因此,如果说想在不创建对象的情况下调用某个方法,就可以将这个方法设置为static。我们最常见的static方法就是main方法,至于为什么main方法必须是static的,现在就很清楚了。因为程序在执行main方法的时候没有创建任何对象,因此只有通过类名来访问。 另外记住,即使没有显示地声明为static,类的构造器实际上也是静态方法。 2)static变量

静态属性和非静态属性的赋值与取值

public class Test { static int age; //由static修饰,静态属性 int classNum; public static void main(String[] args) { Test person = new Test(); //在这个对象中赋值,12给了classNum,它是非静态属性,所以只在本空间中存在 person.classNum = 12; person.age = 20; //20给了静态属性,也就是说它存在于方法区,所有对象共用 //非静态属性 System.out.println("person中classNum的值:"+ person.classNum);//person中classNum的值:12 System.out.println("person中age的值:" + person.age);//person中age的值:20

Test person2 = new Test(); //因此,在这个新建的空间中,classNum获取不到值,但是age能获取得到20 //非静态属性 System.out.println("person2中classNum的值:" + person2.classNum);//person2中classNum的值:0 System.out.println("person2中age的值:" + person2.age);//person2中age的值:20 } } package com.qianfeng.day07.demo4; public class Pet2 { static int i = 0; /*运用场景: * 1、所有的对象共用某一个属性时,使用静态属性 * 2、跟final 一起使用,可以当常量用 * 3、国际化字符切换*/ public Pet2() { i++; method(); } public static void method() { System.out.println("生成的是第" + i + "个对象");

相关主题
文本预览
相关文档 最新文档