当前位置:文档之家› Multi-Channel Parallel Adaptation Theory for Rule Discovery

Multi-Channel Parallel Adaptation Theory for Rule Discovery

Multi-Channel Parallel Adaptation Theory for Rule Discovery
Multi-Channel Parallel Adaptation Theory for Rule Discovery

a r X i v :c s /0105022v 1 [c s .A I ] 11 M a y 2001Multi-Channel Parallel Adaptation Theory for Rule Discovery Li Min Fu Department of CISE University of Florida Corresponding Author:Li Min Fu Department of Computer and Information Sciences,301CSE P.O.Box 116120University of Florida Gainesville,Florida 32611Phone:(352)392-1485e-mail:fu@cise.u?.edu

Multi-Channel Parallel Adaptation Theory

for Rule Discovery

Li Min Fu

Abstract

In this paper,we introduce a new machine learning theory based on multi-channel par-allel adaptation for rule discovery.This theory is distinguished from the familiar parallel-

distributed adaptation theory of neural networks in terms of channel-based convergence to

the target rules.We show how to realize this theory in a learning system named CFRule.

CFRule is a parallel weight-based model,but it departs from traditional neural computing

in that its internal knowledge is comprehensible.Furthermore,when the model converges

upon training,each channel converges to a target rule.The model adaptation rule is de-

rived by multi-level parallel weight optimization based on gradient descent.Since,however,

gradient descent only guarantees local optimization,a multi-channel regression-based op-

timization strategy is developed to e?ectively deal with this problem.Formally,we prove

that the CFRule model can explicitly and precisely encode any given rule set.Also,we

prove a property related to asynchronous parallel convergence,which is a critical element

of the multi-channel parallel adaptation theory for rule learning.Thanks to the quanti-

zability nature of the CFRule model,rules can be extracted completely and soundly via

a threshold-based mechanism.Finally,the practical application of the theory is demon-

strated in DNA promoter recognition and hepatitis prognosis prediction.

Keywords:rule discovery,adaptation,optimization,regression,certainty factor,neural net-work,machine learning,uncertainty management,arti?cial intelligence.

1Introduction

Rules express general knowledge about actions or conclusions in given circumstances and also principles in given domains.In the if-then format,rules are an easy way to represent cognitive processes in psychology and a useful means to encode expert knowledge.In another perspective,rules are important because they can help scientists understand problems and engineers solve problems.These observations would account for the fact that rule learning or discovery has become a major topic in both machine learning and data mining research.The former discipline concerns the construction of computer programs which learn knowledge or skill while the latter is about the discovery of patterns or rules hidden in the data.

The fundamental concepts of rule learning are discussed in[16].Methods for learning sets of rules include symbolic heuristic search[3,5],decision trees[17-18],inductive logic programming [13],neural networks[2,7,20],and genetic algorithms[10].A methodology comparison can be found in our previous work[9].Despite the di?erences in their computational frameworks, these methods perform a certain kind of search in the rule space(i.e.,the space of possible rules)in conjunction with some optimization https://www.doczj.com/doc/8c18941762.html,plete search is di?cult unless the domain is small,and a computer scientist is not interested in exhaustive search due to its exponential computational complexity.It is clear that signi?cant issues have limited the e?ectiveness of all the approaches described.In particular,we should point out that all the

algorithms except exhaustive search guarantee only local but not global optimization.For example,a sequential covering algorithm such as CN2[5]performs a greedy search for a single rule at each sequential stage without backtracking and could make a suboptimal choice at any stage;a simultaneous covering algorithm such as ID3[18]learns the entire set of rules simultaneously but it searches incompletely through the hypothesis space because of attribute ordering;a neural network algorithm which adopts gradient-descent search is prone to local minima.

In this paper,we introduce a new machine learning theory based on multi-channel parallel adaptation that shows great promise in learning the target rules from data by parallel global convergence.This theory is distinct from the familiar parallel-distributed adaptation theory of neural networks in terms of channel-based convergence to the target rules.We describe a system named CFRule which implements this theory.CFRule bases its computational characteristics on the certain factor(CF)model[4,22]it adopts.The CF model is a calculus of uncertainty mangement and has been used to approximate standard probability theory[1] in arti?cial intelligence.It has been found that certainty factors associated with rules can be revised by a neural network[6,12,15].Our research has further indicated that the CF model used as the neuron activation function(for combining inputs)can improve the neural-network performance[8].

The rest of the paper is organized as follows.Section2describes the multi-channel rule learning model.Section3examines the formal properties of rule encoding.Section4derives the model parameter adaptation rule,presents a novel optimization strategy to deal with the local minimum problem due to gradient descent,and proves a property related to asynchronous parallel convergence,which is a critical element of the main theory.Section5formulates a rule extraction algorithm.Section6demonstrates practical applications.Then we draw conclusions in the?nal section.

2The Multi-Channel Rule Learning Model

CFRule is a rule-learning system based on multi-level parameter optimization.The kernel of CFRule is a multi-channel rule learning model.CFRule can be embodied as an arti?cial neural network,but the neural network structure is not essential.We start with formal de?nitions about the model.

De?nition2.1The multi-channel rule learning model M is de?ned by k(k≥1)channels (Ch’s),an input vector(M in),and an output(M out)as follows:

M≡(Ch1,Ch2,...,Ch k,M in,M out)(1)

where?1≤M out≤1and

M in≡(x1,x2,...,x d)(2) such that d is the input dimensionality and?1≤x i≤1for all i.

The model has only a single output because here we assume the problem is a single-class, multi-rule learning problem.The framework can be easily extended to the multi-class case.

De?nition2.2Each channel(Ch j)is de?ned by an output weight(u j),a set of input weights (w ji’s),activation(φj),and in?uence(ψj)as follows:

Ch j≡(u j,w j0,w j1,w j2,...,w jd,φj,ψj)(3)

where w j0is the bias,0≤u j≤1,and?1≤w ji≤1for all i.The input weight vector (w j1,...,w jd)de?nes the channel’s pattern.

De?nition2.3Each channel’s activation is de?ned by

φj=f cf(w j0,w j1x1,w j2x2,...,w jd x d)(4) where f cf is the CF-combining function[4,22],as de?ned below.

De?nition2.4The CF-combining function is given by

f cf(x1,x2,...,y1,y2,...)=f+cf(x1,x2,...)+f?cf(y1,y2,...)(5)

where

f+cf(x1,x2,...)=1? i(1?x i)(6)

f?cf(y1,y2,...)=?1+ j(1+y j)(7) x i’s are nonnegative numbers and y j’s are negative numbers.

As we will see,the CF-combining function contributes to several important computational properties instrumental to rule discovery.

De?nition2.5Each channel’s in?uence on the output is de?ned by

ψj=u jφj(8) De?nition2.6The model output M out is de?ned by

M out=f cf(ψ1,ψ2,...,ψk)(9)

We call the class whose rules to be learned the target class,and de?ne rules inferring(or explaining)that class to be the target rules.For instance,if the disease diabetes is the target class,then the diagnostic rules for diabetes would be the target rules.Each target rule de?nes a condition under which the given class can be inferred.Note that we do not consider rules which deny the target class,though such rules can be de?ned by reversing the class concept. The task of rule learning is to learn or discover a set of target rules from given instances called training instances(data).It is important that rules learned should be generally applicable to the entire domain,not just the training data.How well the target rules learned from the training data can be applied to unseen data determines the generalization performance.

Instances which belong to the target class are called positive instances,else,called negative instances.Ideally,a positive training instance should match at least one target rule learned and vice versa,whereas a negative training instance should match none.So,if there is only a single target rule learned,then it must be matched by all(or most)positive training instances.But if multiple target rules are learned,then each rule is matched by some(rather than all)positive training instances.Since the number of possible rule sets is far greater than the number of possible rules,the problem of learning multiple rules is naturally much more complex than that of learning single rules.

In the multi-channel rule learning theory,the model learns to sort out instances so that instances belonging to di?erent rules?ow through di?erent channels,and at the same time, channels are adapted to accommodate their pertinent instances and learn corresponding rules. Notice that this is a mutual process and it cannot occur all at once.In the beginning,the rules are not learned and the channels are not properly shaped,both information?ow and adaptation are more or less random,but through self-adaptation,the CFRule model will gradually converge to the correct rules,each encoded by a channel.The essence of this paper is to prove this property.

In the model design,a legitimate question is what the optimal number of channels is.This is just like the question raised for a neural network of how many hidden(internal computing) units should be used.It is true that too many hidden units cause data over?tting and make generalization worse[7].Thus,a general principle is to use a minimal number of hidden units. The same principle can be equally well applied to the CFRule model.However,there is a di?erence.In ordinary neural networks,the number of hidden units is determined by the sample size,while in the CFRule model,the number of channels should match the number of rules embedded in the data.Since,however,we do not know how many rules are present in the data,our strategy is to use a minimal number of channels that admits convergence on the training data.

The model’s behavior is characterized by three aspects:

?Information processing:Compute the model output for a given input vector.

?Learning or training:Adjust channels’parameters(output and input weights)so that

the input vector is mapped into the output for every instance in the training data.

?Rule extraction:Extract rules from a trained model.

The?rst aspect has been described already.

3Model Representation of Rules

The IF-THEN rule(i.e.,If the premise,then the action)is a major knowledge representation paradigm in arti?cial intelligence.Here we make analysis of how such rules can be represented with proper semantics in the CFRule model.

De?nition3.1CFRule learns rules in the form of

IF A+1,...,A+i,,...,?A?1,...,?A?j,...,THEN the target class with a certainty factor.

where A+i is a positive antecedent(in the positive form),A?j a negated antecedent(in the negative form),and?reads“not.”Each antecedent can be a discrete or discretized attribute (feature),variable,or a logic proposition.The IF part must not be empty.The attached certainty factor in the THEN part,called the rule CF,is a positive real≤1.

The rule’s premise is restricted to a conjunction,and no disjunction is allowed.The collection of rules for a certain class can be formulated as a DNF(disjunctive normal form)logic expression, namely,the disjunction of conjunctions,which implies the class.However,rules de?ned here are not traditional logic rules because of the attached rule CFs meant to capture uncertainty. We interpret a rule by saying when its premise holds(that is,all positive antecedents mentioned are true and all negated antecedents mentioned are false),the target concept holds at the given con?dence level.CFRule can also learn rules with weighted antecedents(a kind of fuzzy rules), but we will not consider this case here.

There is increasing evidence to indicate that good rule encoding capability actually fa-cilitates rule discovery in the data.In the theorems that follow,we show how the CFRule model can explicitly and precisely encode any given rule set.We note that the ordinary sigmoid-function neural network can only implicitly and approximately does this.Also,we note although the threshold function of the perceptron model enables it to learn conjunc-tions or disjunctions,the non-di?erentiability of this function prohibits the use of an adaptive procedure in a multilayer construct.

Theorem3.1For any rule represented by De?nition3.1,there exists a channel in the CFRule model to encode the rule so that if an instance matches the rule,the channel’s activation is1, else0.

(Proof):This can be proven by construction.Suppose we implement channel j by setting the bias weight to1,the input weights associated with all positive attributes in the rule’s premise

to1,the input weights associated with all negated attributes in the rule’s premise to?1,the rest of the input weights to0,and?nally the output weight to the rule CF.Assume that each instance is encoded by a bipolar vector in which for each attribute,1means true and?1false. When an instance matches the rule,the following conditions hold:x i=1if x i is part of the rule’s premise,x i=?1if?x i is part of the rule’s premise,and otherwise x i can be of any value.For such an instance,given the above construction,it is true that w ji x i=1or0for all i.Thus,the channel’s activation(by De?nition2.3),

φj=f cf(w j0=1,w j1x1,w j2x2,...,w jd x d)(10)

must be1according to f cf.On the other hand,if an instance does not match the rule,then there exists i such that w ji x i=?1.Since w j0(the bias weight)=1,the channel’s activation is0due to f cf.2

Theorem3.2Assume that rule CF’s>θ(0≤θ≤1).For any set of rules represented by De?nition3.1,there exists a CFRule model to encode the rule set so that if an instance matches any of the given rules,the model output is>θ,else0.

(Proof):Suppose there are k rules in the set.As suggested in the proof of Theorem3.1,we construct k channels,each encoding a di?erent rule in the given rule set so that if an instance matches,say rule j,then the activation(φj)of channel j is1.In this case,since the channel’s in?uenceψj is given by u jφj(where u j is set to the rule CF)and the rule CF>θ,it follows thatψj>θ.It is then clear that the model output must be>θsince it combines in?uences from all channels that≥0but at least one>θ.On the other hand,if an instance fails to match any of the rules,all the channels’activations are zero,so is the model output.2

4Model Adaptation and Convergence

In neural computing,the backpropagation algorithm[19]can be viewed as a multilayer,par-allel optimization strategy that enables the network to converge to a local optimum solution. The black-box nature of the neural network solution is re?ected by the fact that the pattern (the input weight vector)learned by each neuron does not bear meaningful knowledge.The CFRule model departs from traditional neural computing in that its internal knowledge is comprehensible.Furthermore,when the model converges upon training,each channel con-verges to a target rule.How to achieve this objective and what is the mathematical theory are the main issues to be addressed.

4.1Model Training Based on Gradient Descent

The CFRule model learns to map a set of input vectors(e.g.,extracted features)into a set of outputs(e.g.,class information)by training.An input vector along with its target output

constitute a training instance.The input vector is encoded as a1/?1bipolar vector.The target output is1for a positive instance and0for a negative instance.

Starting with a random or estimated weight setting,the model is trained to adapt itself to the characteristics of the training instances by changing weights(both output and input weights)for every channel in the model.Typically,instances are presented to the model one at a time.When all instances are examined(called an epoch),the network will start over with the?rst instance and repeat.Iterations continue until the system performance has reached a satisfactory level.

The learning rule of the CFRule model is derived in the same way as the backpropagation algorithm[19].The training objective is to minimize the sum of squared errors in the data. In each learning cycle,a training instance is given and the weights of channel j(for all j)are updated by

u j(t+1)=u j(t)+?u j(11)

w ji(t+1)=w ji(t)+?w ji(12)

where u j:the output weight,w ji:an input weight,the argument t denotes iteration t,and?the adjustment.The weight adjustment on the current instance is based on gradient descent. Consider channel j.For the output weight(u j),

?u j=?η(?E/?u j)(13) (η:the learning rate)where

1

E=

Sinceφj is not directly related to E,the?rst partial derivative on the right hand side of the above equation is expanded by the chain rule again to obtain

?E/?φj=(?E/?M out)(?M out/?φj)=?D(?M out/?φj)

Substituting these results into Eq.(15)leads to the following de?nition.

De?nition4.2The learning rule for input weight w ji of channel j is given by

?w ji=ηd j(?φj/?w ji)(16)

where

d j=D(?M out/?φj)

Assume that

φj=f+cf(w j1x1,w j2x2,...,w jd′x d′)+f?cf(w jd′+1x d′+1,...,w jd x d)(17)

Suppose d′>1and d?d′>1.The partial derivative?φj

?w ji =(

l=i,l≤d′

(1?w jl x l))x i(18)

Case(b)If w ji x i<0,

?φj

?w ji

=x i.

4.2Multi-Channel Regression-Based Optimization

It is known that gradient descent can only?nd a local-minimum.When the error surface is?at or very convoluted,such an algorithm often ends up with a bad local minimum.Moreover,the learning performance is measured by the error on unseen data independent of the training set. Such error is referred to as generalization error.We note that minimization of the training error by the backpropagation algorithm does not guarantee simultaneous minimization of generalization error.What is worse,generalization error may instead rise after some point along the training curve due to an undesired phenomenon known as over?tting[7].Thus, global optimization techniques for network training(e.g.,[21])do not necessarily o?er help as far as generalization is concerned.To address this issue,CFRule uses a novel optimization strategy called multi-channel regression-based optimization(MCRO).

In De?nition2.4,f+cf and f?cf can also be expressed as

f+cf(x1,x2,...)= i x i? i j x i x j+ i j

k

x i x j x k? (20)

f?cf(y1,y2,...)= i y i+ i j y i y j+ i j

k

y i y j y k+ (21)

When the arguments(x i’s and y i’s)are small,the CF function behaves somewhat like a linear function.It can be seen that if the magnitude of every argument is<0.1,the?rst order approximation of the CF function is within an error of10%or so.Since when learning starts,all the weights take on small values,this analysis has motivated the MCRO strategy for improving the gradient descent solution.The basic idea behind MCRO is to choose a starting point based on the linear regression analysis,in contrast to gradient descent which uses a random starting point.

If we can use regression analysis to estimate the initial in?uence of each input variable on the model output,how can we know how to distribute this estimate over multiple channels? In fact,this is the most intricate part of the whole idea since each channel’s structure and parameters are yet to be learned.The answer will soon be clear.

In CFRule,each channel’s activation is de?ned by

φj=f cf(w j0,w j1x1,w j2x2,...)(22) Suppose we separate the linear component from the nonlinear component(R)inφj to obtain

φj=(d

i=0

w ji x i)+R j(23) We apply the same treatment to the model output(De?nition2.6)

M out=f cf(u1φ1,u2φ2,...)(24)

so that

M out=(

k

j=1

u jφj)+R out(25) Then we substitute Eq.(23)into Eq.(25)to obtain

M out=(

k

j=1

d

i=0

u j w ji x i)+R acc(26)

in which the right hand side is equivalent to

[d

i=0(

k

j=1

u j w ji)x i]+R acc

Note that

R acc=(

k

j=1

u j R j)+R out

Suppose linear regression analysis produces the following estimation equation for the model output:

M′out=b0+b1x1+...

(all the input variables and the output transformed to the range from0to1).

Table1:The target rules in the simulation experiment.

rule1:IF x1and?x2and x7THEN the target concept

rule2:IF x1and?x4and x5THEN the target concept

rule3:IF x6and x11THEN the target concept

Table2:Comparison of the MCRO strategy with random start for the convergence to the target rules.The results were validated by the statistical t test with the level of signi?cance <0.01and<0.025(degrees of freedom=48)for the training and test error rates upon convergence,respectively.

Random Start Level of Signi?cance

0.010 2.47

0.012 2.34

4.3Asynchronous Parallel Convergence

In the multi-channel rule learning theory,there are two possible modes of parallel convergence. In the synchronous mode,all channels converge to their respective target patterns at the same time,whereas in the asynchronous mode,each channel converges at a di?erent time.In a self-adaptation or self-organization model without a global clock,the synchronous mode is not a plausible scenario of convergence.On the other hand,the asynchronous mode may not arrive at global convergence(i.e.,every channel converging to its target pattern)unless there is a mechanism to protect a target pattern once it is converged upon.Here we examine a formal property of CFRule on this new learning issue.

Theorem4.1Suppose at time t,channel j of the CFRule model has learned an exact pattern (w j1,w j2,...,w jd)(d≥1)such that w j0(the bias)=1and w ji=1or?1or0for1≤i≤d.At time t+1when the model is trained on a given instance with the input vector(x0,x1,x2,...,x d) (x0=1and x i=1or?1for all1≤i≤d),the pattern is unchanged unless there is a single mismatched weight(weight w ji is mismatched if and only if w ji x i=?1).Let?w ji(t+1)be the weight adjustment for w ji.Then

(a)If there is no mismatch,then?w ji(t+1)=0for all i.

(b)If there are more than one mismatched weight then?w ji(t+1)=0for all i.

(Proof):In case(a),there is no mismatch,so w ji x i=1or0for all i.There exists l such that w jl x l=1and l=i,for example,w j0x0=1as given.From Eq.(18),

?φj

)=0

?w ji

In case(b),the proof for matched weights is the same as that in case(a).Consider only mismatched weights w ji’s such that w ji x i=?1.Since there are at least two mismatched weights,there exists l such that w jl x l=?1and l=i.From Eq.(19),

?φj

)=0

?w ji

In the case of a single mismatched weight,

?φj

which is not zero,so the weight adjustment?w ji(t+1)may or may not be zero,depending on the error d j.2

Since model training starts with small weight values,the initial pattern associated with each channel cannot be exact.When training ends,the channel’s pattern may still be inexact because of possible noise,inconsistency,and uncertainty in the data.However,from the proof of the above theorem,we see that when the nonzero weights in the channel’s pattern grow larger,the error derivative(d jφj

Table3:Asynchronous parallel convergence to the target rules in the CFRule model.Channels 1,2,3converge to target rules1,2,3,respectively.w j,i denotes the input weight associated with the input x i in channel j.An epoch consists of a presentation of all training instances.

epoch

1

5

10

15

20

25

30

essarily happen in practical circumstances involving data noise,inconsistency,uncertainty, and inadequate sample sizes.However,in whatever circumstances,it turns out that a simple thresholding mechanism su?ces to distinguish important from unimportant weights in the CFRule model.Since the weight absolute values range from0to1,it is reasonable to use 0.5as the threshold,but this value does not always guarantee optimal performance.How to search for a good threshold in a continuous range is di?cult.Fortunately,thanks to the quantizability nature of the system adopting the CF model[9],only a handful of values need to be considered.Our research has narrowed it down to four candidate values:0.35,0.5,0.65, and0.8.A larger threshold makes extracted rules more general,whereas a smaller threshold more speci?c.In order to lessen data over?tting,our heuristic is to choose a higher value as long as the training error is https://www.doczj.com/doc/8c18941762.html,ing an independent cross-validation data set is a good idea if enough data is available.The rule extraction algorithm is formulated below.

The threshold-based algorithm described here is fundamentally di?erent from the search-based algorithm in neural network rule extraction[7,9,20].The main advantage with the threshold-based approach is its linear computational complexity with the total number of weights,in contrast to polynomial or even exponential complexity incurred by the search-based approach.Furthermore,the former approach obviates the need of a special training,pruning,

Table4:The promoter(of prokaryotes)consensus sequences.

Region

@-36=T@-35=T@-34=G@-33=A

Minus-10

@-9=A@-8=T

or approximation procedure commonly used in the latter approach for complexity reduction. As a result,the threshold-based,direct approach should produce better and more reliable rules.Notice that this approach is not applicable to the ordinary sigmoid-function neural network where knowledge is entangled.The admissibility of the threshold-based algorithm for rule extraction in CFRule can be ascribed to the CF-combining function.

6Applications

Two benchmark data sets were selected to demonstrate the value of CFRule on practical domains.The promoter data set is characterized by high dimensionality relative to the sample size,while the hepatitis data has a lot of missing values.Thus,both pose a challenging problem.

The decision-tree-based rule generator system C4.5[18]was taken as a control since it(and with its later version)is the currently most representative(or most often used)rule learning system,and also the performance of C4.5is optimized in a statistical sense.

6.1Promoter Recognition in DNA

In the promoter data set[23],there are106instances with each consisting of a DNA nucleotide string of four base types:A(adenine),G(guanine),C(cytosine),and T(thymine).Each instance string is comprised of57sequential nucleotides,including?fty nucleotides before (minus)and six following(plus)the transcription site.An instance is a positive instance if the promoter region is present in the sequence,else it is a negative instance.There are53positive instances and53negative instances,respectively.Each position of an instance sequence is encoded by four bits with each bit designating a base type.So an instance is encoded by a vector of228bits along with a label indicating a positive or negative instance.

In the literature of molecular biology,promoter(of prokaryotes)sequences have average constitutions of-TTGACA-and-TATAAT-,respectively,located at so-called minus-35and minus-10regions[14],as shown in Table4.

The CFRule model in this study had3channels,which were the minimal number of channels to bring the training error under0.02upon convergence.Still,the model is relatively

Table5:The average two-fold cross-validation error rates of the rules learned by C4.5and CFRule,respectively.

Domain CFRule

Promoters(without

prior knowledge)12.8%

7.1%

Table6:The promoter prediction rules learned from106instances by CFRule and C4.5, respectively.?:not.@:at.

DNA Sequence Pattern

CFRule@-34=G@-33=?G@-12=?G

@-36=T@-35=T@-31=?C@-12=?G

@-45=A@-36=T@-35=T

#1

#2

#3

Rule

#1

C4.5MALE and NO STEROID and ALBUMIN<3.7

databases.

6.2Hepatitis Prognosis Prediction

In the data set concerning hepatitis prognosis1,there are155instances,each described by19 attributes.Continuous attributes were discretized,then the data set was randomly partitioned into two halves(78and77cases),and then cross-validation was carried out.CFRule and C4.5 used exactly the same data to ensure fair comparison.The CFRule model for this problem consisted of2channels.Again,CFRule was superior to C4.5based on the cross-validation performance(see Table5).However,both systems learned the same single rule from the whole 155instances,as displayed in Table7.To learn the same rule by two fundamentally di?erent systems is quite a coincidence,but it suggests the rule is true in a global sense.

7Conclusions

If global optimization is a main issue for automated rule discovery from data,then current machine learning theories do not seem adequate.For instance,the decision-tree and neural-network based algorithms,which dodge the complexity of exhaustive search,guarantee only local but not global optimization.In this paper,we introduce a new machine learning theory

based on multi-channel parallel adaptation that shows great promise in learning the target rules from data by parallel global convergence.The basic idea is that when a model consisting of multiple parallel channels is optimized according to a certain global error criterion,each of its channels converges to a target rule.While the theory sounds attractive,the main question is how to implement it.In this paper,we show how to realize this theory in a learning system named CFRule.

CFRule is a parallel weight-based model,which can be optimized by weight adaptation. The parameter adaptation rule follows the gradient-descent idea which is generalized in a multi-level parallel context.However,the central idea of the multi-channel rule-learning theory is not about how the parameters are adapted but rather,how each channel can converge to a target rule.We have noticed that CFRule exhibits the necessary conditions to ensure such convergence behavior.We have further found that the CFRule’s behavior can be attributed to the use of the CF(certainty factor)model for combining the inputs and the channels.

Since the gradient descent technique seeks only a local minimum,the learning model may well be settled in a solution where each rule is optimal in a local sense.A strategy called multi-channel regression-based optimization(MCRO)has been developed to address this issue.This strategy has proven e?ective by statistical validation.

We have formally proven two important properties that account for the parallel rule-learning behavior of CFRule.First,we show that any given rule set can be explicitly and precisely encoded by the CFRule model.Secondly,we show that once a channel is settled in a target rule,it barely moves.These two conditions encourage the model to move toward the target rules.An empirical weight convergence graph clearly showed how each channel converged to a target rule in an asynchronous manner.Notice,however,we have not been able to prove or demonstrate this rule-oriented convergence behavior in other neural networks.

We have then examined the application of this methodology to DNA promoter recognition and hepatitis prognosis prediction.In both domains,CFRule is superior to C4.5(a rule-learning method based on the decision tree)based on cross-validation.Rules learned are also consistent with knowledge in the literature.

In conclusion,the multi-channel parallel adaptive rule-learning theory is not just theoreti-cally sound and supported by computer simulation but also practically useful.In light of its signi?cance,this theory would hopefully point out a new direction for machine learning and data mining.

Acknowledgments

This work is supported by National Science Foundation under the grant ECS-9615909.The author is deeply indebted to Edward Shortli?e who contributed his expertise and time in the discussion of this paper.

References

1.J.B.Adams,“Probabilistic reasoning and certainty factors”,in Rule-Based Expert Sys-

tems,Addison-Wesley,Reading,MA,1984.

2.J.A.Alexander and M.C.Mozer,“Template-based algorithms for connectionist rule ex-

traction”,in Advances in Neural Information Processing Systems,MIT Press,Cam-bridge,MA,1995.

3.B.G.Buchanan and T.M.Mitchell,“Model-directed learning of production rules”,in

Pattern-Directed Inference Systems,Academic Press,New York,1978.

4.B.G.Buchanan and E.H.Shortli?e(eds.),Rule-Based Expert Systems,Addison-Wesley,

Reading,MA,1984.

5.P.Clark and R.Niblett,“The CN2induction algorithm”,Machine Learning,3,pp.

261-284,1989.

6.L.M.Fu,“Knowledge-based connectionism for revising domain theories”,IEEE Trans-

actions on Systems,Man,and Cybernetics,23(1),pp.173–182,1993.

7.L.M.Fu,Neural Networks in Computer Intelligence,McGraw Hill,Inc.,New York,NY,

1994.

8.L.M.Fu,“Learning in certainty factor based multilayer neural networks for classi?ca-

tion”,IEEE Transactions on Neural Networks.9(1),pp.151-158,1998.

9.L.M.Fu and E.H.Shortli?e,“The application of certainty factors to neural computing

for rule discovery”,IEEE Transactions on Neural Networks,11(3),pp.647-657,2000.

10.C.Z.Janikow,“A knowledge-intensive genetic algorithm for supervised learning”,Ma-

chine Learning,13,pp.189-228,1993.

11.G.B.Koudelka,S.C.Harrison,and M.Ptashne,“E?ect of non-contacted bases on the

a?nity of434operator for434repressor and Cro”,Nature,326,pp.886-888,1987.

https://www.doczj.com/doc/8c18941762.html,cher,S.I.Hruska,and D.C.Kuncicky,“Back-propagation learning in expert

networks”,IEEE Transactions on Neural Networks,3(1),pp.62–72,1992.

翻译理论整理汇总

翻译理论整理汇总 翻译腔(translationeses)是在译文中留有源语言特征等翻译痕迹的现象,严重的翻译腔使译文读起来不够通顺。出现翻译腔有时是很难免的。形成翻译腔有以下几个原因:1)英汉语言本身的不同,包括句型结构等,在翻译时如果不能做到自由转换就会有种不是地道目标语的感觉。2)英汉文化的不同,不同的文化背景下,单词或习语有时会出现偏差甚至零对等(zero-correspondence)。有时只讲词对词的反应就会出现翻译腔。3)译者自身水平。奈达的功能对等就要求译者注重译文对读者的影响,好的译文让读者感觉是母语的写作、实现功能的对等。Venuti主张同化和异化的结合,通过同化让读者获得相识的反应,通过异化使读者领略异国语言和文化。译者应该提高自己翻译水平,在翻译策略上实现功能对等,从而尽量避免翻译腔的出现。 1. 中国的翻译理论家 严复、茅盾、鲁迅、朱光潜、傅雷、钱钟书 1)严复 信、达、雅——faithfulness、expressiveness、elegance/gracefulness 2)茅盾 他也主张“直译”,反对“意译”,他认为汉语确实存在语言组织上欠严密的不足,有必要吸引印欧语系的句法形态。但是矛盾与鲁迅观点同中有异,他认为“直译”并不是“字对字”,一个不多,一个不少。因为中西文法结构截然不同,纯粹的“字对字”是不可能的。 3)鲁迅 鲁迅的“宁信而不顺”是“凡是翻译,必须兼顾两面,一当然是力求易解,一则保存着原作风姿”,这是鲁迅的基本思想。针对当年那种“牛头不对马嘴”的胡译、乱译以及所谓“与其信而不顺,不如顺而不信”的说法(梁秋实),提出了“宁信而不顺”这一原则,主张直译,以照顾输入新表现法和保持原作的风貌。他还认为,翻译一要“移情”、“益志”,译文要有“异国情调”,二要“输入新的表现法”,以改进中文的文法,在当时主要表现为改进白话文。必须强调的是,鲁迅其实是主张翻译要通顺,又要忠实的。只是二者不可兼得的情况下,“宁信而不顺”。 4)林语堂 他是在中国议学史上第一个最明确地将现代语言学和心理学作为翻译理论的“学理剖析”的基础的。了解翻译的三条标准:忠实、通顺和美。 5)朱光潜 他的翻译标准是:第一,不违背作者的意思;第二,要使读者在肯用心了解时能够了解。 6)傅雷的“神似说”(spirit alikeness) 傅雷对翻译有两个比喻,一是“以效果而言,翻译应该像临画一样,所求的不在形似而在神似”。二是就手法而言“以甲国文字传达乙国文字所包含的那些特点,必须像伯乐相马,要得其精而忘其粗,在其内而忘其外”。 7)钱钟书的“化境”说(sublimation) 所谓“化境”(sublimation),是指原作在译文中就像“投胎转世”,躯体换了一个,但精魄依然故我。换句话说,译本对原作应该忠实得读起来不像译本,因为作品在原文里绝不会读起来想翻译出的东西的:“文学翻译的最高理想可以说是‘化’,把作品从一国文字转换成另一国文字,既能不因语文习惯的差异而露出生硬牵强的痕迹,又能完全保存原作的风味,那就算得入于‘化境’”。 2. 国外的翻译理论家 1)John Dryden Metaphrase(逐词译), turning an author word by word, and line by line, from one language into another. Paraphrase(释译), or translation with latitude, where the author's words are not so strictly followed but his sense.

C++ #pragma code_seg用法

#pragma code_seg 格式如: #pragma code_seg( [ [ { push | pop}, ] [ identifier, ] ] [ "segment-name" [, "segment-class" ] ) 该指令用来指定函数在.obj文件中存放的节,观察OBJ文件可以使用VC自带的dumpbin命令行程序,函数在.obj文件中默认的存放节为.text节,如果code_seg 没有带参数的话,则函数存放在.text节中。 push (可选参数)将一个记录放到内部编译器的堆栈中,可选参数可以为一个标识符或者节名 pop(可选参数)将一个记录从堆栈顶端弹出,该记录可以为一个标识符或者节名identifier(可选参数)当使用push指令时,为压入堆栈的记录指派的一个标识符,当该标识符被删除的时候和其相关的堆栈中的记录将被弹出堆栈 "segment-name" (可选参数)表示函数存放的节名 例如: //默认情况下,函数被存放在.text节中 void func1() {// stored in .text } //将函数存放在.my_data1节中 #pragma code_seg(".my_data1") void func2() {// stored in my_data1 } //r1为标识符,将函数放入.my_data2节中 #pragma code_seg(push, r1, ".my_data2") void func3() {// stored in my_data2 } int main() { } 例如 #pragma code_seg(“PAGE”) 作用是将此部分代码放入分页内存中运行。 #pragma code_seg() 将代码段设置为默认的代码段 #pragma code_seg("INIT") 加载到INIT内存区域中,成功加载后,可以退出内存

翻译方法

翻译方法 作为翻译实践和翻译理论中一个重要 的研究对象, 言表达形式的方法和策略,是翻译过程中译 者必需进行选择的一系列具体的操作步骤。 多年来,有关翻译方法的讨论大致围绕着 “直译”方法还是“意译”方法的话题展开。 多数研究者和译者认为翻译是翻译原文的 内容意义,而不是语言形式,提倡采取某种 “意译”的方法。也有一些学者认为语言之 间存在不可逾越的障碍,因此,翻译应该倾 向于尽量使用“直译”的方法。根据 Newmark(1988:45)的观点,翻译方法可以分为两大类型:以原语为主的方法和以译语为主的方法。在这两大类型中,又可以根据不同的语言转换处理方式,进一步细分出不同种类的具体翻译方法,如:

图3-1翻译方法的种类。摘自Newmark 1988:45 在本章中,我们将简要介绍这些不同类型的翻译方法,并结合翻译实践活动的性质和特点,阐释影响译者选择不同翻译方法的因素。 3.1以原语为主的翻译方法 图3-1中左边的四种翻译方法,从形式上完全一对一地逐字翻译到注重意义的语义翻译,均在不同程度上强调原文内容及其语言表达方式在译文中的再现。下面我们按照自上而下的顺序,分别简要介绍这四种翻译方法。 3.1.1逐字翻译 逐字翻译(word-for-word translation)指译文字词和原文字词一对一对应的翻译,

例如: (1)How old are you? A.怎么老是你? B.你多大年纪? (2)橱窗里摆着的是什么东西? A.Shop window in displayed is what thing? B.What?s that in the shop window? (3)好是很好,可是买这么多扇子干什么呀? A.Good is very good, but buy so many fans do what? B.That?s all well and good, but what are you buying so many fans for? (4)回国以后送给朋友。做得多精细呀! A.Return country after present friends made how delicate! B.I?ll give them to my friends when I go home. Aren?t they delicately made? Passengers up and down the boat be careful.

C++ #pragma预处理命令

#pragma预处理命令 #pragma可以说是C++中最复杂的预处理指令了,下面是最常用的几个#pragma 指令: #pragma comment(lib,"XXX.lib") 表示链接XXX.lib这个库,和在工程设置里写上XXX.lib的效果一样。 #pragma comment(linker,"/ENTRY:main_function") 表示指定链接器选项/ENTRY:main_function #pragma once 表示这个文件只被包含一次 #pragma warning(disable:4705) 表示屏蔽警告4705 C和C++程序的每次执行都支持其所在的主机或操作系统所具有的一些独特的特点。例如,有些程序需要精确控制数据存放的内存区域或控制某个函数接收的参数。#pragma为编译器提供了一种在不同机器和操作系统上编译以保持C和C++完全兼容的方法。#pragma是由机器和相关的操作系统定义的,通常对每个编译器来说是不同的。 如果编译器遇到不认识的pragma指令,将给出警告信息,然后继续编译。Microsoft C and C++ 的编译器可识别以下指令:alloc_text,auto_inline,bss_seg,check_stack,code_seg,comment,component,conform,const_seg,data_seg,deprecated,fenv_access,float_control,fp_contract,function,hdrstop,include_alias,init_seg,inline_depth,inline_recursion,intrinsic,make_public,managed,message,omp,once,optimize,pack,pointers_to_members,pop_macro,push_macro,region, endregion,runtime_checks,section,setlocale,strict_gs_check,unmanaged,vtordisp,warning。其中conform,init_seg, pointers_to_members,vtordisp仅被C++编译器支持。 以下是常用的pragma指令的详细解释。 1.#pragma once。保证所在文件只会被包含一次,它是基于磁盘文件的,而#ifndef 则是基于宏的。

#pragma data code ICCAVR的使用

#pragma data:code 在Keil中为了节省数据存储器的空间,通过“code”关键字来定义一个数组或字符串将被存储在程序存储器中: uchar code buffer[]={0,1,2,3,4,5}; uchar code string[]="Armoric" ; 而这类代码移值到ICCAVR上时是不能编译通过的。我们可以通过"const" 限定词来实现对存储器的分配: #pragma data:code const unsigned char buffer[]={0,1,2,3,4,5}; const unsigned char string[]="Armoric"; #pragma data:data 注意: 《1》使用ICCAVR6.31时,#pragma data :code ;#pragma data:data ; 这些语法时在"data:cod"、"data:data"字符串中间不能加空格,否则编译不能通过。 《2》const 在ICCAVR是一个扩展关键词,它与ANSIC标准有冲突,移值到其它的编译器使用时也需要修改相关的地方。 在ICCAVR中对数组和字符串的五种不同空间分配: const unsigned char buffer[]={0,1,2,3,4,5}; //buffer数组被分配在程序存储区中 const unsigned char string[]="Armoric" ; //stringp字符串被分配在程序存储区中 const unsigned char *pt //指针变量pt被分配在数据存储区中,指向程序存储区中的字符类型数据 unsigned char *const pt //指针变量pt被分配在程序存储区中,指向数据存储区中的字符类型数据 const unsigned char *const pt //指针变量pt被分配在程序存储区,指向程序存储区中的字符类型数据 unsigned char *pt //指针变量pt被分配在数据存储区中,指向数据存储区中的数据 请问#pragma data:code和#pragma data:data是什么意思? 前者表示:随后的数据将存贮在程序区,即FLASH区,此区只能存贮常量,比如表格之类。

翻译硕士名词解释词条

绝对翻译absolute translation 摘要翻译 abstract translation 滥译 abusive translation 可接受性 acceptibility 准确accuracy 译者行动 translatorial action 充分性 adequacy 改编 adaptation 调整 adjustment 美学诗体翻译aesthetic-poetic translation 经纪人agent 类同形式 analogical form 分析 analysis 感染型文本 applied focused-texts 应用翻译研究 applied translation studies 古词废词 archaism 元译素 architranseme 关于范围的翻译理论 area-restricted theories of translation 听觉媒介型文本 audio-medial texts 委托 commission 自动翻译 automatic translation 自立幅度 autonomy spectrum

自译 autotranslation 逆转换 back-transformation 关于范围的翻译理论 back-translation 双边传译 bilateral interpreting 双语语料库 bilingual corporal 双文本 bi-text 空位 blank spaces 无韵体翻译 blank verse translation 借用 borrowing 仿造 calque 机助翻译 MAT 范畴转换 category shift 词类转换 class shift 贴近翻译 close translation 连贯 coherence 委托 commission 传意负荷 communication load 传意翻译 communicative translation 社群传译 community interpreting 对换 commutation 可比语料库 comparable corpora 补偿compensation

pragma的用法

#pragma的用法 在所有的预处理指令中,#Pragma 指令可能是最复杂的了,它的作用是设定编译器的状态或者是指示编译器完成一些特定的动作。#pragma指令对每个编译器给出了一个方法,在保持与C和C++语言完全兼容的情况下,给出主机或操作系统专有的特征。依据定义, 编译指示是机器或操作系统专有的,且对于每个编译器都是不同的。 其格式一般为: #pragma para。其中para为参数,下面来看一些常用的参数。 1)message 参数 message参数是我最喜欢的一个参数,它能够在编译信息输出窗口中输出相应的信息,这对于源代码信息的控制是非常重要的。其使用方法为: #pragma message("消息文本") 当编译器遇到这条指令时就在编译输出窗口中将消息文本打印出来。 当我们在程序中定义了许多宏来控制源代码版本的时候,我们自己有可能都会忘记有 没有正确的设置这些宏, 此时我们可以用这条指令在编译的时候就进行检查。假设我们希望判断自己有没有在源代码的什么地方定义了_X86这个宏, 可以用下面的方法: #ifdef _X86 #pragma message("_X86 macro activated!") #endif 我们定义了_X86这个宏以后,应用程序在编译时就会在编译输出窗口里显示"_86 macro activated!"。 我们就不会因为不记得自己定义的一些特定的宏而抓耳挠腮了。 (2)另一个使用得比较多的pragma参数是code_seg 格式如: #pragma code_seg( ["section-name" [, "section-class"] ] ) 它能够设置程序中函数代码存放的代码段,当我们开发驱动程序的时候就会使用到 它。 (3)#pragma once (比较常用) 只要在头文件的最开始加入这条指令就能够保证头文件被编译一次,这条指令实际上 在VC6中就已经有了, 但是考虑到兼容性并没有太多的使用它。 (4)#pragma hdrstop 表示预编译头文件到此为止,后面的头文件不进行预编译。BCB可以预编译头文件以 加快链接的速度, 但如果所有头文件都进行预编译又可能占太多磁盘空间,所以使用这个选项排除一些头文

英语翻译技巧指导

翻译技巧 由于汉英两种语言在词法、句法上的差异,翻译并不是简单的一对一复制,而是根据具体情况,灵活地运用翻译技巧坐车必要的调整和改变,使译文最大程度地再现原文的意义,又使它符合译语的表达习惯。 翻译技巧大体分为八类。它们是:加注(annotation)、释义(paraphrase)、增词(amplification)、减词(omission)、转换(shift of perspective)、归化(adaptation)、切分(division)和合并(combination)。 1.加注 由于汉英文化的不同,英语中某些词义在汉语中根本不存在,形成词义上的空缺。在这种情况下,常常采用加注法来弥补空缺。加注可分为音译加注和直译加注两种。 音译加注指音译后附加解释性注释。注释可长可短,可以在文中注释,也可采用脚注。 Cartoon 卡通片 Hamburger 汉堡包 Benz 奔驰车 Hippie 嬉皮士(20世纪60年代出现在美国的青年颓废派一员,喜群居,蓄长发,穿奇装异服) El Nino 厄尔尼诺(现象)(指严重影响全球气候的太平洋热带海域的大风及海水的大规模移动) Saint Valentine’s Day 圣瓦伦丁节情人节,2月14日) 1.Hygeia herself would have fallen sick under such a regiment; and how much more this poor old nervous victim? 按照这样的养生之道,别说这可怜的老太太了,就连健康女神哈奇亚本 人也会生病。 2.You look like Al-Capone in that suit. 你穿上那套衣服,看上去就像个流氓阿尔·卡彭。

stm32中使用#pragma pack(非常有用的字节对齐用法说明)

#pragma pack(4) //按4字节对齐,但实际上由于结构体中单个成员的最大占用字节数为2字节,因此实际还是按2字节对齐 typedef struct { char buf[3];//buf[1]按1字节对齐,buf[2]按1字节对齐,由于buf[3]的下一成员word a是按两字节对齐,因此buf[3]按1字节对齐后,后面只需补一空字节 word a; //#pragma pack(4),取小值为2,按2字节对齐。 }kk; #pragma pack() //取消自定义字节对齐方式 对齐的原则是min(sizeof(word ),4)=2,因此是2字节对齐,而不是我们认为的4字节对齐。 这里有三点很重要: 1.每个成员分别按自己的方式对齐,并能最小化长度 2.复杂类型(如结构)的默认对齐方式是它最长的成员的对齐方式,这样在成员是复杂类型时,可以最小化长度 3.对齐后的结构体整体长度必须是成员中最大的对齐参数的整数倍,这样在处理数组时可以保证每一项都边界对齐 补充一下,对于数组,比如: char a[3];这种,它的对齐方式和分别写3个char是一样的.也就是说它还是按1个字节对齐. 如果写: typedef char Array3[3]; Array3这种类型的对齐方式还是按1个字节对齐,而不是按它的长度. 不论类型是什么,对齐的边界一定是1,2,4,8,16,32,64....中的一个. 声明: 整理自网络达人们的帖子,部分参照MSDN。 作用: 指定结构体、联合以及类成员的packing alignment; 语法: #pragma pack( [show] | [push | pop] [, identifier], n ) 说明: 1,pack提供数据声明级别的控制,对定义不起作用; 2,调用pack时不指定参数,n将被设成默认值; 3,一旦改变数据类型的alignment,直接效果就是占用memory的减少,但是performance会下降; 语法具体分析: 1,show:可选参数;显示当前packing aligment的字节数,以warning message的形式被显示; 2,push:可选参数;将当前指定的packing alignment数值进行压栈操作,这里的栈是the internal compiler stack,同时设置当前的packing alignment为n;如果n没有指定,则将当前的packing alignment数值压栈; 3,pop:可选参数;从internal compiler stack中删除最顶端的record;如果没有指定n,则当前栈顶record即为新的packing alignment 数值;如果指定了n,则n将成为新的packing aligment数值;如果指定了identifier,则internal compiler stack中的record都将被pop 直到identifier被找到,然后pop出identitier,同时设置packing alignment数值为当前栈顶的record;如果指定的identifier并不存在于internal compiler stack,则pop操作被忽略; 4,identifier:可选参数;当同push一起使用时,赋予当前被压入栈中的record一个名称;当同pop一起使用时,从internal compiler stack 中pop出所有的record直到identifier被pop出,如果identifier没有被找到,则忽略pop操作; 5,n:可选参数;指定packing的数值,以字节为单位;缺省数值是8,合法的数值分别是1、2、4、8、16。 重要规则: 1,复杂类型中各个成员按照它们被声明的顺序在内存中顺序存储,第一个成员的地址和整个类型的地址相同; 2,每个成员分别对齐,即每个成员按自己的方式对齐,并最小化长度;规则就是每个成员按其类型的对齐参数(通常是这个类型的大小)和指定对齐参数中较小的一个对齐; 3,结构体、联合体或者类的数据成员,第一个放在偏移为0的地方;以后每个数据成员的对齐,按照#pragma pack指定的数值和这个数据成员自身长度两个中比较小的那个进行;也就是说,当#pragma pack指定的值等于或者超过所有数据成员长度的时候,这个指定值的大小将不产生任何效果; 4,复杂类型(如结构体)整体的对齐是按照结构体中长度最大的数据成员和#pragma pack指定值之间较小的那个值进行;这样当数据成员为复杂类型(如结构体)时,可以最小化长度; 5,复杂类型(如结构体)整体长度的计算必须取所用过的所有对齐参数的整数倍,不够补空字节;也就是取所用过的所有对齐参数中最大的那个值的整数倍,因为对齐参数都是2的n次方;这样在处理数组时可以保证每一项都边界对齐; 对齐的算法:由于各个平台和编译器的不同,现以本人使用的gcc version 3.2.2编译器(32位x86平台)为例子,来讨论编译器对struct 数据结构中的各成员如何进行对齐的。 在相同的对齐方式下,结构体内部数据定义的顺序不同,结构体整体占据内存空间也不同,如下: 设结构体如下定义: struct A { int a; //a的自身对齐值为4,偏移地址为0x00~0x03,a的起始地址0x00满足0x00%4=0;

功能翻译术语

功能翻译术语 Acceptibility 可接受性 Adaptation 改编;编译 Adequacy 在功能翻译理论中,“合适”(德语:Adaquatheit)用以形容译本对翻译纲要所规定的交际目的而言的合适性。它是与翻译行为过程有关的动 态概念。 Adjustment 调整 Agent 经纪人 Appeal-focused Texts 感染型文本(德文Appellbetonte Texte) Appellative Function 感染性功能(德语为Appellfunktion):使用言语或非言语交际符号,以使接受者作出一定的答复或反应。也称为“使役功能”或“意动功能”。可进一步分为:说明、劝诱、命令、教学、广告等功能。 感染功能的效果取决于接受者的敏感性、应激性、经验视野和知识水平。Appellative Texts 感染型文本 Assignment 任务:委托给译者的职责,包括工作条件(时间、薪酬等)、原文和(更理想的情况是包括)翻译纲要。 Audio-medial Texts 听觉媒介文本(德文Audio-mediale TextP) Brief 纲要:用于定义翻译行为所服务的交际目的。理想的翻译纲要明示或暗示以下信息:译文的预期功能、读者、传播媒介、出版时间和地点,必要的话,还包括创作译文或出版译文的动机。 Coherence 连贯(德文Koharenz) Commission 委托(德文Auftrag) Communication Load 传意负荷(又名Information Load[信息负荷]) Communicative Translation 交际翻译 Compensation 补偿 Contextual Consistency 语境一致[性] Conventions 常规:为人们所默认但不具强制约束力的行为规范,基于常识或建立在“期待”之上,即人们对特定场合中对于应该如何做的“期待”。 常规见于:常见文本类型或语篇体裁、一般的文体常规、习惯度量法、翻译传统等。 Correspondence 对应 Covert Translation 隐型翻译 Cultural Approach 文化途径 Cultural Borrowing 文化借用 Cultural Translation 文化翻译 Cultural Transplantation 文化移植 Cultural Transposition 文化移位 Culture Specificity 文化专属性:翻译时比较的两种文化中,文化专属现象(其形式或职能)只在于其一。但文化专属现象并非只限于一种特定文化之中。Cultureme 文化素:A文化中被本族成员认为有意义的一种社会现象。较之与B文化中对应的社会现象,即可发现其为A文化所专属。 Decision-making,Translation as 翻译即抉择 Documentary Translation 纪实性翻译:翻译类型的一种,旨在以译语重建源语

#pragma指令用法汇总和解析

#pragma指令用法汇总和解析 一. message 参数。 message 它能够在编译信息输出窗 口中输出相应的信息,这对于源代码信息的控制是非常重要的。其使用方法为: #pragma message(“消息文本”) 当编译器遇到这条指令时就在编译输出窗口中将消息文本打印出来。 当我们在程序中定义了许多宏来控制源代码版本的时候,我们自己有可能都会忘记有没有正确的设置这些宏,此时我们可以用这条 指令在编译的时候就进行检查。假设我们希望判断自己有没有在源代码的什么地方定义了_X86这个宏可以用下面的方法 #ifdef _X86 #pragma message(“_X86 macro activated!”) #endif 当我们定义了_X86这个宏以后,应用程序在编译时就会在编译输出窗口里显示“_ X86 macro activated!”。我们就不会因为不记得自己定义的一些特定的宏而抓耳挠腮了 二. 另一个使用得比较多的#pragma参数是code_seg。格式如: #pragma code_seg( [ [ { push | pop}, ] [ identifier, ] ] [ "segment-name" [, "segment-class" ] ) 该指令用来指定函数在.obj文件中存放的节,观察OBJ文件可以使用VC自带的dumpbin命令行程序,函数在.obj文件中默认的存放节 为.text节 如果code_seg没有带参数的话,则函数存放在.text节中 push (可选参数) 将一个记录放到内部编译器的堆栈中,可选参数可以为一个标识符或者节名 pop(可选参数) 将一个记录从堆栈顶端弹出,该记录可以为一个标识符或者节名 identifier (可选参数) 当使用push指令时,为压入堆栈的记录指派的一个标识符,当该标识符被删除的时候和其相关的堆栈中的记录将被弹出堆栈 "segment-name" (可选参数) 表示函数存放的节名 例如: //默认情况下,函数被存放在.text节中 void func1() { // stored in .text } //将函数存放在.my_data1节中 #pragma code_seg(".my_data1") void func2() { // stored in my_data1 } //r1为标识符,将函数放入.my_data2节中 #pragma code_seg(push, r1, ".my_data2") void func3() { // stored in my_data2 } int main() { } 三. #pragma once (比较常用) 这是一个比较常用的指令,只要在头文件的最开始加入这条指令就能够保证头文件被编译一次 四. #pragma hdrstop表示预编译头文件到此为止,后面的头文件不进行预编译。

向量化的方法

使用英特尔编译器进行自动向量化 作者:Yang Wang (Intel) 自动向量化是英特尔编译器提供的一个可以自动的使用SIMD指示的功能。在处理数据时,编译器自动选择MMX?, Intel? Streaming SIMD 扩展(Intel? SSE, SSE2, SSE3 和SSE4)等指令集,对数据进行并行的处理。使用编译器提供的自动向量化功能是提高程序性能的一个非常有效的手段。自动向量化在IA-32和Intel? 64的平台上均提供很好的支持。 英特尔编译器提供的自动向量化相关的编译选项如下所示。”/Q”开头的选项是针对Windows平台的,“-“开头的选项是针对Linux*和Mac平台的。 -x, /Qx 按照该选项指定的处理器类型生成相应的优化代码。比如-xSSE3, 该选项指定编译器生成Intel? SSE3指令的代码。又比如-xSSE3_ATOM, 该选项针对Intel? Atom? 处理器进行优化。 -ax, /Qax 如果指定该选项,在生成的单一目标文件中,不但会生成专门针对指定的处理器类型进行优化的代码,同时也生成通用的IA-32架构的代码。该选项主要是为了生成代码的兼容性考虑。 -vec, /Qvec 打开或者关闭编译器的向量化优化。默认情况下自动向量化是打开的。 -vec-report, /Qvec-report 该选项用户控制在编译过程中产生的向量化消息报告。 编译器提供的自动向量化优化默认情况下是打开的。在编译过程中我们可以使用-vec-report选项来打开向量化诊断消息报告。这样编译器可以告诉我们有哪些循环被向量化了,有哪些循环没有被向量化已经无法向量化的原因。 在编译程序的过程中,有时候我们会发现编译器报告说某个循环无法被向量化。很多时候无法向量化的原因都是因为循环中存在的变量依赖关系。有时候我们可以修改程序来消除这种依赖关系,有的时候我们可以使用编译器提供的一些编译指示来显示的告诉编译器如何处理这种依赖关系。即使在某个循环已经可以被自动向量化的时候,使用编译器提供的对向量化的语言支持和编译指示还可以提高编译器向量化的效率,提高程序执行的性能。 下面我们来详细解释一下编译器提供的编译指示以及这些指示对编译器编译的影响。 在Intel编译器中,我们提供下面这样一些对自动向量化的语言支持和编译指示。 __declspec(align(n)) 指导编译器将变量按照n字节对齐 __declspec(align(n,off)) 指导编译器将变量按照n字节再加上off字节的编译量进行对齐 restrict 消除别名分析中的二义性 __assume_aligned(a,n) 当编译器无法获取对齐信息时,假定数组a已经按照n字节对齐 #pragma ivdep 提示编译器忽略可能存在的向量依赖关系 #pragma vector {aligned|unaligned|always}

翻译专业术语

术语翻译贡献者Absolute Translation绝对翻译古阿德克(Gouadec) Abstract Translation摘要翻译古阿德克(Gouadec) Abusive translation滥译路易斯(Lewis)Acceptability可接受性托利(Toury) Accuracy准确 Adaptation改编 Adequacy充分性 Adjustment调整 Analogical Form类同形式霍尔姆斯(Holmes)Analysis分析奈达(Nida)和泰伯(Taber)Applied Translation Studies应用翻译研究霍尔姆斯(Holmes)Architranseme (ATR)元译素范·路文兹瓦特(van Leuven-Zwart)Autonomy Spectrum自立幅度罗斯(Rose) Autotranslation自译波波维奇(Popovic) Back Translation回译 Bilateral interpreting双边传译凯斯(Keith)Class Shift词类转换韩礼德(Halliday)Close Translation贴近翻译纽马克(Newmark)Communicative Translation传意翻译;交际翻译纽马克(Newmark)Community interpreting社群传译 Compensation补偿赫维(Hervey) Competence能力托利(Toury)Componential Analysis语义成分分析奈达(Nida)Comprehensive theory综合理论

OpenMP的用法

在双重循环中怎样写OpenMP? 那要分析你的外循环跟内循环有没有彼此依赖的关系 unsigned int nCore = GetComputeCore(); unsigned int nStep = IMAGETILEYSIZE / nCore; #pragma omp parallel for private(Level0_x, Level0_y, ChangeLevel0_x, ChangeLevel0_y, InterX1, InterX2, InterY1, InterY2) for (int k = 0; k < nCore; k++) { int begin = k * nStep; int end = (k + 1) * nStep; for (int YOff = begin; YOff < end; YOff++) { for (int XOff = 0; XOff < IMAGETILEXSIZE; XOff++) { Level0_x = pPixelXBuf[YOff][XOff]; Level0_y = pPixelYBuf[YOff][XOff]; ChangeLevel0_x = Level0_x - XMin; ChangeLevel0_y = Level0_y - YMin; //寻找坐标在Level1Buf中对应的4个像素值 InterX1 = (int)(ChangeLevel0_x); InterX2 = (int)(ChangeLevel0_x + 1); InterY1 = (int)(ChangeLevel0_y); InterY2 = (int)(ChangeLevel0_y + 1); //双线性插值对Level0_Buf赋值 ZoomInterpolation(Level0Buf, Level1Buf, ChangeLevel0_x, ChangeLevel0_y, SamplesPerPixel, nXSize, nYSize, InterX1, InterX2, InterY1, InterY2, XOff, YOff); } } } 我也想应该这样,可是如果nCore=1的时候,外循环只循环一次,线程是怎么分配的呢。其实最外层的循环如果很多,就在外循环分配线程是不是就可以不考虑里面的循环了? nCore = 1,就是单核单cpu,多核循环就跟普通的循环没有差别, openmp默认有几个内核就开几个线程同时运行。所以单核openmp也没有什么意义,此时你也可以开两个线程“同时”运行,但在单核机器上,两个线程是不可能同时运行的 可以不考虑里面的循环。你只要保证外循环跟外循环之间,内寻环跟内循环之间没有数据依赖关系就行。 假设 for (int i = 0; i < 200000000; i++)

Lost in Adadaptation翻译

Lost in Ad-adaptation 水土不服的广告标语 Pepsi Generation will bring your ancestors back from the dead. At least, that’s what the ads claimed in China. 百事在中国的广告语是:与百事世代一起活着。 But this wasn’t a one-time accident. There’s the often-told (maybe untrue) story of how the Chevy Nova’s sales in Latin America were miserable—no va, it turns out in Spanish means “won’t go.” In all cross-cultural, cross-linguistic mar keting, you’ve got to be careful. Some of the world’s best-known brands have tasted the bitter humiliation of poor slogan translations—especially in China. Not only are foreign marketers dealing with the fundamental cultural differences, but also two very different linguistic milieus. Here, brand and ad translation have become arts in their own right. 但这不仅是唯一的错误。有一个关于雪弗莱汽车的故事经常被人们提起(可能是谣言)。由于雪弗莱的广告标语no va,译成西班牙语后意思是“不走。”导致雪弗莱在拉丁美洲销量凄惨。在所有跨文化,跨语言的市场上,你必须要谨慎。一些世界知名品牌都经历过了广告翻译的痛苦与屈辱—尤其是在中国。外国商人需要处理的不只是基本的文化差异,还有两种截然不同的语言环境。此时,在各个领域中,品牌和广告标语的翻译成为了一门艺术。 The popular French megastore Carrefour is known in Chinese as Jia Le Fu (家乐福). Not only is it a nice sonic approximation of the original name, but it also makes for good Chinese marketing: “Family, Happy, Good Fortune.”The name might not draw many custome rs in the US or France, but in China, it’s a hit. Unlike their Western counterparts, who prefer to think of themselves as individuals, Chinese consumers are attracted to things like family, harmony and stability. 倍受欢迎的法国购物广场Carrefour在汉语中以家乐福而闻名。因为它的译名不仅发音接近于原始名字,而且有“家庭,幸福,好运”的含义,有助于它在中国市场的推广。在美国或者法国可能不会有这么多的顾客被这个名字所吸引,但是在中国却大受欢迎。原因在于,西方人更倾向以个体为形式考虑自己,而与西方不同的是,中国消费者更容易受到像家庭,和谐稳定这类词语的影响。 Chinese is based on symbols and their meanings, while English is based on letters and sounds. This is why translating something like Coca-Cola into Chinese can be so difficult. While Coca-cola works as a brand name across the world, in China there was a problem. There were hundreds of different characters—with the same sounds, but very different meanings—that could have been used to spell out the name. The characters originally used to market the soda were chosen haphazardly. The first translation that appeared? “Bite the wax tadpole.” The next try was “Wax-flattened mare.” Neither were very appetizing. 中文是以符号及其所表达的含义为基础,而英语是以字母和发音为基础。这就解

相关主题
文本预览
相关文档 最新文档