当前位置:文档之家› rfc5439.An Analysis of Scaling Issues in MPLS-TE Core Networks

rfc5439.An Analysis of Scaling Issues in MPLS-TE Core Networks

rfc5439.An Analysis of Scaling Issues in MPLS-TE Core Networks
rfc5439.An Analysis of Scaling Issues in MPLS-TE Core Networks

Network Working Group S. Yasukawa Request for Comments: 5439 NTT Category: Informational A. Farrel Old Dog Consulting O. Komolafe Cisco Systems February 2009 An Analysis of Scaling Issues in MPLS-TE Core Networks

Status of This Memo

This memo provides information for the Internet community. It does

not specify an Internet standard of any kind. Distribution of this

memo is unlimited.

Copyright Notice

Copyright (c) 2009 IETF Trust and the persons identified as the

document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust’s Legal

Provisions Relating to IETF Documents (https://www.doczj.com/doc/005850457.html,/

license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document.

Abstract

Traffic engineered Multiprotocol Label Switching (MPLS-TE) is

deployed in providers’ core networks. As providers plan to grow

these networks, they need to understand whether existing protocols

and implementations can support the network sizes that they are

planning.

This document presents an analysis of some of the scaling concerns

for the number of Label Switching Paths (LSPs) in MPLS-TE core

networks, and examines the value of two techniques (LSP hierarchies

and multipoint-to-point LSPs) for improving scaling. The intention

is to motivate the development of appropriate deployment techniques

and protocol extensions to enable the application of MPLS-TE in large networks.

This document only considers the question of achieving scalability

for the support of point-to-point MPLS-TE LSPs. Point-to-multipoint MPLS-TE LSPs are for future study.

Yasukawa, et al. Informational [Page 1]

Table of Contents

1. Introduction (3)

1.1. Overview (3)

1.2. Glossary of Notation (5)

2. Issues of Concern for Scaling (5)

2.1. LSP State (5)

2.2. Processing Overhead (6)

2.3. RSVP-TE Implications (6)

2.4. Management (7)

3. Network Topologies (8)

3.1. The Snowflake Network Topology (9)

3.2. The Ladder Network Topology (11)

3.3. Commercial Drivers for Selected Configurations (14)

3.4. Other Network Topologies (15)

4. Required Network Sizes (16)

4.1. Practical Numbers (16)

5. Scaling in Flat Networks (16)

5.1. Snowflake Networks (17)

5.2. Ladder Networks (18)

6. Scaling Snowflake Networks with Forwarding Adjacencies (22)

6.1. Two-Layer Hierarchy (22)

6.1.1. Tuning the Network Topology to Suit the

Two-Layer Hierarchy (23)

6.2. Alternative Two-Layer Hierarchy (24)

6.3. Three-Layer Hierarchy (25)

6.4. Issues with Hierarchical LSPs (26)

7. Scaling Ladder Networks with Forwarding Adjacencies (27)

7.1. Two-Layer Hierarchy (27)

7.2. Three-Layer Hierarchy (28)

7.3. Issues with Hierarchical LSPs (29)

8. Scaling Improvements through Multipoint-to-Point LSPs (30)

8.1. Overview of MP2P LSPs (30)

8.2. LSP State: A Better Measure of Scalability (31)

8.3. Scaling Improvements for Snowflake Networks (32)

8.3.1. Comparison with Other Scenarios (33)

8.4. Scaling Improvements for Ladder Networks (34)

8.4.1. Comparison with Other Scenarios (36)

8.4.2. LSP State Compared with LSP Numbers (37)

8.5. Issues with MP2P LSPs (37)

9. Combined Models (39)

10. An Alternate Solution (39)

10.1. Pros and Cons of the Alternate Solution (40)

11. Management Considerations (42)

12. Security Considerations (42)

13. Recommendations (42)

Yasukawa, et al. Informational [Page 2]

14. Acknowledgements (43)

15. Normative References (43)

16. Informative References (43)

1. Introduction

Network operators and service providers are examining scaling issues as they look to deploy ever-larger traffic engineered Multiprotocol

Label Switching (MPLS-TE) networks. Concerns have been raised about the number of Label Switched Paths (LSPs) that need to be supported

at the edge and at the core of the network. The impact on control

plane and management plane resources threatens to outweigh the

benefits and popularity of MPLS-TE, while the physical limitations of the routers may constrain the deployment options.

Historically, it has been assumed that all MPLS-TE scaling issues can be addressed using hierarchical LSP [RFC4206]. However, analysis

shows that the improvement gained by LSP hierarchies is not as

significant in all topologies and at all points in the network as

might have been presumed. Further, additional management issues are introduced to determine the end-points of the hierarchical LSPs and

to operate them. Although this does not invalidate the benefits of

LSP hierarchies, it does indicate that additional techniques may be

desirable in order to fully scale MPLS-TE networks.

This document examines the scaling properties of two generic MPLS-TE network topologies and investigates the benefits of two scaling

techniques.

1.1. Overview

Physical topology scaling concerns are addressed by building networks that are not fully meshed. Network topologies tend to be meshed in

the core but tree-shaped at the edges, giving rise to a snowflake

design. Alternatively, the core may be more of a ladder shape with

tree-shaped edges.

MPLS-TE, however, establishes a logical full mesh between all edge

points in the network, and this is where the scaling problems arise

since the structure of the network tends to focus a large number of

LSPs within the core of the network.

This document presents two generic network topologies (the snowflake and the ladder) and attempts to parameterize the networks by making

some generalities. It introduces terminology for the different

scaling parameters and examines how many LSPs might be required to be carried within the core of a network.

Yasukawa, et al. Informational [Page 3]

Two techniques (hierarchical LSPs and multipoint-to-point LSPs) are

introduced and an examination is made of the scaling benefits that

they offer as well as of some of the concerns with using these

techniques.

Of necessity, this document makes many generalizations. Not least

among these is a set of assumptions about the symmetry and

connectivity of the physical network. It is hoped that these

generalizations will not impinge on the usefulness of the overview of the scaling properties that this document attempts to give. Indeed, the symmetry of the example topologies tends to highlight the scaling issues of the different solution models, and this may be useful in

exposing the worst case scenarios.

Although protection mechanisms like Fast Reroute (FRR) [RFC4090] are briefly discussed, the main body of this document considers stable

network cases. It should be noted that make-before-break

re-optimisation after link failure may result in a significant number of ’duplicate’ LSPs. This issue is not addressed in this document.

It should also be understood that certain deployment models where

separate traffic engineered LSPs are used to provide different

services (such as layer 3 Virtual Private Networks (VPNs) [RFC4110]

or pseudowires [RFC3985]) or different classes of service [RFC3270]

may result in ’duplicate’ or ’parallel’ LSPs running between any pair of provider edge nodes (PEs). This scaling factor is also not

considered in this document, but may be easily applied as a linear

factor by the reader.

The operation of security mechanisms in MPLS-TE networks [MPLS-SEC]

may have an impact on the ability of the network to scale. For

example, they may increase both the size and number of control plane messages. Additionally, they may increase the processing overhead as control plane messages are subject to processing algorithms (such as encryption), and security keys need to be managed. Deployers will

need to consider the trade-offs between scaling objectives and

security objectives in their networks, and should resist the

temptation to respond to a degradation of scaling performance by

turning off security techniques that have previously been deemed as

necessary. Further analysis of the effects of security measures on

scalability are not considered further in this document.

This document is designed to help service providers discover whether existing protocols and implementations can support the network sizes that they are planning. To do this, it presents an analysis of some of the scaling concerns for MPLS-TE core networks and examines the Yasukawa, et al. Informational [Page 4]

value of two techniques for improving scaling. This should motivate the development of appropriate deployment techniques and protocol

extensions to enable the application of MPLS-TE in large networks.

This document only considers the question of achieving scalability

for the support of point-to-point MPLS-TE LSPs. Point-to-multipoint MPLS-TE LSPs are for future study.

1.2. Glossary of Notation

This document applies consistent notation to define various

parameters of the networks that are analyzed. These terms are

defined as they are introduced throughout the document, but are

grouped together here for quick reference. Refer to the full

definitions in the text for detailed explanations.

n A network level. n = 1 is the core of the network.

See Section 3 for more details on the definition of a level.

P(n) A node at level n in the network.

S(n) The number of nodes at level n. That is, the number of P(n)

nodes.

L(n) The number of LSPs seen by a P(n) node.

X(n) The number of LSP segment states held by a P(n) node.

M(n) The number of P(n+1) nodes subtended to a P(n) node.

R The number of rungs in a ladder network.

E The number of edge nodes (PEs) subtended below (directly or

indirectly) a spar-node in a ladder network.

K The cost-effectiveness of the network expressed in terms of

the ratio of the number of PEs to the number of network nodes.

2. Issues of Concern for Scaling

This section presents some of the issues associated with the support of LSPs at a Label Switching Router (LSR) or within the network.

These issues may mean that there is a limit to the number of LSPs

that can be supported.

2.1. LSP State

LSP state is the data (information) that must be stored at an LSR in order to maintain an LSP. Here, we refer to the information that is necessary to maintain forwarding plane state and the additional

information required when LSPs are established through control plane protocols. While the size of the LSP state is implementation-

dependent, it is clear that any implementation will require some data in order to maintain LSP state.

Yasukawa, et al. Informational [Page 5]

Thus, LSP state becomes a scaling concern because as the number of

LSPs at an LSR increases, so the amount of memory required to

maintain the LSPs increases in direct proportion. Since the memory

capacity of an LSR is limited, there is a related limit placed on the number LSPs that can be supported.

Note that techniques to reduce the memory requirements (such as data compression) may serve to increase the number of LSPs that can be

supported, but this will only achieve a moderate multiplier and may

significantly decrease the ability to process the state rapidly.

In this document, we define X(n) as "the number of LSP segment states held by a P(n) node." This definition observes that an LSR at the

end of an LSP only has to maintain state in one direction (i.e., into the network), while a transit LSR must maintain state in both

directions (i.e., toward both ends of the LSP). Furthermore, in

multipoint-to-point (MP2P) LSPs (see Section 8), a transit LSR may

need to maintain LSP state for one downstream segment (toward the

destination) and multiple upstream segments (from multiple sources). That is, we define LSP segment state as the state necessary to

maintain an LSP in one direction to one adjacent node.

2.2. Processing Overhead

Depending largely on implementation issues, the number of LSPs

passing through an LSR may impact the processing speed for each LSP. For example, control block search times can increase with the number of control blocks to be searched, and even excellent implementations cannot completely mitigate this fact. Thus, since CPU power is

constrained in any LSR, there may be a practical limit to the number of LSPs that can be supported.

Further processing overhead considerations depend on issues specific to the control plane protocols, and are discussed in the next

section.

2.3. RSVP-TE Implications

Like many connection-oriented signaling protocols, RSVP-TE (Resource Reservation Protocol - Traffic Engineering) requires that state is

held within the network in order to maintain LSPs. The impact of

this is described in Section 2.1. Note that RSVP-TE requires that

separate information is maintained for upstream and downstream

relationships, but does not require any specific implementation of

that state.

Yasukawa, et al. Informational [Page 6]

RSVP-TE is a soft-state protocol, which means that protocol messages (refresh messages) must be regularly exchanged between signaling

neighbors in order to maintain the state for each LSP that runs

between the neighbors. A common period for the transmission (and

receipt) of refresh messages is 30 seconds, meaning that each LSR

must send and receive one message in each direction (upstream and

downstream) every 30 seconds for every LSP it supports. This has the potential to be a significant constraint on the scaling of the

network, but various improvements [RFC2961] mean that this refresh

processing can be significantly reduced, allowing an implementation

to be optimized to remove nearly all concerns about soft-state

scaling in a stable network.

Observations of existing implementations indicate that there may be a threshold of around 50,000 LSPs above which an LSR struggles to

achieve sufficient processing to maintain LSP state. Although

refresh reduction [RFC2961] may substantially improve this situation, it has also been observed that under these circumstances the size of the Srefresh may become very large, and the processing required may

still cause significant disruption to an LSR.

Another approach is to increase the refresh time. There is a

correlation between the percentage increase in refresh time and the

improvement in performance for the LSR. However, it should be noted that RSVP-TE’s soft-state nature depends on regular refresh messages; thus, a degree of functionality is lost by increasing the refresh

time. This loss may be partially mitigated by the use of the RSVP-TE Hello message, and can also be reduced by the use of various GMPLS

extensions [RFC3473], such as the use of [RFC2961] message

acknowledgements on all messages.

RSVP-TE also requires that signaling adjacencies be maintained

through the use of Hello message exchanges. Although [RFC3209]

suggests that Hello messages should be retransmitted every 5 ms, in

practice, values of around 3 seconds are more common. Nevertheless, the support of Hello messages can represent a scaling limitation on

an RSVP-TE implementation since one message must be sent and received to/from each signaling adjacency every time period. This can impose limits on the number of neighbors (physical or logical) that an LSR

supports, but does not impact the number of LSPs that the LSR can

handle.

2.4. Management

Another practical concern for the scalability of large MPLS-TE

networks is the ability to manage the network. This may be

constrained by the available tools, the practicality of managing

large numbers of LSPs, and the management protocols in use. Yasukawa, et al. Informational [Page 7]

Management tools are software implementations. Although such

implementations should not constrain the control plane protocols, it is realistic to appreciate that network deployments will be limited

by the scalability of the available tools. In practice, most

existing tools have a limit to the number of LSPs that they can

support. While a Network Management System (NMS) may be able to

support a large number of LSPs, the number that can be supported by

an Element Management System (EMS) (or the number supported by an NMS per-LSR) is more likely to be limited.

Similarly, practical constraints may be imposed by the operation of

management protocols. For example, an LSR may be swamped by

management protocol requests to read information about the LSPs that it supports, and this might impact its ability to sustain those LSPs in the control plane. OAM (Operations, Administration, and

Management), alarms, and notifications can further add to the burden placed on an LSR and limit the number of LSPs it can support.

All of these considerations encourage a reduction in the number of

LSPs supported within the network and at any particular LSR.

3. Network Topologies

In order to provide some generic analysis of the potential scaling

issues for MPLS-TE networks, this document explores two network

topology models. These topologies are selected partly because of

their symmetry, which makes them more tractable to a formulaic

approach, and partly because they represent generalizations of real

deployment models. Section 3.3 provides a discussion of the

commercial drivers for deployed topologies and gives more analysis of why it is reasonable to consider these two topologies.

The first topology is the snowflake model. In this type of network, only the very core of the network is meshed. The edges of the

network are formed as trees rooted in the core.

The second network topology considered is the ladder model. In this type of network, the core of the network is shaped and meshed in the form of a ladder and trees are attached rooted to the edge of the

ladder.

The sections that follow examine these topologies in detail in order to parameterize them.

Yasukawa, et al. Informational [Page 8]

3.1. The Snowflake Network Topology

The snowflake topologies considered in this document are based on a

hierarchy of connectivity within the core network. PE nodes have

connectivity to P-nodes as shown in Figure 1. There is no direct

connectivity between the PEs. Dual homing of PEs to multiple P-nodes is not considered in this document, although it may be a valuable

addition to a network configuration.

P

/|\

/ | \

/ | \

/ | \

PE PE PE

Figure 1 : PE to P-Node Connectivity

The relationship between P-nodes is also structured in a hierarchical way. Thus, as shown in Figure 2, multiple P-nodes at one level are

connected to a P-node at a higher level. We number the levels such

that level 1 is the top level (top in our figure, and nearest to the core of the network) and level (n) is immediately above level (n+1); we denote a P-node at level n as a P(n).

As with PEs, there is no direct connectivity between P(n+1) nodes.

Again, dual homing of P(n+1) nodes to multiple P(n) nodes is not

considered in this document, although it may be a valuable addition

to a network configuration.

P(n)

/|\

/ | \

/ | \

/ | \

P(n+1) P(n+1) P(n+1)

Figure 2 : Relationship between P-Nodes

At the top level, P(1) nodes are connected in a full mesh. In

reality, the level 1 part of the network may be slightly less well-

connected than this, but assuming a full mesh provides for

generality. Thus, the snowflake topology comprises a clique with

topologically equivalent trees subtended from each node in the

clique.

Yasukawa, et al. Informational [Page 9]

The key multipliers for scalability are the number of P(1) nodes and the multiplier relationship between P(n) and P(n+1) at each level,

down to and including PEs.

We define the multiplier M(n) as the number of P(n+1) nodes at level (n+1) attached to any one P(n). Assume that M(n) is constant for all nodes at level n. Since nodes at the same level are not

interconnected (except at the top level), and since each P(n+1) node is connected to precisely one P(n) node, M(n) is one less than the

degree of the node at level n (that is, the P(n) node is attached to M(n) nodes at level (n+1) and to 1 node at level (n-1)).

We define S(n) as the number of nodes at level (n).

Thus:

S(n) = S(1)*M(1)*M(2)*...*M(n-1)

So the number of PEs can be expressed as:

S(PE) = S(1)*M(1)*M(2)*...*M(n)

where the network has (n) layers of P-nodes.

Thus, we may depict an example snowflake network as shown in Figure

3. In this case:

S(1) = 3

M(1) = 3

S(2) = S(1)*M(1) = 9

M(2) = 2

S(PE) = S(1)*M(1)*M(2) = 18

Yasukawa, et al. Informational [Page 10]

PE PE PE PE PE PE

\ \/ \/ /

PE--P(2) P(2) P(2) P(2)--PE

\ | | /

\| |/

PE--P(2)---P(1)------P(1)---P(2)--PE

/ \ / \

PE \ / PE

\/

P(1)

/|\

/ | \

/ | \

PE--P(2) P(2) P(2)--PE

/ /\ \

PE PE PE PE

Figure 3 : An Example Snowflake Network

3.2. The Ladder Network Topology

The ladder networks considered in this section are based on an

arrangement of routers in the core network that resembles a ladder.

Ladder networks typically have long and thin cores that are arranged as conventional ladders. That is, they have one or more spars

connected by rungs. Each node on a spar may have:

- connection to one or more other spars,

- connection to a tree of other core nodes,

- connection to customer nodes.

Figure 4 shows a simplified example of a ladder network. A core of

twelve nodes makes up two spars connected by six rungs.

Yasukawa, et al. Informational [Page 11]

PE PE PE PE

PE PE PE | PE | PE PE PE | PE | PE

\| \|/ |/ | \| \|/

PE-P-----P-----P-----P------P-----P--PE

| | | | | |\

| | | | | | PE

| | | | | |

PE-P-----P-----P-----P------P-----P

/| /|\ |\ |\ |\ \

PE PE PE | PE | PE | PE | PE PE

PE PE PE PE

Figure 4 : A Simplified Ladder Network

In practice, not all nodes on a spar (call them spar-nodes) need to

have subtended PEs. That is, they can exist simply to give

connectivity along the spar to other spar-nodes, or across a rung to another spar. Similarly, the connectivity between spars can be more complex with multiple connections from one spar-node to another spar. Lastly, the network may be complicated by the inclusion of more than two spars (or simplified by reduction to a single spar).

These variables make the ladder network non-trivial to model. For

the sake of simplicity, we will make the following restrictions:

- There are precisely two spars in the core network.

- Every spar-node connects to precisely one spar-node on the other

spar. That is, each spar-node is attached to precisely one rung.

- Each spar-node connects to either one (end-spar) or two (core-spar) other spar-nodes on the same spar.

- Every spar-node has the same number of PEs subtended. This does

not mean that there are no P-nodes subtended to the spar-nodes, but does mean that the edge tree subtended to each spar-node is

identical.

From these restrictions, we are able to quantify a ladder network as follows:

R - The number of rungs. That is, the number of spar-nodes on each spar.

S(1) - The number of spar-nodes in the network. S(1)=2*R.

E - The number of subtended edge nodes (PEs) to each spar-node. Yasukawa, et al. Informational [Page 12]

The number of rungs may vary considerably. A number less than 3 is

unlikely (since that would not be a significantly connected network), and a number greater than 100 seems improbable (because that would

represent a very long, thin network).

E can be treated as for the snowflake network. That is, we can

consider a number of levels of attachment from P(1) nodes, which are the spar-nodes, through P(i) down to P(n), which are the PEs.

Practically, we need to only consider n=2 (PEs attached direct to the spar-nodes) and n=3 (one level of P-nodes between the PEs and the

spar-nodes).

Let M(i) be the ratio of P(i) nodes to P(i-1) nodes, i.e., the

connectivity between levels of P-node as defined for the snowflake

topology. Hence, the number of nodes at any level (n) is:

S(n) = S(1)*M(1)*M(2)*...*M(n-1)

So the number of PEs subtended to a spar-node is:

E = M(1)*M(2)*...*M(n)

And the number of PEs can be expressed as:

S(PE) = S(1)*M(1)*M(2)*...*M(n)

= S(1)*E

Thus, we may depict an example ladder network as shown in Figure 5.

In this case:

R = 5

S(1) = 10

M(1) = 2

S(2) = S(1)*M(1) = 20

M(2) = 2

E = M(1)*M(2) = 4

S(PE) = S(1)*E = 40

Yasukawa, et al. Informational [Page 13]

PE PE PE PE PE PE PE PE PE PE PE PE PE PE PE PE

\| \| \| \| |/ |/ |/ |/

P(2) P(2) P(2) P(2) P(2) P(2) P(2) P(2)

\ \ | \ / | / /

PE \ \ | \ / | / / PE

\ \ \| \/ |/ / /

PE-P(2)---P(1)---P(1)---P(1)---P(1)---P(1)---P(2)-PE

| | | | |

| | | | |

| | | | |

PE-P(2)---P(1)---P(1)---P(1)---P(1)---P(1)---P(2)-PE

/ / / | /\ |\ \ \

PE / / | / \ | \ \ PE

/ / | / \ | \ \

P(2) P(2) P(2) P(2) P(2) P(2) P(2) P(2)

/| /| /| /| |\ |\ |\ |\

PE PE PE PE PE PE PE PE PE PE PE PE PE PE PE PE

Figure 5 : An Example Ladder Network

3.3. Commercial Drivers for Selected Configurations

It is reasonable to ask why these two particular network topologies

have been chosen.

The most important consideration is physical scalability. Each node (Label Switching Router - LSR) is only able to support a limited

number of physical interfaces. This necessarily reduces the ability to fully mesh a network and leads to the tree-like structure of the

network toward the PEs.

A realistic commercial consideration for an operator is the fact that the only revenue-generating nodes in the network are the PEs. Other nodes are needed only to support connectivity and scalability.

Therefore, there is a desire to maximize S(PE) while minimizing the

sum of S(n) for all values of (n). This could be achieved by

minimizing the number of levels and maximizing the connectivity at

each layer, M(n). Ultimately, however, this would produce a network of just interconnected PEs, which is clearly in conflict with the

physical scaling situation.

Therefore, the solution calls for a "few" levels with "relatively

large" connectivity at each level. We might say that the cost-

effectiveness of the network can be stated as:

K = S(PE)/(S(1)+S(2) + ... + S(n)) where n is the level above the PEs Yasukawa, et al. Informational [Page 14]

We should observe, however, that this equation may be naive in that

the cost of a network is not actually a function of the number of

routers (since a router chassis is often free or low cost), but is

really a function of the cost of the line cards, which is, itself, a product of the capacity of the line cards. Thus, the relatively high connectivity decreases the cost-effectiveness, while a topology that tends to channel data through a network core tends to demand higher

capacity (and so, more expensive) line cards.

A further consideration is the availability of connectivity (usually fibers) between LSR sites. Although it is always possible to lay new fiber, this may not be cost-effective or timely. The physical shape and topography of the country in which the network is laid is likely to be as much of a problem. If the country is ’long and thin’, then a ladder network is likely to be used.

This document examines the implications for control plane and data

plane scalability of this type of network when MPLS-TE LSPs are used to provide full connectivity between all PEs.

3.4. Other Network Topologies

As explained in Section 1, this document is using two symmetrical and generalized network topologies for simplicity of modelling. In

practice, there are two other topological considerations.

a. Multihoming

It is relatively common for a node at level (n) to be attached to more than one node at level (n-1). This is particularly common at PEs that may be connected to more than one P(n).

b. Meshing within a level

A level in the network will often include links between P-nodes at the same level, including the possibility of links between PEs.

This may result in a network that looks like a series of

concentric circles with spokes.

Both of these features are likely to have some impact on the scaling of the networks. However, for the purposes of establishing the

ground rules for scaling, this document restricts itself to the

consideration of the symmetrical networks described in Sections 2.1

and 2.2. Discussion of other network formats is for future study. Yasukawa, et al. Informational [Page 15]

4. Required Network Sizes

An important question for this evaluation and analysis is the size of the network that operators require. How many PEs are required? What ratio of P to PE is acceptable? How many ports do devices have for

physical connectivity? What type of MPLS-TE connectivity between PEs is required?

Although presentation of figures for desired network sizes must be

treated with caution because history shows that networks grow beyond all projections, it is useful to set some acceptable lower bounds.

That is, we can state that we are interested in networks of at least a certain size.

The most important features are:

- The network should have at least 1000 PEs.

- Each pair of PEs should be connected by at least one LSP in each

direction.

4.1. Practical Numbers

In practice, reasonable target numbers are as follows.

S(PE) >= 1000

Number of levels is 3. That is: 1, 2, and PE.

M(2) <= 20

M(1) <= 20

S(1) <= 100

5. Scaling in Flat Networks

Before proceeding to examine potential scaling improvements, we need to examine how well the flat networks described in the previous

sections scale.

Consider the requirement for a full mesh of LSPs linking all PEs.

That is, each PE has an LSP to and from every other LSP. Thus, if

there are S(PE) PEs in the network, there are S(PE)*(S(PE) - 1) LSPs. Define L(n) as the number of LSPs handled by a level (n) LSR.

L(PE) = 2*(S(PE) - 1)

Yasukawa, et al. Informational [Page 16]

5.1. Snowflake Networks

There are a total of S(PE) PEs in the network and, since each PE

establishes an LSP with every other PE, it would be expected that

there are S(PE) - 1 LSPs incoming to each PE and the same number of

LSPs outgoing from the same PE, giving a total of 2(S(PE) - 1) on the incident link. Hence, in a snowflake topology (see Figure 3), since there are M(2) PEs attached to each P(2) node, it may tempting to

think that L(2) (the number of LSPs traversing each P(2) node) is

simply 2*(S(PE) - 1)*M(2). However, it should be noted that of the

S(PE) - 1 LSPs incoming to each PE, M(2) - 1 originated from nodes

attached to the same P(2) node, and so this value would count the

LSPs between the M(2) PEs attached to each P(2) node twice: once when outgoing from the M(2) - 1 other nodes and once when incoming into a particular PE.

There are a total of M(2)*(M(2) - 1) LSPs between these M(2) PEs and, since this value is erroneously included twice in 2*(S(PE) - 1)*M(2), the correct value is:

L(2) = 2*M(2)*(S(PE) - 1) - M(2)*(M(2) - 1)

= M(2)*(2*S(PE) - M(2) - 1)

An alternative way of looking at this, that proves extensible for the calculation of L(1), is to observe that each PE subtended to a P(2)

node has an LSP in each direction to all S(PE) - M(2) PEs in the rest of the system, and there are M(2) such locally subtended PEs; thus,

2*M(2)*(S(PE) - M(2)). Additionally, there are M(2)*(M(2) - 1) LSPs between the locally subtended PEs. So:

L(2) = 2*M(2)*(S(PE) - M(2)) + M(2)*(M(2) - 1)

= M(2)*(2*S(PE) - M(2) - 1)

L(1) can be computed in the same way as this second evaluation of

L(2). Each PE subtended below a P(1) node has an LSP in each

direction to all PEs not below the P(1) node. There are M(1)*M(2)

PEs below the P(1) node, so this accounts for 2*M(1)*M(2)*(S(PE) -

M(1)*M(2)) LSPs. To this, we need to add the number of LSPs that

pass through the P(1) node and that run between the PEs subtended

below the P(1). Consider each P(2): it has M(2) PEs, each of which

has an LSP going to all of the PEs subtended to the other P(2) nodes subtended to the P(1). There are M(1) - 1 such other P(2) nodes, and so M(2)*(M(1) - 1) other such PEs. So the number of LSPs from the

PEs below a P(2) node is M(2)*M(2)*(M(1) - 1). And there are M(1)

P(2) nodes below the P(1), giving rise to a total of

M(2)*M(2)*M(1)*(M(1) - 1) LSPs. Thus:

Yasukawa, et al. Informational [Page 17]

L(1) = 2*M(1)*M(2)*(S(PE) - M(1)*M(2)) + M(2)*M(2)*M(1)*(M(1) - 1)

= M(1)*M(2)*(2*S(PE) - M(2)*(M(1) + 1))

So, for example, with S(1) = 5, M(1) = 10, and M(2) = 20, we see:

S(PE) = 1000

L(PE) = 1998

L(2) = 39580

L(1) = 356000

Alternatively, with S(1) = 10, M(1) = 10, and M(2) = 20, we see:

S(PE) = 2000

L(PE) = 3998

L(2) = 79580

L(1) = 756000

In both examples, the number of LSPs at the core (P(1)) nodes is

probably unacceptably large, even though there are only a relatively modest number of PEs. In fact, L(2) may even be too large in the

second example.

5.2. Ladder Networks

In ladder networks, L(PE) remains the same at 2*(S(PE) - 1).

L(2) can be computed using the same mechanism as for the snowflake

topology because the subtended tree is the same format. Hence,

L(2) = 2*M(2)*(S(PE) - 1) - M(2)*(M(2) - 1)

But L(1) requires a different computation because each P(1) not only sees LSPs for the subtended PEs, but is also a transit node for some of the LSPs that cross the core (the core is not fully meshed).

Each P(1) sees:

o all of the LSPs between locally attached PEs,

o less those LSPs between locally attached PEs that can be served

exclusively by the attached P(2) nodes,

o all LSPs between locally attached PEs and remote PEs, and

o LSPs in transit that pass through the P(1).

The first three numbers are easily determined and match what we have seen from the snowflake network. They are:

Yasukawa, et al. Informational [Page 18]

o E*(E-1)

o M(1)*M(2)*(M(2)-1) = E*(M(2) - 1)

o 2*E*E*(S(1) - 1)

The number of LSPs in transit is more complicated to compute. It is simplified by not considering the ends of the ladders but by

examining an arbitrary segment of the middle of the ladder, such as

shown in Figure 6. We look to compute and generalize the number of

LSPs traversing each core link (labeled a and b in Figure 6) and so

determine the number of transit LSPs seen by each P(1).

: : : : : :

: : : : : :

P(2) P(2) P(2) P(2) P(2) P(2)

\ | \ / | /

\ | \ / | /

\| \/ |/

......P(1)---P(1)---P(1)......

| a | |

| |b |

| | |

......P(1)---P(1)---P(1)......

/| /\ |\

/ | / \ | \

/ | / \ | \

P(2) P(2) P(2) P(2) P(2) P(2)

: : : : : :

: : : : : :

Figure 6 : An Arbitrary Section of a Ladder Network

Of course, the number of LSPs carried on links a and b in Figure 6

depends on how LSPs are routed through the core network. But if we

assume a symmetrical routing policy and an even distribution of LSPs across all shortest paths, the result is the same.

Now we can see that each P(1) sees half of 2a+b LSPs (since each LSP would otherwise be counted twice as it passed through the P(1)),

except that some of the LSPs are locally terminated and so are only

included once in the sum 2a+b.

So L(1) = a + b/2 -

(locally terminated transit LSPs)/2 +

(locally contained LSPs)

Yasukawa, et al. Informational [Page 19]

Thus:

L(1) = a + b/2 -

2*E*E*(S(1) - 1)/2 +

E*(E-1) - E*(M(2) - 1)

= a + b/2 +

E*E*(2 - S(1)) - E*M(2)

So all we have to do is work out a and b.

Recall that the ladder length R = S(1)/2, and define X = E*E.

Consider the contribution made by all of the LSPs that make n hops on the ladder to the totals of each of a and b. If the ladder was

unbounded, then we could say that in the case of a, there are n*2X

LSPs along the spar only, and n(n-1)*2X/n = 2X(n-1) LSPs use a rung

and the spar. Thus, the LSPs that make n hops on the ladder

contribute (4n-2)X LSPs to a. Note that the edge cases are special

because LSPs that make only one hop on the ladder cannot transit a

P(1) but only start or end there.

So with a ladder of length R = S(1)/2, we could say:

R

a = SUM[(4i-2)*X] + 2RX

i=2

= 2*X*R*(R+1)

And similarly, considering b in an unbounded ladder, the LSPs that

only travel one hop on the LSP are a special case, contributing 2X

LSPs, and every other LSP that traverses n hops on the ladder

contributes 2n*2X/n = 4X LSPs. So:

R+1

b = 2X + SUM[4X]

i=2

= 2*X + 4*X*R

In fact, the ladders are bounded, and so the number of LSPs is

reduced because of the effect of the ends of the ladders. The links that see the most LSPs are in the middle of the ladder. Consider a

ladder of length R; a node in the middle of the ladder is R/2 hops

away from the end of the ladder. So we see that the formula for the contribution to the count of spar-only LSPs for a is only valid up to n=R/2, and for spar-and-rung LSPs, up to n=1+R/2. Above these

limits, the contribution made by spar-only LSPs decays as (n-R/2)*2X. Yasukawa, et al. Informational [Page 20]

交互式多模型算法仿真与分析

硕037班 刘文3110038020 2011/4/20交互式多模型仿真与分析IMM算法与GBP算法的比较,算法实现和运动目标跟踪仿真,IMM算法的特性分析 多源信息融合实验报告

交互式多模型仿真与分析 一、 算法综述 由于混合系统的结构是未知的或者随机突变的,在估计系统状态参数的同时还需要对系统的运动模式进行辨识,所以不可能通过建立起一个固定的模型对系统状态进行效果较好的估计。针对这一问题,多模型的估计方法提出通过一个模型集{}(),1,2,,j M m j r == 中不同模型的切换来匹配不同目标的运动或者同一目标不同阶段的运动,达到运动模式的实时辨识的效果。 目前主要的多模型方法包括一阶广义贝叶斯方法(BGP1),二阶广义贝叶斯方法(GPB2)以及交互式多模型方法等(IMM )。这些多模型方法的共同点是基于马尔科夫链对各自的模型集进行切换或者融合,他们的主要设计流程如下图: M={m1,m2,...mk} K 时刻输入 值的形式 图一 多模型设计方法 其中,滤波器的重初始化方式是区分不同多模型算法的主要标准。由于多模型方法都是基于一个马尔科夫链来切换与模型的,对于元素为r 的模型集{}(),1,2,,j M m j r == ,从0时刻到k 时刻,其可能的模型切换轨迹为 120,12{,,}k i i i k trace k M m m m = ,其中k i k m 表示K-1到K 时刻,模型切换到第k i 个, k i 可取1,2,,r ,即0,k trace M 总共有k r 种可能。再令1 2 1 ,,,,k k i i i i μ+ 为K+1时刻经由轨迹0,k trace M 输入到第1k i +个模型滤波器的加权系数,则输入可以表示为 0,11 2 1 12|,,,,|,,,???k k trace k k k i M k k i i i i k k i i i x x μ++=?∑ 可见轨迹0,k trace M 的复杂度直接影响到算法计算量和存储量。虽然全轨迹的

五种大数据压缩算法

?哈弗曼编码 A method for the construction of minimum-re-dundancy codes, 耿国华1数据结构1北京:高等教育出版社,2005:182—190 严蔚敏,吴伟民.数据结构(C语言版)[M].北京:清华大学出版社,1997. 冯桂,林其伟,陈东华.信息论与编码技术[M].北京:清华大学出版社,2007. 刘大有,唐海鹰,孙舒杨,等.数据结构[M].北京:高等教育出版社,2001 ?压缩实现 速度要求 为了让它(huffman.cpp)快速运行,同时不使用任何动态库,比如STL或者MFC。它压缩1M数据少于100ms(P3处理器,主频1G)。 压缩过程 压缩代码非常简单,首先用ASCII值初始化511个哈夫曼节点: CHuffmanNode nodes[511]; for(int nCount = 0; nCount < 256; nCount++) nodes[nCount].byAscii = nCount; 其次,计算在输入缓冲区数据中,每个ASCII码出现的频率: for(nCount = 0; nCount < nSrcLen; nCount++) nodes[pSrc[nCount]].nFrequency++; 然后,根据频率进行排序: qsort(nodes, 256, sizeof(CHuffmanNode), frequencyCompare); 哈夫曼树,获取每个ASCII码对应的位序列: int nNodeCount = GetHuffmanTree(nodes); 构造哈夫曼树 构造哈夫曼树非常简单,将所有的节点放到一个队列中,用一个节点替换两个频率最低的节点,新节点的频率就是这两个节点的频率之和。这样,新节点就是两个被替换节点的父

LZ77压缩算法实验报告

LZ77压缩算法实验报告 一、实验内容 使用C++编程实现LZ77压缩算法的实现。 二、实验目的 用LZ77实现文件的压缩。 三、实验环境 1、软件环境:Visual C++ 6.0 2、编程语言:C++ 四、实验原理 LZ77 算法在某种意义上又可以称为“滑动窗口压缩”,这是由于该算法将一个虚拟的,可以跟随压缩进程滑动的窗口作为术语字典,要压缩的字符串如果在该窗口中出现,则输出其出现位置和长度。使用固定大小窗口进行术语匹配,而不是在所有已经编码的信息中匹配,是因为匹配算法的时间消耗往往很多,必须限制字典的大小才能保证算法的效率;随着压缩的进程滑动字典窗口,使其中总包含最近编码过的信息,是因为对大多数信息而言,要编码的字符串往往在最近的上下文中更容易找到匹配串。 五、LZ77算法的基本流程 1、从当前压缩位置开始,考察未编码的数据,并试图在滑动窗口中找出最长的匹 配字符串,如果找到,则进行步骤2,否则进行步骤3。 2、输出三元符号组( off, len, c )。其中off 为窗口中匹

配字符串相对窗口边 界的偏移,len 为可匹配的长度,c 为下一个字符。然后将窗口向后滑动len + 1 个字符,继续步骤1。 3、输出三元符号组( 0, 0, c )。其中c 为下一个字符。然后将窗口向后滑动 len + 1 个字符,继续步骤1。 六、源程序 /********************************************************************* * * Project description: * Lz77 compression/decompression algorithm. * *********************************************************************/ #include #include #include #include #define OFFSET_CODING_LENGTH (10) #define MAX_WND_SIZE 1024 //#define MAX_WND_SIZE (1<

LZSS压缩算法实验报告

实验名称:LZSS压缩算法实验报告 一、实验内容 使用Visual 6..0 C++编程实现LZ77压缩算法。 二、实验目的 用LZSS实现文件的压缩。 三、实验原理 LZSS压缩算法是词典编码无损压缩技术的一种。LZSS压缩算法的字典模型使用了自适应的方式,也就是说,将已经编码过的信息作为字典, 四、实验环境 1、软件环境:Visual C++ 6.0 2、编程语言:C++ 五、实验代码 #include #include #include #include /* size of ring buffer */ #define N 4096 /* index for root of binary search trees */ #define NIL N /* upper limit for g_match_len. Changed from 18 to 16 for binary compatability with Microsoft COMPRESS.EXE and EXPAND.EXE #define F 18 */ #define F 16 /* encode string into position and length if match_length is greater than this: */ #define THRESHOLD 2 /* these assume little-endian CPU like Intel x86

-- need byte-swap function for big endian CPU */ #define READ_LE32(X) *(uint32_t *)(X) #define WRITE_LE32(X,Y) *(uint32_t *)(X) = (Y) /* this assumes sizeof(long)==4 */ typedef unsigned long uint32_t; /* text (input) size counter, code (output) size counter, and counter for reporting progress every 1K bytes */ static unsigned long g_text_size, g_code_size, g_print_count; /* ring buffer of size N, with extra F-1 bytes to facilitate string comparison */ static unsigned char g_ring_buffer[N + F - 1]; /* position and length of longest match; set by insert_node() */ static unsigned g_match_pos, g_match_len; /* left & right children & parent -- these constitute binary search tree */ static unsigned g_left_child[N + 1], g_right_child[N + 257], g_parent[N + 1]; /* input & output files */ static FILE *g_infile, *g_outfile; /***************************************************************************** initialize trees *****************************************************************************/ static void init_tree(void) { unsigned i; /* For i = 0 to N - 1, g_right_child[i] and g_left_child[i] will be the right and left children of node i. These nodes need not be initialized. Also, g_parent[i] is the parent of node i. These are initialized to NIL (= N), which stands for 'not used.' For i = 0 to 255, g_right_child[N + i + 1] is the root of the tree for strings that begin with character i. These are initialized to NIL. Note there are 256 trees. */ for(i = N + 1; i <= N + 256; i++) g_right_child[i] = NIL; for(i = 0; i < N; i++) g_parent[i] = NIL; } /***************************************************************************** Inserts string of length F, g_ring_buffer[r..r+F-1], into one of the trees (g_ring_buffer[r]'th tree) and returns the longest-match position and length via the global variables g_match_pos and g_match_len. If g_match_len = F, then removes the old node in favor of the new one, because the old one will be deleted sooner.

多媒体数据压缩实验报告

多媒体数据压缩实验报告 篇一:多媒体实验报告_文件压缩 课程设计报告 实验题目:文件压缩程序 姓名:指导教师:学院:计算机学院专业:计算机科学与技术学号: 提交报告时间:20年月日 四川大学 一,需求分析: 有两种形式的重复存在于计算机数据中,文件压缩程序就是对这两种重复进行了压 缩。 一种是短语形式的重复,即三个字节以上的重复,对于这种重复,压缩程序用两个数字:1.重复位置距当前压缩位置的距离;2.重复的长度,来表示这个重复,假设这两个数字各占一个字节,于是数据便得到了压缩。 第二种重复为单字节的重复,一个字节只有256种可能的取值,所以这种重复是必然的。给 256 种字节取值重新编码,使出现较多的字节使用较短的编码,出现较少的字节使用较长的编码,这样一来,变短的字节相对于变长的字节更多,文件的总长度就会减少,并且,字节使用比例越不均

匀,压缩比例就越大。 编码式压缩必须在短语式压缩之后进行,因为编码式压缩后,原先八位二进制值的字节就被破坏了,这样文件中短语式重复的倾向也会被破坏(除非先进行解码)。另外,短语式压缩后的结果:那些剩下的未被匹配的单、双字节和得到匹配的距离、长度值仍然具有取值分布不均匀性,因此,两种压缩方式的顺序不能变。 本程序设计只做了编码式压缩,采用Huffman编码进行压缩和解压缩。Huffman编码是一种可变长编码方式,是二叉树的一种特殊转化形式。编码的原理是:将使用次数多的代码转换成长度较短的代码,而使用次数少的可以使用较长的编码,并且保持编码的唯一可解性。根据 ascii 码文件中各 ascii 字符出现的频率情况创建 Huffman 树,再将各字符对应的哈夫曼编码写入文件中。同时,亦可根据对应的哈夫曼树,将哈夫曼编码文件解压成字符文件. 一、概要设计: 压缩过程的实现: 压缩过程的流程是清晰而简单的: 1. 创建 Huffman 树 2. 打开需压缩文件 3. 将需压缩文件中的每个 ascii 码对应的 huffman 编码按 bit 单位输出生成压缩文件压缩结束。

数据快速压缩算法的C语言实现

价值工程 置,是一项十分有意义的工作。另外恶意代码的检测和分析是一个长期的过程,应对其新的特征和发展趋势作进一步研究,建立完善的分析库。 参考文献: [1]CNCERT/CC.https://www.doczj.com/doc/005850457.html,/publish/main/46/index.html. [2]LO R,LEVITTK,OL SSONN R.MFC:a malicious code filter [J].Computer and Security,1995,14(6):541-566. [3]KA SP ER SKY L.The evolution of technologies used to detect malicious code [M].Moscow:Kaspersky Lap,2007. [4]LC Briand,J Feng,Y Labiche.Experimenting with Genetic Algorithms and Coupling Measures to devise optimal integration test orders.Software Engineering with Computational Intelligence,Kluwer,2003. [5]Steven A.Hofmeyr,Stephanie Forrest,Anil Somayaji.Intrusion Detection using Sequences of System calls.Journal of Computer Security Vol,Jun.1998. [6]李华,刘智,覃征,张小松.基于行为分析和特征码的恶意代码检测技术[J].计算机应用研究,2011,28(3):1127-1129. [7]刘威,刘鑫,杜振华.2010年我国恶意代码新特点的研究.第26次全国计算机安全学术交流会论文集,2011,(09). [8]IDIKA N,MATHUR A P.A Survey of Malware Detection Techniques [R].Tehnical Report,Department of Computer Science,Purdue University,2007. 0引言 现有的压缩算法有很多种,但是都存在一定的局限性,比如:LZw [1]。主要是针对数据量较大的图像之类的进行压缩,不适合对简单报文的压缩。比如说,传输中有长度限制的数据,而实际传输的数据大于限制传输的数据长度,总体数据长度在100字节左右,此时使用一些流行算法反而达不到压缩的目的,甚至增大数据的长度。本文假设该批数据为纯数字数据,实现压缩并解压缩算法。 1数据压缩概念 数据压缩是指在不丢失信息的前提下,缩减数据量以减少存储空间,提高其传输、存储和处理效率的一种技术方法。或按照一定的算法对数据进行重新组织,减少数据的冗余和存储的空间。常用的压缩方式[2,3]有统计编码、预测编码、变换编码和混合编码等。统计编码包含哈夫曼编码、算术编码、游程编码、字典编码等。 2常见几种压缩算法的比较2.1霍夫曼编码压缩[4]:也是一种常用的压缩方法。其基本原理是频繁使用的数据用较短的代码代替,很少使用 的数据用较长的代码代替,每个数据的代码各不相同。这些代码都是二进制码,且码的长度是可变的。 2.2LZW 压缩方法[5,6]:LZW 压缩技术比其它大多数压缩技术都复杂,压缩效率也较高。其基本原理是把每一个第一次出现的字符串用一个数值来编码,在还原程序中再将这个数值还成原来的字符串,如用数值0x100代替字符串ccddeee"这样每当出现该字符串时,都用0x100代替,起到了压缩的作用。 3简单报文数据压缩算法及实现 3.1算法的基本思想数字0-9在内存中占用的位最 大为4bit , 而一个字节有8个bit ,显然一个字节至少可以保存两个数字,而一个字符型的数字在内存中是占用一个字节的,那么就可以实现2:1的压缩,压缩算法有几种,比如,一个自己的高四位保存一个数字,低四位保存另外一个数字,或者,一组数字字符可以转换为一个n 字节的数值。N 为C 语言某种数值类型的所占的字节长度,本文讨论后一种算法的实现。 3.2算法步骤 ①确定一种C 语言的数值类型。 —————————————————————— —作者简介:安建梅(1981-),女,山西忻州人,助理实验室,研究方 向为软件开发与软交换技术;季松华(1978-),男,江苏 南通人,高级软件工程师,研究方向为软件开发。 数据快速压缩算法的研究以及C 语言实现 The Study of Data Compression and Encryption Algorithm and Realization with C Language 安建梅①AN Jian-mei ;季松华②JI Song-hua (①重庆文理学院软件工程学院,永川402160;②中信网络科技股份有限公司,重庆400000)(①The Software Engineering Institute of Chongqing University of Arts and Sciences ,Chongqing 402160,China ; ②CITIC Application Service Provider Co.,Ltd.,Chongqing 400000,China ) 摘要:压缩算法有很多种,但是对需要压缩到一定长度的简单的报文进行处理时,现有的算法不仅达不到目的,并且变得复杂, 本文针对目前一些企业的需要,实现了对简单报文的压缩加密,此算法不仅可以快速对几十上百位的数据进行压缩,而且通过不断 的优化,解决了由于各种情况引发的解密错误,在解密的过程中不会出现任何差错。 Abstract:Although,there are many kinds of compression algorithm,the need for encryption and compression of a length of a simple message processing,the existing algorithm is not only counterproductive,but also complicated.To some enterprises need,this paper realizes the simple message of compression and encryption.This algorithm can not only fast for tens of hundreds of data compression,but also,solve the various conditions triggered by decryption errors through continuous optimization;therefore,the decryption process does not appear in any error. 关键词:压缩;解压缩;数字字符;简单报文Key words:compression ;decompression ;encryption ;message 中图分类号:TP39文献标识码:A 文章编号:1006-4311(2012)35-0192-02 ·192·

压缩编码算法设计与实现实验报告

压缩编码算法设计与实现实验报告 一、实验目的:用C语言或C++编写一个实现Huffman编码的程序,理解对数据进行无损压缩的原理。 二、实验内容:设计一个有n个叶节点的huffman树,从终端读入n个叶节 点的权值。设计出huffman编码的压缩算法,完成对n个节点的编码,并写出程序予以实现。 三、实验步骤及原理: 1、原理:Huffman算法的描述 (1)初始化,根据符号权重的大小按由大到小顺序对符号进行排序。 (2)把权重最小的两个符号组成一个节点, (3)重复步骤2,得到节点P2,P3,P4……,形成一棵树。 (4)从根节点开始顺着树枝到每个叶子分别写出每个符号的代码 2、实验步骤: 根据算法原理,设计程序流程,完成代码设计。 首先,用户先输入一个数n,以实现n个叶节点的Huffman 树; 之后,输入n个权值w[1]~w[n],注意是unsigned int型数值; 然后,建立Huffman 树; 最后,完成Huffman编码。 四、实验代码:#include #include #include #define MAX 0xFFFFFFFF typedef struct / /*设计一个结构体,存放权值,左右孩子*// { unsigned int weight; unsigned int parent,lchild,rchild; }HTNode,*HuffmanTree; typedef char** HuffmanCode;

int min(HuffmanTree t,int i) { int j,flag; unsigned int k = MAX; for(j=1;j<=i;j++) if(t[j].parent==0&&t[j].weight s2) { tmp = s1; s1 = s2; s2 = tmp; } } void HuffmanCoding(HuffmanTree &HT,HuffmanCode &HC,int *w,int n,int &wpl) { int m,i,s1,s2,start,j; unsigned int c,f; HuffmanTree p; char *cd; if(n<=1) return; m=2*n-1; HT=(HuffmanTree)malloc((m+1)*sizeof(HTNode)); for(p=HT+1,i=1;i<=n;++i,++p,++w) { (*p).weight=*w; (*p).parent=0; (*p).lchild=0; (*p).rchild=0; }

交互式多模型算法卡尔曼滤波仿真

交互式多模型算法卡尔曼滤波仿真 1 模型建立 分别以加速度u=0、1、2代表三个不同的运动模型 状态方程x(k+1)=a*x(k)+b*w(k)+d*u 观察方程z(k)=c*x(k)+v(k) 其中,a=[1 dt;0 1],b=[dt^2/2;dt],d=[dt^2/2;dt],c=[1 0] 2 程序代码 由两个功能函数组成,imm_main用来实现imm 算法,move_model1用来生成仿真数据,初始化运动参数 function imm_main %交互式多模型算法主程序 %初始化有关参数 move_model %调用运动模型初始化及仿真运动状态生成函数 load movedata %调入有关参数初始值(a b d c u position velocity pmeas dt tg q r x_hat p_var) p_tran=[0.8 0.1 0.1;0.2 0.7 0.1;0.1 0.2 0.7];%转移概率 p_pri=[0.1;0.6;0.3];%模型先验概率 n=1:2:5; %提取各模型方差矩阵 k=0; %记录仿真步数 like=[0;0;0];%视然函数 x_hat_hat=zeros(2,3);%三模型运动状态重初始化矩阵 u_=zeros(3,3);%混合概率矩阵 c_=[0;0;0];%三模型概率更新系数 %数据保存有关参数初始化 phat=[];%保存位置估值 vhat=[];%保存速度估值 xhat=[0;0];%融合和运动状态 z=0;%量测偏差(一维位置) pvar=zeros(2,2);%融合后估计方差 for t=0:dt:tg; %dt为为仿真步长;tg为仿真时间长度 k=k+1;%记录仿真步数 ct=0; %三模型概率更新系数 c_max=[0 0 0];%混合概率规范系数 p_var_hat=zeros(2,6);%方差估计重初始化矩阵, %[x_hat_hat p_var_hat]=model_reinitialization(p_tran,p_pri,x_hat,p_var);%调用重初始化函数,进行混合估计,生成三个模型重初始化后的运动状态、方差 %混合概率计算 for i=1:3 u_(i,:)=p_tran(i,:)*p_pri(i); end for i=1:3 c_max=c_max+u_(i,:); end

数据压缩实验指导书

目 录 实验一用C/C++语言实现游程编码 实验二用C/C++语言实现算术编码 实验三用C/C++语言实现LZW编码 实验四用C/C++语言实现2D-DCT变换13

实验一用C/C++语言实现游程编码 1. 实验目的 1) 通过实验进一步掌握游程编码的原理; 2) 用C/C++语言实现游程编码。 2. 实验要求 给出数字字符,能正确输出编码。 3. 实验内容 现实中有许多这样的图像,在一幅图像中具有许多颜色相同的图块。在这些图块中,许多行上都具有相同的颜色,或者在一行上有许多连续的象素都具有相同的颜色值。在这种情况下就不需要存储每一个象素的颜色值,而仅仅存储一个象素的颜色值,以及具有相同颜色的象素数目就可以,或者存储一个象素的颜色值,以及具有相同颜色值的行数。这种压缩编码称为游程编码,常用(run length encoding,RLE)表示,具有相同颜色并且是连续的象素数目称为游程长度。 为了叙述方便,假定一幅灰度图像,第n行的象素值为: 用RLE编码方法得到的代码为:0@81@38@501@40@8。代码中用黑体表示的数字是游程长度,黑体字后面的数字代表象素的颜色值。例如黑体字50代表有连续50个象素具有相同的颜色值,它的颜色值是8。 对比RLE编码前后的代码数可以发现,在编码前要用73个代码表示这一行的数据,而编码后只要用11个代码表示代表原来的73个代码,压缩前后的数据量之比约为7:1,即压缩比为7:1。这说明RLE确实是一种压缩技术,而且这种编码技术相当直观,也非常经济。RLE所能获得的压缩比有多大,这主要是取决于图像本身的特点。如果图像中具有相同颜色的图像块越大,图像块数目越少,获得的压缩比就越高。反之,压缩比就越小。 译码时按照与编码时采用的相同规则进行,还原后得到的数据与压缩前的数据完全相同。因此,RLE是无损压缩技术。

数据压缩实验

实验二图像预测编码 一、实验题目: 图像预测编码: 二、实验目的: 实现图像预测编码和解码. 三、实验内容: 给定一幅图片,对其进行预测编码,获得预测图像,绝对残差图像, 再利用预测图像和残差图像进行原图像重建并计算原图像和重建图像误差. 四、预备知识: 预测方法,图像处理概论。 五、实验原理: 根据图像中任意像素与周围邻域像素之间存在紧密关系,利用周围邻域四个像素来进行该点像素值预测,然后传输图像像素值与其预测值的差值信号,使传输的码率降低,达到压缩的目的。 六、实验步骤: (1)选取一幅预测编码图片; (2)读取图片内容像素值并存储于矩阵; (3)对图像像素进行预测编码; (4)输出预测图像和残差图像; (5)根据预测图像和残差图像重建图像; (6)计算原预测编码图像和重建图像误差. 七、思考题目: 如何采用预测编码算法实现彩色图像编码和解码. 八、实验程序代码: 预测编码程序1: 编码程序: i1=imread('lena.jpg'); if isrgb(i1) i1=rgb2gray(i1);

end i1=imcrop(i1,[1 1 256 256]); i=double(i1); [m,n]=size(i); p=zeros(m,n); y=zeros(m,n); y(1:m,1)=i(1:m,1); p(1:m,1)=i(1:m,1); y(1,1:n)=i(1,1:n); p(1,1:n)=i(1,1:n); y(1:m,n)=i(1:m,n); p(1:m,n)=i(1:m,n); p(m,1:n)=i(m,1:n); y(m,1:n)=i(m,1:n); for k=2:m-1 for l=2:n-1 y(k,l)=(i(k,l-1)/2+i(k-1,l)/4+i(k-1,l-1)/8+i(k-1,l+1)/8); p(k,l)=round(i(k,l)-y(k,l)); end end p=round(p); subplot(3,2,1); imshow(i1); title('原灰度图像'); subplot(3,2,2); imshow(y,[0 256]); title('利用三个相邻块线性预测后的图像'); subplot(3,2,3); imshow(abs(p),[0 1]); title('编码的绝对残差图像'); 解码程序 j=zeros(m,n); j(1:m,1)=y(1:m,1); j(1,1:n)=y(1,1:n); j(1:m,n)=y(1:m,n);

LZ77 压缩算法实验报告 一

LZ77 压缩算法实验报告 一、实验内容:使用 C++编程实现 LZ77 压缩算法的实现。 二、实验目的:用 LZ77 实现文件的压缩。 三、实验环境: 1、软件环境:Visual C++ 6.0 2、编程语言:C++ 四、实验原理: LZ77 算法在某种意义上又可以称为“滑动窗口压缩”,这是 由于该算法将一个虚拟的,可以跟随压缩进程滑动的窗口作为术语字典,要压缩的字符串如果在该窗口中出现,则输出其出现位置和长度。使用固定大小窗口进行术语匹配,而不是在所有已经编码的信息中匹配,是因为匹配算法的时间消耗往往很多,必须限制字典的大小才能保证算法的效率;随着压缩的进程滑动字典窗口,使其中总包含最近编码过的信息,是因为对大多数信息而言,要编码的字符串往往在最近的上下文中更容易找到匹配串。 五、 LZ77 算法的基本流程:1、从当前压缩位置开始,考察未编码的数据,并 试图在滑动窗口中找出最长的匹配字符串,如果找到,则进行步骤2,否则进行步骤 3。 2、输出三元符号组 ( off, len, c )。其中 off 为窗口中匹配字符串相 对窗口边界的偏移,len 为可匹配的长度,c 为下一个字符。然后将窗口向后滑动 len + 1 个字符,继续步骤 1。 3、输出三元符号组 ( 0, 0, c )。其中 c 为下一个字符。然后将窗口向 后滑动 len + 1 个字符,继续步骤 1。 代码如下: #include #include #include #include"lz77.h" ////////////////////////////////////////////////////////////////// // out file format: // 0;flag2;buffer;0;flag2;buffer;...flag1;flag2;bufferlast // flag1 - 2 bytes, source buffer block length // if block size is 65536, be zero // flag2 - 2 bytes, compressed buffer length // if can not compress, be same with flag1 ////////////////////////////////////////////////////////////////// void main(int argc, char* argv[]) { /*if (argc != 4) { puts("Usage: ");

哈夫曼算法实现字符串压缩——实验报告单

《用哈夫曼编码实现文件压缩》 实验报告 课程名称《数据结构B》 实验学期 2011 至 2012 学年第一学期学生所在系部计算机系 年级 2009级专业班级计科B09—1 学生姓名韩翼学号 200907014106 任课教师盛建瓴 实验成绩

一、实验题目: 用哈夫曼编码实现文件压缩 二、实验目的: 1、了解文件的概念。 2、掌握线性链表的插入、删除等算法。 3、掌握Huffman 树的概念及构造方法。 4、掌握二叉树的存储结构及遍历算法。 5、利用Huffman 树及Huffman 编码,掌握实现文件压缩的一般原理。 三、实验设备与环境: 微型计算机、Windows 系列操作系统 、Visual C++6.0软件 四、实验内容: 根据ascii 码文件中各ascii 字符出现的频率情况创建Haffman 树,再将各字符对应的哈夫曼编码写入文件中,实现文件压缩。 五、概要设计: 本次试验采用将字符用长度尽可能短的二进制数位表示方法,即对于文件中出现的字符,无须全部都用8位的ASCLL 码进行存储,根据他们在文件中出现的频率不同,我们利用Haffman 算法使每个字符能以最短的二进制字符进行存储,以达到节省存储空间,压缩文件的目的。解决了压缩需采用的算法,程序的思路已然清晰: 1、统计需压缩文件中每个字符出现的频率。 2、将每个字符的出现频率作为叶子结点构建Haffman 树,然后将树中结点 引向其左孩子的分支标“0”,引向其右孩子的分支标“1” ; 每个字符的编码即为从根到每个叶子的路径上得到的0、1序列,这样便完成了Haffman 编码,将每个字符用最短的二进制字符表示。 3、打开需压缩的文件,再将需压缩文件中的每个ASCII 码对应的编码按bit 单位输出。 4、文件压缩结束。 六、详细设计: (1)Huffman 树简介 路径:从树中一个结点到另一个结点之间的分支构成这两个结点间的路径 路径长度:路径上的分支数 树的路径长度:从树根到每一个结点的路径长度之和 树的带权路径长度:树中所有带权结点的路径长度之和 —结点到根的路径长度 ——权值 —其中:记作:k k n k k k l w l w wpl ∑==1

《数据压缩与信源编码》实验指导书

《数据压缩与信源编码》实验指导书 适用专业:信息工程 课程代码: 6088619 总学时: 40 总学分: 2.5 编写单位:电气与电子信息学院 编写人:李斌 审核人: 审批人: 批准时间: 2015 年 11 月 10日

目录 实验一码书的设计和使用 (2) 实验二基于DCT变换的图像压缩技术 (8) 实验三基于小波变换的图像压缩技术 (15)

实验一 码书的设计和使用 一、实验目的 采用矢量量化算法(LBG )获得图像压缩所需要的码书,通过码书实现图像压缩编码。 二、实验内容 对给定的一幅图像进行码书设计、编码和解码。 三、实验仪器、设备及材料 操作系统:Windowsxp ; 软件:MA TLAB 四、实验原理 要想得到好的性能编码,仅采用标量量化是不可能的。当把多个信源符号联合起来形成多维矢量,再对矢量进行标量量化时自由度将更大,同样的失真下,量化基数可进一步减少,码率可进一步压缩。这种量化叫矢量量化。 一种有效和直观的矢量量化码书设计算法——LBG 算法(也叫GLA 算法)是由Linde 、Buzo 和Gray 于1980年首先提出来的。该算法基于最佳矢量量化器设计的最佳划分和最佳码书这两个必要条件,且是Lloyd 算法在矢量空间的推广,其特点为物理概念清晰、算法理论严密及算法实现容易。 设训练矢量集为{}110,,,-=M x x x X ,待产生的码书为{}110,,,-=N y y y C ,其中{})1(10,,,-=k i i i i x x x x ,{})1(10,,,-=k j j j j y y y y ,10,10-≤≤-≤≤N j M i ,则码书设计过程就是需求把训练矢量集X 分成N 个子集)1,,1,0(-=N j S j 的一种最佳聚类方案,而子集j S 的质心矢量j y 作为码字。假设平方误差测度用来表征训练矢量i x 和码字j y 之间的失真,即: ∑-=-=1 02)(),(k l jl il j i y x y x d 则码书设计的准则可用下列数学形式表达: 最小化 ∑∑-=-==101 0),(),,(N j M i j i ij y x d w C X W f 约束条件 ∑-==101N j ij w ,10-≤≤M i 其中W 为N M ?矩阵,其元素满足:

文本压缩算法的比较研究

第22卷 第12期 2006年12月 甘肃科技 Gansu Science and Technology V ol.22 N o.12Dec. 2006 文本压缩算法的比较研究 徐成俊1,舒 毅1,柴 蓉2,张其斌1,田全红1,郝 涛3 (1甘肃省计算中心,甘肃兰州730030;2西北师范大学,甘肃兰州730070;3兰州理工大学甘肃兰州730050) 摘 要:论述了4种不同的文本压缩算法。根据压缩算法的优点和缺点,在实践中,要有针对性选 择算法,用其优点,从而得到比较理想的压缩文本。关键词:压缩算法;哈夫曼;算术;L Z ;游程中图分类号:TP391 1 引言 随着计算机技术的快速发展,各种系统数据量越来越大,给信息存储特别是网络传输带来诸多的困难,己成为有效获取和使用信息的瓶颈。为了节省信息的存储空间和提高信息的传输效率,必须对大量的实际数据进行压缩。实践证明,采用数据压缩技术可以节省80%以上的费用,且一些难点问题通过压缩技术能够实现。 数据压缩技术种类繁多、应用广泛,技术不断发展,一些分支已成为当今研究的焦点。按照编码的失真程度数据压缩可以分为有损压缩和无损压缩。有损压缩即原始数据不能完全恢复,主要应用于图像和数字化的语音方面。无损压缩就是经过一个压缩后,能够产生和输入完全一致的压缩技术,主要用于存储数据库记录或处理文本文件。 2 目前主要文本压缩算法 文本压缩是根据一定方法对大量数据进行编码处理以达到信息压缩存储过程,被压缩的数据应该能够通过解码恢复到压缩以前的原状态,而不会发生信息丢失现象。发展到现在已经有很多关于文本压缩的算法,主要有Huff man 编码、算术编码等无损压缩和预测编码、量化、变换编码等有损压缩。如图1所示。 本文对常见的几种无损压缩算法进行综合分类,比较研究。 3 算法描述 3.1哈夫曼编码 美国数学家David Huff man 在上世纪五十年 代初就提出了这种编码,其主导思想是根据字符出 现的概率来构造平均长度最短的编码,并且保持编码的唯一可解性。也就是说,在源数据中出现概率越高的字符,相应码字越短;出现概率越小的字符,其码长越长,从而达到用尽可能少的码符号来表示源数据,达到压缩的效果。哈夫曼编码是一种变长的编码(因为其长度是随符号出现的概率而不同),在编码过程中,若各码字长度严格按照码字所对应符号出现概率的大小的逆序排列,则编码的平均长度是最小的。它最根本的原则是累计的(字符的统计数字3字符的编码长度)为最小,也就是权值(字符的统计数字3字符的编码长度)的和最小。 图1 文本压缩的分类 哈夫曼编码的编码过程如下:①对源信号的出现频率进行统计,每个源信号根据它出现频率大小被赋予一定的编码,高频率的信号对应短码,低频率的信号对应长码。 ②用信号对应的编码去取代源数据中的信号。在已知源数据流中各字符发生的概率情况下使用,即在压缩数据时,遵循固定的源数据符号代码表。在不允许两次扫描源文件的情况下,往往根据先验统计概率或假设构造这张代码表。压缩处理时,若源数据流中的符号与代码表中的符号相匹配,则用与之相对应的较短的代码替代该字符,否则不

实验报告-数据滤波和数据压缩实验

实验题目:使用Haar 小波和傅里叶变换方法滤波及数据压缩 1 实验目的 (1)掌握离散数据的Haar 小波变换和傅里叶变换的定义,基本原理和方法 (2)使用C++实现数据的Haar 小波变换和离散傅里叶变换 (3)掌握数据滤波的基本原理和方法 (4)掌握使用Haar 小波变换和离散傅里叶变换应用于数据压缩的基本原理和方法,并且对两种数据压缩进行评价 2 实验步骤 2.1 算法原理 2.1.1 Haar 小波变换 (1)平均,细节及压缩原理 设{x1,x2}是一组两个元素组成的信号,定义平均与细节为(12)/2a x x =+, (12)/2d x x =-。则可以将{a ,d}作为原信号的一种表示,且信号可由{a ,d}恢复,1x a d =+,2x a d =-。 由上述可以看出,当x1,x2非常接近时,d 会很小。此时,{x1,x2}可以近似的用{a}来表示,由此实现了信号的压缩。重构的信号为{a ,a},误差信号为 {|1|,|2|}{||,||}x a x a d d --=。因此,平均值a 可以看做是原信号的整体信息,而d 可 以看成是原信号的细节信息。用{a}近似的表示原信号,可以实现对原信号的压缩,而且丢失的细节对于最终信号的重构不会有重大影响。对于多元素的信号,可以看成是对于二元信号的一种推广。 (2)尺度函数和小波方程 在小波分析中,引入记号 [1,0)()() t X t φ=,其中, [1,0)() X t 表示区间[1,0]上的特征函 数。定义 ,()(2),0,1,...,21 j j j k t t k k φφ=-=- 称()t φ为Haar 尺度函数。由上式可知, ,()j k t φ都可以由 0,0() t φ伸缩和平移得到。 小波分析中,对于信号有不同分辨率的表示,当用较低分辨率来表示原始信号时,会丢失细节信息,需要找到一个函数来描述这种信息,该函数称之为小波函数。基本的小波函数定义如下:

相关主题
文本预览
相关文档 最新文档