当前位置:文档之家› Selectively acquiring ratings for product recommendation

Selectively acquiring ratings for product recommendation

Selectively acquiring ratings for product recommendation
Selectively acquiring ratings for product recommendation

Selectively Acquiring Ratings for Product

Recommendation

Zan Huang

Pennsylvania State University

419 Business Building

University Park, PA 16802

1-814-863-1940

zanhuang@https://www.doczj.com/doc/305049416.html,

ABSTRACT

Accurate prediction of customer preferences on products is the key to any recommender systems to realize its promised strategic values such as improved customer satisfaction and therefore enhanced loyalty. In this paper, we propose proactively acquiring ratings from customers for a newly introduced product to quickly improve the accuracy of the predicted ratings generated by a collaborative filtering recommendation algorithm for the entire customer population. We formally introduce the problem of identifying the most informative ratings to acquire and termed it as the product rating acquisition problem. We proposed an active learning sampling method for this problem that is generic to any recommendation algorithms. Using the Netflix Prize dataset, we experimented with our proposed method, a uniform random sampling method, and a degree-based sampling method that is biased toward customers with large numbers of ratings for the user-based and item-based neighborhood recommendation algorithms. The experimental results showed that even with the random sampling method, acquiring 10% of all ratings in addition to a randomly selected 10% initial ratings achieved 4.5% improvement on overall rating prediction accuracy of the movie. In addition, our proposed active learning sampling method consistently outperformed the random and degree-based sampling for the better-performing item-based algorithm and achieved more than 8% improvement by acquiring 10% of the ratings. Categories and Subject Descriptors

H.1.2 [Information Systems]: User/Machine systems -Human information processing; H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval - Information filtering; Relevance feedback; Retrieval models. General Terms

Algorithms, Design, Experimentation.

Keywords

Recommender system, collaborative filtering, active learning, sampling.

1.INTRODUCTION

Recommender systems are being widely adopted in many application settings to suggest products, services, and information items to potential customers. A wide range of companies, such as https://www.doczj.com/doc/305049416.html,, https://www.doczj.com/doc/305049416.html,, https://www.doczj.com/doc/305049416.html,, CDNOW, J.C. Penney, and Procter & Gamble, have successfully deployed recommendation technologies to increase Web and catalog sales and improve customer loyalty [33]. For some of these online businesses, the recommendation service plays a central role in the business strategy, with https://www.doczj.com/doc/305049416.html, and https://www.doczj.com/doc/305049416.html, as the prominent examples.

Accurate prediction of customer preferences on products is the key to any recommender systems. There has been a substantial literature on recommendation algorithm design and evaluation. There is also ever increasing industry interest and efforts in improving the performance of real-world recommender systems, highlighted by the Netflix Prize launched in October 2006.

The key input for a recommendation algorithm is the previously observed customer-product interactions, implicit (such as the purchase of a product by an Amazon customer) or explicit (such as ratings of a movie by a Netflix customer). In this paper we limit our focus on recommendation derived from rating-based explicit interactions. Previous efforts on improving recommendation performance have primarily focused on designing a better recommendation algorithm given a set of customer-product interactions and characteristics of the customers and products. As is true for any data mining task, a set of competitive algorithms may deliver slightly varying levels of performances within a certain range largely dictated by the nature

of the input data. In the context of an operational recommender system, it is possible to influence the nature of the available customer-product interactions by proactively acquiring customers’ ratings. The carefully acquired ratings have the potential to significantly improve the accuracy of the predicted ratings for the population of customers, especially at the stage when a new product just enters the system.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

Conference’04, Month 1–2, 2004, City, State, Country.

Copyright 2004 ACM 1-58113-000-0/00/0004…$5.00.

Consider the example of Netflix’s recommendation service. When a new movie becomes available on DVD but few customers have provided ratings for the movie, the accuracy of the predicted ratings on this movie is expected to be low. Netflix has the incentive to quickly improve the prediction accuracy to avoid customer dissatisfaction. In this situation, sending the movie free of charge to a set of selected customers to get their ratings could bring significant return on customer satisfaction that overweighs the cost associated with it.

In this paper, we formalize the problem of selectively acquiring customer ratings to improve product recommendation quality, i.e., for a given number of ratings to acquire identifying which customers should be reached for their ratings in order to achieve the largest improvement on overall rating prediction accuracy for a particular product. We term this problem as the product rating acquisition problem.

There is substantial literature on selective sampling for building classifiers in a cost-sensitive context. Many algorithms have been proposed under the active learning scheme, in which labeled data are acquired incrementally, using the model learned using the available data so far to select particularly helpful additional training examples. In this paper, we propose an active learning sampling algorithm for the product rating acquisition problem and compare with two benchmark sampling methods, a random sampling method and a degree-based sampling method that selects customers with the greatest number of ratings. We use the data from Netflix Prize to evaluate the proposed active learning method and assess the potential gain from proactive acquisition of customer ratings.

The remainder of the paper is organized as follows. Section 2 reviews relevant literature on recommendation algorithms, data acquisition and active learning research, and the limited previous studies on user-oriented active learning for recommendation. In Section 3 we formalize the product rating acquisition problem and describe the proposed active learning method and the random and degree-based sampling methods. Section 4 provides the details of the experimental study using the Netflix dataset. Section 5 concludes the paper by summarizing the contributions and pointing out the future directions of this research.

2.RELATED WORK

In this section, we first review the literature on recommendation algorithm research and describe the two commonly used algorithms that will be used in this study, the user-based and item-based neighborhood algorithms. We then summarize the literature on data acquisition and active learning research. Lastly, we review few previous studies that apply the idea of active learning for recommendation.

2.1Recommendation Algorithms

At the heart of recommender systems are the algorithms for making recommendations. Models based on the item or user attributes attempt to explain user-item interactions using these intrinsic attributes (e.g., [23, 25]). Intuitively these models learn either explicitly or implicitly rules such as “Joe likes science fiction books” and “well-educated users like Harry Potter.” Techniques such as regression and classification algorithms have been used to derive such models. The performances of these approaches, however, rely heavily on high-quality user and item attributes that are often difficult or expensive to obtain. Collaborative filtering-based recommendation takes a different approach by utilizing only the customer-product interaction data and ignoring the customer and product attributes [26]. Based solely on the interaction data, customers and prodcuts are characterized implicitly by their previous interactions. Collaborative filtering has been reported to be the most widely-adopted and successful recommendation approach. The literature on recommendation algorithm design has largely focused on collaborative filtering algorithms. A wide range of collaborative filtering algorithms have been proposed, including standard user-based and item-based neighborhood algorithms [26, 32], rule-based approaches [1, 21], cluster and generative models [12, 35], advanced matrix analysis approaches [11, 31], and graph-based algorithms [14, 17].

In this paper, we focus on collaborative filtering recommendation based on the rating data. We focus on the standard user-based and item-based neighborhood algorithms in this paper. The user-based algorithm was the first collaborative filtering algorithm proposed in the literature and often serves as a comparison benchmark for later proposed algorithms. A number of evaluation studies (e.g., [6]) have shown that this algorithm can achieve competitive performance with many other algorithms. The item-based algorithm is adopted by real-world systems such as Amazon’s [22] because of its relative computational efficiency when customers substantially outnumber the products in the system. It has also been shown to outperform the user-based algorithm for many datasets [9, 32].

The input for collaborative filtering algorithms is a set of ratings R = {r c,p} given by a set of customers C = {c1, c2, …, c M} on a set of products P = {p1, p2, …, p N}.

The user-based neighborhood algorithm constructs a customer similarity matrix based on the customers’ co-rated products. For a target product, the predicted rating for a customer is derived from aggregating the observed ratings given by the customers similar to this customer. The item-based algorithm operates in a similar manner but constructs the product similarity matrix instead. The similarity for a product pair is based on the ratings from customers who rated both products. The predicted rating for a customer is derived from aggregating her ratings on products that are similar to the target product.

There are a variety of design choices on customer/product similarity matrix construction and prediction generation. We have adopted the designs in [6, 32], which have been demonstrated to have competitive performance with other designs.

For the user-based algorithm, the similarity between customers c i and c j is given by the Pearson-r correlation

∑∑

∈∈

?

?

?

?

=

j

i j

i j

j

i

i

j

i j

j

i

i

P

p P

p c

p

c

c

p

c

P

p c

p

c

c

p

c

c

r

r

r

r

r

r

r

r

j

i

w

,,

,

2

,

2

,

,

,

)

(

)

(

)

)(

(

)

,((1)

where P i,j denotes the set of products both customers c i and c j have rated and c r denotes customer c’s overall average rating. Similarly, for the item-based algorithm, the similarity between products p i and p j is given by

∑∈∈∈????=

j

i j

i j i j

i j i C c C c c p c c p c C c c p c c p c p r r r r r r r r j i w ,,,2

,2

,,,)

()

()

)((),((2)

where C i,j denotes the set of customers who have rated both products p i and p j . Note that we do not follow exactly the Pearson correlation formula but adopted the formulation used in [32]. This adjusted version captures differences in rating scale between different customers and was reported to deliver superior performance over the Pearson correlation formulation.

We adopted the weighted sum approach for rating prediction generation. For the user-based algorithm, the predicted rating on product p for customer c is given by

∑∈∈?+

=C

c C

c c p c c

p c c c w r r c c w r p ''',',|)',(|))(',(. (3)

Similarly, the item-based algorithm generates the rating

prediction by

∑∈∈?+

=P

p P

p c p c c

p c p p w r r p p w r p '',,|

)',(|)

)(',( (4)

The performances of the large set of recommendation algorithms

proposed in the literature have been compared using real-world recommendation data in many algorithm and evaluation studies. Many of these studies (e.g., [6, 16]) confirmed the finding that across different recommendation datasets no single algorithm outperforms all the others and that the general level of recommendation quality is related to certain inherent characteristics of the customer-product interaction data other than the number of ratings available (e.g., [15]). It is often found that the performance differences among a set of competitive algorithms are smaller than the performance differences across datasets. These findings indicate that selecting a proper structure of the available rating data is likely to significantly improve the overall rating prediction accuracy given the same number of ratings.

2.2 The Data Acquisition Problem and Active Learning

There is a substantial literature on information acquisition for predictive modeling, targeting at building the most accurate model with a specified number of labeled training instances selected from a pool of examples. This idea of selecting the most ‘useful’ subset of examples to build a model comparable to model derived from the entire pool of examples, as backed by the theoretical results in [2], is intriguing for domains where training examples are costly to acquire.

Information acquisition has been studied in a variety of settings including the multi-armed bandit problem originally proposed by Robbins [27] and the optimal experiment design (OED) [3, 19] in statistics and active learning [7] in machine learning research. The general objective is to select the most informative data to improve the predictive power of the model, where the predictive models range from simple univariate distribution estimations of a single random variables to parametric statistical function relating distribution of the dependent variable to independent variables to

non-parametric machine learning models such as categorical classifiers. The major difference between OED and active learning is on the parametric or non-parametric nature of the model at study [30]. OED studies optimal data acquisition for parametric statistical models, for which close-form objective functions that relates to certain notion of the utility of the acquired data can be specified [37]. In this context, the data acquisition problem is eventually formulated as an optimization problem. Active learning, on the other hand, typically deals with non-parametric machine learning models for which the close-form utility function of a candidate data observation cannot be derived. Active learning acquires information sequentially based on the model learned so far. Most active learning research has focused on categorical classification. A major class of active learning methods relies on certain measures of uncertainty of the currently held model with respect to individual data instances to guide acquisition. The general notion is that the data instances with the greatest model uncertainty should be acquired such that the model can be improved the most. A wide variety of uncertainty measures have been proposed in the literature such as voting-based measure in the Query by Committee algorithm [10], probability of binary class membership [20], and local class-probability-estimation error [29]. The second class of active learning methods directly optimizes the expected error on future test examples (e.g., [8, 28]). Both classes of methods rely on analytically or numerically derived probabilistic estimates (regarding the model uncertainty or the prediction error).

2.3 Active Learning for Recommendation

There have been only a few previous studies that explore the application of the active learning idea in the recommendation context. These studies follow the tradition in active learning research for categorical classification to formulate uncertainty measures of the current collaborative filtering model with respect to individual ratings to guide acquisition. The proposed active learning methods in these studies are designed specifically for a particular type of probabilistic collaborative filtering models. Boutilier et al. [5] proposed acquiring rating based on the expected value of information for the multiple-cause vector quantification model. Yu et al. [36] used a measure of the entropy of the likemindedness for rating acquisition for a probabilistic memory-based collaborative filtering model, in which a rating on a product by a customer is related probabilistically to a set of hidden customer profile prototypes. Jin and Si [18] proposed using entropy of the model parameters to guide rating acquisition which is in principle applicable to the probabilistic latent class model [13], personality diagnosis model [24], and mixture models [34].

To the best of our knowledge, no previous study has investigated the generic active learning method that is applicable to any recommendation algorithms, especially the heuristic memory-based algorithms such as the popular user-based and item-based neighborhood algorithms. These previous work is also limited to the “new user problem,” trying to find informative additional products for a new customer to rate in order to quickly improve the recommendation quality for this particular customer. In this study, we focus on the “new product problem,” trying to quickly improve the accuracy of the predicted ratings of all customers for a newly introduced product. In operational commercial recommender systems, such as Netflix, such a rating acquisition strategy would have a much wider impact on the customer base

and the overall customer satisfaction out of the recommendation service.

3.THE PRODUCT RATING ACQUISITION PROBLEM

3.1Problem Setup

We formalize the generic product rating acquisition problem as follows. For a new product q, a small set of ratings may become available naturally provided by the early raters C0. We denote this set of rating as the initial rating set R0 = {r c,q}, where c∈C0. A set of target customers C = {c1, c2, …, c M} is identified for whom the rating prediction is needed. Denote the actual (unobserved) rating of q by the target customers as R q = {r c,q}, where c∈C. In practice, the target customer set could be the entire customer population or a subset depending on to which customers the product p is of potential interest. We assume C0?C. Based on the target customer set the relevant product set P = {p1, p2, …, p N} is determined. Products in P have been rated by more than a

specified minimum number k of customers in C. The set of ratings given by customers in C on products in P is denoted as R= {r c,p}, where c∈C and p∈P.

In order to improve the accuracy of rating predictions of the product q for the target customers, we identify s customers C1 = {c | c∈C, c?C0}, where |C1| = s, to acquire their ratings R q(C1) = {r c,q} (c∈C1) on q to improve the overall rating prediction accuracy on the entire target customer set. With a particular product recommendation function, f: R×q→R q’, where R q’ = {r c,q’}, c∈C, we can obtain the predicted ratings of the target customers on q. Our objective is to find the set C1 such that the error of the predicted ratings of customers in C on q, Error(R q, R q’), is minimized. A number of forms of rating prediction error have been used in the literature, such as mean absolute error (MAE), mean square error (MSE) and root mean square error (RMSE).

The input and output of the product rating acquisition problem formulation is summarized in Figure 1.

Input: target product q, target customer set C, relevant product set P, early raters C0 and their ratings on q R0, number of additional ratings to acquire s, a recommendation algorithm f that takes as input the rating set R and a target product q and produces predicts q’s predicted ratings by the customers R q’

Output: acquisition set C1, where c∈C, c?C0, and |C1| = s, such that Error(R q, f(R∪R0∪R q(C1), q)) is minimized

Figure 1. Product rating acquisition Problem

3.2Rating Acquisition Methods

In this section we describe the proposed active learning sampling method for rating acquisition. We also describe two straightforward benchmark sampling methods, the uniform random sampling and degree-based sampling. 3.2.1Active Learning Sampling

Our proposed method adopts the fundamental notion of active

learning schemes for learning classifiers. The candidate customers

to acquire ratings are selected based on the prediction error of the

recommendation function derived from the currently available

ratings.

Most existing active learning methods identify examples for

acquisition based on certain notion of the uncertainty of the

current model. Take the Uncertainty Sampling [20] for binary

classification as an example, the most informative examples are

identified as the ones for which the current classifier estimates a

class membership probability closest to 0.5. The idea is that the

current classification model is most uncertain about the class

membership of these examples and their class labels should be

acquired to achieve the largest improvement on the model. Such

an approach relies on the classifier’s ability to provide predictions

in a probabilistic form. In order to follow exactly the uncertainty

sampling approach to perform rating acquisition, we have to be

limited to the recommendation algorithms that can provide some

uncertainty measure of the predicted rating, such as the latent

class model and mixture models. For a large number of other

recommendation algorithms, such measures of rating prediction

uncertainty have not been well-studied, making such a sampling

method rarely applicable.

In order to develop a generic active learning sampling method for

any recommendation algorithm, we rely on the observed

prediction errors on the customers whose actual ratings are

available, C0. Based on the recommendation function derived

from the available ratings we can obtain predicted ratings of the

new product for the target customers, R q’ = f(R∪R0, q). Among these target customers, for c∈C0 the observed prediction error is

available (r c,q?r c,q’) but for the rest of the customers (candidates

to acquire ratings) actual rating is not available yet to derive

prediction error. Here the challenge for active learning sampling

is that the uncertainty of the prediction with respect to a particular

customer only becomes known after his/her rating has already

been acquired. As the rating-based similarity is the fundamental

data pattern any recommendation algorithm relies on to generated

predictions, we propose to sample the close neighbors of the

customers for which the current large rating prediction error is

observed. Under the fundamental assumption of collaborative

filtering, similar customers are expected to continue to show

similarity in rating new products. Building upon this assumption,

it is reasonable to expect that the current recommendation quality

is poor for the close neighbors of the customers whose observed

prediction error is large. These customers’ ratings should be

acquired to improve the overall recommendation quality.

Figure 2 presents this active learning sampling algorithm.

The algorithm starts with computing the customer similarities

using Eq (1) introduced in Section 2.1. Predicted ratings on the

new product q are then generated by a given recommendation

algorithm using the ratings on relevant products (R) and available

ratings on product q(R0). The absolute prediction error is

obtained for customers in C0 and for each of these customers a

selection probability proportional to the absolute prediction error

is computed. We then perform s random draws from the

customers in C0 based on their selection probability. Each time a

customer c i is drawn, her closest unselected neighbor is added to

the selection set. Intuitively we bias the sampling towards the areas with observed large prediction errors in the customer similarity space. The probabilistic selection process is used based on the success of such procedures in evolutionary computing such as genetic algorithms.

Active Learning Sampling Algorithm

Input : target product q , target customer set C , relevant product set P , early raters C 0 and their ratings on q R 0, number of additional ratings to acquire s , a recommendation algorithm f that takes as input the rating set R and a target product q and produces predicts q ’s predicted ratings by the customers R q ’

1. Compute the customer similarities {w (i ,j )} based on R ∪R 0 following Eq (1)

2. Apply the recommendation algorithm f to obtain predicted ratings: R q ’ = f (R ∪R 0, q )

3. Compute the absolute prediction errors for customers in C 0: E = {e c = |r c,q ? r c,q ’| }

4. Compute selection probability based on E : PR = {pr c },

pr c = ∑

∈??0

|'|/

|'|,,,,C c q c q c q c q c r r r r

5. Set the selection set S to be empty

6. While |S |< s

7. Randomly select a customer c i from C 0 according to selection probability PR

8. Find the closest unselected neighbor c n of c i to add to S : n = |)),((|max arg ,0n i w c S

c C c i i ??

Output : acquisition set C 1, where c ∈ C , c ? C 0, and |C 1| = s , such that Error (R q , f (R ∪R 0∪R q (C 1), q )) is minimized

Figure 2. Active learning sampling algorithm

We next examine in detail the impact of the new ratings acquired following the active learning sampling for both recommendation algorithms. Under the user-based algorithm, a large observed rating prediction error for customer c could be due to lack of information on the customer neighborhood (either none or very few of c ’s neighbors have rated product q therefore there is no information to infer c ’s rating). Acquiring the rating by a close neighbor of c enhances the information available about the match between q and this customer neighborhood and improves the accuracy of predicted ratings for the customers in this neighborhood. Under the item-based algorithm, a large observed rating prediction error for customer c could be related to lack of information on q ’s relationship to the products previously rated by c (either none or very few products rated by c appear within the neighborhood of product q ). Acquiring the rating of c ’s close neighbor helps bring the maximum amount of information on the relationship between q and these products, as c’ by definition has strong correlation with c on co-rated products. With the additional rating r c’,q , each pair of highly correlated ratings by c and c’ on the co-rated products will now contribute to form the similarity between q and these co-rated products.

3.2.2 Sequential Active Learning Sampling

One major advantage of active learning sampling is that it can be performed sequentially to further improve the sampling performance. In the recommendation context, a relatively small set of ratings can be acquired at each step. The algorithm in Figure 2 can be repeatedly applied to perform the sequential active learning sampling. Measures can be taken to alleviate the heavy computing demand for this approach. Note that when there are a large number of relevant products, additional ratings acquired on the new product along the active learning sampling process have negligible effect on the customer similarities. Therefore, we can use the customer similarities computed using R ∪R 0 throughout the entire sampling process. Depending on the specific recommendation algorithm used, the efficiency for rating prediction generation process can also be improved based on the model previously built.

3.2.3 Benchmark Sampling Methods

The most straightforward sampling method is the uniform random sampling method. Under this method, each unselected customer is chosen with equal probability.

Another intuitive method is to prefer the active customers who have previously rated large numbers of products. One may conjecture that these customers might be experts and plays a role of opinion leader in the customer community. Following the terminology in networks, we refer these customers as high-degree customers (who have large numbers of links in a customer-product rating graph) and such a sampling method as degree-based sampling . Under this method, an unselected customer is chosen with a probability proportional to his/her degree.

4. AN EXPERIMENTAL STUDY 4.1 Data

We use the rating data from Netflix Prize competition [4] to empirically test the performance of our proposed active learning sampling algorithm in comparison with the benchmark sampling methods. The Netflix Prize competition was launched in October 2006, for which a dataset containing 100 million anonymous movie ratings was released.

We sampled eight movies of varying level of popularity released in 2005 from the Netflix Prize training dataset. We did not include the extremely popular movies as the proactive rating acquisition might not be necessary for these movies. For each movie, we identified all customers who gave ratings and treated this set of customers as the target customers C . Their ratings on the movie constitute the set R q . The relevant movie set P is formed by the movies released prior to 2005 rated by more than 20 customers in C . The customers in C without any rating on the relevant movies were removed from C . The rating set R consists of ratings given by customers in C on movies in P . Table 1 shows the sample movies used in this study.

Table 1. Sample movies used in the experiment

Movie

Id Title |C | |P | |R |

1 Thomas & Friends: Calling

All Engines 59 176 4917

2 Johnny Cash: Ridin' the Rails: The Great American

Train Story 64 179 4834

3 Kelly Clarkson: Behind

Hazel Eyes 100 421 14858

4 The Work of Director

Jonathan Glazer 100 436 15589

5 Fraggle Rock: Live by the Rule of the Rock 510 2393 199978

6 The Life and Times of Frida Kahlo 514 191

7 140083

7 Barbie and the Magic of

Pegasus 999 1898 209161 8 Mondovino

1038 2057 210801 4.2 Experiment Procedure

For each target movie q , we first randomly select about 10% of the target customer set C to acquire their ratings. These ratings are intended to simulate the ratings that are naturally available in the system and form the initial rating set R 0. In order to get a complete picture of the impact of the acquired ratings on overall recommendation quality we vary the acquisition size by deciles from 10% to 90% of the target customer set. The last acquisition set always exhausts the entire target customer set as 10% of the customers were already used for initial rating set. Presumably, as more ratings are acquired the recommendation quality should improve. As the acquisition size approaches 90% of the target set the marginal improvement should diminish as the information value of the additional ratings decreases when a large number of ratings are already available.

We perform sequential sampling as the acquisition size increases. For example, to reach the acquisition size of 50% we use the 10% initial ratings and the 30% additional ratings acquired in the previous step to form the current initial ratings. An additional 10% of the ratings are acquired following random, degree-based, and active learning sampling methods.

After each step of acquisition, predicted ratings of all customers in the target set R q ’ are computed using the user-based and item-based algorithms. Accuracy measures of these predicted ratings are computed comparing to their actual ratings R q . In this study we adopt the commonly used MAE, MSE, and RMSE as rating prediction accuracy measures. These measures are computed as follows. MAE =

∑∈?C

c q c q c r r |'|,,

MSE = ∑

∈?C

c q c q c r r 2

,,)'( RMSE =

∈?C

c q c q c r r 2

,,)'( Note that the active learning sampling method relies on the

predicted ratings computed in the previous step. Therefore substantially different samples may be selected depending on the user-based or item-based algorithm is incorporated in the

sampling process if the two algorithms generate substantially

different rating predictions. In our experiments, when the user-based (item-based) algorithm is used to generate the recommendation after acquisition, the user-based (item-based) algorithm is also used during the acquisition process. It is possible though, for example, to perform active learning sampling using item-based algorithm and use the resulting rating set to perform user-based recommendation. We leave such a setup for future research.

4.3 Results Figure 3 shows the results of our experiments on the 8 sample movies. We only report the MAE measure in this paper due to

space limitation. The MSE and RMSE measures we obtained showed similar patterns.

Overall, we did observe that as more ratings were acquired to be used for recommendation, the prediction error of both algorithms

generally decreased as expected, for all three sampling methods. There were several exceptions, though, typically at the early stages of the rating acquisition process. This general decreasing trend of the prediction error confirmed that the fundamental assumption for collaborative filtering is consistent with the reality to certain extent and the recommendation algorithms we used have some predictive power. We also observed that for large-size target customer sets, the additional ratings were less valuable in improving the recommendation performance. This is likely to be the result of our experiment procedure. Because we always used 10% randomly selected ratings as the initial rating set, for a movie with a large target customer set the initial rating set was also large. Had we used a fixed batch size to acquire additional ratings, we would likely to observe the same steep decreasing curve at the beginning for the movies with large target customer set as well. However, in practical systems the popular movies are also likely to naturally get larger numbers of ratings from early raters. Therefore, our experiment may capture the actual potential gain for Netflix from adopting the proactive rating acquisition strategy for different types of movies.

Table 2. MAE differences between using initial 10% ratings

and all ratings Movie Id User-based Item-based Item-based/

User-based

1 0.11118 0.19159 1.72321

2 0.13434 0.31285 2.32887

3 0.09793 0.22663 2.31425

4 0.09701 0.14361 1.48034

5 0.03171 0.1081

6 3.41036 6 0.0351

7 0.12754 3.6260

8 7 0.03451 0.13593 3.93924 8 0.01432 0.13643 9.52868 Avg 0.06952 0.17284 3.54388

We found that for the sample movies we studied the item-based algorithm consistently outperformed the user-based algorithm as more ratings were acquired for recommendation. Across movies, the predicted ratings generated by the two algorithms using the 10% initial rating set had similar MAE measures with differences

less than 0.1. Of the eight movies, item-based algorithm had lower initial MAE than the user-based algorithm for six movies. However, as we acquired additional ratings to be used for generating recommendations, we observed that the prediction error of the item-based algorithm experienced much greater decrease than the user-based algorithm. Table 2 shows the MAE

differences between the recommendations generated from the initial rating set and the entire rating set for user-based and item-based algorithms. The reduction in MAE achieved by the item-based algorithm was on average 3.54 (ranging from 1.48 to. 9.53) times of that achieved by the user-based algorithm.

(a) Movie 1 (b) Movie

2

(c) Movie 3 (d) Movie 4

(e) Movie 5 (f) Movie 6

(g) Movie 7 (h) Movie 8

Figure 3. MAE measures of predicted ratings by user-based and item-based algorithms with different training sample sizes using

three rating acquisition methods.

Table 3 shows the percentage improvement in MAE by acquiring additional ratings to use 20%, 40%, and 60% of all ratings over using the 10% initial ratings for the item-based algorithm. We reported the random sampling and active learning sampling methods here. On average, randomly selecting additional ratings to acquire achieved 4.51%, 11.443%, and 15.751% improvements over the initial recommendation when ratings used for recommendation reached 20%, 40%, and 60%. The corresponding improvements using the active learning sampling methods were 8.023%, 15.007%, and 19.484%. These are substantial improvements considering that Netflix offered 1 million dollars for a 10% improvement.

Table 3. Percentage improvements of using 20%, 40%, and 60% ratings over using 10% initial ratings for the item-

based algorithm

% of

ratings used Movie

Id Random Active Difference

1 4.417% 6.236% 1.818%

2 -1.146%

16.155% 17.301%

3 2.454% 6.606% 4.152%

4 10.007%

10.877% 0.870%

5 3.678% 4.550% 0.872%

6 7.422% 8.388% 0.966%

7 5.402% 5.350% -0.051%

8 3.843% 6.021% 2.178%

20%

Avg 4.510% 8.023% 3.513%

2 16.650%

25.376% 8.727%

3 6.512%

15.234% 8.721%

4 14.862%

19.001% 4.139%

5 8.306% 8.592% 0.286%

6 12.670%

15.706% 3.036%

7 9.745%

13.391% 3.645%

8 7.120%

12.335% 5.215%

40%

Avg 11.443% 15.007% 3.564%

1 24.158%

24.109% -0.049%

2 23.395%

28.952% 5.557%

3 12.531%

19.528% 6.997%

4 15.790%

19.404% 3.614%

5 11.249%

11.232% -0.017%

6 15.490%

18.550% 3.060%

7 13.595%

17.437% 3.842%

8 9.799%

16.660% 6.861%

60%

Avg 15.751% 19.484% 3.733%

These experimental results show that improving rating prediction accuracy by rating acquisition could be a viable strategy. Even with random sampling, acquiring ratings of an additional 10 percent of the customers can result in 4.51% in improvement on rating prediction accuracy. Our results also show that the proposed active learning sampling methods substantially outperformed the random sampling method. On average, the active learning sampling achieved over 3.5% improvement more than the random sampling (3.513%, 3.564%, and 3.733% when 10%, 30%, and 50% additional ratings were acquired).

We also present the MAE measures of the item-based algorithm under random and active learning sampling in Table 4.

Table 4. MAE measures of using 20%, 40%, and 60%

ratings for the item-based algorithm

% of

ratings

used

Movie

Id Random Active Improvement

1 0.56053 0.54987 0.01066

2 0.78032 0.64684 0.13347

3 1.02388 0.98030 0.04358

4 0.67496 0.66843 0.00652

5 0.74331 0.73658 0.00673

6 0.62789 0.62134 0.00655

7 0.63937 0.63972 -0.00035

8 0.74321 0.72638 0.01683

20%

Avg 0.72418 0.69618 0.02800

1 0.49449 0.5253

2 -0.03083

2 0.64302 0.57570 0.06732

3 0.98129 0.8897

4 0.09154

4 0.63854 0.60750 0.03104

5 0.70759 0.70539 0.00221

6 0.59230 0.57171 0.02059

7 0.61002 0.58538 0.02464

8 0.71788 0.67758 0.04030

40%

Avg 0.67314 0.64229 0.03085

1 0.44476 0.44505 -0.00029

2 0.59099 0.54812 0.04287

3 0.91811 0.84467 0.07344

4 0.63159 0.60448 0.02711

5 0.68489 0.68502 -0.00013

6 0.5731

7 0.55242 0.02075

7 0.58400 0.55803 0.02597

8 0.69717 0.64414 0.05303

60%

Avg 0.64058 0.61024 0.03034

The comparison among the three sampling methods was less clear for the user-based algorithm. There was no single sampling method consistently outperformed the other two. Surprisingly, the active learning sampling had generally the worst performance for Movies 3, 4, 5, and 8. It is not exactly clearly yet why the sampling methods had different performance on the user-based versus the item-based algorithms. One possible explanation is that the customer similarity is not as reliable as movie similarity in the Netflix data. That is to say, if two movies have been rated similarly by a large number of customers, another customer is most likely to rate the two movies similarly as well. However, if two customers have rated

a large number of movies similarly in the past, they are still likely to disagree on a new movie. That explains the generally superior performance of the item-based algorithm in our experiments. Given this understanding, the relative performance of active learning sampling for user-based and item-based algorithms can be explained by the fact that active learning sampling only relies on the static (not so reliable) customer similarity for user-based algorithm but helps quickly improving the reliability of movie q’s similarity with other movies all the time for the item-based algorithm.

5.CONCLUSIONS AND FUTURE DIRECTSION

In this paper, we proposed an alternative strategy to improve rating prediction accuracy on a new product in recommender systems by selectively acquiring informative ratings from customers. We formalized this product rating acquisition problem and proposed an active learning sampling method that is generic to any recommendation algorithms. Our approach relies on the construction of customer neighborhoods based on their similarities in past ratings and sequentially selects close neighbors of the customers who have provided ratings deviating most from the prediction from the current model. Using the Netflix Prize dataset, we evaluated our proposed sampling method and two benchmark methods, uniform random sampling and degree-based sampling that prefers the customers who have previously rated large number of products. Two versions of neighborhood collaborative filtering algorithms that have been commonly used in research and practice, the user-based and item-based algorithms were used in our experiment. The experimental results showed that proactively acquiring additional ratings from the customers (even randomly) can quickly improve the overall rating prediction accuracy of a new product substantially, especially for the item-based algorithm. This finding provides empirical support for our proposed rating acquisition strategy to improving recommendation quality on newly introduced products. Our proposed active learning sampling method also substantially outperformed the benchmark sampling methods for the item-based algorithm, which had significantly better performance than the user-based algorithm in our experiments. This finding confirms that additional ratings should be acquired selectively to achieve the maximum improvement on recommendation quality.

To the best of our knowledge, this is the first study to investigate the application of the active learning notion to the recommendation of new products. Our proposed active learning sampling method also departs from the existing active learning methods in the sense that it relies on the observed prediction errors on the acquired examples to perform selection and is applicable to any recommendation algorithms.

We believe the rating acquisition problem introduced in this paper is an intellectually intriguing problem that has large potential impact in practice of e-commerce recommender systems. A wide variety of ideas are yet to be explored on designing effective sampling methods for product rating acquisition, which is a major future direction of this research. Our current formulation of the product rating acquisition problem is limited to a single new product. It is interesting to explore whether the correlations among the multiple products can be exploited to improve the acquisition effectiveness. Meanwhile, we are also extending the current work by experimenting on additional movie rating data and exploring the detailed analysis targeted at explaining the different behavior with the user-based and item-based algorithms.

6.ACKNOWLEDGMENTS

This research is partly supported by a grant from Smeal School of Business, Pennsylvania State University. We also want thank Netflix for making Netflix Prize dataset available for research use.

7.REFERENCES

[1]Adomavicius, G. and Tuzhilin, A. Using data mining

methods to build customer profiles. IEEE Computer, 34

(2). 74-82, 2001.

[2]Angluin, D. Queries and concept learning. Machine

Learning, 2. 319-342, 1988.

[3]Atkinson, A. The usefulness of optimum experimental

designs. Journal of the Royal Statistical Society, 58. 59-76,

1996.

[4]Bennett, J. and Lanning, S. The Netflix Prize, Netflix,

2007.

[5]Boutilier, C., Zemel, R.S. and Marlin, B., Active

collaborative filtering. in Nineteenth Annual Conference on Uncertainty in Artificial Intelligence, (2003), 98-106.

[6]Breese, J.S., Heckerman, D. and Kadie, C., Empirical

analysis of predictive algorithms for collaborative filtering.

in Fourteenth Conference on Uncertainty in Artificial

Intelligence, (Madison, WI, 1998), Morgan Kaufmann, 43-

52.

[7]Cohn, D., Atlas, L. and Ladner, R. Improved generalization

with active learning. Machine Learning, 15. 201-221,

1994.

[8]Cohn, D.A., Ghahramani, Z. and Jordan, M.I. Active

learning with statistical models. Journal of Artificial

Intelligence Research, 4. 129-145, 1996.

[9]Deshpande, M. and Karypis, G. Item-based top-N

recommendation algorithms. ACM Transactions on

Information Systems, 22 (1). 143-177, 2004.

[10]Freund, Y., Seung, H.S., Shamir, E. and Tishby, N.

Selective sampling using the query by committee

algorithm. Machine Learning, 28 (2-3). 133-168, 1997. [11]Goldberg, K., Roeder, T., Gupta, D. and Perkins, C.

Eigentaste: A constant time collaborative filtering

algorithm. Information Retrieval, 4 (2). 133-151, 2001. [12]Hofmann, T. Latent semantic models for collaborative

filtering. ACM Transactions on Information Systems, 22

(1). 89-115, 2004.

[13]Hofmann, T. and Puzicha, J., Latent class models for

collaborative filtering. in International Joint Conference in

Artificial Intelligence, (Stockholm, 1999), Morgan

Kaufmann, 688-693.

[14]Huang, Z., Chen, H. and Zeng, D. Applying associative

retrieval techniques to alleviate the sparsity problem in

collaborative filtering. ACM Transactions on Information

Systems (TOIS), 22 (1). 116-142, 2004.

[15]Huang, Z. and Zeng, D., Why does collaborative filtering

work? -- Recommendation model validation and selection

by analyzing random bipartite graphs. in Fifteenth Annual Workshop on Information Technologies and Systems (WITS 2005), (Las Vegas, NV, 2005).

[16]Huang, Z., Zeng, D. and Chen, H. A comparative study of

recommendation algorithms for e-commerce applications.

IEEE Intelligent Systems, forthcoming, 2007.

[17]Huang, Z., Zeng, D. and Chen, H., A link analysis

approach to recommendation with sparse data. in Americas Conference on Information Systems, (New York, NY,

2004), 1997-2005.

[18]Jin, R. and Si, L., A Bayesian approach toward active

learning for collaborative filtering. in Twentieth

Conference on Uncertainty in Artificial Intelligence,

(Banff, Canada, 2004), 278-285.

[19]Kiefer, J. Optimal experimental designs. Journal of the

Royal Statistical Society, series B, 21. 272-304, 1959. [20]Lewis, D. and Gale, W., Training text classifiers by

uncertainty sampilng. in International ACM Conference on Research and Development in Information Retrieval

(SIGIR-94), (1994), 3-12.

[21]Lin, W., Alvarez, S.A. and Ruiz, C. Efficient adaptive-

support association rule mining for recommender systems.

Data Mining and Knowledge Discovery, 6. 83-105, 2002.

[22]Linden, G., Smith, B. and York, J. https://www.doczj.com/doc/305049416.html,

recommendations: item-to-item collaborative filtering.

IEEE Internet Computing, 7 (1). 76-80, 2003.

[23]Pazzani, M. A framework for collaborative, content-based

and demographic filtering. Artificial Intelligence Review,

13 (5). 393-408, 1999.

[24]Pennock, D.M., Horvitz, E., Lawrence, S. and Giles, C.L.

Collaborative filtering by personality diagnosis: A hybrid

memory- and model-based approach, 16th Conference on

Uncertainty in Artificial Intelligence (UAI-2000), 2000,

473-480.

[25]Popescul, A., Ungar, L.H., Pennock, D.M. and Lavrence,

S., Probabilistic models for unified collaborative and

content-based recommendation in sparse-data

environments. in 17'th Conference on Uncertainty in

Artificial Intelligence (UAI 2001). (2001), 437-444. [26]Resnick, P., Iacovou, N., Suchak, M., Bergstorm, P. and

Riedl, J., GroupLens: An open architecture for

collaborative filtering of netnews. in ACM Conference on

Computer-Supported Cooperative Work, (1994), 175-186.

[27]Robbins, H. Some aspects of the sequential design of

experiments. Bulletin American Mathematical Society, 55.

527-535, 1952.

[28]Roy, N. and McCallum, A., Toward optimal active learning

through sampling estimation of error reduction. in 18th

International Conference on Machine Learning, (2001),

Morgan Kaufmann, 441-448.

[29]Saar-Tsechansky, M. and Provost, F. Active sampling for

class probability estimation and ranking. Machine

Learning, 54. 153-178, 2004.

[30]Saar-Tsechansky, M. and Provost, F. Decision-centric

Active Learning of Binary-Outcome Models. Information

Systems Research, forthcoming, 2007.

[31]Sarwar, B., Karypis, G., Konstan, J. and Riedl, J.,

Application of dimensionality reduction in recommender

systems: a case study. in WebKDD Workshop at the ACM

SIGKKD, (Boston, MA, 2000).

[32]Sarwar, B.M., Karypis, G., Konstan, J.A. and Riedl, J.T.,

Item-based collaborative filtering recommendation

algorithms. in Tenth International World Wide Web

Conference, (2001), 285-295.

[33]Schafer, J., Konstan, J. and Riedl, J. E-commerce

recommendation applications. Data Mining and Knowledge Discovery, 5 (1-2). 115-153, 2001.

[34]Si, L. and Jin, R., Flexible mixture model for collaborative

filtering. in Twentieth International Conference on

Machine Learning, (2003).

[35]Ungar, L.H. and Foster, D.P., A formal statistical approach

to collaborative filtering. in Conference on Automated

Learning and Discovery (CONALD), (Pittsburgh, PA,

1998).

[36]Yu, K., Schwaighofer, A., Tresp, V., Xu, X. and Kriegel,

H.-P. Probabilistic memory-based collaborative filtering.

IEEE Transactions on Knowledge and Data Engineering,

16 (1). 56-69, 2004.

[37]Zheng, Z. and Padmanabhan, B. Selectively acquiring

Customer Information: A new data acquisition problem and an active learning-based solution. Management Science, 52

(5). 697-712, 2006.

[批处理]计算时间差的函数etime

[批处理]计算时间差的函数etime 计算时间差的函数etime 收藏 https://www.doczj.com/doc/305049416.html,/thread-4701-1-1.html 这个是脚本代码[保存为etime.bat放在当前路径下即可:免费内容: :etime <begin_time> <end_time> <return> rem 所测试任务的执行时间不超过1天// 骨瘦如柴版setlocal&set be=%~1:%~2&set cc=(%%d-%%a)*360000+(1%%e-1%%b)*6000+1%%f-1% %c&set dy=-8640000 for /f "delims=: tokens=1-6" %%a in ("%be:.=%")do endlocal&set/a %3=%cc%,%3+=%dy%*("%3>> 31")&exit/b ---------------------------------------------------------------------------------------------------------------------------------------- 计算两个时间点差的函数批处理etime 今天兴趣大法思考了好多bat的问题,以至于通宵 在论坛逛看到有个求时间差的"函数"被打搅调用地方不少(大都是测试代码执行效率的) 免费内容: :time0

::计算时间差(封装) @echo off&setlocal&set /a n=0&rem code 随风@https://www.doczj.com/doc/305049416.html, for /f "tokens=1-8 delims=.: " %%a in ("%~1:%~2") do ( set /a n+=10%%a%%100*360000+10%%b%%100*6000+10%% c%%100*100+10%%d%%100 set /a n-=10%%e%%100*360000+10%%f%%100*6000+10%%g %%100*100+10%%h%%100) set /a s=n/360000,n=n%%360000,f=n/6000,n=n%%6000,m=n/1 00,n=n%%100 set "ok=%s% 小时%f% 分钟%m% 秒%n% 毫秒" endlocal&set %~3=%ok:-=%&goto :EOF 这个代码的算法是统一找时间点凌晨0:00:00.00然后计算任何一个时间点到凌晨的时间差(单位跑秒) 然后任意两个时间点求时间差就是他们相对凌晨时间点的时间数的差 对09这样的非法8进制数的处理用到了一些技巧,还有两个时间参数不分先后顺序,可全可点, 但是这个代码一行是可以省去的(既然是常被人掉用自然体

to与for的用法和区别

to与for的用法和区别 一般情况下, to后面常接对象; for后面表示原因与目的为多。 Thank you for helping me. Thanks to all of you. to sb.表示对某人有直接影响比如,食物对某人好或者不好就用to; for表示从意义、价值等间接角度来说,例如对某人而言是重要的,就用for. for和to这两个介词,意义丰富,用法复杂。这里仅就它们主要用法进行比较。 1. 表示各种“目的” 1. What do you study English for? 你为什么要学英语? 2. She went to france for holiday. 她到法国度假去了。 3. These books are written for pupils. 这些书是为学生些的。 4. hope for the best, prepare for the worst. 作最好的打算,作最坏的准备。 2.对于 1.She has a liking for painting. 她爱好绘画。 2.She had a natural gift for teaching. 她对教学有天赋/ 3.表示赞成同情,用for不用to. 1. Are you for the idea or against it? 你是支持还是反对这个想法? 2. He expresses sympathy for the common people.. 他表现了对普通老百姓的同情。 3. I felt deeply sorry for my friend who was very ill. 4 for表示因为,由于(常有较活译法) 1 Thank you for coming. 谢谢你来。 2. France is famous for its wines. 法国因酒而出名。 5.当事人对某事的主观看法,对于(某人),对…来说(多和形容词连用)用介词to,不用for.. He said that money was not important to him. 他说钱对他并不重要。 To her it was rather unusual. 对她来说这是相当不寻常的。 They are cruel to animals. 他们对动物很残忍。 6.for和fit, good, bad, useful, suitable 等形容词连用,表示适宜,适合。 Some training will make them fit for the job. 经过一段训练,他们会胜任这项工作的。 Exercises are good for health. 锻炼有益于健康。 Smoking and drinking are bad for health. 抽烟喝酒对健康有害。 You are not suited for the kind of work you are doing. 7. for表示不定式逻辑上的主语,可以用在主语、表语、状语、定语中。 1.It would be best for you to write to him. 2.The simple thing is for him to resign at once. 3.There was nowhere else for me to go. 4.He opened a door and stood aside for her to pass.

of与for的用法以及区别

of与for的用法以及区别 for 表原因、目的 of 表从属关系 介词of的用法 (1)所有关系 this is a picture of a classroom (2)部分关系 a piece of paper a cup of tea a glass of water a bottle of milk what kind of football,American of soccer? (3)描写关系 a man of thirty 三十岁的人 a man of shanghai 上海人 (4)承受动作 the exploitation of man by man.人对人的剥削。 (5)同位关系 It was a cold spring morning in the city of London in England. (6)关于,对于 What do you think of Chinese food? 你觉得中国食品怎么样? 介词 for 的用法小结 1. 表示“当作、作为”。如: I like some bread and milk for breakfast. 我喜欢把面包和牛奶作为早餐。What will we have for supper? 我们晚餐吃什么?

2. 表示理由或原因,意为“因为、由于”。如: Thank you for helping me with my English. 谢谢你帮我学习英语。 Thank you for your last letter. 谢谢你上次的来信。 Thank you for teaching us so well. 感谢你如此尽心地教我们。 3. 表示动作的对象或接受者,意为“给……”、“对…… (而言)”。如: Let me pick it up for you. 让我为你捡起来。 Watching TV too much is bad for your health. 看电视太多有害于你的健康。 4. 表示时间、距离,意为“计、达”。如: I usually do the running for an hour in the morning. 我早晨通常跑步一小时。We will stay there for two days. 我们将在那里逗留两天。 5. 表示去向、目的,意为“向、往、取、买”等。如: let’s go for a walk. 我们出去散步吧。 I came here for my schoolbag.我来这儿取书包。 I paid twenty yuan for the dictionary. 我花了20元买这本词典。 6. 表示所属关系或用途,意为“为、适于……的”。如: It’s time for school. 到上学的时间了。 Here is a letter for you. 这儿有你的一封信。 7. 表示“支持、赞成”。如: Are you for this plan or against it? 你是支持还是反对这个计划? 8. 用于一些固定搭配中。如: Who are you waiting for? 你在等谁? For example, Mr Green is a kind teacher. 比如,格林先生是一位心地善良的老师。

延时子程序计算方法

学习MCS-51单片机,如果用软件延时实现时钟,会接触到如下形式的延时子程序:delay:mov R5,#data1 d1:mov R6,#data2 d2:mov R7,#data3 d3:djnz R7,d3 djnz R6,d2 djnz R5,d1 Ret 其精确延时时间公式:t=(2*R5*R6*R7+3*R5*R6+3*R5+3)*T (“*”表示乘法,T表示一个机器周期的时间)近似延时时间公式:t=2*R5*R6*R7 *T 假如data1,data2,data3分别为50,40,248,并假定单片机晶振为12M,一个机器周期为10-6S,则10分钟后,时钟超前量超过1.11秒,24小时后时钟超前159.876秒(约2分40秒)。这都是data1,data2,data3三个数字造成的,精度比较差,建议C描述。

上表中e=-1的行(共11行)满足(2*R5*R6*R7+3*R5*R6+3*R5+3)=999,999 e=1的行(共2行)满足(2*R5*R6*R7+3*R5*R6+3*R5+3)=1,000,001 假如单片机晶振为12M,一个机器周期为10-6S,若要得到精确的延时一秒的子程序,则可以在之程序的Ret返回指令之前加一个机器周期为1的指令(比如nop指令), data1,data2,data3选择e=-1的行。比如选择第一个e=-1行,则精确的延时一秒的子程序可以写成: delay:mov R5,#167 d1:mov R6,#171 d2:mov R7,#16 d3:djnz R7,d3 djnz R6,d2

djnz R5,d1 nop ;注意不要遗漏这一句 Ret 附: #include"iostReam.h" #include"math.h" int x=1,y=1,z=1,a,b,c,d,e(999989),f(0),g(0),i,j,k; void main() { foR(i=1;i<255;i++) { foR(j=1;j<255;j++) { foR(k=1;k<255;k++) { d=x*y*z*2+3*x*y+3*x+3-1000000; if(d==-1) { e=d;a=x;b=y;c=z; f++; cout<<"e="<

用c++编写计算日期的函数

14.1 分解与抽象 人类解决复杂问题采用的主要策略是“分而治之”,也就是对问题进行分解,然后分别解决各个子问题。著名的计算机科学家Parnas认为,巧妙的分解系统可以有效地系统的状态空间,降低软件系统的复杂性所带来的影响。对于复杂的软件系统,可以逐个将它分解为越来越小的组成部分,直至不能分解为止。这样在小的分解层次上,人就很容易理解并实现了。当所有小的问题解决完毕,整个大的系统也就解决完毕了。 在分解过程中会分解出很多类似的小问题,他们的解决方式是一样的,因而可以把这些小问题,抽象出来,只需要给出一个实现即可,凡是需要用到该问题时直接使用即可。 案例日期运算 给定日期由年、月、日(三个整数,年的取值在1970-2050之间)组成,完成以下功能: (1)判断给定日期的合法性; (2)计算两个日期相差的天数; (3)计算一个日期加上一个整数后对应的日期; (4)计算一个日期减去一个整数后对应的日期; (5)计算一个日期是星期几。 针对这个问题,很自然想到本例分解为5个模块,如图14.1所示。 图14.1日期计算功能分解图 仔细分析每一个模块的功能的具体流程: 1. 判断给定日期的合法性: 首先判断给定年份是否位于1970到2050之间。然后判断给定月份是否在1到12之间。最后判定日的合法性。判定日的合法性与月份有关,还涉及到闰年问题。当月份为1、3、5、7、8、10、12时,日的有效范围为1到31;当月份为4、6、9、11时,日的有效范围为1到30;当月份为2时,若年为闰年,日的有效范围为1到29;当月份为2时,若年不为闰年,日的有效范围为1到28。

图14.2日期合法性判定盒图 判断日期合法性要要用到判断年份是否为闰年,在图14.2中并未给出实现方法,在图14.3中给出。 图14.3闰年判定盒图 2. 计算两个日期相差的天数 计算日期A (yearA 、monthA 、dayA )和日期B (yearB 、monthB 、dayB )相差天数,假定A 小于B 并且A 和B 不在同一年份,很自然想到把天数分成3段: 2.1 A 日期到A 所在年份12月31日的天数; 2.2 A 之后到B 之前的整年的天数(A 、B 相邻年份这部分没有); 2.3 B 日期所在年份1月1日到B 日期的天数。 A 日期 A 日期12月31日 B 日期 B 日期1月1日 整年部分 整年部分 图14.4日期差分段计算图 若A 小于B 并且A 和B 在同一年份,直接在年内计算。 2.1和2.3都是计算年内的一段时间,并且涉及到闰年问题。2.2计算整年比较容易,但

常用介词用法(for to with of)

For的用法 1. 表示“当作、作为”。如: I like some bread and milk for breakfast. 我喜欢把面包和牛奶作为早餐。 What will we have for supper? 我们晚餐吃什么? 2. 表示理由或原因,意为“因为、由于”。如: Thank you for helping me with my English. 谢谢你帮我学习英语。 3. 表示动作的对象或接受者,意为“给……”、“对…… (而言)”。如: Let me pick it up for you. 让我为你捡起来。 Watching TV too much is bad for your health. 看电视太多有害于你的健康。 4. 表示时间、距离,意为“计、达”。如: I usually do the running for an hour in the morning. 我早晨通常跑步一小时。 We will stay there for two days. 我们将在那里逗留两天。 5. 表示去向、目的,意为“向、往、取、买”等。如: Let’s go for a walk. 我们出去散步吧。 I came here for my schoolbag.我来这儿取书包。 I paid twenty yuan for the dictionary. 我花了20元买这本词典。 6. 表示所属关系或用途,意为“为、适于……的”。如: It’s time for school. 到上学的时间了。 Here is a letter for you. 这儿有你的一封信。 7. 表示“支持、赞成”。如: Are you for this plan or against it? 你是支持还是反对这个计划? 8. 用于一些固定搭配中。如: Who are you waiting for? 你在等谁? For example, Mr Green is a kind teacher. 比如,格林先生是一位心地善良的老师。 尽管for 的用法较多,但记住常用的几个就可以了。 to的用法: 一:表示相对,针对 be strange (common, new, familiar, peculiar) to This injection will make you immune to infection. 二:表示对比,比较 1:以-ior结尾的形容词,后接介词to表示比较,如:superior ,inferior,prior,senior,junior 2: 一些本身就含有比较或比拟意思的形容词,如equal,similar,equivalent,analogous A is similar to B in many ways.

Excel中如何计算日期差

Excel中如何计算日期差: ----Excel中最便利的工作表函数之一——Datedif名不见经传,但却十分好用。Datedif能返回任意两个日期之间相差的时间,并能以年、月或天数的形式表示。您可以用它来计算发货单到期的时间,还可以用它来进行2000年的倒计时。 ----Excel中的Datedif函数带有3个参数,其格式如下: ----=Datedif(start_date,end_date,units) ----start_date和end_date参数可以是日期或者是代表日期的变量,而units则是1到2个字符长度的字符串,用以说明返回日期差的形式(见表1)。图1是使用Datedif函数的一个例子,第2行的值就表明这两个日期之间相差1年又14天。units的参数类型对应的Datedif返回值 “y”日期之差的年数(非四舍五入) “m”日期之差的月数(非四舍五入) “d”日期之差的天数(非四舍五入) “md”两个日期相减后,其差不足一个月的部分的天数 “ym”两个日期相减后,其差不足一年的部分的月数 “yd”两个日期相减后,其差不足一年的部分的天数

表1units参数的类型及其含义 图1可以通过键入3个带有不同参数的Datedif公式来计算日期的差。units的参数类型 ----图中:单元格Ex为公式“=Datedif(Cx,Dx,“y”)”得到的结果(x=2,3,4......下同) ----Fx为公式“=Datedif(Cx,Dx,“ym”)”得到的结果 ----Gx为公式“=Datedif(Cx,Dx,“md”)”得到的结果 现在要求两个日期之间相差多少分钟,units参数是什么呢? 晕,分钟你不能用天数乘小时再乘分钟吗? units的参数类型对应的Datedif返回值 “y”日期之差的年数(非四舍五入) “m”日期之差的月数(非四舍五入) “d”日期之差的天数(非四舍五入) “md”两个日期相减后,其差不足一个月的部分的天数 “ym”两个日期相减后,其差不足一年的部分的月数 “yd”两个日期相减后,其差不足一年的部分的天数 假设你的数据从A2和B2开始,在C2里输入下面公式,然后拖拉复制。 =IF(TEXT(A2,"h:mm:ss")

of和for的用法

of 1....的,属于 One of the legs of the table is broken. 桌子的一条腿坏了。 Mr.Brown is a friend of mine. 布朗先生是我的朋友。 2.用...做成的;由...制成 The house is of stone. 这房子是石建的。 3.含有...的;装有...的 4....之中的;...的成员 Of all the students in this class,Tom is the best. 在这个班级中,汤姆是最优秀的。 5.(表示同位) He came to New York at the age of ten. 他在十岁时来到纽约。 6.(表示宾格关系) He gave a lecture on the use of solar energy. 他就太阳能的利用作了一场讲演。 7.(表示主格关系) We waited for the arrival of the next bus. 我们等待下一班汽车的到来。

I have the complete works of Shakespeare. 我有莎士比亚全集。 8.来自...的;出自 He was a graduate of the University of Hawaii. 他是夏威夷大学的毕业生。 9.因为 Her son died of hepatitis. 她儿子因患肝炎而死。 10.在...方面 My aunt is hard of hearing. 我姑妈耳朵有点聋。 11.【美】(时间)在...之前 12.(表示具有某种性质) It is a matter of importance. 这是一件重要的事。 For 1.为,为了 They fought for national independence. 他们为民族独立而战。 This letter is for you. 这是你的信。

单片机C延时时间怎样计算

C程序中可使用不同类型的变量来进行延时设计。经实验测试,使用unsigned char类型具有比unsigned int更优化的代码,在使用时 应该使用unsigned char作为延时变量。以某晶振为12MHz的单片 机为例,晶振为12M H z即一个机器周期为1u s。一. 500ms延时子程序 程序: void delay500ms(void) { unsigned char i,j,k; for(i=15;i>0;i--) for(j=202;j>0;j--) for(k=81;k>0;k--); } 计算分析: 程序共有三层循环 一层循环n:R5*2 = 81*2 = 162us DJNZ 2us 二层循环m:R6*(n+3) = 202*165 = 33330us DJNZ 2us + R5赋值 1us = 3us 三层循环: R7*(m+3) = 15*33333 = 499995us DJNZ 2us + R6赋值 1us = 3us

循环外: 5us 子程序调用 2us + 子程序返回2us + R7赋值 1us = 5us 延时总时间 = 三层循环 + 循环外 = 499995+5 = 500000us =500ms 计算公式:延时时间=[(2*R5+3)*R6+3]*R7+5 二. 200ms延时子程序 程序: void delay200ms(void) { unsigned char i,j,k; for(i=5;i>0;i--) for(j=132;j>0;j--) for(k=150;k>0;k--); } 三. 10ms延时子程序 程序: void delay10ms(void) { unsigned char i,j,k; for(i=5;i>0;i--) for(j=4;j>0;j--) for(k=248;k>0;k--);

for和to区别

1.表示各种“目的”,用for (1)What do you study English for 你为什么要学英语? (2)went to france for holiday. 她到法国度假去了。 (3)These books are written for pupils. 这些书是为学生些的。 (4)hope for the best, prepare for the worst. 作最好的打算,作最坏的准备。 2.“对于”用for (1)She has a liking for painting. 她爱好绘画。 (2)She had a natural gift for teaching. 她对教学有天赋/ 3.表示“赞成、同情”,用for (1)Are you for the idea or against it 你是支持还是反对这个想法? (2)He expresses sympathy for the common people.. 他表现了对普通老百姓的同情。 (3)I felt deeply sorry for my friend who was very ill. 4. 表示“因为,由于”(常有较活译法),用for (1)Thank you for coming. 谢谢你来。

(2)France is famous for its wines. 法国因酒而出名。 5.当事人对某事的主观看法,“对于(某人),对…来说”,(多和形容词连用),用介词to,不用for. (1)He said that money was not important to him. 他说钱对他并不重要。 (2)To her it was rather unusual. 对她来说这是相当不寻常的。 (3)They are cruel to animals. 他们对动物很残忍。 6.和fit, good, bad, useful, suitable 等形容词连用,表示“适宜,适合”,用for。(1)Some training will make them fit for the job. 经过一段训练,他们会胜任这项工作的。 (2)Exercises are good for health. 锻炼有益于健康。 (3)Smoking and drinking are bad for health. 抽烟喝酒对健康有害。 (4)You are not suited for the kind of work you are doing. 7. 表示不定式逻辑上的主语,可以用在主语、表语、状语、定语中。 (1)It would be best for you to write to him. (2) The simple thing is for him to resign at once.

51单片机延时时间计算和延时程序设计

一、关于单片机周期的几个概念 ●时钟周期 时钟周期也称为振荡周期,定义为时钟脉冲的倒数(可以这样来理解,时钟周期就是单片机外接晶振的倒数,例如12MHz的晶振,它的时间周期就是1/12 us),是计算机中最基本的、最小的时间单位。 在一个时钟周期内,CPU仅完成一个最基本的动作。 ●机器周期 完成一个基本操作所需要的时间称为机器周期。 以51为例,晶振12M,时钟周期(晶振周期)就是(1/12)μs,一个机器周期包 执行一条指令所需要的时间,一般由若干个机器周期组成。指令不同,所需的机器周期也不同。 对于一些简单的的单字节指令,在取指令周期中,指令取出到指令寄存器后,立即译码执行,不再需要其它的机器周期。对于一些比较复杂的指令,例如转移指令、乘法指令,则需要两个或者两个以上的机器周期。 1.指令含义 DJNZ:减1条件转移指令 这是一组把减1与条件转移两种功能结合在一起的指令,共2条。 DJNZ Rn,rel ;Rn←(Rn)-1 ;若(Rn)=0,则PC←(PC)+2 ;顺序执行 ;若(Rn)≠0,则PC←(PC)+2+rel,转移到rel所在位置DJNZ direct,rel ;direct←(direct)-1 ;若(direct)= 0,则PC←(PC)+3;顺序执行 ;若(direct)≠0,则PC←(PC)+3+rel,转移到rel 所在位置 2.DJNZ Rn,rel指令详解 例:

MOV R7,#5 DEL:DJNZ R7,DEL; rel在本例中指标号DEL 1.单层循环 由上例可知,当Rn赋值为几,循环就执行几次,上例执行5次,因此本例执行的机器周期个数=1(MOV R7,#5)+2(DJNZ R7,DEL)×5=11,以12MHz的晶振为例,执行时间(延时时间)=机器周期个数×1μs=11μs,当设定立即数为0时,循环程序最多执行256次,即延时时间最多256μs。 2.双层循环 1)格式: DELL:MOV R7,#bb DELL1:MOV R6,#aa DELL2:DJNZ R6,DELL2; rel在本句中指标号DELL2 DJNZ R7,DELL1; rel在本句中指标号DELL1 注意:循环的格式,写错很容易变成死循环,格式中的Rn和标号可随意指定。 2)执行过程

excel中计算日期差工龄生日等方法

excel中计算日期差工龄生日等方法 方法1:在A1单元格输入前面的日期,比如“2004-10-10”,在A2单元格输入后面的日期,如“2005-6-7”。接着单击A3单元格,输入公式“=DATEDIF(A1,A2,"d")”。然后按下回车键,那么立刻就会得到两者的天数差“240”。 提示:公式中的A1和A2分别代表前后两个日期,顺序是不可以颠倒的。此外,DATEDIF 函数是Excel中一个隐藏函数,在函数向导中看不到它,但这并不影响我们的使用。 方法2:任意选择一个单元格,输入公式“="2004-10-10"-"2005-6-7"”,然后按下回车键,我们可以立即计算出结果。 计算工作时间——工龄—— 假如日期数据在D2单元格。 =DA TEDIF(D2,TODAY(),"y")+1 注意:工龄两头算,所以加“1”。 如果精确到“天”—— =DA TEDIF(D2,TODAY(),"y")&"年"&DATEDIF(D2,TODAY(),"ym")&"月"&DATEDIF(D2,TODAY(),"md")&"日" 二、计算2003-7-617:05到2006-7-713:50分之间相差了多少天、多少个小时多少分钟 假定原数据分别在A1和B1单元格,将计算结果分别放在C1、D1和E1单元格。 C1单元格公式如下: =ROUND(B1-A1,0) D1单元格公式如下: =(B1-A1)*24 E1单元格公式如下: =(B1-A1)*24*60 注意:A1和B1单元格格式要设为日期,C1、D1和E1单元格格式要设为常规. 三、计算生日,假设b2为生日

=datedif(B2,today(),"y") DA TEDIF函数,除Excel2000中在帮助文档有描述外,其他版本的Excel在帮助文档中都没有说明,并且在所有版本的函数向导中也都找不到此函数。但该函数在电子表格中确实存在,并且用来计算两个日期之间的天数、月数或年数很方便。微软称,提供此函数是为了与Lotus1-2-3兼容。 该函数的用法为“DA TEDIF(Start_date,End_date,Unit)”,其中Start_date为一个日期,它代表时间段内的第一个日期或起始日期。End_date为一个日期,它代表时间段内的最后一个日期或结束日期。Unit为所需信息的返回类型。 “Y”为时间段中的整年数,“M”为时间段中的整月数,“D”时间段中的天数。“MD”为Start_date与End_date日期中天数的差,可忽略日期中的月和年。“YM”为Start_date与End_date日期中月数的差,可忽略日期中的日和年。“YD”为Start_date与End_date日期中天数的差,可忽略日期中的年。比如,B2单元格中存放的是出生日期(输入年月日时,用斜线或短横线隔开),在C2单元格中输入“=datedif(B2,today(),"y")”(C2单元格的格式为常规),按回车键后,C2单元格中的数值就是计算后的年龄。此函数在计算时,只有在两日期相差满12个月,才算为一年,假如生日是2004年2月27日,今天是2005年2月28日,用此函数计算的年龄则为0岁,这样算出的年龄其实是最公平的。 本篇文章来源于:实例教程网(https://www.doczj.com/doc/305049416.html,) 原文链接:https://www.doczj.com/doc/305049416.html,/bgruanjian/excel/631.html

双宾语tofor的用法

1. 两者都可以引出间接宾语,但要根据不同的动词分别选用介词to 或for: (1) 在give, pass, hand, lend, send, tell, bring, show, pay, read, return, write, offer, teach, throw 等之后接介词to。 如: 请把那本字典递给我。 正:Please hand me that dictionary. 正:Please hand that dictionary to me. 她去年教我们的音乐。 正:She taught us music last year. 正:She taught music to us last year. (2) 在buy, make, get, order, cook, sing, fetch, play, find, paint, choose,prepare, spare 等之后用介词for 。如: 他为我们唱了首英语歌。 正:He sang us an English song. 正:He sang an English song for us. 请帮我把钥匙找到。 正:Please find me the keys. 正:Please find the keys for me. 能耽搁你几分钟吗(即你能为我抽出几分钟吗)? 正:Can you spare me a few minutes? 正:Can you spare a few minutes for me? 注:有的动词由于搭配和含义的不同,用介词to 或for 都是可能的。如: do sb a favou r do a favour for sb 帮某人的忙 do sb harnn= do harm to sb 对某人有害

for和of的用法

for的用法: 1. 表示“当作、作为”。如: I like some bread and milk for breakfast. 我喜欢把面包和牛奶作为早餐。 What will we have for supper? 我们晚餐吃什么? 2. 表示理由或原因,意为“因为、由于”。如: Thank you for helping me with my English. 谢谢你帮我学习英语。 Thank you for your last letter. 谢谢你上次的来信。 Thank you for teaching us so well. 感谢你如此尽心地教我们。 3. 表示动作的对象或接受者,意为“给……”、“对…… (而言)”。如: Let me pick it up for you. 让我为你捡起来。 Watching TV too much is bad for your health. 看电视太多有害于你的健康。 4. 表示时间、距离,意为“计、达”。如:

I usually do the running for an hour in the morning. 我早晨通常跑步一小时。 We will stay there for two days. 我们将在那里逗留两天。 5. 表示去向、目的,意为“向、往、取、买”等。如: Let’s go for a walk. 我们出去散步吧。 I came here for my schoolbag.我来这儿取书包。 I paid twenty yuan for the dictionary. 我花了20元买这本词典。 6. 表示所属关系或用途,意为“为、适于……的”。如: It’s time for school. 到上学的时间了。 Here is a letter for you. 这儿有你的一封信。 7. 表示“支持、赞成”。如: Are you for this plan or against it? 你是支持还是反对这个计划? 8. 用于一些固定搭配中。如:

英语形容词和of for 的用法

加入收藏夹 主题: 介词试题It’s + 形容词 + of sb. to do sth.和It’s + 形容词 + for sb. to do sth.的用法区别。 内容: It's very nice___pictures for me. A.of you to draw B.for you to draw C.for you drawing C.of you drawing 提交人:杨天若时间:1/23/2008 20:5:54 主题:for 与of 的辨别 内容:It's very nice___pictures for me. A.of you to draw B.for you to draw C.for you drawing C.of you drawing 答:选A 解析:该题考查的句型It’s + 形容词+ of sb. to do sth.和It’s +形容词+ for sb. to do sth.的用法区别。 “It’s + 形容词+ to do sth.”中常用of或for引出不定式的行为者,究竟用of sb.还是用for sb.,取决于前面的形容词。 1) 若形容词是描述不定式行为者的性格、品质的,如kind,good,nice,right,wrong,clever,careless,polite,foolish等,用of sb. 例: It’s very kind of you to help me. 你能帮我,真好。 It’s clever of you to work out the maths problem. 你真聪明,解出了这道数学题。 2) 若形容词仅仅是描述事物,不是对不定式行为者的品格进行评价,用for sb.,这类形容词有difficult,easy,hard,important,dangerous,(im)possible等。例: It’s very dangerous for children to cross the busy street. 对孩子们来说,穿过繁忙的街道很危险。 It’s difficult for u s to finish the work. 对我们来说,完成这项工作很困难。 for 与of 的辨别方法: 用介词后面的代词作主语,用介词前边的形容词作表语,造个句子。如果道理上通顺用of,不通则用for. 如: You are nice.(通顺,所以应用of)。 He is hard.(人是困难的,不通,因此应用for.) 由此可知,该题的正确答案应该为A项。 提交人:f7_liyf 时间:1/24/2008 11:18:42

to和for的用法有什么不同(一)

to和for的用法有什么不同(一) 一、引出间接宾语时的区别 两者都可以引出间接宾语,但要根据不同的动词分别选用介词to 或for,具体应注意以下三种情况: 1. 在give, pass, hand, lend, send, tell, bring, show, pay, read, return, write, offer, teach, throw 等之后接介词to。如: 请把那本字典递给我。 正:Please hand me that dictionary. 正:Please hand that dictionary to me. 她去年教我们的音乐。 正:She taught us music last year. 正:She taught music to us last year. 2. 在buy, make, get, order, cook, sing, fetch, play, find, paint, choose, prepare, spare 等之后用介词for 。如: 他为我们唱了首英语歌。 正:He sang us an English song. 正:He sang an English song for us. 请帮我把钥匙找到。 正:Please find me the keys. 正:Please find the keys for me. 能耽搁你几分钟吗(即你能为我抽出几分钟吗)? 正:Can you spare me a few minutes?

正:Can you spare a few minutes for me? 3. 有的动词由于用法和含义不同,用介词to 或for 都是可能的。如: do sb a favor=do a favor for sb 帮某人的忙 do sb harm=do harm to sb 对某人有害 在有的情况下,可能既不用for 也不用to,而用其他的介词。如: play sb a trick=play a trick on sb 作弄某人 请比较: play sb some folk songs=play some folk songs for sb 给某人演奏民歌 有时同一个动词,由于用法不同,所搭配的介词也可能不同,如leave sbsth 这一结构,若表示一般意义的为某人留下某物,则用介词for 引出间接宾语,即说leave sth for sb;若表示某人死后遗留下某物,则用介词to 引出间接宾语,即说leave sth to sb。如: Would you like to leave him a message? / Would you like to leave a message for him? 你要不要给他留个话? Her father left her a large fortune. / Her father left a large fortune to her. 她父亲死后给她留下了一大笔财产。 二、表示目标或方向的区别 两者均可表示目标、目的地、方向等,此时也要根据不同动词分别对待。如: 1. 在come, go, walk, move, fly, ride, drive, march, return 等动词之后通常用介词to 表示目标或目的地。如: He has gone to Shanghai. 他到上海去了。 They walked to a river. 他们走到一条河边。

相关主题
文本预览
相关文档 最新文档