Our maintained assumption is that patents are a proxy for “bits of knowledge” and patent citations are a proxy for a given bit of knowledge being useful in the development of a descendant bit. This permits us to use the probability of citation as a proxy for the probability of useful knowledge flow, and empirical citation frequencies as a measure of that probability. Of course, the frequency with which generation of knowledge bits leads to a patent (the “propensity to patent”) varies over time and space, as does the frequency with which use of earlier knowledge produces a citation (the “propensity to cite”). These variations in the correspondence between the data and the underlying constructs of interest create problems of interpretation that must be dealt with via a combination of multiple measurements and identifying assumptions.

The nature of these issues can be seen in Figure 1, which plots empirical citation frequencies from other countries to the U.S., as a function of the time lag between the citing and cited patents. The citation frequency is calculated as the total number of citations divided by the product of the number of potentially citing and number of potentially cited patents. For example, Japanese inventors took out about 22 thousand patents in 1993. U.S. inventors took out about 36 thousand patents in 1969. A total of about 800 citations from 1993-Japanese patents were made to 1969-U.S. patents. Hence the estimated citation frequency for this combination is about lxlO’6 (800/(22000*36000)). The citation frequencies plotted in Figure 1 are averages for all combinations with a given lag for which we have data, e.g., the calculated frequency at lag 30 derives from citations from 1993 to 1963 (our earliest data year) and 1994 (our last data year) to 1964. We interpret the citation frequency as an estimate of the probability that a randomly drawn patent in the citing group will cite a randomly drawn patent in the cited group.