Page images
PDF
EPUB

The average of the three cities with low income is not affected by density or temperature characteristics since the three cities contain all three levels of population density and all three levels of temperature. Thus, each of the foregoing averages is affected by only one characteristic at a time. Therefore, the difference between each of these three-city averages and the grand average of the nine cities measures the net effect on the average expenditure of each of the three levels within each classification. Using the net effects of each classification characteristic, the average expenditure can be estimated for a city in a class from which there was no sample city. The estimate is calculated by adding to or subtracting from the average for nine cities, the net effects measured by the three-city averages.

Suppose that in the example the 9-city average expenditure was $30, and differed from the 3-city averages as follows:

[blocks in formation]

$3 in the 3 Thin cities

+$1 in the 3 High income cities

0 in the 3 Moderate income cities -$1 in the 3 Low income cities Then the estimate for a cold, thinly populated, high-income city would be $30 (the 9-city average) plus $3 (the net effect of cold), minus $3 (the net effect of thinly populated), plus $1 (the net effect of high income) or $31.

To estimate the average expenditure for all cities in the population size class 240,000 to 1 million, it is necessary only to estimate an average for each of the 27 city classes and weight the classes together by the total population of the cities contained in each class. To make an estimate for all cities, the 13 large cities (Group A) and the estimates of the three size groups (B, C, and D) of cities are weighted together by their total aggregates of population.

Estimates for individual cities not included in the sample are subject to four types of errors: (1) Sampling error in the average of the sample city (within-city error); (2) error of using the sample city average to represent the average of its class; (3) error of using the average effects of each characteristic, additively, to estimate the average

of a class from which no city was selected (error of the estimating formula); and (4) error of using the estimated average of a class not surveyed to estimate a given city in that class.

When the survey is completed it will be possible to estimate expenditure weights for price index purposes for cities not surveyed and to approximate the error of the estimate.

The success of this method depends, of course, largely on classifying the cities by variables which are closely related to expenditure patterns. Since the thousands of items of expenditures are affected by so many different characteristics (e. g., fuel by the climate, housing by density, and medical care by income level), it is difficult to find those few characteristics which are common to the greatest number of expenditures.

Also the modes of classification must be independent one from the other; otherwise the threeway classification of the cities shows many blank cells and a balanced Latin Square cannot be selected. Cells in the classification diagram might be selected which contained no city. For instance, if the Bureau had used temperature as one mode of classification and geographic location for another mode of classification, cells classified as hot-northern and cold-southern would not likely contain any cities.

The problem of finding modes of classification which were closely related to expenditure patterns, but which were mutually unrelated, required study of many characteristics of cities before making the final choice for each particular group of cities. The selection of characteristics was further limited by the necessity of having comparable data for selected characteristics for all urban places. For the group of cities 240,000 to 1 million population, income level, climate, and population density were finally used; for the group of cities 30,500 to 240,000, city size, income level, and climate were used. For cities under 30,500, 4 modes of classification were used with 4 levels in each classification. The modes were-income level, climate, population size, and distance to nearest major market area. The following paragraphs explain the exact sources and treatment of the data used.

Income level was based on the average quarterly pay for employees covered by Old-Age and Survivors Insurance tabulations by counties. These data can be found in Business Establishments, Employments and Taxable Payrolls under Old

Age and Survivors Insurance Program, First quarter 1947, by Industry Groups and Counties, U. S. Department of Commerce. The income classification of large cities, where the city population accounts for the major part of the county, was based on the published data without adjustment. For the smallest group of cities (under 30,500), the community income level was determined by a cross-classification of these average earnings data for the county in which the city is located and the 1940 Census average rent for the city. That is, cities were classified into five earnings levels-low, moderate low, moderate, moderate high, and high-by the average taxable earnings for the counties in which they were located. The mod

erate high level was observed to have a wide range in 1940 average city rents. It was therefore subdivided into low and high rent groups; the low rent portion was combined with the moderate income group and assigned to the "moderate high income" group; the high rent portion was assigned to the "high income" group.

Climate was based on Average Monthly and Seasonal Degree Days-Base 65° F. as tabulated in U. S. Weather Bureau, Climatological Data. Degree days are defined as the sum of the deviations below 65° F. in the daily mean temperature. Population density is the ratio of 1947 estimated population to area in square miles.

City size consists of 1947 population estimates.

TABLE 3.-Cities in Group A and Groups B-D cities selected from the three Latin Squares

[blocks in formation]

1 Surveyed for 1948. Surveyed for 1947.

City

Minneapolis-St. Paul, Minn.

Popula-
tion (1947
estimate) Climate

964,000 Cold.... Thick... Moder-
ate.

317,000 ..do... Thin. High.
240,000 ...do.

602, 000 Mild.

Medium Low.
Thick.. High.

829, 000.do..... Medium Moder

Rawlins, Wyo.

Grand Island, Nebr..
Ravenna, Ohio...

Garrett, Ind..

Shawnee, Okla.

17,000 Cold.

7,000 --.do.

4,000 do.

19,000 Med. cold.

10,000 ...do.

7,000...do.

Large.

Med. large

Classification characteristics

Popula- Income
tion
density

level

Grand Forks, N. Dak..
Laconia, N. É..

Sandpoint, Idaho.

Popula-
tion (1947 Climate
estimate)

City size

Income
level

Distance to market

15,000...do..

[blocks in formation]

do.. High.

[blocks in formation]

Climate classification (in normal number of annual degree days): Hot-185 to under 4,417; Mild-4,417 to under 6,144; Cold-6,144 and over. Population density classification (persons per square mile): Thick-1,773.8 to 3,913.3; Medium-1,386.5 to 1,732.0; Thin-514.1 to 1,269.2. Income level classification (annual dollar earnings as reported under OASI): High-$2,468 and over; Moderate-$2,264 to $2,460; Low-under $2,240. Surveyed in 1948.

Climate classification (normal number of annual degree days): Hot-185 to 4,410; Mild-4,417 to 5,936; Cold-5,941 and over.

City size classification (population): Large-154,455 to 235,275; Medium85,924 to 154,454; Small-30,273 to 85,923.

Income level classification (annual dollar earnings as reported under OASI): High-$2,424 and over; Moderate-$2,136 to $2,240; Low-under $2,132. Climate classification (normal number of annual degree days): Hot-under 3,224; Mild-3,224 to under 5,232; Medium cold-5,232 to under 6,282; Coldover 6,282.

[blocks in formation]

City size classification (population): Large city-16,096-30,273; Medium large-9,512-16,088; Medium small-5,233-9,509; Small-2,500-5,232. Distance to market classification: A-Long distance to market (over 76 miles to any marketing area). B=Short distance to small market (less than 76 miles to marketing area with retail sales under $80,386,000). C=Short distance to medium market (less than 76 miles to marketing area with retail sales of $80,386,000 to $231,143,000). D= Short distance to large market (less than 76 miles to marketing area with retail sales over $231,143,000). Income level classification (earnings as reported under OÁSI): High-Cities in counties with average annual dollar earnings over $2,360 and cities with county average earnings between $2,136 and $2,360 and average city rent (1940) of $26 and over per month. Moderate high-Cities with county average earnings between $2,136 and $2,360 and average city rent (1940) under $26 and cities with county average earnings between $2,036 and $2,136. Moderate low-Cities with county average earnings between $1,660 and $2,036. Low-Cities with county average earnings less than $1,660.

BOADADBOA◄OMORDA

Distance to market center (for small cities) is the distance in road miles that the city is to nearest market center. A market center was defined as any city with retail sales over $40 million in 1947 as reported in Sales Management, March 1948.

Detailed tabulations of the 3 groups of cities under 1 million population, by the modes of classification, are given in table 2.

Sample Selection

For the three population groups of cities less than 1 million (Groups B, C, and D), a sample of cells was selected from each diagram to produce a balanced Latin Square as outlined above. The Latin Square for Groups B and C contained 9 cells and that for Group D, 16 cells.

Only one combination of cells was possible which would fulfill all the requirements of a balanced Latin Square for Group B. The reason is that the diagram contained a number of blank cells, the characteristic combinations of which did not describe any city of this size; for example, Group B contains no high-income, densely populated, hot city. The appearance of these blank cells in the diagram raises some question as to the efficiency of the design in estimating expenditure weights for cities not surveyed. Data obtained from cities added by purposive selection as described below will be used to test the estimates derived from the sample. For Group C there were 8 combinations possible, and for Group D a very large number of combinations.

The one combination of cells of Group B (just mentioned) was used in selecting the actual cities to be surveyed; of the 8 combinations of Group C, one was selected at random; and from the many combinations possible in Group D, the one which had the largest total population was selected. From each of the selected cells, cities were chosen at random.

The cities in Group A and those selected from the three Latin Squares are given in table 3.

Purposive Selection of Additional Cities

The sample of cities selected randomly from the Latin Square formed a Nation-wide urban sample and met the requirements for calculating estimates for any city in the United States. It did not,

[blocks in formation]

however, include a number of cities for which for particular reasons individual city data is important. For example, the probability of drawing many of the relatively small cities in some geographical regions, especially the Southwest and Mountain States in a Nation-wide sample is slight.

Experimentation with Latin Square designs, using geographic regions as a classification factor, indicated that the geographic distribution of cities selected from such designs would not be very different from that of the cities selected from the designs based on climate and income level. Therefore, it was decided that the need for individual data for such cities could be met best by purposive selection.

Furthermore, it was apparent that the variability in expenditure patterns among small places was considerably larger than that among the large cities. For this reason it seemed advisable to expand the coverage of the sample in the small city strata in order to provide estimates of expenditure patterns for small cities of various

1 For further discussion, see Revision of the Consumers' Price Index, Monthly Labor Review, July 1950 (p. 129), and Consumer Expenditure Study, 1950, Monthly Labor Review, January 1951, (p. 56).

See 16th Census of the U. S. 1940 Population, Volume I, Number of Inhabitants, Bureau of the Census and Urbanized Areas, Bureau of the Census, November 15, 1949. Some States do not incorporate places of less than 10,000 population. The Census Bureau designates places in these States as urban if (1) they are made up of towns (townships) containing a village having 2,500 inhabitants or more, or (2) they contain a thickly settled area of 2,500 inhabitants or more which comprises by itself or in combination with other villages within the same town, more than 50 percent of the total popu⚫ lation of the town.

Another type of unincorporated area classified as urban by the Census Bureau is made up of townships and other political subdivisions which have a total population of 10,000 or more and a population density of at least 1,000 persons per square mile.

The Census has designated the closely settled urban fringe in and sur. rounding cities as urbanized areas for the 157 cities which had 50,000 or more inhabitants in 1940. Places are included in these areas if they are contiguous to the central city, or if they are contiguous to an area already included. These places are: (1) incorporated places with 2,500 inhabitants or more; (2) incorporated places with fewer than 2,500 inhabitants provided the incorporated place includes an area with a concentration of 100 dwelling units or more; (3) unincorporated areas with at least 500 dwelling units per square mile; (4) areas devoted to commercial, industrial, transportational, recreational, and other miscellaneous uses functionally related to the central city. In addition, all outlying areas within 11⁄2 miles of a central contiguous urban area measured along the shortest connecting highway are included as are those outlying areas within 1⁄2 mile of another outlying area which is within 11⁄2 miles of a central contiguous urban area.

The percentage change was obtained as follows: (1) If located in one of the metropolitan areas of which the population was estimated by the Bureau of Census Sample Survey of 1947 (p. 21, No. 35), the percentage derived by Census was used. This percentage was applied to all places in the metropolitan district. (2) Where a special census was taken (since 1946), that figure obtained by the special census was used. (3) All other places were assumed to have increased in population at the same rate as the whole State

types. Additional cities outside the Latin Square cells were also needed in order to test estimated expenditure weights derived from the sample cities. To meet these needs, it was decided to survey the largest city in each State, providing the population was 30,500 or more and the State was not already represented by a city of over 30,500 population in the selection from the Latin Square.

In addition, a city was selected randomly, proportionate to size, from each of the 6 cells of the small city classification of Group C (under 86,000) not represented in the original Latin Square. For Group D another Latin Square combination of 16 cells, with no cell of the previous selection included, was selected at random. A city was chosen within each cell of this set giving preference to cities in States not represented or to important regions and areas not covered. These extra cities, shown above, complete the list of the 97 cities in the Survey of Consumer Expenditures.

-MARVIN KOGAN

Division of Prices and Cost of Living

after allowance for places coming under (1) and (2) above. State population increase from Bureau of Census Sample Survey (p. 25, No. 12, Aug. 9, 1948). A comparison was made between the 1947 estimates and the 1950 preliminary population reports of the Census which have just become available. Most of the estimates were within 10 to 15 percent of the 1950 count with the exception of a number of small cities and certain cities located in the western and southwestern portions of the United States. In general, the differences between the estimated and actual figures do not change the relative position of cities with respect to the population-size classes used in the sample selection. The Census urbanized area consists of a central city or cities by which it is designated and surrounding urban area both incorporated and unincorporated. For 17 of the 157 urbanized areas established by the Census, the actual delineation had not been completed when the selection of cities was made. The metropolitan district definition was used to designate the urban boundary for these 17 areas.

Six of the designated urbanized areas were separated into sub-areas which were considered as more appropriate units for expenditure and price studies. The sub-areas (other than the central city areas) which were treated as separate sampling units in the universe follow:

(1) The New Jersey portion of the New York urbanized area; (2) The DuPage County, Ill., portion of the Chicago urbanized area; (3) The Lake County, Illinois portion of the Chicago urbanized area; (4) Will County, Ill., and Lake and Porter Counties, Ind., portion of the Chicago urbanized area; (5) The New Jersey portion (other than Camden, N. J.) of the Philadelphia urbanized area; (6) The New Kensington (and environs in Allegheny and Westmoreland counties) portion of the Pittsburgh urbanized area; (7) The Beaver County portion of the Pittsburgh urbanized area; (8) The extreme northern part of the San Francisco urbanized area consisting of parts of Contra Costa, Solano and Marin Counties of California; (9) The extreme southern part of the San Francisco urbanized area consisting of parts of San Mateo, Santa Clara, and Alameda Counties in California; (10) The Middlesex and Essex Counties, Massachusetts portion of the Boston urbanized area; and, (11) The extreme southern part of the Boston urbanized area consisting of parts of Norfolk and Plymouth Counties, Massachusetts.

For further information on the Latin Square see R. A. Fisher, The Design of Experiments, 3d Edition (Olfver & Boyd Ltd., London 1942), Chapter V; particularly p. 86.

Correction of New Unit Bias in Rent Component of CPI

THE UNDERSTATEMENT of the rise in rents during the past decade reflected by the rent component of the Consumers' Price Index, and by the CPI itself, has been corrected and is here described. It arose during the war and postwar years from the failure to reflect the difference between rents charged for new dwellings when they first enter the rental market and those of comparable dwellings already in the market. This difference is equivalent to a price change which properly should be reflected in an index of rents and prices. The 3-year revision program of the CPI, authorized in the fall of 1949, included comprehensive housing studies in each of the 34 city areas covered in the CPI and made the correction possible. From surveys conducted early in 1950, the Bureau of Labor Statistics is now able to announce that the correction to the rent index for the accumulated downward bias for 10 yearsfrom 1940 to 1950-is 5.5 percent of the January 1950 rent index and 0.8 percent of the "all items" index for the 34 cities combined. Applying this correction to the January 1950 index would raise the rent index by 6.8 index points and the all items index by 1.3 index points. The amount of this correction is somewhat higher than the 1949 rough estimate which follows, because it takes into account the very high rate of new rental construction during 1949 and also because the measurement

was more accurate.

Several rough estimates of the understatement had previously been made by the Bureau so that users of the CPI could appraise the extent of this "new unit" bias. However, they were not incorporated into the CPI because of the meager data upon which they were based. In July 1949, the Bureau made its last rough estimate that, as a result of this "downward bias" from 1940 to 1949, the rent index in February 1949 was too low by something between 3% and 5 index points, and that as a result the all-items index was too low by something between 0.6 and 0.9 index points. Origin of New Unit Bias

The procedure used in making the correction for the "new unit" bias in the rent component

of the CPI was of course conditioned by the basic concept of the index and can be clarified by a brief review of how the bias originates.

The CPI measures average changes in retail prices of a bill of goods and services of constant quantities and qualities, purchased by moderate income families. It is designed to show the influence of price changes only, and to exclude the effect of changes in the quantities or qualities purchased. Because of the difficulty of determining which houses are identical in quality, the Bureau has measured changes in rents for samples of identical houses as a means of arriving at the change in rent for dwellings of identical quality. If the rent for a unit is not reported at the beginning and the ending months of the period for which rental change is measured, that unit is excluded from the tabulation.

Additions to the rental market (created by new construction or conversion) do not have an "earlier" rent when they first come onto the market, and therefore the procedures for calculating the index do not reflect the difference in rent between "new" units and comparable existing units. Consequently the price change-between average rents for dwellings in one period and the average rent for identical qualities of housing, including new dwellings, in a later period-which properly should be reflected in the index, is missed.

Normally in a market free from rent controls there is no consistent differential in price between "new" units and comparable existing dwellings. However, during periods of rent control, those market forces which tend to equate the rents for "new" and "old" housing of identical quality are not permitted to function.

Thus, during the war and postwar years—a prolonged period of rent control and housing shortages-additions to the rental market almost always came on the market at higher rents than those for comparable dwellings already in existence. It is the failure of the index to reflect this difference which introduced the consistent downward bias that is referred to as the "new unit bias" in the rent index.

At the same time, the Bureau has been unable to bring up to date frequently the sample of tenant dwellings from which rental data are obtained. Newly built rented dwellings are drawn into the samples only when a new sample is drawn. Since

« PreviousContinue »