## Real Estate Appraisals: Art or Science?

I sometimes jokingly remark to my wife, Erin, that “appraising is much like building a house or making sausage…you don’t always want to see how they’re made.” Like the imperfections hidden behind painted drywall in the building process or the spooning cuts of Italian sausage neatly presented at the meat counter…the final appraisal report is often the product of a messy process. Few people realize that there is no such thing as being unbiased. Without exception, everyone is biased. Everyone sees life from a unique perspective. The simple act of choosing which facts to present in an appraisal report IS bias. The best we can do is to recognize our biases and be aware of them when appraising real estate. Which brings me to the point of this post.

While most of us who love to crunch numbers would like to reduce the appraisal process to a scientific endeavor, it will never work. Why? Well, for a number of reasons. Among them are the inability to predict human behavior, the lack of information, sampling error, etc. I could give you more reasons. Instead, I’ll demonstrate my point. The scatter graph below plots the sales prices of over 900 home sales in Hardin County, KY, in chronological order.

## Measuring Central Tendency

As shown in the graph, the prices appear to be quite random. That’s because they are very recent sales sorted only by chronological order. What happens during the appraisal process is that “comparable” sales are selected and adjusted for various factors to reflect the “most probable” price a buyer might pay for the property being appraised. Dealing with large volumes of data naturally lends itself to statistical analysis and in particular to regression analysis. So I ran a regression of these sales and developed a model to adjust each sale for differences with a given property. The results are shown in the graph below.

As you can see, the indicated price range has narrowed considerably. The central tendency appears to be somewhere between $300,000 and $400,000. In fact, applied to the subject property, the predicted price of the regression equation is around $354,000. However, as you can see, there is still a relatively wide range in potential values. Ideally, the property with the fewest differences with the subject property would yield the most accurate value indication. So I sorted these sales from the lowest to the highest gross adjustments. These results are presented in the graph below.

As you can see in the chart, there still appears to be considerable variation in price, even for the sales most like the subject property. In fact, here are the predicted prices of the ten most similar sales using the regression analysis.

## Appraisals are BOTH Art and Science

The sales with the fewest adjustments still have a range of less than $300,000 to over $500,000 with an arithmetic mean of just over $400,000. An experienced and honest appraiser should know about where within this range the value should fall. But this still introduces bias into the equation. My only point is this: As you can see, statistics is a very useful tool for analyzing property. But it has its limitations. The first graph presented above is an excellent picture of the central tendency for this property. However, there is still a very wide range of indicated values using regression analysis.

There is still the unpredictable human factor…and the volume of data factor…and the inability to verify 900 sales factor to deal with. There will always be some degree of error, or uncertainty when dealing with appraising. That’s why appraising is as much art as it is science.

Wyatt Roberts says

I agree that there are limitations to multiple variable linear regressions. That said, I think I may be more optimistic about the reliability of results that can be obtained. So my question is, what were your independent variables on the dataset? What, if any, demographic variables did you use?

RussellRoberts says

Wyatt, the independent variables were; Date of Sale; Total GLA; Age; bedrooms; bathrooms; basement (y/n); acres; Garage (y/n); Finished BSMT SF; Total Bsmt SF.

In hindsight, I would have omitted DOS (low correlation) and would have omitted the last two basement categories.

This equation had an R-squared around 84%-85% – I don’t remember specifically. The highest I’ve been able to achieve was on a relatively limited dataset which was above 90% – using similar independent variables. As you alluded, I think there are some demographic variables that could increase the R-squared. I’m betting that the median household income for the census block group may do just that. There are several others that might bump up that R-squared – vacant housing, rent levels, etc.

A program with an iteration loop to maximize the R-squared value based on various physical characteristics and economic demographics would be cool. Just thinking out loud here…

Wyatt Roberts says

3:28 am? What’s wrong with you, man???

RussellRoberts says

I guess I couldn’t sleep! :)

Wyatt Roberts says

This morning, I was thinking about this in the shower (where I do some of my best thinking). All of your independent variables concern the physical characteristics of the home. If your dataset is limited to a particular subdivision (or a group of very similar subdivisions) , that would get you very close to a R^2 of 1. In that case, geography variables would probably have minimal effect, if any. However, in larger datasets having a great deal of geographical diversity, such as the one in your example, I think variables related to geography would improve the correlation. In a word, location. That would pertain to both geography as well as the demographics that could be attributed to geography. Is the property within the city limits? If not, how far is it? Drive time to work? What’s the median household income and per capita income in the area? What’s the vacancy rate within the neighborhood? How much multi-family housing is in the neighborhood? Individually, those variables may have very low correlations to home price, but I think your R^2 is going to improve if you include these in your independent variables.

You know all this, of course. Just thinking outloud.

RussellRoberts says

I absolutely agree, Wyatt. The best sales are those sales subject to the same, or similar, economic influences. My problem with using economic data has been the difficulty with entering the data for the individual sales. But you may have solved this problem, you statistical stud.

Russell Roberts says

Michael, I agree with your comments. I think regression is a single, but very useful tool we have in the toolbox to understand and explain how the market tends to respond to various situations. As you pointed out, it has very little use for non-homogeneous property types. But it goes a long way, for example, to show how the market responds to larger building sizes, for example.

That is exactly what this article is about – using regression as a tool but understanding that it can have some serious limitations.

Michael Jones says

I came across this thread and thought I would add my two cents worth. Thirty years ago when Regression Analysis was suppose to be the answer I bought a fancy regression program (don’t remember what it was) pumped a lot of data in it and got a lot of mixed results that did not make much sense. After about 6 months of entering more data, deleting data, massaging the data, had it looked it by an expert, and finally getting so called acceptable results that indicated a P value of less than .1; it dawned on me that this is nuts. The appraiser is suppose to simulate the market; in other words put yourself in the shoes of the buyers and sellers. Buyers and sellers don’t use regression analysis so why am I trying to make this a “science” when it clearly isn’t. Most buyers and sellers don’t know what regression analysis is and don’t want to know! Can a mathematical process calculate the unpredictable and sometimes irrational behavior of buyers and sellers in an imperfect market? Not really! It will give you an answer that is as good as the data going in, but that answer is still merely an opinion of the what the adjustment should be. Better and more reliable results can be gotten from percentage of cost studies, cost differences, surveys, matched pairs, and my favorite – grouping analysis.

Fast forward 30 years and here I am again trying once again to utilize regression analysis because some non-appraisers further up the food chain are trying to make appraising a science. The data entry part is now easier with MLS exports and it produces a lot of very scientific looking results along with some sophisticated and colorful graphs and charts. But when you step back and look at it and ask yourself if the results make sense the answer is the same as it was 30 years ago. I find that regression analysis only works well when you have lots of really similar sales from the same or similar subdivision with all outliers removed. The problem is I can use matched pairs in these subdivisions and get equally if not better results. I have tried over and over to use regression analysis for the more complex submarkets. Unfortunately it does not work well in rural non-homogenous markets, urban areas, or for High Value Residential properties. I keep coming back to the primary objective of the appraiser and that is to simulate the market (buyers and sellers motivations and expectations) which still does not include the use of regression analysis.