Document Type

Working Paper

Publication Date

8-7-2015

College/Unit

College of Business and Economics

Document Number

15-34

Department/Program/Center

Economics

Abstract

This paper incorporates text data from MLS listings from Atlanta, GA into a hedonic pricing model. Text is found to decrease pricing error by more than 25%. Information from text is incorporated into a linear model using a tokenization approach. By doing so, the implicit prices for various words and phrases are estimated. The estimation focuses on simultaneous variable selection and estimation for linear models in the presence of a large number of variables. The LASSO procedure and variants are shown to outperform least-squares in out-of-sample testing.

Included in

Economics Commons

Share

COinS