Skip to Main Content

Data Analysis with Stata

Automobile MPG and Price

This tutorial shows you how you can use Stata to produce Figure 2 below.

Step 1

Load the following dataset into Stata using the sysuse command.

sysuse auto, clear

The sysuse command loads into memory an example dataset from Stata's default directory in your computer's hard drive. It is very useful for experimenting with commands. To see a list of all available example datasets, type sysuse dir.

Step 2

Using the string function called word, extract automakers from the variable make, saving them in a new variable called automaker.

gen automaker = word(make,1)

To see a list of all the string functions available in Stata, type help string.

Step 3

Next, collapse the data to obtain the average price and mpg across models for each automaker.

collapse (mean) avg_price = price avg_mpg = mpg, by(automaker)

Step 4

Produce the graph using the twoway scatter command.

twoway scatter avg_mpg avg_price, mlabel(automaker) mlabangle(45) ytitle("Average MPG across models") xtitle("Average price across models ($)") title("Figure 2")

As long as you are working within a do-file,  you can break up lengthy commands across multiple lines using three slashes.

twoway scatter avg_mpg avg_price, ///
mlabel(automaker) mlabangle(45) ///
ytitle("Average MPG across models") ///
xtitle("Average price across models ($)") ///
title("Figure 2")

You can export the graph to an external format like PNG or PDF.

graph export automakers_mpg_price.png, replace