# Linear Model Function

The `lm()` function (short for "Linear Modeling") is a function in base R that can be used to, as the name suggests, create a linear model; this model can include multiple variables, including interaction terms and squared terms. A brief discussion of its use is provided below.

## Syntax

Much like we use '==' instead of just '=' when using an 'if' statement, we do not use the lone equals sign when writing equations in R; instead we use a '~', located on the top left of the keyboard. Otherwise, our equations still follow the same format as when we write them by hand.

## Create a Linear Model

The lm() function requires two arguments, as shown below:

`lm(data, formula)`
• data: the name of the dataset from which you are building the model
• formula: the equation you are using to create your model

For this discussion, we will be using the mtcars dataset, included in base R, to demonstrate.
Let's begin by modeling a car's miles per gallon (mpg) as a function of its weight (wt):

``library(ggplot2)myModel = lm(data = mtcars, mpg ~ wt)myModelggplot(data = mtcars, aes(x = wt, y = mpg, col = 'red')) + geom_point() + geom_abline(aes(intercept = 37.285, slope = -5.344))``

As you can see, creating a model with R is very simple; also note that we do not need to use a \$ or enter the variable names as strings within the `lm()` function.

the `fmodel()` function from the  'statisticalModeling' package is useful for plotting the line of the equation, and requires only the name of the model ( myModel in this case), though it does not include the dataset in this plot; we use ggplot here to make the points easily visible for this demonstration.

## Multiple terms

While the mtcars data serves as a simple example, the data you'll encounter in the workplace is farm more complex; using a single variable to create a model just won't work. Luckily, adding variables to our model is quite simple: we simply use the '+' symbol, followed by our new term, as demonstrated below:

``myModel = lm(data = mtcars, mpg ~ wt + hp)myModel``

## Higher Order Terms

However, sometimes a relationship between variables isn't purely linear: our relationship may have a quadratic term, or perhaps one variable influences another, and we need an interaction term in the data. While we could code these into our dataset using R and dplyr, we do not need to do that here.

To create an interaction term, we can use one of two methods: if we wish to include both terms and their interaction in the model, we can simply use the asterisk:

``myModel = lm(data = mtcars, mpg ~ wt * hp)myModel ``

However, if we wish to include only one (or neither) of these terms, we use a colon for the interaction term instead

``myModel = lm(data = mtcars, mpg ~ wt : hp)myModel``