MARYLAND MATH 107 Curve-fitting Project Linear Regression Model updated
Check this A+ tutorial guideline at
For more classes visit
Curve-fitting Project – Linear Regression Model
A. Summary For this assignment you will be collecting data which exhibits a relatively linear trend, finding the line of best fit, plotting the data and the line, interpreting the slope, and using the linear equation to make a prediction. You will also find r 2 (coefficient of determination) and r (correlation coefficient). Finally, you will write a report discussing your findings. Your topic may be related to sports, your work, a hobby, or something you find interesting. If you choose, you may use the suggestions described below. There are two assignments for you to complete with respect to this project: 1. A proposal for your project in a posting online in the Linear Model Project Proposal discussion group. In addition to describing your topic in a few sentences, your posting must include your data, how/where you obtained your data, and a rough scatterplot. 2. A report in which you document the linear regression you did on your data, what you found, and predictions based on your results.
B. Background A linear regression is a technique for examining real-world data to determine if the data follows a linear model. In other words, given some data points, can we reliably use a line to model the points and make predictions? There are tools available which will find the best line that approximates a set of data points. The tools provide a measure of how well the line fits the data values. If a line exists that is a good fit, then we can use the line to make predicitions for values we do not have. There are a variety of reference materials available to help you complete the project. • Your textbook has a brief introduction to mathematical models on pages 114 – 117. • The following YouTube video is an introduction to Linear Regression. This is background/motivation rather than how to actually compute a linear regression. Introduction to Linear Regression • Suzanne Sands (a teacher at UMUC) has made two video tutorials that show you how to compute a linear regression using Excel. See: Excel Linear Regression Tutorial #1 Excel Linear Regression Tutorial #2 • Suzanne has also done a video on using a free online tool (www.meta-calculator.com) to do a linear regression. See: Online Linear Regression Tutorial
C. Instructions For this assignment, collect data exhibiting a relatively linear trend, find the line of best fit, plot the data and 1 the line, interpret the slope, and use the linear equation to make a prediction. Also, find r 2 (coefficient of determination) and r (correlation coefficient). Discuss your findings. Your topic may be related to sports, your work, a hobby, or something you find interesting. Several suggested topics are provided at the end of these instructions. 1. Describe your topic, provide your data, and cite your source. You must have at least 8 data points for this project. Post this information in the Linear Model Project Proposal (see the discussion group for a detailed list of requirements for this posting). This summary is also the first part of your project report. Each student must use different data. The idea with the discussion posting is two-fold: (1) To share your interesting project idea with your classmates, and (2) To give me a chance to give you a brief thumbs-up or thumbs-down about your proposed topic and data. Sometimes students get off on the wrong foot or misunderstand the intent of the project, and your posting provides an opportunity for some feedback. Remark: Students may choose similar topics, but must have different data sets. For example, several students may be interested in a particular Olympic sport, and that is fine, but they must collect different data, perhaps from different events or different gender. 2. Plot the points (x, y) to obtain a scatterplot. Use an appropriate scale on the horizontal and vertical axes and be sure to label the axes carefully, including units. Visually judge whether the data points exhibit a relatively linear trend. (If so, proceed. If not, try a different topic or data set.) 3. Find the line of best fit (regression line) and graph it on the scatterplot. The equation of the line must be included on the graph or in the text. 4. State the slope of the line of best fit. Carefully interpret the meaning of the slope in a sentence or two. 5. Find and state the value of r 2 , the coefficient of determination, and r, the correlation coefficient. Discuss your findings in a few sentences. Is r positive or negative? Why? Is a line a good curve to fit to this data? Why or why not? Is the linear relationship very strong, moderately strong, weak, or nonexistent? 6. Choose a value of interest and use the line of best fit to make an estimate or prediction. Show calculation work. 7. Write a brief narrative of a paragraph or two. Summarize your topic (same information that you posted online at the beginning of the project) as well as your findings. Be sure to mention any aspect of the linear model project (topic, data, scatterplot, line, r, or estimate, etc.) that you found particularly important or interesting. Do not just mimic what I have said in my sample project — thoughtfully describe your own project. Items #1-#7 constitute your project report. You may submit all of your project report in one document or a combination of documents, which may consist of word processing documents or scanned handwritten work, provided it is clearly labeled where each task can be found. If you used Excel or other spreadsheet software to do the graphs, you must copy the resulting graphs into your word processing document. In the past, students have tried to hand in projects with the text portion written in a spreadsheet – this is confusing and poorly presented and will no longer be accepted. Be sure to include your name. Projects are graded on the basis of completeness, correctness, and strength of the narrative portions. While mathematics work can be hand-written, any descriptions, sentences, or paragraphs must be typed!
D. Suggested Topics You are welcome to use a topic of your own. Several ideas are listed below. If you are using your own topic, it is important to note that you topic cannot involve a physical law that is defined to be linear. For example, an inappropriate choice for a topic would be to relate the time it takes to travel somewhere 2 with the distance travelled. The reason this is an inappropriate choice is that physical laws tell us that distance = speed × time, which is a linear relationship. Further, since we already know the equation of the line, doing a linear regression for this case is not interesting! Another example of an inappropriate choice for a topic is data that exhibits a linear trend but have no apparent cause to do so. For example, if you graph the divorce rate in Maine vs the consumption of margarine, you will find these values correlated. This is an example of an inappropriate topic for the project because we have no reason to believe that margarine causes divorces! The goal of this project is to use data that appears to be roughly linear but where the formula or equation is not known ahead of time and show how the data can be modelled with a line found through linear regression. • Choose an Olympic sport – an event that interests you. Go to http://www.databaseolympics.com/ and collect data for winners in the event for at least 8 Olympic games (dating back to at least 1980). (Example: Winning times in Men’s 400 m dash). Make a quick plot for yourself to “eyeball” whether the data points exhibit a relatively linear trend. (If so, proceed. If not, try a different event.) After you find the line of best fit, use your line to make a prediction for the next Olympics (2014 for a winter event, 2012 or 2016 for a summer event ). NOTE: Not all Olympic events lend themselves to this type of analysis. For instance, downhill skiing times from different Olympics cannot be compared because the race courses can be very different, unlike swimming events where the same swimming pool specifications are used with each Olympics. • Choose a particular type of food. (Examples: Fish sandwich at fast-food chains, cheese pizza, breakfast cereal) For at least 8 brands, look up the fat content and the associated calorie total per serving. Make a quick plot for yourself to “eyeball” whether the data exhibit a relatively linear trend. (If so, proceed. If not, try a different type of food.) After you find the line of best fit, use your line to make a prediction corresponding to a fat amount not occurring in your data set.) Alternative: Look up carbohydrate content and associated calorie total per serving. • Choose a sport that particularly interests you and find two variables that may exhibit a linear relationship. For instance, for each team for a particular season in baseball, find the total runs scored and the number of wins. Excellent websites: http://www.databasesports.com/ and http://www.baseballreference.com/