| > |
lab4.mws --- Fitting Linear, Exponential, Power, and Logarithmic Functions to Data
| > |
| > | restart; with( plots ): with( stats ): with( student ): |
| > |
Lab Overview
The purpose of this lab is for you to find the best linear (
), exponential (
), power (
), and logarithmic (
) function that fits a given set of data. In the process of finding these four functions, you should see that - in most instances - one of these four functions is better than the other three.
This should provide you with some practice applying the properties of exponentials and logarithms in addition to being helpful in other courses that you are taking (or will take).
After a short review of exponential and power functions, we will look at some sample data and determine the best power or exponential fit for the data. You will then repeat this analysis on several additional sets of data. It is not necessary for you to complete the analysis of all of the data sets. Do only as much work as is needed to answer the questions. Note that some data sets will need to be adjusted prior to performing some of the analysis.
Note that this analysis is related to the uses of (semi-)log and log-log plots, but this topic will not be explored in detail here.
| > |
Auxiliary Procedures (execute but do not change)
| > | slope := proc( F ) if type( F, equation ) then return lcoeff( rhs(F) ) else return lcoeff( F ) end if end proc: |
| > | intercept := proc( F ) if type(F,equation) then return tcoeff( rhs(F) ) else return tcoeff( F ) end if end proc: |
| > | bestlinearfit := (X,Y)->fit[leastsquare[[x,y]]]([evalf(X),evalf(Y)]): |
| > | dataplot := proc( X, Y ) local bestlinfit, opts, p1, p2, xlo, xhi; if nargs>2 then opts:= args[3..-1] else opts := NULL end if; p1 := plot( zip( (x,y)->[x,y], X, Y ), style=point ); bestlinfit := rhs(bestlinearfit(X,Y)); xlo := min(op(X),0); xhi := max(op(X),0); p2 := plot( bestlinfit, x=xlo..xhi, color=blue ); return display( [p1,p2], opts ) end proc: |
| > | datafitplot := proc( X, Y, F ) local bestlinfit, f, opts, p1, p2, xlo, xhi; if nargs>3 then opts:= args[4..-1] else opts := NULL end if; if type(F,equation) then f := rhs(F) else f := F end if; p1 := plot( zip( (x,y)->[x,y], X, Y ), style=point ); xlo := min(op(X),0); xhi := max(op(X),0); p2 := plot( f, x=xlo..xhi, color=blue ); return display( [p1,p2], opts ) end proc: |
| > |
Review of Exponential and Power Functions
We want to develop a simple (oftentimes visual) way to determine if a collection of data points is better approximated with an exponential function or a power function. Let's consider a general exponential function:
| > | e1 := y=a*exp(b*x): e1; |
If the natural logarithm is applied to both sides of this equation, and the right-hand side is simplified:
| > | e2 := map( ln, e1 ): e2; simplify( e2 ) assuming real; |
The assumption is necessary because of some technical concerns that are not relevant to us. (Be sure you can do this calculation by hand!)
The conclusion that we should draw is that if the plot of the
-values vs. the (natural) logarithm of the
-values is a straight line, then the data corresponds to a power function. Moreover, the slope of the line is the growth (or decay) rate of the exponential function and the y-intercept of the line is the (natural) logarithm of the coefficient of the exponential function.
| > |
Next, consider a general power function:
| > | p1 := y=a*x^b: p1; |
Taking the logarithm of both sides of this equation, and simplifying the right-hand side leads to:
| > | p2 := map( ln, p1 ): p2; simplify( p2 ) assuming positive; |
Again, be sure you can do these calculations by hand.
The conclusion to be drawn from this result is that if the plot of the logarithm of the
-values vs. the logarithm of the
-values is a straight line, then the data appears to follow a power function. The slope of the line is the power of the power function and the y-intercept of the line is the logarithm of the coefficient of the power function.
| > |
Examples
Fully-Worked Example with Fictitious Data #1
| > | X := [ 1.5, 2.0, 3.0, 4.5, 6.0, 8.0 ]; Y := [ 3.01, 3.45, 4.15, 5.00, 5.70, 6.50 ]; |
The first thing one might do is simply plot the data points. To use Maple's plot command we need to
zip
together the
- and
-values to form (
,
) pairs.
| > | XY := zip( (x,y)->[x,y], X, Y ); |
| > | plot( XY, style=point, view=[0..7,0..6], title="Plot of the data points" ); |
These points almost look like they could be fit by a straight line. The best linear function for this data is
| > | blf1 := bestlinearfit( X, Y ): blf1; |
I have prepared a Maple procedure, dataplot, to plot a set of data and the best linear function. Here is how it could be used for this problem:
| > | dataplot( X, Y, title="Plot of x vs. y and best linear function fit" ); |
Take a minute to check that this linear function corresponds to the one found above. The slope and intercept of the best linear fit can be extracted as follows:
| > | m = slope( blf1 ); b = intercept( blf1 ); |
| > |
While this fit looks pretty good, could we do better with a power or exponential function? To find the best exponential fit, we need the logarithms of the
-values:
| > | lnY := map( ln, Y ); |
Now, the plot of this data and the best linear fit to it is easy to obtain.
| > | dataplot( X, lnY, title="Plot of x vs. ln(y) and best linear fit to data" ); |
From this picture, the linear function for the transformed data points does not do a better job of approximating the data than the linear function for the original data points.
| > |
To consider the power function the logarithm of the
-values are needed:
| > | lnX := map( ln, X ); |
| > | dataplot( lnX, lnY, title="Plot ln(x) vs. ln(y) and best linear fit to data" ); |
Now that is a good fit! The equation of the linear function to this log-log data is
| > | blf2 := bestlinearfit( lnX, lnY ): blf2; |
To obtain the best fitting power function for this data, remember that this equation is really
| > | e1 := ln(y) = intercept(blf2) + slope(blf2)*ln(x): e1; |
| > | e2 := isolate( e1, y ); |
| > | simplify( e2 ); |
Note that the coefficient is
and the power is 0.4591.
| > | datafitplot( X, Y, e2, title="Plot of data and best fitting power function" ); |
This picture confirms that we have a good fit to the data.
| > |
| > |
Unworked Example with Fictitious Data #2
Source: http://www.ucl.ac.uk/Mathematics/geomath/logsnb/MHlogsnbloglog.html
| > | X := [ 0.2, 0.4, 0.6, 0.8, 1.2 ]; Y := [ 1.18, 1.71, 2.31, 2.71, 3.45 ]; |
| > |
General Data Processing
These commands can be used to visualize any set of data points stored in two vectors, X and Y . Note that all work necessary to ensure the data values are all positive and are appropriately scaled or shifted are assumed to have been completed prior to executing these commands. In some cases it might be necessary to add a view= option to some of the plot commands.
It is recommended that you copy this section to where you want to use it. The reason for this recommendation is that you will always have a good copy of these commands should you want to revert to the original form.
| > | 'X' = X; 'Y' = Y; |
| > |
Best Fit to Data by a Linear Function
| > | dataplot( X, Y, title="Data points and best linear fit" ); |
| > | bestlinearfit( X, Y ); |
| > |
Best Fit to Data by an Exponential Function
| > | lnY := map( ln, Y ); |
| > | dataplot( X, lnY, title="x vs. ln(y) and best linear fit to transformed data" ); |
| > | blf1 := bestlinearfit( X, lnY ): blf1; |
| > | e1 := ln(y) = intercept(blf1)+slope(blf1)*x: e1; |
| > | e2 := isolate( e1, y ): simplify( e2 ); |
| > | datafitplot( X, Y, e2, title="Plot of data points and best exponential function fit" ); |
| > |
Best Fit to Data by a Power Function
| > | lnX := map( ln , X ); |
| > | dataplot( lnX, lnY, title="ln(x) vs. ln(y) and best linear fit to transformed data" ); |
| > | blf2 := bestlinearfit( lnX, lnY ): blf2; |
| > | p1 := ln(y) = intercept(blf2)+slope(blf2)*ln(x): p1; |
| > | p2 := isolate( p1, y ): simplify( p2 ); |
| > | datafitplot( X, Y, p2, title="Plot of data points and best power function fit" ); |
| > |
Best Fit to Data by a Logarithmic Function
| > | dataplot( lnX, Y, title="ln(x) vs. y and best linear fit to transformed data" ); |
| > | blf3 := bestlinearfit( lnX, Y ): blf3; |
| > | L1 := y = intercept(blf3)+slope(blf3)*ln(x): L1; |
| > | datafitplot( X, Y, L1, title="Plot of data points and best logarithmic function fit" ); |
| > |
| > |
| > |
| > |
| > |
The following section contains the general commands needed to perform the graphical and numerical work needed to find each of the four best fitting functions. It is recommended that you put a copy of this section immediately after the data that it will be used to analyze. This was done for you in the Unworked Example above.
General Data Processing
These commands can be used to visualize any set of data points stored in two vectors, X and Y . Note that all work necessary to ensure the data values are all positive and are appropriately scaled or shifted are assumed to have been completed prior to executing these commands. In some cases it might be necessary to add a view= option to some of the plot commands.
It is recommended that you copy this section to where you want to use it. The reason for this recommendation is that you will always have a good copy of these commands should you want to revert to the original form.
| > | 'X' = X; 'Y' = Y; |
| > |
Best Fit to Data by a Linear Function
| > | dataplot( X, Y, title="Data points and best linear fit" ); |
| > | bestlinearfit( X, Y ); |
| > |
Best Fit to Data by an Exponential Function
| > | lnY := map( ln, Y ); |
| > | dataplot( X, lnY, title="x vs. ln(y) and best linear fit to transformed data" ); |
| > | blf1 := bestlinearfit( X, lnY ): blf1; |
| > | e1 := ln(y) = intercept(blf1)+slope(blf1)*x: e1; |
| > | e2 := isolate( e1, y ): simplify( e2 ); |
| > | datafitplot( X, Y, e2, title="Plot of data points and best exponential function fit" ); |
| > |
Best Fit to Data by a Power Function
| > | lnX := map( ln , X ); |
| > | dataplot( lnX, lnY, title="ln(x) vs. ln(y) and best linear fit to transformed data" ); |
| > | blf2 := bestlinearfit( lnX, lnY ): blf2; |
| > | p1 := ln(y) = intercept(blf2)+slope(blf2)*ln(x): p1; |
| > | p2 := isolate( p1, y ): simplify( p2 ); |
| > | datafitplot( X, Y, p2, title="Plot of data points and best power function fit" ); |
| > |
Best Fit to Data by a Logarithmic Function
| > | dataplot( lnX, Y, title="ln(x) vs. y and best linear fit to transformed data" ); |
| > | blf3 := bestlinearfit( lnX, Y ): blf3; |
| > | L1 := y = intercept(blf3)+slope(blf3)*ln(x): L1; |
| > | datafitplot( X, Y, L1, title="Plot of data points and best logarithmic function fit" ); |
| > |
| > |
| > |
Required Data Sets
Regional Drainage Area Data for Dillon, MT
X - drainage area (mi^2)
Y - bankfull depth (ft)
Parrett, Omang, Hull, 1983
Mean annual runoff and peak flow estimates based on channel geometry of streams in northeastern and western Montana, US Geological Survey Water-Resources Investigations Report 83-4086, 53 pp.
Source: http://www.homepage.montana.edu/~uessc/esci432/logexpgraph.htm
| > | X := [ 4.06, 62.7, 280, 36, 935, 2476 ]; Y := [ 1.4, 2.8, 4, 2.6, 5.5, 7.5]; |
| > |
Radium Emissions from a Watch Hand
X - count per minute (of radioactive ions)
Y - number of sheets of paper in gap between source and sensor
Source: http://www.blackcatsystems.com/GM/experiments/ex2.html
| > | Y := [ 30335, 26878, 23868, 21680, 19814, 18004, 16344, 15093, 13908, 12920, 11835, 10934, 10220, 9551, 8839, 8299, 7733, 7318 ]; X := [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17 ]; |
| > |
Hint:
Note that it is not possible to take the natural logarithm of all of the x values. There are two main ways to overcome this problem:
i) drop the first x- and y-values
| > | Xnew := X[2..-1]; Ynew := Y[2..-1]; |
ii) change the y-values so that they are all positive - here, it make sense to add 1 to each y-value
| > | Xnew := X+[1$nops(X)]; |
For the remainder of this problem, the second approach will be used because it allows us to keep all data points:
| > | X := Xnew; Y := Y; |
| > |
| > |
U.S. Census Data
X - year
Y - average population (in millions) over 10-year period
Source: http://www.mste.uiuc.edu/malcz/ExpFit/data.html
| > | X := [ 1815, 1825, 1835, 1845, 1855, 1865, 1875, 1885, 1895, 1905, 1915, 1925, 1935, 1945, 1955, 1965, 1975 ]; Y := [ 8.3, 11.0, 14.7, 19.7, 26.7, 35.2, 44.4, 55.9, 68.9, 83.2, 98.8, 114.2, 127.1, 140.1, 164.0, 190.9, 214.3 ]; |
| > |
Hint:
You might want to shift the
data to represent the number of years after 1800. For example:
| > | X := X - [1800$nops(X)]; |
| > |
| > |
| > |
Optional Data Sets
Experimental Diode Performance
X - voltage (V)
Y - current (milliamp)
Source: http://people.msoe.edu/~rather/me/100lab2.html
| > | X := [0.52,0.597,0.619,0.632,0.643,0.65,0.655,0.661,0.672,0.689,0.711,0.722,0.754,0.78,0.801,0.819,0.848,0.87,0.893,0.92]; Y := [0.01,0.1,0.2,0.3,0.4,0.5,0.6,0.7,1,1.6,3.03,4.04,8.37,15,23.4,33,53.7,75.8,102.5,137.5]; |
| > |
Stress-Strain Behavior of Steel
X - stress (lbf/in^2)
Y - strain (in)
Source: http://people.msoe.edu/~rather/me/100lab2.html
| > | X := [5000,10000,15000,20000,25000,30000,35000]; Y := [0.000172,0.0003490,0.0005092,0.00069,0.000856,0.001018,0.00119]; |
| > |
Voltage Drop Across a Capacitor
X - time (seconds)
Y - voltage (V)
Source: http://wwwfp.vuse.vanderbilt.edu:8888/es130/lectures/lecture6/exponential.html
| > | X := [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12 ]; Y := [ 10, 6.1, 3.7, 2.2, 1.4, 0.8, 0.5, 0.3, 0.2, 0.1, 0.07, 0.03 ]; |
| > |
Hint:
Why is it not possible to take the logarithm of the
values? What can you do to overcome this difficulty?
| > |
| > |
| > |
| > |
Lab Questions
1. Determine the best fitting power and logarithmic functions for the Regional Drainage Area data.
2. Determine the best fitting power and exponential functions for the Radium Emissions from a Watch Hand data.
3. Find one of the Optional Data Sets that is best fit with a power function; give the best fitting power function.
4. Find one of the Optional Data Sets that is best fit with an exponential function; give the best fitting power function.
5. Which of the four best fitting functions provides the best fit to the U.S. Census data?
Essay Question [2 points]: Explain how the behavior of the exponential, power, and logarithmic functions differ at the origin. How do these differences help select the most appropriate fit for the Regional Drainage Area and Radium Emissions data sets.
| > |
| > |