Skip to content Skip to navigation Skip to collection information

Connexions

You are here: Home » Content » The Art of the PFUG » The Physics of Springs

Navigation

Table of Contents

Lenses

What is a lens?

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

This content is ...

Affiliated with (What does "Affiliated with" mean?)

This content is either by members of the organizations listed or about topics related to the organizations listed. Click each link to see a list of all content affiliated with the organization.
  • Rice Digital Scholarship

    This collection is included in aLens by: Digital Scholarship at Rice University

    Click the "Rice Digital Scholarship" link to see all content affiliated with them.

Also in these lenses

  • Lens for Engineering

    This collection is included inLens: Lens for Engineering
    By: Sidney Burrus

    Click the "Lens for Engineering" link to see all content selected in this lens.

Recently Viewed

This feature requires Javascript to be enabled.
 

The Physics of Springs

Module by: Heather Williamson, Aneesh Mehta, Matthew Broussard, Jordon Cavazos, Jeffrey Bridge, Steven J. Cox. E-mail the authors

Summary: This report summarizes work done as part of the Physics of Strings PFUG under Rice University's VIGRE program. VIGRE is a program of Vertically Integrated Grants for Research and Education in the Mathematical Sciences under the direction of the National Science Foundation. A PFUG is a group of Postdocs, Faculty, Undergraduates and Graduate students formed round the study of a common problem. This module describes experiments done on a spring system.

A Network of Springs

Our research is centered on a network of springs, built by Jeff Hokansen and Dr. Mark Embree for use in the CAAM 335 Lab. Over the table, we set up a webcam on a beam and connected it to a computer running MATLAB. Springs are connected to pennies (nodes), two of which are fixed to the table. Along the outside pennies, strings run over pullies set along the edge of the table and are attached to hooks, upon which we hang masses. These masses cause the nodes to move. We use the webcam to capture an image of the network, then use a MATLAB script to find the center of each node; the pennies have been painted red to make it easier for MATLAB to detect them. This gives us the displacement of each node, from which we can compute the elongation of each spring. We also know the force applied to each node (9.8*mass9.8*mass in units of Newtons) and can calculate the spring constant k for each spring using Hooke's Law, f restoring =-( elongation )*kf restoring =-( elongation )*k

A Forward Problem

In the forward problem, we seek to compare results from our physical model to the results predicted by solving a linear system of equations. Specifically, we wish to predict our displacements, given we know the load forces and spring constants in our system of springs.

Let us begin with an easier system of just two springs, three nodes, and two forces. Since only two of the nodes are moving, we will have two horizontal displacements denoted in the vector x. There are two elongations, one for each spring, denoted in the vector e.

Figure 1: 2 Spring Network
Figure 1 (2Dsprings.png)
x = x 1 x 2 , e = e 1 e 2 , x = x 1 x 2 , e = e 1 e 2 ,
(1)

Each spring elongation is a linear combination of node displacements. The equations can be written in the following manner.

e = e 1 e 2 = x 1 x 2 - x 1 = 1 0 - 1 1 x = A x e = e 1 e 2 = x 1 x 2 - x 1 = 1 0 - 1 1 x = A x
(2)

Now we have our adjacency matrix, A. This translates us from node displacement to spring elongation. It will have one more property which will we shall see shortly. Now let us consider finding the restoring force, y, which will have one component for each spring.

y = y 1 y 2 , y = y 1 y 2 ,
(3)

We assume that each spring follows Hooke's Law, y=key=ke, where restoring force is directly proportional to elongation. Each spring has a corresponding stiffness, ki which comprise the the diagonal elements of matrix, K.

y = y 1 y 2 = k 1 e 1 k 2 e 2 = k 1 0 0 k 2 e = K e = K A x y = y 1 y 2 = k 1 e 1 k 2 e 2 = k 1 0 0 k 2 e = K e = K A x
(4)

The final step is to translate these restoring forces into the load forces acting on each node, denoted by vector f.

f = f 1 f 2 = y 1 - y 2 y 2 = 1 - 1 0 1 = A T y f = f 1 f 2 = y 1 - y 2 y 2 = 1 - 1 0 1 = A T y
(5)

Now we can see the second feature of the adjacency martrix. The transpose of A performs the reverse translation from edges to nodes. The final product of this example is the equation just shown: f=ATKAxf=ATKAx. Now we can expand the problem to any system of springs for which we can create an adjacency matrix A. For this project we focused on the spring network shown below.

Figure 2: Spring Network
Figure 2 (strang0000.png)

For this system, we have 16 strings and 7 seven nodes. So we will be working with vectors x, e, y, and f which contain the following information:

x: displacement of nodes

e: elongation of springs

y: restoring force in each spring

f: load force at each node

Since there are 7 nodes, each with a horizontal and vertical displacement, vectors x and f with be vectors of length 14. The 2n-1 entry will correspond to the horizontal displacement or load force on node n and the 2n entry will correspond to the vertical displacement or load force on node n. Since there are 16 springs, y and e will be vectors of length 16 with each entry corresponding to the restoring force or displacement of that spring.

x = x 1 x 2 x 13 x 14 x = x 1 x 2 x 13 x 14

x 1 : horizontal displacement at Node 1 x 2 : vertical displacementd at Node 1 x 13 : horizontal displacement at Node 7 x 14 : vertical displacement at Node 7 x 1 : horizontal displacement at Node 1 x 2 : vertical displacementd at Node 1 x 13 : horizontal displacement at Node 7 x 14 : vertical displacement at Node 7
(6)

As for the matrices translating us from one vector to the next, we will be working with A and K. A will have fourteen columns relating to the 14 degrees of freedom of the nodes of system, and it will have 16 rows, corresponding to the springs of the system. K will be a square matrix, 16x16. Diagonal elements of K will correspond to the stiffnesses of each spring

K = k 1 0 0 0 0 k 2 0 0 0 0 k 3 0 0 0 0 k 16 , k i = spring constant of spring i K = k 1 0 0 0 0 k 2 0 0 0 0 k 3 0 0 0 0 k 16 , k i = spring constant of spring i
(7)

To construct A, we use a pattern of three linear approximations. The first two assume that the elongation of horizontal springs will be approximately equal to the horizontal displacements of the two adjacent nodes. The same will hold true for vertical nodes. The third recurring equation in the adjacency matrix approximates the diagonal elongation with a first order Maclaurin expansion. This elongation is equal to 2/22/2 multiplied by the sum of the horizontal and vertical displacements. The specific adjacency matrix for this set up is the following:

A = 0 0 0 0 0 1 0 0 0 0 0 0 0 0 α - α 0 0 - α α 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 - 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 α - α 0 0 - α α 0 0 0 0 0 0 - 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 - 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 - 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 α - α 0 0 - α α 0 0 0 0 0 0 - 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 - 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 - α α 0 0 α - α 0 0 0 0 0 0 - 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 - 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 - 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 - 1 0 , α = 2 2 , A = 0 0 0 0 0 1 0 0 0 0 0 0 0 0 α - α 0 0 - α α 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 - 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 α - α 0 0 - α α 0 0 0 0 0 0 - 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 - 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 - 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 α - α 0 0 - α α 0 0 0 0 0 0 - 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 - 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 - α α 0 0 α - α 0 0 0 0 0 0 - 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 - 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 - 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 - 1 0 , α = 2 2 ,

Now that we have created our adjacency matrix for the system and know the relation f=ATKAxf=ATKAx, we can create the matrix defined by ATKAATKA. If this matrix is invertible, then we can solve what is referred to as the forward problem, finding x given f. This relationship is simply done by finding the inverse of our matrix and solving x=(ATKA)-1fx=(ATKA)-1f.

An Inverse Problem

We move now to our inverse problem: given the applied forces and the displacements of nodes in our network, can we determine the stiffnesses of the springs? Although the problem may at first appear trivially similar to the forward problem, we can see that this is not the case. From the previous section, we know that our calculations involve force vector f and displacement vector x,

f = f 1 f 2 f 13 f 14 , x = x 1 x 2 x 13 x 14 f = f 1 f 2 f 13 f 14 , x = x 1 x 2 x 13 x 14
(8)

for 14 degrees of freedom. However, the spring constants involved are

K = k 1 0 0 0 0 k 2 0 0 0 0 k 3 0 0 0 0 k 16 , k i = spring constant of spring i K = k 1 0 0 0 0 k 2 0 0 0 0 k 3 0 0 0 0 k 16 , k i = spring constant of spring i
(9)

for 16 springs. This dimensional mismatch leads to an underdetermined system when we attempt to solve for K.

Our solution to this problem involves experimental stacking. Due to the fact that our experiment is easily repeatable under different conditions, we can arrive at more observations by simply altering the forces acting on the table. This, however, should not alter any of the spring constants. By doing so, we generate a second set of data which is still dependent on the spring constants. We are left with 16 springs, but we now have 28 degrees of freedom: an overdetermined system. The experiment can be repeated s times to minimize experimental error as well, yielding 14s degrees of freedom and the same 16 values in K.

With this experimental technique in hand, we can revisit our primary equation:

A T K A x = f . A T K A x = f .
(10)

In order to solve for K, we re-express part of our equation:

K A x = K e = k 1 k 16 e 1 e 16 = e 1 e 16 k 1 k 16 . K A x = K e = k 1 k 16 e 1 e 16 = e 1 e 16 k 1 k 16 .
(11)

This simple substitution lets us rewrite our equation as

A T diag e ( i ) k = f ( i ) - f ( 0 ) A T diag e ( i ) k = f ( i ) - f ( 0 )
(12)

which allows us to solve for k.

At this point, we can apply our experimental technique of stacking. Using s experiments, we may construct the appropriate matrices such that

B = A T diag e ( 1 ) A T diag e ( 2 ) A T diag e ( s ) , f = f ( 1 ) - f ( 0 ) f ( 2 ) - f ( 0 ) f ( s ) - f ( 0 ) . B = A T diag e ( 1 ) A T diag e ( 2 ) A T diag e ( s ) , f = f ( 1 ) - f ( 0 ) f ( 2 ) - f ( 0 ) f ( s ) - f ( 0 ) .
(13)

We now recall Hooke's Law:

f = B k . f = B k .
(14)

With f and B, we are now ready to solve for k. However, due to the fact that our system is overdetermined, there does not have to be a unique solution. To find the solution of best fit, then, we turn to the least squares method so as to find k that satisfies

min k R 16 B k - f 2 . min k R 16 B k - f 2 .
(15)

We go about this using the standard method of normal equations: we multiply both sides of the Hooke equation by BT:

B T B k = B T f . B T B k = B T f .
(16)

This allows us to use the Moore-Penrose psuedoinverse of B solve for k, our vector of spring constants:

k = ( B T B ) - 1 B T f . k = ( B T B ) - 1 B T f .
(17)

Bar graphs of k are presented from a laboratory implementation of this technique. Figure 3 shows the calculated spring constants using the first two experiments from Data Set A. These spring constants have 246.7% error (see "Notes: Our Data Sets, Measuring Spring Constants, and Error" for notes on data sets and calculating error). Figure 3 shows the measured spring constants.

Figure 3: Results from Stacking Two Experiments
(a)
Figure 3(a) (invcomp.png)
(b)
Figure 3(b) (invmeas.png)

Due to our technique stacking of s experiments, our solution should minimize experimental error. However, it should be clear that, due to the amount of experimental error involved, we are extremely unlikely to arrive at an exact solution to the problem. It is important to note that, though wildly inaccurate, we can correctly identify the stiff spring in the system.

Our Question

In theory, this works out beautifully. Unfortunately, our experiments are carried out in the real world, so the entire process is rife with error. Measurements of both the masses and the positions of the nodes may be imprecise, and the alignment of the webcam may be off (See Figure 4), introducing the keystone effect, all of which dirties the displacement data. The forces may not be perfectly aligned along the horizontal and vertical (See Figure 4), and the masses we hang may not be exactly what we believe, introducing error in the force vector. Furthermore, Hooke's Law is an approximation, valid only in a certain range (although we keep our forces within that range). We also assume that the springs lie at angles of 0, π/4π/4, or π/2π/2 to the nodes, a belief that is reflected in the adjacency matrix and not at all accurate (See Figure 4). Our model also approximates the elongations of the springs linearly and assumes that the pennies are massless points, which introduces further error.

Figure 4: Sources of Error
(a)
Figure 4(a) (imperfect_camera_alignment.png)
(b)
Figure 4(b) (imperfect_load_direction.png)
(c)
Figure 4(c) (imperfect_angles.png)

Ideally, we think we are solving the equation ATEk=fATEk=f, where ER16×16ER16×16 has the elongation of each spring along the diagonal. In reality, we are actually attempting to solve the relation

A T + α E + ϵ k + κ f + γ , A T + α E + ϵ k + κ f + γ ,
(18)

where α, ϵ, κ, and γ are error due to either the measurements or the model. We would like to use statistical inference to find a solution.

Applications

Our experiment is designed to test the elastic properties of a very simple network by stretching it in two dimensions. This is an instance of the biaxial test, which is used to study the properties of polymers, plastics, and tissue by material scientists and physicians. Our setup simulatesa biopsy that seeks to identify a flaw in an otherwise homogeneous material.

Notes: Our Data Sets, Measuring Spring Constants, and Error

We ran two large sets of experiments. Data Set A consists of 104 experiments on Network g (see "Conditioning" for a list of all networks). All the even experiments have (theoretically) identical forces; all the odd experiments have another set of (theoretically) identical forces. We are confident of the magnitude of the forces; we are less confident of the alignment. These 104 experiments are actually just two experiments, each repeated 52 times. Data Set B consists of 120 unique experiments on Network a. No set of forces was repeated. These two data sets will allow us to explore which sets of experiments produce the best results, a most important question for applications. Data Set A will allow us to approach the problem statistically.

In order to be able to say which set of experiments or computational method gives us the “best” results, we need to know the actual values of the spring constants. We would like each spring of a particular class (horizontal/vertical, diagonal, stiff) to have the same k, but because these springs were purchased at our local hardware store, we have no guarantee of that and much measure each spring individually.

To measure k for a particular spring, we attach the spring to one of the fixed nodes, with a penny and string on the other end. We begin with only the 50g hook on the string and use the webcam and MATLAB to take the position. Next, we add 10g to the hook at a time, taking the position after adding each mass, until 100g are on the hook. This gives us 11 data points to work with. We plot these, then find the line of best fit through these points using three different methods: MATLAB's polyfit function; the least squares approach, MATLAB's backslash, which forms the normal equations k=eTf/etek=eTf/ete; and the total least squares approach, described in "Total Least Squares". The results from each of these methods is recorded in Table 1.

Table 1: Values from Methods of Computing Spring Constants
k, (N/m)
Spring Least Squares Total Least Squares Normal Equations
1 190.3 191.8 191.5
2 285.0 276.3 273.7
3 153.0 170.2 165.7
4 166.8 176.6 173.3
5 230.9 237.1 236.6
6 187.9 278.0 252.6
7 200.0 219.6 217.4
8 915.0 1064.2 1005.4
9 240.3 254.0 252.5
10 208.9 321.4 249.8
11 196.1 202.7 200.0
12 233.3 238.7 236.4
13 197.5 167.4 162.2
14 223.1 218.1 215.9
15 181.2 173.2 171.5
16 249.0 227.3 225.3

The least squares approach to calculating the actual values of k fits the experimental data best. We run the calculations described in "An Inverse Problem" three times, using the values of k from each of these approaches as the actual value once. The error from each of these methods is recorded in Table 2.

Table 2: Error in Methods of Computing Spring Constants
Method % Error
Least Squares 235.4
Total Least Squares 237.9
Normal Equations 246.6

Mathematically, the total least squares approach should be the most accurate (again, see "Total Least Squares" for a description of this approach), but the least squares approach gives the least error. The computations for total least squares and the normal equations both require subtracting off the first entry from the displacement and force vector from the rest of the vectors, so these two computations work with one less data point. This may result in less accurate results, so we will use the values of k that we compute with the least squares approach when we need the “measured” values of k, such as when we compute percent error.

Throughout this report, we will refer to the “percent error” in the results of a particular method. To compute this, we consider first the percent error in each spring, (ki,actual-ki,measured)/ki,actual(ki,actual-ki,measured)/ki,actual, arrange these into a vector, take the norm of the error, and multiply by 100. For reference, if the result has 100% error in each spring, the percent error of the entire result will be 400%.

Linear Algebra Approaches

Conditioning

The condition number, C, of a matrix A is the ratio of the largest singular value to the smallest singular value. If C=C=, A is singular; the closer C is to one, the better conditioned A is. In a linear equation of the form Ax=bAx=b, C gives a bound for how sensitive x is to small changes in b. If C is large, then even a small perturbation in b can produce a large change in x.

Our equation, Bk=fBk=f, has some error in f, so we are interested in the condition number of B. The smaller C of B is, the more confident we can be in our results.

Of the sixteen springs in the network, the twelve along the horizontal and vertical are important for holding the structure in place. The four diagonal springs can easily be switched back and forth from π/4π/4 to 3π/43π/4 without significantly changing the stability of the network. Switching these four back and forth creates 16 different networks. We would like to find the network with the lowest condition number and use that for our experiments.

Figure 5: 16 Possible Networks
Figure 5 (sp1.png)

Because B=ATdiagAxB=ATdiagAx, B depends on the displacements x and so varies with each trial. Because we cannot compute it directly, we run a simulation that, given the measured spring constants, generates random forces of the correct size in the correct locations, finds the displacement of each node, then calculates B. We computed the average condition number of B over several thousand trials for each network.

However, there is a slight catch. For networks g, h, o, and p, B has a nullspace. This corresponds to rigid-body motion: that is, some of the nodes can move without the rest of the network being displaced. We find that all of the networks where the Spring 2 and Spring 12 are connected at the center node have a nullspace corresponding to the free motion of these two nodes. We correct this problem by adding a row to the bottom of B (a 1 in the column corresponding to spring 2 and a -1-1 in the column corresponding to spring 12) and a 0 to the bottom of f that forces these two springs to have the same spring constant. Table 3 presents the results from these simulations.

Table 3: Condition Number of Potential Networks
Average Condition Number of Networks
Network C ¯ C ¯
1 17.2
2 31.9
3 29.5
4 26.5
5 27.4
6 28.1
7 3815.0
8 2538.4
9 30.0
10 25.8
11 28.0
12 29.7
13 25.8
14 31.3
15 2133.3
16 2219.0

It would seem that all networks that we had to modify are significantly worse than networks we did not have to modify. Looking more closely at the singular-value decomposition, we find that there is one very large singular value corresponding to the row we added because the entries in this row are much larger than all the other entries in B. There is no particular reason why we need a 1 and a -1-1 (or any other pair of opposite numbers), so we scale this row down. We would like to improve the condition number as much as possible, so we look at the condition number of B as a function of the scaling factor of the last row. We find that, as the scaling factor increases, the condition number decreases.

Figure 6 shows the relationship between the scaling factor on the additional row and the condition number of B for Network g, using the first 4 experiments of Data Set A.

Figure 6: Effect of Scaling Factor on Condition Number of B
Figure 6 (condscale.png)

We repeated the simulations, scaling the additional row by a factor of 1/2001/200. The scaled results are presented in Table 4; for reference, the unscaled results are also included. This method, then, greatly improves the condition number of the matrix.

Table 4: Condition Number of Potential Networks
Average Condition Number of Networks - Scaled
Network C¯C¯ not scaled C¯C¯ scaled
1 17.2 17.2
2 31.9 31.9
3 29.5 29.5
4 26.5 26.5
5 37.5 27.4
6 28.1 28.1
7 3815.0 44.5
8 2538.4 26.1
9 30.0 30.0
10 25.8 25.8
11 28.0 28.0
12 29.7 29.7
13 25.8 25.8
14 31.3 31.3
15 2133.3 22.3
16 2219.0 22.4

Ultimately, we decided to work with Network g because it was the tautest, so it eliminated some experimental error. It was unfortunate that this network had the worst average condition number, but the benefit of accurate data was greater than the benefit of an accurate matrix. For future experimentation, however, building a new structure and using Network a might be the best.

We were also interested in how stacking impacts the condition number of a network. To study this, we add two experiments from Data Set A at a time and compute the condition number after each step. Figure 7 shows the singular values of B as we add more experiments; the lighter the color, the more experiments have been added. Figure 8 shows how the condition number of B changes as we add more experiments; the red line is the minimum. From this, we conclude that adding more experiments improves the condition number to a certain point, but it is close to constant from there.

Figure 7: Singular Values of B
Figure 7 (svdmoreexp.png)
Figure 8: Condition Number of B
Figure 8 (condmoreexp.png)

When we stack all of the results from Data Set A together and scale them appropriately, we get 145.6% error.

Figure 9: Spring Constants with Best Condition Number
(a)
Figure 9(a) (kcondcalc.png)
(b)
Figure 9(b) (kcondmeas.png)
Table 5: Spring Constants with Best Condition Number
Spring Constants with Best Condition Number
Spring Measured Calculated
1 190.3 213.5
2 285.0 295.6
3 153.0 105.6
4 166.8 184.2
5 230.9 258.3
6 187.9 142.1
7 200.0 125.5
8 915.0 988.5
9 240.3 171.1
10 208.9 323.7
11 196.1 173.9
12 233.3 295.6
13 197.5 276.7
14 223.1 75.8
15 181.2 224.5
16 249.0 450.7

Total Least Squares

Consider the linear equation Ax=bAx=b, where ARm×n,bRm×1,xRn×1,m>nARm×n,bRm×1,xRn×1,m>n.

This equation is overdetermined and has no precise answer. The simplest approach to finding x is a least-squares fitting model, which finds the curve with the least difference between the value of the curve at a point and the value of the data at that point; i.e., it solves minxRnAx-b2minxRnAx-b2. This amounts to saying that the data may be slightly perturbed:

A x = b + r , A x = b + r ,
(19)

where r is some residual noise, and minimizing rr:

min r : A x = b + r r . min r : A x = b + r r .
(20)

When we compare this to our equation Bk=fBk=f, we see that this is an appropriate method: we are not entirely confident of f, and can perturb it slightly.

Looking more closely, B=ATdiagAxB=ATdiagAx. We are also not entirely certain of x, which means we are not entirely certain of ATdiagAxATdiagAx. This is best reflected in the total least squares approach, in which both the data (b in the simple equation, f in our equation) and the matrix (A in the simple equation, ATdiagAxATdiagAx in our equation) may be slightly perturbed:

A + E x = b + r , A + E x = b + r ,
(21)

where E is some noise in A and r is some noise in b, and minimizing ErEr:

min [ E r ] : ( A + E ) x = b + r [ E r ] | F . min [ E r ] : ( A + E ) x = b + r [ E r ] | F .
(22)

The last term in the singular value decomposition of [Ab][Ab], -sn+1un+1vn+1T-sn+1un+1vn+1T, is precisely what we want for [Er][Er].

At first glance, this exactly what we want. We can find the singular value decomposition of [Bf][Bf], take the last term as [Er][Er], and solve for k. When we implement this method, however, we get worse results compared to the measured data. Standard least squares returns a k with only 182.04% percent error (See Figure 10); total least squares returns a k with 269.17% percent error (See Figure 10). Looking at the structure of B=ATdiagAxB=ATdiagAx and E gives a hint as to why. The adjacency matrix, A, encodes information about the structure of the network, so it has a very specific pattern of zeros, which is reflected in B. There are no similar restrictions on E, allowing zeros in inappropriate places. This is physically equivalent to sprouting a new spring between two nodes, an absurdity. Figure 11 below compares the structure of B (Figure 11) and E (Figure 11). Light green entires correspond to a zero; everything else corresponds to a nonzero entry. E has many non-zero entries where there should not be any. Note the scale for the colorbar on the right: the entries of E are two orders of magnitude smaller than the entries in B. Though they are small, they represent connections between nodes and springs that do not exist, throwing off the entire result. Requiring that particular entries equal zero makes the problem combinatorally harder.

Figure 10: Results from Least Squares and Total Least Squares
(a)
Figure 10(a) (tlsls.png)
(b)
Figure 10(b) (tlstls.png)
(c)
Figure 10(c) (tlsmeas.png)
Figure 11: Total Least Squares: Structure of B and E
(a)
Figure 11(a) (tlsB.png)
(b)
Figure 11(b) (tlsE.png)

Statistical Approaches

Statistical Background

Because we would like to use statistical inference, it is important to have a basic understanding of several statistical concepts.

Definition 1 Probability Space

A space, Ω, of all possible events, ωΩωΩ

Example

Rolling a die is an event.

Flipping a coin is an event.

Loading forces onto the spring network is an event.

Definition 2 Random Variable

A mapping from a space of events into the real line, X:ΩRX:ΩR, or real n-dimensional space, X:ΩRnX:ΩRn

Example

Reading the number shown on a die is a random variable. The number shown is in R.

Calculating the displacements of the nodes is a random variable. The actual displacements are in R14

Definition 3 Probability Distribution

A function on a set BRnBRn describing the probability that X(ω)BX(ω)B.

Example

The distribution, the log-normal distribution, and the Poisson distribution are all well-known distributions.

Definition 4 Expectation, Variance, Covariance

The expectation of a distribution is the “center of mass”:

E X = R n x π x d x = x ¯ = μ = x 0 . E X = R n x π x d x = x ¯ = μ = x 0 .
(23)

The empirical expectation of a sample x1,x2,,xNx1,x2,,xN is the mean:

x 0 ^ = 1 N j = 1 N x j . x 0 ^ = 1 N j = 1 N x j .
(24)

The variance of a distribution is the expectation of the squared difference from the expectation of the distribution, varX=σ2=EX-x¯2=RX-x¯2πxdx.varX=σ2=EX-x¯2=RX-x¯2πxdx. This is only for a one-dimensional distribution.

The covariance of two single-variable distributions is the expectation of the deviation from the expectation of one, times the expectation of the deviation from the expectation of the other, cov(X,Y)=E(X-x¯)(Y-y¯)=EXY-EXEYcov(X,Y)=E(X-x¯)(Y-y¯)=EXY-EXEY.

The covariance matrix of a distribution has, for its (i,j)(i,j) entry the covariance of the i and j components of the distribution. The empirical covariance matrix of a sample x1,x2,,xNx1,x2,,xN is the mean of deviation from the mean:

Γ ^ = 1 N j = 1 N ( x j - x 0 ^ ) ( x j - x 0 ^ ) T . Γ ^ = 1 N j = 1 N ( x j - x 0 ^ ) ( x j - x 0 ^ ) T .
(25)

Definition 5 Normal (Gaussian) Distribution

In one dimension, the normal distribution has probability distribution function

π x = 1 σ 2 π exp - x - μ 2 2 σ 2 . π x = 1 σ 2 π exp - x - μ 2 2 σ 2 .
(26)

In n dimensions, the normal distribution has probability distribution function

π x = 1 2 π n det ( Γ ) 1 / 2 exp - 1 2 x - x 0 T Γ - 1 x - x 0 . π x = 1 2 π n det ( Γ ) 1 / 2 exp - 1 2 x - x 0 T Γ - 1 x - x 0 .
(27)

Definition 6 Maximum Likelihood Estimate

Given a sample from a known distribution depending on unknown parameter θ, the value of θ that maximizes the probability that the distribution returns the given sample.

Definition 7 Equiprobability Curve

Given an n-dimensional distribution, the locus of points in n-dimensions with equivalent probability.

Example

The equiprobability curves for a normal n-dimensional distribution are n-dimensional ellipses.

Normality

Because we wish to use statitsical inference to solve this problem, we begin by looking at the probability distribution of the displacements.

Calvetti and Somersalo [1] present a method for studying the normality of a 2-dimensional sample S, by using the equiprobability curves, which they also call credibility curves. Let π, corresponding to a random variable X, be a probability density. Let

B α = x R 2 | π ( x ) α , α > 0 ; B α = x R 2 | π ( x ) α , α > 0 ;
(28)

that is, the set of all points whose probability is greater than or equal to some positive number α. This corresponds to either the empty set (for α>1α>1) or the interior of an equiprobability curve; if π is normally distributed and α1α1, then Bα is the interior of an ellipse. We calculate the probability that X is in Bα, the credibility:

p = P X B α = B α π x d x , 0 < p < 1 . p = P X B α = B α π x d x , 0 < p < 1 .
(29)

We would like to find an explicit expression relating α and p, α=α(p).α=α(p). If S is normally distributed, then the number of points inside Bα(p)Bα(p) is about pn.pn.

The immediate goal, then is to calculate the number of points in S that lie within Bα for some α. We will evaluate the integral p=Bαπxdxp=Bαπxdx, but first we will need a change of variable. Let UD-1UTUD-1UT be the eigenvalue of decomposition of Γ^-1Γ^-1 (the inverse of the empirical covariance matrix), where

D - 1 / 2 = 1 / λ 1 1 / λ 2 . D - 1 / 2 = 1 / λ 1 1 / λ 2 .
(30)
( x - x 0 ^ ) T Γ ^ - 1 ( x - x 0 ^ ) = ( x - x 0 ^ ) T U D - 1 U T ( x - x 0 ^ ) = D - 1 / 2 U T ( x - x 0 ^ ) 2 . ( x - x 0 ^ ) T Γ ^ - 1 ( x - x 0 ^ ) = ( x - x 0 ^ ) T U D - 1 U T ( x - x 0 ^ ) = D - 1 / 2 U T ( x - x 0 ^ ) 2 .
(31)

So we can set up a change of variable:

w = f ( x ) = W ( x - x 0 ^ ) , W = D - 1 / 2 U T . w = f ( x ) = W ( x - x 0 ^ ) , W = D - 1 / 2 U T .
(32)

Notice that

d w = d e t ( W ) d x = 1 λ 1 λ 2 d x = 1 det ( Γ ^ ) 1 / 2 d x . d w = d e t ( W ) d x = 1 λ 1 λ 2 d x = 1 det ( Γ ^ ) 1 / 2 d x .
(33)

This change of variable transforms the equiprobability ellipses of x into circles centered at the origin:

f ( B α ) = w R 2 | w < δ , δ = δ ( α ) > 0 . f ( B α ) = w R 2 | w < δ , δ = δ ( α ) > 0 .
(34)

We are now ready to evaluate the integral. We will first perform a change of variable from x to w, then change to polar coordinates.

p = B α π x d x = 1 2 π det Γ ^ 1 / 2 B α exp - 1 2 ( x - x 0 ^ ) T Γ ^ - 1 ( x - x 0 ^ ) d x = 1 2 π det Γ ^ 1 / 2 B α exp - 1 2 W ( x - x 0 ^ ) 2 d x = 1 2 π f ( B α ) exp - 1 2 w 2 d w = 0 δ exp - 1 2 r 2 r d r = 1 - exp - 1 2 δ 2 . p = B α π x d x = 1 2 π det Γ ^ 1 / 2 B α exp - 1 2 ( x - x 0 ^ ) T Γ ^ - 1 ( x - x 0 ^ ) d x = 1 2 π det Γ ^ 1 / 2 B α exp - 1 2 W ( x - x 0 ^ ) 2 d x = 1 2 π f ( B α ) exp - 1 2 w 2 d w = 0 δ exp - 1 2 r 2 r d r = 1 - exp - 1 2 δ 2 .
(35)

Given this, we can solve for δ:

δ = δ ( p ) = 2 log 1 1 - p . δ = δ ( p ) = 2 log 1 1 - p .
(36)

This gives us a straightforward way to check whether or not a point in the sample, xj, lies in the credibility region. If

w j < δ ( p ) , w j = W ( x j - x 0 ^ ) w j < δ ( p ) , w j = W ( x j - x 0 ^ )
(37)

holds, then the point is in the credibility ellipse.

Recall that this expression is for two dimensions. If we repeat the integral in different dimensions, we get a different expression relating δ and p. We are also interested in distributions in R, R14, and R16, so we repeat the calculations for those spaces. In R14 and R16, we do not get a nice expression, so we approximate the values of δ for p={0.1,0.2,,0.9}p={0.1,0.2,,0.9} using a TI-89 calculator (MATLAB was unable to carry out the calculation symbolically). In R, using w=σ-1w=σ-1, we find that

δ = 2 erf - 1 p . δ = 2 erf - 1 p .
(38)

We can then plot the number of points inside a credibility ellipse against the credibility. A perfectly normal distribution will be a straight line from the origin to (1,1)(1,1). If a sample significantly deviates from this line, then this criterion suggests that the sample is not normal.

Distributions

Armed with these criteria for looking at the normality of a sample, we study the displacements of the nodes. First we look at the nodes individually. We look at how a single node moves under a particular load. We expect the sample to be normally distributed: under a fixed load, most of the displacements should be about the same, with a few outliers. Data Set A has 104 samples split evenly between two loads, so we have 14 samples in R2 with 52 points each. Figure 12 plots the number of points inside the credibility ellipse against the credibility for Node 4, Pattern 2. The red line is from the sample, the blue line is a randomly generated normally distributed sample with 52 points, and the black line is a reference line. The blue allows us to have an idea for how much we can expect a sample this size to deviate from the straight line. Figure 12 is a histogram of the positions of the node. This plot suggests that the distribution of this node is normally distributed. All of the nodes, under both loads, give similar results (See Figure 13). Because of this, we believe that the displacements are normally distributed, as we expected.

Figure 12: Node 4, Load 2
(a)
Figure 12(a) (dispnormb4big.png)
(b)
Figure 12(b) (disphistb4big.png)
Figure 13: Normality of Individual Nodes [Load Pattern:Node]
Figure 13 (sp2.png)

Next we look at the nodes collectively, giving us two samples in R14 with 52 points each. Because we believe that each node is normally distributed, we expect the combined displacements to be normally distributed. Figure 14 is the plot from Load 1; Figure 14 is the plot from Load 2. Figure 14 suggests that the displacements of the nodes are indeed normally distributed.

Figure 14: Normality of All Nodes
(a)
Figure 14(a) (dispnorma.png)
(b)
Figure 14(b) (dispnormb.png)

Spring Constants

Next we look at the calculated spring constants. First, we look at them individually, drawing from Data Set A. We take experiments pairwise, one from each load, and calculate the spring constants. This gives us 16 samples in R of 52 points each. Figure 15 plots the normality of each spring constant: all seem to be fairly normal. Because of this, we believe that the spring constants are normally distributed.

Figure 15: Normality of Individual Spring Constants
Figure 15 (sp3.png)

Next we look at the spring constants collectively, with one sample in R16. Because we believe that each individual spring constant is normally distributed, we expect the combined spring constants to be normally distributed. Figure 16 is the plot from this sample, suggesting that the spring constants are indeed normally distributed.

Figure 16: Normality of Collective Spring Constants
Figure 16 (knorm.png)

When we look at our equation, ATdiagAxk=fATdiagAxk=f, this makes sense. If A and f are fixed, then the relationship between x and k is simply linear, with no worries about the distributions of A and f. To be sure, A and f are not entirely accurate, but they are close enough to constant that we can expect k to come from the same kind of distribution as x.

Note that this criteria is fairly subjective. Plotting the random distribution helps us to get a feel for how much deviation from the reference line we can expect, but these in no way prove that the samples are from normal distributions. Rather, it suggests that, if they are not normally distributed, they can probably be reasonably approximated by a normal distribution.

Rewriting the Problem and the Maximum Likelihood Estimate

Rewriting the Problem

In our original approach to use statistical inference to solve the inverse problem (see "Our Question") our problem reduced to the equation ATEk=fATEk=f, where ER16×16ER16×16 has the elongation of each spring along the diagonal. If we consider the experimental error as well as the error in the model the equation becomes

A T + α E + ϵ k + κ f + γ , A T + α E + ϵ k + κ f + γ ,
(39)

where α, ϵ, κ, and γ are error due to either the measurements or the model. The error ϵ is the easiest error that we can describe with all our experimental data, specifically Data Set A (see "Notes: Our Data Sets, Measuring Spring Constants, and Error"). The error of α,α,γ,γ, and κ we either have no description or very little description from our experimental data. The problem that we can then study using our observed error is

A T E + ϵ k f . A T E + ϵ k f .
(40)

Because ϵ is in the middle of the equation, it is very hard to deal with. We wish our problem to take the form

y = F z + ϵ , y = F z + ϵ ,
(41)

where z is the unknown variable, y is the observed variable with error ϵ, and FRm×nFRm×n.

To reach an equation of this form we begin by looking at our original problem ATKAx=fATKAx=f (see "An Inverse Problem"). If AT is invertible, then our equation becomes Ax=K-1A-TfAx=K-1A-Tf, where

K - 1 = c 1 0 0 0 0 c 2 0 0 0 0 c 3 0 0 0 0 c n , c i = 1 / k i . K - 1 = c 1 0 0 0 0 c 2 0 0 0 0 c 3 0 0 0 0 c n , c i = 1 / k i .
(42)

Because A-TfA-Tf is a vector and K-1K-1 is a diagonal matrix, we see that our equation can be rewritten to

e = F c , e = F c ,
(43)

where e=Axe=Ax, F= diag (A-Tf)F= diag (A-Tf), and c is the compliance, or the vector of inverses of spring constants. Now if we include error of the observed data, e, we obtain a model for the system of the form

e = F c + ϵ , e = F c + ϵ ,
(44)

where ϵ is the error in the measurements. Note that if AT is not invertible we can replace A-TA-T with A+TA+T, where A+TA+T is the pseudo inverse of AT, so that F=diag(A+Tf)F=diag(A+Tf).

Maximum Likelihood Estimate

With our problem now rewritten in a statistically more usable form, we can easily apply a method involving the maximum likelihood estimate, described by Calvetti and Somersalo [1], pages 35-37. Applying this method to our rewritten problem yields the equation

F T Γ - 1 F c = F T Γ - 1 e ¯ , F T Γ - 1 F c = F T Γ - 1 e ¯ ,
(45)

where

e ¯ = 1 N j = 1 N e j , e ¯ = 1 N j = 1 N e j ,
(46)

and Γ is the covariance matrix of the random variable e. This method assumes that the error is normally distributed, which is a reasonable assumption (see "Distributions"). Now with this formulation we can solve our problem using statistical knowledge of the problem.

Problems with Maximum Likelihood Estimate Approach

The covariance matrix, Γ, comes from the distribution of ej=Axjej=Axj. Then, Γ(e)=AΓ(x)AT.Γ(e)=AΓ(x)AT. In our problem where AR16×14AR16×14 and Γ(x)R14×14Γ(x)R14×14, Γ(e)Γ(e) is a singular matrix. In fact in any system where ARm×nARm×n where m>nm>n, Γ(e)Γ(e) is singular. This presents a problem when solving the system

F T Γ ( e ) - 1 F c = F T Γ ( e ) - 1 e ¯ . F T Γ ( e ) - 1 F c = F T Γ ( e ) - 1 e ¯ .
(47)

We can avoid the issue of invertibility of Γ(e)Γ(e) by instead of solving the problem e=Fce=Fc, we solve the problem x=F'cx=F'c where

F ' = A + F , A + = pseudoinverse of A . F ' = A + F , A + = pseudoinverse of A .
(48)

Our problem then is

F ' T Γ ( x ) - 1 F ' c = F ' T Γ ( x ) - 1 x ¯ , F ' T Γ ( x ) - 1 F ' c = F ' T Γ ( x ) - 1 x ¯ ,
(49)

where

x ¯ = 1 N j = 1 N x j . x ¯ = 1 N j = 1 N x j .
(50)

However we see that even if F is nonsingular F'=A+FF'=A+F is singular, and our problem is not avoided.

Finding Optimal Force Vector

Setting up our problem with the equation e=Fce=Fc (see "Rewriting the Problem") helps us see an important realization. Because F is diagonal,

e = e 1 e 2 e n - 1 e n = F 1 , 1 c 1 F 2 , 2 c 2 F n - 1 , n - 1 c n - 1 F n , n c n . e = e 1 e 2 e n - 1 e n = F 1 , 1 c 1 F 2 , 2 c 2 F n - 1 , n - 1 c n - 1 F n , n c n .
(51)

Thus assuming Fi,i0Fi,i0,

c i = e i F i , i or, k i = 1 c i = F i , i e i . c i = e i F i , i or, k i = 1 c i = F i , i e i .
(52)

This formulation helps us see another important realization. Namely the importance of Fi,i0Fi,i0. Recall that F=diag(A-Tf)F=diag(A-Tf). We have control over the f we choose, so it would be wise to choose an f so that A-TfA-Tf has no zero elements. This happens only if every row of A-TA-T is not orthogonal to f. We would like to choose an f that had as many zero elements as possible, but also is not orthogonal to any of the rows of A-TA-T. We can formulate our search for an optimal f into the optimization problem,

min f T f + γ 1 δ + min ( A - T f ) , min f T f + γ 1 δ + min ( A - T f ) ,
(53)

where γ and δ are adjustable parameters to get the best result. Because of the unsmooth nature of this minimization problem, this is best solved using MATLAB's fminsearch function.

In some networks we may have limitations on the f that we choose. For example in our original network

f 1 = f 6 = f 7 = f 8 = f 10 = f 13 = 0 f 1 = f 6 = f 7 = f 8 = f 10 = f 13 = 0
(54)

for every f that we choose. In the case that fj=0fj=0, we can rewrite our optimization problem to

min f ' T f ' + γ 1 δ + min ( A ' f ' ) , min f ' T f ' + γ 1 δ + min ( A ' f ' ) ,
(55)

where

A ' = A 1 - T A j - 1 - T A j + 1 - T A n - T , A i - T = the i th column of A - T , A ' = A 1 - T A j - 1 - T A j + 1 - T A n - T , A i - T = the i th column of A - T ,
(56)

and

f ' = f 1 f j - 1 f j + 1 f n . f ' = f 1 f j - 1 f j + 1 f n .
(57)

Using these methods, we found optimal f's using MATLAB's fminsearch function on simulated data from small simple spring networks. In every network the A matrix was square and invertible. We tested networks having 2, 4, 6, and 8 springs. In each case our initial force vector was a perfectly reasonable guess that gave incomplete solutions for the spring constants. However when we inputted that initial force vector into fminsearch, the output was a force vector that made physical sense and gave accurate solutions for the spring constants.

Further Research

There are two main areas of further research that directly follow the research presented in this paper. One is applying the maximum likelihood estimate approach to spring systems having a singular A matrix. We saw in "Problems with Maximum Likelihood Estimate Approach" the problems we ran into when we applied the statistical technique to a system with a singular A matrix. But this problem is expected due to the underdetermined nature of the problem (see "An Inverse Problem"). We dealt with that problem by stacking originally, so further research could be done in using the statistical approach and stacking.

The other area for further research is in optimizing the f vector (see "Finding Optimal Force Vector"). Further work could be done on tweaking the minimization problem so that the optimal f can be found for larger networks and networks with more restrictions on f. Also, if our system is underdetermined then even an optimal f won't give us a complete solution without stacking. So more research could be done in trying to find a set of optimal f's that, when stacked, give the most accurate and complete solutions.

Acknowledgements

We would like to thank Dr. Steven Cox and Dr. Mark Embree for their guidance, as well as Dr. Derek Hansen and Jeffrey Hokanson for their help and support.

This Connexions module describes work conducted as part of Rice University's VIGRE program, supported by National Science Foundation grant DMS–0739420.

See Also

Pfieffer, P. Pfieffer Applied Probability. Connexions.org.

Cox, S. CAAM 335: Course Notes. www.caam.rice.edu.http://www.caam.rice.edu/ cox/main.pdf

Cox, S., Embree, M., & Hokanson, J. CAAM 335: Lab Manual.

www.caam.rice.edu.http://www.caam.rice.edu/ caam335lab/labman.pdf

References

  1. Calvetti, Daniela and Somersalo, Erkki. (2007). Introduction to Bayesian Scientific Computing: Ten Lectures on Subjective Computing. New York: Springer.

Collection Navigation

Content actions

Download:

Collection as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Module as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Add:

Collection to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks

Module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks