hw:final

# Differences

This shows you the differences between two versions of the page.

hw:final [2012/05/02 17:04]
neu
hw:final [2014/04/29 15:19] (current)
neu
Line 1: Line 1:
====== Physics 2660 Final Exam ====== ====== Physics 2660 Final Exam ======

-**Final Exam Due Tuesday ​May 8** +**Final Exam Due Thursday ​May 8**
-  * Online Submissions by 12:00 PM  +  * There are three questions, a total of 100 points.
-  * Paperwork: ​4:00 PM (Turn in code/​write-ups ​at the reception desk in the Physics Office)+  * Online Submissions ​**by 9:00 AM on Thursday May 8**
+  * Paperwork: ​
+      * Turn in code/​write-ups ​**in our normal lecture room Phys 204 between 9:00 and 10:00 AM on Thursday May 8**
+      * You must turn in a hard copy, otherwise I will not grade your exam.
+      * Prepare a stapled hard copy for each problem individually,​ and hand in the stack of hard copies bound together with a staple.

-Rules:+**Rules:**
* All students must work individually,​ no collaboration.   * All students must work individually,​ no collaboration.
-  * You can ask Prof. Neu questions via email or chat but the insight will be limited to **conceptual assistance or clarifications**. Pose all questions **before ​Friday 4 May 6pm**. +  * You can ask Prof. Neu questions via email or chat but the insight will be limited to **conceptual assistance or clarifications**. Pose all questions **before ​Monday 5 May at 7:00 PM**.
-  * TA hours will be held on **Thursday ​May  3:30-5:30pm** and **Friday 4 May 2:00-4:00pm**. These will not be standard lab-TA hours: students can come **individually** to ask questions to Aaron but then you will go and work on the questions on your own. Conceptual and clarifications on content only, no assistance in determining the solutions. ​ No examination of code will be allowed. +  * TA hours will be held on **Thursday ​May 5:30-7:00 PM** and **Monday 5 May 5:30-7:00 PM**. These will not be standard lab-TA hours: students can come **individually** to ask questions to the TA but then you will go and work on the questions on your own. Conceptual ​questions ​and clarifications on content only, no assistance in determining the solutions. ​ No examination of code will be allowed.
-  * Allowed resources: class texts, class notes, class web page (including solutions), and only web links provided directly from the class web pages. ​ You can use functions available in the standard C libraries we have used in class/​lab/​homeworks. +  ​* **Allowed resources:** class texts, class notes, class web page (including solutions), and web links provided directly from the class web pages. ​ You can use functions available in the standard C libraries we have used in class/​lab/​homeworks.
-  * Your descriptions/​justifications are important for these problems, some questions will be open to interpretation. +  * Your descriptions/​justifications are important for these problems, some questions will be open to interpretation. You should provide descriptions accompanying your solutions where needed
-  * Pledge your source code/reports.+  * Pledge your source code and hard copies.

-:?: Email/chat any requests for problem clarifications to Prof. Neu by Friday, May 46pm This may be followed by a single FAQ email covering appropriate questions. **Watch for this email.** ​ After this, there will be no more discussion during the exam.+:?: Email/chat any requests for problem clarifications to Prof. Neu by Monday, May 57:00 PMAfter this period, watch for a single FAQ email covering appropriate questions. **Watch for this email.** ​ After this, there will be no more discussion during the exam period.

Tips: Tips:
+  * Not all questions are worth the same amount of points, so plan accordingly.
* Don't jump into coding before you have a good idea about how you want to write your program.  ​   * Don't jump into coding before you have a good idea about how you want to write your program.  ​
* Plan your work carefully. ​     * Plan your work carefully. ​
-  * Submit documented partial solutions if you don't have full solutions to the problems - we'll give credit for all reasonable work towards your solutions.+  * Submit documented partial solutions if you don't have full solutions to the problems - we'll give credit for reasonable ​yet incomplete ​work towards your solutions.

-===== 1. Chutes and Ladders (50) =====+===== 1. Making change (25 points) =====
+
+Did you know there are 292 ways to make change for $1? See [[http://​www.maa.org/​frank-morgans-math-chat-293-ways-to-make-change-for-a-dollar|this page]] for a discussion of this particularly interesting mathematical problem. ​ + +One can determine the various ways to "make change"​ algebraically,​ as discussed in the link. The mathematics can be cumbersome however. Instead we can write a computer algorithm to do the job for us. + +Assume the following denominations are available:​ + * 1 cent pennies + * 5 cent nickels + * 10 cent dimes + * 25 cent quarters + * 50 cent half-dollars + *$1 bills (do not distinguish between dollar bills and dollar coins, just think of them as the same thing)
+  * $2 bills + *$5 bills
+
+Note: As mentioned at [[http://​www.maa.org/​frank-morgans-math-chat-293-ways-to-make-change-for-a-dollar|the link above]], if you consider the **dollar bill** and the **dollar coin** to be distinct then the number of manners in which you can "make change"​ for $1 is actually 293, not 292. But we will not be making that distinction here. + +Your code, called <​username>​_makechange.cpp,​ should be capable of accepting any one of these monetary denominations listed above and calculating the different ways that change can be made for that denomination. + +Your code should accept the following command line argument: + * the denomination'​s value, in units of$, that you want to make change for.
+
+If no value is given, then the code should produce the output for all possible denominations listed above. For instance, if you want to calculate the number of ways to make change for $1, you execute the program this way: + + <​username>​_makechange 1.0 + +If you want to calculate the number of ways to make change for 25 cents, you execute the program this way: + + <​username>​_makechange 0.25 + +If you want to calculate the number of ways to make change for all the different denominations above, you execute the program this way: + + <​username>​_makechange + +Protection should be put into the code to prevent a user from trying to calculate the number of change combinations for any value not in the above list; combinations of denominations (like trying to do this for some random value like$13.07) should not be considered.
+
+The output should be a simple table of the format
+<​code>​
+Denomination ​     Number of ways to make change
+</​code>​
+
+...with the number of lines appropriate for the manner the user executes the program (ie, either for a single value or for all the denominations). The correct answers are given for you at [[http://​www.maa.org/​frank-morgans-math-chat-293-ways-to-make-change-for-a-dollar|the link above]]. Your code however should determine these numbers directly through a proper algorithm.
+
+The grading for this question will proceed in this way:
+  * 10 points for general code structure: compilation without warnings, adhering to the format of the command line arguments, protection in place against unexpected inputs, produces output as expected, etc
+  * 1 points each for getting the correct answer for 0.01, 0.05, 0.10 and 0.25
+  * 2 point each for getting the correct output for 0.50, 1.00 and 2.00
+  * 5 points for getting the correct output for 5.00
+
+
+!-
+===== 2. Shut the Box (25 points) =====
+
+{{:​hw:​shut-the-box.jpg|}}
+
+Variants of the game [[http://​en.wikipedia.org/​wiki/​Shut_the_Box|Shut the Box]] have been around since the 12th century. Today it goes by many names and some version of it is played on every continent.
+
+Shut the Box is a simple game played with two dice and some number of numbered tiles. The point of the game is to take successive turns rolling the two dice and eliminating numbered tiles, turning them over as the sum is achieved. Any number of tiles can be turned over after a given roll as long as the sum matches the sum of the dice from the roll. The version of Shut the Box we will consider here has tiles numbered 1-12.
+
+Play starts with all tiles turned up. As an example, consider rolling on the first roll a 3 and a 4. Their sum is 7. In this case, one could either flip down the tiles for 3 and 4, or 2 and 5, or 1 and 6, or 1 and 2 and 4, or any combination whose sum is 7. One could even flip down the **one** tile for the sum of the numbers, 7. The player'​s turn continues with more rolls. Later in the turn, if one rolls a 2 and a 3, but 1, 2, 3, 4 and 5 are already turned over, then the turn is over, since none of the remaining tiles can be summed to match the sum of the dice rolled. If one rolls a 1 and a 4, but 1, 2, 3, and 5 are turned over, then one might be tempted to turn over the 4 BUT in fact the turn is over. Play proceeds until one cannot flip any more tiles. A final score for a player'​s turn is determined by summing the values of the tiles that remain. Clearly, lower scores are optimal.
+
+The game is best played in a group of people, each person taking turns to acieve the best/lowest score, claiming as their prize **ultimate glory** or, in some cases, **a small amount of previously agreed-upon stakes**.
+
+For this portion of the final exam, you will run thousands of single-player "​pseudogames"​ of Shut the Box.
+
+Write your simulation, call it <​username>​_shut,​ such that it accepts the following command line arguments:​
+  * strategy number, either 1 or 2 (see below)
+  * number of pseudogames to run
+
+**Question:​** Is it more likely to shut-the-box depending on the **strategy** one employs in the game? There are two general schools of thought:
+  * **Strategy 1, "​More-tiles-first":​** In every case where there is a choice, prefer to flip down the most number of tiles that equal the sum instead of smaller combinations of tiles. ​ In cases where there are multiple choices with the same number of tiles to flip down, choose the tile combination that has the highest individual value among it.
+  * **Strategy 2, "​Fewer-tiles-first":​** In every case where there is a choice, prefer to flip down the smallest number of tiles that equal the sum instead of larger combinations of tiles. In cases where there are multiple choices with the same number of tiles to flip down, choose the tile combination that has the highest individual value among it.
+
+The output of your program should print to screen:
+  - The worst (highest) possible score
+  - An estimate of the probability of achieving the worst possible score
+  - An estimate of the probability of "​shutting the box", ie, in a given turn what is the probability of rolling the dice such that you turn down all the tiles over the course of a turn, effectively achieving a score of 0.
+
+**Make sure each piece of information is labeled properly and that you are clear on which of the two strategies you are testing**. These answers should be determined from your pseudogames.
+
+-!
+
+===== 2. Chutes and Ladders (35 points) =====

-Chutes and Ladders is a popular children'​s game in the US (alternatively known as Snakes and Ladders in the UK). It has simple rules and is good teaching how to count and understand simple directions.+Chutes and Ladders ​(C&​L) ​is a popular children'​s game in the US (alternatively known as Snakes and Ladders in the UK). It has simple rules and is good for teaching ​children ​how to count and understand simple directions.

-C&L is a classic land-accumulation game in which the objective is to move forward, covering a set amount of 100 units of space. The victor is the player who does this first, in the least amount of moves. ​Perturbations exist in one's path that can alter one's position in a non-linear way: +C&L is a classic land-accumulation game in which the objective is to move forward, covering a set amount of 100 units of space. The victor is the player who does this first, ie., in the least amount of moves.
-  * Backwards – falling down a slide/chute OR +
-  * Forwards – climbing up a ladder +Each player spins a fair 6-segment ​spinner or rolls a fair single 6-sided die, allowing a movement forward of between 1 and 6 units. Here is what the standard board looks like:
-Each player spins a fair spinner or rolls a fair single 6-sided die, allowing a movement forward of between 1 and 6 units. Here is what the standard board looks like:+

-Play starts ​**off the board** and **proceeds with the first roll**, each player taking a single turn. (2 May: There was an error in the original instructions regarding the starting ​position.)+All players start **off the board** and **play proceeds with the first roll**, each player taking a single turn. Perturbations exist in one's path that can alter one'​s ​position ​in a non-linear way:
+  * Backwards – falling down a slide/chute OR
+  * Forwards – climbing up a ladder
+In this game, one **cannot climb up a slide, nor can one climb down a ladder.**

Besides being a simple game for children, C&L is a nice, simple system that can easily be simulated using a Monte Carlo technique: Besides being a simple game for children, C&L is a nice, simple system that can easily be simulated using a Monte Carlo technique:
Line 43: Line 135:
Here you will simulate the game of C&L with the goal of answering some simple questions about the game play. Use the standard board layout that is displayed above (some online versions might be different, old versions might be different – it is your responsibility to make sure to use the version I expect you to use). NB: Assume the “winner” is the first with ≥100, not ==100. To answer the questions below **you will need to simulate thousands of C&L matches -- pseudomatches!** Here you will simulate the game of C&L with the goal of answering some simple questions about the game play. Use the standard board layout that is displayed above (some online versions might be different, old versions might be different – it is your responsibility to make sure to use the version I expect you to use). NB: Assume the “winner” is the first with ≥100, not ==100. To answer the questions below **you will need to simulate thousands of C&L matches -- pseudomatches!**

-Write your simulation, call it clsim, such that it accepts the following command line arguments:+Write your simulation, call it <​username>​_clsim, such that it accepts the following command line arguments:
* number of players   * number of players
* number of pseudomatches   * number of pseudomatches
+  * starting position for all players (take '​0'​ to mean off the board)

+Have your code output to screen the following information:​
+    - Number of players
+    - Number of pseudomatches
+    - Average game length in number of moves among the pseudomatches run
+    - Shortest game in number of moves among the pseudomatches run
+    - Longest game in number of moves among the pseudomatches run

-Questions:​ +**Make sure each piece of information is labeled properly**, and do not produce any other text output ​to the screen.
-  - On average, do games with four players last longer than those with two players? Why or why not?  ​**Here I mean total game duration, ie, total number ​of moves.** +
-  - In 10k pseudomatcheswhat is the shortest 4-player game in terms of total moves for all players? Shortest 2-player game? +
-  - In 10k pseudomatches,​ what is the longest 4-player game?  +
-  - Assuming 3.5 seconds per move per player, what is the average duration of a 4-player game? 2-player game? +
-  - The optimal location on the board is defined as the position from which the most-frequent path to winning is shortest. Clearly this is someplace between 95 and 99. However, what is the optimal location between squares 21 and 60? NB: this can be simulated using many single-player pseudomatches,​ starting from different locations in positions [21,60] and recording the number of moves needed ​to reach 100 in each match. A histogram of the number of moves required for winning is useful: the position with the distribution with the smallest mean is the most optimal.+

-===== 2. Signal and Background (50) =====+**What is the most optimal position for winning from the middle of the board in the shortest number of moves?**
+
+The optimal location on the board is defined as the position from which the most-frequent path to winning is shortest. Clearly this is someplace between 95 and 99. However, what is the optimal location between squares 31 and 70? Hint: this can be simulated using many single-player pseudomatches,​ starting from different locations in positions [31,70] and recording the number of moves needed to reach 100 in each match.
+
+Besides the textual output required above, your code can also produce some histograms designed for you to answer this question. Note however that I do not want histograms produced by default -- for instance when I am grading I do not want 40 hists popping up on my screen! You should make them on your own in order to answer the question and turn off that feature in the code you hand in.
+
+Hand in your code electronically **[<​username>​_clsim.cpp]**. ​ Make sure to answer the "​optimal location"​ question in the comments area near the beginning of your code. Optional: If you want to include **one histogram file (at most one!)** as support for your answer, submit it as <​username>​_clplots.pdf.
+
+
+===== 3. Signal and Background (40 points) =====

In particle physics, we build massive high-energy particle colliders in the hopes of harnessing the energy in these collisions to produce exotic forms of matter. These processes are rare, so we design our accelerator to collide particles together millions of times a second in the hopes of capturing these elusive new entities. When created, these new forms of matter decay to more common familiar particles; these mundane decay products then fly out from the collision region and interact with our particle detector. ​ This massive device, often bigger than a good-sized office building, collects all the signals for each collision, called an "​event"​ in our vocabulary; each collision event is characterized by a collection of reconstructed observables describing the momenta, energies, trajectories and multiplicities of the outgoing particles. For example, the file In particle physics, we build massive high-energy particle colliders in the hopes of harnessing the energy in these collisions to produce exotic forms of matter. These processes are rare, so we design our accelerator to collide particles together millions of times a second in the hopes of capturing these elusive new entities. When created, these new forms of matter decay to more common familiar particles; these mundane decay products then fly out from the collision region and interact with our particle detector. ​ This massive device, often bigger than a good-sized office building, collects all the signals for each collision, called an "​event"​ in our vocabulary; each collision event is characterized by a collection of reconstructed observables describing the momenta, energies, trajectories and multiplicities of the outgoing particles. For example, the file
Line 88: Line 191:

Start by **looking at the data** and doing the following: Start by **looking at the data** and doing the following:
-  * Write some code to run over the data events and fill nine histograms, one for each observable. Use bins of appropriate width such that the resulting plot is not too jagged bin-to-bin; for example, for Ht use bins that are 20 units wide spanning the range [0.0, 1000.0]. Several bins will be empty but the bins in the core of the distribution will be fairly full. +  * Write some code to run over the data events and fill a histogram ​for just one of the observables,​ Ht. Use bins of appropriate width such that the resulting plot is not too jagged bin-to-bin; for example, ​a good choice ​for Ht would be to use bins that are 20 units wide spanning the range [0.0, 1000.0]. Several bins will be empty but the bins in the core of the distribution will be fairly full.
-  * Produce a plot for each observable, plotting ​the data with black points.  +  * Plot the data with black points. Make sure the plot has all the desired elements of a good histogram
-  * Hand in these plots electronically as a single ​pdf **[username_data.pdf]**. ​There are several ways one can do this; a simple, inelegant-yet-effective way would be to insert each of your output png/jpeg files into a PowerPoint or Word document and subsequently save this collection of plots as a single pdf. **Then, secure-copy this file to galileo, and submit it electronically just like any other file you have submitted this semester.**+  * Hand in this Ht plot for data electronically as a pdf **[<​username>​_data.pdf]**. ​

Next we need to **inspect our signal and background models**: Next we need to **inspect our signal and background models**:
-  * Use similar code as in the above to make histograms ​for your simulated signal events ​ However, here normalize each of the nine histograms ​to the same number of events as observed in the data. Do this by first normalizing to unit area -- meaning scale the final contents of each histogram such that the area under each is 1.00. Then scale again, this time to match the area under the data curve (hint: the area under the data curve should be the same as the number of observed data events). Use the same binning as you used for your data histograms in each variable.   +  * Use similar code as in the above to make a histogram ​for the simulated signal events ​in the variable Ht.  ​However,​ here normalize each of the histogram ​to the same number of events as observed in the data. Do this by first normalizing to unit area -- meaning ​**scale the contents of each histogram** such that the area under the curve is //1.00//**Then scale again,** this time to match //the area under the data curve// (hint: the area under the data curve should be the same as the number of observed data events). Use the same binning as you used for your data histogram.
-  * Use the same code to run over the simulated background events, filling ​nine similar normalized-to-the-data ​histograms. +  * Use this same code to run over the simulated background events, filling ​similar normalized-to-the-data ​histogram ​for the Ht variable
-  * Produce a plot for each observable, overlaying ​the simulated signal and simulated background. Make the signal a solid red histogram, and the background an outline-only blue histogram.  +  * Hand in these plots electronically as pdfs as well **[<​username>​_SigShape.pdf]** and **[<​username>​_BkgShape.pdf]**. ​
-  * Hand in these plots electronically as a single ​pdf **[username_SBshapes.pdf]**. ​+

Now we need to determine how much signal and background the data favors. We will therefore perform a two-component "​fit"​ in one of the variables, Ht. The two components are signal and background; we seek for the sum of signal and background to match as closely as possible the data across all the bins of the Ht histogram. Ht is common shorthand in the field of collider physics; it typically represents the sum of the energies of each of the reconstructed objects in the event. From your signal vs. background shapes, it should be apparent that the signature of our signal is typically more energetic (ie, favors higher Ht values) than the background. ​ Now we need to determine how much signal and background the data favors. We will therefore perform a two-component "​fit"​ in one of the variables, Ht. The two components are signal and background; we seek for the sum of signal and background to match as closely as possible the data across all the bins of the Ht histogram. Ht is common shorthand in the field of collider physics; it typically represents the sum of the energies of each of the reconstructed objects in the event. From your signal vs. background shapes, it should be apparent that the signature of our signal is typically more energetic (ie, favors higher Ht values) than the background. ​

-Back to our fit. There are several ways to do this. Many are computationally expensive and can easily become intractable. We will do something simple here using a chi^2 fit (see [[http://​faculty.virginia.edu/​comp-phys/​phys2660/​wiki/​doku.php?​id=notes|lecture ​10 for a review]]. Define a summed chi^2 over all of the bins of the Ht distribution:​+Back to our fit. There are several ways to do this. Many are computationally expensive and can easily become intractable. We will do something simple here using a chi^2 fit (see [[http://​faculty.virginia.edu/​comp-phys/​phys2660/​wiki/​doku.php?​id=notes|lecture ​11 for a review]]). Define a summed chi^2 over all of the bins of the Ht distribution:​

\\ <​math>​\large \chi^2 = \sum_{i=1}^{k} \frac{(n_i - N_{i}^{pred})^2}{\sigma_i^2}</​math>​ \\ <​math>​\large \chi^2 = \sum_{i=1}^{k} \frac{(n_i - N_{i}^{pred})^2}{\sigma_i^2}</​math>​

where: where:
-  * <​math>​ n_i </​math>​ is the number of observed events in the data in bin <​math>​ i</​math>​+  * <​math>​ n_i </​math>​ is the number of observed events in the data in bin <​math>​ i </​math>​
* <​math>​ \sigma_i </​math>​ is the statistical uncertainty on the number of observed events in the data in bin <​math>​ i</​math>​   * <​math>​ \sigma_i </​math>​ is the statistical uncertainty on the number of observed events in the data in bin <​math>​ i</​math>​
* <​math>​ N_{i}^{pred} </​math>​ is the sum of the predicted signal and background contributions in bin <​math>​ i</​math>​ (see below for more)   * <​math>​ N_{i}^{pred} </​math>​ is the sum of the predicted signal and background contributions in bin <​math>​ i</​math>​ (see below for more)
Line 125: Line 227:
the problem becomes much easier. One can simply scan different values of <​math>​ f_s </​math>​ and <​math>​ f_b </​math>,​ consistent with this simple constraint, and find where the minimum of the chi^2 occurs. the problem becomes much easier. One can simply scan different values of <​math>​ f_s </​math>​ and <​math>​ f_b </​math>,​ consistent with this simple constraint, and find where the minimum of the chi^2 occurs.

-Write a piece of code that reads in the Ht histograms from signal, background and data and executes this simple chi^2 fit. You can do this in a crude manner (scanning all possible values) or an intelligent manner (starting with say a 50/50 mix and choosing subsequent pairs such the chi^2 is reduced, iterating until the best fractions are found). ​More credit will be give for intelligent solutions. Make sure that the code reports ​to the command line the preferred ​fractions of signal and background, <​math>​ f_s </​math>​ and <​math>​ f_b </​math>,​ that the data prefers. +Write a piece of code that reads in the Ht histograms from signal, background and data and executes this simple chi^2 fit. You can do this in a crude manner (scanning all possible values) or an intelligent manner (starting with say a 50/50 mix and choosing subsequent pairs such the chi^2 is reduced, iterating until the best fractions are found). ​
-Hand in this code electronically **[username_chi2fit.cpp]**.+
+The code should report ​to the screen:
+  - The chi2 value when assuming a 50/50 mix of S and B.
+  - The fractions of signal and background, <​math>​ f_s </​math>​ and <​math>​ f_b </​math>,​ that the data prefers, including the final chi2 value achieved
+
+Make sure these pieces of output are properly labeled when printed to screen. No other output should be produced.
+
+Hand in this code electronically **[<​username>​_chi2fit.cpp]**.