hw:final

# Differences

This shows you the differences between two versions of the page.

hw:final [2012/05/02 17:04]
neu
hw:final [2018/05/07 22:52] (current)
neu
Line 1: Line 1:
====== Physics 2660 Final Exam ====== ====== Physics 2660 Final Exam ======

-**Final Exam Due Tuesday ​May 8** +**Final Exam Due Thursday ​May 10**
-  * Online Submissions by 12:00 PM  +  * There are three questions, a total of 100 points.
-  * Paperwork: ​4:00 PM (Turn in code/​write-ups ​at the reception desk in the Physics Office)+  * Online Submissions ​**by 9:00 AM on Thursday May 10**
+  * Paperwork: ​
+      * Turn in code/​write-ups ​**in our normal lab room Phys 022-C between 8:00 and 10:00 AM on Thursday May 10**
+      * You must turn in a hard copy of all required files, otherwise I will not grade your exam.
+      * Prepare a stapled hard copy for each problem individually,​ and hand in the stack of hard copies bound together with a paper clip / binder clip.

-Rules:+**Rules:**
* All students must work individually,​ no collaboration.   * All students must work individually,​ no collaboration.
-  * You can ask Prof. Neu questions via email or chat but the insight will be limited to **conceptual assistance or clarifications**. Pose all questions **before ​Friday 4 May 6pm**. +  * You can ask questions via email or chat or Piazza ​but the insight will be limited to **conceptual assistance or clarifications**. Pose all questions **before ​Monday 7 May at 8:00 PM**.
-  * TA hours will be held on **Thursday 3 May  3:30-5:30pm** and **Friday 4 May 2:00-4:00pm**. These will not be standard lab-TA hours: students can come **individually** to ask questions to Aaron but then you will go and work on the questions on your own. Conceptual and clarifications on content only, no assistance in determining the solutions. ​ No examination of code will be allowed. +  * TA hours will be held on **Saturday 6-8pm** and **Sunday 11am-1pm** and **Monday 1-3pm**. These will not be standard lab-TA hours: students can come **individually** to ask questions to the TA but then you will go and work on the questions on your own. Conceptual ​questions ​and clarifications on content only, no assistance in determining the solutions. ​ No examination of code will be allowed.
-  * Allowed resources: class texts, class notes, class web page (including solutions), and only web links provided directly from the class web pages. ​ You can use functions available in the standard C libraries we have used in class/​lab/​homeworks. +  ​* **Allowed resources:** class texts, class notes, class web page (including solutions), and web links provided directly from the class web pages. ​ You can use functions available in the standard C libraries we have used in class/​lab/​homeworks.
-  * Your descriptions/​justifications are important for these problems, some questions will be open to interpretation. +  * Your descriptions/​justifications are important for these problems, some questions will be open to interpretation. You should provide descriptions accompanying your solutions where needed.
-  * Pledge your source code/reports.+  * If your code produces huge amounts of incomprehensible text **you will lose lots of points**. Make sure the code produces the output as described in the statement of the problem **but not much beyond that.** This makes it impossible to grade -- so avoid doing so
+  * Pledge your source code and hard copies.

-:?: Email/chat any requests for problem clarifications ​to Prof. Neu by Friday, May 46pm This may be followed by a single FAQ email covering appropriate questions. **Watch for this email.** ​ After this, there will be no more discussion during the exam.+:?: Email/chat/​Piazza ​any requests for problem clarifications ​**by Monday, May 78:00 PM.** After this period, watch for **single FAQ email** covering appropriate questions ​with any necessary clarifications. **Watch for this email.** ​ After this, there will be no more discussion during the exam period.

Tips: Tips:
+  * Not all questions are worth the same amount of points, so plan accordingly.
* Don't jump into coding before you have a good idea about how you want to write your program.  ​   * Don't jump into coding before you have a good idea about how you want to write your program.  ​
* Plan your work carefully. ​     * Plan your work carefully. ​
-  * Submit documented partial solutions if you don't have full solutions to the problems - we'll give credit for all reasonable work towards your solutions.+  * Submit documented partial solutions if you don't have full solutions to the problems - we'll give partial ​credit for reasonable ​yet incomplete ​work towards your solutions.

-===== 1. Chutes and Ladders ​(50) =====+===== 1. Projectile Motion Animation ​(25 points) =====

-{{:​hw:​chutes-and-ladders_box.jpg|}}+In this problem, we will revisit for the final time the topic of projectile motion, here synthesizing what we have learned about animating plots in gnuplot

-Chutes and Ladders is a popular children'​s game in the US (alternatively known as Snakes ​and Ladders ​in the UK)It has simple rules and is good teaching how to count and understand simple directions.+==== Part A (15 points) ====
+Building upon your work on projectile motion in [[hw:​hw11|HW 11]] and performing animations in gnuplot in [[hw:​hw10|HW 10]], write an animation of projectiles launched at 5 degree increments ​in the range [5,85] degreesThe program should therefore launch 17 different projectiles,​ each at a different launch angle. Here assume that the projectile ​is fired from the surface of the Earth and that there is no air resistance.

-C&L is a classic land-accumulation game in which the objective is to move forward, covering ​set amount ​of 100 units of space. The victor is the player who does this firstin the least amount of movesPerturbations exist in one's path that can alter one's position in non-linear way: +Call your code [**username_launches.cpp**]. You can hardwire ​the loop through the launch angles -- no need to assume ​different spectrum or range of launch angles. The trajectory of each projectile should be represented by a series of pointseach point separated by a visually pleasing ​amount of space (ie., not so many points that the code takes more than 30 seconds to complete, but a sufficient number such that one can visually associate adjacent points with given trajectory). ​**Update ​May: Assume an initial launch speed of 100 m/s.**
-  ​Backwards – falling down a slide/chute OR +
-  ​Forwards – climbing up a ladder +
-Each player spins a fair spinner or rolls a fair single ​6-sided die, allowing a movement forward ​of between 1 and 6 unitsHere is what the standard board looks like:+

-{{:​hw:​chutes-and-ladders_board.jpg|}}+Ideally each trajectory or range of points in flight would have a different color to aid in the visualization,​ but this is not a major point of the problem ​-- **do not waste hours of time trying to figure out how to do the colors**. However to get full credit, the trajectories must be distinguishable through coloring.

-Play starts **off the board** ​and **proceeds ​with the first roll**each player taking ​single turn. (2 May: There was an error in the original instructions regarding ​the starting position.)+The graphics should be created and displayed to screen. Make sure to label the axes properly ​and, ideally, each trajectory should be marked ​with the launch angle -- although againthis is not major point of the problem -- **do not waste hours of time trying to figure out how to do the labels**. However to get full credit, all labels must be present.

-Besides being a simple game for children, C&L is a nice, simple system ​that can easily ​be simulated using Monte Carlo technique:​ +We are going for something here that is both accurate and visually pleasing. So there will be a small amount of credit assigned ​to this problem for an animation that is visually effective. This will be subjective ​-- but the bar will be low.
-  * Completely deterministic +
-  * Random path length but rolls limited ​to between [1,6] +
-  * Paths non-linear but well-defined+

-Here you will simulate the game of C&L with the goal of answering some simple questions about the game play. Use the standard board layout that is displayed above (some online versions might be different, old versions might be different – it is your responsibility ​to make sure to use the version I expect you to use). NB: Assume the “winner” is the first with ≥100not ==100. To answer the questions below **you will need to simulate thousands ​of C&L matches -- pseudomatches!**+The code needs to be able to complete ​the animation of all trajectories in 30 sec or lessotherwise ​you will lose significant fraction ​of the available points.

-Write your simulation, call it clsim, such that it accepts ​the following command line arguments: +To see what the final product could look like, [[https://​www.reddit.com/​r/​dataisbeautiful/​comments/​7vktxh/​projectile_motion_at_complementary_angles_oc/​|check out this reasonable version posted to reddit.]] I say "​reasonable"​ here since the x-axis really should be labeled with the range value for each projectile instead ​of the launch angle. But the concept and execution are otherwise great.
-  * number ​of players +
-  * number of pseudomatches+

+**Edit 6 May: **
+NB: To produce the animation, you should consult animate_answer.cpp from the solutions to HW10. You will likely need to submit your own version of this animation program to get the animation to work. Submit this file as **username_animation.cpp** and include explicit instructions in comments in the header portion of this file, or in the header portion of **username_launches.cpp**. ​

-  - On average, do games with four players last longer than those with two players? Why or why not?  ​**Here I mean total game duration, ie, total number of moves.** +
-  - In 10k pseudomatches,​ what is the shortest 4-player game in terms of total moves for all players? Shortest 2-player game? +
-  - In 10k pseudomatches,​ what is the longest 4-player game?  +
-  - Assuming 3.5 seconds per move per player, what is the average duration of a 4-player game? 2-player game? +
-  - The optimal location on the board is defined as the position from which the most-frequent path to winning is shortest. Clearly this is someplace between 95 and 99However, what is the optimal location between squares 21 and 60? NB: this can be simulated using many single-player pseudomatches,​ starting from different locations in positions ​[21,60and recording the number of moves needed to reach 100 in each match. A histogram of the number of moves required for winning is useful: the position with the distribution with the smallest mean is the most optimal.+

-Hand in your code electronically **[username_clsim.cpp]**. ​ Make sure to answer the questions in the comments area near the beginning of your code. **Preserve the functionality inside your code to answer all these questions -- and provide some instructions on how to run it in comments near the top of your file.**+==== Part B (10 points) ====

-===== 2. Signal and Background (50) =====+Repeat Part A but for projectiles in the presence of air resistance. **Update 7 May: Assume an initial launch speed of 100 m/s and k/​m=0.2.** ​ Call your code [**username_resistlaunches.cpp**]. All other instructions from Part A apply here. **Edit 6 May:** Include animation instructions in the header portion of **username_animation.cpp** or **username_resistlaunches.cpp**.
+
+To see what the final product could look like, [[https://​www.reddit.com/​r/​dataisbeautiful/​comments/​7vrhec/​projectile_motion_with_air_resistance_oc|here is another nice example from reddit]].
+
+
+===== 2. Fourier Series (40 points) =====
+
+In the early 19th century, French mathematician and physicist Jean-Baptiste Joseph Fourier was studying the heat equation -- a partial differential equation that describes heat dissipation in a given material, a problem for which, at the time, no general solution existed. In the examination of this problem, Fourier happened upon the fact that an arbitrary continuous periodic function could be expressed exactly as the sum of trigonometric functions -- simple sines and cosines.
+
+The significance of this result cannot be overstated. This technique is applicable in a wide variety of problems involving often-nasty analytic functions -- where the first step often is to abandon the nasty function in favor of the relatively simple trig functions. This technique is used everywhere -- electrical engineering,​ signal analysis, quantum mechanics, encryption, economics, market modeling, acoustics, image processing, sound analysis, etc.
+
+The general form of the Fourier series for a given function can be found in many sources; a couple good online ones are from [[http://​mathworld.wolfram.com/​FourierSeries.html|Wolfram MathWorld]] and [[https://​en.wikipedia.org/​wiki/​Fourier_series|Wikipedia]].
+
+In general, for any function $f(x)$ that is continuous and integrable in the interval $[x_0,​x_0+P]$,​ and periodic with period $P$, then $f(x)$ can be expressed as the sum of $N$ terms of the form:
+
+$f_N(x) = \frac{a_0}{2} + \sum_{n=1}^{N} \Big(a_n \cdot \cos(\frac{2\pi n x}{P}) + b_n \cdot \sin(\frac{2\pi n x}{P})\Big)$
+
+where $N$ is a non-negative integer and
+
+$a_n = \frac{2}{P} \int_{x_0}^{x_0+P} f(x) \cos(\frac{2\pi n x}{P})dx$
+$b_n = \frac{2}{P} \int_{x_0}^{x_0+P} f(x) \sin(\frac{2\pi n x}{P})dx$
+
+$f_N(x)$ approximates $f(x)$ and the accuracy of the approximation increases with $N$, meaning
+
+$\lim_{N \rightarrow \infty} f_N(x) = f(x)$
+
+Here we will write a program that performs a Fourier decomposition of a few interesting periodic functions. For the purposes of this problem, let's only consider real-valued functions in 1D. Also, let's consider functions that are periodic in the interval $x \in [-\pi,\pi]$ and have vertical range restricted to $[-1.0,​+1.0]$.
+
+Under these assumptions,​ write a program [**username_fourier.cpp**] that will accept such a function on the interval $x \in [-\pi,\pi]$ and calculate its Fourier decomposition up to some value $N$ (see below for the definition of $N$ for each part).
+
+There are two approaches one can use for this problem:
+  - **Analytic approach:** Calculate $a_n$ and $b_n$ from the expressions above; or
+  - **Brute force approach:** Scan values of $a_n$ and $b_n$ and find the best values in a 2-parameter optimization problem.
+
+It is up to you which approach you choose to implement in your program. You have an additional option as well: You can either write specific versions of this code for each use case below, or you can write one version of the code that has a general function interface that performs this task on an arbitrary function (see for example our implementation of the [[hw:​hw07|Trapezoid Rule for integration]] or the [[labs:​lab07|Finite Difference Method for differentiation]],​ in which we developed general function interfaces that employed pointers to functions.
+
+Your program should produce the following output:
+  * You should print to screen what technique you used, **Analytic** or **Brute Force**
+  * Print out a table of values of $a_n$ and $b_n$ for all $n=1$ to $N$ considered in the problem. Make sure this table is clear
+  * Use gnuplot to plot the input function in the range $[-2\pi,​2\pi]$,​ (**note different range for the plots**) and overlay the final Fourier series up to order $N$ in this range. **Update, 3 May: This means the sum of the N terms is what should be plotted and overlayed over the original function.** Make the original function and Fourier series different colors -- **do not waste hours of time trying to figure out how to do the colors**. However to get full credit, the original function and the Fourier series approximation must be distinguishable through coloring.
+
+Make sure not to produce more output than this. That makes the program difficult to understand/​grade. Also, the program should be able to execute its task in under 30 seconds. If the program requires more time than this, many points will be taken off. Code that takes more than 1 minute to run will be ignored.
+
+Note: Before diving into the tasks below, you might want to seek some inspiration from this [[http://​www.falstad.com/​fourier/​|brilliant little online Fourier decomposition utility]] that does this for the user in real time via a web interface and makes very nice little displays of the various components.
+
+==== Part A (10 points) ====
+
+In this part of the problem use your program to determine the Fourier decomposition up to $N=5$ for a specific function:
+
+$f(x) = \sin(x)$
+
+for $x \in [-\pi,​\pi]$. You are probably saying to yourself, "Prof. Neu, you ding-dong! This is trivial! The Fourier decomposition for this function is just the function itself!"​ Yes this is true -- but often when developing a complicated piece of code, it is wise to run it first on a problem that is easily solvable so that you can ensure that it does the right things under simple, understood conditions.
+
+For this part, you can hardwire the function to what is required; if you do this, call the program [**username_fourierA.cpp**]. If you write a general function interface that allows for any function to be called, submit [**username_fourier.cpp**] and make sure to **include explicit instructions in the header portion of your .cpp file** describing how one should run it to get the desired output here.
+
+
+
+==== Part B (10 points) ====
+
+Here, use your program to determine the Fourier decomposition up to $N=7$ of:
+
+$f(x) = \frac{x}{\pi}$
+
+for $x \in [-\pi,​\pi]$. Periodicity of this function is achieved by defining that
+
+$f(x + 2k\pi) = f(x)$.
+
+This is the well-known **sawtooth function**. ​
+
+Again for this part, you can hardwire the function to what is required; if you do this, call the program [**username_fourierB.cpp**]. If you write a general function interface that allows for any function to be called, submit [**username_fourier.cpp**] and make sure to **include explicit instructions in the header portion of your .cpp file** describing how one should run it to get the desired output here.
+
+
+
+==== Part C (10 points) ====
+
+Here, use your program to determine the Fourier decomposition up to $N=9$ of:
+
+\begin{align*}
+f(x) = \Bigg\{ & \\
+\end{align*}
+
+for $x \in [-\pi,​\pi]$. Periodicity of this function is achieved by defining that
+
+$f(x + 2k\pi) = f(x)$.
+
+This is the well-known **square-wave**. ​
+
+Again for this part, you can hardwire the function to what is required; if you do this, call the program [**username_fourierC.cpp**]. If you write a general function interface that allows for any function to be called, submit [**username_fourier.cpp**] and make sure to **include explicit instructions in the header portion of your .cpp file** describing how one should run it to get the desired output here.
+
+
+==== Part D (10 points) ====
+
+Here, use your program to determine the Fourier decomposition up to $N=5$ of:
+
+**(Update May 7** small change in function):​
+
+$f(x) = 2\big(\frac{x}{\pi}\big)^2 - 1$
+
+for $x \in [-\pi,​\pi]$. Periodicity of this function is achieved by defining that
+
+$f(x + 2k\pi) = f(x)$.
+
+This is the **periodic parabolic well**. ​
+
+Again for this part, you can hardwire the function to what is required; if you do this, call the program [**username_fourierD.cpp**]. If you write a general function interface that allows for any function to be called, submit [**username_fourier.cpp**] and make sure to **include explicit instructions in the header portion of your .cpp file** describing how one should run it to get the desired output here.
+
+
+
+===== 3. Identifying a Gravitational Wave Signal (35 points) =====
+
+{{hw:​blackhole_merger.jpeg?​300 ​ |Black hole merger}}The recent observation of the first gravitational wave by the [[https://​www.ligo.caltech.edu/​|LIGO experiment]] could quite possibly be the most important discovery in physics in our lifetime.
+
+This first-observed gravitational wave, called **GW150914**,​ was caused by the violent collision of two black holes, with masses of 29 and 36 times that of our sun, a billion years in the past. The collision was so energetic that it shook spacetime itself -- and these ripples in spacetime have been propagating outward, dissipating their energy over the eons and seemingly forgotten to the cosmos -- until they were detected on Earth by humans using two multi-kilometer-long laser interferometers separated by thousands of miles through the correlated stretching and compression of their devices at the level of the width of a single proton. //If that is not amazing to you, I don't know what could be.//
+
+The observation was the result of a magnificently designed and impossibly precise instrument -- and **careful signal extraction techniques, ie, scientific computing at its best.**
+
+{{hw:​Inspiral-merger-ringdown.jpg?​300 ​ |Inspiral, merger, ringdown}}In this problem we will explore how a gravitational wave signal such as GW150914 can be extracted from a noisy background. Start by watching this brief [[https://​www.nytimes.com/​video/​science/​100000004200661/​what-are-gravitational-waves-ligo-black-holes.html|movie]],​ looking at [[https://​www.ligo.caltech.edu/​page/​ligo-gw-interferometer|these links]] on how the interferometers were designed and the data collected and reading this short [[https://​www.ligo.caltech.edu/​image/​ligo20160211a|piece]] on the signal extraction.
+
+There are two LIGO interferometer instruments:​ one in Hanford, WA, and one in Livingston, LA. The main measurement in each interferometer system is that of **strain** as a function of **time**. Strain is a measure of the bi-directional stretching the interferometer baseline arms would endure during a gravitational wave. Hence, strain vs. time distributions are collected from each of the LIGO interferometers in the search for gravitational waves.
+
+These long-baseline interferometers are separated by thousands of miles in order to be sensitive to the incredibly tiny signals from gravitational waves that manifest themselves in a characteristic shape in the strain vs. time distribution,​ as pictured [[https://​losc.ligo.org/​events/​GW150914/​|here]]. However, each instrument is subject to thermal noise that cannot be avoided; this thermal noise affects, for instance, the front face of each mirror, ever-so-slightly changing the path length each light beam traverses. This effect is similar to what happens when a gravitational wave interacts with the apparatus -- but **much more irregular**. Thermal noise looks quite random, whereas the gravitational wave signal has a pronounced yet brief periodic structure as the different eras of the black hole collision **(inspiral,​ merger, ringdown)** affect the gravitational wave differently.
+
+{{hw:​ligo20160211a.jpg?​300 ​ |Signal extraction}}Hence the strain measurements will bounce around quite a bit in each instrument in any time interval, as the interferometers will mostly be collecting noise data -- gravitational waves are quite rare. Crucial however is the following: The thermal noise in each of the Hanford and Livingston interferometers <​del>​in</​del>​ is **independent**,​ yet an authentic gravitational wave signal should appear in **both interferometers simultaneously**.
+
+Your task begins by obtaining the following strain v. time data from the Hanford and Livingston interferometers.
+
+<​code>​wget http://​faculty.virginia.edu/​comp-phys/​phys2660/​html/​homework/​final/​hanford_waveform_complete.dat
+wget http://​faculty.virginia.edu/​comp-phys/​phys2660/​html/​homework/​final/​livingston_waveform_complete.dat
+</​code>​
+
+The files have the following format:
+
+<​code>#​ time (s) strain * 1.e21
+0.000000 -0.057048
+0.000061 -0.006558
+0.000122 -0.064206
+0.000183 -0.066297
+0.000244 -0.114568
+0.000305 -0.092682
+0.000366 -0.122600
+0.000427 -0.206769
+0.000488 -0.206988
+0.000549 -0.194172
+etc
+</​code>​
+
+The first column are time readings in seconds, the second column are the respective strain measurements. The file contains the same time interval recorded at both interferometers;​ the clock skew between the two devices has already been accounted for (ie, the time measurements have already been aligned for you, no need to correct for the distance between the two locations). Note however, there is some ambiguity in the definition of positive and negative amplitude; just as in the observation of GW150914. Hence you should make sure to examine not only the strain values as provided, but also the **inverse**.
+
+You can also examine an example of what a gravitational wave would look like in the data. Below are the representation of the prediction from general relativity for a generic black hole merger in each of the two interferometers individually:​
+
+<​code>​wget http://​faculty.virginia.edu/​comp-phys/​phys2660/​html/​homework/​final/​hanford_GRmodel.dat
+wget http://​faculty.virginia.edu/​comp-phys/​phys2660/​html/​homework/​final/​livingston_GRmodel.dat
+</​code>​
+
+Note however that for a gravitational wave to be identified, it needs to be seen in **both interferometers** at the **same time values**.
+
+Write a program **<​username>​_ligo.cpp** that attempts to identify whether in this time interval a gravitational wave was recorded by the LIGO experiment. Your code should work on this specific example, ie, this time window in the 2 files provided; however it should be capable of extracting a gravitational wave signal on any such set of files with same format and similar noise and signal characteristics.
+
+You should answer the following questions:​
+  * (10 points) How many gravitational waves are there in this time interval? And at what time were they recorded, in seconds?
+  * (5 points) How long does each gravitational wave signal last?
+  * (5 points) Are the Hanford or Livingston waveforms in need of an inversion in order to observe a gravitational wave?
+  * (10 points) Are there any transient profiles in the either the Hanford or Livingston detectors that appear to be gravitational waves but are not corroborated by the other detector?
+
+The remaining 5 points will be distributed according to the quality of the approach and code you use to solve the above questions.
+
+
+Your code should produce legible, useful output that is neither too abundant nor too sparse.
+
+
+
+Make sure your code produces the desired output in less than 30 sec of running or significant points will be taken off.
+
+
+!-
+
+===== 3. Traveling Salesman (35 points) =====
+
+===== 2. Blackjack (30 points) =====
+
+{{hw:​how-do-you-play-blackjack.jpg?​300 ​ |Black hole merger}}[[http://​en.wikipedia.org/​wiki/​Blackjack|Blackjack]] is one of the simplest card games one can play, and is one of the most widely-played casino games in the world. The purpose of the game is to collect cards that come as close to or match the value or score **21**. Using a standard 52-card deck, each card has a specific value:
+   * cards numbered 2-10 are worth their numeric value
+   * jacks, queens and kings (J,Q,K) are worth 10
+   * aces (A) are worth either 1 or 11, and the choice of value is up to the player
+There is no difference in valuation depending on the suit of the cards.
+
+We will only consider a two-hand game with a single deck of cards, and the basic game play goes as follows: One hand is owned by the "​dealer",​ and the other hand is owned by the "​player"​. The player and dealer are adversaries. The dealer distributes one card face-up to the player and one face-up for himself or herself, followed by a second face-up to the player, and finally a second face-down to himself. At this point the player knows the value of his hand, but only one of the cards in the dealer'​s hand.
+
+The player wants to beat the dealer, which can happen in one of three ways:
+   * the player achieves 21 on the initial deal, otherwise knowns as "​blackjack!",​ without a dealer blackjack, or
+   * the player gets a final score higher than the dealer, but not exceeding 21, or
+   * the dealer gets a final score that is higher than 21, or "going bust"​
+
+Once the initial hand is dealt, the player typically then has a variety of choices. In this simplified game we describe here, the choices available to the player are:
+   * to add a card to his hand, or "​hit",​ one by one
+   * to refuse any more cards, or "​stand"​
+
+There are other options typically available in a casino-sanctioned environment (choices like "​doubling down", "​split",​ and "​surrender"​) but we will not consider those additional options here.
+
+The player can add cards until he or she is satisfied with their score. There is no minimum score that the player must achieve. If the player "​busts"​ or goes above 21, the player automatically loses.
+
+Once the player decides to stand, the dealer then reveals his or her previously face-down card. The dealer must then continue to take cards until their hand has a score of 17 or higher. If the dealer achieves a hand of 17 that contains an Ace (interpreted as a value of 11), then the dealer must take one more card. Note that dealer must stand once achieving a score of 17 or more -- except for this one loophole, colloquially known as "​dealer must draw on soft 17". In the language of the game, a "​soft"​ have has an Ace in it, since the value of the hand can change as you draw more cards.
+
+Once the dealer has a hand that totals 17 or more, the values are compared, and whoever is closer to 21 -- without going over -- is the winner of the hand.
+
+From the player'​s perspective,​ the choice of when to stand depends critically on what is in the dealer'​s hand -- **the knowledge of which is incomplete, given that one of the dealer'​s cards is still face-down.** Depending on what the dealer has in their single face-up card, it is more optimal for the player to "​hit"​ or "​stand"​.
+
+In this problem, we have two goals:
+  - To write a realistic simulation of blackjack for one player versus the dealer, which can be run either **interactive mode** where a user decides whether to hit or stand and registers the decision via the keyboard, OR in **batch mode** in which many hands of blackjack are simulated at once, the hit vs. stand decision randomized;​
+  - To use the batch mode in which we can analyze the strategy options for a given game condition (player'​s originally dealt hand vs. dealer'​s originally dealt up card) by playing thousands of "​pseudo-hands"​ of blackjack.
+
+Write your simulation, call it **<​username>​_blackjack.cpp**,​ such that it accepts the following command line arguments:​
+  * if there is no additional argument, then enter **interactive mode**
+  * if there are additional arguments then enter **batch mode** for two possibilities:​
+    * if four additional arguments are provided, then these arguments should signify
+      * the number of pseudohands to run
+      * the player'​s two cards as originally dealt
+      * the dealer'​s up-card as originally dealt
+    * if just one additional argument is provided, then the batch mode simulation will randomize the initial player and dealer hands and test whatever possibilities arise.
+
+So for instance, if I want to play along in interactive mode, I would run the program this way:
+
+<​code>​./​ccn4g_blackjack</​code>​
+
+If I want to simulate 10000 hands of blackjack in which the player is deal 14=10,4 and the dealer has a 6, I would run the program this way:
+
+<​code>​./​ccn4g_blackjack 10000 10 4 6</​code>​
+
+If I want to simulate 10000 hands of blackjack in which the player is deal 14=A,3 and the dealer has a 6, I would run the program this way:
+
+<​code>​./​ccn4g_blackjack 10000 A 3 6</​code>​
+
+If I want to simulate 50000 hands of blackjack in which all possibilities for the the player'​s originally dealt cards and the dealer'​s up card, I would run the program this way:
+
+<​code>​./​ccn4g_blackjack 50000</​code>​
+
+Notes:
+  * In both modes of play, each hand is independent,​ meaning that the deck is shuffled after each hand concludes.
+  * In a single-deck game, after the initial deal, there are only 48 cards left from which to draw the subsequent cards. Make sure to take this into account when dealing additional cards.
+  * Do not consider the possibilities of "​doubling down", "​splitting",​ or "​surrender"​.
+  * One should be concerned with whether the player'​s hand is "​soft"​ or "​hard",​ ie, whether it is built with an Ace. The recommendations will be different of the player as 17=10,7 versus 17=A,6. We want to be able to test both possibilities as different types of hands.
+  * Take care to write your code in steps, adding functionality as you go, hoping to achieve the most-general case. See the grading breakdown below for guidance. Make sure to achieve the simple deliverables first, as described below, before making the code completely general -- and infinitely harder to debug.
+
+
+In interactive mode, the output of your program should print to screen:
+  * Display the originally dealt cards to both the player and the dealer
+  * Report the final hand of cards and score of the player'​s hand
+  * Report the final hand of cards and score of the dealer'​s hand
+  * Report who the winner was
+
+In batch mode, the output of your program should print to screen
+  * The recommendation whether a player should "​hit"​ or "​stand"​ under the conditions requested, OR
+  * The matrix of recommendations if testing all possibilities
+
+The recommendation of whether to hit or stand comes from the pseudohands -- which choice more often yields a winning hand? For the '​all'​ cases, reported to screen should be a summary of all combinations -- something streamlined which efficiently reports the summary of all combinations,​ a simpler version of what is [[http://​en.wikipedia.org/​wiki/​Blackjack#​Basic_strategy|displayed here]]. However, again here we will only take into account the difference between whether the player has a "​soft"​ or "​hard"​ hand, not considering the cases of "​doubling down", "​splitting"​ or "​surrender"​.
+
+Grading: (**NOTE:** Updated 1:22pm 5 May 2017)
+   * 4 points: Does interactive mode work? Are the screen prompts, dealt cards and winner/​loser clear?
+   * 4 points: If the player has a score of 18=A,7, and the dealer has 9, should you hit or stay -- and why?
+   * 9 points: If the player has a score of 18=A,7, what is the recommendation for each of the possibilities for the dealer'​s initial face-up card?
+   * 9 points: What are the recommendations for all of the possibilities for the player'​s initial hand in light of each of the possibilities for the dealer'​s initial face-up card?  Again, look at this [[http://​en.wikipedia.org/​wiki/​Blackjack#​Basic_strategy|matrix]] for inspiration on how to report your findings.
+   * 4 points: How would your answers change if instead of playing with one deck, we were playing with two identical decks with cards intermingled?​
+
+Hand in your code electronically **<​username>​_blackjack.cpp**. Make sure the output of your code is clear, such that it is easy to understand what you are reporting. Answer clearly the questions above in the comments of your code near the top of your file.
+
+
+**Note:** We are not learning here how to "count cards",​ a practice that takes into account what other cards have been played in the deck and adjusts the probabilities accordingly. ​ The legality of counting cards is something that
+varies by jurisdiction. Here we are only looking at **probabilities in a statistical sense**: Given no other knowledge of the sequence of the remaining cards in the deck, in a statistical sense what is the best strategy for a given player'​s hand in light of what we know about the dealer'​s hand.
+
+
+===== 3. Flocking (30 points) =====
+
+To me, one of the most interesting natural phenomena is the flocking behavior of a collection of birds, as depicted in this [[https://​www.youtube.com/​watch?​v=t3Gp4vob_t8|cool youtube video from Norway]] and many many other sources on the web.
+
+From a physics point of view, it is really fascinating. I prefer to think of it in terms of **forces**. There must be some sort of **long-range attractive force** that brings the birds together. But then there must be some sort of **repulsion at short distances** to keep them from crashing into each other. And then clearly there must be some sort of **cohesion** among all the coalesced collection of birds. The ensemble then **moves in some sort of collective aligned motion** until some sort of perturbation (some insects to eat? an interloper bird?) is encountered,​ at which point some of them **scatter** and the **attractive force** comes back into play.
+
+Let's model this behavior. Now before each of you gets fired up about the solvability of the problem, I will tell you this: I do not yet have a solution. But I know it is possible. The solution itself is not what will get you the lion's share of the precious 30 points on this problem, but rather the process.
+
+Our goals here are:
+  * Develop an approach to solving the problem that is well-defined
+  * Use the tools that we have used throughout the semester
+  * Carry through our simulation in a thoughtful manner
+
+==== Part A: 20 points ====
+
+We will perform the simulation in two dimensions. To start out, consider a small number of birds, say 10. Consider the birds to be restricted to some confined simple 2D space.
+
+
+**Question 1:** What **specific rules will you define** that govern the motion of the birds? This can be expressed in terms of forces and resulting acceleration via Newton'​s Second Law, or it can be a simple set of spatial- and velocity-dependent rules.
+
+**Question 2:** What **simple 2D bounding space** will you consider in the simulation?​
+
+**Question 3:** How will we **define the initial conditions** of the birds? Think about positions as well as velocities.
+
+**Question 4:** We need to use the specific rules from above to **update each bird's position and velocity.**
+
+**Question 5:** Each bird will be affected by some or all of the other birds in the collection. In an N=10 collection of birds, **is it important to take into account the effect of all other birds** on each individual? Or just the M<9 (5? 2? 8?) closest neighbors? How do we define closest?
+
+Once you have thought about these questions, develop a simulation **<​username>​_flock.cpp** for a flock of N birds that implements the interaction rules and evolves them from their initial conditions. Start with N=10 but make the simulation capable of handling up to N=100. Make sure that if they encounter the bounding barrier that they bounce back -- no birds leave the bounding space. Animate the simulation using reasonable time steps. Submit any additional files you may need (.cpp,​.hpp,​.pdf,​.plt,​.dat extensions only) pre-pended with your <​username>​. If there are special instructions required for compiling / running / animating, **be sure to submit all necessary files** and **write the instructions in the comments section** of your **<​username>​_flock.cpp** file.
+
+Your initial ideas to the answers for the above questions can and should evolve. At the end of your thought process, make sure to briefly answer Questions 1-5 in the comments section of your **<​username>​_flock.cpp** file, describing the approach you used in the simulation and any failed/​intractable/​interesting previous attempts you might have made.
+
+
+==== Part B: 10 points ====
+
+Once your N-bird simulation is written, write another simulation called **<​username>​_flockpredator.cpp** which includes an additional feature: introduce k **predators** which cause the flock to scatter. Start with k=1 but write the code such that up to k=10 can be accommodated. Consider the predators to be kind of dim-witted, such that they initially are on some trajectory and then they just bounce against the bounding surface and continue with same speed. Make sure the predator speed is not very different from the average speed of the collection of N birds. When an individual bird encounters a predator, **how do you think the bird will react?** Implement that in the simulation and animate it as well. Describe your approach to this aspect of the problem in the comments section of your **<​username>​_flockpredator.cpp** file.
+
+
+**General note:** For this problem I suggest referring back to the gravity problems in [[hw:​hw07|hw07]] and [[hw:​hw07|hw08]] for both inspiration on how to define interaction rules, updating the locations and velocities of the objects in question and how to create an animation of your simulation. The source code for //​animate_answer//​ is located [[http://​faculty.virginia.edu/​comp-phys/​phys2660/​html/​homework/​hw08/​solutions/​|here]] which may or may not be helpful.
+
+
+
+
+
+
+**Note:** Do not explore any outside resources on bird behavior or computer simulations of bird behavior. Really do this on your own, using your own creativity.
+
+
+
+
+
+===== 1. Shut the Box (30 points) =====
+
+{{:​hw:​shut-the-box.jpg|}}
+
+Variants of the game [[http://​en.wikipedia.org/​wiki/​Shut_the_Box|Shut the Box]] have been around since the 12th century. Today it goes by many names and some version of it is played on every continent.
+
+Shut the Box is a simple game played with two dice and some number of numbered tiles. The point of the game is to take successive turns rolling the two dice and eliminating numbered tiles, turning them over as the sum is achieved. Any number of tiles can be turned over after a given roll as long as the sum matches the sum of the dice from the roll. The version of Shut the Box we will consider here has tiles numbered 1-12.
+
+Play starts with all tiles turned up. As an example, consider rolling on the first roll a 3 and a 4. Their sum is 7. In this case, one could **either** flip down the tiles for 3 and 4, or 2 and 5, or 1 and 6, or 1 and 2 and 4, or any combination whose sum is 7. One could even flip down the **one** tile for the sum of the numbers, 7. The player'​s turn continues with more rolls. Later in the turn, if one rolls a 2 and a 3, but 1, 2, 3, 4 and 5 are already turned over, then the player'​s turn is over, since none of the remaining tiles can be summed to match the sum of the dice rolled. If one rolls a 1 and a 4, but 1, 2, 3, and 5 are turned over, then one might be tempted to turn over the 4 BUT in fact the turn is over. Play proceeds until one cannot flip any more tiles. A final score for a player'​s turn is determined by summing the values of the tiles that remain. Clearly, lower scores are optimal.
+
+The game is best played in a group of people, each person taking turns to achieve the best/lowest score, claiming as their prize **ultimate glory** or, in some cases, **a small amount of previously agreed-upon stakes**.
+
+For this portion of the final exam, you will run thousands of single-player "​pseudogames"​ of Shut the Box.
+
+Write your simulation, call it <​username>​_shutthebox,​ such that it accepts the following command line arguments:​
+  * number of pseudogames to run
+  * strategy number, either 1 or 2 (see below), or, if no value is given, for running both strategies simultaneously
+
+
+The output of your program should print to screen:
+  - The strategy tested, including its name
+  - The number of pseduogames tested
+  - The worst (highest) possible score
+  - An estimate of the probability of achieving the worst possible score
+  - An estimate of the probability of "​shutting the box", ie, in a given turn what is the probability of rolling the dice such that you turn down all the tiles over the course of a turn, effectively achieving a score of 0.
+
+**Question:​** Is it more likely to shut-the-box depending on the **strategy** one employs in the game? There are two general schools of thought:
+  * **Strategy 1, "​More-tiles-first":​** In every case where there is a choice, prefer to flip down the most number of tiles that equal the sum instead of smaller combinations of tiles. ​ In cases where there are multiple choices with the same number of tiles to flip down, choose the tile combination that has the highest individual value among it.
+  * **Strategy 2, "​Fewer-tiles-first":​** In every case where there is a choice, prefer to flip down the smallest number of tiles that equal the sum instead of larger combinations of tiles. In cases where there are multiple choices with the same number of tiles to flip down, choose the tile combination that has the highest individual value among it.
+
+
+**Make sure each piece of information is labeled properly and that you are clear on which of the two strategies you are testing**. These answers should be determined from your pseudogames.
+
+
+===== 2. Texas Hold 'Em Poker (40 points) =====
+
+{{:​hw:​poker.png|}}
+
+Before you start, [[https://​en.wikipedia.org/​wiki/​Texas_hold_%27em|read about the rules and game play]] of the popular poker variant called Texas Hold 'Em.
+
+The goal of this game is to achieve the best [[http://​www.cardplayer.com/​rules-of-poker/​hand-rankings|5-card poker hand]] among all the players. The game is played with [[https://​en.wikipedia.org/​wiki/​Standard_52-card_deck|a standard deck of 52 playing cards]].
+
+Players are dealt two face-down cards that belong to them alone, and five community cards which are dealt in three stages and are shared by all players. Typically there is a round of betting:
+  * after being dealt the two "hole cards"​
+  * after the first three community cards are revealed, called "the flop"​
+  * after the fourth community card is revealed, called "the turn"​
+  * and after the fifth and final community card is revealed, called "the river"​
+
+As designed then, players have more of an idea about the strength of their hand on //the river// than they do before //the flop//, for instance.
+
+For this portion of the final exam, you will run thousands of two-player "​pseudohands"​ of Texas Hold 'Em poker. Your general task here is the following: Model a 1- or 2-player, single-deck Texas Hold 'Em game **from the perspective of one of the players**. The goal of your code is to answer some probability questions, and provide **win probabilities** for various hole-card / community-cards configurations at defined stages of the game. Recall each player knows nothing about the other player'​s hole cards -- hence the need to simulate lots of possible outcomes in order to know what the win probabilities are. No ante / betting / raising / calling assumptions need to be included to answer these questions.
+
+Write your simulation, call it <​username>​_poker,​ such that it accepts the following command line arguments:​
+  * number of pseudohands to run, followed by
+  * the two hole-card identities, followed by
+  * three, four, or five community card identities
+
+For the card identities, use //A// for Ace, 2....9 for number cards, //T// for 10, //J,Q,K// for Jack, Queen, King, respectively. Also use //C,D,H,S// for the suits Clubs, Diamonds, Hearts, Spades. Hence the ten of Spades would be denoted //TS//, the four of Hearts //4H//, etc.
+
+Your code should expect a variable number of community cards depending on what situation you are modeling without any extra prompting.
+
+The output should minimally state the following:​
+
+<​code>​
+Simulated mmmmm pseudohands of Texas Hold 'Em:
+  Given hole cards xx and yy, and community cards c1,​c2,​c3,​(c4,​c5),​ the win probability is nnnnn%.
+  In mmmmm pseudohands of this situation, there were rrrrr instances of a Royal Flush (rf%).
+</​code>​
+
+[You can add other lines reporting other single-hand outcomes like Straight Flush, Four of a Kind, etc., if you like. But take care not to have any other cluttered output printed to screen.]
+
+Questions to answer. Do so briefly in words in the header of your code. These answers should be consistent with the output reported to screen:
+  - 10 points: What is the probability of achieving a [[http://​www.cardplayer.com/​rules-of-poker/​hand-rankings|Royal Flush of any suit]] on or before the river in a standalone, single-player sense? //**Note** (updated 5/6) this used to read: "What is the probability of any single player in this 2-player game achieving a [[http://​www.cardplayer.com/​rules-of-poker/​hand-rankings|Royal Flush of any suit]] on or before the river?"​ updated for clarity.//​
+  - 20 points: If one is dealt KC,7H and the community cards at the river are 2H,​4S,​KH,​7D,​3S,​ what is the win probability? ​
+  - 10 points: If one is dealt KC,7H and the community cards at the flop are 2H,4S,KH, what is the win probability? ​
+
+Although one could calculate the probabilities by hand, the answers should come from the simulated pseudohands in your code.
+
+Hints:
+  * Start by simulating the game at //the river// first, since the combinatorics are simplest, ie, there are fewer possibilities to consider for the opponent'​s hand.
+  * Also, you should start by simulating the game at //the river// first since questions 1 and 2 above only pertain to the river -- and are worth 2/3-rds of the points for this question.
+  * The first question requires simply counting the number of Royal Flush outcomes among the pseudohands you run. This will of course depend on what you enter for the hole and community cards. Hence I recommend writing two poker modeling functions:
+     - One that simply deals out many raw combinations of 2 hole cards and 5 community cards and counts how often a Royal Flush is achieved (this is what I mean by a 1-player game above), and
+     - One that simulates the player vs. player nature of the 2-player game
+
+
+===== 3. Etiquette and Efficiency on the London Underground (30 points) =====
+
+{{:​hw:​holborn.jpg|}}
+
+Start by reading [[https://​tfl.gov.uk/​info-for/​media/​press-releases/​2016/​march/​london-underground-planspilot-of-new-escalator-arrangement-at-holborn-tube-station|this important notice]] about escalator protocols at Holborn Tube station on the London Underground.
+
+In short, the [[https://​tfl.gov.uk/​|Transit For London]] authorities have mandated that a couple of their up escalators will eschew the common convention of "walk on the left, stand on the right" in favor of the "stand on both sides" paradigm. The claim is that "walk on the left, stand on the right" approach, somewhat surprisingly,​ is in fact less efficient at transporting passengers from bottom to top of an escalator.
+
+Actually, is this so surprising? Consider the following [[http://​www.bbc.com/​news/​magazine-23444086|statistics]]:​
+  * Only ~25% of Tube users choose to walk up the escalator, when the option is available, meaning a great deal of the time half the escalator is unused
+  * While 90% of Tube users formerly adhered to the "walk on the left, stand on the right" convention, that means 10% ignored the convention -- and a single left-side standing commuter halts all left-side walkers temporarily until their very impolite escalator ride is over. //**Note:** changed 5/6 from "a single left-side standing commuter halts all left-side walkers."​ for clarity.//​
+
+
+So which approach is best? Sounds like a job for some **talented and clever University of Virginia PHYS 2660 students!**
+
+In this part of the final you will model this physical system. Consider the problem of needing to move N people from the bottom to the top of an escalator, where N is large, say N=5000. Imagine the flux of people entering the escalator is 2 persons per second and is constant. The escalator at Holborn station is really tall -- nearly 25m! -- and hence a trip from bottom to top while standing requires about 30 seconds. Walking up the escalator requires half that amount of time. The probabilities listed above should be considered to be Gaussian distributed,​ adhering to a distribution with a width that is 10% of the mean.
+
+Unlike the above problems, this one is going to be **a bit free-form** -- you are free to answer the problem in any C-programming-based way you like.
+
+All files you submit should be named in the format **[<​username>​_escalator.xx]** where xx = cpp, txt or pdf. You should write a brief report in a .txt file answering the following:
+  - What was your hypothesis? Ie, which escalator convention do you predict to be most efficient?​
+  - What was your approach to the problem?
+  - What figure of merit did you design or rely on to measure the efficiency of the two conventions?​
+  - How do you run your code?
+  - What do you conclude from your modeling?
+  - Are there features of the description of this model that are missing that make it less-than realistic?​
+  - Are there additional tests one could do with the model to take the understanding of this system beyond the baseline? What other questions could you consider pursuing?
+
+There is no length requirement for the written portion, but **strong preference is for succinct clarity** rather than lengthy responses -- minimum word counts are //patently absurd// (IMO it is juvenile to tell students they need to respond in a minimum number of words -- this is not middle school). You know what you need to say to represent your work properly.
+
+The problem will be graded by all three of us (Eric, Rob and Prof. Neu) and your grade the average of the three. We will be grading your responses based on soundness of scientific approach, completeness and clarity.
+
+For more background information,​ check out:
+  * http://​www.slate.com/​blogs/​future_tense/​2016/​01/​22/​standing_on_escalators_faster_than_walking_according_to_transport_for_london.html
+  * http://​www.bbc.com/​news/​magazine-23444086
+  * http://​www.citylab.com/​commute/​2016/​01/​subway-escalator-standing-study-tfl-london/​424950/​
+  * Chaos! http://​metro.co.uk/​2016/​04/​18/​chaos-as-passengers-are-told-to-stand-on-the-right-and-left-at-holborn-tube-station-5823640/​
+
+
+===== 3. Signal and Background (40 points) =====

In particle physics, we build massive high-energy particle colliders in the hopes of harnessing the energy in these collisions to produce exotic forms of matter. These processes are rare, so we design our accelerator to collide particles together millions of times a second in the hopes of capturing these elusive new entities. When created, these new forms of matter decay to more common familiar particles; these mundane decay products then fly out from the collision region and interact with our particle detector. ​ This massive device, often bigger than a good-sized office building, collects all the signals for each collision, called an "​event"​ in our vocabulary; each collision event is characterized by a collection of reconstructed observables describing the momenta, energies, trajectories and multiplicities of the outgoing particles. For example, the file In particle physics, we build massive high-energy particle colliders in the hopes of harnessing the energy in these collisions to produce exotic forms of matter. These processes are rare, so we design our accelerator to collide particles together millions of times a second in the hopes of capturing these elusive new entities. When created, these new forms of matter decay to more common familiar particles; these mundane decay products then fly out from the collision region and interact with our particle detector. ​ This massive device, often bigger than a good-sized office building, collects all the signals for each collision, called an "​event"​ in our vocabulary; each collision event is characterized by a collection of reconstructed observables describing the momenta, energies, trajectories and multiplicities of the outgoing particles. For example, the file
Line 88: Line 542:

Start by **looking at the data** and doing the following: Start by **looking at the data** and doing the following:
-  * Write some code to run over the data events and fill nine histograms, one for each observable. Use bins of appropriate width such that the resulting plot is not too jagged bin-to-bin; for example, for Ht use bins that are 20 units wide spanning the range [0.0, 1000.0]. Several bins will be empty but the bins in the core of the distribution will be fairly full. +  * Write some code to run over the data events and fill a histogram ​for just one of the observables,​ Ht. Use bins of appropriate width such that the resulting plot is not too jagged bin-to-bin; for example, ​a good choice ​for Ht would be to use bins that are 20 units wide spanning the range [0.0, 1000.0]. Several bins will be empty but the bins in the core of the distribution will be fairly full.
-  * Produce a plot for each observable, plotting ​the data with black points.  +  * Plot the data with black points. Make sure the plot has all the desired elements of a good histogram
-  * Hand in these plots electronically as a single ​pdf **[username_data.pdf]**. ​There are several ways one can do this; a simple, inelegant-yet-effective way would be to insert each of your output png/jpeg files into a PowerPoint or Word document and subsequently save this collection of plots as a single pdf. **Then, secure-copy this file to galileo, and submit it electronically just like any other file you have submitted this semester.**+  * Hand in this Ht plot for data electronically as a pdf **[<​username>​_data.pdf]**. ​

Next we need to **inspect our signal and background models**: Next we need to **inspect our signal and background models**:
-  * Use similar code as in the above to make histograms ​for your simulated signal events ​ However, here normalize each of the nine histograms ​to the same number of events as observed in the data. Do this by first normalizing to unit area -- meaning scale the final contents of each histogram such that the area under each is 1.00. Then scale again, this time to match the area under the data curve (hint: the area under the data curve should be the same as the number of observed data events). Use the same binning as you used for your data histograms in each variable.   +  * Use similar code as in the above to make a histogram ​for the simulated signal events ​in the variable Ht.  ​However,​ here normalize each of the histogram ​to the same number of events as observed in the data. Do this by first normalizing to unit area -- meaning ​**scale the contents of each histogram** such that the area under the curve is //1.00//**Then scale again,** this time to match //the area under the data curve// (hint: the area under the data curve should be the same as the number of observed data events). Use the same binning as you used for your data histogram.
-  * Use the same code to run over the simulated background events, filling ​nine similar normalized-to-the-data ​histograms. +  * Use this same code to run over the simulated background events, filling ​similar normalized-to-the-data ​histogram ​for the Ht variable
-  * Produce a plot for each observable, overlaying ​the simulated signal and simulated background. Make the signal a solid red histogram, and the background an outline-only blue histogram.  +  * Hand in these plots electronically as pdfs as well **[<​username>​_SigShape.pdf]** and **[<​username>​_BkgShape.pdf]**. ​
-  * Hand in these plots electronically as a single ​pdf **[username_SBshapes.pdf]**. ​+

Now we need to determine how much signal and background the data favors. We will therefore perform a two-component "​fit"​ in one of the variables, Ht. The two components are signal and background; we seek for the sum of signal and background to match as closely as possible the data across all the bins of the Ht histogram. Ht is common shorthand in the field of collider physics; it typically represents the sum of the energies of each of the reconstructed objects in the event. From your signal vs. background shapes, it should be apparent that the signature of our signal is typically more energetic (ie, favors higher Ht values) than the background. ​ Now we need to determine how much signal and background the data favors. We will therefore perform a two-component "​fit"​ in one of the variables, Ht. The two components are signal and background; we seek for the sum of signal and background to match as closely as possible the data across all the bins of the Ht histogram. Ht is common shorthand in the field of collider physics; it typically represents the sum of the energies of each of the reconstructed objects in the event. From your signal vs. background shapes, it should be apparent that the signature of our signal is typically more energetic (ie, favors higher Ht values) than the background. ​

-Back to our fit. There are several ways to do this. Many are computationally expensive and can easily become intractable. We will do something simple here using a chi^2 fit (see [[http://​faculty.virginia.edu/​comp-phys/​phys2660/​wiki/​doku.php?​id=notes|lecture ​10 for a review]]. Define a summed chi^2 over all of the bins of the Ht distribution:​+Back to our fit. There are several ways to do this. Many are computationally expensive and can easily become intractable. We will do something simple here using a chi^2 fit (see [[http://​faculty.virginia.edu/​comp-phys/​phys2660/​wiki/​doku.php?​id=notes|lecture ​11 for a review]]). Define a summed chi^2 over all of the bins of the Ht distribution:​

\\ <​math>​\large \chi^2 = \sum_{i=1}^{k} \frac{(n_i - N_{i}^{pred})^2}{\sigma_i^2}</​math>​ \\ <​math>​\large \chi^2 = \sum_{i=1}^{k} \frac{(n_i - N_{i}^{pred})^2}{\sigma_i^2}</​math>​

where: where:
-  * <​math>​ n_i </​math>​ is the number of observed events in the data in bin <​math>​ i</​math>​+  * <​math>​ n_i </​math>​ is the number of observed events in the data in bin <​math>​ i </​math>​
* <​math>​ \sigma_i </​math>​ is the statistical uncertainty on the number of observed events in the data in bin <​math>​ i</​math>​   * <​math>​ \sigma_i </​math>​ is the statistical uncertainty on the number of observed events in the data in bin <​math>​ i</​math>​
* <​math>​ N_{i}^{pred} </​math>​ is the sum of the predicted signal and background contributions in bin <​math>​ i</​math>​ (see below for more)   * <​math>​ N_{i}^{pred} </​math>​ is the sum of the predicted signal and background contributions in bin <​math>​ i</​math>​ (see below for more)
Line 125: Line 578:
the problem becomes much easier. One can simply scan different values of <​math>​ f_s </​math>​ and <​math>​ f_b </​math>,​ consistent with this simple constraint, and find where the minimum of the chi^2 occurs. the problem becomes much easier. One can simply scan different values of <​math>​ f_s </​math>​ and <​math>​ f_b </​math>,​ consistent with this simple constraint, and find where the minimum of the chi^2 occurs.

-Write a piece of code that reads in the Ht histograms from signal, background and data and executes this simple chi^2 fit. You can do this in a crude manner (scanning all possible values) or an intelligent manner (starting with say a 50/50 mix and choosing subsequent pairs such the chi^2 is reduced, iterating until the best fractions are found). ​More credit will be give for intelligent solutions. Make sure that the code reports ​to the command line the preferred ​fractions of signal and background, <​math>​ f_s </​math>​ and <​math>​ f_b </​math>,​ that the data prefers. +Write a piece of code that reads in the Ht histograms from signal, background and data and executes this simple chi^2 fit. You can do this in a crude manner (scanning all possible values) or an intelligent manner (starting with say a 50/50 mix and choosing subsequent pairs such the chi^2 is reduced, iterating until the best fractions are found). ​
-Hand in this code electronically **[username_chi2fit.cpp]**.+
+The code should report ​to the screen:
+  - The chi2 value when assuming a 50/50 mix of S and B.
+  - The fractions of signal and background, <​math>​ f_s </​math>​ and <​math>​ f_b </​math>,​ that the data prefers, including the final chi2 value achieved
+
+Make sure these pieces of output are properly labeled when printed to screen. No other output should be produced.
+
+Hand in this code electronically **[<​username>​_chi2fit.cpp]**.

+-!