Difference between revisions of "General statistics notes on counting experiments"

From Vacuum Ultra Violet
Jump to navigation Jump to search
(Add section about event timing)
m (Fix problems)
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
 
Since VULCAN will measure light signals, whether they are reflected, fluoresced or other, basically all measurements will look the same. In the simplest case we look at the signal from a single SiPM, and we want to calculate the rate. This is a simple counting experiment.
 
Since VULCAN will measure light signals, whether they are reflected, fluoresced or other, basically all measurements will look the same. In the simplest case we look at the signal from a single SiPM, and we want to calculate the rate. This is a simple counting experiment.
  
== Counting experiments (the simplest case) ==
+
== Counting experiments ==
In a counting experiment, you count the number of occurances of an event in a certain time interval. Currently, we record 16us waveforms, but the exact length of the time interval is irrelevant. In these 16us, we want to know how much light was seen. So, we count the number of photo electron (p.e.) peaks. These peaks have a very distinct shape and a well-defined height. It depends on the measurement which peaks are relevant, e.g. for dark count measurements only the single p.e. peaks are interesting. In the end, we'll have a number of events in a certain time interval. These counts are Poisson distributed. So, if we take multiple measurements, and our process it truly a Poisson process, we'll be able to fit a nice Poisson distribution to the counts we measured. The maximum likelihood estimator (MLE) for the Poisson parameter λ is then simply the sample mean. This MLE is unbiased, efficient, complete and sufficient. λ is the expected value, as well as the variance, so when we have λ, we are basically done.
+
In a counting experiment, you count the number of occurances of an event in a certain time interval. Currently, we record 16us waveforms, but the exact length of the time interval is irrelevant. In these 16us, we want to know how much light was seen. So, we count the number of photo electron (p.e.) peaks. These peaks have a very distinct shape and a well-defined height. It depends on the measurement which peaks are relevant, e.g. for dark count measurements only the single p.e. peaks are interesting, the rest is "background". In the end, we'll have a number of events in a certain time interval. These counts are Poisson distributed. So, if we take multiple measurements, and our process it truly a Poisson process, we'll be able to fit a nice Poisson distribution to the counts we measured. The maximum likelihood estimator (MLE) for the Poisson parameter λ is then simply the sample mean. This MLE is unbiased, efficient, complete and sufficient. λ is the expected value, as well as the variance, so when we have λ, we are basically done.
  
The rates we expected are rather high, especially at room temperature. For example, the VUV sensitive SiPMs have a dark count rate of ~1.5-2 MHz at room temperature. Since we can see a Poisson(100) distribution as the sum of 100 Poisson(1) distributions, we can apply the central limit theorem and approximate the Poisson distribution as a Gaussian if we have more than ~20 counts per time interval (or above 1.25 MHz). We'll need to apply a continuity correction too. For lower rates, we can simply take longer measurements in order to still be able to use the Gaussian approximation, or just use Poisson distributions.
+
The rates we expect for our measurements are rather high, especially at room temperature. For example, the VUV sensitive SiPMs have a dark count rate of ~1.5-2 MHz at room temperature. Since we can see a Poisson(100) distribution as the sum of 100 Poisson(1) distributions, we can apply the central limit theorem and approximate the Poisson distribution as a Gaussian if we have more than ~20 counts per time interval (or above 1.25 MHz). We'll need to apply a continuity correction too. For lower rates, we can simply take longer measurements in order to still be able to use the Gaussian approximation. But we like to use Poisson statistics whenever possible, because it is simple.
  
 
== Noise effects ==
 
== Noise effects ==
So, this will get us an estimate for the rate we measured. However, we also have noise in our measurements. And it can happen that we confuse a noise peak with a p.e. peak, or that noise makes a p.e. peak unrecognizable causing us to miss it. Both of these types of noise are in principle also Poisson processes, therefore our measured distribution would still be Poissonian. Unfortunately, sometimes a couple of waveform recordings in a row go wrong, because of connection problems, or some other reason, and then our peak events are no longer independent of the time since the last event. Similar time dependencies of the measurement broaden our Poisson distribution. If our Poisson parameter λ is large enough that we can use the Gaussian approximation, we will most likely still find our measurements can be approximated by a Gaussian distribution. However, the variance will be larger than the Poissonian square root of λ.
+
So, from our peak count we get an estimate for the rate. However, we also have noise in our measurements. And it can happen that we confuse a noise peak with a p.e. peak, or that noise makes a p.e. peak unrecognizable - causing us to miss it. Both of these types of noise are in principle also Poisson processes, therefore our measured distribution would still be Poissonian. Unfortunately, sometimes a couple of waveform recordings in a row go wrong, e.g. because of connection problems, and then our peak events are no longer independent of the time. Similar time dependencies of the measurement broaden our Poisson distribution. If our Poisson parameter λ is large enough that we can use the Gaussian approximation, we will most likely still find our measurements can be approximated by a Gaussian distribution. However, the variance will be larger than the Poissonian square root of λ.
 
 
== Experiments (a bit more involved) ==
 
In the simplest case, we looked at dark count measurements. These are nice because we (in theory) only have single p.e. events. Therefore, no two events can happen at the same time and we are looking at a true Poisson process. As soon as we introduce light, we get multiple p.e. peaks, which are basically multiple events happening at the same time. Now our measurement is no Poisson process anymore!
 

Latest revision as of 15:36, 14 March 2023

Since VULCAN will measure light signals, whether they are reflected, fluoresced or other, basically all measurements will look the same. In the simplest case we look at the signal from a single SiPM, and we want to calculate the rate. This is a simple counting experiment.

Counting experiments

In a counting experiment, you count the number of occurances of an event in a certain time interval. Currently, we record 16us waveforms, but the exact length of the time interval is irrelevant. In these 16us, we want to know how much light was seen. So, we count the number of photo electron (p.e.) peaks. These peaks have a very distinct shape and a well-defined height. It depends on the measurement which peaks are relevant, e.g. for dark count measurements only the single p.e. peaks are interesting, the rest is "background". In the end, we'll have a number of events in a certain time interval. These counts are Poisson distributed. So, if we take multiple measurements, and our process it truly a Poisson process, we'll be able to fit a nice Poisson distribution to the counts we measured. The maximum likelihood estimator (MLE) for the Poisson parameter λ is then simply the sample mean. This MLE is unbiased, efficient, complete and sufficient. λ is the expected value, as well as the variance, so when we have λ, we are basically done.

The rates we expect for our measurements are rather high, especially at room temperature. For example, the VUV sensitive SiPMs have a dark count rate of ~1.5-2 MHz at room temperature. Since we can see a Poisson(100) distribution as the sum of 100 Poisson(1) distributions, we can apply the central limit theorem and approximate the Poisson distribution as a Gaussian if we have more than ~20 counts per time interval (or above 1.25 MHz). We'll need to apply a continuity correction too. For lower rates, we can simply take longer measurements in order to still be able to use the Gaussian approximation. But we like to use Poisson statistics whenever possible, because it is simple.

Noise effects

So, from our peak count we get an estimate for the rate. However, we also have noise in our measurements. And it can happen that we confuse a noise peak with a p.e. peak, or that noise makes a p.e. peak unrecognizable - causing us to miss it. Both of these types of noise are in principle also Poisson processes, therefore our measured distribution would still be Poissonian. Unfortunately, sometimes a couple of waveform recordings in a row go wrong, e.g. because of connection problems, and then our peak events are no longer independent of the time. Similar time dependencies of the measurement broaden our Poisson distribution. If our Poisson parameter λ is large enough that we can use the Gaussian approximation, we will most likely still find our measurements can be approximated by a Gaussian distribution. However, the variance will be larger than the Poissonian square root of λ.