Skip to main content
Bumped by Community user
Bumped by Community user
Bumped by Community user
Bumped by Community user
Bumped by Community user
Bumped by Community user
Bumped by Community user
Bumped by Community user
Bumped by Community user
Bumped by Community user
Bumped by Community user
Bumped by Community user
Bumped by Community user
Bumped by Community user
Bumped by Community user
Bumped by Community user
Bumped by Community user
added 88 characters in body
Source Link
ECR
  • 9
  • 2

I encountered the following problem (I give more details of the problem at the end of the post) and I am trying to figure out the best way of performing a null hypothesis testing. I looked for similar questions (like this) but it does not fit exactly my problem.

I have a random vector $X = (X_1,...,X_N)$ of $N$ random binary variables, not necessarily independent and non identically distributed. ThisThese $N$ variables are divided into two subsets: $A$ with $N_A$ random variables and B$B$ with $N_B$ variables ($N_A + N_B=N$$N_A + N_B = N$), so I can also write $X = (X_A,X_B)$. I know the marginal distribution of each of the binary variables, as well as the first and second moments of the random vector, which forms my null hypothesis. 

Now I would like to test it depending on the following data obtained by samplingconsider another random vector $Y = (Y_1,...,Y_N) = (Y_A, Y_B)$, from which I can only sample in two successive steps:

First first sample for the set $A$, obtaining the outcome$Y_A$ $a= (a_1, ..., a_{N_A})$(obtaining some string $(a_1,...,a_{N_A})$), and conditioned on this outcomethen sample on$(Y_B | Y_A = (a_1,...,a_{N_A}))$. The null hypothesis is that $Y$ follows the second subsetsame distribution as $B$$X$. 

The problem that arises here is that the set of possible outcomes for $A$ is too large, which means that the probability to obtain the same $a= (a_1, ..., a_{N_A})$ is negligible. Thus, the distribution of the random variables in subset $B$ changes in each iteration. Since I cannot repeat the sampling under identical conditions I cannot use the usual central limit theorem to approximate the experimental mean by a Gaussian and perform typical Gaussian hypothesis tests.

You can imagine this as having $N$ biased coins, each bias being different, and the coins may not be independent. First I throw $N_A$ of the coins, which conditions the possible outcomes of the second set $B$.

How can I test my null hypothesis under these restrictions?

More details of the problem: I am dealing with a problem in quantum mechanics, having a state of $N$ spins that might be entangled (thus non independent variables). The data corresponds to measuring part of the system first (subsystem $A$), thus collapsing the whole state and conditioning the possible outcomes of the rest of the system (subsystem $B$). Because the set of possible outcomes for subsystem $A$ is very large and because when I measure I destroy the state, sampling two times subsystem $A$ and obtain the same result is highly unlikely.

Thank you very much in advance! Any idea or suggestion is highly appreciated!

I encountered the following problem (I give more details of the problem at the end of the post) and I am trying to figure out the best way of performing a null hypothesis testing. I looked for similar questions (like this) but it does not fit exactly my problem.

I have a random vector of $N$ random binary variables, not necessarily independent and non identically distributed. This $N$ variables are divided into two subsets: $A$ with $N_A$ random variables and B with $N_B$ variables ($N_A + N_B=N$). I know the marginal distribution of each of the binary variables, as well as the first and second moments of the random vector, which forms my null hypothesis. Now I would like to test it depending on the following data obtained by sampling in two successive steps:

First sample for the set $A$, obtaining the outcome $a= (a_1, ..., a_{N_A})$ and conditioned on this outcome sample on the second subset $B$. The problem that arises here is that the set of possible outcomes for $A$ is too large, which means that the probability to obtain the same $a= (a_1, ..., a_{N_A})$ is negligible. Thus, the distribution of the random variables in subset $B$ changes in each iteration. Since I cannot repeat the sampling under identical conditions I cannot use the usual central limit theorem to approximate the experimental mean by a Gaussian and perform typical Gaussian hypothesis tests.

You can imagine this as having $N$ biased coins, each bias being different, and the coins may not be independent. First I throw $N_A$ of the coins, which conditions the possible outcomes of the second set $B$.

How can I test my null hypothesis under these restrictions?

More details of the problem: I am dealing with a problem in quantum mechanics, having a state of $N$ spins that might be entangled (thus non independent variables). The data corresponds to measuring part of the system first (subsystem $A$), thus collapsing the whole state and conditioning the possible outcomes of the rest of the system (subsystem $B$). Because the set of possible outcomes for subsystem $A$ is very large and because when I measure I destroy the state, sampling two times subsystem $A$ and obtain the same result is highly unlikely.

Thank you very much in advance! Any idea or suggestion is highly appreciated!

I encountered the following problem (I give more details of the problem at the end of the post) and I am trying to figure out the best way of performing a null hypothesis testing. I looked for similar questions (like this) but it does not fit exactly my problem.

I have a random vector $X = (X_1,...,X_N)$ of $N$ random binary variables, not necessarily independent and non identically distributed. These $N$ variables are divided into two subsets: $A$ with $N_A$ random variables and $B$ with $N_B$ variables ($N_A + N_B = N$), so I can also write $X = (X_A,X_B)$. I know the marginal distribution of each of the binary variables, as well as the first and second moments of the random vector. 

Now I consider another random vector $Y = (Y_1,...,Y_N) = (Y_A, Y_B)$, from which I can only sample in two steps: first sample $Y_A$ (obtaining some string $(a_1,...,a_{N_A})$), and then sample $(Y_B | Y_A = (a_1,...,a_{N_A}))$. The null hypothesis is that $Y$ follows the same distribution as $X$. 

The problem that arises here is that the set of possible outcomes for $A$ is too large, which means that the probability to obtain the same $a= (a_1, ..., a_{N_A})$ is negligible. Thus, the distribution of the random variables in subset $B$ changes in each iteration. Since I cannot repeat the sampling under identical conditions I cannot use the usual central limit theorem to approximate the experimental mean by a Gaussian and perform typical Gaussian hypothesis tests.

You can imagine this as having $N$ biased coins, each bias being different, and the coins may not be independent. First I throw $N_A$ of the coins, which conditions the possible outcomes of the second set $B$.

How can I test my null hypothesis under these restrictions?

More details of the problem: I am dealing with a problem in quantum mechanics, having a state of $N$ spins that might be entangled (thus non independent variables). The data corresponds to measuring part of the system first (subsystem $A$), thus collapsing the whole state and conditioning the possible outcomes of the rest of the system (subsystem $B$). Because the set of possible outcomes for subsystem $A$ is very large and because when I measure I destroy the state, sampling two times subsystem $A$ and obtain the same result is highly unlikely.

Thank you very much in advance! Any idea or suggestion is highly appreciated!

Source Link
ECR
  • 9
  • 2

Hypothesis testing for not identically distributed random variables conditioned on the outcome of a subset

I encountered the following problem (I give more details of the problem at the end of the post) and I am trying to figure out the best way of performing a null hypothesis testing. I looked for similar questions (like this) but it does not fit exactly my problem.

I have a random vector of $N$ random binary variables, not necessarily independent and non identically distributed. This $N$ variables are divided into two subsets: $A$ with $N_A$ random variables and B with $N_B$ variables ($N_A + N_B=N$). I know the marginal distribution of each of the binary variables, as well as the first and second moments of the random vector, which forms my null hypothesis. Now I would like to test it depending on the following data obtained by sampling in two successive steps:

First sample for the set $A$, obtaining the outcome $a= (a_1, ..., a_{N_A})$ and conditioned on this outcome sample on the second subset $B$. The problem that arises here is that the set of possible outcomes for $A$ is too large, which means that the probability to obtain the same $a= (a_1, ..., a_{N_A})$ is negligible. Thus, the distribution of the random variables in subset $B$ changes in each iteration. Since I cannot repeat the sampling under identical conditions I cannot use the usual central limit theorem to approximate the experimental mean by a Gaussian and perform typical Gaussian hypothesis tests.

You can imagine this as having $N$ biased coins, each bias being different, and the coins may not be independent. First I throw $N_A$ of the coins, which conditions the possible outcomes of the second set $B$.

How can I test my null hypothesis under these restrictions?

More details of the problem: I am dealing with a problem in quantum mechanics, having a state of $N$ spins that might be entangled (thus non independent variables). The data corresponds to measuring part of the system first (subsystem $A$), thus collapsing the whole state and conditioning the possible outcomes of the rest of the system (subsystem $B$). Because the set of possible outcomes for subsystem $A$ is very large and because when I measure I destroy the state, sampling two times subsystem $A$ and obtain the same result is highly unlikely.

Thank you very much in advance! Any idea or suggestion is highly appreciated!