You have a set of samples and you are interested in learning something about the probability distribution from which they are drawn. That something is the parameter of interest. It might be the mean. If you do something to the samples, add them together, for example, then you might lose some piece of information that they contain regarding the parameter. But you also might not. Whether you lose information or not by manipulating your samples depends on what you do to them.

For example, if you are sampling from a binomial distribution for which success has value 1 and failure value 0, then adding up the results of the samples won’t destroy information about the mean of the distribution (i.e., the probability of success). That’s because the mean is expressed in the number of successes, rather than their order. You know just as much about the mean of the distribution if your first nine samples are successes and your tenth a failure as if your first is a failure and the next 9 successes. In other words, when you add up the results, you lose information on the order with which the successes occurred, but the mean does not determine that order, and so you don’t lose any information relevant to determining the mean.

When the mean increases, the sample results change because you end up with more successes. So a statistic that counts successes changes too. Both the sample and the statistic change in the same way. That is what happens when a statistic is “sufficient.”

That’s why for a sufficient statistic the probability of drawing a particular sample, conditional on a particular result for the statistic, is independent of the parameter. As the parameter changes, both the sample and the statistic change in the same way. So their relationship to each other remains constant regardless of what happens to the parameter. In a sense, the sufficient statistic transforms the sample, instead of altering it. So any change to the parameter doesn’t change the relationship between the samples and the statistic. Sample and statistic are just different ways of expressing the same thing about the parameter.

The sample conditional on the statistic is just the ratio of the probability of the sample to that of the statistic. This means that if the statistic is sufficient, the probabilities of the sample and the statistic must both be products of the parameter, so that the parameter will cancel out and therefore have no effect on this conditional probability.