Today, read the comment of Mr. Nguyen Dinh Cong  I would like to take this opportunity to discuss mortality in a different way. I still think the number of deaths in Vietnam is lower than the number reported by the state.
On the surface (and as of now, September 8, 2021), the death rate in Vietnam per 100 positive cases is 2.48% (13701/550996) . But I think this number is higher than reality. So is there a way to estimate the number close to reality?
Mr. Nguyen Dinh Cong thinks that the death rate figure of 0.26% that I use as a reference point (to calculate the number of people infected in the community) is lower than the reality, because “There are many people who die unjustly from other causes.”
Writing a note that someone notices and comments on is very dear to me, because at least someone reads and thinks about it. What’s more, the commenter is Mr. Cong, whom I respect very much for his views and thoughts. I feel like I want to discuss more about Master Cong’s thoughts.
The ncov website provides data on the number of infections (in fact, the number of positive cases) and the number of deaths for each province. I summarize those numbers through the chart below so you can get an idea of which province is at the top and which is at the bottom of the ‘death’ table. If you don’t say it, everyone knows that Ho Chi Minh City is at the top (4.11%). The province at the bottom of the table is Hai Duong (0.1%). In fact, there are 23 provinces at the bottom with 0 deaths.
What can we infer from the above numbers?
There are a few points to note. The first thing to remember is that provinces with 0 deaths do not mean there will be 0 deaths. Second, the provinces have low mortality because the number of deaths is too low, so the divisor is not highly reliable. For example, in Hai Duong, with 957 infections and 1 death, that 0.1% rate is statistically unreliable. Therefore, averaging the total number of deaths from the provinces divided by the total number of infections does not reflect the situation properly.
The problem here is that the mortality rate (calculated by number of infections or CFR) is very different from province to province. We need a way to correct for that difference. I thought of a Bayesian method to integrate mortality rates from provinces.
With the Bayesian method, we temporarily call the number of deaths in each province d(i) and the number of infections n(i), where i denotes 1 province. We assume that d(i) obeys a binary distribution (i.e. binomial distribution) with mean p(i) and variance s2(i). We also assume that the logit set of p(i) of many provinces follows a normal distribution with Theta mean and Tau variance. The problem is that we estimate Theta and Tau. With the Bayesian method, we give p(i) a predetermined distribution that reflects that we know nothing about the true proportions of each province. To reflect that ignorance, I put p(i) in a normal distribution with mean 0 as variance as high as 10,000. Theta here is the mortality rate for the country, and Tau is the variance (as mentioned above). From this, we can estimate a 95% confidence interval for mortality. That’s roughly a few lines of theory and model.
What were the results?
The results can be seen in chart 2 below. But the chart is a bit hard to read, so I’ll make it easy to describe: on average, the country’s average fatality rate is 0.64%, with a 95% probability ranging between 0.47% and 0.87%.
Thus, even though the national simple rate is 2.48%, if you consider the variation between provinces in the number of cases and the number of deaths, perhaps the true mortality rate ranges from 0.64% to 0.87%.
But the above calculation is still flawed, because there is no data on age and number of tests in each province. In addition, the reported number of deaths may not match the reality (as Mr. Cong said), so the actual number may fluctuate high or low in the future. I will wait when I have enough data on age, I will write a scientific paper.