Statistics

Why use stratification?

Published 2021-06-27 In statistical sampling, stratification is a strategy used to improve precision (lower standard error) of the estimate while maintaining a reasonable sample size. Before we can talk about stratification, we first have to cover what simple random sampling is. Simple Random Sampling Simple random sampling is perhaps the most well-known sampling strategies to […]

Tae 
Statistics

Basics of Bayesian Inference

Published 2021-06-20 Bayesian statistics is heavily used in decision theory and data science; however, many statistics programs at most universities do not require any Bayesian statistics courses to be taught throughout the entire program. I know “traditional” statisticians are not comfortable with Bayesian statistics. I’m hoping this post will maybe explain why they think that […]

Tae 
Privacy

The Importance of Digital Privacy

Published on 2021-06-13 A few weeks ago, the privacy-focused messaging app Signal tried to purchase some ads on a popular social media platform. You can see examples in their blog post, which show that the platform and advertisers can target you based on your occupation, personal preferences, location, major life events, and probably more. Signal […]

Tae 
Statistics

Did you just assume independence?

Published on 2021-05-30 Introduction Many popular statistical procedures, such as hypothesis testing and linear regression, include the assumption of independence. Assuming independence makes the math much simpler and can be a reasonable assumption depending on how the data was collected. Even though I lumped them together, hypothesis tests and regression have different assumptions of independence. […]

Tae 
Statistics

Confidence Intervals

Published on 2021-05-23 In my previous post about the Central Limit Theorem, I said that I would make a dedicated post about confidence intervals. Well, here it is. Introduction Before we jump into what confidence intervals are, it’s important to go over some basic terminology. In the field of statistics, a population is the entire […]

Tae 
Statistics

Central Limit Theorem

Published on 2021-05-16 Anyone who has taken an introductory statistics course probably learned about the Central Limit Theorem (CLT). During my time as a Teaching Assistant at George Mason University, I noticed that students don’t quite understand what CLT is or why CLT is important. Hopefully, this post can serve as another reference for those […]

Tae 
Tools

Basics of Git

Published on 2021-05-15 Git is a free and open source (I explain what open source means in this post) source control management (SCM) system originally developed by Linus Torvalds to help him manage the development of the Linux kernel. SCM can also mean source code management and it’s more commonly known as version control. Git […]

Tae 
Privacy

What is FOSS?

Published on 2021-05-15 FOSS stands for “Free and Open Source Software”. Many people probably haven’t heard the term FOSS. Even if they have heard it, they don’t fully understand what it means. The open source part is not too difficult to understand. The source code of the program or application is open or available to […]

Tae 
Statistics

Faster Bayesian Inference with INLA

Published on 2021-05-14 Introduction It is no secret that Bayesian statistics – Markov chain Monte Carlo (MCMC) methods in particular – has been gaining popularity in the past couple of decades. I remember making memes in undergrad about how “Bayesian inference is so hot right now!” Yes, I made statistics memes. MCMC samplers like NIMBLE, […]

Tae