HackerNoon’s Learn Repo has assembled 118 articles covering everything from basic probability to advanced Bayesian modeling. The collection offers practical code snippets, real‑world case studies, and clear explanations that help readers build a solid statistical foundation for data science, finance, and machine learning.
Why a 118‑article list matters
Statistics is the glue that holds modern data‑driven decision making together. Yet many practitioners learn it piecemeal, jumping from one tutorial to another without a clear roadmap. HackerNoon’s Learn Repo tackled this problem by curating 118 stories that span introductory concepts, industry‑specific applications, and deep dives into niche techniques. The list is not a random bibliography; each entry was selected for its practical relevance, code‑first approach, and clear exposition.

How the collection is organized
The articles fall into three broad buckets:
- Foundations – probability, distributions, hypothesis testing, and core metrics like variance and standard deviation.
- Applied techniques – outlier detection, time‑series feature engineering, Bayesian tail‑risk modeling, and causal inference without A/B tests.
- Tool‑specific guides – using R datasets, installing RStudio on WSL, retrieving NHL stats via undocumented APIs, and building Gaussian blurs in Python.
This structure lets a reader start with the basics, then move to the methods they need for a particular domain, and finally see concrete implementations in their favorite language.
Highlights from the list
| # | Topic | Practical takeaway |
|---|---|---|
| 1 | Cross‑entropy, Logloss, Perplexity | Shows how these loss functions are mathematically linked, helping you pick the right metric for classification models. |
| 3 | Radial Basis Functions | Explains the intuition behind RBF kernels and provides a simple NumPy implementation for non‑linear regression. |
| 6 | Time‑Series Feature Engineering | Walks through Fourier transforms, wavelet decomposition, and autocorrelation as feature generators for forecasting models. |
| 12 | Hypothesis Testing | Breaks down p‑values, Type I/II errors, and power analysis with reproducible R scripts. |
| 18 | Randomness in Science | Clarifies the difference between everyday “random” and the statistical definition, with Monte‑Carlo examples. |
| 24 | Propensity Score Matching | Offers a step‑by‑step guide to estimate treatment effects when a classic A/B test isn’t feasible. |
| 28 | Statistics Cheat Sheet | A quick‑reference PDF that covers probability, distributions, and common test statistics – perfect for interview prep. |
| 34 | Model Calibration | Demonstrates reliability diagrams and temperature scaling to improve probability estimates from classifiers. |
| 41 | Counterfactual Forecasting | Shows how to simulate “what‑if” scenarios using causal impact models in Python. |
| 59 | Principal Component Analysis | Provides a visual walkthrough of variance explained, eigenvectors, and reconstruction error. |
| 70 | No‑Code/Low‑Code Market Stats | Uses publicly available datasets to illustrate growth trends and investment opportunities. |
| 84 | Dynamic Truncated Mean in Power BI | A DAX pattern that automatically excludes extreme outliers from KPI calculations. |
| 97 | Three Outlier‑Handling Strategies | Compares trimming, Winsorizing, and model‑based detection, with code snippets for each. |
| 110 | Heavy‑Tailed Metrics with Cross‑Fitted CUPED | Shows how to reduce variance in ARPU experiments without over‑fitting. |
Each story includes live code, links to the original GitHub repos when applicable, and a short discussion of trade‑offs—e.g., why Winsorizing may bias the mean but improve model stability.
What readers gain
- A coherent learning path – Start at the bottom of the statistical ladder and climb to advanced causal inference without feeling lost.
- Hands‑on experience – Nearly every article ships with a runnable notebook or script, so you can experiment immediately.
- Contextual understanding – The collection ties statistical concepts to real business problems, such as fraud detection (imbalanced data) or marketing budget allocation (tail‑risk modeling).
- Community‑tested resources – Articles are authored by practitioners who have applied the techniques at startups, banks, and research labs, providing credibility beyond textbook theory.
How to use the list effectively
- Identify your current skill gap – If you’re comfortable with descriptive statistics but stumble on causal inference, begin with the “Hypothesis Testing” and “Propensity Score Matching” entries.
- Pick a language – Most tutorials offer both R and Python versions. Choose the one that aligns with your stack and follow the code step‑by‑step.
- Apply immediately – Take a small dataset from your own work and try the technique described. The immediate feedback loop solidifies learning.
- Bookmark the cheat sheets – The “Statistics Cheat Sheet” and “Dynamic Truncated Mean in Power BI” articles serve as quick references when you’re in the middle of a project.
Where to find the full collection
The complete list is hosted on the Learn Repo section of HackerNoon. You can browse the articles directly or download a CSV of titles and URLs for offline planning. Each entry is linked to its original story, so you always land on the most up‑to‑date version.
Explore the full 118‑article list on HackerNoon
If you’re looking for a structured syllabus, consider grouping the articles into weekly milestones:
- Week 1‑2: Foundations – probability, distributions, hypothesis testing.
- Week 3‑4: Data preparation – outlier detection, imbalanced data handling, feature engineering.
- Week 5‑6: Modeling – loss functions, model calibration, PCA.
- Week 7‑8: Causal analysis – propensity scores, counterfactual forecasting, Bayesian tail‑risk.
- Week 9‑10: Domain applications – finance, marketing, sports analytics, NLP.
By the end of the ten‑week sprint you’ll have a portfolio of notebooks that demonstrate a full statistical workflow, from raw data to actionable insight.
Happy learning, and may your data always tell the right story.

Comments
Please log in or register to join the discussion