Video/Text

Subscribers only

1 Assessment, 42 Lessons

0% Not started

Ready?

Start Course

The certified Biostatistician (English Version)

The certified Biostatistician module 1

1 Assessment, 13 Lessons

Premium course

Lesson 1 (Intro, R installing)

In Module 1 Lecture 1, we start by defining the biostatistician’s core tasks—cleaning and organizing data, designing studies, choosing and applying appropriate statistical tests, interpreting results, and building reproducible workflows—and then compare how those responsibilities play out in academic settings (teaching, method development, grant writing) versus industry roles (clinical trial analysis, safety monitoring, real-world evidence projects). Next, we trace R’s evolution from its S-language roots at Bell Labs to the modern open-source ecosystem, so you understand why R has become the standard tool for data analysis. We then guide you step by step through installing R and RStudio on your computer, ensuring you have a working environment. Finally, we explore the RStudio Source Editor—how to save and run scripts, execute selected code, diagnose errors, use multiple cursors, and leverage code snippets—so that by the end of this lecture you’re ready to write and execute your first R script. important links: https://cloud.r-project.org/ https://posit.co/download/rstudio-desktop/

Premium course

Lesson2: Foundations of Data in R: Types, Structures, and Access

In Module 1 Lecture 2, we dive into R’s simplest building blocks, its atomic data types. You’ll learn how to create and inspect numeric, integer, character, and logical vectors, understand how R stores each type under the hood, and see why choosing the right type matters for analysis and memory use. We then apply these basics in a hands-on exercise: given two vectors of heights and weights, use R’s vectorized arithmetic to compute BMI for each pair. This session turns you from a code installer into someone who truly speaks R’s native language, preparing you to work with more complex data structures next.

Premium course

Lesson3 : Structuring Clinical Data in R: Dimensions, Binding, and Hidden Pitfalls

In Lecture 3, we move beyond defining data types and start actively shaping our data in R. You’ll learn how to inspect and reshape objects using dim() and length(), rename rows and columns for clarity, and combine data using cbind() and rbind()—just like you would when merging clinical lab results or adding new patients to a dataset. We also confront one of R’s quiet traps: implicit coercion. You’ll see how mixing data types inside a matrix can silently convert your numeric values into text—and how to fix it by switching to data frames. By the end of this session, you’ll be confidently manipulating R structures in ways that mirror how real clinical datasets are handled, preparing you for importing, cleaning, and analyzing real-world data in the next lecture.

Premium course

Lesson 4 : Writing Your Own Tools: Functions in R

In this lecture, we move from using R to customizing it. You’ll learn how to write your own functions—your own mini-tools, to automate repetitive tasks, clean up your code, and make it reusable across projects. We’ll start with built-in functions like mean() and sum(), then show you how to create your own using the function() keyword. You’ll see how to define inputs, write logic inside the function body, return results, and even give default values to arguments. Expect hands-on examples from clinical practice, like calculating BMI, MAP, and expected blood loss during surgery. By the end of this session, you won’t just be calling functions, you’ll be writing them.

Premium course

Lesson 5: Efficient Data Summaries in Clinical Research Using the apply() Family

By the end of this session, participants will be able to: Differentiate between apply(), lapply(), and sapply() functions in R, including their input types, output structures, and optimal use cases in clinical data analysis. Apply vectorized functions to summarize patient-level or variable-level clinical data, such as calculating averages, standard deviations, or custom metrics across rows or columns. Integrate apply() functions into clinical workflows, such as cleaning datasets, generating summary tables, or flagging outliers based on lab values or vitals. Write and apply custom R functions within the apply() family, to handle more complex or domain-specific analysis tasks (e.g., risk stratification, score calculation). Interpret and troubleshoot common output types and error messages from the apply() family, especially in real-world, messy datasets typical in clinical research. Demonstrate efficient, reproducible code practices using the apply() family, avoiding unnecessary loops and improving script readability.

Premium course

Lesson 6: From Theory to Practice: Using apply, sapply, vapply, and tapply in Real Clinical Data

This lesson introduces the practical use of the apply, sapply, vapply, and tapply functions in R, focusing on real clinical datasets. Through structured examples using vital signs, lab results, demographics, and diagnoses, learners explore how these functions help efficiently summarize, transform, and analyze patient-level data. The goal is to bridge statistical concepts with hands-on R coding—building the skills needed to handle common healthcare data challenges, from calculating means and standard deviations to grouping diagnoses and extracting useful patterns. By the end of this lesson, learners will be able to: Understand the differences between each member of the apply family. Know when and why to use each one in clinical research workflows. Implement them confidently on real-world datasets in R.

Premium course

Lesson 7 (Cross tabulation , if, if else)

Different methods for cross-tabulation The main structure for the if-else statement If only with else if-else ifelse

Premium course

Lesson 8 (For loop, while loop)

Control structures in R If-else For loop While loop

Premium course

Lesson 9 (nested loop, all in one)

Continue Control structures in R Nested loop All in one examples

Premium course

Lesson 10 (KAP questionnaire validity and reliability)

KAP Questionnare Validity Reliability

Premium course

Lesson 11(KAP questionnaire domains)

KAP Questionnaire: Basic informations Knowledge Attitude Perception

Premium course

Lesson 12 (KAP questionnaire correlation, covarience, predictors)

KAP questionnaire: covariance Correlation Predictors

Premium course

Project#1 Discussion

The certified Biostatistician module 2

15 Lessons

Premium course

Lecture 1 (Probability)

Why Probability? •Nothing in life is certain. In everything we do, we gauge the chances of successful outcomes, from business to medicine to the weather •A probability provides a quantitative description of the chances or likelihoods associated with various outcomes

Premium course

Lecture 2 (Statistical reasoning)

Statistical reasoning is the way people reason with statistical ideas and make sense of statistical information. Statistical reasoning may involve connecting one concept to another or may combine ideas about data and chance. Reasoning means understanding and being able to explain statistical processes, and being able to fully interpret statistical results.

Premium course

Lesson 3 ( t-test intro)

Difference between a statistic and a parameter The history of z Why t-test One sample t-test Independent sample t-test Paired t-test

Premium course

Lesson 4 (t-test assumptions)

Normality • Histogram • Boxplot • Skewness • kurtosis

Premium course

Lesson 5 (Chi-Square)

Basic Logic We are looking for significant differences between the actual cell frequencies in a table (fo) and those that would be expected by random chance (fe).

Premium course

Lesson 6 (Chi square cont.- Fisher- McNemar's Test )

Statistical independence or association between two or more categorical variables. The Chi-Square Test of Independence can only compare categorical variables. It cannot make comparisons between continuous variables or between categorical and continuous variables. Additionally, the Chi-Square Test of Independence only assesses associations between categorical variables, and can not provide any inferences about causation

Premium course

Lesson 7 ANOVA Family

ANOVA FAMILY

Premium course

Lesson 8 (Nonparametric alternative to ANOVA)

Nonparametric alternative to: ANOVA MANOVA ANCOVA MANCOVA

Premium course

Lesson#9 (Factorial Analysis)

Factor analysis is a statistical technique used to understand the relationship between a large number of variables. It is often used in healthcare research to identify underlying factors or variables that are related to a particular health outcome. For example, suppose a healthcare professional is interested in studying the factors that contribute to the development of heart disease. They may collect data on various risk factors such as age, gender, smoking status, blood pressure, cholesterol levels, family history, etc. Factor analysis can help identify which of these risk factors are most closely related to the development of heart disease. It does this by grouping together variables that are highly correlated with each other and separating them from variables that are less correlated. This process can help reduce the complexity of the data and provide insight into the underlying factors that contribute to a particular health outcome. It can also help identify which risk factors are most important to target in order to prevent or treat a particular health condition. In summary, factor analysis is a useful statistical tool that can help healthcare professionals better understand the complex relationships between various health-related variables and ultimately improve patient outcomes.

Premium course

Lesson#10 Bonus Lecture (Factor Analysis)

Thnaks our dear colleagues for their outstanding lecture: Dr. Madonna Ibraam Dr. Rash Emad Dr. Alaa Hamdan Dr. Nada Alfarra ==================== Our talk today 1. What is a factor analysis (FA)? ◦ Concepts of FA ◦ Assumptions of FA ◦ Data evaluation 2. How to conduct FA? ◦ Factor n determination ◦ Factor extraction ◦ Factor loadings ◦ Factor scoring ◦ Factors rotation 3. How to interpret FA result? 4. Conclusion

Premium course

Lesson#11 (Principal Component Analysis)

Principal Component Analysis

Premium course

Lesson#12 (Principle component analysis# P A R T 2 )

Principle component analysis # P A R T 2 (practical session)

Premium course

Lesson#13 (ADAM, SDTM) & (CRF)

The SDTM standard provides a structured format for organizing and presenting data collected during clinical trials. It defines a set of domains, variables, and relationships to represent various aspects of the study, such as demographics, adverse events, laboratory measurements, and so on. The SDTM standardizes how the data is structured and labeled, ensuring consistency and facilitating data sharing and integration across different studies and organizations

Premium course

Lesson#14 (compliant tables generation )

compliant tables generation 1- generating the dataset 2-Descriptive statistics 3-Comparative statistics 4- Regression models

Premium course

Module II Project's Answer

The Certified Biostatistician Module 3

14 Lessons

Premium course

Lesson#1 (Fundamental theory behind linear regression)

1. Are relations between variables always linear ? 2. Why and How correlation coefficient was developed ? 3. Pearson and spearman correlation formulas 4. How to interpret correlation coefficient 5.Hint on linear regression

Premium course

Lesson#2 (simple linear regression)

You can use simple linear regression when you want to know: 1-How strong the relationship is between two variables (e.g. the relationship between weight and height). 2-The value of the dependent variable at a certain value of the independent variable (e.g. the amount of weight at a specific height ).

Premium course

Lesson#3 (Multiple Linear Regression )

1- What is multiple linear regression ? When to use multiple linear regression ? 2- Difference between simple and multiple linear regression 3- Decomposition of the total deviation multiple linear regression 4- Terms to be used in modeling multiple linear regression 5- Let's apply on a simulated data ( ps: it is simulated from results of real data) 6- How to identify confounder? (Interaction effect in multiple linear regression equation)

Premium course

Lesson#4 (logistic regression)

1 When to use logistic regression? 2- What are the underlying calculations of logistic regression? 3- Example and how to implement in R Univariate 4- How to interpret? 5- Example and how to implement in R Multivariate

Premium course

Lesson#5 (Conditional Logistic Regression)

Conditional Logistic Regression Multinomial Logistic Regression Ordinal Logistic Regression

Premium course

Lesson#6 (Ordinal Logistic Regression)

What is Ordinal logistic regression? When to use Ordinal logistic regression? Don’t use ordinal model if … How to label dummy variables in R fit ordered logit model using clm function fit ordered logit model using polr and brant functions

Premium course

Lesson#7 (Poisson regression)

What is poisson regression? And when to use it? Hint on poisson distribution Let's discover our data R code for poisson regression How to interpret ?

Premium course

Lesson#8 (Repeated Measure ANOVA)

When to use a Repeated Measures ANOVA Hypothesis for Repeated Measures ANOVA Logic of the Repeated Measures ANOVA Assumptions Computing and visualization

Premium course

Lesson#9 (GEE)

What is GEE When to use GEE Computation Choosing the best model Computing the confidence interval

Premium course

Lesson#10 Bonus Lecture (Multinomial logistic regression)

Premium course

Lesson#11 (survival analysis)

The survival probability (which is also called the survivor function) S(t) is the probability that an individual survives from the time origin (e.g. diagnosis of cancer) to a specified future time t. It is fundamental to a survival analysis because survival probabilities for different values of t provide crucial summary information from time to event data.

Premium course

Lesson#12 (COX & Survival)

Premium course

Project

Module 3 project الديدلاين اسبوعين this is a safety data to assess abnormal values during any point at the study and to find trends in increasing/decreasing of any measurement during the trial Phase IV clinical trial to compare safety of drug A versus B so 1- separate the blood pressure to DPB and SBP 2- FOR EVERY COLUMN, make a new one labeled normal/ abnormal values ( for normal value ranges , Google it ) 3- Find the right test for the repetitive measures based on the distribution of every value (trick:: long format and wide format) 4- Find an appealing visualization method as if you are presenting the results for clinicians who can't understand statistics 5- Five of you will be assigned to present the project, Dr Rania Ibrahim will post the names within 7 days

Premium course

Project answer

About the teacher

MARS GLOBAL TEAM

Dynamic and highly skilled researcher and entrepreneur who is making waves in the biomedical industry! With a Master's in Epidemiology and a Ph.D. in Biostatistics.

NOURAN HAMZA- CEO And Biostatistics Expert