The certified Biostatistician (English Version)
Premium course
Intermediate
Premium course
Intermediate
Video/Text
Subscribers only
1 Assessment, 42 Lessons
0% Not started
Premium course
In Module 1 Lecture 1, we start by defining the biostatistician’s core tasks—cleaning and organizing data, designing studies, choosing and applying appropriate statistical tests, interpreting results, and building reproducible workflows—and then compare how those responsibilities play out in academic settings (teaching, method development, grant writing) versus industry roles (clinical trial analysis, safety monitoring, real-world evidence projects). Next, we trace R’s evolution from its S-language roots at Bell Labs to the modern open-source ecosystem, so you understand why R has become the standard tool for data analysis. We then guide you step by step through installing R and RStudio on your computer, ensuring you have a working environment. Finally, we explore the RStudio Source Editor—how to save and run scripts, execute selected code, diagnose errors, use multiple cursors, and leverage code snippets—so that by the end of this lecture you’re ready to write and execute your first R script. important links: https://cloud.r-project.org/ https://posit.co/download/rstudio-desktop/
Premium course
In Module 1 Lecture 2, we dive into R’s simplest building blocks, its atomic data types. You’ll learn how to create and inspect numeric, integer, character, and logical vectors, understand how R stores each type under the hood, and see why choosing the right type matters for analysis and memory use. We then apply these basics in a hands-on exercise: given two vectors of heights and weights, use R’s vectorized arithmetic to compute BMI for each pair. This session turns you from a code installer into someone who truly speaks R’s native language, preparing you to work with more complex data structures next.
Premium course
In Lecture 3, we move beyond defining data types and start actively shaping our data in R. You’ll learn how to inspect and reshape objects using dim() and length(), rename rows and columns for clarity, and combine data using cbind() and rbind()—just like you would when merging clinical lab results or adding new patients to a dataset. We also confront one of R’s quiet traps: implicit coercion. You’ll see how mixing data types inside a matrix can silently convert your numeric values into text—and how to fix it by switching to data frames. By the end of this session, you’ll be confidently manipulating R structures in ways that mirror how real clinical datasets are handled, preparing you for importing, cleaning, and analyzing real-world data in the next lecture.
Premium course
In this lecture, we move from using R to customizing it. You’ll learn how to write your own functions—your own mini-tools, to automate repetitive tasks, clean up your code, and make it reusable across projects. We’ll start with built-in functions like mean() and sum(), then show you how to create your own using the function() keyword. You’ll see how to define inputs, write logic inside the function body, return results, and even give default values to arguments. Expect hands-on examples from clinical practice, like calculating BMI, MAP, and expected blood loss during surgery. By the end of this session, you won’t just be calling functions, you’ll be writing them.
Premium course
By the end of this session, participants will be able to: Differentiate between apply(), lapply(), and sapply() functions in R, including their input types, output structures, and optimal use cases in clinical data analysis. Apply vectorized functions to summarize patient-level or variable-level clinical data, such as calculating averages, standard deviations, or custom metrics across rows or columns. Integrate apply() functions into clinical workflows, such as cleaning datasets, generating summary tables, or flagging outliers based on lab values or vitals. Write and apply custom R functions within the apply() family, to handle more complex or domain-specific analysis tasks (e.g., risk stratification, score calculation). Interpret and troubleshoot common output types and error messages from the apply() family, especially in real-world, messy datasets typical in clinical research. Demonstrate efficient, reproducible code practices using the apply() family, avoiding unnecessary loops and improving script readability.
Premium course
This lesson introduces the practical use of the apply, sapply, vapply, and tapply functions in R, focusing on real clinical datasets. Through structured examples using vital signs, lab results, demographics, and diagnoses, learners explore how these functions help efficiently summarize, transform, and analyze patient-level data. The goal is to bridge statistical concepts with hands-on R coding—building the skills needed to handle common healthcare data challenges, from calculating means and standard deviations to grouping diagnoses and extracting useful patterns. By the end of this lesson, learners will be able to: Understand the differences between each member of the apply family. Know when and why to use each one in clinical research workflows. Implement them confidently on real-world datasets in R.
Premium course
Different methods for cross-tabulation The main structure for the if-else statement If only with else if-else ifelse
Premium course
Control structures in R If-else For loop While loop
Premium course
Continue Control structures in R Nested loop All in one examples
Premium course
KAP Questionnare Validity Reliability
Premium course
KAP Questionnaire: Basic informations Knowledge Attitude Perception
Premium course
KAP questionnaire: covariance Correlation Predictors
Premium course
Premium course
Why Probability? •Nothing in life is certain. In everything we do, we gauge the chances of successful outcomes, from business to medicine to the weather •A probability provides a quantitative description of the chances or likelihoods associated with various outcomes
Premium course
Statistical reasoning is the way people reason with statistical ideas and make sense of statistical information. Statistical reasoning may involve connecting one concept to another or may combine ideas about data and chance. Reasoning means understanding and being able to explain statistical processes, and being able to fully interpret statistical results.
Premium course
Difference between a statistic and a parameter The history of z Why t-test One sample t-test Independent sample t-test Paired t-test
Premium course
Normality • Histogram • Boxplot • Skewness • kurtosis
Premium course
Basic Logic We are looking for significant differences between the actual cell frequencies in a table (fo) and those that would be expected by random chance (fe).
Premium course
Statistical independence or association between two or more categorical variables. The Chi-Square Test of Independence can only compare categorical variables. It cannot make comparisons between continuous variables or between categorical and continuous variables. Additionally, the Chi-Square Test of Independence only assesses associations between categorical variables, and can not provide any inferences about causation
Premium course
ANOVA FAMILY
Premium course
Nonparametric alternative to: ANOVA MANOVA ANCOVA MANCOVA
Premium course
Factor analysis is a statistical technique used to understand the relationship between a large number of variables. It is often used in healthcare research to identify underlying factors or variables that are related to a particular health outcome. For example, suppose a healthcare professional is interested in studying the factors that contribute to the development of heart disease. They may collect data on various risk factors such as age, gender, smoking status, blood pressure, cholesterol levels, family history, etc. Factor analysis can help identify which of these risk factors are most closely related to the development of heart disease. It does this by grouping together variables that are highly correlated with each other and separating them from variables that are less correlated. This process can help reduce the complexity of the data and provide insight into the underlying factors that contribute to a particular health outcome. It can also help identify which risk factors are most important to target in order to prevent or treat a particular health condition. In summary, factor analysis is a useful statistical tool that can help healthcare professionals better understand the complex relationships between various health-related variables and ultimately improve patient outcomes.
Premium course
Thnaks our dear colleagues for their outstanding lecture: Dr. Madonna Ibraam Dr. Rash Emad Dr. Alaa Hamdan Dr. Nada Alfarra ==================== Our talk today 1. What is a factor analysis (FA)? â—¦ Concepts of FA â—¦ Assumptions of FA â—¦ Data evaluation 2. How to conduct FA? â—¦ Factor n determination â—¦ Factor extraction â—¦ Factor loadings â—¦ Factor scoring â—¦ Factors rotation 3. How to interpret FA result? 4. Conclusion
Premium course
Principal Component Analysis
Premium course
Principle component analysis # P A R T 2 (practical session)
Premium course
The SDTM standard provides a structured format for organizing and presenting data collected during clinical trials. It defines a set of domains, variables, and relationships to represent various aspects of the study, such as demographics, adverse events, laboratory measurements, and so on. The SDTM standardizes how the data is structured and labeled, ensuring consistency and facilitating data sharing and integration across different studies and organizations
Premium course
compliant tables generation 1- generating the dataset 2-Descriptive statistics 3-Comparative statistics 4- Regression models
Premium course
Premium course
1. Are relations between variables always linear ? 2. Why and How correlation coefficient was developed ? 3. Pearson and spearman correlation formulas 4. How to interpret correlation coefficient 5.Hint on linear regression
Premium course
You can use simple linear regression when you want to know: 1-How strong the relationship is between two variables (e.g. the relationship between weight and height). 2-The value of the dependent variable at a certain value of the independent variable (e.g. the amount of weight at a specific height ).
Premium course
1- What is multiple linear regression ? When to use multiple linear regression ? 2- Difference between simple and multiple linear regression 3- Decomposition of the total deviation multiple linear regression 4- Terms to be used in modeling multiple linear regression 5- Let's apply on a simulated data ( ps: it is simulated from results of real data) 6- How to identify confounder? (Interaction effect in multiple linear regression equation)
Premium course
1 When to use logistic regression? 2- What are the underlying calculations of logistic regression? 3- Example and how to implement in R Univariate 4- How to interpret? 5- Example and how to implement in R Multivariate
Premium course
Conditional Logistic Regression Multinomial Logistic Regression Ordinal Logistic Regression
Premium course
What is Ordinal logistic regression? When to use Ordinal logistic regression? Don’t use ordinal model if … How to label dummy variables in R fit ordered logit model using clm function fit ordered logit model using polr and brant functions
Premium course
What is poisson regression? And when to use it? Hint on poisson distribution Let's discover our data R code for poisson regression How to interpret ?
Premium course
When to use a Repeated Measures ANOVA Hypothesis for Repeated Measures ANOVA Logic of the Repeated Measures ANOVA Assumptions Computing and visualization
Premium course
What is GEE When to use GEE Computation Choosing the best model Computing the confidence interval
Premium course
The survival probability (which is also called the survivor function) S(t) is the probability that an individual survives from the time origin (e.g. diagnosis of cancer) to a specified future time t. It is fundamental to a survival analysis because survival probabilities for different values of t provide crucial summary information from time to event data.
Premium course
Premium course
Module 3 project الديدلاين اسبوعين this is a safety data to assess abnormal values during any point at the study and to find trends in increasing/decreasing of any measurement during the trial Phase IV clinical trial to compare safety of drug A versus B so 1- separate the blood pressure to DPB and SBP 2- FOR EVERY COLUMN, make a new one labeled normal/ abnormal values ( for normal value ranges , Google it ) 3- Find the right test for the repetitive measures based on the distribution of every value (trick:: long format and wide format) 4- Find an appealing visualization method as if you are presenting the results for clinicians who can't understand statistics 5- Five of you will be assigned to present the project, Dr Rania Ibrahim will post the names within 7 days
Premium course
Follow
Session expired
Please log in again. The login page will open in a new tab. After logging in you can close it and return to this page.