Search This Blog

Showing posts with label Statistics. Show all posts
Showing posts with label Statistics. Show all posts

Thursday, December 5, 2019

Linear Regression using R and Python

R and Python

There are times when we not only look at the descriptive analysis but also want to make future predictions based on the past trends. We will look at techniques that we can use to predict the number of studies submitted or registered in future years.

We will see how we can use some of the libraries like pandas, statsmodels and  matplotlib in python.
the python code is available on my github repository here
https://github.com/kalehdoo/clintrials/blob/master/ctrials_1.py
I will also try to explain the steps and procedures to perform the analysis later.
Here is the final outcome:
Model Summary:

Studies submitted predicted:
For 2019 : 31,022 
For 2020 : 32,479

The data used for the regression is from 2005-2018.
There is still some time left for 2019 to complete so I will come back next year to compare the 2019 actuals with the predicted numbers here. The actuals for 2019 is 19,990 (data until Aug 2019) which was posted in one of the previous posts here.

I have also shown how to do regression using R programming, and also how to interpret the results. The link below has complete code and the analytics:
http://rpubs.com/kalehdoo/sponsor_analytics


Source data extracted from: https://aact.ctti-clinicaltrials.org