R and Python
There are times when we not only look at the descriptive analysis but also want to make future predictions based on the past trends. We will look at techniques that we can use to predict the number of studies submitted or registered in future years.We will see how we can use some of the libraries like pandas, statsmodels and matplotlib in python.
the python code is available on my github repository here:
https://github.com/kalehdoo/clintrials/blob/master/ctrials_1.py
I will also try to explain the steps and procedures to perform the analysis later.
Here is the final outcome:
Model Summary:
Studies submitted predicted:
For 2019 : 31,022
For 2020 : 32,479
The data used for the regression is from 2005-2018.
There is still some time left for 2019 to complete so I will come back next year to compare the 2019 actuals with the predicted numbers here. The actuals for 2019 is 19,990 (data until Aug 2019) which was posted in one of the previous posts here.
I have also shown how to do regression using R programming, and also how to interpret the results. The link below has complete code and the analytics:
http://rpubs.com/kalehdoo/sponsor_analytics
Source data extracted from: https://aact.ctti-clinicaltrials.org