R and Python
There are times when we not only look at the descriptive analysis but also want to make future predictions based on the past trends. We will look at techniques that we can use to predict the number of studies submitted or registered in future years.
We will see how we can use some of the libraries like pandas, statsmodels and matplotlib in python.
the python code is available on my github repository here:
https://github.com/kalehdoo/clintrials/blob/master/ctrials_1.py
I will also try to explain the steps and procedures to perform the analysis later.
Here is the final outcome:
Model Summary:
Studies submitted predicted:
For 2019 : 31,022
For 2020 : 32,479
The data used for the regression is from 2005-2018.
There is still some time left for 2019 to complete so I will come back next year to compare the 2019 actuals with the predicted numbers here. The actuals for 2019 is 19,990 (data until Aug 2019) which was posted in one of the previous posts here.
I have also shown how to do regression using R programming, and also how to interpret the results. The link below has complete code and the analytics:
http://rpubs.com/kalehdoo/sponsor_analytics
Source data extracted from: https://aact.ctti-clinicaltrials.org
Study Activity Dashboard
In this post, we will try to understand the clinical study activities across the globe. We will gather some inputs like population, GDP and health spending as % of GDP. Then we compare different countries by their involvement in clinical studies. The clinical study activity is based on the clinical site in that country for a particular study. It is important to keep in mind that those clinical sites or facilities may or may not have enrolled any participants. Also, the demographics data is for the year 2017 and we are considering all the clinical studies registered in the USA clinicaltrials.gov as of Aug 2019. Keeping all that in mind, we will try to get a sense of overall study activity and compare them for different countries. We will also look at the study activity from region level. So, just sit back and relax.
 |
| Figure 1.1 |
Figure 1.1 shows study activity per 100 K population of a country. Denmark tops the list with the highest number of studies per 100 K population. This is not the complete list and I have tried to display maximum I could fit in a picture. You would notice that there are countries with very small population and hence they have got a high activity per capita. Also, there are few countries with very large populations and have got a low ratio.
In figure 1.2 above, the countries are categorized under geographic region. It also shows the percentage share of the population and number of studies. The dashboard allows to drill down on a region and see the details by country.
We will look at the study activity based on the GDP and health spending as % of GDP of countries in next post.
Keep thinking till next time.