Importance of learning statistical software

Share on facebook
Share on twitter
Share on linkedin
Share on whatsapp
Share on telegram
Share on email
Share on print

There is good news! There are many choices of statistical software that are free and available online. Some of these options include R and Python. Researchers are not limited to proprietary and often expensive software packages, such as Stata and SPSS. This availability means that everyone can learn and employ statistical methods in their research. The power of statistical software is the ability to manipulate and analyze large amounts of data efficiently in just a few seconds.

Sometimes coding in R and Python on its face seem intimidating, but it all begins with a dataset prepared for analysis. Our Research Methods Program teaches analysis using R, which prefers comma-separated values (.csv) files for data. Excel can save and open .csv files, but if your dataset is very large with many observations and variables it may be difficult to open. Let’s look at some real data from the World Bank Development Indicators from 2016. The variable names in the header of the table below represent the following variables: country (of course!), GDP_growth (the annual growth of the Gross Domestic Product in %), GDP_percapita (total GDP Purchasing Power Parity for the country), oil_rents (% of the GDP that is from oil rents), and hdi (Human Development Index). Here are some of the countries from the dataset:

countryGDP_growthGDP_percapita oil_rents hdi
Côte d’Ivoire5.273665.550.670.486
The Gambia-2.551558.7000.457
Ghana1.154135.651.070.588
Kenya3.353121.8300.585
Malawi-0.261230.4200.474
Nigeria-4.175882.992.800.53
Uganda1.021908.3000.508
Zimbabwe-0.792687.3900.532

First of all, in this small sample of countries, there is significant variation across the different variables with annual GDP growth ranging from a high of over 5% in Côte d’Ivoire to a low of -4% in Nigeria. There are different ways to load this dataset in R. Here is one way:

worldbankdata <- read.csv(file.choose())

This code creates a new dataset called “worldbankdata.” The rest of the code opens a dialogue box in which you can select your data file on your computer. Now, let’s see what it looks like in R:

Eight observations (countries) with 5 variables

Opening the data is just the beginning of the analytical journey. The Research Methods Program teaches how to prepare data for analysis from data entry to opening the data in R. The good news is that the statistical tools are accessible, so let us learn together.

Share on facebook
Facebook
Share on twitter
Twitter
Share on linkedin
LinkedIn
Share on whatsapp
WhatsApp
Share on telegram
Telegram
Share on email
Email
Share on print
Print
About the Authors
Summary
Themes

For more updates from Leaders of Africa,
subscribe to our newsletter

More from Leaders of Africa Institute