Germany, officially known as the Federal Republic of Germany located in the heart of Europe, is one of the 195 independent sovereign states in the world. In Europe, Germany is the seventh largest country based on area size but the second most populous, with a population of approximately 82.3 million people (Central Intelligence Agency, 2018). The purpose of this report is to provide a statistical analysis of the population census collected at five different periods in time for 182 cities located in Germany. Various statistical methods will be used to analyze this data to determine certain trends such as the population growth rate.
The population census reported during the following five periods will be used: January 1900, December 1995, December 2001, May 2011 and December 2012. Descriptive statistics have been computed to provide a summary analysis of the given population data sets as shown below. The values for each computation have been rounded to the nearest whole number.
Note that, population data were missing for eight of the 182 cities in the January 1990 census, so the total population shown for that year may not be an accurate representation. There were no data missing for the latter four years. Note also that, the gap between the years are not consistent which could affect the accuracy of the trend shown. The analysis will be performed, nonetheless, using the data given.
The total population calculated for each of the five years shows that the population grew from 30.5 million in 1900 to 32.6 million people in 1995 but declined in 2001 to 32.1 million, with a further drop in 2011 to 31.6 million. Nonetheless, there was an increase in 2012 to 32.1 million, which is less than half of the total population of Germany today. The same trend can be identified using the other statistical measures shown in the table. The population of Germany continues to grow at a slow rate with a current growth rate of 0.22 percent, compared to -0.19 percent from a decline in 2010 (United Nations Department of Economic and Social Affairs, 2017). A visual representation of the population growth trend over the years is shown below.
Another observation made is that the mean is almost two times higher than the median for all five data sets. When the mean is greater than the median, it means the distribution is positively skewed (skewed to right), and since the distribution is skewed the median would be the best measure of central location (University of Texas, 2016). The median is also best because each data set contains extremes values for a few cities, which is evident in the ranges calculated. While not the best measure of variability, the high range easily tells us that the minimum and maximum values are extremely far apart in each of the five years. However, since the range is highly influenced by extreme values and does not consider all the data, it does not give an accurate description of the variability for the entire distribution. In this case, it may be best to use the standard deviation to determine the spread since it measures variability by considering the distance between each data and the mean. The standard deviation calculated for each year is very high which tells that the data are widely spread around the mean.
Extreme values are often considered outliers since they lie an abnormal distance from the other values in a data set. A common way of detecting outliers is by calculating the z-score, which is a standardized value interpreted as the number of standard deviations is from the mean (Anderson, Sweeney & Williams, 2016). The z-score method will be used to determine which cities in Germany, if any, should be considered an outlier in each of the five years. When using this method, any city with a z-score of less than -3 or greater than 3 will be considered an outlier. There were a few cities which had values close to -3 and 3 but only three were identified as outliers based on the z-score method, which are Berlin, Hamburg, and Munich. Berlin had the highest z-score, which was expected since it is the capital city and known to have the largest population in Germany. The following table shows the z-scored calculated for the three cities.
The correlation coefficient was calculated between the first year and each of the four other years to determine what type of relationship exists between the variables. The following table shows the results of each calculation.
As shown for each comparison, the correlation coefficient is equal to 1 which indicated that a perfect positive linear relationship exists between the year 1990 and each of the latter four years. A visual representation of this relationship is shown in the scatter diagram below. Majority of the data points are close together which makes it difficult to see the individual points, however, the trendline shows the strong positive linear relationship that exists between the two years shown in the diagram.
The statistical analysis performed on the population census for each of the five years revealed that the total population 30.5 million people in 1990 to 32.1 in 2011. Despite the decline in the years 2001 and 2011, the growth over the years have been very consistent. The three cities with the largest population were identified using the z-score method to find the outliers. The cities were Berlin, Hamburg and Munich. The correlation coefficient calculated also showed that a positive linear relationship existed between the year 1990, and each of the latter five years.
A professional writer will make a clear, mistake-free paper for you!Get help with your assignment
Please check your inbox
I'm Chatbot Amy :)
I can help you save hours on your homework. Let's start by finding a writer.Find Writer