Methodology

This is the methodology for the Forenames of Ireland 1911 and the Surnames of Ireland 1911.

Introduction

The aim of this section is to describe the methodology used to formulate the statistics presented in the Forenames of Ireland 1911 series and the Surnames of Ireland 1911 series. Both series use the same experimental dataset which originates from transcriptions of the 1911 census of Ireland, currently made available by the National Archives of Ireland. Using these transcriptions, for each question asked, the aim was to try to group these transcriptions into categories. In 1911, the questions asked which have been transcribed were as follows:

Forename: The first name of the member of the household.
Surname: The last name of the member of the household.
Relation to Head of Family: This is how the member of the household was related to the person considered the Head of Family such as "Wife", "Son" and "Visitor".
Religious Profession: This was the religious denomination of the member of the household such as "Church of Ireland" or "Roman Catholic".
Education: This was whether the member of the household could "Read and Write", "Cannot Read", or "Read Only".
Age: The member of the household's age. If this member was under 1, months should have been recorded instead of years. This was where gender was determined, as the ages were in two categories, "Ages of Males" and "Ages of Females".
Rank, Profession, or Occupation: This was the occupation of the member of the household, such as "Farmer" or "Seamstress", or if in school, "Scholars".
Particulars as to Marriage: This is a category divided into four parts, one on the marital status such as "Married", "Single", and "Widow/Widower". The second was the number of years married, third was the total children born alive, while the fourth was the children still living. The last three parts were only to be filled in for the married woman.
Where Born: This was which county the member of the household was born, or if outside Ireland, the name of the country.
Irish Language: Only to be filled in if the member of the household could speak Irish only, "Irish", or both Irish and English, "Irish and English". For example, if the member of the household only spoke English, this part of the form should have been blank.
Infirmities: Only to be filled in if the member of the household had some sort of disability known in 1911 such as being "Blind" or "Deaf".

Unfortunately, in a document such as a census, there are errors in what could be written, such as writing "English" in the Irish Language section, when it was stated not to. Furthermore, there are questions such as Occupation and Birthplace where multiple answers could mean the same thing. For example, if a person in the census claimed to be from "Westmeath" and another from "Co. Westmeath", or even shorthand such as "W Meath", that these would be grouped into a single item, "Westmeath". As a result, where possible, the dataset generated for the two series tries to group census entries into items such as religion and birthplace. Furthermore, the transcriptions are not perfect, there are questions which were not possible to compile data for due to not all the information being transcribed. For example, not all disabilities were transcribed, thus it would be quite difficult to attain statistics without looking through every census document. The following sections go through each of the questions asked above and the statistics obtained from the data.

Forenames

The forenames, or first names have had slight corrections. Firstly, all non-letters have been removed (such as full-stops, apostrophes) and grouped together separately for males and females. Then, to avoid mistakes with spelling and transcription, only first names which matched the CSO first name database created from 1964 to the present day. The names in this database contains names where more than three were registered each year, for both males and females. It is estimated, the majority of actual names from the 1911 census would be contained in this database.

In some instances, multiple names were given for a person, such as the inclusion of middle names like "Michael Patrick" for example. While it is uncertain whether this would be just a multiple named first name like "Mary-Anne", the decision was made to remove any name but the first one unless the entire name is contained on the CSO database. This means names such as Mary Rose existed in the database as Maryrose, and those the second name Rose was maintained.

For the Forenames of Ireland 1911 series, each of the detailed statistics below is carried out for all first names with a population greater than 1,000 people. For some forenames less than 1,000 people, due to the lower population size, only the proportions of populations by electoral divisions and district are provided, along with religious breakdown and birthplace. For the Surnames of Ireland 1911 series, a table of the top first names is compiled for each surname on the list. A very important note is that the order of the most popular forenames is highly dependent on the methodology described here. As a result, it may be different from other lists which have used different approaches/methodologies.

Surnames

Like the forenames, the surnames have also had some slight corrections to help group surnames which are closely linked such as those with an O prefix and those without. Firstly, as with the first names, all non-letters have been removed. Then, any starting with Mac, Mc or O, these prefixes have been removed e.g., O'Brien and Brien are considered the same name, thus the O is removed from the front and are both considered Brien. Given the number of different surnames and variations, no further changes are made to surnames other than small character changes that were noticed.

For the Forenames of Ireland 1911 series, the top surnames which correspond to each name are shown. For the Surnames of Ireland 1911 series, detailed statistics are provided for those last names with population greater than a 1,000. Some surnames with population under 1,000 will have statistics on the proportion of the population by DED/district, religious breakdown, and birthplace. A very important note is that the order of the most popular surnames is highly dependent on the methodology described here. As a result, it may be different from other lists which have used different approaches/methodologies.

Proportion of the Population, by District and DED

For both series, a proportion of the population with a particular first name or last name is shown by district (also known as poor law unions), and electoral districts (DEDs), which are one the smallest statistical units, other than townlands, which are still in use today. Quite simply, a first name/ surname is grouped by either each district or DED and a proportion is obtained where the total is the remainder of the population in the district/DED. Note for the first names, the proportion is taken from the total number of males or females in each district/DED.

The maps shown for both district and DED use what is called a logarithmic norm when displaying the proportion. This only difference to a normal percentage display is that a logarithmic norm shows the contrast better between areas and not just where names are most prominent. There is a colour bar displayed next to each map which use scientific notation to display the proportion. For interest, the following should be known when interpreting the notation for the graphs:

Marital Status

As mentioned in the introduction, there were four separate parts to the marital status section of the census. The specific question on what marital status each had should have been filled out for every member of the household. These options were "Married", "Single", "Widow" and "Widower". When filling out the census, different words were used to describe the marital situation of each person. For example, single people were also written in as "Not Married", "Spinster" or "Bachelor". Furthermore, in some cases, widow and widower were used interchangeably, hence, when categorising, if someone was a widow or widower, they were grouped into a single group. The categories which each entry was assigned to were (if possible):

Married
Single
Widow or Widower

Only the population greater than 15 years old were considered for the number of people with a certain name and their marital statuses. The visualisation chosen was a pie chart to illustrate this.

The other questions, which were supposed to be in the entry for the married woman, did not always happen. In some cases, both the husband and wife were both filled in or just the husbands. Thus, in terms of the statistics carried out here, it would have been difficult to fix this issue and as a result, these are not presented in each series.

Religious Breakdown

For the religious breakdown, a similar task as Marital Status was carried out by grouping each census entry into a religion. For example, those who were Catholic were also referred to as "RC", "Roman Catholic" or "Cath", to name but a few. For each of the grouped religions, a set of matching words was created to solve the problem. The religions which were grouped were:

Catholic
Church of Ireland
Epis (Episcopalian or Protestant generally)
Presbyterian
Methodist
Church of England
Church of Scotland
Baptist
Unitarian
Jewish
No Religion
Brethren

The proportion is calculated for each name and what was the religious breakdown for it. These proportions are rounded to the nearest whole number (e.g. 45.2% would be rounded to 45%). Any religion under 3% is grouped into an "Other" category.

Occupations

The aim here was to get statistics from the occupations for each member of the household excluding scholars. Due to the large variety of occupations, it was decided to use the reported occupation rather than try to group, due to the variety of different names of occupations and the occupations themselves. As a result, for each forename/surname, a table is provided with the top 5 occupations.

Birthplace by County

The aim here was to group each entry on the census form for birthplace into counties where each person was reported to have been born in. The data was mapped as a proportion of each forename/surname that were born in each county over the total population in it (for forenames, also for each gender). Note that there are some differences to the historic counties from 1911:

QueensCo/Queen's County/Queens is now known as Co. Laois.
KingsCo/King's County/Kings is now known as Co. Offaly.
Londonderry, while some still know it by the same name, it is also known as Co. Derry.

Similar to the proportions in terms of DEDs and districts, logarithmic norm was decided as the best approach to display the data. It provides a greater contrast in areas with a lower proportion which allows us to see different regional differences. Furthermore, any births outside the island of Ireland are not displayed in the map, but included in the total in which the proportions were generated from.

Literacy

As described in the introduction, there was a question on the literacy status of each member of the household. For this statistic, the aim was to group the returns into either:

Read and write,
Cannot Read and Write,
Read Only.

This is carried out by matching common words to describe each category. A proportion is calculated, removing unknowns for each of the three categories for people aged 9 and above.

Irish Language

The last statistic presented for each forename/surname is the proportion of people who claimed to speak Irish. As mentioned in the introduction, people were to fill out "Irish" if they only spoke Irish, "Irish and English", if they spoke both or nothing otherwise. Due to the way the form was filled out, it was decided to get a proportion of Irish speakers, whether they spoke English or not. Furthermore, as people were asked not to fill in this section if they did not speak Irish, the proportion of Irish speakers is out of the total with that forename/surname. Thus, the two categories shown are:

Speaks Irish,
Does not Speak Irish/Unknown.