{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Recoding Data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Preliminaries\n",
"I include the data import and library import commands at the start of each lesson so that the lessons are self-contained."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"bank = pd.read_csv('Data/Bank.csv')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Appending a column\n",
"As in R, we can add a column (\"Series\") to our Pandas data frame. In the code below, I add a new column called \"Dummy\" and set every value in the series to zero."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
Employee
\n",
"
EducLev
\n",
"
JobGrade
\n",
"
YrHired
\n",
"
YrBorn
\n",
"
Gender
\n",
"
YrsPrior
\n",
"
PCJob
\n",
"
Salary
\n",
"
Dummy
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
1
\n",
"
3
\n",
"
1
\n",
"
92
\n",
"
69
\n",
"
Male
\n",
"
1
\n",
"
No
\n",
"
32.0
\n",
"
0
\n",
"
\n",
"
\n",
"
1
\n",
"
2
\n",
"
1
\n",
"
1
\n",
"
81
\n",
"
57
\n",
"
Female
\n",
"
1
\n",
"
No
\n",
"
39.1
\n",
"
0
\n",
"
\n",
"
\n",
"
2
\n",
"
3
\n",
"
1
\n",
"
1
\n",
"
83
\n",
"
60
\n",
"
Female
\n",
"
0
\n",
"
No
\n",
"
33.2
\n",
"
0
\n",
"
\n",
"
\n",
"
3
\n",
"
4
\n",
"
2
\n",
"
1
\n",
"
87
\n",
"
55
\n",
"
Female
\n",
"
7
\n",
"
No
\n",
"
30.6
\n",
"
0
\n",
"
\n",
"
\n",
"
4
\n",
"
5
\n",
"
3
\n",
"
1
\n",
"
92
\n",
"
67
\n",
"
Male
\n",
"
0
\n",
"
No
\n",
"
29.0
\n",
"
0
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Employee EducLev JobGrade YrHired YrBorn Gender YrsPrior PCJob \\\n",
"0 1 3 1 92 69 Male 1 No \n",
"1 2 1 1 81 57 Female 1 No \n",
"2 3 1 1 83 60 Female 0 No \n",
"3 4 2 1 87 55 Female 7 No \n",
"4 5 3 1 92 67 Male 0 No \n",
"\n",
" Salary Dummy \n",
"0 32.0 0 \n",
"1 39.1 0 \n",
"2 33.2 0 \n",
"3 30.6 0 \n",
"4 29.0 0 "
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"bank['Dummy'] = 0\n",
"bank.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Setting all values of \"Dummy\" to a constant value is not very useful, so I can drop the column using the `drop` method. Like many functions in Pandas, drop requires an `axis` argument (where 0=row and 1=column). The `inplace = True` argument is also common in Pandas: it is equivalent to `bank = bank.drop(...)`. That is, it ensures the changes are not part of a new data frame but are written back to the original data frame."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
Employee
\n",
"
EducLev
\n",
"
JobGrade
\n",
"
YrHired
\n",
"
YrBorn
\n",
"
Gender
\n",
"
YrsPrior
\n",
"
PCJob
\n",
"
Salary
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
1
\n",
"
3
\n",
"
1
\n",
"
92
\n",
"
69
\n",
"
Male
\n",
"
1
\n",
"
No
\n",
"
32.0
\n",
"
\n",
"
\n",
"
1
\n",
"
2
\n",
"
1
\n",
"
1
\n",
"
81
\n",
"
57
\n",
"
Female
\n",
"
1
\n",
"
No
\n",
"
39.1
\n",
"
\n",
"
\n",
"
2
\n",
"
3
\n",
"
1
\n",
"
1
\n",
"
83
\n",
"
60
\n",
"
Female
\n",
"
0
\n",
"
No
\n",
"
33.2
\n",
"
\n",
"
\n",
"
3
\n",
"
4
\n",
"
2
\n",
"
1
\n",
"
87
\n",
"
55
\n",
"
Female
\n",
"
7
\n",
"
No
\n",
"
30.6
\n",
"
\n",
"
\n",
"
4
\n",
"
5
\n",
"
3
\n",
"
1
\n",
"
92
\n",
"
67
\n",
"
Male
\n",
"
0
\n",
"
No
\n",
"
29.0
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Employee EducLev JobGrade YrHired YrBorn Gender YrsPrior PCJob \\\n",
"0 1 3 1 92 69 Male 1 No \n",
"1 2 1 1 81 57 Female 1 No \n",
"2 3 1 1 83 60 Female 0 No \n",
"3 4 2 1 87 55 Female 7 No \n",
"4 5 3 1 92 67 Male 0 No \n",
"\n",
" Salary \n",
"0 32.0 \n",
"1 39.1 \n",
"2 33.2 \n",
"3 30.6 \n",
"4 29.0 "
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"bank.drop('Dummy', axis=1, inplace=True)\n",
"bank.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Recoding using the ternary operator\n",
"Recoding is easy in R because R naturally manages arrays and vectors. Based on our experience with R, we might expect the following expression to work. The core of the expression is Python's inline `if` statement (or ternary operator), which takes the form:\n",
"` if else `\n",
"\n",
"To remap \"Female\" and \"Male\" to 1 and 0, we might think we could use the following ternary operator:\n",
"`1 if bank['Gender'] == \"Female\" else 0`\n",
"\n",
"Unfortunately, although this approach works magically in R, it does not work in Python. This is because the ternary operator does not work on the entire bank\\['Gender'\\] series. Of course, we have some alternatives.\n",
"\n",
"### The numpy where method\n",
"Numpy is another useful Python library (which means it has to be imported before it is used in our code). Its `where()` method works the same as the ternary operator, except it works with arrays of data:\n",
"`where(, , )`"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
Employee
\n",
"
EducLev
\n",
"
JobGrade
\n",
"
YrHired
\n",
"
YrBorn
\n",
"
Gender
\n",
"
YrsPrior
\n",
"
PCJob
\n",
"
Salary
\n",
"
GenderDummy_F
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
1
\n",
"
3
\n",
"
1
\n",
"
92
\n",
"
69
\n",
"
Male
\n",
"
1
\n",
"
No
\n",
"
32.0
\n",
"
0
\n",
"
\n",
"
\n",
"
1
\n",
"
2
\n",
"
1
\n",
"
1
\n",
"
81
\n",
"
57
\n",
"
Female
\n",
"
1
\n",
"
No
\n",
"
39.1
\n",
"
1
\n",
"
\n",
"
\n",
"
2
\n",
"
3
\n",
"
1
\n",
"
1
\n",
"
83
\n",
"
60
\n",
"
Female
\n",
"
0
\n",
"
No
\n",
"
33.2
\n",
"
1
\n",
"
\n",
"
\n",
"
3
\n",
"
4
\n",
"
2
\n",
"
1
\n",
"
87
\n",
"
55
\n",
"
Female
\n",
"
7
\n",
"
No
\n",
"
30.6
\n",
"
1
\n",
"
\n",
"
\n",
"
4
\n",
"
5
\n",
"
3
\n",
"
1
\n",
"
92
\n",
"
67
\n",
"
Male
\n",
"
0
\n",
"
No
\n",
"
29.0
\n",
"
0
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Employee EducLev JobGrade YrHired YrBorn Gender YrsPrior PCJob \\\n",
"0 1 3 1 92 69 Male 1 No \n",
"1 2 1 1 81 57 Female 1 No \n",
"2 3 1 1 83 60 Female 0 No \n",
"3 4 2 1 87 55 Female 7 No \n",
"4 5 3 1 92 67 Male 0 No \n",
"\n",
" Salary GenderDummy_F \n",
"0 32.0 0 \n",
"1 39.1 1 \n",
"2 33.2 1 \n",
"3 30.6 1 \n",
"4 29.0 0 "
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import numpy as np\n",
"bank['GenderDummy_F'] = np.where(bank['Gender'] == \"Female\", 1, 0)\n",
"bank.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Applying a function\n",
"Pandas has a special method called `apply()` which applies an expression to each element of the Series object. Which expression? The easiest way to see how this works is to start with a parameterized function that implements the if/then logic. What follows is a standard function declaration in Python. The code defines a new function called \"my_recode\" which takes a single parameter \"gender\". The function returns a 1 or 0 depending on the value passed to it:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"def my_recode(gender):\n",
" if gender == \"Female\":\n",
" return 1\n",
" else:\n",
" return 0"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Once defined, we can call the function anywhere within our notebook. The code below tests the function over the expect range of inputs. We see that we get 1 and 0 in response, as expected:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(1, 0)"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_recode(\"Female\"), my_recode(\"Male\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can use the Pandas `apply()` method to call the function for each value of the \"Gender\" column:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
Employee
\n",
"
EducLev
\n",
"
JobGrade
\n",
"
YrHired
\n",
"
YrBorn
\n",
"
Gender
\n",
"
YrsPrior
\n",
"
PCJob
\n",
"
Salary
\n",
"
GenderDummy_F
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
1
\n",
"
3
\n",
"
1
\n",
"
92
\n",
"
69
\n",
"
Male
\n",
"
1
\n",
"
No
\n",
"
32.0
\n",
"
0
\n",
"
\n",
"
\n",
"
1
\n",
"
2
\n",
"
1
\n",
"
1
\n",
"
81
\n",
"
57
\n",
"
Female
\n",
"
1
\n",
"
No
\n",
"
39.1
\n",
"
1
\n",
"
\n",
"
\n",
"
2
\n",
"
3
\n",
"
1
\n",
"
1
\n",
"
83
\n",
"
60
\n",
"
Female
\n",
"
0
\n",
"
No
\n",
"
33.2
\n",
"
1
\n",
"
\n",
"
\n",
"
3
\n",
"
4
\n",
"
2
\n",
"
1
\n",
"
87
\n",
"
55
\n",
"
Female
\n",
"
7
\n",
"
No
\n",
"
30.6
\n",
"
1
\n",
"
\n",
"
\n",
"
4
\n",
"
5
\n",
"
3
\n",
"
1
\n",
"
92
\n",
"
67
\n",
"
Male
\n",
"
0
\n",
"
No
\n",
"
29.0
\n",
"
0
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Employee EducLev JobGrade YrHired YrBorn Gender YrsPrior PCJob \\\n",
"0 1 3 1 92 69 Male 1 No \n",
"1 2 1 1 81 57 Female 1 No \n",
"2 3 1 1 83 60 Female 0 No \n",
"3 4 2 1 87 55 Female 7 No \n",
"4 5 3 1 92 67 Male 0 No \n",
"\n",
" Salary GenderDummy_F \n",
"0 32.0 0 \n",
"1 39.1 1 \n",
"2 33.2 1 \n",
"3 30.6 1 \n",
"4 29.0 0 "
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"bank['GenderDummy_F'] = bank['Gender'].apply(my_recode)\n",
"bank.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Applying a lambda function\n",
"A slightly more elegant approach is to apply a _lambda_ function in Python. A lambda function is simply a short, anonymous (unnamed), inline function. It saves us from having to define a separate function (as we did with `my_recode`). In addition, the lambda function makes the argument (in this case _x_) explicit. The explicit, non-array argument allows us to use the ternary operator:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
Employee
\n",
"
EducLev
\n",
"
JobGrade
\n",
"
YrHired
\n",
"
YrBorn
\n",
"
Gender
\n",
"
YrsPrior
\n",
"
PCJob
\n",
"
Salary
\n",
"
GenderDummy_F
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
1
\n",
"
3
\n",
"
1
\n",
"
92
\n",
"
69
\n",
"
Male
\n",
"
1
\n",
"
No
\n",
"
32.0
\n",
"
0
\n",
"
\n",
"
\n",
"
1
\n",
"
2
\n",
"
1
\n",
"
1
\n",
"
81
\n",
"
57
\n",
"
Female
\n",
"
1
\n",
"
No
\n",
"
39.1
\n",
"
1
\n",
"
\n",
"
\n",
"
2
\n",
"
3
\n",
"
1
\n",
"
1
\n",
"
83
\n",
"
60
\n",
"
Female
\n",
"
0
\n",
"
No
\n",
"
33.2
\n",
"
1
\n",
"
\n",
"
\n",
"
3
\n",
"
4
\n",
"
2
\n",
"
1
\n",
"
87
\n",
"
55
\n",
"
Female
\n",
"
7
\n",
"
No
\n",
"
30.6
\n",
"
1
\n",
"
\n",
"
\n",
"
4
\n",
"
5
\n",
"
3
\n",
"
1
\n",
"
92
\n",
"
67
\n",
"
Male
\n",
"
0
\n",
"
No
\n",
"
29.0
\n",
"
0
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Employee EducLev JobGrade YrHired YrBorn Gender YrsPrior PCJob \\\n",
"0 1 3 1 92 69 Male 1 No \n",
"1 2 1 1 81 57 Female 1 No \n",
"2 3 1 1 83 60 Female 0 No \n",
"3 4 2 1 87 55 Female 7 No \n",
"4 5 3 1 92 67 Male 0 No \n",
"\n",
" Salary GenderDummy_F \n",
"0 32.0 0 \n",
"1 39.1 1 \n",
"2 33.2 1 \n",
"3 30.6 1 \n",
"4 29.0 0 "
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"bank['GenderDummy_F'] = bank['Gender'].apply(lambda x: 1 if x == \"Female\" else 0)\n",
"bank.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The obvious advantage with the `apply()` method is that the function (be it explicitly named or lambda) can be arbitrarily complex."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Replacing values from a list\n",
"Pandas has a `replace()` method that can take lists. For example, we could create a list of job grades (1-6) and a corresponding list of \"managerial status\" for each of the job grades. Thus, when the `replace()` method sees a job grade of 1, it replaces it with the corresponding value in the other list."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
Employee
\n",
"
EducLev
\n",
"
JobGrade
\n",
"
YrHired
\n",
"
YrBorn
\n",
"
Gender
\n",
"
YrsPrior
\n",
"
PCJob
\n",
"
Salary
\n",
"
GenderDummy_F
\n",
"
Manager
\n",
"
\n",
" \n",
" \n",
"
\n",
"
170
\n",
"
171
\n",
"
2
\n",
"
4
\n",
"
79
\n",
"
42
\n",
"
Female
\n",
"
1
\n",
"
No
\n",
"
45.5
\n",
"
1
\n",
"
non-mgmt
\n",
"
\n",
"
\n",
"
171
\n",
"
172
\n",
"
3
\n",
"
4
\n",
"
84
\n",
"
58
\n",
"
Female
\n",
"
0
\n",
"
No
\n",
"
44.5
\n",
"
1
\n",
"
non-mgmt
\n",
"
\n",
"
\n",
"
172
\n",
"
173
\n",
"
2
\n",
"
4
\n",
"
82
\n",
"
55
\n",
"
Female
\n",
"
2
\n",
"
No
\n",
"
51.2
\n",
"
1
\n",
"
non-mgmt
\n",
"
\n",
"
\n",
"
173
\n",
"
174
\n",
"
5
\n",
"
5
\n",
"
88
\n",
"
61
\n",
"
Male
\n",
"
0
\n",
"
No
\n",
"
47.5
\n",
"
0
\n",
"
mgmt
\n",
"
\n",
"
\n",
"
174
\n",
"
175
\n",
"
5
\n",
"
5
\n",
"
87
\n",
"
58
\n",
"
Female
\n",
"
0
\n",
"
No
\n",
"
44.5
\n",
"
1
\n",
"
mgmt
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Employee EducLev JobGrade YrHired YrBorn Gender YrsPrior PCJob \\\n",
"170 171 2 4 79 42 Female 1 No \n",
"171 172 3 4 84 58 Female 0 No \n",
"172 173 2 4 82 55 Female 2 No \n",
"173 174 5 5 88 61 Male 0 No \n",
"174 175 5 5 87 58 Female 0 No \n",
"\n",
" Salary GenderDummy_F Manager \n",
"170 45.5 1 non-mgmt \n",
"171 44.5 1 non-mgmt \n",
"172 51.2 1 non-mgmt \n",
"173 47.5 0 mgmt \n",
"174 44.5 1 mgmt "
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"grades = [1,2,3,4,5,6]\n",
"status = [\"non-mgmt\", \"non-mgmt\", \"non-mgmt\", \"non-mgmt\", \"mgmt\", \"mgmt\"]\n",
"\n",
"bank['Manager'] = bank['JobGrade'].replace(grades, status)\n",
"bank[170:175]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here I create a list of six job grades and six managerial statuses (the lists have to be the same length and the _i_ th job grade has to correspond to the _i_ th managerial status). Since the `inline = True` argument is not passed to `replace()`, no change is made to the underlying \"Job Grade\" column. Instead, I assign the output of the `replace()` method to a new column called \"Manager\".\n",
"\n",
"Instead of calling `head()` (or `tail()`) to preview the results, I use Python's slice to show rows 170-175. This gives me a sample of managerial and non-managerial employees.\n",
"\n",
"Of course, it doesn't take much imagination to see how the `replace()` function could be used to create dummy variables. Returning to the \"Gender\" example:"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
Employee
\n",
"
EducLev
\n",
"
JobGrade
\n",
"
YrHired
\n",
"
YrBorn
\n",
"
Gender
\n",
"
YrsPrior
\n",
"
PCJob
\n",
"
Salary
\n",
"
GenderDummy_F
\n",
"
Manager
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
1
\n",
"
3
\n",
"
1
\n",
"
92
\n",
"
69
\n",
"
Male
\n",
"
1
\n",
"
No
\n",
"
32.0
\n",
"
0
\n",
"
non-mgmt
\n",
"
\n",
"
\n",
"
1
\n",
"
2
\n",
"
1
\n",
"
1
\n",
"
81
\n",
"
57
\n",
"
Female
\n",
"
1
\n",
"
No
\n",
"
39.1
\n",
"
1
\n",
"
non-mgmt
\n",
"
\n",
"
\n",
"
2
\n",
"
3
\n",
"
1
\n",
"
1
\n",
"
83
\n",
"
60
\n",
"
Female
\n",
"
0
\n",
"
No
\n",
"
33.2
\n",
"
1
\n",
"
non-mgmt
\n",
"
\n",
"
\n",
"
3
\n",
"
4
\n",
"
2
\n",
"
1
\n",
"
87
\n",
"
55
\n",
"
Female
\n",
"
7
\n",
"
No
\n",
"
30.6
\n",
"
1
\n",
"
non-mgmt
\n",
"
\n",
"
\n",
"
4
\n",
"
5
\n",
"
3
\n",
"
1
\n",
"
92
\n",
"
67
\n",
"
Male
\n",
"
0
\n",
"
No
\n",
"
29.0
\n",
"
0
\n",
"
non-mgmt
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Employee EducLev JobGrade YrHired YrBorn Gender YrsPrior PCJob \\\n",
"0 1 3 1 92 69 Male 1 No \n",
"1 2 1 1 81 57 Female 1 No \n",
"2 3 1 1 83 60 Female 0 No \n",
"3 4 2 1 87 55 Female 7 No \n",
"4 5 3 1 92 67 Male 0 No \n",
"\n",
" Salary GenderDummy_F Manager \n",
"0 32.0 0 non-mgmt \n",
"1 39.1 1 non-mgmt \n",
"2 33.2 1 non-mgmt \n",
"3 30.6 1 non-mgmt \n",
"4 29.0 0 non-mgmt "
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"genders=[\"Female\", \"Male\"]\n",
"dummy_vars=[1,0]\n",
"\n",
"bank['GenderDummy_F'] = bank['Gender'].replace(genders, dummy_vars)\n",
"bank.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Logging variables\n",
"As we have seen, we occasionally want to transform a numerical column in order to increase the linearity of out models. For this, we can use the numpy `log()` function, which returns the natural (base $e$) logarithm:"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"bank['logSalary'] = np.log(bank['Salary'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If we want, we can plot the results. In this case, a log transform does not really improve the normality of the salary data. The underlying issue appears to be bimodality—there are actually two salary distributions: workers and managers."
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYgAAAEGCAYAAAB/+QKOAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAAxp0lEQVR4nO3deXxU93no/88zo31HIAlt7DtmMWC8xVvsONixQ+I4jV0naXLjErdJmjbN1vx+NzdNe/tLr3tvaydOE+okTtJeb/ES78a7MWBALGJHCLEJJLSiDbTO8/tjRniQR9Ig6ejM8rxfr3kx8z3naB6Gwzz67qKqGGOMMQN53A7AGGNMZLIEYYwxJiRLEMYYY0KyBGGMMSYkSxDGGGNCSnA7gLE0adIknTZtmtthGGNM1Ni2bVuDquaFOhZTCWLatGmUlZW5HYYxxkQNETk22DFrYjLGGBOSJQhjjDEhWYIwxhgTkiUIY4wxIVmCMMYYE5IlCGOMMSFZgjDGGBOSJQhjjDEhxdREOeO+itNtPLW9mm1Hm0lK8DC/MIt7Lp/CjLwMt0MzxlwkSxBmTPT5lJ++eYgH3ziEL2gPqo2HG/nVe0e477qZfPvmOSR4rdJqTLSwBGFGrc+n/NVjO3hxVw0C3DA3n5XTcwFly5Em3qmo5xfvHGZfTStrv7CclESv2yEbY8JgCcKM2j+8sI8Xd9WQmujlbz42h0XF2eePLS2dwEdm5/HA6xW8W1HP3zy+k5/96TK8HnExYmNMOKy+b0blhV2neGTjURI8wt/efGFy6LegMIsf3DqftCQvL++p5X+vO+hCpMaYi2UJwoxYfVsX//3ZPQB88cppLCz6cHLoN3ViOt/62Bw8Av/+zmE2Hm4YrzCNMSNkCcKM2D+8sI/msz1cUpzNTfPzhz1/YVE2n7q0GFX41uPltHX2jEOUxpiRsgRhRmRX9RmeKz9FoldYc80MRMLrU7jj0hJm5qVT29rJ/15X4XCUxpjRcCxBiMivRaRORPYMcvw7IrIz8NgjIn0ikhs4dlREdgeO2Q5AEUZV+cnLBwD4+MLJ5GUmh32t1yPce80MPAK/23SUPSdbnArTGDNKTtYgHgFWDXZQVe9X1aWquhT4O+AdVW0KOuWGwPEVDsZoRmBTVSMbDzeSnuRl9dLii75+2sR0Pr5wMj6FHz+/D1Ud/iJjzLhzLEGo6rtA07An+t0NPOpULGZsrX23CoBbFhWSkTyykdJ3Li8hMzmBLUebeG3f6bEMzxgzRlzvgxCRNPw1jaeCihVYJyLbRGTNMNevEZEyESmrr693MlQDHKxt4+2D9SR5PXxsQcGIf05aUgJ3LPPXPn7yygF6+3xjFaIxZoy4niCA24ENA5qXrlbVZcAtwNdE5NrBLlbVtaq6QlVX5OXlOR1r3PuP9f7aw/Vz88hKSRzVz7ppfgEFWclU1Xfw9I6TYxGeMWYMRUKCuIsBzUuqeirwZx3wDLDShbjMAC1ne3i+/BQAty4qHPXPS/B6uHN5KQAPvH6I7l6rRRgTSVxNECKSDVwH/DGoLF1EMvufAzcDIUdCmfH1zI5qunp9LCrOpiArZUx+5lUzJlKck8rJM+d4fOvxMfmZxpix4eQw10eBTcBcEakWka+IyH0icl/QaZ8G1qlqR1BZAfCeiJQDW4AXVfUVp+I04VFVHt1yAoAb5w0/KS5cHo/w2eUlAPz87cN09faN2c82xoyOY4v1qerdYZzzCP7hsMFlVcASZ6IyI7XjxBkOnm4jKyWB5VMnjOnPvmx6LqW5aZxoOssTZdV84YqpY/rzjTEjEwl9ECYKPBvoRP7I7Lwx39PBI8JnLvWPaPr5W5VWizAmQliCMMPq6fPxwq4aAD4ya5Ij73HZ9FxKJ6RS09LJk2XVjryHMebiWIIww3rvUANNHd0U5aQwbWKaI+/hEeGOZf6+iH9/+7CNaDImAliCMMP6405/89LVMyeFvSjfSKycnkvJBP+Ipj9ss1qEMW6zBGGG1NnTd34pjKsdal7q5xHhjkBfxENvVVotwhiXWYIwQ9pQ2UBHdx/TJqaN2dyHoVw+feL5WsTjZSccfz9jzOAsQZghvbKnFoDLpuWOy/t5PMKdgb6Ih96spLPHRjQZ4xZLEGZQvX0+Xtvvb15aOX18EgT4RzRNyU2jtrWT/3z/2Li9rzHmQpYgzKC2HGnizNkeirJTKM5JHbf39YjwuRX+NZoeeqvStiY1xiWWIMygXt9fB8CKabmOjl4K5dIpOcwtyKT5bA//Edh/whgzvixBmJBUlTcO+JuXLp2SM+7vLyLcvXIKAGvXV3HqzLlxj8GYeGcJwoRU1dDBscazZCQnMDs/05UY5k7O5PLpuXT2+PjnVw64EoMx8cwShAnprQP+5qUlpTl4PePbvBTsnsunkOgV/rjzFFuPhruDrTFmLFiCMCG9Eeh/uLQ0x9U48jJTuH1JEQB/9/RuW8jPmHFkCcJ8SHtXL2XHmhCBJSU5bofD6iXFFGanUFnXzi/etg5rY8aLJQjzIZurGunpU2bmZZCR4tiWIWFLSvBw70emA/DTNw+xu7rF5YiMiQ+WIMyHrD/UAMDikmyXI/nAgqJsVi2cTK9P+eZjOzjb3et2SMbEPEsQ5kPePVQPwOLiHHcDGeDulVMonZBKVUMH3/3DLlTV7ZCMiWmWIMwFqpvPUlXfQWqil5n56W6Hc4GkBA/fvHEOqYleXthVw8/fPux2SMbENMcShIj8WkTqRGTPIMevF5EWEdkZePww6NgqETkoIpUi8n2nYjQf9l6geWlhURYJnsj7/aF4Qip/ecNMAO5/9SCPbz3uckTGxC4nvwEeAVYNc856VV0aePwYQES8wEPALcAC4G4RWeBgnCbIxsONACwqjpz+h4FWTM3lz66cCviHvj693TYXMsYJjiUIVX0XGMnMppVApapWqWo38BiwekyDMyGp6vkEsbAochMEwKpLCvns8hJ8Ct96opyH11dZn4QxY8ztNoQrRaRcRF4WkYWBsmIgeKeY6kBZSCKyRkTKRKSsvr7eyVhj3uH6dhrau8hJTaQox/nNgUbrjmUl3HO5f72mf3xxPz94ZrftQmfMGHIzQWwHpqrqEuCnwLOB8lDrOgz6q6GqrlXVFaq6Ii8vb+yjjCP9tYcFRVnjvnrrSN22uIiv3zCLRK/w6JYTfPaXm6huPut2WMbEBNcShKq2qmp74PlLQKKITMJfYygNOrUEOOVCiHFnU1CCiCZXz5rE/7h9IZMykig/cYZPPPge6/bWuh2WMVHPtQQhIpMl8GuqiKwMxNIIbAVmi8h0EUkC7gKecyvOeOHzKe9XBfofCiO7/yGUmXkZ/H+fXsyyKTm0nOthze+38T9f3EdvnzU5GTNSTg5zfRTYBMwVkWoR+YqI3Cci9wVOuRPYIyLlwIPAXerXC3wdeBXYDzyhqnuditP4Vda303y2h9z0JAqykt0OZ0QyUhL49s1zuefyKXg9wn+sP8IXfrWFlrO2I50xI+HYQjuqevcwx38G/GyQYy8BLzkRlwlt8xH/gLN5kzOjpv8hFBHhtsVFzMrP4IHXD7GpqpHP/GIjv/1vK8d121RjYoHbo5hMhNhyPkFEV//DYOZNzuIfP3UJJRNSqaxr5+6171Pb0ul2WMZEFUsQBlVlyxF//8P8Qnd2j3PCxIxkfnT7QmZMSud401nuefh9a24y5iJYgjAcbzrL6dYuMlMSYq4ZJj05gb+7ZT6luWkcru/gG4/toM9nE+qMCYclCBPUvBTd/Q+DyUhJ4Ds3zyUzJYF3K+r5t9cr3A7JmKhgCcJQdrQZgLkFsdH/EEpeZjLfvHE2Ajz0ViXlJ864HZIxEc8ShKHsmL8GMXdyhsuROGthUTa3LirEp/C3T5bb/tbGDMMSRJxr7ujmcH0HiV5h2sTI2v/BCX+yopSiwP7Wj2w46nY4xkQ0SxBxbvtxf/PSzLwMEryxfzskJXj44pXTAPjZm5U0tHe5G5AxESz2vxHMkMqO+RPEnILYGd46nCWlOSwtzaGtq5cH3zjkdjjGRCxLEHFu27H+Dur4SRAAf7pyCgI8tuUEp1ttAp0xoViCiGPdvb7zo3lmF8R2B/VApblpXDY9l+4+H//xbpXb4RgTkSxBxLEDta109fooyk4hMyXR7XDG3aeW+veh+q/Nx2nq6HY5GmMijyWIOLbj+BkAZuXHV+2h3/RJ6SwuyeZcTx+Pbz0x/AXGxBlLEHFsR2AE06z8+Op/CPbxhZMB+K/Nx2wJDmMGsAQRx3bEaf9DsKUlOeRnJlPdfI63D9a5HY4xEcUSRJxqaO/iWONZkhM8lE5Iczsc13g8wk3zCwD4z/ePuRyNMZHFEkSc2hnof5iRl47XE3sL9F2M6+bm4fUI7x5qoK7Nhrwa088SRJzaccLf/zA7jvsf+mWlJHJpaQ59PuW5nafcDseYiGEJIk6Vn2gBYFZe/PY/BPvI7EkAPL39pMuRGBM5HEsQIvJrEakTkT2DHL9HRHYFHhtFZEnQsaMisltEdopImVMxxiufT9lVfQaAmXE6xHWgZVMmkJ7kZV9NKwdqW90Ox5iI4GQN4hFg1RDHjwDXqepi4B+AtQOO36CqS1V1hUPxxa2jjR20dvYyIS2R3PQkt8OJCIleD1fMmAjAi7tqXI7GmMjgWIJQ1XeBpiGOb1TV5sDL94ESp2IxFyrvrz1Y89IFLu9PELtrULU5EcZESh/EV4CXg14rsE5EtonImqEuFJE1IlImImX19fWOBhkr+vsfLEFcaEFhFhnJCVTVd1Bxut3tcIxxnesJQkRuwJ8gvhdUfLWqLgNuAb4mItcOdr2qrlXVFaq6Ii8vz+FoY0O59T+E5PUIl03LBeCl3dbMZIyrCUJEFgMPA6tVtbG/XFVPBf6sA54BVroTYezp7vWx95S/E3bGpNjfQe5iXT7dnyBe3mMJwhjXEoSITAGeBr6gqhVB5ekiktn/HLgZCDkSyly8itNtdPf6mJyVQnpygtvhRJyFRVmkJnqpON3OiaazbodjjKucHOb6KLAJmCsi1SLyFRG5T0TuC5zyQ2Ai8PMBw1kLgPdEpBzYAryoqq84FWe82VXd3/9gtYdQErweFpdkA/D6/tMuR2OMuxz7FVJV7x7m+L3AvSHKq4AlH77CjIXdJ88AMMM6qAe1fOoENh9p4s0DdXz56uluh2OMa1zvpDbjq78GMcNqEINaWpqDCLxf1UhbZ4/b4RjjGksQcaSzp4+DtW2IwLSJliAGk5mSyJz8THr6lPWHGtwOxxjXWIKIIwdq2+j1KcU5qaQket0OJ6ItnZIDwLsVNrfGxC9LEHFkd2D+gw1vHd6SkhwA3qmot1nVJm5Zgogj/f0P0ydZB/Vwpk5MIys1kZqWTirrbFa1iU+WIOLI7pPWQR0ujwhLiv3DXd+xZiYTpyxBxInOnj4O1bUj4v/t2AxvcWkOYAnCxC9LEHFif00rfYEO6uQE66AOx+JADWLLkSY6e/pcjsaY8WcJIk7sOdnf/2DNS+HKSk1kam4aXb0+th9rHv4CY2KMJYg4seekLdA3EgsDtYgNh20+hIk/YSUIEXlKRD4hIpZQotTukzaCaSQWFWcB8F5l4zBnGhN7wv3C/3fgT4FDIvITEZnnYExmjHX29FFxug3BOqgv1rzJWXhF2F19hpZztuyGiS9hJQhVfV1V7wGWAUeB10Rko4h8WUQSnQzQjF7Faf8M6iKbQX3RUhK9zMrPwKf+tZmMiSdhNxmJyETgS/hXYN0BPIA/YbzmSGRmzPT3P0yz/ocRuSTQzLTpsCUIE1/C7YN4GlgPpAG3q+onVfVxVf0GYI3aEW7PqUD/gy3QNyILivwd1VaDMPEm3P0gHlbVl4ILRCRZVbtUdYUDcZkxtPd8B7X1P4zErLwMEr3Cgdo2mjq6yU1PcjskY8ZFuE1M/xiibNNYBmKc0dPnY39tGwBTrQYxIkkJHmbnZwKw5YjVIkz8GDJBiMhkEVkOpIrIpSKyLPC4Hn9zk4lwlXXtdPf6KMhKtj2oR2FhkfVDmPgz3DfGx/F3TJcA/yeovA34gUMxmTG091Sgg9pqD6OyoCgLtsEm64cwcWTIGoSq/lZVbwC+pKo3BD0+qapPD3WtiPxaROpEZM8gx0VEHhSRShHZJSLLgo6tEpGDgWPfH9HfzAAfLLFhI5hGZ1ZeBkleDxWn22nq6HY7HGPGxXBNTJ8PPJ0mIt8a+BjmZz8CrBri+C3A7MBjDf7JeIiIF3gocHwBcLeILBj2b2JC2hsYwWQ1iNFJ8HqYXeAfsGf9ECZeDNdJ3f+tkgFkhngMSlXfBZqGOGU18Dv1ex/IEZFCYCVQqapVqtoNPBY411wkn0/ZF2hiskX6Rm/eZH8/xPtVQ93WxsSOIfsgVPWXgT//3oH3LgZOBL2uDpSFKr98sB8iImvw10CYMmXK2EcZxY42dtDR3UduehLZqTbhfbQWFGbyFLD5iCUIEx/CnSj3v0QkS0QSReQNEWkIan4aKQlRpkOUh6Sqa1V1haquyMvLG2VIseWDDmobcDYWZuVnkuARDtS20nLW1mUysS/ceRA3q2orcBv+3+jnAN8Z5XtXA6VBr0uAU0OUm4tkI5jGVlKCh1n5GajC1qNWizCxL9wE0d8+cSvwqKqOxf+O54AvBkYzXQG0qGoNsBWYLSLTRSQJuCtwrrlI5zuorf9hzMwv9PdDbLaOahMHwp059byIHADOAX8pInlA51AXiMijwPXAJBGpBv4HgUSjqr8AXsKfcCqBs8CXA8d6ReTrwKuAF/i1qu69yL9X3FNVq0E4YN7k/hnVVoMwsS+sBKGq3xeRfwZaVbVPRDoYZmSRqt49zHEFvjbIsZfwJxAzQrWtnTR1dJORnMCkDFs7aKzMKcjEI7DnVCvtXb1k2Ox0E8Mu5u6ej38+RPA1vxvjeMwY2Xvygw5qkVD9/mYkUhK9TJ+UzuH6DrYfa+baOTYwwsSucEcx/R74F+AjwGWBh63iGsH2WP+DY/r7IayZycS6cGsQK4AFgWYhEwWs/8E58yZn8cKuGksQJuaFO4ppDzDZyUDM2NpnCcIxcydnIsDOE2fo7OlzOxxjHBNuDWISsE9EtgBd/YWq+klHojKj0tzRzckz50hO8FCYneJ2ODEnIzmB0tw0jjedpfzEGS6fMdHtkIxxRLgJ4kdOBmHGVn/z0pTcNDwe66B2wrzJmRxvOsuWI02WIEzMCquJSVXfAY4CiYHnW4HtDsZlRsEmyDnvfEe1zag2MSzcUUx/DvwB+GWgqBh41qGYzChZB7Xz+ifMbTvWTE+fz+VojHFGuJ3UXwOuBloBVPUQkO9UUGZ0zg9xtUX6HJOTlkRhdgpnu/vOJ2RjYk24CaIrsDcDAIHJcjbkNQJ1dPVypKEDrwiluZYgnNS/P8Rm24bUxKhwE8Q7IvIDIFVEPgY8CTzvXFhmpPbXtKIKJbmpJHrD/ec1IzG/0NZlMrEt3G+Q7wP1wG7gq/jXSfp/nQrKjFx/c8d0639wXHBHdZ/PKtQm9oS7WJ9PRJ4FnlXVemdDMqOx56SNYBovkzKSyctIpr69i/01rVxSnO12SMaMqSFrEIG9Gn4kIg3AAeCgiNSLyA/HJzxzsfbYHtTjqr+ZybYhNbFouCamv8Y/eukyVZ2oqrn494e+WkT+xungzMXp6u3j0Ok2BP8kOeO88xsIWUe1iUHDJYgvAner6pH+AlWtAj4fOGYiSEVtO70+pTA7hZREr9vhxIUPdphrwmf9ECbGDJcgElW1YWBhoB8iMcT5xkX98x+seWn85GcmMzE9iZZzPRyobXM7HGPG1HAJonuEx4wLdlsH9bgTERYEahHvWzOTiTHDJYglItIa4tEGLBqPAE349gYSxAxLEONqQZE/QWyyBGFizJDDXFV1VA3ZIrIKeADwAg+r6k8GHP8OcE9QLPOBPFVtEpGjQBvQB/Sqqu1gN4SePh/7A00cU20OxLhaWPRBR3WfT/HaCromRjg21VZEvMBDwC3AAuBuEVkQfI6q3q+qS1V1KfB3wDuqGjxe8IbAcUsOwzh0up3uXh8FWcmkJ1/MVuNmtPIyU5iUkURrZy/7a2xdJhM7nFyLYSVQqapVgXWcHgNWD3H+3cCjDsYT06yD2l0Li/yT5DYdtmYmEzucTBDFwImg19WBsg8RkTRgFfBUULEC60Rkm4isGexNRGSNiJSJSFl9ffxO8u6fQW1LbLijv6Pa+iFMLHEyQYRqiB1soPjtwIYBzUtXq+oy/E1UXxORa0NdqKprVXWFqq7Iy8sbXcRRzEYwuat/mY3NVY22P4SJGU4miGqgNOh1CXBqkHPvYkDzkqqeCvxZBzyDv8nKhNDb52OfLbHhqtz0JIpyUujo7qP8xBm3wzFmTDiZILYCs0Vkuogk4U8Czw08SUSygeuAPwaVpYtIZv9z4GZgj4OxRrVDde109frIz0wmM8XmL7rlkkA/xHuVH5pbakxUcixBqGov8HXgVWA/8ISq7hWR+0TkvqBTPw2sU9WOoLIC4D0RKQe2AC+q6itOxRrt+puXrPbgrv5mpo2V1g9hYoOj4yFV9SX8e0cEl/1iwOtHgEcGlFUBS5yMLZbsrg5MkMvLcDmS+LagMAsR2HGimY6uXhtubKKebTkWA3bbDOqIkJ6cwMy8DHr61JbdMDHBEkSU6+nzsS8wOctGMLlvcYm/mendivgdcm1ihyWIKFdxuo3uXh+Ts1LIsCYN1y0pyQHgHUsQJgZYgohy/f0P1kEdGWbmZZCe5OVo41mONXYMf4ExEcwSRJQrrz4D+L+YjPu8Hjk/msmamUy0swQR5cpP+GsQM/OsBhEp+puZ3j5oCcJEN0sQUexcdx8HT7chYh3UkWTplBzAP2HuXHefu8EYMwqWIKLYvpoW+nxKyYQ024M6gkxIS2LGpHS6en1sPGyzqk30sgQRxfqbl2ZZ81LEuXTKBADeOFDnciTGjJwliCjW30FtM6gjz/Kp/gTx5v46VAdbxNiYyGYJIor1rxpqI5giz7SJaUxIS6S2tfP8THdjoo0liCjV3NHN0cazJHk9lOamuh2OGUBEWDEtF4BX9tS6HI0xI2MJIkrtDNQeZuSlk+Cxf8ZItDIoQVgzk4lG9s0SpXYcbwZgVr41L0Wq+YVZZCQnUNXQQcXpdrfDMeaiWYKIUjsCNQhLEJHL6xEuC9QiXt5T43I0xlw8SxBRyOdTdh4/A8Ds/Ex3gzFDuny6P0E8X37KmplM1LEEEYWqGtpp6+olNz2J3PQkt8MxQ7ikOJuslAQO13ew52Sr2+EYc1EsQUSh7cfOANa8FA28HuHKmZMAeHbnSZejMebiWIKIQmXHmgCYW2DNS9HgI7P8CeK58lP09vlcjsaY8DmaIERklYgcFJFKEfl+iOPXi0iLiOwMPH4Y7rXxrOyYfwTTnAKrQUSDmXnpTM5Kob6ti/WHbG0mEz0cSxAi4gUeAm4BFgB3i8iCEKeuV9WlgcePL/LauNPU0U1VfQeJXmHaRFuDKRqICNfPzQPgvzYfdzkaY8LnZA1iJVCpqlWq2g08Bqweh2tj2rZA7WFmXgYJXmshjBbXzcnD6xHePHCampZzbodjTFic/IYpBk4Eva4OlA10pYiUi8jLIrLwIq+NO/0JYu5k63+IJjlpSayYOgGfwuNbTwx/gTERwMkEISHKBg4E3w5MVdUlwE+BZy/iWv+JImtEpExEyurrY38Hr22BDuo51kEddW6aXwD4m5m6em0jIRP5nEwQ1UBp0OsS4FTwCaraqqrtgecvAYkiMimca4N+xlpVXaGqK/Ly8sYy/ojT2dNH+YkWBJhjE+SizsKiLKbkplHf1sUfd4S8nY2JKE4miK3AbBGZLiJJwF3Ac8EniMhkEZHA85WBeBrDuTYelZ84Q3efj9LcNDJSEtwOx1wkEeG2xYUA/PLdw/h8NrPaRDbHEoSq9gJfB14F9gNPqOpeEblPRO4LnHYnsEdEyoEHgbvUL+S1TsUaLTYf8TcvzS/McjkSM1JXzpzIxPQkDtd3sG6fLQNuIpujv4YGmo1eGlD2i6DnPwN+Fu618W5Lf4KwDuqoleDx8MklRfxm41H+ZV0FN80vsNFoJmLZnRklevp850cwzbMaRFT76Lx88jOTqaxr5+kdtvyGiVyWIKLE7pMtnOvpoygnhezURLfDMaOQ4PXw2RX+MRj/8upBWjt7XI7ImNAsQUSJTYcbAZg/2WoPseCqmROZnZ9BXVsX979y0O1wjAnJEkSU2HjYv4bPwqJslyMxY8Ejwr3XzMDrEf5z87Hz/UvGRBJLEFGgs6ePsqP+/oeFRVaDiBVTctO4fXERqvBXj+6gsb3L7ZCMuYAliCiw/VgzXb0+puamkWX9DzHlM8uLmVuQSW1rJ994dIfNsDYRxRJEFNjQ37xUbM1LsSbB4+GvbpxNdmoiGw838teP7bQ9I0zEsAQRBd6r9HdQX2LNSzEpNz2J798yj7QkLy/vqeWrv99GR1ev22EZYwki0p05283u6jN4PWIzqGPYtInpfG/VPDKSE3jjQB2f/vkG9pxscTssE+csQUS49Yca8Kl/e9GURK/b4RgHzSnI5MerF1KYnULF6XY+9dAGfvjHPdS1drodmolTliAi3NsH/UuYLy3NcTcQMy4Ks1P5p08vYtXCyfT5lN9tOsbV//wm33h0B28frKPH+ifMOLIlQSOYz6e8U2EJIt6kJHr5s6um8dF5+fxhezVbjzbxfPkpni8/RXZqIjfMzeOmBQVcNyePzBQb1WacYwkigu2raaWhvYvc9CRKJqS6HY4ZZ6W5afzNTXNoaO/i7YP1vF/VyMkz53h25yme3XmKpAQP183J4zPLirlxfgGJtuifGWOWICLY2wfrAFhSkk1g2wwThyZlJHPn8hLuXF7CqTPn2HasmW3Hmqk43cZr+07z2r7T5Gcmc+8107nn8qmkJ9t/azM27E6KYK/tOw3AsikTXI7ERIqinFSKclK5fUkRTR3dvF/VyJsH6jh55hz/9NIBHl5/hO+umsdnlhXbLxVm1KxOGqFqWzopr24hyethUYlNkDMflpuexK2LCrn/zsV8b9VcZualU9fWxbefLOeLv95CTcs5t0M0Uc4SRIR6bb+/9rC4JJvkBBveagYnIiwtncCPV1/CX1w3k4zkBNYfauATD77HxsoGt8MzUcwSRITqb15aPtWal0x4PCJcOyeP++9czOLibJo6uvn8rzbzX5uPuR2aiVKWICJQy7keNh1uQMT6H8zFy0lL4nur5rF6aRE+hf/nmT3862sVqKrboZkoYwkiAr26t5aePmVBYZat3mpGxOMR7rpsCvdeMx2PwANvHOJf1h20JGEuiqMJQkRWichBEakUke+HOH6PiOwKPDaKyJKgY0dFZLeI7BSRMifjjDQv7KoB4MqZE12OxES7G+cV8PUbZuMReOitwzzwxiG3QzJRxLFhriLiBR4CPgZUA1tF5DlV3Rd02hHgOlVtFpFbgLXA5UHHb1DVuOpla2zvYkNlA16PsHJartvhmBhw5cyJiMBP3zzEv71+iAlpSfzZVdPcDstEASdrECuBSlWtUtVu4DFgdfAJqrpRVZsDL98HShyMJyq8sreWPp+yqDjbllEwY+aKGRO595oZAPzo+b28sqfG5YhMNHAyQRQDJ4JeVwfKBvMV4OWg1wqsE5FtIrJmsItEZI2IlIlIWX19/agCjgRPbasG/JvaGzOWbpibz+dWlKIK33xsJzuONw9/kYlrTiaIUNM4Q/aQicgN+BPE94KKr1bVZcAtwNdE5NpQ16rqWlVdoaor8vLyRhuzqyrr2th+/AypiV4us+Yl44DVS4v46Lx8unp9/Pnvyjh5xibTmcE5mSCqgdKg1yXAqYEnichi4GFgtao29per6qnAn3XAM/ibrGLak4HawxUzcm3vB+MIEeHLV0/jkqIsGtq7ufe3ZbZ7nRmUkwliKzBbRKaLSBJwF/Bc8AkiMgV4GviCqlYElaeLSGb/c+BmYI+Dsbqup8/H09tPAnDdnHyXozGxLMHj4Zs3zaEwO4X9Na18+8lyfD4b/mo+zLEEoaq9wNeBV4H9wBOquldE7hOR+wKn/RCYCPx8wHDWAuA9ESkHtgAvquorTsUaCdbtPU19WxdF2SnMKchwOxwT4zKSE/j2zXPP74Ntw19NKI6u5qqqLwEvDSj7RdDze4F7Q1xXBSwZWB7LfrvxKAA3L5xsq3CacVGUk8o3Pjqb+189wANvHGJ+YSarLil0OywTQWwmdQTYd6qVLUebSE30cu3s6O5oN9FlaWkOd102BYBvPVHO/ppWlyMykcQSRAT49YYjAFw7J4/UJOucNuPrtsWFXDNrEme7+7j3t2U0tHe5HZKJEJYgXFbdfJZnd5xEBFYtnOx2OCYOiQj3XjODWfkZnDxzjq/+fhudPX1uh2UigCUIl619t4pen3LVjIlMzk5xOxwTp5ISPPztx+YwMT2Jbcea+e4fdtnCfsYShJtqWzp5bKt/svnqpUNNMjfGeTlpSXzn43NJSfTwXPkp/terB90OybjMEoSL/vW1Crp7fVw+PZfS3DS3wzGGqRPT+esb5+AR+Pe3D/ObQP+YiU+WIFxysLaNJ7edwOsRPndZ6fAXGDNOlpTmsOZa/8J+f//8Pp4sOzHMFSZWWYJwgary4xf24lO4cV4+hdmpbodkzAWum5PP5y+fCsD3ntrFMzuqXY7IuMEShAue3XmSDZWNZCYn8Jnlcb/CuYlQn1hcyGeXl+BT/xyJJ7ZaTSLeWIIYZ43tXfzjC/sBuOeKKWTZng8mgt2xrOT8EuHffWoXD71VaaOb4ogliHGkqnz7yXIaO7pZWJRls6ZNVPjUpcV8+appCHD/qwf52yfKbZ5EnHB0LSZzoV+9d4S3DtaTnuzlL66baWsumahx88LJTEhL4qG3K3l6x0kO1Lbx4N2XMis/OheWbDnXw7HGDk6d6aT5bDddPX34FFKTvExIS6IwO4UZeelxv6ujJYhx8taBOv7pJX/T0levmcnEjGSXIzLm4lw2PZe/z1rIv75ewb6aVm776Xq+ffNcvnTVNBK8kdsY4fMp+2pa2XS4kW3Hmtl9siXsjZKm5KZx6ZQcrpo5kWvn5MXdgBKJpfbEFStWaFlZ2fAnjrPyE2e45+HNtHf1cseyYj673Ia1muh1truXRzYcZX1lAwDzJmfyg1vnc83sSRFTK27r7OGdinre2F/HuxX1NHZ0X3A80StMzk4lLyOJzJREkhM8iAhdPX20dvZS395Fbcs5evou/H5cVJzNJxYX8sklRRTlxEayEJFtqroi5DFLEM4qP3GGz/9qM22dvVw9cyJfu2FWxPwnMmY0th9v5jcbjtDQ7v/yXT51AmuuncGN8/JdqVHUtJzj9X2nWbfvNO9XNV7w5T4pI4mFRdnML8xkVl4mhdkpeDxD/z/s8ynVzWc5UNvGnpMt7D7ZQlevDwAR/77xn11eyqpLJkf1DpCWIFyybm8t33xsJ+d6+lg5PZdvfHQWCZ7IrYobc7G6e328vKeGF3bV0B7YunRyVgqfXFrELZdMZnFJDt5hvohH8947jjez/lADbx2sY++pD5YqF4G5BZksnzqBS0snUJSTMupfzLp7fZSfOMOGww1sP958PgFlpSRwx7IS7lpZyrzJWaN6DzdYghhnXb19/J91FaxdX4UqXDt7En9+7QxLDiZmdfb08eaBOl7ff5qals7z5bnpSayclsvyqRO4pDibOQUZI+p/8/mUk2fOsb+mld0nW9h2rJkdx89wLmg0VXKCh0XF2ayYNoFLp0xwdAh5R1cvm6oaeetAHVUNHefLL52Sw+dWlHLbkiIykqOji9cSxDjaUNnAj57by6G6djwCf7KilE8uKbJmJRMXVJWK0+1sqmpk+7Fm6kPsLZGVkkDJhDQKspLJTU8mKzWB1EQvCV4Pgn9/9rPdfbSe6wn0BXRyvOns+eadYMU5qSwqyWZJSQ4LCrNIShj/X8KONnbwxv46NlQ2nE9YqYlePr6wgNWXFvORWZNIjOBOfEsQDuvzKe8eqmftO1VsqmoEoCArmb+8fhZzCjLHPR5jIoGqUtvayYHaNg7XtXM0MKz03AjnUOSkJlKSm8b0iWnMys9kTkEGOWlJYxz1yHX29LH5SBNvH6zjQG3b+fLs1ERuml/ATfPzuXr2pIibHOtaghCRVcADgBd4WFV/MuC4BI7fCpwFvqSq28O5NpTxTBCdPX1sP97MG/vreHl3DacC1erURC+3LyniE4sKXfltxphIpqq0dvbS0N5F89lu2jp7OdvVR3efjz6fDwUSPR6SEjykJXnJTk1kQnoS+ZnJpCVFR5MNwOnWTjZUNrDxcOMFQ2q9HmFxSTYrp+Vy6ZQJLCnNZnLW6PtHRsOVBCEiXqAC+BhQDWwF7lbVfUHn3Ap8A3+CuBx4QFUvD+faUMYyQagq53r6aD7bQ32bf8jbiaZzHK5vZ39NK/tr2uju+6DKm5+ZzEfn5XPT/ALSo6Tt0RjjvJPN5yg71sSO42c4VNeGb8BXbnZqIrPzM5g+KZ3S3DQKs1MoyEphUkYyuelJZKcmkpLocSyJDJUgnPwmWwlUqmpVIIjHgNVA8Jf8auB36s9S74tIjogUAtPCuHbMrNtby5rfbxvx9enJXopyUqk43UbF6bbhLzDGxJ30ZC9zCjKpOH1hkmg510PZsWbKjjWP/GcnefnNl1eycnruGET6AScTRDEQvPxjNf5awnDnFId5LQAisgZYE3jZLiJDbYM1CWgYWJiQXTDVk5o5aYjrhlU5mosH6Dvbgjctewx/onMsVmdYrM6I5ViveqDldF9r/UjWZZ862AEnE0So+tDA9qzBzgnnWn+h6lpgbVgBiZQNVpWKJCJS1ttSF/FxgsXqFIvVGRbrxXEyQVQDwWtKlACnwjwnKYxrjTHGOMjJYTZbgdkiMl1EkoC7gOcGnPMc8EXxuwJoUdWaMK81xhjjIMdqEKraKyJfB17FP1T116q6V0TuCxz/BfAS/hFMlfiHuX55qGvHIKywmqIiQLTECRarUyxWZ1isFyGmJsoZY4wZOzaTyxhjTEiWIIwxxoQU9QlCRFJEZIuIlIvIXhH5+xDniIg8KCKVIrJLRJYFHVslIgcDx74fAbHeE4hxl4hsFJElQceOishuEdkpIo6uKRJmrNeLSEsgnp0i8sOgY5H2uX4nKM49ItInIrmBY+P2uQbezysiO0TkhRDHIuJeDTPWiLhXw4w1Iu7VMGONmHsVVY3qB/45ExmB54nAZuCKAefcCrwcOPcKYHOg3AscBmbgH1pbDixwOdargAmB57f0xxp4fRSYFEGf6/XACyGujbjPdcD5twNvuvG5Bt7vW8D/HeSzi4h7NcxYI+JeDTPWiLhXw4l1wHmu3qtRX4NQv/bAy8TAY2DP+/klPVT1faB/SY/zy4GoajfQv6SHa7Gq6kZV7Z9z/z7+OSDjLszPdTAR97kOcDfwqFPxDEVESoBPAA8PckpE3KvhxBop9yqE9bkOJuI+1wFcu1chBpqY4Hx1bSdQB7ymqpsHnHIxS3oUOxhqOLEG+wr+3yb7KbBORLaJf4kRR4UZ65WBpp2XRWRhoCxiP1cRSQNWAU8FFY/n5/pvwHeBD29u4Bcx9yrDxxrM1XuV8GKNiHuVMD/XCLhXYyNBqGqfqi7F/xvMShG5ZMApo17SY6yEESsAInID/v903wsqvlpVl+Gvzn9NRK51OdbtwFRVXQL8FHi2P/xQP86pOCH8zxV/lX2DqjYFlY3L5yoitwF1qjrUypARca+GGWv/ua7eq2HGGhH36sV8rrh4r/aLiQTRT1XPAG/jz7rBBlvSI5zlQBwxRKyIyGL81c/VqtoYdM2pwJ91wDP4q8euxaqqrf1NO6r6EpAoIpOI0M814C4GVNnH8XO9GvikiBzF35TxURH5zwHnRMq9Gk6skXKvDhtrBN2rYX2uAW7eq+ffMKofQB6QE3ieCqwHbhtwzie4sONvS6A8AagCpvNBB9VCl2Odgn9m+VUDytOBzKDnG4FVLsc6mQ8mW64Ejgc+44j7XAPHsoEmIN2tzzXofa8ndKdpRNyrYcYaEfdqmLFGxL0aTqyRdK/Gws42hcBvxb/JkAd4QlVfEHeX9BhNrD8EJgI/F/8GIb3qX4G2AHgmUJYA/F9VfcXlWO8E/kJEeoFzwF3qv3sj8XMF+DSwTlU7gq4d78/1QyL0Xg0n1ki5V8OJNVLu1XBihQi5V22pDWOMMSHFVB+EMcaYsWMJwhhjTEiWIIwxxoRkCcIYY0xIliCMMcaEZAnCmCAi0j78WYNee1tghc5yEdknIl8d5vwvicjPRvp+xjgtFuZBGOM6EUnEv0XkSlWtFpFkYNoYv0eCqvaO5c80ZihWgzAmBPG7P7Ae/24R+Vyg3CMiPxf/vhMviMhLInInkIn/F65GAFXtUtWDgWtuF5HNgdrF6yJSEOL9Qp4jIj8SkbUisg74nYisF5GlQddtCCx3YcyYswRhTGh3AEuBJcBNwP2BZbfvwF8zWATcC1wJoP4F1Z4DjonIo+LfTKf//9d7+PenuBT/+jvfDfF+Q52zHP9aR3+Kf92jLwGIyBwgWVV3jdHf2ZgLWBOTMaF9BHhUVfuA0yLyDnBZoPxJVfUBtSLyVv8FqnqviCzCn1C+DXwM/5d5CfB4IMEkAUdCvN9Q5zynqucCz58E/ruIfAf4b8AjY/T3NeZDrAZhTGihloEeqhwAVd2tqv+KPzl8JlD8U+BnqroI+CqQEuLSoc45vx6Pqp4FXsO/qc2f4N+VzBhHWIIwJrR3gc8FNiLKA64FtuBvCvpMoC+iAP+KnIhIhohcH3T9UuBY4Hk2cDLw/M8Geb9wzun3MPAgsFUv3CvAmDFlTUzGhPYM/v6FcvwbyHxXVWtF5CngRmAPUIF//+sW/DWL74rIL/GvFtpBoK8A+BHwpIicxL815/QQ7xfOOQCo6jYRaQV+M7q/ojFDs9VcjblIIpKhqu0iMhF/reJqVa0dx/cvwr8p0rxAX4gxjrAahDEX7wURycHfmfwP45wcvgj8T+BblhyM06wGYYwxJiTrpDbGGBOSJQhjjDEhWYIwxhgTkiUIY4wxIVmCMMYYE9L/DwtKYBmTzntGAAAAAElFTkSuQmCC\n",
"text/plain": [
""
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"import seaborn as sns\n",
"sns.kdeplot(x=bank['logSalary'], shade=True, linewidth=2);"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.5"
}
},
"nbformat": 4,
"nbformat_minor": 2
}