Skip to content
Projects
Groups
Snippets
Help
This project
Loading...
Sign in / Register
Toggle navigation
K
kurs_alx_pcz
Overview
Overview
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Patryk Czarnik
kurs_alx_pcz
Commits
9308b618
Commit
9308b618
authored
Dec 13, 2023
by
Patryk Czarnik
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
analiza_danych - połowa przykładów emps
parent
adcbf13b
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
1903 additions
and
1 deletions
+1903
-1
analiza_danych.ipynb
jupyter/analiza_danych.ipynb
+1903
-1
No files found.
jupyter/analiza_danych.ipynb
View file @
9308b618
...
@@ -730,9 +730,1911 @@
...
@@ -730,9 +730,1911 @@
]
]
},
},
{
{
"cell_type": "markdown",
"id": "f8cc6bb1-b7c8-4b56-a086-b4e0c05ecb5c",
"metadata": {},
"source": [
"## Indeksowanie\n",
"\n",
"czyli dostęp po współrzędnych.\n",
"\n",
"- ``.iloc`` - dostęp wg współrzędnych numerycznych, jak w `numpy`, numeracja od zera\n",
"- ``.loc`` - dostęp wg wartości indeksu i nazwy kolumny"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "4be9c4e3-5db4-4aef-ab9b-8848cf6f33f5",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Steven'"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"emps.iloc[0, 0]"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "4f8411c4-4735-4712-b541-1f6d3dd86efe",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"17000"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"emps.iloc[2, 3]"
]
},
{
"cell_type": "markdown",
"id": "0efb33fa-aaff-4996-80bc-ac21f9a05ee5",
"metadata": {},
"source": [
"`DataFrame` i `Series` są „mutowalne”."
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "c12fa03f-4fbc-464f-8a59-cbb0a9ca2d35",
"metadata": {},
"outputs": [],
"source": [
"emps.iloc[0, 3] += 1"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "cc3281fc-f38b-4add-8850-11f5c8649b8a",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>first_name</th>\n",
" <th>last_name</th>\n",
" <th>job_title</th>\n",
" <th>salary</th>\n",
" <th>hire_date</th>\n",
" <th>department_name</th>\n",
" <th>address</th>\n",
" <th>postal_code</th>\n",
" <th>city</th>\n",
" <th>country</th>\n",
" </tr>\n",
" <tr>\n",
" <th>employee_id</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>100</th>\n",
" <td>Steven</td>\n",
" <td>King</td>\n",
" <td>President</td>\n",
" <td>24001</td>\n",
" <td>1997-06-17</td>\n",
" <td>Executive</td>\n",
" <td>2004 Charade Rd</td>\n",
" <td>98199</td>\n",
" <td>Seattle</td>\n",
" <td>United States of America</td>\n",
" </tr>\n",
" <tr>\n",
" <th>101</th>\n",
" <td>Neena</td>\n",
" <td>Kochhar</td>\n",
" <td>Administration Vice President</td>\n",
" <td>17000</td>\n",
" <td>1999-09-21</td>\n",
" <td>Executive</td>\n",
" <td>2004 Charade Rd</td>\n",
" <td>98199</td>\n",
" <td>Seattle</td>\n",
" <td>United States of America</td>\n",
" </tr>\n",
" <tr>\n",
" <th>102</th>\n",
" <td>Lex</td>\n",
" <td>De Haan</td>\n",
" <td>Administration Vice President</td>\n",
" <td>17000</td>\n",
" <td>2003-01-13</td>\n",
" <td>Executive</td>\n",
" <td>2004 Charade Rd</td>\n",
" <td>98199</td>\n",
" <td>Seattle</td>\n",
" <td>United States of America</td>\n",
" </tr>\n",
" <tr>\n",
" <th>103</th>\n",
" <td>Alexander</td>\n",
" <td>Hunold</td>\n",
" <td>Programmer</td>\n",
" <td>9000</td>\n",
" <td>2000-01-03</td>\n",
" <td>IT</td>\n",
" <td>2014 Jabberwocky Rd</td>\n",
" <td>26192</td>\n",
" <td>Southlake</td>\n",
" <td>United States of America</td>\n",
" </tr>\n",
" <tr>\n",
" <th>104</th>\n",
" <td>Bruce</td>\n",
" <td>Ernst</td>\n",
" <td>Programmer</td>\n",
" <td>6000</td>\n",
" <td>2001-05-21</td>\n",
" <td>IT</td>\n",
" <td>2014 Jabberwocky Rd</td>\n",
" <td>26192</td>\n",
" <td>Southlake</td>\n",
" <td>United States of America</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" first_name last_name job_title salary \\\n",
"employee_id \n",
"100 Steven King President 24001 \n",
"101 Neena Kochhar Administration Vice President 17000 \n",
"102 Lex De Haan Administration Vice President 17000 \n",
"103 Alexander Hunold Programmer 9000 \n",
"104 Bruce Ernst Programmer 6000 \n",
"\n",
" hire_date department_name address postal_code \\\n",
"employee_id \n",
"100 1997-06-17 Executive 2004 Charade Rd 98199 \n",
"101 1999-09-21 Executive 2004 Charade Rd 98199 \n",
"102 2003-01-13 Executive 2004 Charade Rd 98199 \n",
"103 2000-01-03 IT 2014 Jabberwocky Rd 26192 \n",
"104 2001-05-21 IT 2014 Jabberwocky Rd 26192 \n",
"\n",
" city country \n",
"employee_id \n",
"100 Seattle United States of America \n",
"101 Seattle United States of America \n",
"102 Seattle United States of America \n",
"103 Southlake United States of America \n",
"104 Southlake United States of America "
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"emps.head(5)"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "f83593f7-7a4f-4501-b733-2aa8c30a4634",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>first_name</th>\n",
" <th>last_name</th>\n",
" <th>job_title</th>\n",
" <th>salary</th>\n",
" </tr>\n",
" <tr>\n",
" <th>employee_id</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>120</th>\n",
" <td>Matthew</td>\n",
" <td>Weiss</td>\n",
" <td>Stock Manager</td>\n",
" <td>8000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>121</th>\n",
" <td>Adam</td>\n",
" <td>Fripp</td>\n",
" <td>Stock Manager</td>\n",
" <td>8200</td>\n",
" </tr>\n",
" <tr>\n",
" <th>122</th>\n",
" <td>Payam</td>\n",
" <td>Kaufling</td>\n",
" <td>Stock Manager</td>\n",
" <td>7900</td>\n",
" </tr>\n",
" <tr>\n",
" <th>123</th>\n",
" <td>Shanta</td>\n",
" <td>Vollman</td>\n",
" <td>Stock Manager</td>\n",
" <td>6500</td>\n",
" </tr>\n",
" <tr>\n",
" <th>124</th>\n",
" <td>Kevin</td>\n",
" <td>Mourgos</td>\n",
" <td>Stock Manager</td>\n",
" <td>5800</td>\n",
" </tr>\n",
" <tr>\n",
" <th>125</th>\n",
" <td>Julia</td>\n",
" <td>Nayer</td>\n",
" <td>Stock Clerk</td>\n",
" <td>3200</td>\n",
" </tr>\n",
" <tr>\n",
" <th>126</th>\n",
" <td>Irene</td>\n",
" <td>Mikkilineni</td>\n",
" <td>Stock Clerk</td>\n",
" <td>2700</td>\n",
" </tr>\n",
" <tr>\n",
" <th>127</th>\n",
" <td>James</td>\n",
" <td>Landry</td>\n",
" <td>Stock Clerk</td>\n",
" <td>2400</td>\n",
" </tr>\n",
" <tr>\n",
" <th>128</th>\n",
" <td>Steven</td>\n",
" <td>Markle</td>\n",
" <td>Stock Clerk</td>\n",
" <td>2200</td>\n",
" </tr>\n",
" <tr>\n",
" <th>129</th>\n",
" <td>Laura</td>\n",
" <td>Bissot</td>\n",
" <td>Stock Clerk</td>\n",
" <td>3300</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" first_name last_name job_title salary\n",
"employee_id \n",
"120 Matthew Weiss Stock Manager 8000\n",
"121 Adam Fripp Stock Manager 8200\n",
"122 Payam Kaufling Stock Manager 7900\n",
"123 Shanta Vollman Stock Manager 6500\n",
"124 Kevin Mourgos Stock Manager 5800\n",
"125 Julia Nayer Stock Clerk 3200\n",
"126 Irene Mikkilineni Stock Clerk 2700\n",
"127 James Landry Stock Clerk 2400\n",
"128 Steven Markle Stock Clerk 2200\n",
"129 Laura Bissot Stock Clerk 3300"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"emps.iloc[20:30, :4]"
]
},
{
"cell_type": "markdown",
"id": "44dcabe7-9e7e-4789-a55a-6ba6b4869dd4",
"metadata": {},
"source": [
"`.loc` to dostęp wg indeksu „biznesowego” i nazw kolumn"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "7506e308-23b4-453b-87dc-934040bb9516",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Kochhar'"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"emps.loc[101, 'last_name']"
]
},
{
"cell_type": "code",
"execution_count": 30,
"id": "79604102-9c56-402f-95b4-fbd6dc6d6a8c",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>first_name</th>\n",
" <th>last_name</th>\n",
" <th>job_title</th>\n",
" <th>salary</th>\n",
" </tr>\n",
" <tr>\n",
" <th>employee_id</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>102</th>\n",
" <td>Lex</td>\n",
" <td>De Haan</td>\n",
" <td>Administration Vice President</td>\n",
" <td>17000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>103</th>\n",
" <td>Alexander</td>\n",
" <td>Hunold</td>\n",
" <td>Programmer</td>\n",
" <td>9000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>104</th>\n",
" <td>Bruce</td>\n",
" <td>Ernst</td>\n",
" <td>Programmer</td>\n",
" <td>6000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>105</th>\n",
" <td>David</td>\n",
" <td>Austin</td>\n",
" <td>Programmer</td>\n",
" <td>4800</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" first_name last_name job_title salary\n",
"employee_id \n",
"102 Lex De Haan Administration Vice President 17000\n",
"103 Alexander Hunold Programmer 9000\n",
"104 Bruce Ernst Programmer 6000\n",
"105 David Austin Programmer 4800"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"emps.loc[102:105, 'first_name':'salary']"
]
},
{
"cell_type": "code",
"execution_count": 31,
"id": "20d4942e-0a49-42e2-8ea7-860b7cfb22cb",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>first_name</th>\n",
" <th>last_name</th>\n",
" <th>salary</th>\n",
" <th>city</th>\n",
" </tr>\n",
" <tr>\n",
" <th>employee_id</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>100</th>\n",
" <td>Steven</td>\n",
" <td>King</td>\n",
" <td>24001</td>\n",
" <td>Seattle</td>\n",
" </tr>\n",
" <tr>\n",
" <th>101</th>\n",
" <td>Neena</td>\n",
" <td>Kochhar</td>\n",
" <td>17000</td>\n",
" <td>Seattle</td>\n",
" </tr>\n",
" <tr>\n",
" <th>102</th>\n",
" <td>Lex</td>\n",
" <td>De Haan</td>\n",
" <td>17000</td>\n",
" <td>Seattle</td>\n",
" </tr>\n",
" <tr>\n",
" <th>103</th>\n",
" <td>Alexander</td>\n",
" <td>Hunold</td>\n",
" <td>9000</td>\n",
" <td>Southlake</td>\n",
" </tr>\n",
" <tr>\n",
" <th>104</th>\n",
" <td>Bruce</td>\n",
" <td>Ernst</td>\n",
" <td>6000</td>\n",
" <td>Southlake</td>\n",
" </tr>\n",
" <tr>\n",
" <th>105</th>\n",
" <td>David</td>\n",
" <td>Austin</td>\n",
" <td>4800</td>\n",
" <td>Southlake</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" first_name last_name salary city\n",
"employee_id \n",
"100 Steven King 24001 Seattle\n",
"101 Neena Kochhar 17000 Seattle\n",
"102 Lex De Haan 17000 Seattle\n",
"103 Alexander Hunold 9000 Southlake\n",
"104 Bruce Ernst 6000 Southlake\n",
"105 David Austin 4800 Southlake"
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"emps.loc[:105, ['first_name', 'last_name', 'salary', 'city']]"
]
},
{
"cell_type": "markdown",
"id": "73a9600a-9078-4bb0-a376-cffd7b402482",
"metadata": {},
"source": [
"Odczyt całej wybranej kolumny jest jeszcze prostszy:\n",
"- notacja obiektowa, dostępna tylko jeśli w nazwie kolumny nie ma spacji ani innych znaków specjalnych:"
]
},
{
"cell_type": "code",
"execution_count": 32,
"id": "30c45466-5e94-49bb-b330-4388189f5f14",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"employee_id\n",
"100 24001\n",
"101 17000\n",
"102 17000\n",
"103 9000\n",
"104 6000\n",
" ... \n",
"202 6000\n",
"203 6500\n",
"204 10000\n",
"205 12000\n",
"206 8300\n",
"Name: salary, Length: 107, dtype: int64"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"emps.salary"
]
},
{
"cell_type": "markdown",
"id": "5a858350-06b3-4ad3-a726-7bcc9e2ded36",
"metadata": {},
"source": [
"- notacja słownikowa"
]
},
{
"cell_type": "code",
"execution_count": 36,
"id": "ac767f05-1b0f-43f6-b11b-3b3809636e7f",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"employee_id\n",
"100 1997-06-17\n",
"101 1999-09-21\n",
"102 2003-01-13\n",
"103 2000-01-03\n",
"104 2001-05-21\n",
" ... \n",
"202 2007-08-17\n",
"203 2004-06-07\n",
"204 2004-06-07\n",
"205 2004-06-07\n",
"206 2004-06-07\n",
"Name: hire_date, Length: 107, dtype: datetime64[ns]"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"emps['hire_date']"
]
},
{
"cell_type": "code",
"execution_count": 34,
"id": "71eadaec-b8a8-4d44-918e-a3062549d319",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>last_name</th>\n",
" <th>hire_date</th>\n",
" </tr>\n",
" <tr>\n",
" <th>employee_id</th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>100</th>\n",
" <td>King</td>\n",
" <td>1997-06-17</td>\n",
" </tr>\n",
" <tr>\n",
" <th>101</th>\n",
" <td>Kochhar</td>\n",
" <td>1999-09-21</td>\n",
" </tr>\n",
" <tr>\n",
" <th>102</th>\n",
" <td>De Haan</td>\n",
" <td>2003-01-13</td>\n",
" </tr>\n",
" <tr>\n",
" <th>103</th>\n",
" <td>Hunold</td>\n",
" <td>2000-01-03</td>\n",
" </tr>\n",
" <tr>\n",
" <th>104</th>\n",
" <td>Ernst</td>\n",
" <td>2001-05-21</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>202</th>\n",
" <td>Fay</td>\n",
" <td>2007-08-17</td>\n",
" </tr>\n",
" <tr>\n",
" <th>203</th>\n",
" <td>Mavris</td>\n",
" <td>2004-06-07</td>\n",
" </tr>\n",
" <tr>\n",
" <th>204</th>\n",
" <td>Baer</td>\n",
" <td>2004-06-07</td>\n",
" </tr>\n",
" <tr>\n",
" <th>205</th>\n",
" <td>Higgins</td>\n",
" <td>2004-06-07</td>\n",
" </tr>\n",
" <tr>\n",
" <th>206</th>\n",
" <td>Gietz</td>\n",
" <td>2004-06-07</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>107 rows × 2 columns</p>\n",
"</div>"
],
"text/plain": [
" last_name hire_date\n",
"employee_id \n",
"100 King 1997-06-17\n",
"101 Kochhar 1999-09-21\n",
"102 De Haan 2003-01-13\n",
"103 Hunold 2000-01-03\n",
"104 Ernst 2001-05-21\n",
"... ... ...\n",
"202 Fay 2007-08-17\n",
"203 Mavris 2004-06-07\n",
"204 Baer 2004-06-07\n",
"205 Higgins 2004-06-07\n",
"206 Gietz 2004-06-07\n",
"\n",
"[107 rows x 2 columns]"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"emps[['last_name', 'hire_date']]"
]
},
{
"cell_type": "markdown",
"id": "50c95e65-80ab-476e-aa37-0cff3fc6f9b0",
"metadata": {},
"source": [
"Iteracja po wszystkich wierszach\n",
"(w praktyce rzadko stosowane, jeśli już, to w programie `.py`, a nie w Jupyter Notebook)."
]
},
{
"cell_type": "code",
"execution_count": 41,
"id": "0e7725bb-bfac-404f-8e16-25033562a435",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Osoba Steven King zarabia 24001\n",
"Osoba Neena Kochhar zarabia 17000\n",
"Osoba Lex De Haan zarabia 17000\n",
"Osoba Alexander Hunold zarabia 9000\n",
"Osoba Bruce Ernst zarabia 6000\n",
"Osoba David Austin zarabia 4800\n",
"Osoba Valli Pataballa zarabia 4800\n",
"Osoba Diana Lorentz zarabia 4200\n",
"Osoba Nancy Greenberg zarabia 12000\n",
"Osoba Daniel Faviet zarabia 9000\n",
"Osoba John Chen zarabia 8200\n",
"Osoba Ismael Sciarra zarabia 7700\n",
"Osoba Jose Manuel Urman zarabia 7800\n",
"Osoba Luis Popp zarabia 6900\n",
"Osoba Den Raphaely zarabia 11000\n",
"Osoba Alexander Khoo zarabia 3100\n",
"Osoba Shelli Baida zarabia 2900\n",
"Osoba Sigal Tobias zarabia 2800\n",
"Osoba Guy Himuro zarabia 2600\n",
"Osoba Karen Colmenares zarabia 2500\n",
"Osoba Matthew Weiss zarabia 8000\n",
"Osoba Adam Fripp zarabia 8200\n",
"Osoba Payam Kaufling zarabia 7900\n",
"Osoba Shanta Vollman zarabia 6500\n",
"Osoba Kevin Mourgos zarabia 5800\n",
"Osoba Julia Nayer zarabia 3200\n",
"Osoba Irene Mikkilineni zarabia 2700\n",
"Osoba James Landry zarabia 2400\n",
"Osoba Steven Markle zarabia 2200\n",
"Osoba Laura Bissot zarabia 3300\n",
"Osoba Mozhe Atkinson zarabia 2800\n",
"Osoba James Marlow zarabia 2500\n",
"Osoba TJ Olson zarabia 2100\n",
"Osoba Jason Mallin zarabia 3300\n",
"Osoba Michael Rogers zarabia 2900\n",
"Osoba Ki Gee zarabia 2400\n",
"Osoba Hazel Philtanker zarabia 2200\n",
"Osoba Renske Ladwig zarabia 3600\n",
"Osoba Stephen Stiles zarabia 3200\n",
"Osoba John Seo zarabia 2700\n",
"Osoba Joshua Patel zarabia 2500\n",
"Osoba Trenna Rajs zarabia 3500\n",
"Osoba Curtis Davies zarabia 3100\n",
"Osoba Randall Matos zarabia 2600\n",
"Osoba Peter Vargas zarabia 2500\n",
"Osoba John Russell zarabia 14000\n",
"Osoba Karen Partners zarabia 13500\n",
"Osoba Alberto Errazuriz zarabia 12000\n",
"Osoba Gerald Cambrault zarabia 11000\n",
"Osoba Eleni Zlotkey zarabia 10500\n",
"Osoba Peter Tucker zarabia 10000\n",
"Osoba David Bernstein zarabia 9500\n",
"Osoba Peter Hall zarabia 9000\n",
"Osoba Christopher Olsen zarabia 8000\n",
"Osoba Nanette Cambrault zarabia 7500\n",
"Osoba Oliver Tuvault zarabia 7000\n",
"Osoba Janette King zarabia 10000\n",
"Osoba Patrick Sully zarabia 9500\n",
"Osoba Allan McEwen zarabia 9000\n",
"Osoba Lindsey Smith zarabia 8000\n",
"Osoba Louise Doran zarabia 7500\n",
"Osoba Sarath Sewall zarabia 7000\n",
"Osoba Clara Vishney zarabia 10500\n",
"Osoba Danielle Greene zarabia 9500\n",
"Osoba Mattea Marvins zarabia 7200\n",
"Osoba David Lee zarabia 6800\n",
"Osoba Sundar Ande zarabia 6400\n",
"Osoba Amit Banda zarabia 6200\n",
"Osoba Lisa Ozer zarabia 11500\n",
"Osoba Harrison Bloom zarabia 10000\n",
"Osoba Tayler Fox zarabia 9600\n",
"Osoba William Smith zarabia 7400\n",
"Osoba Elizabeth Bates zarabia 7300\n",
"Osoba Sundita Kumar zarabia 6100\n",
"Osoba Ellen Abel zarabia 11000\n",
"Osoba Alyssa Hutton zarabia 8800\n",
"Osoba Jonathon Taylor zarabia 8600\n",
"Osoba Jack Livingston zarabia 8400\n",
"Osoba Kimberely Grant zarabia 7000\n",
"Osoba Charles Johnson zarabia 6200\n",
"Osoba Winston Taylor zarabia 3200\n",
"Osoba Jean Fleaur zarabia 3100\n",
"Osoba Martha Sullivan zarabia 2500\n",
"Osoba Girard Geoni zarabia 2800\n",
"Osoba Nandita Sarchand zarabia 4200\n",
"Osoba Alexis Bull zarabia 4100\n",
"Osoba Julia Dellinger zarabia 3400\n",
"Osoba Anthony Cabrio zarabia 3000\n",
"Osoba Kelly Chung zarabia 3800\n",
"Osoba Jennifer Dilly zarabia 3600\n",
"Osoba Timothy Gates zarabia 2900\n",
"Osoba Randall Perkins zarabia 2500\n",
"Osoba Sarah Bell zarabia 4000\n",
"Osoba Britney Everett zarabia 3900\n",
"Osoba Samuel McCain zarabia 3200\n",
"Osoba Vance Jones zarabia 2800\n",
"Osoba Alana Walsh zarabia 3100\n",
"Osoba Kevin Feeney zarabia 3000\n",
"Osoba Donald OConnell zarabia 2600\n",
"Osoba Douglas Grant zarabia 2600\n",
"Osoba Jennifer Whalen zarabia 4400\n",
"Osoba Michael Hartstein zarabia 13000\n",
"Osoba Pat Fay zarabia 6000\n",
"Osoba Susan Mavris zarabia 6500\n",
"Osoba Hermann Baer zarabia 10000\n",
"Osoba Shelley Higgins zarabia 12000\n",
"Osoba William Gietz zarabia 8300\n"
]
}
],
"source": [
"for idx, row in emps.iterrows():\n",
" print(f'Osoba {row.first_name} {row.last_name} zarabia {row[\"salary\"]}')"
]
},
{
"cell_type": "markdown",
"id": "c35cd855-26e2-4b5a-b614-5aeab4ac6422",
"metadata": {},
"source": [
"## Filtrowanie danych\n",
"\n",
"zwn warunek logiczny"
]
},
{
"cell_type": "code",
"execution_count": 44,
"id": "b499577f-04a8-4779-a699-27ad6bd65a89",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>first_name</th>\n",
" <th>last_name</th>\n",
" <th>job_title</th>\n",
" <th>salary</th>\n",
" <th>hire_date</th>\n",
" <th>department_name</th>\n",
" <th>address</th>\n",
" <th>postal_code</th>\n",
" <th>city</th>\n",
" <th>country</th>\n",
" </tr>\n",
" <tr>\n",
" <th>employee_id</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>100</th>\n",
" <td>Steven</td>\n",
" <td>King</td>\n",
" <td>President</td>\n",
" <td>24001</td>\n",
" <td>1997-06-17</td>\n",
" <td>Executive</td>\n",
" <td>2004 Charade Rd</td>\n",
" <td>98199</td>\n",
" <td>Seattle</td>\n",
" <td>United States of America</td>\n",
" </tr>\n",
" <tr>\n",
" <th>101</th>\n",
" <td>Neena</td>\n",
" <td>Kochhar</td>\n",
" <td>Administration Vice President</td>\n",
" <td>17000</td>\n",
" <td>1999-09-21</td>\n",
" <td>Executive</td>\n",
" <td>2004 Charade Rd</td>\n",
" <td>98199</td>\n",
" <td>Seattle</td>\n",
" <td>United States of America</td>\n",
" </tr>\n",
" <tr>\n",
" <th>102</th>\n",
" <td>Lex</td>\n",
" <td>De Haan</td>\n",
" <td>Administration Vice President</td>\n",
" <td>17000</td>\n",
" <td>2003-01-13</td>\n",
" <td>Executive</td>\n",
" <td>2004 Charade Rd</td>\n",
" <td>98199</td>\n",
" <td>Seattle</td>\n",
" <td>United States of America</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" first_name last_name job_title salary \\\n",
"employee_id \n",
"100 Steven King President 24001 \n",
"101 Neena Kochhar Administration Vice President 17000 \n",
"102 Lex De Haan Administration Vice President 17000 \n",
"\n",
" hire_date department_name address postal_code city \\\n",
"employee_id \n",
"100 1997-06-17 Executive 2004 Charade Rd 98199 Seattle \n",
"101 1999-09-21 Executive 2004 Charade Rd 98199 Seattle \n",
"102 2003-01-13 Executive 2004 Charade Rd 98199 Seattle \n",
"\n",
" country \n",
"employee_id \n",
"100 United States of America \n",
"101 United States of America \n",
"102 United States of America "
]
},
"execution_count": 44,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"emps[emps.salary >= 15000]"
]
},
{
"cell_type": "markdown",
"id": "12666367-22b8-4cdd-8cf1-3d963ee418a1",
"metadata": {},
"source": [
"Technicznie operacja `emps.salary >= 15000` daje w wyniku serię wartości True/False"
]
},
{
"cell_type": "code",
"execution_count": 45,
"id": "b884e0ff-a189-4cec-ba17-720bb43e654d",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"employee_id\n",
"100 True\n",
"101 True\n",
"102 True\n",
"103 False\n",
"104 False\n",
" ... \n",
"202 False\n",
"203 False\n",
"204 False\n",
"205 False\n",
"206 False\n",
"Name: salary, Length: 107, dtype: bool"
]
},
"execution_count": 45,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"emps.salary >= 15000"
]
},
{
"cell_type": "code",
"execution_count": 46,
"id": "ffe6b0e2-6295-451a-a4fb-35652d3e842a",
"metadata": {},
"outputs": [],
"source": [
"warunki = emps.salary >= 15000"
]
},
{
"cell_type": "markdown",
"id": "2867f3a0-d8a5-4f4d-966b-f10b48a918c7",
"metadata": {},
"source": [
"Gdy do nawiasów kwadratowych przekażemy taką serię, to w wyniku dostajemy te rekordy, dla których na odp pozycji było True."
]
},
{
"cell_type": "code",
"execution_count": 47,
"id": "d8249bbf-e26b-457d-bb55-d3eb9153859a",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>first_name</th>\n",
" <th>last_name</th>\n",
" <th>job_title</th>\n",
" <th>salary</th>\n",
" <th>hire_date</th>\n",
" <th>department_name</th>\n",
" <th>address</th>\n",
" <th>postal_code</th>\n",
" <th>city</th>\n",
" <th>country</th>\n",
" </tr>\n",
" <tr>\n",
" <th>employee_id</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>100</th>\n",
" <td>Steven</td>\n",
" <td>King</td>\n",
" <td>President</td>\n",
" <td>24001</td>\n",
" <td>1997-06-17</td>\n",
" <td>Executive</td>\n",
" <td>2004 Charade Rd</td>\n",
" <td>98199</td>\n",
" <td>Seattle</td>\n",
" <td>United States of America</td>\n",
" </tr>\n",
" <tr>\n",
" <th>101</th>\n",
" <td>Neena</td>\n",
" <td>Kochhar</td>\n",
" <td>Administration Vice President</td>\n",
" <td>17000</td>\n",
" <td>1999-09-21</td>\n",
" <td>Executive</td>\n",
" <td>2004 Charade Rd</td>\n",
" <td>98199</td>\n",
" <td>Seattle</td>\n",
" <td>United States of America</td>\n",
" </tr>\n",
" <tr>\n",
" <th>102</th>\n",
" <td>Lex</td>\n",
" <td>De Haan</td>\n",
" <td>Administration Vice President</td>\n",
" <td>17000</td>\n",
" <td>2003-01-13</td>\n",
" <td>Executive</td>\n",
" <td>2004 Charade Rd</td>\n",
" <td>98199</td>\n",
" <td>Seattle</td>\n",
" <td>United States of America</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" first_name last_name job_title salary \\\n",
"employee_id \n",
"100 Steven King President 24001 \n",
"101 Neena Kochhar Administration Vice President 17000 \n",
"102 Lex De Haan Administration Vice President 17000 \n",
"\n",
" hire_date department_name address postal_code city \\\n",
"employee_id \n",
"100 1997-06-17 Executive 2004 Charade Rd 98199 Seattle \n",
"101 1999-09-21 Executive 2004 Charade Rd 98199 Seattle \n",
"102 2003-01-13 Executive 2004 Charade Rd 98199 Seattle \n",
"\n",
" country \n",
"employee_id \n",
"100 United States of America \n",
"101 United States of America \n",
"102 United States of America "
]
},
"execution_count": 47,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"emps[warunki]"
]
},
{
"cell_type": "markdown",
"id": "6a1bd071-6a02-49ba-a88a-87523b4f0872",
"metadata": {},
"source": [
"### Złożone warunki logiczne\n",
"\n",
"Tylko za pomocą operatorów ``&`` i ``|``"
]
},
{
"cell_type": "code",
"execution_count": 48,
"id": "c8fc1a87-2aab-40d6-8a52-18386b0ae922",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>first_name</th>\n",
" <th>last_name</th>\n",
" <th>job_title</th>\n",
" <th>salary</th>\n",
" <th>hire_date</th>\n",
" <th>department_name</th>\n",
" <th>address</th>\n",
" <th>postal_code</th>\n",
" <th>city</th>\n",
" <th>country</th>\n",
" </tr>\n",
" <tr>\n",
" <th>employee_id</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>103</th>\n",
" <td>Alexander</td>\n",
" <td>Hunold</td>\n",
" <td>Programmer</td>\n",
" <td>9000</td>\n",
" <td>2000-01-03</td>\n",
" <td>IT</td>\n",
" <td>2014 Jabberwocky Rd</td>\n",
" <td>26192</td>\n",
" <td>Southlake</td>\n",
" <td>United States of America</td>\n",
" </tr>\n",
" <tr>\n",
" <th>104</th>\n",
" <td>Bruce</td>\n",
" <td>Ernst</td>\n",
" <td>Programmer</td>\n",
" <td>6000</td>\n",
" <td>2001-05-21</td>\n",
" <td>IT</td>\n",
" <td>2014 Jabberwocky Rd</td>\n",
" <td>26192</td>\n",
" <td>Southlake</td>\n",
" <td>United States of America</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" first_name last_name job_title salary hire_date \\\n",
"employee_id \n",
"103 Alexander Hunold Programmer 9000 2000-01-03 \n",
"104 Bruce Ernst Programmer 6000 2001-05-21 \n",
"\n",
" department_name address postal_code city \\\n",
"employee_id \n",
"103 IT 2014 Jabberwocky Rd 26192 Southlake \n",
"104 IT 2014 Jabberwocky Rd 26192 Southlake \n",
"\n",
" country \n",
"employee_id \n",
"103 United States of America \n",
"104 United States of America "
]
},
"execution_count": 48,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"emps[(emps.job_title == 'Programmer') & (emps.salary >= 5000)]"
]
},
{
"cell_type": "markdown",
"id": "05ac332e-00ab-48e6-bb60-c0c188f0ffa5",
"metadata": {},
"source": [
"## Funkcje argegujące\n",
"\n",
"Statystyki itp.\n",
"\n",
"Najłatwiej wywołać funkcję na pojedynczej kolumnie:"
]
},
{
"cell_type": "code",
"execution_count": 57,
"id": "67d95ad3-04a1-41f4-ac16-124e8c0c02e2",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"6461.691588785046"
]
},
"execution_count": 57,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"emps.salary.mean()"
]
},
{
"cell_type": "code",
"execution_count": 53,
"id": "e31c1359-33c7-4724-9547-fd17460cb51f",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(2100, 24001, 6461.691588785046, 6200.0, 691401, 107, 3909.408070359112)"
]
},
"execution_count": 53,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"emps.salary.min(), emps.salary.max(), emps.salary.mean(), emps.salary.median(), emps.salary.sum(), emps.salary.count(), emps.salary.std()"
]
},
{
"cell_type": "markdown",
"id": "614c91fb-35a8-4859-a276-d23b1344bcaf",
"metadata": {},
"source": [
"Łącząc technikę filtrowania z wyliczaniem statystyk, można:"
]
},
{
"cell_type": "code",
"execution_count": 54,
"id": "06eab5e5-3393-4900-beea-0829baf098a8",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"5760.0"
]
},
"execution_count": 54,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"emps[emps.job_title == 'Programmer'].salary.mean()"
]
},
{
"cell_type": "markdown",
"id": "c33b801b-a4f2-4666-a42a-7a6f4167d7b5",
"metadata": {},
"source": [
"Ciekawostka - można też tak:"
]
},
{
"cell_type": "code",
"execution_count": 55,
"id": "83bda502-f8e7-49da-97fc-6dc0a74ef68b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"5760.0"
]
},
"execution_count": 55,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"emps.salary[emps.job_title == 'Programmer'].mean()"
]
},
{
"cell_type": "markdown",
"id": "5b6c0a5f-1c90-4af3-9c36-b367dff5a60a",
"metadata": {},
"source": [
"### Funkcje liczące kilka rzeczy jednocześnie"
]
},
{
"cell_type": "code",
"execution_count": 58,
"id": "c02553b9-a581-4552-9e46-1218d7cb8fd6",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"count 107.000000\n",
"mean 6461.691589\n",
"std 3909.408070\n",
"min 2100.000000\n",
"25% 3100.000000\n",
"50% 6200.000000\n",
"75% 8900.000000\n",
"max 24001.000000\n",
"Name: salary, dtype: float64"
]
},
"execution_count": 58,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"emps.salary.describe()"
]
},
{
"cell_type": "code",
"execution_count": 59,
"id": "e302a438-f917-4131-b388-999b982d5583",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"count 107.000000\n",
"mean 6461.691589\n",
"std 3909.408070\n",
"min 2100.000000\n",
"10% 2560.000000\n",
"20% 2900.000000\n",
"50% 6200.000000\n",
"max 24001.000000\n",
"Name: salary, dtype: float64"
]
},
"execution_count": 59,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"emps.salary.describe(percentiles=[0.1, 0.2, 0.5])"
]
},
{
"cell_type": "code",
"execution_count": 60,
"id": "9a38a54d-e31e-4907-a9c4-f1bd7bb971b6",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"count 107.000000\n",
"mean 6461.691589\n",
"std 3909.408070\n",
"min 2100.000000\n",
"0% 2100.000000\n",
"10% 2560.000000\n",
"20% 2900.000000\n",
"30% 3200.000000\n",
"40% 4040.000000\n",
"50% 6200.000000\n",
"60% 7260.000000\n",
"70% 8200.000000\n",
"80% 9500.000000\n",
"90% 11000.000000\n",
"max 24001.000000\n",
"Name: salary, dtype: float64"
]
},
"execution_count": 60,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"emps.salary.describe(percentiles=np.arange(0, 1, 0.1))"
]
},
{
"cell_type": "code",
"execution_count": 61,
"id": "c34e86d4-541d-416e-94c7-0483be5bd9e9",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"count 106\n",
"unique 7\n",
"top South San Francisco\n",
"freq 45\n",
"Name: city, dtype: object"
]
},
"execution_count": 61,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"emps.city.describe()"
]
},
{
"cell_type": "code",
"execution_count": 62,
"id": "8dd0410f-9b35-48a1-91cc-333916f79ef8",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"city\n",
"South San Francisco 45\n",
"Oxford 34\n",
"Seattle 18\n",
"Southlake 5\n",
"Toronto 2\n",
"London 1\n",
"Munich 1\n",
"Name: count, dtype: int64"
]
},
"execution_count": 62,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"emps.city.value_counts()"
]
},
{
"cell_type": "markdown",
"id": "5cc30484-9027-48e7-b4ad-223ad27090f9",
"metadata": {},
"source": [
"Operacja `agg` pozwala obliczyć kilka funkcji agregujących dla tego samego zestawu danych.\n",
"\n",
"Szczególnie użyteczna w połączeniu z grupowaniem, o którym za chwilę..."
]
},
{
"cell_type": "code",
"execution_count": 63,
"id": "6447e85e-2078-4667-b536-b49fce6f7e4b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"min 2100.000000\n",
"mean 6461.691589\n",
"max 24001.000000\n",
"Name: salary, dtype: float64"
]
},
"execution_count": 63,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"emps.salary.agg(['min', 'mean', 'max'])"
]
},
{
"cell_type": "code",
"execution_count": 64,
"id": "b49cd299-4d7d-4dae-90e2-13e4a11afc99",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"count 5.0\n",
"min 4200.0\n",
"mean 5760.0\n",
"median 4800.0\n",
"max 9000.0\n",
"Name: salary, dtype: float64"
]
},
"execution_count": 64,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"emps[emps.job_title == 'Programmer'].salary.agg(['count', 'min', 'mean', 'median', 'max'])"
]
},
{
"cell_type": "markdown",
"id": "28a14de8-ddb8-4f16-a2be-cfc300bdf2f2",
"metadata": {},
"source": [
"## Praca z przykładem sprzedaż"
]
},
{
"cell_type": "code",
"execution_count": 65,
"id": "cfa4857f-99bb-4e89-a7e4-6c916efdcb11",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>data</th>\n",
" <th>miasto</th>\n",
" <th>sklep</th>\n",
" <th>kategoria</th>\n",
" <th>towar</th>\n",
" <th>cena</th>\n",
" <th>sztuk</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>2014-11-23</td>\n",
" <td>Łódź</td>\n",
" <td>Wdowiak</td>\n",
" <td>meble</td>\n",
" <td>biurko</td>\n",
" <td>149.99</td>\n",
" <td>4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2017-05-07</td>\n",
" <td>Radom</td>\n",
" <td>Czarnecki</td>\n",
" <td>wyposażenie szkolne</td>\n",
" <td>tablica</td>\n",
" <td>590.00</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>2017-05-05</td>\n",
" <td>Kraków</td>\n",
" <td>Kozłowski</td>\n",
" <td>szkolno-biurowe</td>\n",
" <td>flamaster</td>\n",
" <td>0.99</td>\n",
" <td>51</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>2016-10-19</td>\n",
" <td>Kraków</td>\n",
" <td>Wróbel</td>\n",
" <td>wyposażenie szkolne</td>\n",
" <td>gąbka</td>\n",
" <td>4.00</td>\n",
" <td>250</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>2016-04-08</td>\n",
" <td>Poznań</td>\n",
" <td>Borowik</td>\n",
" <td>meble</td>\n",
" <td>biurko</td>\n",
" <td>149.99</td>\n",
" <td>9</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9995</th>\n",
" <td>2016-05-22</td>\n",
" <td>Katowice</td>\n",
" <td>Gaińska</td>\n",
" <td>szkolno-biurowe</td>\n",
" <td>dziurkacz</td>\n",
" <td>7.50</td>\n",
" <td>178</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9996</th>\n",
" <td>2016-11-19</td>\n",
" <td>Kraków</td>\n",
" <td>Kozłowski</td>\n",
" <td>meble</td>\n",
" <td>biurko</td>\n",
" <td>149.99</td>\n",
" <td>7</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9997</th>\n",
" <td>2016-09-30</td>\n",
" <td>Łódź</td>\n",
" <td>Wdowiak</td>\n",
" <td>szkolno-biurowe</td>\n",
" <td>długopis</td>\n",
" <td>1.49</td>\n",
" <td>87</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9998</th>\n",
" <td>2015-05-01</td>\n",
" <td>Kraków</td>\n",
" <td>Kozłowski</td>\n",
" <td>meble</td>\n",
" <td>biurko</td>\n",
" <td>149.99</td>\n",
" <td>10</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9999</th>\n",
" <td>2016-08-26</td>\n",
" <td>Kraków</td>\n",
" <td>Kozłowski</td>\n",
" <td>wyposażenie szkolne</td>\n",
" <td>gąbka</td>\n",
" <td>4.00</td>\n",
" <td>152</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>10000 rows × 7 columns</p>\n",
"</div>"
],
"text/plain": [
" data miasto sklep kategoria towar cena \\\n",
"0 2014-11-23 Łódź Wdowiak meble biurko 149.99 \n",
"1 2017-05-07 Radom Czarnecki wyposażenie szkolne tablica 590.00 \n",
"2 2017-05-05 Kraków Kozłowski szkolno-biurowe flamaster 0.99 \n",
"3 2016-10-19 Kraków Wróbel wyposażenie szkolne gąbka 4.00 \n",
"4 2016-04-08 Poznań Borowik meble biurko 149.99 \n",
"... ... ... ... ... ... ... \n",
"9995 2016-05-22 Katowice Gaińska szkolno-biurowe dziurkacz 7.50 \n",
"9996 2016-11-19 Kraków Kozłowski meble biurko 149.99 \n",
"9997 2016-09-30 Łódź Wdowiak szkolno-biurowe długopis 1.49 \n",
"9998 2015-05-01 Kraków Kozłowski meble biurko 149.99 \n",
"9999 2016-08-26 Kraków Kozłowski wyposażenie szkolne gąbka 4.00 \n",
"\n",
" sztuk \n",
"0 4 \n",
"1 2 \n",
"2 51 \n",
"3 250 \n",
"4 9 \n",
"... ... \n",
"9995 178 \n",
"9996 7 \n",
"9997 87 \n",
"9998 10 \n",
"9999 152 \n",
"\n",
"[10000 rows x 7 columns]"
]
},
"execution_count": 65,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sprzedaz"
]
},
{
"cell_type": "code",
"cell_type": "code",
"execution_count": null,
"execution_count": null,
"id": "
94ae90c7-4bad-4f46-bb32-cbc3fcc97d8a
",
"id": "
6554eb49-a75f-4a99-bee3-22077bc36162
",
"metadata": {},
"metadata": {},
"outputs": [],
"outputs": [],
"source": []
"source": []
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment